wget ignore robots.txt howto

wget -e robots=off It is as simple as the above and this is something one must watch out carefully when using wget because you may think you have archived or downloaded content when you never did due to a nofollow/robots.txt statement.........

Apache Vhost HowTo Serve Same Content using a different domain and IP

There are a few ways of doing this and all basically involve using the reverse proxy or "ProxyPass" feature of Apache to accomplish it. 1.) Create a normal vhost and simply symlink the root directory of the site you want to mirror. Eg. originalsite.com and newsite.com /vhosts/originalsite.com/httpdocs You would symlink like this: ln -s /vhosts/originalsite.com/httpdocs vhosts/originalsite.com/........

Login and download all files script

This is very handy if you're too busy and don't have time to download whatever files you need. The -D specifies the domains allowed, this is because I specified -H which means foreign hosts are allowed, if you don't restrict them you'll end up going to the whole internet via ads and other links just like a search Engine would follow. -l 0 specifies to go deep, to as many levels as possible/as exist. -e robots=off is important because robots.txt often says you can't vie........

Drupal 6.2 Install and how to move install to root/non-subdirectory

drup 6.2 install $cd drupgoodinst3883/ [ drupgoodinst3883]$ ls CHANGELOG.txt cron.php index.php INSTALL.pgsql.txt INSTALL.txt MAINTAINERS.txt modules robots.txt sites update.php xmlrpc.php COPYRIGHT.txt includes INSTALL.mysql.txt install.php LICENSE.txt misc&........

wget ignore robots.txt howto

Apache Vhost HowTo Serve Same Content using a different domain and IP

Login and download all files script

Drupal 6.2 Install and how to move install to root/non-subdirectory

Latest Articles