Last week a guy asked me if i could develop some protection against bad robots. Previously i enhanced a google proxy hack defending script and another one which is suiable to make your robots.txt file only for the robots, which are authorized to check it so he was very enthusiastic.
Some days ago i started a little survey on topic and found a lot of htaccess rule, where certain hosts are rejected via .htaccess, but they were not automatic, so the challange was given. Basically i used this resource to create this automatic solution.
1. Open your robots.txt and insert this line into it
Code:
if you don't have any create one with this content
Code:
User-agent: *
Disallow: /core
The name of the limited folder is not important, but would be great if the humanoid atteckers would find it very attractive to check it since this will be the live-bait. Bad robots and hackers are not interested in the crwaling limitation can be found in the robots.txt file
2. Create the folder on your hosting space which is specified in the robots.txt file, in my example this is root/core and upload an index.php file with the following content.
PHP Code:
<?php
$ip = $_SERVER["REMOTE_ADDR"];
$logfile = 'bannolnilog.txt';
//collect the IP adresses or something else into the logfile
$fp = fopen($logfile, 'a');
fputs($fp, "$ip
");
fputs($fp, " ");
fclose($fp);
echo "your IP was logged for security reasons and your visit is now over";
?>
3. As you may see i defined a $logfile where the IP adresses will be collected hence we need to upload to the same (core) folder a blank txt file called bannolnilog.txt (644 attributum).
4. We need to upload an other php file which will check if the visitor is bannished whenever a page is requested, i named this file validator.php and its content is the following.
PHP Code:
<?php
$ip = $_SERVER["REMOTE_ADDR"];
$logfile = 'bannolnilog.txt';
$target = file(dirname(__FILE__). "/core/bannolnilog.txt");
foreach($target as $item){
$item = trim($item);
if(stristr($ip, $item)){
header("HTTP/1.0 403 Forbidden");
exit;
}
}
?>
5. You need to insert this line into the very front of your script header
PHP Code:
<?php require "/you/need/to/insert/the/path/here/validator.php";?>
This will make the validater.php run before the page would be diplayed.
I warrant nothing, but works very well
You may truncate the logfile deleting the collected IPs.