Blocking VoilaBot
Tuesday, August 19th, 2008
There’s a web-crawler out there called VoilaBot, which is hammering my site with needless crawls and which appears to ignore robots.txt files completely. Apparently it’s a crawler for a french portal/search engine. If you need to block this bot from your site, there are two things you can do:
Firewall
If you’ve got a firewall on your box, you can deny access to the two IP ranges 81.52.143.0 / 24 and 193.252.149.0 / 24. That’ll get them off your back permanently. For Linux machines with iptables firewall, the following will do the trick:
iptables -A INPUT --source 193.252.149.0/24 -j DROP iptables -A INPUT --source 81.52.143.0/24 -j DROP
htaccess
If you don’t want to firewall the bot, you can deny them access to your website by putting a .htaccess file in your web root directory with the following contents:
order allow,deny deny from 81.52.143. deny from 193.252.149.
Don’t trust VoilaBot to honour your robots.txt file; it won’t.