How can I block specific search engines server-wide?

IT_Architect

Verified User
Joined
Feb 27, 2006
Messages
1,088
I have looked for a listing of maintained IP, but I have found none. It would be nice if something like this would work.

Code:
################################################################################
#                  			Block Bad Bots
################################################################################
<Directory /var/www/>
#
SetEnvIfNoCase User-Agent "^BaiDuSpider" bad_bots
SetEnvIfNoCase User-Agent "^bot*" bad_bots
SetEnvIfNoCase User-Agent "^Cityreview" bad_bots
SetEnvIfNoCase User-Agent "^crawl" bad_bots
SetEnvIfNoCase User-Agent "^Dotbot" bad_bots
SetEnvIfNoCase User-Agent "^Exabot" bad_bots
SetEnvIfNoCase User-Agent "^Java" bad_bots
SetEnvIfNoCase User-Agent "^MJ12bot" bad_bots
SetEnvIfNoCase User-Agent "^NG\ 1.x (Exalead)" bad_bots
SetEnvIfNoCase User-Agent "^Sogou" bad_bots
SetEnvIfNoCase User-Agent "^Sosospider" bad_bots
SetEnvIfNoCase User-Agent "^spider" bad_bots
SetEnvIfNoCase User-Agent "^Twiceler" bad_bots
SetEnvIfNoCase User-Agent "^Yandex" bad_bots
SetEnvIfNoCase User-Agent "^YandexBot" bad_bots
SetEnvIfNoCase User-Agent "^\s+$" bad_bots
SetEnvIfNoCase User-Agent "^$" bad_bots
SetEnvIf Remote_Addr "212\.100\.254\.105" bad_bots
#
<Files *>
Order allow,deny
Allow from all
Deny from env=bad_bots
</Files>
</Directory>
### End Block Bad Bots
but it doesn't.
 
I have looked for a listing of maintained IP, but I have found none. It would be nice if something like this would work. [...]

Just add your code to etc/httpd/conf/extra/httpd-includes.conf, however you should change the code a little, I have done it for you, so just add this to httpd-includes.conf and restart apache:

Code:
SetEnvIfNoCase User-Agent "BaiDuSpider" bad_bots
SetEnvIfNoCase User-Agent "bot*" bad_bots
SetEnvIfNoCase User-Agent "Cityreview" bad_bots
SetEnvIfNoCase User-Agent "crawl" bad_bots
SetEnvIfNoCase User-Agent "Dotbot" bad_bots
SetEnvIfNoCase User-Agent "Exabot" bad_bots
SetEnvIfNoCase User-Agent "Java" bad_bots
SetEnvIfNoCase User-Agent "MJ12bot" bad_bots
SetEnvIfNoCase User-Agent "NG\ 1.x (Exalead)" bad_bots
SetEnvIfNoCase User-Agent "Sogou" bad_bots
SetEnvIfNoCase User-Agent "Sosospider" bad_bots
SetEnvIfNoCase User-Agent "spider" bad_bots
SetEnvIfNoCase User-Agent "Twiceler" bad_bots
SetEnvIfNoCase User-Agent "Yandex" bad_bots
SetEnvIfNoCase User-Agent "YandexBot" bad_bots
SetEnvIfNoCase User-Agent "\s+$" bad_bots
SetEnvIfNoCase User-Agent "$" bad_bots
SetEnvIf Remote_Addr "212\.100\.254\.105" bad_bots
<Location />
Order Allow,Deny
Deny from env=bad_bots
Allow from all
</Location>
 
Last edited:
Hello,
Did you try to change "/var/www/" to "/home/" ?
I did, but it didn't work. Actually, I thought about trying this before, but thought it wouldn't work, so I didn't try it, which was kinda dumb, because it only takes a minute to try. Thanks for scratching your head with me. I'll need to try dittos idea next, but the last time I tried it without the <directory tag, it hung Apache. We'll see what happens without the up arrows like he shows.

Thanks!
 
Last edited:
Just add your code to etc/httpd/conf/extra/httpd-includes.conf, however you should change the code a little, I have done it for you, so just add this to httpd-includes.conf and restart apache:
I did. Apache didn't choke on restart, so we'll know in a few hours.

Thanks for your help!
 
Just add your code to etc/httpd/conf/extra/httpd-includes.conf, however you should change the code a little, I have done it for you, so just add this to httpd-includes.conf and restart apache:
It didn't work. When I tried a web page on the server it returned the following error:

Forbidden
You don't have permission to access /listings.php on this server.
Additionally, a 403 Forbidden error was encountered while trying to use an ErrorDocument to handle the request.
Apache/2 Server at www.mydomain.com Port 80

Thanks for trying.
 
Back
Top