Block IP addresses which overload server

A little improve for overview from my post #12,
Great. I'm not into regex and these things so no clue as to how to write them. And didn't knew the "BrowserMatchNocase" could be used the same way with the rewrite.
This indeed gives a way better overview, going to try that one, thank you!!
 
Sorry to up this post after half a year, but it seems this is not really blocking bots. Bots which I have in my list, seem to still be hammering my site getting a 301 which most likely is causing delays for other users visiting.

Short example from my httpd-includes.conf:

Code:
<IfModule mod_rewrite.c>
    RewriteEngine On
    BrowserMatchNoCase "adscanner" bad_bot
    BrowserMatchNoCase "Barkrowler/0.9" bad_bot
    BrowserMatchNoCase "Dotbot" bad_bot
    SetEnvIfNoCase  Request_URI "\.env" bad_bot
    SetEnvIfNoCase  Request_URI "xlmrpc\.php" bad_bot

    RewriteCond %{ENV:BAD_BOT} !^$
    RewriteRule (.*) - [F,L]
</IfModule>

I don't know exactly what that rewrite rules does. Shouldn't that send them to oblivion? Anyway this is from my logs:
Code:
54.39.177.48 - - [11/Aug/2025:00:35:58 +0200] "GET /threads/gitarist-george-kooymans-golden-earring-overleden.46714/ HTTP/1.1" 301 825 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:35:59 +0200] "GET /threads/rocklegende-ozzy-osbourne-76-enkele-weken-na-afscheidsshow-overleden.46707/ HTTP/1.1" 301 844 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:36:00 +0200] "GET /threads/google-stopt-eind-volgende-maand-met-url-verkorter-goo-gl.46726/ HTTP/1.1" 301 833 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:36:01 +0200] "GET /threads/gitarist-george-kooymans-golden-earring-overleden.46714/ HTTP/1.1" 301 825 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:36:02 +0200] "GET /threads/rocklegende-ozzy-osbourne-76-enkele-weken-na-afscheidsshow-overleden.46707/ HTTP/1.1" 301 844 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:36:02 +0200] "GET /threads/google-stopt-eind-volgende-maand-met-url-verkorter-goo-gl.46726/ HTTP/1.1" 301 833 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:36:03 +0200] "GET /threads/gitarist-george-kooymans-golden-earring-overleden.46714/ HTTP/1.1" 301 825 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:36:04 +0200] "GET /threads/rocklegende-ozzy-osbourne-76-enkele-weken-na-afscheidsshow-overleden.46707/ HTTP/1.1" 301 844 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
216.244.66.231 - - [11/Aug/2025:20:38:27 +0200] "GET /threads/582/ HTTP/1.1" 301 3088 "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; [email protected])"
216.244.66.231 - - [11/Aug/2025:20:38:30 +0200] "GET /threads/612/ HTTP/1.1" 301 3078 "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; [email protected])"
216.244.66.231 - - [11/Aug/2025:20:38:33 +0200] "GET /threads/6415/ HTTP/1.1" 301 3094 "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; [email protected])"
216.244.66.231 - - [11/Aug/2025:20:38:35 +0200] "GET /threads/8724/ HTTP/1.1" 301 3086 "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; [email protected])"

So all 301 redirects. But there are so many in the logs. Could this indeed be the reason that often posts on my forum are bit delayed? Because I see loads of this in the logs.
If yes, is there another way to block these in a better way so they can't cause load?
 
Hi Richard,
Sorry to up this post after half a year, but it seems this is not really blocking bots. Bots which I have in my list, seem to still be hammering my site getting a 301 which most likely is causing delays for other users visiting.

Short example from my httpd-includes.conf:

Code:
<IfModule mod_rewrite.c>
    RewriteEngine On
    BrowserMatchNoCase "adscanner" bad_bot
    BrowserMatchNoCase "Barkrowler/0.9" bad_bot
    BrowserMatchNoCase "Dotbot" bad_bot
    SetEnvIfNoCase  Request_URI "\.env" bad_bot
    SetEnvIfNoCase  Request_URI "xlmrpc\.php" bad_bot

    RewriteCond %{ENV:BAD_BOT} !^$
    RewriteRule (.*) - [F,L]
</IfModule>

I don't know exactly what that rewrite rules does. Shouldn't that send them to oblivion? Anyway this is from my logs:
Code:
54.39.177.48 - - [11/Aug/2025:00:35:58 +0200] "GET /threads/gitarist-george-kooymans-golden-earring-overleden.46714/ HTTP/1.1" 301 825 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:35:59 +0200] "GET /threads/rocklegende-ozzy-osbourne-76-enkele-weken-na-afscheidsshow-overleden.46707/ HTTP/1.1" 301 844 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:36:00 +0200] "GET /threads/google-stopt-eind-volgende-maand-met-url-verkorter-goo-gl.46726/ HTTP/1.1" 301 833 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:36:01 +0200] "GET /threads/gitarist-george-kooymans-golden-earring-overleden.46714/ HTTP/1.1" 301 825 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:36:02 +0200] "GET /threads/rocklegende-ozzy-osbourne-76-enkele-weken-na-afscheidsshow-overleden.46707/ HTTP/1.1" 301 844 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:36:02 +0200] "GET /threads/google-stopt-eind-volgende-maand-met-url-verkorter-goo-gl.46726/ HTTP/1.1" 301 833 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:36:03 +0200] "GET /threads/gitarist-george-kooymans-golden-earring-overleden.46714/ HTTP/1.1" 301 825 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
54.39.177.48 - - [11/Aug/2025:00:36:04 +0200] "GET /threads/rocklegende-ozzy-osbourne-76-enkele-weken-na-afscheidsshow-overleden.46707/ HTTP/1.1" 301 844 "-" "Mozilla/5.0 (compatible; YaK/1.0; http://linkfluence.com/; [email protected])"
216.244.66.231 - - [11/Aug/2025:20:38:27 +0200] "GET /threads/582/ HTTP/1.1" 301 3088 "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; [email protected])"
216.244.66.231 - - [11/Aug/2025:20:38:30 +0200] "GET /threads/612/ HTTP/1.1" 301 3078 "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; [email protected])"
216.244.66.231 - - [11/Aug/2025:20:38:33 +0200] "GET /threads/6415/ HTTP/1.1" 301 3094 "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; [email protected])"
216.244.66.231 - - [11/Aug/2025:20:38:35 +0200] "GET /threads/8724/ HTTP/1.1" 301 3086 "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; [email protected])"

So all 301 redirects. But there are so many in the logs. Could this indeed be the reason that often posts on my forum are bit delayed? Because I see loads of this in the logs.
If yes, is there another way to block these in a better way so they can't cause load?
I have this and it will block it all:

<IfModule mod_rewrite.c>
RewriteEngine On

# Tiktok for AI
BrowserMatchNoCase "Bytespider" bad_bot
# Claude for AI
BrowserMatchNoCase "ClaudeBot" bad_bot
# Huawai Search Engine
BrowserMatchNoCase "PetalBot" bad_bot
# Amazon for Alexa
BrowserMatchNoCase "Amazonbot" bad_bot
# Installatron Plugin crawler
# BrowserMatchNoCase "Installatron Plugin" bad_bot
# UptimeRobot
# BrowserMatchNoCase "UptimeRobot" bad_bot
# Facebook
# BrowserMatchNoCase "facebookexternalhit" bad_bot
# Extensive crawling
# BrowserMatchNoCase "ALittle Client" bad_bot

# Usually bad webservers
BrowserMatchNoCase "fasthttp" bad_bot
BrowserMatchNoCase "Linux Mozilla" bad_bot
BrowserMatchNoCase "scalaj-http" bad_bot
BrowserMatchNoCase "Go-http-client" bad_bot
BrowserMatchNoCase "python-requests" bad_bot
BrowserMatchNoCase "python-urllib" bad_bot
BrowserMatchNoCase "Apache-HttpClient" bad_bot
BrowserMatchNoCase "aiohttp" bad_bot

SetEnvIfNoCase Request_URI "xlmrpc\.php" bad_bot
SetEnvIfNoCase Request_URI "\.env" bad_bot

RewriteCond %{ENV:BAD_BOT} !^$
RewriteRule (.*) - [F,L]
</IfModule>

See:
[12/Aug/2025:15:47:33 +0200] "GET /azie/oktinath/photooktinath37.html HTTP/1.1" 403 4012 "-" "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot) Chrome/119.0.6045.214 Safari/537.36"

[12/Aug/2025:15:02:51 +0200] "GET / HTTP/1.1" 403 365 "https://www.seokicks.de/backlinks/www.***.nl" "Mozilla/5.0 (Linux; Android 7.0;) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; PetalBot;+https://webmaster.petalsearch.com/site/petalbot)"
 
Last edited:
Hi Stefan. It's Richard, not Ri. :)

Thank you, but that's the same I'm having as you can see. As stated I've shortened the list a little bit.

But then I don't understand why you get the 403's and in my case they manage to get the 301's.
Maybe that has to do with a .htaccess in my public folder?

I got these in there, maybe they overrule something? Maybe I better do that via DA itself?
Code:
# Redirect HTTP to HTTPS on the same host
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

# Redirect non-www to www (HTTPS only)
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^ https://www.%{HTTP_HOST}%{REQUEST_URI} [R=301,L]
 
you might change the rules flag from "L" to "END".
I don't think so:
Using the [END] flag terminates not only the current round of rewriteprocessing (like [L]) but also prevents any subsequent rewriteprocessing from occurring in per-directory (htaccess) context.
so that would also stop .htaccess files from being used for rewriting on all domains if I understand that correctly.

Next to that, it's working the same way for Stafan. So in that case I rather leave the rest of the rewriting in place, but only comment out the non-www and www stuff to see if that makes anything better.
 
The workflow same as other flag, so if you missing the "Condition", then it will execute "RewriteRules" without checking the condition.

L flag will check if there have any "R" flag rules or not in the current Dir.
END flag will end in immediately.
 
Code:
# Redirect HTTP to HTTPS on the same host
RewriteCond %{HTTPS} !=on
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

# Redirect non-www to www (HTTPS only)
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^ https://www.%{HTTP_HOST}%{REQUEST_URI} [R=301,L]

I looked up a site with almost the same in the .htaccess, I have here this one:

RewriteCond %{HTTPS} off
# First rewrite to HTTPS:
# Don't put www. here. If it is already there it will be included, if not
# the subsequent rule will catch it.
RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI} [R,L]

# Now, rewrite any request to the wrong domain to use www.
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule (.*) https://www.%{HTTP_HOST}%{REQUEST_URI} [L,R]

But at the logs of this site I don't see the 301-redirect, but the 403 Forbidden
 
But at the logs of this site I don't see the 301-redirect, but the 403 Forbidden
Odd. This was from this morning before I disabled the .htaccess redirect tot www.

Code:
216.244.66.231 - - [12/Aug/2025:06:02:03 +0200] "GET /threads/1340/ HTTP/1.1" 301 3121 "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexp>
216.244.66.231 - - [12/Aug/2025:06:02:07 +0200] "GET /threads/1570/ HTTP/1.1" 301 3082 "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexp>
216.244.66.231 - - [12/Aug/2025:06:02:10 +0200] "GET /threads/2187/ HTTP/1.1" 301 3111 "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexp
I added DotBot/1.2 now in the list, but still....

I also thought meta-externalagent would be enough, but I see:
Code:
 2a03:2880:f800:2:: - - [12/Aug/2025:20:14:04 +0200] "GET /whats-new/posts/283251/ HTTP/2.0" 200 87424 "-" "meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)"
so I added meta-externalagent/1.1 now too to the list. But I thought this wasn't necessary.

Same for claudebot, now there is a claudebot 1/0.
 
This is my complete httpd-includes.conf file. Looks good to me so why does it not load? Is there a way to check if its loaded?

Code:
<IfModule mod_rewrite.c>
    RewriteEngine On
    BrowserMatchNoCase "adscanner" bad_bot
    BrowserMatchNoCase "Amazonbot" bad_bot
    BrowserMatchNoCase "Applebot" bad_bot
    BrowserMatchNoCase "aiohttp" bad_bot
    BrowserMatchNoCase "Amazonbot" bad_bot
    BrowserMatchNoCase "Amazonbot/0.1" bad_bot
    BrowserMatchNoCase "anthropic-ai" bad_bot
    BrowserMatchNoCase "Applebot" bad_bot
    BrowserMatchNoCase "AspiegelBot" bad_bot
    BrowserMatchNoCase "Baiduspider" bad_bot
    BrowserMatchNoCase "Barkrowler" bad_bot
    BrowserMatchNoCase "BLEXBot" bad_bot
    BrowserMatchNoCase "BoardReader" bad_bot
    BrowserMatchNoCase "Bytespider" bad_bot
    BrowserMatchNoCase "ChatGPT-User" bad_bot
    BrowserMatchNoCase "ClaudeBot" bad_bot
    BrowserMatchNoCase "Datanyze" bad_bot
    BrowserMatchNoCase "Dotbot" bad_bot
    BrowserMatchNoCase "facebook" bad_bot
    BrowserMatchNoCase "facebookcatalog" bad_bot
    BrowserMatchNoCase "facebookexternalhit" bad_bot
    BrowserMatchNoCase "Go-http-client" bad_bot
    BrowserMatchNoCase "GPTBot" bad_bot
    BrowserMatchNoCase "ImagesiftBot" bad_bot
    BrowserMatchNoCase "Kinza" bad_bot
    BrowserMatchNoCase "LieBaoFast" bad_bot
    BrowserMatchNoCase "MauiBot" bad_bot
    BrowserMatchNoCase "Mb2345Browser" bad_bot
    BrowserMatchNoCase "meta-externalagent" bad_bot
    BrowserMatchNoCase "MicroMessenger" bad_bot
    BrowserMatchNoCase "MJ12bot" bad_bot
    BrowserMatchNoCase "msnbot-media" bad_bot
    BrowserMatchNoCase "msnbot-MM" bad_bot
    BrowserMatchNoCase "nbot" bad_bot
    BrowserMatchNoCase "Petalbot" bad_bot
    BrowserMatchNoCase "Scrapy" bad_bot
    BrowserMatchNoCase "Scrapy *\([a-zA-Z]+\) *(.+)" bad_bot
    BrowserMatchNoCase "SemrushBot" bad_bot
    BrowserMatchNoCase "SemrushBot-BA" bad_bot
    BrowserMatchNoCase "SemrushBot-BM" bad_bot
    BrowserMatchNoCase "SemrushBot-COUB" bad_bot
    BrowserMatchNoCase "SemrushBot-CT" bad_bot
    BrowserMatchNoCase "SemrushBot-SI" bad_bot
    BrowserMatchNoCase "SemrushBot-SWA" bad_bot
    BrowserMatchNoCase "serpstatbot" bad_bot
    BrowserMatchNoCase "SiteAuditBot" bad_bot
    BrowserMatchNoCase "Sogou" bad_bot
    BrowserMatchNoCase "spaziodat" bad_bot
    BrowserMatchNoCase "SplitSignalBot" bad_bot
    BrowserMatchNoCase "YaK" bad_bot
    BrowserMatchNoCase "YandexBot" bad_bot
    BrowserMatchNoCase "YandexImages" bad_bot

    SetEnvIfNoCase Request_URI "\.env" bad_bot
    SetEnvIfNoCase Request_URI "xlmrpc\.php" bad_bot

    RewriteCond %{ENV:BAD_BOT} !^$
    RewriteRule (.*) - [F,L]
</IfModule>
 
This is my complete httpd-includes.conf file. Looks good to me so why does it not load? Is there a way to check if its loaded?
Hi Richard;
try this command will show you all included files
apachectl -t -D DUMP_INCLUDES


why you don't use modsecurity ??
 
Hello Hostmavi.

Thank you, it shows correctly.
(162) /etc/httpd/conf/extra/httpd-includes.conf

But why is it ignored then or at least shown a 301 or even 200 in the log instead of a 403.
Code:
216.244.66.231 - - [13/Aug/2025:00:28:01 +0200] "GET /threads/iemand-windows-10-al-geinstalleerd.11642/ HTTP/1.1" 200 150513 "-" "Mozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; [email protected])"

why you don't use modsecurity ??
I tried that once and then too much got closed, to be honest I'm too unfamiliar with that. But the httpd-includes should work as it does for others like Stefan, so I can't understand why it does not block the bots in my case.
I don't have -any- customisations in Apache, except for this httpd-includes.conf and http-info.conf which has my ip to see server status.[/code]
 
This is my complete httpd-includes.conf file. Looks good to me so why does it not load? Is there a way to check if its loaded?


Try and update the rules with the following:

Code:
        RewriteEngine On
        RewriteBase     /
        RewriteOptions  InheritBefore

just around the line:

Code:
RewriteEngine On
 
RewriteEngine On
RewriteBase /
RewriteOptions InheritBefore
That one stopped my httpd from working.

Code:
Aug 13 15:51:50 server.mydomain.nl httpd[2629507]: AH00526: Syntax error on line 3 of /etc/httpd/conf/extra/httpd-includes.conf:
Aug 13 15:51:50 server.mydomain.nl httpd[2629507]: RewriteBase: only valid in per-directory config files

So I commented out the RewriteBase line and then httpd will restart.
Unfortunately that won't help either. Just added the Seekportbot . However something must have changed because now it's still running wild with 301's instead of the 200's before. It's just not 403's yet for some reason.

Edit: Also showing a lot of 303's now for that Seekportbot.

That last line of the httpd-includes.conf is this:
RewriteRule (.*) - [F,L]
so in fact that is a redirect, right? Could it be for some reason the F is not working and the redirect to 403 does not work? Or something like that?
It's odd because it's just plain apache and same on multiple servers.
 
Last edited:
Ah, I see. This is because I don't use httpd-includes.conf any longer. If it comes to Apache, I block bots in templates:

- /usr/local/directadmin/data/templates/custom/cust_httpd.CUSTOM.2.pre - added in <VirtualHost ...></VirtualHost> context.
- /usr/local/directadmin/data/templates/custom/cust_httpd.CUSTOM.3.pre - added in <Directory ...></Directory> context.

any of the two can be used.

And I use RewriteCond %{HTTP_USER_AGENT} instead of BrowserMatchNoCase, something like here:

- https://forum.directadmin.com/threa...-spiders-with-common-names.74495/#post-387434
 
Oef... I'm not good in regexp. But I have a look at it. Should be working in the httpd-includes too. I'm just too curious as to why it works with others and not with me.
But thank you for thinking with me on this.
 
@Richard G
Try debug using empty domain,.... like create new subdomain to test the filter. This will ensure no any .htaccess conflic with your filter.
 
Back
Top