Solved Use wildcard to block spam at the end via regexp in bad_sender_hosts, how?

Richard G

Verified User
Joined
Jul 6, 2008
Messages
14,331
Location
Maastricht
I don't know regexp so I need some help.

The various blacklistfiles in the /etc/virtual directory are based on nwildlsearch which means if you use a * wildcard, it has to be in the beginning of the line for example *.store, you can't use someting.* in these files.
That's how I thought this only would work with the nwildlsearch method.

But I thought there should be a better way so I searched and now I found these examples on the internet for nwildlsearch usage:
Code:
[email protected]
*@somespamsite.com
*@someotherspamsite.com
^known-first-part@mail[.].*
^another-known-first-part@.*
with this explaining text belonging to it.
Note that the * wilcards only work for the leading part of theaddress. For any other wildcarding – for example, ignoring thetrailing part of the address – we have to use a regular expression(cued by starting the line with ^).
That contains the word "regular expression" and then abracadabra starts in my head.

Now we are receiving spam from addresses like this:
mail-koreacentralazolkn19012059.outbound.protection.outlook.com
now the mail-koreacentralazolkn part stays the same, but the number often changes.

Am I correct I would be able to block these too in the bad_sender_hosts like this with a * at the end, when using a ^in front of the line?
^mail-koreacentralazolkn*
or am I undersanding this incorrectly? Maybe @zEitEr you have a clue?
 
Hello Richard,

Thank you for tagging me here. I'm under an impression, that * wilcards only work for the leading part of the address.

I will need to check the other way to use the things first. Then I will be able to reply.
 
Am I correct I would be able to block these too in the bad_sender_hosts like this with a * at the end, when using a ^in front of the line?
^mail-koreacentralazolkn*
If it in fact turns into a regex parse by using ^ at the start, then simply * alone would not match "anything else", because it's a modifier in regex. You'd need to use .* which would mean "any character" (.) zero or more times (*).

So, maybe this will work: ^mail-koreacentralazolkn.*
 
If it in fact turns into a regex parse by using ^ at the start, then simply * alone would not match "anything else", because it's a modifier in regex. You'd need to use .* which would mean "any character" (.) zero or more times (*).

So, maybe this will work: ^mail-koreacentralazolkn.*
Aha! Thank you very much. From the example's the dot was on a logical place. So I didn't know if it was there because it should be there, or if it was part of the regular expression (as I don't know about RE's).

If I can bother you another time, then what would it be if I really wanted the dot included for some reason?
For example I now block *@something.adomain.com but I don't want to block *@something.adomain.org would it be a double dot then so like:
^*@something.*.com or ^.*@something.*.com or something else?
Because in the blacklist_senders file e-mail addresses are used. So I'm puzzling still with the wildcard front and end or between and the @ for email address.
 
If I can bother you another time, then what would it be if I really wanted the dot included for some reason?
For example I now block *@something.adomain.com but I don't want to block *@something.adomain.org would it be a double dot then so like:
^*@something.*.com or ^.*@something.*.com or something else?
Because in the blacklist_senders file e-mail addresses are used. So I'm puzzling still with the wildcard front and end or between and the @ for email address.
Any special character in a regex can be escaped (prefixed with \) to match it literally, so for example:

^.$ will match start of line (^) followed by any single character (.) followed by end of line ($)
^\.$ will match start of line (^) followed by exactly one literal . (\.) followed by end of line ($)

^.*@something\..*\.com$ would match <whatever>@something.<whatever>.com.

Does that make sense?
 
Richard, you could try a regex preview page like https://regex101.com
Works really well.

Especially this line from Kristian is what you want i think:
^.*@something\..*\.com$

Depending on where you are putting 'what' you will need a start of line '^' and end of line '$', or not.
In CSF regex rules i mostly do not need '^' and/or '$' as it already comes in line by line and i dont care about whats after what im searching for, and so i end with .* which matches <whatever> for the rest of the line. eg:
^.*mail-koreacentralazolkn.*\.com.*

in your blacklist case:
^.*mail-koreacentralazolkn.*\.com$
 
Last edited:
In CSF regex rules i mostly do not need '^'
Why not? Because...
and so i end with .* which matches <whatever> for the rest of the line.
Yes, but the information states when using a .* at the end of the line in a nwildlsearch file, then a ^ is required in the beginning of the line.
So you do have to use the ^ in those cases too right?

I'm aware of the link, but I can only test regexp strings there as far as I've seen before, but I have troubles creating them. However the explanation there could help.
Just don't understand what the "insert your test string here" box does, if I just enter *@something.* in there, nothing happens. :)
 
Why not? Because...
CSF custom regex is 'fed in' with single loglines, so the '^' not necessary as it already is at start, and already ends at end of line.
Would it be a complete logfile or multiline input, then '^' is a must ofcourse, otherwise only a single line of that multiline would be found.
"^" is good practise, my shortcut is not. maybe better to ignore my comments about csf regexp start line ;)

Yes, but the information states when using a .* at the end of the line in a nwildlsearch file, then a ^ is required in the beginning of the line.
So you do have to use the ^ in those cases too right?
Yes. because it is a search in a file, and without it only a maximum of 1 'search item' would be found, ignoring rest of file.

I'm aware of the link, but I can only test regexp strings there as far as I've seen before, but I have troubles creating them. However the explanation there could help.
Just don't understand what the "insert your test string here" box does, if I just enter *@something.* in there, nothing happens. :)
Ah, yes! The 'test string' is your search source file (haystack), and 'regular expression' your check against it (needle).

for example:
on the regex101 page put the following lines in 'test string':
[email protected]
[email protected]
[email protected]
[email protected]


now in 'regular expression' start to test your expressions:
^*domain.com - will show 'pattern error' see Kristan's explanation above about wildcard's
^.*domain - will highlight ALL items with 'domain', which is to much, will need the toplevel also
^.*domain\.com - will highlight exactly what we want, ignoring the .org (see Kristan's explanation about literal '.'.

now for the spamxxxx:
^.*spam - will highlight to much
^.*spam.*@anotherdomain.xyz - will highlight what we want.

use of '$'.
as Kristian explained, '$' marks end of line.
in the above examples putting a '$' after the regex will work as indeed there is a end of line after the expression.
But, if you have a logfile of lines where the search is part of something inside that lines with text after it, be carefull with the '$'.

example:
put in 'test string' something like:
This is a test to see if [email protected] can be blocked.

in 'regular expression':
^.*domain\.com$ does NOT find what we want cause in the test string the line is not ended after 'domain.com'.
^.*domain\.com does find it (dont care about end of line)
^.*domain\.com.*$ does find it (multiple characters + end of line are met

the last one can be usefull if you want to meet multiple conditions in 1 line
example: ^.*domain\.com.*blocked\.$ line must contain 'domain.com' and end with 'blocked.'

Hopefully this gives a start to the 101 regexp test page.
If i make it more confusing then it was you can pm me, and i provide you my contact and i'll try to explain in Dutch.
 
Back
Top