My First Awk Script!!!!

vincenzobar

Verified User
Joined
Aug 15, 2004
Messages
92
Well i did it!!!! it took for ever to learn since i am not a programmer at heart! But i did it!

What you do is in vi or favorite text editor paste this:

NEW CODE - last code only showed clean email that passed through - oops, this shows both identified and clean gives a count opf each and then totals it up to show how much email you recieved! plus sessions and imapd sessions!

Code:
##########################################
#mail_log_filter.awk
# Created by: Vincenzo S Barranca, Underwater Design, LLC
# 8/24/2004
# insert GNU stuff here
##########################################
BEGIN  { printf("spam \t Rate\n")  } # writes header
       /clean/ {email++} # searches file for  "clean"
        /vm-pop3/ {sessions++} # searches file for pop3 sessions
        /identified/ {spam++} # searches file for "identified"
        /imapd/ {imapd++} # searches file for (take a guess) yep "imapd"
        (($6 == "clean")) || ($6 == "identified")  {field = 6 # this searches column 6 for clean and identified
                      while(field <= 8 ) # now while it is searching it is getting column 8
                      {
                            printf("%s\t", $field ) # this says " print it as a string with a TAB between fields (columns)
                            field += 2 #repeat
                       }
                        print "" # print string here
                        }
        # END is where we get the rest of what we want to show from the variables we set in the BEGIN statement
        # we assign values to the variables above here and print them to screan or file!
                  END {totalclean=email; # gets total clean email
                        printf ("Number of clean mail = %3d \n", totalclean); # prints to file total
                        totalspam=spam; # gets spammed emails
                        printf ("Number of emails considered spam = %3d \n",totalspam);
                        totalmail = totalclean + totalspam; #gets all emails recieved
                        printf ("Total email recieved = %3d \n", totalmail);
                        totalsessions=sessions / 2; # gets total email sessions
                        printf ("number of sessions = %5d \n", totalsessions)
                        totalimapd=imapd / 2; # gets imapd sessions
                        printf ("total imapd connections = %5d \n", totalimapd)
}
chmod it
Code:
 chmod +x filename.awk
and run it
Code:
gawk -f filename.awk maillog > results

What this does is goes through the mail log and find all mail that has been cleaned and prints it to a file with its scores and at the bottom it does a total count. I also added how many vm-pop3 sessions i had which isn't an exact science yet its just vm-pop3 divided by 2 for open close sessions and the same with imapd.

I have no idea what imapd is and i wanted to see how many times it hit so i added it.

my results were:
Code:
spam     Rate
clean   (4.1/5.0)
clean   (0.7/5.0)
identified      (6.6/5.0)
clean   (1.1/5.0)
clean   (0.8/5.0)
clean   (4.3/5.0)
clean   (1.6/5.0)
clean   (0.6/5.0)
identified      (6.6/5.0)
identified      (8.7/5.0)
clean   (0.3/5.0)
clean   (3.7/5.0)
clean   (0.5/5.0)
clean   (2.4/5.0)
clean   (2.0/5.0)
identified      (6.6/5.0)
etc ... (save space -you get the idea))

Number of clean mail =  61
Number of emails considered spam =  30
Total email recieved =  91
number of sessions =  4349
total imapd connections =   487

its yours for the taking as a starter file to tweak to yur specs!!! this is going to make life so much easier when i want to page through my logs. Im going to do a few for various logs!

later!
 
Last edited:
First of all, imap is a protocol which leaves mail on the server, so you can access it from anywhere. Unlike pop you can have multiple directory's etc.

Your script here gives this output on a shared hosting server:
Code:
spam     Rate
Cleaned:        1
Cleaned:        1
Cleaned:        1
Cleaned:        1
Cleaned:        1
Cleaned:        1
Cleaned:        1
Cleaned:        1
Number of spam mail =   8
number of email sessions =  5230
total imapd connections =   361
Not entirely correct, but i have to note that i'm running mailscanner & clamav & spamassassin.
 
Thanks for the info on imapd!


BTW i am revising this as we speak! i just realized it only tells you the mail thjat went through and not "identified" I got that in now and all i have left is to add the domain so you can easily add it to the black list!!!
thsi stuff is cool! so much time will be freed up!
I forgot to mention i am just running spamAssassin and the new exem!

well then you have to refine it!!!

Field 6 is where Clean appears and field 8 is where my rates are. fields are seperated but spaces sooo:

Code:
Aug 22 05:01:44 server spamd[15901]: info: setuid to mail succeeded
Aug 22 05:01:44 server spamd[15901]: processing message <[email protected]> for mail:8.
Aug 22 05:01:45 server spamd[15901]: clean message (4.1/5.0) for mail:8 in 0.4 seconds, 1624 bytes.
I took this line:
1----2-------3------4-------5---------6------7-------8
Aug 22 05:01:45 server spamd[15901]: clean message (4.1/5.0)

9----10-11-12-----13-----14----15
for mail:8 in 0.4 seconds, 1624 bytes.


this searched 6 for the word clean and returns column 6 and 8 for every occurance
Code:
 /clean/  {field = 6
                      while(field <= 8 )
                      {

if you have a specific need post a line from the log that you want to return results on and a description and i will try and write a script for it!!!!!

practice makes perfect!
 
Last edited:
I forgot to mention i am just running spamAssassin and the new exem!
Perhaps add a variable or something to set the way it's supposed to be parsed ;)

Some lines from my maillog file:
Aug 24 19:20:47 horus vm-pop3d[22540]: User 'info' of 'domain.ext' logged in from 255.255.255.255, nmsgs=124
Aug 24 19:20:48 horus vm-pop3d[22540]: Session ended for user: '[email protected]' from 255.255.255.255, nmsgs=124, ndel=0
Aug 24 19:20:49 horus vm-pop3d[22541]: User 'info' of 'domain.ext' logged in from 255.255.255.255, nmsgs=0
Aug 24 19:20:49 horus vm-pop3d[22541]: Session ended for user: '[email protected]' from 255.255.255.255, nmsgs=0, ndel=0
Aug 24 19:20:49 horus vm-pop3d[22542]: User 'support' of 'domain.ext' logged in from 255.255.255.255, nmsgs=0
Aug 24 19:20:49 horus vm-pop3d[22542]: Session ended for user: '[email protected]' from 255.255.255.255, nmsgs=0, ndel=0
Aug 24 19:21:52 horus vm-pop3d[22543]: User 'info' of 'domain.ext' logged in from 255.255.255.255, nmsgs=10
Aug 24 19:21:54 horus vm-pop3d[22543]: Session ended for user: '[email protected]' from 255.255.255.255, nmsgs=10, ndel=0
Aug 24 19:21:54 horus vm-pop3d[22574]: User 'crew' of 'domain.ext' logged in from 255.255.255.255, nmsgs=7
Aug 24 19:31:07 horus vm-pop3d[23010]: User 'totaldom' logged in from 255.255.255.255, nmsgs=1
Aug 24 19:31:07 horus vm-pop3d[23010]: bytes: user totaldom 7396 bytes
Aug 24 19:31:07 horus vm-pop3d[23010]: Session ended for user: 'totaldom' from 255.255.255.255, nmsgs=1, ndel=1
Aug 24 19:30:37 horus imapd[23232]: imap service init from 255.255.255.255
Aug 24 19:30:37 horus imapd[23232]: Login user=sebsoft host=domain_ISP.nl [255.255.255.255]
Aug 24 19:30:44 horus imapd[23234]: imap service init from 255.255.255.255
Aug 24 19:30:44 horus imapd[23234]: Login [email protected] host=domain_ISP.nl [255.255.255.255]
Aug 24 19:30:45 horus imapd[23234]: Command stream end of file, while reading line [email protected] host=domain_ISP.nl [255.255.255.255]
Aug 24 19:30:45 horus imapd[23235]: imap service init from 255.255.255.255
Aug 24 19:30:45 horus imapd[23235]: Login [email protected] host=domain_ISP.nl [255.255.255.255]
Aug 24 19:30:45 horus imapd[23235]: Command stream end of file, while reading line [email protected] host=domain_ISP.nl [255.255.255.255]


Aug 24 18:14:32 horus MailScanner[15746]: New Batch: Scanning 1 messages, 787 bytes
Aug 24 18:14:33 horus MailScanner[15746]: Spam Checks: Starting
Aug 24 18:14:33 horus MailScanner[15746]: Virus and Content Scanning: Starting
Aug 24 18:14:33 horus MailScanner[15746]: Uninfected: Delivered 1 messages
Aug 24 18:15:29 horus MailScanner[16561]: New Batch: Scanning 1 messages, 1438 bytes
Aug 24 18:15:29 horus MailScanner[16561]: Spam Checks: Starting
Aug 24 18:15:29 horus MailScanner[16561]: Virus and Content Scanning: Starting
Aug 24 18:15:30 horus MailScanner[11616]: New Batch: Scanning 1 messages, 1457 bytes
Aug 24 18:15:30 horus MailScanner[11616]: Spam Checks: Starting
Aug 24 18:15:30 horus MailScanner[16561]: Uninfected: Delivered 1 messages
Aug 24 18:15:30 horus MailScanner[11616]: Virus and Content Scanning: Starting
Aug 24 18:15:31 horus MailScanner[11616]: Uninfected: Delivered 1 messages
Aug 24 18:33:12 horus MailScanner[16561]: New Batch: Scanning 1 messages, 1550 bytes
Aug 24 18:33:12 horus MailScanner[16561]: Spam Checks: Starting
Aug 24 18:33:13 horus MailScanner[16561]: RBL checks: 1BzeER-00058D-C2 found in spamcop.net
Aug 24 18:33:13 horus MailScanner[16561]: Message 1BzeER-00058D-C2 from 66.137.14.10 ([email protected]) to domain.ext is spam, spamcop.net, SpamAssassin (score=23.266, vereist 5, FORGED_IMS_HTML 4.30, FORGED_IMS_TAGS 4.30, FORGED_MUA_IMS 1.10, HTML_50_60 0.18, HTML_IMAGE_ONLY_02 2.24, HTML_MESSAGE 0.00, HTML_MIME_NO_HTML_TAG 1.72, MIME_HTML_NO_CHARSET 0.72, MIME_HTML_ONLY 0.10, MIME_HTML_ONLY_MULTI 1.10, RCVD_IN_BL_SPAMCOP_NET 4.00, RCVD_IN_DSBL 1.10, RCVD_IN_NJABL 0.10, RCVD_IN_NJABL_PROXY 1.10, RCVD_IN_SORBS 0.10, RCVD_IN_SORBS_MISC 1.10)
Aug 24 18:33:13 horus MailScanner[16561]: Spam Checks: Found 1 spam messages
Aug 24 18:33:13 horus MailScanner[16561]: Spam Actions: message 1BzeER-00058D-C2 actions are attachment,deliver
Aug 24 18:33:13 horus MailScanner[16561]: Virus and Content Scanning: Starting
Aug 24 18:33:14 horus MailScanner[16561]: Uninfected: Delivered 1 messages
Aug 24 18:38:04 horus MailScanner[11616]: New Batch: Scanning 1 messages, 1936 bytes
Aug 24 18:38:04 horus MailScanner[11616]: Spam Checks: Starting
Aug 24 18:38:04 horus MailScanner[11616]: Virus and Content Scanning: Starting
Aug 24 18:38:05 horus MailScanner[11616]: Uninfected: Delivered 1 messages
Aug 24 18:40:40 horus MailScanner[16561]: New Batch: Scanning 2 messages, 3089 bytes
Aug 24 18:40:40 horus MailScanner[16561]: Spam Checks: Starting
Aug 24 18:40:40 horus MailScanner[16561]: Virus and Content Scanning: Starting
Aug 24 18:40:41 horus MailScanner[16561]: Uninfected: Delivered 2 messages
Aug 24 18:40:50 horus MailScanner[11616]: New Batch: Scanning 1 messages, 1819 bytes
Aug 24 18:40:50 horus MailScanner[11616]: Spam Checks: Starting
Aug 24 18:40:50 horus MailScanner[11616]: RBL checks: 1BzeLj-0005EI-Ce found in spamcop.net
Aug 24 18:40:52 horus MailScanner[11616]: Message 1BzeLj-0005EI-Ce from 69.148.173.209 ([email protected]) to domain.be is spam, spamcop.net, SpamAssassin (score=9.222, vereist 5, ALL_NATURAL 1.32, LOSE_POUNDS 2.80, RCVD_IN_BL_SPAMCOP_NET 4.00, RCVD_IN_DSBL 1.10)
Aug 24 18:40:52 horus MailScanner[11616]: Spam Checks: Found 1 spam messages
Aug 24 18:40:52 horus MailScanner[11616]: Spam Actions: message 1BzeLj-0005EI-Ce actions are attachment,deliver
Aug 24 18:40:52 horus MailScanner[11616]: Virus and Content Scanning: Starting
Aug 24 18:40:52 horus MailScanner[11616]: Uninfected: Delivered 1 messages

Aug 24 17:59:40 horus MailScanner[4369]: New Batch: Scanning 1 messages, 35327 bytes
Aug 24 17:59:40 horus MailScanner[4369]: Spam Checks: Starting
Aug 24 17:59:41 horus MailScanner[4369]: Virus and Content Scanning: Starting
Aug 24 17:59:42 horus MailScanner[4369]: Content Checks: Detected HTML-specific exploits in 1Bzdhw-0004Tv-LQ
Aug 24 17:59:42 horus MailScanner[4369]: Content Checks: Found 1 problems
Aug 24 17:59:42 horus MailScanner[4369]: Content Checks: Detected and will convert HTML message to plain text in 1Bzdhw-0004Tv-LQ
Aug 24 17:59:42 horus MailScanner[4369]: Uninfected: Delivered 1 messages
Aug 24 18:01:01 horus update.virus.scanners: Found clamav installed
Aug 24 18:01:01 horus update.virus.scanners: Running autoupdate for clamav
Aug 24 18:01:01 horus ClamAV-autoupdate[18191]: ClamAV did not need updating


Also happens once every x min:
Aug 24 17:41:39 horus MailScanner[496]: MailScanner child dying of old age
Aug 24 17:41:39 horus MailScanner[16561]: MailScanner E-Mail Virus Scanner version 4.23-11 starting...
Aug 24 17:41:39 horus MailScanner[16561]: Enabling SpamAssassin auto-whitelist functionality...
Aug 24 17:41:40 horus MailScanner[16561]: Using locktype = posix
Aug 24 17:41:40 horus MailScanner[16561]: Creating hardcoded struct_flock subroutine for linux (Linux-type)

After some 'cat maillog | grep -v vm-pop3d | grep -v imap | grep -v Spam':
Aug 22 10:18:03 horus MailScanner[21785]: New Batch: Scanning 1 messages, 41658 bytes
Aug 22 10:18:05 horus MailScanner[21785]: Virus and Content Scanning: Starting
Aug 22 10:18:05 horus MailScanner[21785]: /var/spool/MailScanner/incoming/21785/./1BynY8-0005oW-VN/sample01.zip: Worm.SomeFool.P FOUND
Aug 22 10:18:05 horus MailScanner[21785]: Virus Scanning: ClamAV found 1 infections
Aug 22 10:18:05 horus MailScanner[21785]: Virus Scanning: Found 1 viruses
Aug 22 10:18:05 horus MailScanner[21785]: Saved infected "sample01.zip" to /var/spool/MailScanner/quarantine/20040822/1BynY8-0005oW-VN
Aug 22 10:18:50 horus MailScanner[21785]: New Batch: Scanning 1 messages, 41547 bytes
Aug 22 10:18:51 horus MailScanner[21785]: Virus and Content Scanning: Starting
Aug 22 10:18:52 horus MailScanner[21785]: /var/spool/MailScanner/incoming/21785/./1BynYr-0005of-JN/my_details_jos.zip: Worm.SomeFool.P FOUND
Aug 22 10:18:52 horus MailScanner[21785]: Virus Scanning: ClamAV found 1 infections
Aug 22 10:18:52 horus MailScanner[21785]: Virus Scanning: Found 1 viruses
I hope this doesn't screw up the layout of this site too much.
I did change all the ip addresses and domain names... ;)
Also note that horus is the server name.
Hope you can use it ;)
 
Icheb -

I noticed yours is a huge pain in the ass becuase it isn't to easy to pick out standardized calls for spam and non-spam. but i think i found the solution!!!

My roomate was home for a change and he is a C++/perl programmer and helped me write a perl script that calls both maillog file and /exim/mainlog file and finds the ids of mails as they come from the exim mainlog. it then creates a file when you run it and prints off all the domains of the spams into a nice neat page for you to look through and copy and past into your blacklist!

This works flawlessly for Exim and Spamassassin. hopefully soon i will have one for mailscanner!

Code:
#!/usr/local/bin/perl -w
#Created by Underwater Design, LLC 8/24/04

if ( scalar(@ARGV) != 2) {
        print "USAGE: $0 <maillog> <mainlog>\n";
        exit -1;
}

open (MAILLOG, "<$ARGV[0]");
open (MAINLOG, "<$ARGV[1]");

$curr = "";
$prev = "";
%vars = ( );
while ( <MAILLOG> ) {
        $prev = $curr;
        $curr = $_;

        if ( $curr =~ /identified\s+spam/ ) {
                ($var) = $prev =~ /<(\S+@\S+)>/;
                $vars{$var} = 1;
        }
}
close (MAILLOG);

while ( <MAINLOG> ) {
        $curr = $_;

        if ( ($id) = $curr =~ /id=(\S+)/ ) {
                if ( exists($vars{$id}) ) {
                        if ( ($domain) = $curr =~ /from <\S+@(\S+)>/ ) {
                                print "$domain\n";
                        }
                }
        }
}
close (MAINLOG);

##-- END OF SCRIPT --##

This does not do a count like my awk script but i am working on it as well as creating one that works with MAILSCANNER.

Code:
"in var/log" [b./pull_blacklist.pl /var/log/maillog /var/log/exim/mainlog > temp[/b] "don't forget to make this +x like the awk script."
make sure you add the path and file to the 2 files then > (pipe) it to whatever file you want to

and i got this:
Code:
ior.clmwer.com
ohio.edu
hotmail.com
hotmail.com
ebay.com
aol.com
hotmail.com
boeingstore.com
topica.email-publisher.com
optonline.net
ineedhits.com
yahoo.com
yahoo.com
complimentsofyou.com
creationherbal.com
yahoo.com
breezybrookfarms.com
aol.com
cs.com
cs.com
juno.com
worldonline.fr
etc......


pick and chose the ones you want to black list with ease!!!!!!
 
Last edited:
Back
Top