HOWTO train or retrain your DSPAM

kopeah

Verified User
Joined
Sep 10, 2006
Messages
9
Hi all,

This is howto gear toward those who has DSPAM installed. If you have not try DSPAM, installation HOWTO is available here.

Let's start by doing initial training of your DSPAM. To do so, you can either let DSPAM auto learn all emails which will probably take several days before it can be an effective spam filter or force it to learn by feeding it corpus files. Now remember, eventhough you force feed your DSPAM with corpus files, it doesn't mean that it will be 100% effective. Small number of spam emails will probably pass through, that's when you have to retrain your DSPAM.

### CORPUS TRAINING ###

1. First download spam and ham corpus files from http://spamassassin.apache.org/publiccorpus/
Download two of each should be enough.

2. Extract one of each, and make sure the directory it creates are ham and spam. If it uses other name such as hard_ham or spam_2 rename it.

3. Run dspam_train, since I'm using amavis+DSPAM .. I will train it using amavis as user.

Code:
/opt/dspam/bin/dspam_train amavis /full/path/to/dir/spam /full/path/to/dir/ham

4. Delete your spam and ham directory and follow step 2 and step 3 for the remaining corpus files.

NOTE:
You can use the same corpus files to train your spamassassin bayesian database by using 'sa-learn'. If you have both DSPAM and spamassassin running, you don't need to run sa-learn for spamassassin, let DSPAM handles that since its faster and needs less resources than spamassassin.

### RETRAIN ###

Every once in awhile, DSPAM marked good email as SPAM or vice versa. This is normal specially during training period. To retrain your DSPAM, you have to do the following:

1. Download the retrain script from here.

2. Put it the appropriate location, for the sake of this howto, we'll put it in dspam directory /opt/dspam/bin and don't forget to chmod it as 755

Code:
chmod 755 /opt/dspam/bin/dspam_retrain.sh

3. Create two folders using IMAP, name them "ham" and "spam". When you see any legitimate email marked spam, then you move it to folder "ham". Do the same thing with untagged spam emails, you move it to spam folder.

4. Run dspam_retrain.sh one for each folder. We'll start with spam folder

Code:
/opt/dspam/bin/dspam_retrain.sh spam /home/username/imap/domain.com/username/mail/spam

Now ham folder

Code:
/opt/dspam/bin/dspam_retrain.sh ham /home/username/imap/domain.com/username/mail/ham

That should do it .. :)


Cheers,
 
Back
Top