Spamassassin setup

Spam filtering with SpamAssassin

Overview

When e-mail message arrives on NMRFAM SMTP server it passes it to Mail Delivery Agent program for delivery to user’s mailbox. MDA can do a number of things during delivery, including piping the message through a spam filtering program, delivering to sub-folders, forwarding to different address, etc.

At NMRFAM we use maildrop MDA and SpamAssassin for spam filtering. SpamAssassin runs a number of tests on the message, calculates its “spam score”, and adds that score to the headers of the message. You can then use maildop’s filtering capabilities to do something with the message based on its spam score.

Neither maildrop nor Spamassassin are enabled by default.

Enable maildrop

To enable maidrop, create .mailfilter file in your home directory. The file must not be readable or writable by anyone except yourself (mode 600).

cd ~
echo 'DEFAULT="./Maildir"' > .mailfilter
chmod 600 .mailfilter

This simply tells maildrop to deliver everything to your Inbox.

Enable spamassassin

To enable spamassassin, edit your .mailfilter file to look like this (you can copy-paste it):

#
# ~/.mailfilter file for maildrop-2.x
# must be mode 600 (rw-------)
#
import SENDER
SENDMAIL=/usr/sbin/sendmail
DEFAULT="./Maildir"
xfilter "/usr/bin/spamc -f ${SENDER}"

(xfilter line tells maildrop to pipe all messages through SpamAssassin).

SpamAssassin’s configuration file is .spamassassin/user_prefs in your home directory. It (and the entire .spamassassin directory) will be created automatically when SpamAssasin runs for the first time.
Wait for new e-mail to come in, or send yourself a test message to make that happen.

Tune-up SpamAssassin

By default messages with spam score 5 or higher are marked as spam by SpamAssassin (it adds an X-Spam-Flag: YES header to the message). That is controlled by required_score setting in .spamassassin/user_prefs. Basic rule here: the higher the spam score, the less chance of legitimate message being marked as spam (”false positive”).

Other frequently used settings:

whitelist_from somebody@somewhere: use if legitimate mail from somebody@somewhere gets mis-identified as spam. You can have as many whitelist_from lines as you need use_bayes 1: enable Bayesian filter (see below)

bayes_ignore_from somebody@somewhere: like whitelist_from above, only specific to Bayesian filter.

Tune-up Bayesian filter

Bayesian filter works by running statistical analysis on the body of the message. Before it can be used, it must be trained on a representative sample of spam messages.

Collect a few hundred spam message in a folder in your mail account (e.g. train). Then, logon to herens (the mail server) and run the training program:

ssh herens
cd ~/Maildir/.train
sa-learn --spam --showdots cur

Then delete messages from train.

Note: the leading dot in folder name: .train. This is how mail folders a stored in unix filesystem: Inbox is ~/Maildir, train is ~/Maildir/.train

You can re-train Bayesian filter as often as you like. You can also train it on a set of
non-spam messages - e.g. on your Inbox:

ssh herens
sa-learn --ham --showdots ~/Maildir/cur

Spam filtering, with complete examples

Maildrop can search for regular expressions in the message and do various things based on the result. Examples below show maildrop rules for delivering messages marked by SpamAssassin to different mail folders.

Example 1

Deliver messages marked as spam to spam folder, deliver everything else to Inbox.
Complete ~/.mailfilter file:

#
# ~/.mailfilter file for maildrop-2.x
# must be mode 600 (rw-------)
#
import SENDER
SENDMAIL=/usr/sbin/sendmail
DEFAULT="./Maildir"
xfilter "/usr/bin/spamc -f ${SENDER}"
if( /^X-Spam-Flag: YES/ )
{
to "Maildir/.spam"
}

Note: many mail client programs can be configured to use SpamAssassin’s scores. The above example is equivalent to configuring thunderbird (in junk mail control settings dialog) to trust SpamAssassin headers and automatically move messages marked as junk to spam folder. The maildrop way above is more efficient because
the mail is filtered as it is delivered, and all of it runs locally on the mail server. With thunderbird’s filter the message first gets delivered to Inbox and is moved to spam later when thunderbird runs its junk mail controls. Junk mail controls run on your computer so in order to analyse the message thunderbird has to fetch it
from the server over the network.

Example 2

Delete messages with score 7 or more, deliver messages with score 5..7 to spam folder, deliver everything else to Inbox. (Score 7 is high enough to make it very unlikely that a legitimate message gets it. So it’s safe to assume those messages are spam and delete them without reading.) Complete ~/.mailfilter file:

#
# ~/.mailfilter file for maildrop-2.x
# must be mode 600 (rw-------)
#
import SENDER
SENDMAIL=/usr/sbin/sendmail
DEFAULT="./Maildir"
xfilter "/usr/bin/spamc -f ${SENDER}"
if( /^X-Spam-Level: \*{7,}/ )
{
to "/dev/null"
}
if( /^X-Spam-Level: \*{5,7}/ )
{
to "Maildir/.spam"
}

Note: /dev/null is the Unix bitbucket. Delivering to it simply deletes the message, it never even gets to your mailbox.

SpamAssassin headers

  • X-Spam-Flag: YES if message scores required_score
    or more
  • X-Spam-Level: spam score as
    asterisks (*), rounded (e.g. 5 asterisks
    for spam score 5.5)
  • X-Spam-Status: contains detailed report including required score and the tests run on the message.

See also

maildrop

SpamAssassin