At the end of each training run, bogofilter saves its updated database in a file called

.bogofilter/wordlist.db.

Over the course of time, spam message content will change. Periodic training runs with new spam and valid message sets are necessary to keep bogofilter's internal database current.

Filtering with Bogofilter

Once the bogofilter database has been primed, the command can be used to filter new messages. When a mail text message is filtered using a bogofilter trained database, bogofilter will return a value of 0 for spam, 1 for non-spam, 2 for unsure, and 3 for I/O or other errors. Here is an example:

$ bogofilter new-messages

You can use the bogofilter command line to set many options that determine how bogofilter operates (see bogofilter(1) for more details). The file /usr/internet/etc/bogofilter.cf can be used to set additional parameters that affect its operation. In the file /usr/internet/etc/ bogofilter.cf.example are samples of all of the parameters. Status and logging messages can be customized.

Filter Integration with Other Tools

The following sections describe how bogofilter can be integerated with other e-mail tools.

Using Bogofilter with procmail

The following procmail rule will take mail on stdin and save it to file spam if bogofilter thinks it is spam:

:0HB:

*? bogofilter spam

This similar rule will also register the tokens in the mail according to the bogofilter classification:

:0HB:

*? bogofilter -u spam

If bogofilter fails (returning 3) the message will be treated as non-spam.

The following recipe accomplishes the following:

Spam-bins anything that bogofilter rates as spam

Registers the words in messages rated as spam as such

Registers the words in messages rated as non-spam as such

With this in place, it will normally only be necessary for the user to intervene (with -Nsor -Sn) when bogofilter miscategorizes something.

#filter mail through bogofilter, tagging it as spam and

#updating the wordlist

:0fw

bogofilter -u -e -p

#if bogofilter failed, return the mail to the queue, the MTA will

#retry to deliver it later

#75 is the value for EX_TEMPFAIL in /usr/include/sysexits.h

:0e

{ EXITCODE=75 HOST}

# file the mail to spam-bogofilter if it's spam.

Bogofilter Spam Filter 129

Page 129
Image 129
HP UX Internet Express Software Filtering with Bogofilter, Filter Integration with Other Tools, Bogofilter/wordlist.db