SpamAssassin Email "Email Newsletter Marketing" by Muhammad Ribkhan is licensed under Pixabay License.

SpamAssassin SA-Update Tool

GeekThis remains completely ad-free with zero trackers for your convenience and privacy. If you would like to support the site, please consider giving a small contribution at Buy Me a Coffee.

Are you curious about SpamAssasin’s sa-update tool and what it does? As with many other programs geared towards servers, there are additional tools that are run inside of cron jobs and used by administrators. Knowing what these tools do and how they work can help you better understand your server and fix issues down the line.

The sa-update tool is used to pull new configuration files and rules from channels. These new files are used by SpamAssassin to classify emails as spam in addition to the Naive Bayes filtering. Among these files, there are definitions of free email providers, regex checks on subjects and body of messages, and more.

These files are stored in the /var/lib/spamassassin/<MAJOR>.<MINOR><PATCH> directory. Do not edit these files directly, changes should be made only in your /etc/spamassassin/*.cf files.

SA-Update Channels

A sa-update channel is a remote source where sa-update will get the new configuration files. If your company’s server security policy doesn’t allow this, you should disable the SpamAssassin cron job and either run a private channel or manually review the channel data and apply it to your server.

The option --channel can specify which channel to download new rules from. The default channel for many installations is updates.spamassassin.org. If you run lots of SpamAssassin servers and want an easy way to update rules for all of them, you can run a channel and easily distribute the changes among your servers.

The way sa-update and channels interact is a bit strange and relies partially on DNS queries. A channel can either serve the configuration files themselves or point to a list of mirrors that would have the configuration files.

First, a TXT DNS request is made to the channel with the major, minor, and patch version numbers in reverse order as subdomains. The response is the latest version number of the configuration files.

$ dig +short txt 2.4.3.updates.spamassassin.org
"1884121"

You can verify your version by looking for the line # UPDATE version <version> in the file representing the channel URL e.g. updates_spamassassin_org.cf inside of /var/lib/spamassassin/<MAJOR>.<MINOR><PATCH>. This is done automatically by sa-update though, so don’t worry about it.

Now the list of mirrors has to be resolved with another TXT DNS query. The DNS response will be a URL to a MIRRORED.BY file. The file lists one mirror per line in the format of http(s)://<mirror> weight=<weight>. These mirrors are used to download the new configuration files.

$ dig +short txt mirrors.updates.spamassassin.org
"http://spamassassin.apache.org/updates/MIRRORED.BY"

The program now tries a mirror to download the new configuration files. If a mirror fails sa-update will move onto the next one. The files that are downloaded are <version>.tar.gz, <version>.tar.gz.sha512, <version>.tar.gz.sha256, and <version>.tar.gz.asc.

Once the archive and checksum are verified, the archive is extracted into a directory representing the channel e.g. updates_spamassassin_org in the directory /var/lib/spamassassin/<MAJOR>.<MINOR><PATCH>.

These new files aren’t yet used until you restart SpamAssassin. The cron job on the other hand will automatically restart the service.

SpamAssassin Cron Job

The sa-update tool generally isn’t called manually by administrators. Instead, the tool lives in a daily cron job for SpamAssassin. This cron job is disabled by default but it is recommended to enable it so that your server can get the latest rules from SpamAssassin.

Modify the environment variable CRON to be any value other than 0 inside of the SpamAssassin environment file to enable the cron job. This environment file is loaded by SystemD and by the cron.daily cron job. The file is /etc/defaults/spamassassin for Debian systems.

SA-Update Without Channels

You can avoid using channels by providing sa-update with a .tar.gz archive to be installed. This option works well if you have made lots of modifications to the rules and want to apply them to multiple servers, if your server security policy doesn’t allow for remote configuration updates, or if your SpamAssassin servers don’t have HTTP or DNS access.

Instead of calling with the --channel <channel> option, you would use --install <file>. The archive is in the same format as those downloaded from channels, it just uses a local file instead.

Related Posts

How to Train SpamAssassin

Learn about the different methods used to train SpamAssassin, along with initial spam data sources to use with SpamAssassin. Update your bayes database easily with existing data.

Automatically Start Docker Container

Automatically start Docker containers when your server or computer boots using restart policies and avoiding systemd service files.

Incremental MySQL Backup with Binary Log

Learn how to properly perform an incremental MySQL backup using binary logs without having a gap between backups and overall improve the speed of database backups.

Website Admin Panel on Private Network

Split your website into two parts, administrative and public access. This post talks a little about having your website's control panel on your private network as apposed to publicly available.