How does MailWasher Server work?
MailWasher Server integrates with existing mail servers to process all incoming messages before entering the user’s mailbox. Using a multi-layered approach to identify spam. MailWasher Server provides a combination of algorithm, connection filtering and content filtering to provide a robust approach to solving an organization’s spam problem. This approach substantially reduces the number of unwanted emails which are passed on to the email accounts serviced. MailWasher Server works in conjunction with the mail server, scanning all incoming email to see whether each email matches a known unwanted email, or if they have the characteristics of an unwanted email.
Combining content identification and sender identification, MailWasher Server blocks a very high proportion of unwanted email while maintaining an extremely low false positive rate.
In addition, MailWasher Server uses a centrally controlled database of known unwanted e-mail messages. If the incoming message matches a known unwanted e-mail message, it is deleted and quarantined. For messages not found in the database, heuristic checking is performed to see whether the message has the trademarks of common bulk email. MailWasher Server can be integrated to use a Real-time Blackhole List service like MAPS RBL or Spamhaus to filter mail from known mass mailers who use open proxies.
The MailWasher Server system is a combination of software products and managed services for running MailWasher Server.
The system consists of three software components:
- Mail Processing Daemon (MPD). A daemon that provides the core junk mail filtering for MailWasher Server.
- MailWasher Server Web Interface (MWI), a daemon with a built-in HTTP server that provides the web interface used by administrators to configure and maintain the system, and by end users to access their quarantined email.
- The mail conduit for your mail server, which checks each incoming message against the Mail Processing Daemon, and then accepts, blocks, or bounces the message.
The following diagram shows how MailWasher Server is integrated and works.
What filtering mechanisms does it use?
Message Signatures:
FirstAlert! Global Spam Database. Adding to MailWasher Server’s comprehensive, multi-layered approach, MailWasher Server uses the FirstAlert! global spam database – a 24/7 operation which makes use of a global network of users reporting unsolicited email which is then verified and categorized by our dedicated FirstAlert! team. Users are able to submit unsolicited email to FirstAlert! which is updated in real time.
Content Analysis:
Statistical Content Analysis Filtering. Statistical Content Filtering based on Bayesian, lexical analysis and trait analysis can be applied to incoming email to accurately identify and remove spam, while reducing false positive occurrences to a minimum. Statistical Content Filtering can be set to one of five levels of sensitivity: Most Conservative, Conservative, Moderate, Most Aggressive, or Aggressive. Statistical content analysis filtering works by using lexical, trait and statistical analysis to determine the overall probability that a message is spam by learning what an organization (and each individual) identifies as spam. It also involves checking for traits in the header of each message to weed out messages that are most likely spam.
Connection Filtering:
Real-time Blackhole List Servers. MailWasher Server checks incoming SMTP servers’ IP address against Real-time Blackhole Lists (RBL’s) so that only non-blacklisted servers are allowed to send messages to the server.
Blacklists and Whitelists:
IT administrators and users have the ability to set and control blacklists and whitelists, through the MailWasher online web control panel. Email addresses of legitimate senders added to the white list will automatically bypass the antispam filters.
Which mailservers are supported?
- Microsoft Exchange Server 2000 and 2003. Exchange 2007 is not supported yet.
- Sendmail (this includes Scalix)
- Qmail
How does MailWasher Server compare to SpamAssassin?
There is a fundamental difference in approach to the way that MailWasher and SpamAssassin classify mail. SpamAssassin attaches a ‘score’ to an incoming mail, and the receiving site can then set their own threshold as to how draconian their filtering is by manipulating this threshold. MailWasher, on the other hand is underpinned by the FirstAlert! service at the core of it’s functionality. If an email is marked as spam, then you can be sure that it is, as it has been checked by at least two people before entering the database.
The advantage of the MailWasher approach is that the number of false positives approaches zero, and that new spam messages are picked up almost immediately. The SpamAssassin approach is completely automated, so requires no human intervention, but ultimately relies in the inventiveness of the rule writers to catch new approaches to spam mails.
MailWasher is also geared towards performance: it is written in a compiled, machine friendly language, rather than an interpreted one, and it is placed such that it processes the incoming mail feed once, traps spam, and only hands ham on for local delivery. SpamAssassin takes the flexibility approach, which does have a performance hit associated with it.
Comparison of features
Junk mail filtering
RBL support
Both SpamAssassin and MailWasher Server support IP4R and RHSBL RBLs. SpamAssassin also supports URIBLs, which are planned for MailWasher Server but not yet implemented.
Sender address blacklists and whitelists
Both MailWasher Server and SpamAssassin offer manual email address whitelists and blacklists. SpamAssassin can also provide automatic address whitelisting and blacklisting.
Hash sharing systems
Firetrust's FirstAlert!, which is one of the most important filtering features MailWasher Server offers, is a hash sharing system, similar in concept to the Distributed Checksum Clearinghouse, Pyzor, and the commercial (but free for personal use) Vipul's Razor, all of which are supported by SpamAssassin (though crucially, at this time, they are only supported poorly or not at all on Windows).
There are pros and cons to both FirstAlert! and each of the other services, including cost, accuracy, effectiveness, speed, and platform support. The FirstAlert! section of the IT Specs discusses FirstAlert!'s features and benefits and how it differs from the other systems.
Bayesian filtering
SpamAssassin and MailWasher Server both offer trainable Bayesian-statistics filtering. The implementations are typically broadly comparable, but again, each has its own pros and cons. As this feature is now the most important part of many antispam packages, this point will discuss differences between MailWasher Server and other implementations in general, including SpamAssassin, SpamBayes, and others.
SpamAssassin, along with most of the unix-based open-source mail filters, offers more flexible training options, as the programs can be called from scripts to feed in messages of one kind or another; but without an integrated quarantine facility there is no support for automatic training on message rescue, so the mail server administrator would need to do the work to integrate such a facility. In other words, SpamAssassin can be integrated in more ways, but the administrator needs to do the integration work.
One notable difference is that the algorithms in MailWasher Server were tuned to work effectively with a relatively small amount of training and with a single training set shared amongst the installations users, because this was the expected use in companies and other such sites that were the target for MailWasher Server. In contrast, many other packages are designed to achieve the best scores possible after considerable user training ie. using relatively large corpuses. This is great if you have such corpuses, but most non-technical users don't maintain their own, and companies want to get effective filtering after the absolute minimum possible amount of time and effort from their employees or customers - and need to know that they will still see a very low false positive rate.
But perhaps the most important difference is the approach taken by the MailWasher Server developers in detecting common non-textual junk mail traits. Many open-source packages simply tokenize the raw message text, without any special handling of (for example) any HTML parts in the message, reasoning that the HTML data will itself have certain keywords that occur more in junk mail than legitimate mail. While some have found that this gives better overall filtering accuracy than applying such special cases, the MailWasher Server developers found that it typically generated too many false positives from newsletter mailings and other such HTML-based (usually corporate) communications, until such time as the user had trained the product on those specific emails - which was something the developers wished to avoid wherever possible, for the reasons discussed above. So, MailWasher Server uses a powerful front-end parser to decode extract out the message text; but, crucially, it also examines both the message structure and the HTML structure to look for special traits that the developers have identified as being common in junk mail and uncommon - in some cases virtually never occurring - in legitimate mail.
SpamAssassin primarily uses regular-expression rulesets, discussed below, to achieve this goal. Using specific code to detect these traits means that MailWasher Server can quickly and very accurately identify a number of characteristics, many of which would be very hard to express using more general-purpose mechanisms such as regular expressions. Providing the trait detection mechanism ensures that even with the front-end parser, the structure of the message still provides a lot of useful information to the statistical content filter.
Other rules, and custom rulesets
SpamAssassin offers many more filtering features through its ruleset design. Rulesets, some of which are included in the distribution, some of which are maintained by members of the SpamAssassin user community, and some of which are created or tweaked by server administrators as they see fit, offer considerable filtering power. In SpamAssassin these rules are given fixed (but adjustable) weights, whose values are determined by regular experiments performed by the SpamAssassin community (and may be tweaked by sysadmin). The weightings of all matching rules - including all the filtering features discussed in the sections above - are combined to give an overall score for the message; administrators or users can then filter all messages above a certain threshold.
MailWasher Server has no directly comparable feature. Instead the MailWasher Server developers chose to use the trait system (see above) so that the weightings would be automatically adjusted as users correct the system's mistakes. They chose to use coded traits rather than regular expressions (as used by the majority of SpamAssassin rules) as there are a number of traits that would be either very difficult or very inefficient to implement using regular expressions; they found that some of these were such good indicators that MailWasher Server can frequently reliably catch a substantial amount of spam based on the traits alone (the developers mark those traits that they have found basically never appear in legitimate mail as "safe", and the product will, out-of-the-box, give those traits a weighting that will make them effective straight away if statistical filtering is turned on).
That said, the two approaches are not mutually exclusive: it is hoped that the community will continue to contribute to the traits library (and to the ongoing testing to ensure that only those traits that are actually effective measures of junkness are provided), but adding support for customizable regular expressions is also planned, as this will make it substantially easier for server administrators to extend the product to deal with repeated problematic junk mail they are seeing.
Other features
Installation
Since MailWasher Server is implemented in native binary code, there are no interpreters or libraries that the systems administrator must install, no paths that need to be configured, and no scripts that need to be written. The daemons run automatically as system-wide services, natively on each platform. A GUI installer for Windows, and an executable installer for Linux and Unix, make setup as simple as possible.
Once the initial setup of the MailWasher Server daemons has been completed, configuring the product is done completely through the web-based interface. This provides a user-friendly way to control the product (and helps to minimize the size of the platform-specific installers that perform the steps above). Integration of MailWasher Server into the mail server's mail processing, however, is done in the manner usual for that MTA (see the mail conduits section in the IT Specs for more - for Windows, this is as simple as clicking an item in the MailWasher Server Start menu group).
SpamAssassin itself is written in Perl, which is generally completely portable - though some SpamAssassin modules use other languages or lower-level OS interfaces and so are not. In practice, nearly all Linux and Unix servers already have Perl installed, but Windows servers very rarely do; the Windows Perl port provided by ActiveState is however free and simple to install.
Getting SpamAssassin itself to run is however only the first part of the story; while nearly all Unix SMTP servers allow administrators to glue in whatever software administrators want (SpamAssassin supports numerous different integration options), for Exchange a COM object must be written to integrate other software. Christopher Lewis has created such an Exchange integration tool for SpamAssassin, though it is definitely not as tightly integrated as MailWasher Server's conduit, using temporary files, creating a new SpamAssassin process to examine each message, and re-parsing the processed message to work out what decision SpamAssassin made.
We feel that the simpler installation and tighter integration offered by MailWasher Server is a very important feature, as it substantially reduces the barriers to using the product for less experienced administrators.
Integrated quarantine management
One of the most important features that MailWasher Server provides that many other open-source mail filtering projects do not is a fully integrated, configurable message quarantine system. The IT Specs guide's section on the quarantine features explains the benefits of this in detail.
Strong Windows support
MailWasher Server has had full support for Windows since the first version of the product, and it is the strong intention of the developers that any and all features in the product will always be available and fully functional on all platforms.
As above, SpamAssassin does run on Windows, but with a reduced feature set (Razor/Pyzor/DCC are not supported or unreliable on Windows) and a lower-performance architecture (the Exchange integration script doesn't support 'daemon mode', so each mail is written to a temporary file and a new SpamAssassin process is started - both expensive operations on Windows).
Built-in statistics tracking
SpamAssassin offers no comparable analysis built in. However, packages are available for administrators to do this.
Online help/documentation
MailWasher Server comes with online help for all functionality built into the product.
SpamAssassin doesn't really have a user interface as such and so no online help, but it does have an excellent community-maintained site discussing the features and answering frequently asked questions.
Features Overview
|
|
MailWasher Server
|
SpamAssassin
|
|
Collaborative tests
|
Yes, all manually verified
|
Yes, with no verification
|
|
Bayesian module
|
Yes
|
Yes
|
|
RBL
|
Global and Personal
|
Global only*
|
|
WBL
|
Global and Personal
|
Global only*
|
|
DNSBL
|
Yes
|
Yes
|
|
Quarantining
|
To separate area
|
To predefined user
|
|
Administrative Interface
|
Web based
|
Textual config files
|
|
Statistics reporting
|
Yes
|
Not directly
|
|
Implementation
|
C++ applications
|
Perl scripts/C daemons
|
* Personal tests can be added on a per user basis only after global filtering
Supported platforms
|
|
MailWasher Server
|
SpamAssassin
|
|
Exchange
|
Yes
|
Partially
|
|
Sendmail
|
Yes
|
Yes
|
|
qmail
|
Yes
|
Yes
|
|
procmail
|
No
|
Yes
|
|
imail
|
No
|
Yes
|
|
Exim
|
No
|
Yes
|
How they work
MailWasher Server
Incoming mail is assigned a pass/fail using a number of tests. Failed mail is quarantined at this point, and a web based interface is provided to users to view/unquarantine them. Processing of the mail occurs immediately on delivery to the external facing SMTP port, so multiple domains are filtered transparently.
The following tests are used:
- Global Whitelist
- Personal Whitelist
- Global Blacklist
- Personal Blacklist
- FirstAlert!
- Statistical ( Bayesian ) analysis
- Realtime Blackhole lists
- Address Blacklists
SpamAssassin
Attaches a ‘score' to an email, using configurable tests. The higher the score, the more likely it is to be spam. The mail administrator may then set their own threshold for marking an email as spam. The filters are usually installed in both the outward facing SMTP port, and in the local delivery agent, which will result in decreased performance.
Scores an email using the following tests:
- Header info
- Body phrases using custom rulesets
- Bayesian filtering
- Auto black/whitelisting by weighting the score of an email using the score history of email sender
- Manual whitelisting... subtracts 100 from the score.
6. Collaborative tests... mark an email as spam by sending a checksum to an online database. This can then be looked up from that point onwards, and the result used to modify the score. Uses razor, pyzor, and dcc.
7. RBLs
8. DNS blocklists.
Depending on which side of the user-defined score threshold the incoming email finishes, the mail is either delivered to the recipient, or delivered to a ‘quarantine' user. Summaries of these quarantined messages are delivered to the original recipient, and they have the option to recover this.
Summary
When it comes to filtering features SpamAssassin is a good choice in the open-source antispam world on the Linux/Unix platforms it originated on. However, MailWasher Server offers much better Windows support, simple installation, good integration, and a number of well-polished built-in features.
For those system administrators who want to integrate their antispam system in a custom way, or for whom filtering flexibility is an overwhelming priority, SpamAssassin remains a good choice - but for those who simply want a good, user-friendly turnkey system, MailWasher Server is an exciting new option.
As such, Firetrust does not view SpamAssassin and MailWasher Server as direct competitors: instead, we feel that MailWasher Server offers the open-source world a chance to attack a new segment of the antispam market. We hope that the open-source community will continue adding to MailWasher Server's filtering feature line-up and believe the integration it provides will allow it to achieve great success in these new market areas, and look forward to much cross-pollination of ideas between projects.
I can't find my problem here, what do I do now?
We have a forum available for general support. You can find it at http://www.firetrust.org/phpBB2.