How do I understand and correct false positives (legitimate mail misclassified as spam) in Mirapoint's Signature Edition Rapid Antispam?
Signature Edition Rapid Antispam generates the spam score for a message based on the probability that a message is spam, not necessarily on the contents of the message itself. Overall, this system is very accurate, but it can occasionally produce patterns of false positives that are unintuitive to some customers.
Reading Message Headers
Before you begin, check the message headers to make sure that the message was actually scored as spam by Signature Edition Rapid Antispam. Looking at a subset of the headers from a spam message:
Received: from smtp01.example.net (EHLO smtp01.example.net) ([188.8.131.52])
by razorgate.example.com (MOS 3.10.8-GA FastPath queued)
with ESMTP id ALT96930;
Mon, 11 Jan 2010 15:26:36 -0800 (PST)
Received: from vine.example.edu (vine.example.edu [184.108.40.206])
by smtp01.example.net (Postfix) with ESMTP id 4EE63E00257
for <email@example.com> Tue, 12 Jan 2010 00:26:07 +0100 (CET)
Subject: We are happy to impress our clients.
X-Junkmail-Status: score=300/50, host=razorgate.example.com
The X-Junkmail-SD-Raw header shows the internal details of the scoring, while the X-Junkmail-Status header shows the final score and identifies the host that generated the score. The possible scores generated by Signature Edition Rapid Antispam are:
If the classification in the raw header and the final spam score do not match, there may be other factors in the spam score, such as rescanning by Premium or Principal Edition Antispam or adjustment by modspamscore filters. If the Mirapoint spam headers are removed by other systems or mail clients, it can be very difficult to track down the source of incorrect mail handling.(MOS 4.2 and later use the more intuitive header X-Junkmail-Signature-Raw instead of X-Junkmail-SD-Raw, but that is only a cosmetic difference.)
If you have a filter that discards mail with high spam scores, you may be able to retrieve enough information from your SMTP logs to track the source of the problem, though it is more difficult. Some of the common problems are:
Bad IP Addresses
The overwhelming majority of false positives occur because legitimate mail is sent from an IP address that is also sending spam, typically a virus-infected Windows PC, or a network router that connects several systems, including a virus-infected Windows PC that is sending spam, or a mail host with poor security. In such cases the first thing that you must do is identify and disinfect the culprit system or systems.
You can look up the reputation of an IP address at the MiraCare Messaging Center; this looks up IP reputations in the Reputation Hurdle database, which is largely shared by the antispam system. (There are also many third-party websites that let you look up the reputation of an IP address, and possibly even show samples of spam mail sent from that IP, but they vary over time in their usefulness; you can search on the web for terms like "RBL Lookup" or "email blacklist".) Once you are sure that an IP address is no longer sending spam you can use the "Report a Mistakenly Blocked IP Address" link on that page to request that the reputation of the IP address be cleared.
In some cases a remote user may inherit a dynamically-allocated IP address with a bad reputation even though they are not sending spam themselves. If you are sure that is the case, you can use the "Report a Mistakenly Blocked IP Address" form to request that a false positive be cleared; as of 2009 a bad reputation stays in the database for three days after the last spam is seen from that IP address.
Even if the effective IP address from which a message was injected (the "ip" field in the raw spam header) has a good reputation, a message may still be scored as spam based on other IP addresses recorded in the Received headers.
If the same message is spam-scored multiple times, for example a message that is scored on an inbound mail router (IMR), a mailstore, and an outbound mail router after being forwarded, or legitimate bulk mail such as newsletters or list traffic originating at your site, that increases the probability that you will trigger the recurrent pattern detection that is at the heart of Signature Edition Antispam. The most important step you can take to prevent these problems is to set up trust relationships among your systems so that they trust spam scores originating on the other hosts.
Mail Flow Patterns
The IP address from which an email message originates is an important component of the spam-probability profile of a message. The effective IP address of a message is calculated by stepping back through the Received headers to the closest host that is not on the list of relay hosts. (If the message originated at a host on the relay list, or at a host using a reserved local IP address, an effective IP address of 0.0.0.0 will be shown in the headers.)
The algorithms assume that all of your internal hosts that channel mail from the outside mail to your Mirapoint appliance are on the relay list. However, such hosts are trusted to relay mail (send mail from one external address to another external address); it is important to understand the implications of adding hosts to your relay list.
The effective IP address of a message is recorded in the X-Junkmail-SD-Raw header added by Signature Edition Antispam.
Some less-common causes of false positives:
Legitimate Bulk Mail
In a few cases Mirapoint customers have found that outbound bulk mail (for example, a University newsletter) is being scored as spam. If you are spam-scoring outbound mail then you may need to replace the default rpdengine UCE ruleset on your outbound mail routers with one that is optimized for outbound mail, such as rpdoutbound; please contact Mirapoint Support for assistance if that does not resolve the problem. It is also possible for the repeated patterns on automated form letters to trigger spam detection algorithms -- these cases can also be mitigated by switching to an engine optimized for outbound mail, which relies on global spam detection patterns without the locally-contributed component which can overweight local bulk mail.
If some of your users have been tricked into giving their email passwords to spammers, you may find that your Mirapoint appliance is being used to send outgoing spam. In addition to causing performance problems, and causing many third parties to refuse to accept mail from your network, this may also occasionally cause your system to be seen as a source of spam. If you suspect this to be the case, check your outbound mail queues for mail that is neither to nor from local users and contact Mirapoint Support if you need further assistance.
Content Shared With Spam
In very rare cases, legitimate messages are scored as spam because they share content with spam, for example, a return address that has been widely forged by spammers, or a URL pointing to legitimate content on a website with compromised security that also hosts malicious content.
False positives that cannot be attributed to a known common cause should be reported to Mirapoint following the steps outlined in the Knowledge Base article Submitting Incorrectly Flagged Spam on Mirapoint Appliances; for urgent recurring peer-to-peer (mail sent from one person to another, as opposed to list or automated mail) false positives, you may want to open a support case to escalate the handling and resolution.