I was asked today if some email is not getting through the spam filters on our mail server. In general, only spam is filtered out, but there are some uncommon cases where wanted email is rejected. The more email is filtered, the more opportunity exists for mistakes. Once in a while email does get classified as spam when it is not. This is called a false positive. There will always be some false positives because mail servers are configured by humans and email is created and sent by humans. Humans make mistakes. It’s only a question of when and at what level false positives are acceptable.
An example would be if 2 people want to have an email conversation about certain kinds of pharmaceuticals. Some of their emails might seem to disappear. The thing to realize is that putting up with losing that kind of email also means not having to delete hundreds of spam emails daily, many thousands of emails per year. Most people would say that the inconvenience is worth it.
We are now doing grey listing. When a sender has not been seen before, a deferral response is sent (a 400 series response, not an outright refusal). It’s accompanied by a message, “Please try again in 1 minute.” Properly configured mail servers will try again because they understand that a deferral response is not a refusal. Spammer mail servers seldom try again because they need to send as much email as possible before they get black listed.
False positives can happen when the sending server is misconfigured. Some servers fail to differentiate a deferral from a refusal. They don’t try again. Not to mince words, the person or people in charge of the server do not understand what they are doing.
Grey listing also helps us spot email being sent to web harvested addresses. When we see the same email going to an architectural firm in Dubai, to a home tutoring service in Des Moines, a car repair shop in San Diego and a rare antiques shop in London it’s a good bet that it’s spam.
We are blocking thousands of spam emails per day using these 2 techniques.
We have also started doing sender verification. This is done by testing whether a bounce message to the sending address would be accepted. If the sender address is fake, it’s reasonable not to take mail from it. This is controversial because it puts load on other servers which are innocent of spamming and can be employed by spammers as an attack on those servers. However, Gmail, Yahoo, AOL, Hotmail and others do this to our servers. Fair is fair. But again, false positives can occur due to badly configured servers. The email standards documents say that bounces must be accepted when sent to live addresses. Some servers refuse bounces because of user complaints that they are getting email returned they didn’t send. There are solutions to that problem, but this isn’t it. Some mail server administrators simply do not know that it is possible to tell the difference between verification and a real bounce.
Many people expect that email should be perfectly reliable and run their business partly based on the assumption that it is. Unfortunately, that assumption is unrealistic. Without filtering, email is essentially unusable because of the volume of spam. The entire system is imperfect because it’s run by an unpredictable collection of imperfect humans. It is flawed. To say it more plainly, it’s a mess. Nobody in his right mind would design email to work the way it does on the Internet as it is today.
On balance, false positives are relatively rare. For the vast majority of users they will never be a problem. All the same, it’s a good idea to remember that they are possible.