The Latest Spam Getting Through Your Filtering – and What to Do About It
Written by Lee Clemmer on September 8, 2009
Despite the generally excellent performance of most modern, well-tuned anti-spam engines, some spam is going to get through. We may be lulled into a false sense of superiority when for a period of time our anti-spam tools and techniques have borne fruit, and we see that we have more-than-just-excellent results; we have no spam in our inboxes for an entire day, week, whatever. Then, it returns. We’ve all seen it happen. Some strangely formatted message that you or I can surely tell is garbage, a bizarre attempt to sneak through your heuristics that has surprisingly succeeded.
Lately it has been some rather clever nonsense. I’ve been getting these spam emails with a particularly peculiar twist. Many of them have what appear to be at first glance meaningful, but “non-spam” sentences. On closer look, the sentences are strange, and not quite sensible. For some reason they consistently were getting through the spam filtering. What was strangest to me was the lack of any marketing content or attempt to sell whatsoever. They did have a link in the message, and the link was not ever to the same web destination or even clearly directed to an obvious undesirable site. This may have been one of the reasons this set of spam got by; to the filters, it looked really no different than a sentence or two sent by a friend describing some link they thought I would be interested in.
The content appears to be randomly generated by some sort of sentence constructor, which picks nouns, verbs, adjectives and strings them together, so that they seem to be part of a coherent sentence, but are not. The sentences are not riddled with attempts at sales or exciting your interest; instead they are just random. Oftentimes eerie in their close-but-not-quite structure. Here’s an example, to show what I mean.
Part of him was shocked, but most. of him wasn’t even surprised. seen that right away.
There were maybe fifty in all, most. no bigger than plump raisins. No.
This is just one of the most recent ones. Often they have better punctuation, notice that this one has a few periods without spaces following and missing a few capital letters. One thing we don’t see is the crazy mixed-case words, with sexual content misspelled intentionally and with an obvious attempt to excite or lead us on into clicking the link that was attached and apparently unrelated to the text.
Now here’s the thing I found problematic. I can’t see where this content is going to work to be parsed in an anti-spam scanner in most cases, as it’s random enough when compared with the other spam of the same “type”, and yet the content could easily be valid if you wrote me: “Part of him was shocked, but most of him wasn’t even surprised.” Does it make sense to try to include this in our heuristic anti-spam scanners? I think not. We have to combat this by another means.
An old standby would have been to block inbound messages from this sender or IP address, but unfortunately this one came from Hotmail and I just can’t see blocking all email from any Hotmail senders, as much as I might want to do it some days. That was the first thing to do, though, is examine the headers and the log files to be sure that the mail did in fact come from where it claimed, from a Hotmail address and not from some other source. I still see significant forging of email headers.
The next comparison I made was to determine if the link embedded in the email was actually pointing to the Web site it said it was, and not apparently a link with a different URL within it. In this particular case, the link was to a Google reader URL, and did have some objectionable content. So, although I can’t very well block any messages that might have Google reader links in them, you might be able to. It depends on your email use policy and Internet access policy. Perhaps your business and your employees just have no use for Google reader at work. If not, I found several more spam messages that got through, with completely different text content, completely random and almost literary, with no obvious mention of sexual content, all sent from major web based email services.
The common relationship was the inclusion of a link that pointed to Google Reader. That’s what we’d need to filter as objectionable content. Other links to other sites came in some other spam emails, but there were enough (three) in a short time that we can see this was the mechanism they were using. The near-random and non-contextual nature of the Google Reader links make just blocking them based on the URL difficult, the ones posted by users have simply long numerical strings as identifiers. Pretty much random as well, although it might be possible rather than blocking any and all links to Google Reader content to selectively block ranges of users, although how to do that efficiently, I can’t yet see.
Posted in email management, email security, security | No Comments »


