By measuring the contents of a message against thousands of pre-determined rules, such as how many times a key word appears or a certain phrase occurs with another. Each time a rule is broken, a numerical score is given.
The Spam Score is then totaled up and, depending upon the threshold set, the filter determines whether or not the message should be delivered to the inbox or the spam box.
In a recent study, Microsoft did something rather interesting with some of the more common keywords that spam filters use to identify messages as potential threats. They labeled each keyword in a more general category and totaled each instance according to the categories.
While the results are interesting, they probably won’t surprise many people.
- Pharmacy – Non-sexual: 28%
- Non-pharmacy Product Ads: 17.2%
- 419 Scams (Nigerian Prince and other wire transfer scams): 13.2%
- Financial: 8.9%
- Gambling: 6.1%
- Dating/Sexually Explicit Material: 4.8%
- Phishing: 4.8%
- Pharmacy – Sexual: 3.8%
- Malware: 3.4%
- Image Only: 3.1%
- Get Rich Quick: 2.5%
- Fraudulent Diplomas: 2%
- Stocks: 1.3%
- Software: 1%
Explaining a Few Things
As always, a majority of the spam comes from product advertisements (pharmaceutical and non-pharmaceutical) because most people don’t fall for the get rich quick schemes and the 419 scams anymore.
The Image Only category may have caught a few second looks as well. This isn’t something that is common to most studies regarding spam because it is a new trick that scammers have been trying lately to trick spam filters.
Remember, spam filters are usually set to look for keywords and phrases so a graphical image with text placed in it won’t see the content flagged as likely spam. Yet while this may seem like a smart ploy by the spammers, the number of instances of this type of spam is actually down from 8.7% a year ago.
The one category that did show the most significant increase was phishing. Its average for the year, 4.8%, includes a January number just under 3% and a June number of just over 7%.
The Blocking of Spam Messages
The same report also looked at how many spam messages were blocked as well for the same time period (July 2010 to June 2011).
While the numbers are down, from 90 billion at the start of the survey period to just fewer than 30 billion at the end, this dramatic decrease is primarily attributed to the takedown of the Cutwail and Rustock botnets that were taken down in August 2010 and March 2011 respectively. However, it should be noted that while the immediately after the Cutwail takedown, the number of blocked messages decreased by over 20 billion, the Rustock closure actually saw an increase from March to April of roughly one billion messages. It wasn’t until May of 2011 that a slight decrease in spam levels was seen.
Based on these numbers, there doesn’t seem to be much that can be said that hasn’t already been said already. Spam levels are down, pharmaceuticals are the number one subject of spam and people will still try the money transfer/lottery winner scams.
But there are some other areas that people often ignore.
With an increase in phishing scams over the year, you can bet that the cyber criminals are looking towards more sophisticated ways of depriving people of their money.
The other number to keep an eye on is where the malware category stands.
According to this report, messages containing malware accounted for a little more than 3% of the spam blocked. In fact, the number was one of the least static of the categories, moving from around 2% to over 4% before settling at 3.4%.
The reason this category is so important is because a significant increase here could mean that a new botnet is being built.
It will be interesting to see how the malware numbers stand six months from now. If history does repeat itself, it is only a matter of time before a new botnet is built to replace those that have been eliminated. If this is true, look for it to be even harder than its predecessors to eliminate.
The game is certainly changing in favor of the good guys; however complacency can certainly make fools of anyone who thinks that the war against spam is one based on a few numbers.