http://itmanagement.earthweb.com/columns/executive_tech/article.php/3741331/Why-Cant-Google-Catch-Easy-Spam.htm
Back to article
Why Can't Google Catch Easy Spam?By Mike ElganApril 16, 2008 Google's Gmail spam filter is legendary, and considered one of the best in the business. And it's true, for the most part. I currently use Gmail's spam filtering both for Gmail, and also for POP3 mail, which I "launder" through Gmail in order to take advantage of the great spam filtering. One of the amazing features about Gmail spam filtering is its low percentage of "false positives" -- e-mail identified as spam by the system that's really legitimate e-mail. Another important fact is that Gmails spam filter gets better over time. Its ability to catch spam keeps going up, and false positives go down. I currently have 4776 messages in my Spam folder (can anyone top that?), and get about 10 or so spasm in my inbox every day. As a ratio, you really cant beat that.
Gmail's spam filtering really is great, and I shouldn't complain. But, hey, it's my job. I'm perplexed about Gmail's poor handling of two specific kinds of easy-to-spot spam: 1) Nigerian 419 e-mail, and 2) foreign-language spam. I get Nigerian scam e-mails in my inbox every day. They somehow evade the Gmail spam filters. Sometimes I identify them using the "Report Spam" button, and then I get an identical one later in the day.
These e-mails seem to me to be trivially easy to identify. They tend to use oddly out-of-style language, and unusual formality combined with grammatical errors. Just flag any message sent from Nigeria or surrounding countries that contains any five of the following words or phrases: "pray," "faith," "proposal," "introduce myself," "widow" "fund," transfer," "expenses," "bank," "urgently," "the late Mr.
Given the fantastic job Gmail does of catching other kinds of spam, how hard can it be to stop e-mail with these obvious clues?
Even more confusing is why foreign-language spam isn't flagged. I've been using Gmail for two years, and I have flagged as spam every single e-mail I've received that's entirely in Mandarin or Russian.
Google: I don't speak Chinese! When I get e-mail that's entirely Chinese characters, it's spam, OK? What could be easier than that?
I understand that Google has to be careful. It cant summarily dismiss notes about widows in Nigeria, or Chinese- or Russian-language e-mail. Some percentage of users actually get legitimate e-mail that fall into these categories. But Gmail should be capable of individual, user-added criteria, such as aggressive Nigerian-scam filtering, and maybe an option that tells Gmail to consider all mail in this language as spam.
What about you? Does apparently easy-to-spot spam make it into your Gmail inbox? Tell me about it: mike.elgan@gmail.com
|