SpamOrHam.org attempting to improve spam detection even further

May 28th, 2006

Want to feel good about yourself today?  How about a doing a mitzvot and help improve the future of spam detection?  Oh come on, it’ll take 5 minutes out of your day and you can answer as many or as few as you want.
 
SpamOrHam.org
 
John Graham-Cumming launched SpamOrHam.org in an attempt to build a repository of information that could be distributed to those involved in creating anti-spam filters.  (“Ham” by the way is legitimate email, i.e. the opposite of “Spam”).  In Graham-Cumming’s own words:
The basic idea is to get humans (that means you) to read a small number of messages (some are ham; some are spam) and decide what they are. I’m doing this because there are currently two usable corpuses of spam and ham: the SpamAssassin Public Corpus (which was hand sorted) and the TREC 2005 Public Corpus (which was machine sorted) … Once I’ve got enough human decisions (I’d love to get 10 per message; that means almost 1,000,000 human classifications) I’ll make all the data public.

In other words, if you visit the site, you can vote on individual messages, to say whether or not you think they are spam or legitimate. This voting will be very helpful to spam researchers, because an acurate “corpus” of spam and ham allows them to automatically test new anti-spam techniques. Graham-Cumming continues:

I’ll highlight any emails where people disagree with the current classification published by Gordon Cormack … I expect it’ll throw up some interesting data… for example, just how good are humans are sorting spam? Since we’ll be able to look at where the corpus and the humans disagree we’ll be able to spot machine errors and human errors.

So why does that interest me enough to post about it?  I mean it’s not like I usually address these types of topics. 

Put simply I find it interesting when people find good uses for the internet that aren’t about marketing, commerce, or the ilk and rely on either a unique proposition that interests people, or the altruistic side of human nature.  Let’s see, you don’t have a commercial product, you have nothing that will provide immediate gratification to your visitor, yet you need a million people or so to take a look at something and give you their opinion….and it’d help if it happened sooner rather than later.  Seriously, where else could that happen other than the web?  So yeah, do me a favor and go to his site and vote on a few emails.  Maybe we’ll all get a few less spam in the future.

Sigh.  It makes me nostalgic for the “good old days” of the internet.

Entry Filed under: Exchange Server, Internet, Internet Business Tools, 3rd Party Software, Consulting

Leave a Comment

You must be logged in to post a comment.

Trackback this post  |  Subscribe to the comments via RSS Feed


Topic Areas

Subscribe To Site

  • All MSR Sites

  • Gadgets & Gizmos

  • SBS Links

  • Recent Websites

    Translate This Page

    Who Links Here?

    Related Advertisements

    Featured Download

    Advertiser

    Tag Cloud

    internet consulting Small Business Server sharepoint Internet Business Tools Gadgets and Gizmos wss sps Sharepoint Portal Server Windows Mobile microsoft Just Plain Interesting General Technology software msr consulting Ultimate Lists sharepoint portal server microsoft office Security politics neutrality laptop Internet Marketing google cellphone 3rd Party Software windows sharepoint services windows verizon telecommunication companies sharepoint server senate commerce committee search engines review privacy office notebook netneutrality motorola moss law internet policy Exchange Server desktop dell debate computers civil liberties censorship business