Scamdex Data used in Research – if only they’d asked!

So a routine search turned up a little Research Paper from the University of Nebraska in Omaha.

Trends in Phishing Attacks: Suggestions for Future Research (2011) | Ryan M. Schuetzler | University of Nebraska at Omaha, rschuetzler@unomaha.edu

While I’m flattered by being used as a creditable source, I am upset that they:

  1. Used the Scamdex Email Archive without permission.
  2. Did not contact Scamdex to get permission.
  3. Used ‘Screen Scraping’ tools to (in their words)

    …To obtain a corpus of phishing emails, we scraped 2709 emails from Scamdex.com (“Email Scam, Internet Fraud, IdentityTheft & Phishing Resource,” n.d.). This corpus contained emails over a 3-year period from November 2006 to June 2009.These emails were submitted to Scamdex by recipients of phishing attacks..

  4. Did not credit Scamdex in their references.

The legality of screen-scraping, a term used for software tools that extensively mine or extract information or complete contents of a website, is debatable – Generally speaking, if commercial use is made of the result then it gets a bit tricky, but for research purposes a lot more latitude is generally given. The Electronic Frontier Foundation has a good one-pager on Fair Use.

If asked, Scamdex would have been completely happy to collaborate. We do ask (nicely) that …

“Any derived content from the Scamdex.com website must clearly show attribution to Scamdex.com as the source and must include a link to the original information”. –http://www.scamdex.com/About-Scamdex.php#use

Scamdex is happy to be used as a research tool, but in future – ask first, then make sure it is credited – is that too much to ask for?

Leave a Comment

Your email address will not be published. Required fields are marked *