Strategies for Spam

An intriguing forthcoming paper on spam (no full-text available yet) from one Guido Schryen: an attempt to get some hard data into the debate. Here are the key findings:

  1. Web placements attract more than two-thirds (70%) of all honeypot spam emails, followed by newsgroup placements (28.6%) and newsletter subscriptions (1.4%)
  2. The proportions of spam relating to the email addresses’ top-level domain can be statistically assumed to be uniformly distributed
  3. More than 43% of addresses on the web have been abused, whereas about 27% was the case for addresses on newsgroups and only about 4% was the case for addresses used for a newsletter subscription
  4. Regarding the development of email addresses’ attractiveness for spammers over time, the service “web sites” features a negative linear relationship, whereas the service “Usenet” shows a negative exponential relationship.
  5. Only 1.54% of the spam emails showed an interrelation between the topic of the spam email and that of the location where the recipient’s address was placed, so that spammers are assumed to send their emails in a “context insensitive” manner.

Surprised by how low the last one is, although I shouldn’t be…

Advertisements

4 thoughts on “Strategies for Spam

  1. Actually, Project Honeypot did some similar research : http://ceas.cc/papers-2005/163.pdf

    and there was also earlier data from a US research center, which unfortunately I can’t find now. It was some very well-done research, although the population sizes were quite small.

    BTW, I’d love to read this paper, if you have access 😉

    ‘Only 1.54% of the spam emails showed an interrelation between the topic of the spam email and that of the location where the recipient’s address was placed, so that spammers are assumed to send their emails in a “context insensitive” manner.’

    I expect this means that it doesn’t matter *where* on the page the addresses were taken, they received the same spam — my theory is that spam is generally seeded from a google search nowadays which explains this.

  2. Actaually it’s worse than that – what he means there is the relationship between the topic of the page and the spam. I.e. if the address is harvested from a page about music, only 1% of the spam is music-related. (He’s using location to mean the website or newsgroup or whatever).

    Will see what I can do re: paper 🙂

  3. Well what I want to know is when is the peoria that was supposed to be SPIT coming along. I have heard that some Skypers are subject to it. SPIT – Spam over Internet Telephony.

    I wonder also whether ENUM and DNS hosted number registries will be an attractive target to such SPITting activities.

    Thoughts Dr. Lex?

  4. I thought I’d also mention that with all of this talk of elections and stuff I have given your site the pneumonic lex-referendum.

    Don’t you just love the way that Sinn Fein ‘as ag caint as gaeilge’ in the new northern Ireland assembly. Seems that they have better sense of identity that those down here!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s