GenuineVC David Beisel's Perspective on Digital Change

May 25, 2005

We all know about search engine spam. Wikipedia defines it using the coined term “spamdexing” as,

“the practice of deliberately and dishonestly modifying HTML pages to increase the chance of them being placed close to the beginning of search engine results, or to influence the category to which the page is assigned in a dishonest manner. Many designers of web pages try to get a good ranking in search engines and design their pages accordingly. Spamdexing refers exclusively to practices that are dishonest and mislead search and indexing programs to give a page a ranking it does not deserve.”

To combat the numerous techniques used to spam search sties (like keyword stuffing, invisible text, cloaking, etc.), the engines have deployed a variety of algorithms to determine ranking relevancy. (In a post earlier this month I talked about the fine line between search engine spam and content, arguing that different parties would disagree as to what is spam and what is actual content). Thus far, the major search engines have done a fairly (but not perfectly) good job at combating this problem, but it’s obviously a continuous battle.

Now enters the world of RSS and the Incremental Web. No longer is the position of a search result a function of its relevancy, but it is also a function of its timeliness. Consequently, search engine spammers have a new trick to play with.

For example, I’ve noticed as soon as I ping Technorati with a new blog post, a search for many of the keywords in that entry places my blog in the first result. As the day progresses, the blog entry moves down the list of results. I believe that spammers will begin to realize this “opportunity” to instantly have their pages placed at the top of the list results and exploit it.

Eventually the Feedsters and Technoratis of the world will determine algorithmic techniques to combat this problem, but I predict that there could be a bumpy transition period as spammers realize the power that’s here.

UPDATE: Since writing this entry, I’ve come across two great posts (here and here) from the hyku blog about spam moving to RSS and another with PubSub’s thoughts on the issue.

  • Scott Rafer

    Feedster, PubSub, and Technorati are all already actively excluding ping-server driven spam from our systems. From my latest blog post at

    … blogspot spam is back. Somehow their captchas have been defeated on what appears to be an automated basis. We got hit with 10,000 blogspot gambling spam feeds on Friday, published by some lovely person who wanted to exploit the Belmont Stakes.

About Me

  • avatar
  • I am a cofounder and Partner at NextView Ventures, a dedicated seed-stage venture capital firm making investments in internet-enabled startups. Read More »



Rob Cho Go

Lee Hower


NextView Twitter Stream

  • Rob Go
     - 15 minutes ago
    RT @davidbeisel: My latest blog post: "Seeking Nonconsensus"
  • David Beisel
     - 16 minutes ago
    My latest blog post: "Seeking Nonconsensus"
  • Rob Go
     - 15 minutes ago
    Looking forward to this - thanks @yegg
  • David Beisel
     - 56 minutes ago
    Are startup winners "easier to predict" now? by @maxwellelliot