We all know about search engine spam. Wikipedia defines it using the coined term “spamdexing” as,
“the practice of deliberately and dishonestly modifying HTML pages to increase the chance of them being placed close to the beginning of search engine results, or to influence the category to which the page is assigned in a dishonest manner. Many designers of web pages try to get a good ranking in search engines and design their pages accordingly. Spamdexing refers exclusively to practices that are dishonest and mislead search and indexing programs to give a page a ranking it does not deserve.”
To combat the numerous techniques used to spam search sties (like keyword stuffing, invisible text, cloaking, etc.), the engines have deployed a variety of algorithms to determine ranking relevancy. (In a post earlier this month I talked about the fine line between search engine spam and content, arguing that different parties would disagree as to what is spam and what is actual content). Thus far, the major search engines have done a fairly (but not perfectly) good job at combating this problem, but it’s obviously a continuous battle.
Now enters the world of RSS and the Incremental Web. No longer is the position of a search result a function of its relevancy, but it is also a function of its timeliness. Consequently, search engine spammers have a new trick to play with.
For example, I’ve noticed as soon as I ping Technorati with a new blog post, a search for many of the keywords in that entry places my blog in the first result. As the day progresses, the blog entry moves down the list of results. I believe that spammers will begin to realize this “opportunity” to instantly have their pages placed at the top of the list results and exploit it.
Eventually the Feedsters and Technoratis of the world will determine algorithmic techniques to combat this problem, but I predict that there could be a bumpy transition period as spammers realize the power that’s here.