Dot info (.info) domain tld’s appear to be the new domain of search engine spammers since there is an apparent lack of Google aging delay to list and rank them. They are listed relatively quickly after first crawl by the search engines and are ranking well for some competitive terms. The sleaze monsters among search engine sp*mmers are using software to automate four separate areas, content gathering, article creation, article distribution and blog posting. Some may be using all four techniques in concert in an effort to blanket hundreds of sites with article content in order to slap up Google Adsense or Yahoo Publisher Network Ads.
Authors have worried over a “duplicate content penalty” when their articles are distributed for use by other web sites. It’s extremely unlikely that this type of use will lead to penalties for the author web site, linked from resource boxes of those articles of original content. The likely application of duplicate content penalties comes, interestingly when used in exactly the same way by those clueless purchasers of “pre-loaded” sites with precisely duplicated site structure and precisely the same articles AND RSS feeds that won’t vary. Those that use these mirrored sites are the ones that will suffer that duplicate content penalty as they are mirrored sites, which have been filtered for years. Lazy buyers of “pre-loaded” articles sites will be the only ones to receive penalties from the search engines.
This sleazy article theft software product, which takes already written copyrighted articles by other authors, re-orders paragraphs, swaps out interchangable verbs, rearranges sentences and spits out a fairly readable, and sometimes passable article which may not be recognizable to original authors. These stolen, regurgitated articles are then submitted to article banks and distribution sites by splog creators, sometimes using automated submission software or hosted services, so those stolen, regurgitated articles are used across the web to create inbound links leading to the search engine sp*m sites.
Many of these .info domain owners are using sleazy sp*m blog software to create what has become known as “splogs” which use multiple blogging platforms to create automatically updated blogs with posts made regularly in some random time sequencing. They do this to appear to be active bloggers, using automation built into their software, to create keyword focused posts via RSS feeds coming from keyword phrase centered news searches and then “ping” the blog search engines with new automated posts. Depending on the sophistication of the splog owner, you’ll often see footer links leading to other splogs they operate on separate topics.
Virtually all of the .info domains I’ve seen ranking in top results for competitive phrases are entirely Adsense or YPN sites – including splogs, full of autogenerated RSS news feeds and on-the-fly generated title tags and H1 tags based on the search phrase used to find the site. Even the copyright information in the footer of some of these sites is generated on-the-fly to match search queries. While this technique is also being used by some search engine sp*mming .com sites (older than 1 year since creation to avoid aging delay) it can be seen in more .info domains currently.
If Google is truly ranking sites based on clickstream data, imagine the abuses these dynamic spam sites, full of nothing but RSS feeds or stolen, regurgitated content could spawn! Soon they would rule the results pages because they reflect EXACTLY the search terms used by the searchers, which leads to higher click-through ratios, which generates higher rankings. I see a serious hole for abuse here and hope that the PhD’s at Google work out a filter for the technique fast.
This exact match landing page idea is used widely in pay-per-click campaigns as most savvy SEM specialists highly recommend landing pages which reflect exact matches to user clicks because it leads to higher conversion ratios. Perhaps a programmer who spends his days creating PPC landing page scripts is spending his nights creating .info domains with dynamic page title and metadata for competitive search phrases to rule organic SEO?
Of course, whois ownership information is masked by many recent .info domain owners, since those domains were purchased specifically for se-sp*mming sites. When looking up the whois information on highly ranking .info domains to check creation (purchase) dates, you’ll see a preponderance of October through December 2005 creation dates, with a smattering of January 2006 created sites for those well ranked splogs. This must be about the time that spammer forums started noticing and discussing the lack of aging delay for .info domains.
Whois information for dot com (.com) sites ranking well for competitive searches shows that ALL are over a year old and most are 3 to 5 years since creation date.
All of this suggests clear algorithmic aging filters and the apparent lack of .info filtering. My thought is that Google is using this lack of aging delay and lack of filtering as a honeypot for search engine sp*m to gather the bad boys all in one otherwise rarely used tld and then do wide sweeps, tracing their tactics to further filter (forgive me for using the term) Black Hat techniques.