If you're a publisher, one of the most frustrating experiences is to discover that your content is being scraped by a third party that does not have permission to use your content.

Even more frustrating: when that scraper's website is able to outrank yours for searches related to your own content.

For obvious reasons then, Google has engaged in a considerable effort to thwart scrapers. And now it's turning to the public for additional assistance.

Last week, Google's Matt Cutts put out an RFS (request for scrapers) on his Twitter account:

Scrapers getting you down? Tell us about blog scrapers you see: http://goo.gl/S2hIh We need datapoints for testing.

As Matt McGee of Search Engine Land notes, Google's Panda 2.2 update, which was released earlier this year, was designed to address scraper sites. But that, not surprisingly, didn't end the war.

Despite Google's best efforts, there are still scraper sites that rank well, and sometimes they even rank higher than the site that originally publishing the content.

So what should Google do?

On one hand, it would be curious if Google's web spam team wasn't looking at scrapers specifically. But on the other, it's hard not to make the argument that scrapers aren't the core problem -- they're just a symptom.

In many cases, scrapers aren't simply scraping content and hoping that their sites rank well. In the absence of a means to build PageRank legitimately, they also employ black and gray hat techniques that Google has struggled to deal with.

In other words, scraping alone is frustrating, but it shouldn't be infuriating. What's really infuriating is that scrapers are often able to take advantage of greater flaws in Google's ranking algorithms so that scraped content ranks meaningfully at all.

From this perspective, for Google to win the war against scrapers, it must win the war against search engine scamsters. And that won't be easy.

Patricio Robles

Published 30 August, 2011 by Patricio Robles

Patricio Robles is a tech reporter at Econsultancy. Follow him on Twitter.

2647 more posts from this author

You might be interested in

Comments (3)


Nick Stamoulis

Seeing those scrapers succeed is usually the reason site owners decide to go black hat in the first place. "It works for them and they get away with it, so why should I bother to stay white hat?" I can understand that frustration because it constantly feels like you're fighting an uphill battle.

almost 7 years ago


Matthew Read

I recently published a piece of content online and the next day 6 other sites had the same article up! What annoys me is that they then shove it full of Google Ads and get paid!

The Panda update did hit a lot of people but so many avoided the drop with simple changes and it will be interesting to see if Google release a 3rd Panda update to combat them further.

almost 7 years ago


Max Webster-Dowsing, SEO Consultant at RBS Insurance

Anyone who engages in SEO us never strictly white-hat, after we are after links which we have to create by our own means. All that will happen is that people who engage in scraping content - they will just spin the content i.e. replace the synonyms for each phrase in the article, which in turn makes this unique content in the eyes of the search engines. People with half a brain do not use directly scraped content if they want good rankings, scraped content will only work well on a site and rank well if the site is an aged site with some page-rank and unique content so it is mixed up a bit. Unfortunately there will always be holes in Google's algorithm.

almost 7 years ago

Save or Cancel

Enjoying this article?

Get more just like this, delivered to your inbox.

Keep up to date with the latest analysis, inspiration and learning from the Econsultancy blog with our free Digital Pulse newsletter. You will receive a hand-picked digest of the latest and greatest articles, as well as snippets of new market data, best practice guides and trends research.