A website I run is undergoing a makeover and is down for the day, and I wanted to show somebody the old version. As such I aimed for the Google cache, which is useful in this sort of situation.

I noticed that the cache had updated in the early hours of the morning, and as such I couldn’t see our old site. Bugger.

It seems that Google is caching news sites with increasing frequency. Yet some newspaper websites don't like Google caching at all...

I thought I’d check this out. First, I aimed for the BBC’s cache, to see how recently that had been updated. Turns out that Google had cached it within the hour. I then looked at the Guardian… same result. It too had been cached within the hour. Pretty quick!

But oddly, a few other mainstream publishers don’t have cached versions of their sites in Google. The Telegraph doesn’t allow it. The Sun says ‘No to caching!’. And The Times isn’t cache-friendly either. 

None of those publishers are protecting a subscription access model (unlike, say, the New York Times, which prevents caching for more obvious reasons), so it is a bit of a head scratcher. 

Why stop caching?

You can add a ‘noarchive’ tag to the <HEAD> section of your page to prevent Google from caching it. But why a publisher would do this to its homepage is a mystery to me. 

One possible reason was suggested to me via Twitter, after I asked the SEO ninjas to answer the question. iCrossing said:

“News sites often do it to stop cached versions of a page being available when they remove or change the story.”

I guess that makes sense but then again, it doesn’t. Why would you prevent users from seeing older versions of your pages? What is there to be gained from such an approach? 

People only really aim for the cache when the live website in some way fails, say after a major spike in traffic (such as The Digg Effect). It’s almost always a second-best option, an alternative. So why stop people from seeing it, should they need to? 

Time for a rethink…

If, as iCrossing suggests, newspaper publishers are worried that Google is caching historic versions of their fast-moving homepages, then they might as well think again.

At a guess, we reckon that Google updates its cache whenever it indexes a website. Given that news sites attract Newsbot (Google’s news crawler, which visits regularly), wouldn’t it be the case that their caches would update multiple times per day?

The answer is that yes, this does indeed happen, whether it’s through Newsbot or by some other method.

If we look again at the newspapers that allow caching – the BBC and the Guardian – we can see that Google is generating multiple caches per day

Word to the wise: both sites have had their caches updated at least three times in the past 90 minutes, as follows:

So with that in mind, we can pour cold water on the validity of the theory that some newspaper publishers prevent Google from caching based on fears of it displaying old news. 

Am I missing something? Maybe there a business reason behind this no caching policy? Are there advertising issues?  

If anybody knows then please leave a comment below, otherwise let’s assume that weird ideological issues are the force behind this policy.

[Image by to Dru Broomfield on Flickr. Various rights reserved.] 

Chris Lake

Published 24 March, 2009 by Chris Lake

Chris Lake is CEO at EmpiricalProof, and former Director of Content at Econsultancy. Follow him on Twitter, Google+ or connect via Linkedin.

582 more posts from this author

You might be interested in

Comments (9)

Save or Cancel

Jessica Healy

Publishers often have to remove content quickly and completely due to law suits that are lodged against them for incorrect or defamatory coverage of a story. If it remains in the cache, even for a short time, they risk being accused of not removing the content as soon as they realised it was incorrect. 

over 9 years ago



You don't want to have to deal with cached versions of articles when removed for legal reasons, as you can still be held responsible for them being available to view. So it's easier and less problematic just to stop caching.

over 9 years ago

Chris Lake

Chris Lake, CEO at Empirical Proof

Good points, about defamation. I guess the whole thing here is about a publisher removing offensive / incorrect / legally dubious material as soon as it can to placate people and minimise the risks of escalation. 

Makes sense, though it's not foolproof, and certainly doesn't apply for anything that appears in print (which can be archived forever). Also, publishers cannot prevent screenshots, or articles being printed, or extracts being used in other articles, etc etc. But that does seem to be a plausible explanation.

Does this also mean that those publishers that do not allow caching do not allow full text RSS feeds?

over 9 years ago



I don't think of any valid reason of not allowing google to index the pages . I don't think that offensive content would be a better reason

over 9 years ago


Anthony Sharot

<!-- @page { size: 21cm 29.7cm; margin: 2cm } P { margin-bottom: 0.21cm } -->

As an SEO consultant, I've reviewed a number of leading news websites over the past 12 months on behalf of London agencies and have advised several of them to disable Google caching, as at the time it was often up to five days behind, not less than one, as now reported.

I've been back to my old SEO reports (complete with cache screen shots from when they were written) and can confirm that Google has sped up the caching for many leading news sites over the past few months.

As such, while there is clearly a concern about liability to defamation and the like, the reason that SEO consultants like myself were until recently telling clients to disable caching was indeed due to its latency and not legality.

The situation has clearly changed, showing, if nothing else, how quickly the Google SEO sand can shift under your feet and why constant review of recommendations, even when known to be true today, is required, as the rules of the game could change by the morning!

over 9 years ago


Anthony Sharot

Agreed. So, the question is, will some of these news sites now reinstate caching based upon fresh SEO advice?

Also, will the lawyers now be happy with Google's speedier approach or are they still going to kick up a fuss?


over 9 years ago

Alec Kinnear

Alec Kinnear, Creative Director at Foliovision

There is also the whole issue that a lot of content sites don't want their content anywhere else but on their own site.

There was the whole Belgium newspaper brouhaha last year.

But they do want to be listed in Google.

So the set up is:

  • index
  • follow
  • no cache

A bit annoying really. I think if publishers want to be listed in Google they should have to take their lumps and be cached. If they want to keep, their content to themselves, then they shouldn't be listed in Google for that content.

over 9 years ago




It's not due to defamation. If a story was defamatory, then it would have been defamatory/libellous/slanderous in the first place, and would have resulted in legal action at that time.

The reason, probably, is because of media law and 'contempt of court'. This is quite a complicated act whereby, basically, any ongoing court cases, tribunals etc, can be seriously biased by previous news stories - especially if a defendant has previous convictions.

Newspapers are not supposed to publish anything that could pose a 'substantial risk of serious prejudice' to any ongoing cases involving a jury - who are supposed to judge a case on its own merits, but can be seriously biased by previous reports on the web.

This is quite a commonly debated topic. Obviously the web poses a problem to the act, which is in trouble and could even be scrapped.

Of course, there may be some SEO based reasons but this seems like the most obvious reason.

over 9 years ago


Bobbi Brown

I am involved in a news site and made the desicion to enable no cache. Two reasons: one, the reasons stated above - the defamation stuff (it's a trade magazine site and the content is extremely sensitive for various reasons). Secondly, we do some minor cloaking, once again because of the sensitivity of the content, there's certain things we don't want showing up in search results. The page actually has more content than we let Google see, so seeing the cached version would be bad for the user.

Yes, I am aware no cache looks a little spammy, but it was a considered decision and I do believe it was the right one.

over 9 years ago

Save or Cancel

Enjoying this article?

Get more just like this, delivered to your inbox.

Keep up to date with the latest analysis, inspiration and learning from the Econsultancy blog with our free Digital Pulse newsletter. You will receive a hand-picked digest of the latest and greatest articles, as well as snippets of new market data, best practice guides and trends research.