In a blog post that attracted lots of attention, Bray pointed to an article he wrote and published on his blog in 2006, as well a blog post another person published in 2008, that could not be found via Google search. Both carefully-crafted exact-match queries and searches using the site: prefix failed to locate the pages in question.
Bray was able to locate these pages using two other search engines, Bing and DuckDuckGo.
How to explain this intriguing phenomenon? Bray has a theory:
Obviously, indexing the whole Web is crushingly expensive, and getting more so every day. Things like 10+-year-old music reviews that are never updated, no longer accept comments, are lightly if at all linked-to outside their own site, and rarely if ever visited…well, let’s face it, Google’s not going to be selling many ads next to search results that turn them up. So from a business point of view, it’s hard to make a case for Google indexing everything, no matter how old and how obscure.
Bray’s post went viral and sparked a vigorous discussion and comments from others suggest that Google’s memory loss might not be so isolated. For instance, on Hacker News, one commenter wrote:
I’ve noticed this many times too, particularly recently, and I call it “Google Alzheimer’s” — what was once a very powerful search engine that could give you thousands (yes, I’ve tried exhausting its result pages many times, and used to have much success finding the perfect site many dozens of pages deep in the results) of pages containing nothing but the exact words and phrase you search for has seemingly degraded into an approximation of a search engine that has knowledge of only very superficial information, will try to rewrite your queries and omit words (including the very word that makes all the difference — I didn’t put it in the search query for nothing!), and in general is becoming increasingly useless for finding the sort of detailed, specific information that search engines were once ideal for.
Another person observed:
I think the biggest irony is that the web allows for more adoption of long-tail movements than ever before, and Google has gotten significantly worse at turning these up. I assume this has something to do with the fact that information from the long tail is substantially less searched for than stuff within the normal bounds.
This is a nightmare if you have any hobbies that share a common phrase with a vastly more popular hobby…
Why businesses should care and what they can do about it
Despite his personal pain, Bray recognizes that Google is focused on “giving you great answers to the questions that matter to you right now” and acknowledges that it often does a very good job at that. But even so, it’s worth considering that Google’s apparent memory loss could also be of concern to businesses that have invested in content that they expect to be discoverable through the world’s largest search engine.
Despite the growing popularity of Google alternatives like DuckDuckGo, most companies still focus their SEO efforts on Google and the search giant’s memory loss could affect them in a number of ways.
Most obviously, the prospect that Google is intentionally allowing content to drop out of its index over time means that companies can’t assume their older content will remain in the index, even if it’s high quality.
While Google has never offered a guarantee that content will remain in its index because it was added to it at some point, the possibility that it is dropping content from its index more frequently than many expect is problematic on a number of fronts.
First, many companies, on the advice of their SEOs, have invested in producing content for long-tail (read: low volume) keywords. The thinking behind this is that such content, even if it doesn’t produce significant, consistent returns, will be “out there” and discoverable well into the future and that over time, it will deliver a positive return.
But such content, even if it’s high quality and of potential value to a very targeted base of users, would seem to be most vulnerable to Google memory loss, especially if it’s not updated or linked to frequently from newer pages.
Second, many companies don’t consider content to be a depreciating asset. To the contrary, many believe that content, particularly so-called evergreen content, can pay dividends long into the future. If Google does have Alzheimer’s, determining the value of a piece of content, and calculating how much to invest in the creation of a piece of content, could become a more complex exercise.
So how should companies respond?
While there’s no reason to panic, Tim Bray’s post does suggest that businesses would be wise to pay better attention to their content and, to the extent content is seen as valuable, what happens to it long after it’s published.
At a minimum, companies should be using Google Search Console to better understand the status of their content in Google’s index. There are also third-party tools and companies with development resources can even easily build their own index checkers.
Beyond this, the potential that Google has implemented a form of memory loss should remind companies that the execution of content strategy is a fluid, ongoing process. Publishing content is a part of that process, but the lifecycle of each piece of content needs to be managed long-term if that content is to remain valuable long-term.
Perhaps proving that: since Bray’s post went viral, the two pages he initially couldn’t find in Google are now back in the index.