As you may have heard, Google is no longer passing on the keyword information for logged in users of google.com. 

This
change has not been well received, particularly in the SEO industry. It will mean website owners have less information about where some of
their organic search traffic comes from. 

But
is this general air of doom and gloom warranted? How much data will
actually be lost? We’ve been trying to estimate the impact of
this change…

Privacy
concerns are cited as reasons for the change, a motive questioned by
many in the digital marketing industry since the keyword data is still
available for paid search results. 

What percentage of users and searches will be affected?

In
terms of actual numbers there are no official figures from Google
available to say what percentage of traffic will be affected, but we can look at its userbase to get an idea:

Thinking about other major countries Google operates in and its market share per country, at the very most we estimate around 5% of the search engine’s total userbase could be Google+ users.  

This is also supported by the John Battelle and Vic Gundotra discussion at the Web 2.0 Summit where they use the figure of 4% of Google’s userbase.

Enough estimates, how about some actual figures?

The only guru worth the name, Avinash Kaushik, has tweeted a custom report ready for Google Analytics
you to start measuring the impact of this change. His custom report is
available here (click when you are logged into your GA account).

If
you don’t want to use that report, or are using another analytics package,
then filtering for keywords with “(not provided)” from the 20th October
2011 to date (when the change looks to have been rolled out) will give
you some figures.

Looking
through Guava’s UK client’s analytics packages, across a variety of
sectors, sizes of websites and locations, it looks as if the impact so
far is, at best, negligible. The largest impact was just 0.4% of keywords.  

Bearing
in mind that analytics traffic is never 100% accurate anyway, this
looks like it can be totally ignored, for now at least. 

However, the HTTPS change is at
present only for google.com
and so websites with low US traffic will not be seeing a great effect, while the demographics of a website’s userbase will affect this figure. For example, a tech blog about Google can expect a lot more
of its users logged into their accounts, and may see a larger impact.

(On Econsultancy, just 0.68% of searches came from logged-in Google users beween 20/10 and 26/10). 

Indeed,
comparing our results with other fellow Google Analytics Certified
Partners in the US, some are reporting the average percentage of searches
currently coming in from logged in users at about 1.5%, as @JustinCutroni, Director of Digital Intelligence at Cardinal Path, tweeted soon after the change.

When asked via email on the effect of the https change on his US clients, Justin stated:

We’ve
been looking at the volume of (not provided) over the last week and
have seen it fluctuate between 0.5% and 1.5% of Google organic visits.
We know this change is going to have an impact but we’re still waiting
for the rollout to complete.

We’re cautiously optimistic that most sites will see no more than 3% of Google organic keywords bucketed as (not provided).

Taking these numbers into account, and considering that 4 to 5% of Google
users are on Google+ and assuming that those users are logged in for
about 50% of the time, then we can estimate that the percentage of searches
impacted could be from 0% (where a website gets no traffic from G+ users) up to 2.5% when HTTPS for logged-in users is fully rolled out.

These figures will increase over time dependent on how successful
Google is at increasing market share for its Google+ social product.  

Compare this with data loss expected from users who do not enable JavaScript
(so rendering users invisible to tracking tags) of 3% in the US and
1.4% in the EU, not to mention data losses of accuracy from cookie loss
etc.

Why is Google doing this?

  • The push towards HTTPS is not new and Google has been working on it since June 2010
  • Google
    says it is for user privacy which is a valid reason, if somewhat
    obscured by the fact the data is still available for users of paid
    search.   
  • It’s
    also difficult to see how the present search can seriously be a privacy
    breach since they are not currently tied to a personal profile. Could
    this change be a precursor to a change in the way search results are
    presented for a logged in user? (i.e. within a Google+ profile?) 
  • The
    change directly affects other advertising networks and analytics
    systems because, even with secured connections, there was no requirement
    for Google to strip the referring keyphrase data. 
  • It also stops browsers and other search engines (BING) closely analysing the ‘clickstream’ data of users on Google, as it publicly accused Bing of in Feb 2011.

In
conclusion, while the change does represent a disturbing direction for
Google Analytics, at the moment it looks as if the impact will not be
critical to your digital marketing strategy.