Frustrated by (not provided) keyword data and other Google Analytics niggles?

This post looks to help you resolve these issues to get more from your organic search data. 

There are many differences between agency-side and client-side, not least the amount of sites worked on! However, there are also some shared resources that add value to both sides of the coin.

Data sources are an example of this and when we’re all fighting the same fight, it’s essential to have consistent pots of data to analyse and report on.

One such data source is, of course, Google Analytics. As a free tool, it is an essential resource for agencies and brands alike. I’m a daily user of GA as part of client reporting and helping our search and technical teams understand how sites are performing, both positively and negatively. 

Google has made changes to GA, especially recently. Some of these changes, particularly around UX and usability, have really benefitted the user.

On the flip side, Google has given with one hand and taken away with the other, by making it more and more difficult to gain full access to data, especially in organic.  

With this post I aim to go a little deeper by giving specific, actionable advice on how to further prove the value of organic for your business or clients. 

(not provided)

The most infamous of GA’s restrictions is undoubtedly (not provided) keyword data, where Google restricts keyword data due to ‘security’ and protection of the searcher’s privacy.

If you are signed in to Google when you search, the keyword you search for is not passed on to the site owner. This began as a small percentage but has steadily grown until Google announced in late 2013 that all signed-in keyword data would be hidden indefinitely. In reality, from what we see everyday across our client portfolio, this equates to around 80-90% of organic keyword data being hidden. 

This is just an example of some of GA’s niggling inconsistencies, and one I hope to resolve for you in this post, amongst others.  So let’s begin with unlocking (not provided) data…

(not provided) – the bane of the SEO’s life! There are two main issues with (not provided) that I encounter on a daily basis – being unable to specify brand and non-brand traffic volumes, and the split of non-brand traffic between keywords. Let’s look at both of these issues and how we can resolve them… 

Brand and non-brand

(not provided) makes our lives difficult, but it does leave us a sliver of data to work with. As mentioned above, this is usually around 15% of keyword data. The word of the day here is extrapolate. We need to take this 15% of available keyword data and apply its structure to the hidden remainder. Here is a step-by-step guide on how to do this, with an additional bonus download to help you.

The methodology:

If we can calculate the brand vs non-brand split of the available data, this gives us a % split we can then extrapolate.  Firstly, we need to define “what is brand?” To do this, we must set up an advanced filter that removes all traffic coming in via a brand-related term.

Let’s say our brand is GA Tips and our site is GAtips.com – we need to tell GA what our branded keywords are. This is usually made up of variations of the brand name, for example GA tips, GAtips etc, and variations of the URL, for example GAtips.com, www.GAtips.com etc.

Start off by navigating to this page in GA: acquisition > keywords > organic, and ensure the primary dimension is set to ‘keyword’. Then click the ‘advanced’ button next to the search box below the trend.

https://assets.econsultancy.com/images/resized/0005/4752/advanced-search-ga-blog-half.png

This presents some options, which need to be set as follows:

https://assets.econsultancy.com/images/resized/0005/4751/advanced-filter-ga-blog-full.png

In the text field, enter your brand variations in the following format:

GAtips|GA tips|www.gatips.com|gatips.com etc etc

You can be as accurate as like/need here. Some brands may also need to include product brand names. For example, Apple might include the term ‘iphone’ in this as a branded term. You can also dive into popular misspells to be super-accurate.

Once you’ve inputted all your brand variations, click Apply, then click the ‘Shortcut’ button at the top of the page:

https://assets.econsultancy.com/images/resized/0005/4753/save-to-shortcuts-ga-blog-full.png

You can then name your filter and save it to your shortcuts menu in the left nav, so you can easily access it in future.

What this achieves is removing all branded keywords from the organic keyword report, leaving us with just non-branded and (not provided).

If we take total organic traffic and subtract (not provided), this gives us our available data set. Then, take away (not provided) from the number supplied by our non-brand filter. We are then left with Total provided data and Non-branded provided data. 

The difference between the two figures is the branded provided total. From this point it is simple to calculate the brand % and the non-brand % by subtracting non-branded provided from total provided and using a simple percentage calculation. Finally, apply the two percentages to our total organic traffic number to get the brand and non-brand volumes. 

All of this is summarised and made easy for you in this downloadable Excel equation: all you’ll need is the download, and three numbers: total organic traffic, (not provided) and non-brand organic (supplied by your new advance filter). The equation will do the rest for you.

Tracking this over time allows us to monitor increases and decreases in brand and non-brand, which can in turn indicate changes in brand awareness and/or non-brand rankings.

I also mentioned another issue with (not provided):

Restricted ability to track traffic from non-brand keywords

The key here is landing pages. Spend time looking at which pages rank for which keywords. Tools like SEMrush and Searchmetrics are great for this. Understanding this allows us to make informed decisions about increases and decreases for particular landing pages.

For our GAtips brand, let’s say we have identified that the page gatips.com/really-cool-analytics-hacks ranks for the search terms ‘GA hacks’ and ‘analytics hacks’.

We don’t know that those terms are sending traffic to that page in GA (although we can in Webmaster Tools) because of (not provided), but what we can do is react to changes in traffic to the page by checking positions for the terms we know to be ranking that page in Google. If we know our ranking for ‘GA hacks’ has dropped, then we can assign a landing page traffic drop to that.

I briefly mentioned Webmaster Tools and this is a really key data source for keyword data, although it can be a little vague and somewhat inaccurate. Use it in conjunction with your rankings>pages analysis to really understand how keyword traffic is behaving.

More Google Analytics work-arounds

Now we’ve tackled the big issue, here are a few quickfire solutions for things that you may have come across in your day-to-day use of Google Analytics:

Why does GA show one sessions figure for a particular date range, but a different one when I add in a comparison to a previous date range (this also applies to individual landing page numbers too)?

What Google doesn’t tell you is that, in most cases, GA data is a sample. This is especially true if you have an especially large amount of traffic.

The ultimate solution is to either go Premium (very expensive!) but in the real world, be consistent with your comparisons and the numbers you use.

Personally, I take the numbers from the date range alone, with no comparison, and then add the comparison to get the numbers for the previous date range.  It’s a bit manual, but it works and is predictable.

UA code can also skew things, so make sure you have the most up to date code on your site.

A lot of my landing page traffic is being assigned to (not set) – why is this and how can I fix it?

If Google is unable to make a link between a session and a page, it is recorded as a session but the page is assigned as (not set). This can be caused by several things, including the user not completing the page load.

The server request is made but the page is not loaded.

The solution is to identify which pages are experiencing the (not set) assignations. This should be clear, as (not set) tends to target particular page types, e.g. those with events/conversions on them or pages within a certain category on your site, so you should see a pattern.

The total organic traffic for a date range will not be affected by (not set), but if you calculate (not set) as a percentage of total, then you can extrapolate this across all landing pages to give a truer view of sessions per page.

My rankings are flat vs prior week/month/year but my landing page traffic is dropping, why is this and how can I mitigate it?

This is not exclusively a GA issue, but GA provides the landing page traffic data, so it’s certainly a relevant issue to discuss. If rankings are flat, the drop is most likely market driven.

Use Google Trends to look at search demand for the terms in question over the appropriate timeframe and in the appropriate location. Export this as a CSV. Google assigns a number out of 100 per week within the date range, so each week is relative to those around it.

Also CSV export your weekly traffic numbers for the landing page in question and plot your Google Trends weekly numbers alongside. You should see a pattern between demand modulation and traffic behaviour.

Also check Google Keyword Planner for search volumes for the same month in prior year and the month before it. You may see a similar drop, which indicates a seasonality change.

There is a lot of talk in the industry around organic search traffic being wrongly labeled as Direct by GA, which skews true organic numbers. How do I deal with this?

Famously, Groupon undertook an experiment to understand the real impact of this, as it is understood to affect ALL Google Analytics profiles. It’s caused primarily by browsers failing to correctly report where traffic is coming from, with IE being the biggest culprit.

As a result, some organic traffic is ‘dumped’ into the vague Direct channel.

To understand the true impact, Groupon actually de-indexed itself from Google results for six hours to see how much the Direct channel reduced by. With no organic visits, this gave a true view of how much was actually Direct.

It uncovered the fact that approximately 50% of Direct channel traffic into long URLs, i.e. those beyond sub-folder level, dropped by around 50%. The end result is that organic is bigger than GA actually reports.

Moz also conducted a similar experiment. Its results were slightly different overall, however did share some consistencies with the Groupon investigation, in that Direct traffic to long URLs again dropped by around 50%.

Do some analysis of your Direct traffic to understand how much of it is into long URLs. Take that traffic, divide it by 2 (50%) and, according to the evidence we have from Moz and Groupon, that’s how much of your Direct traffic is organic.

There’s always more!

Whilst I know there are infinitely more niggles and foibles with GA, the issues discussed here are ones that I have encountered in my day-to-day GA usage and have had to formulate work-arounds to ensure consistency for our clients and also ensure my own sanity remains!