In these presentations, marketers need statistics about platforms, technologies, countries.

And having interesting statistics helps marketers tell their audience that they are in touch with what’s going on in the outside world.

Statistics have other uses, too

Statistics are also useful for spotting trends and getting new ideas. Econsultancy publishes statistics frequently and they are some of our most popular posts.

And for subscribers, we pull these into an Internet Statistics Compendium, so that you can get a ton of them all in one place.

But…

But statistics can also cause problems for marketers. There is just so much information to wade through on the internet, that it can be hard to find the one piece of data which backs up an otherwise flawless story.

And when deadlines loom, it’s quite easy to just cut and paste the best-looking chart into your presentation and hope for the best.

This approach, however, has led to a lack of rigour in marketing presentations and the spread of misinformation.

And these bad statistics can easily come back to haunt you when things aren’t working as well as you predicted.

So to help out, we’ve come up with three simple steps you can take to avoid bad statistics. And to illustrate why, there’s a short case study about a widely reported marketing statistic that looked rather suspicious…

So, how can marketers avoid bad statistics?

1. Look for the primary source

Before dropping a chart into your presentation, always have a look at the source of the data.

That is, where did the data come from? Is it a reputable source? Or a survey by a company you have never heard of before?

And if you can, have a look for the original publication to see how the publisher obtained the data and whether it is still relevant to your purpose.

You may find that the data comes from a US-only survey, for example, or that it was gathered five years ago.

Facebook users example

Recently I needed to know the Facebook reach in Asia-Pacific, and found three sources which claimed that Facebook had between 3.6m and 3.8m users in Singapore.

Internet World Stats (3.6m)

Hashmeta (3.8m)

We Are Social (3.63m, when added up)

With a little investigation, I could see that they all used the same source: Facebook-reported data.

As Facebook makes its audience data available to advertisers, it was surprisingly easy to check.

Facebook (3.5 – 4m)

And, yes, the primary source verified what the others had reported – tick.

2. Use your own intuition

Once you know the primary source and are comfortable with the methodology, then it’s prudent to review it against what you yourself know about the area.

That is, if you have some knowledge of a market or a platform, it can be worthwhile to see whether the data passes the sniff test before passing it on.

Often if it seems to good to be true, you will find that the publishers used an obscure source or misrepresented the question in some way.

Back to Singapore and Facebook

Now I had no issue with Facebook as a data source. But, I had some idea of the entire population of Singapore (just over 5m) and it seemed far-fetched that 3.6m, or around 70%, of the population had a Facebook account.

And, as Facebook only reports for people over the age of 13, the amount would be even higher, more like 8 out of 10. This seemed very unlikely and required further investigation.

3. Find additional sources

If you are suspicious about a data point or you are using the statistic for something important, like a business plan, then you should try to find a second source.

And nowadays this is relatively easy. A short search on Google should turn up other reports which have a similar statistic. And if it’s close, then job done.

But you may find that everyone is using the same figure, in which case you may have to dig a bit deeper in order to verify what you are going to include in your presentation.

Singapore sleuthing

So, searching again I found that most sources were using Facebook’s published data. But I know that Singapore meticulously keeps and publishes stats about its population.

Instead of looking at the whole population though, I decided to look at one segment, 20 to 29 year olds, for which both Facebook and Singapore publish figures.

Facebook said that it had 1.2m people between the ages of 20 and 29, inclusive. But on the Singapore government site there are only 535,000 people in that age bracket.

Facebook, Singapore reach for age 20-29 from Power Editor (1.2m)

Singapore government statistics, people aged 20-29 (535,000)

What about non-residents?

Now, the remaining could be non-residents. But again, Singapore keeps close watch on these numbers as well.

Unfortunately Singapore does not break out foreigners into age brackets, but it does say that there are 1.63m non-residents in Singapore.

In order to make up the difference between the Facebook reported figure and the official statistics for people aged 20 to 29, though, there would need to be 700,000 foreigners between the age 20 and 29 in Singapore on Facebook.

So for the Facebook population to be as reported:

  • 43% of all foreigners in Singapore must be between 20 and 29 (unlikely)
  • 100% of them must be on Facebook (unlikely)
  • 100% of all Singapore citizens between 20 and 29 must be on Facebook (unlikely)

This was starting to sound suspicious. And note that all it took was:

  • A look at the source of the statistic
  • A quick thought about whether it was realistic
  • Research on a secondary source

A more realistic figure

Instead, if we used:

  • A generous estimate of 1/3 of expats as between 20 and 29 (538,000)
  • Added them to the Singaporeans aged 20 to 29 (530,000)
  • And then estimated, also generously, that 90% of the total (1,068,000) are on Facebook…

…then we would still come up with only 962,000 people between the age of 20 and 29 have accounts on Facebook, 22% less than reported.

What to make of this

Of course, it’s easy to dismiss this as a small error which doesn’t matter in the larger scheme of things.

I mean, figures are often inflated and people should know they are just estimates, right?

But as we all know, these numbers are then included in other estimates. So then this error is repeated and compounded and, in the end, we may very well be using some bad statistics as justification for our strategy.

And if we do that, then we shouldn’t be surprised if what we propose doesn’t work out as well as we projected!

So why are the numbers inflated?

I’m not entirely sure. It could be a sample error, a ‘creative’ way of counting or even fraudulent users.

But what’s important is that by all logical reasoning the reported and repeated statistic is not correct. And the more it is repeated, the more likely it is to be believed.

And if this number isn’t correct, what does that say about the rest of the statistics? Seems to me that they probably all need revisiting in order to avoid knock-on bad analysis.

I mean, imagine if the next estimate was a 22% undercount of Facebook users. There would be headlines that Singapore youths are leaving Facebook in droves!

So…

So, do have a closer look at the statistics you are using and make sure that you

  • Know the source
  • Think about whether it makes sense
  • And find a back-up source in case it doesn’t look right

You will be then be passing on good, reliable figures and your predictions are more likely to be true.

And who knows, with just a bit of work you may even come up with a different solution which is against the trend, and as a result much more effective!