The boom in the B2B big data market (from a sub-$100m industry in 2009 to $130bn today) mirrors an enterprise-led scramble to invest in data mining, reminiscent of the California gold rush, accompanied by a similar media buzz.

Although much is still written about the near-magical potential of data analytics for business, this fervour is now giving way to a more serious debate on where the real business value can actually be found. It’s clear that data prospectors are diverging into two camps: the ‘haves’ and the ‘have-not-yets’.

A recent KPMG survey showed only 40% of executives have a high level of trust in the consumer insights from their analytics, and most said their C-suite didn’t fully support their current data analytics strategy. 58% of businesses told Teradata the impact of big data analytics on revenues was “3% or smaller”. The real bonanza appears confined to banking, supply chains, and technical performance optimisation – understandably some businesses feel left behind.

Guidance on using data analytics is aimed at companies with a massive pre-existing data hoard who wish to extract value from it – the equivalents of the gold rush’s “49ers” who arrived in California early in 1849 to stake a claim on a good piece of prospecting land. Those struggling tend to be consumer-facing brands or marketers attempting to understand their customers’ behaviour by panning vigorously in a shallow stream of aggregated sales data.

The first question these argonauts need to ask themselves is whether there’s really any difference between the ‘data analytics’ they are doing today and good old-fashioned business intelligence? The ubiquity of big data has led to a subtle shift in language use, whereby any information is now ‘data’ and analysis often simply means ‘looking’.

Can human decision-makers find new actionable insights from just looking at data? Credible examples and detailed case studies are conspicuous by their absence, despite analytics vendors’ repeated promises of golden nuggets of actionable insight at the end of the analytics journey (it should be noted that merchants made far more money in the gold rush than miners – its first millionaire, Samuel Bannon, sold prospecting tools and supplies, and was also the first to publicise the gold strike by running up and down the streets of San Francisco yelling: ‘Gold! Gold in the American river!’). 

Attempting to squeeze insights out of small data has proven hazardous; to the extent it can lead one astray. Insight is defined as ‘the understanding of a specific cause and effect in a specific context.’

Data cannot generate insight – insight is the conclusion people draw from evidence. Humans draw these conclusions, but in a deeply flawed way, simply because we’ve evolved to detect patterns everywhere.

We see faces in house fronts, mythical beasts in constellations and apparitions on toast. We inevitably draw statistical inferences that are invalid and cannot pick-out the ‘random’ scatterplot from an identity parade of graphs by sight alone. People know that statistical correlation doesn’t imply causation, but they constantly behave as if it does. All these traits work against us when viewing data.

What about machine learning, then? Can dumb machines, invulnerable to the cognitive biases that afflict humans, uncover causal relationships from data which our puny minds are too feeble to compute? Typically, as an industry, we’ve found the answer is yes, but while predictive models can reliably forecast and simulate events, their complexity prevents easy interpretation.

For most business uses of prediction, this doesn’t matter. Typically, it’s been found that the forecast-and-simulate approach, particularly for marketing purposes, left unmet the fundamental need brands have to understand their customer and how to connect with them. In fact, there are platforms available in the market that apply machine learning to a particularly large pile of ‘pay dirt’ that any brand can access, but until recently had been considered impossible to mine.

These platforms make use of the natural-language text churned out by millions of people across social media, blogs, and discussion forums and turn it into ‘datapoints’ which can be clustered, filtered, and modelled, just like quantitative data.

This is a great leap forward in the big data era – the use of deep learning techniques such as neural nets to success transform ‘unstructured’ data (natural language, images, sound and video) into usable form. It’s now possible to harvest online conversations and model them using standard data science techniques, telling marketers exactly what position a brand occupies in consumers’ minds, what type of occasion and feelings they associate with a product, and giving a true understanding of how a target audience sees the brand. 

You can identify the distinct communities discussing a subject on social media, the drivers of their collective attention, and who influences the influencers. Finally, once the conversational data has been processed in this way, algorithms can be used to identify the online trends and triggers which track real-world events.

These models are now used by marketers to uncover emerging consumer trends years before they show up in market research, test new product concepts, and determine the best copy for marketing materials.

Most importantly, mining web data this way gives consumer brands the chance to reap the same rewards from advances in big data analytics as the banks and web companies, without sacrificing understanding on the altar of automation. For the data-starved marketer, it’s the mother lode.

For more on this topic, check out these Econsultancy resources: