The future of paid voice search: How voice could be monetised

Earlier this week, I was honoured to speak at the excellent Google Firestarters event series dedicated to performance marketing.

The theme of the night was ‘Twenty years on: The present and future of search’, based on the landmark occasion of Google’s 20^th birthday, and each of the four speakers – myself, Alistair Dent of iCrossing, Paul Byrne of MediaCom, and Josh White of Jellyfish – presented their thoughts on where search is headed over the next two decades and what that means for performance marketing.

If you’ve read some of my previous writing for this blog, you’ll know that voice search, and the future of voice in particular, is a bit of a hobby-horse of mine.

Although I tend to be sceptical towards the hype that typically surrounds voice search within the marketing community, especially cries of “50% of search queries will be voice by 2020!”, I do believe that voice search has huge potential to be revolutionary – if executed in the right way.

The possibilities for voice search advertising are also intriguing, and we might already be seeing some hints as to how voice will be commercialised.

I’ve also been an avid follower of the rise of visual search over the past two years, and think that it holds some very exciting possibilities, particularly for ecommerce, as technology like augmented reality develops.

It’s impossible to say for certain what the next two decades have in store; I don’t think anyone writing in 1998 could have accurately predicted our current internet landscape, complete with filter bubbles, smartphones, voice assistants, and a whole host of other innovations that vastly affect how we connect to and search the internet.

Still, we can give it our best shot. Here are some of my thoughts on where search could be headed over the next 10-20 years, and how it might be monetised. First up: the future of paid voice search.

The future of voice search and voice connectivity

Since the introduction of Apple’s iconic voice assistant, Siri, with the launch of the iPhone 4S in 2011, voice devices have come a long way.

While we initially predicted that voice assistants would be mostly confined to our smartphones, that changed with the launch of the Amazon Echo in 2014, which in turn sparked off a smart speaker “arms race” amongst the major tech companies.

Smart speakers have achieved widespread popularity: according to voicebot.ai, an estimated 47.3 million people in the United States own a smart speaker, which works out to roughly one in five adults. In the UK, uptake has been slightly slower, but a recent report from Ofcom found that one in every eight households now has a smart speaker.

But it’s not the devices themselves so much as what they represent that’s important. Voice assistants are becoming embedded in everything: from your microwave – which Amazon now wants you to talk to – to your car. And smart speakers are designed to be connected to other components of your smart home, such as smart lighting, locks, and appliances, which are increasingly looking like the future of housing. All of these would be voice-controlled.

This means that in the future, we could find ourselves surrounded by voice-activated devices that are all connected to the internet. Will we be able to search with all of them? In 20 years’ time, maybe we will – but only if we see a significant shift in the way that voice search operates.

How voice search could work – really work

I previously wrote in my piece on the future of voice search that the reason we’re not seeing truly mainstream adoption with voice search (contrary to what the hype would have you believe, the number of actual internet searches conducted via voice is fairly small) is because voice can’t truly be used to explore the web.

It lacks an onward journey: either a voice query will produce a single answer that doesn’t give users the opportunity to further browse any of the results returned for their search, or it takes the user to a regular page of results on their smartphone or PC – no different to if they’d input their query using text, and certainly no good in a situation where their hands are occupied, which is one of the use cases voice is supposed to be perfect for.

I imagined a future, inspired in part by a discussion between search experts on how voice devices could affect the future of search, in which voice assistants will offer to read out content from the webpages that rank for their search query, much like a podcast or audiobook.

And in fact, this future is already being developed. Schema.org markup is often cited as key to optimising for voice search, and a type of schema was recently introduced, called “Speakable schema”, which is designed to allow website owners to mark up sections of their website that are particularly suitable for being read aloud by a voice assistant.

An introduction to schema.org markup for voice

Speakable schema is still officially pending, meaning that it’s awaiting further feedback, but that hasn’t stopped Google from announcing a beta programme with select news publishers that enables news results to be read aloud in response to a voice query.

Per the example given in Google’s announcement post, users could ask their Google Home device, “Hey Google – what’s the news on NASA?” and receive a short audio summary of the top headline, followed by an invitation to listen to another article on the subject. The Google Assistant will also send links to two relevant answers to the user’s phone, allowing them to read the full content at their leisure.

It’s not yet clear how many parts of this process work, such as how Google decides which news articles “rank” top for this kind of voice query (does it use regular SEO ranking factors, or some other criteria?). Nonetheless, it’s a big deal for voice search, because it’s a real step towards what a “true” search experience with voice could look – or sound – like.

It also opens up some possibilities for how voice advertising, if it comes to pass, might work.

The possibilities for paid voice search

Thus far, major players like Amazon and Google have held back from introducing any kind of paid search advertising for voice, though rumours abound that Amazon is exploring ad options for Amazon Echo devices.

One of the reasons that companies have been cautious about wading into this area is that voice advertising has the potential to be a lot more intrusive and irritating than visual or text-based ads. There’s no option to skim or scroll past a voice ad. In early 2017, a number of Google Home devices surprised their owners by delivering what appeared to be an advertisement for the new live-action Beauty and the Beast – though Google stridently denied that the short plug was intended as an ad – giving rise to instant backlash.

"I always wanted an ad-supported clock" – said no one ever.

— Jess ???????? Soft-G Gif 2⃣0⃣2⃣0⃣???????? (@mibi) March 16, 2017

well that's one way to ruin something cool

— Justin Pierce (@meJustinPierce) March 17, 2017

In all its forms so far, the voice “SERP” has had significantly fewer results than a regular search results page, because no-one wants to sit through ten search results being read aloud. This means that while sponsored voice search results would enjoy a lot more prominence, there is a real risk of damaging user trust.

How could voice advertising get around these obstacles? One possible solution would be to have sponsored results appear after organic results. The voice assistant would still read them out, ensuring visibility – or audibility – but searchers wouldn’t feel forced to sit through sponsored search results in order to hear the organic results.

However, we know that from Google’s Beauty and the Beast fiasco that users will still likely resent being obliged to listen to an ad even if it appears after the content they want. This is why I believe that consent is the key to making users feel in control when it comes to voice advertising.

A “Cost Per Consent” model?

Imagine this scenario: a person says to their Google Home device, “Hey, Google: what is performance marketing?”

“According to Wikipedia,” the Assistant replies, “Performance-based advertising, also known as pay for performance advertising, is a form of advertising in which the purchaser pays only when there are measurable results. I also have a relevant sponsored result for you. Would you like to hear it?”

If the person responds with “Yes” – if they consent to hearing the ad – then the Assistant reads out the sponsored result, and accompanies that by sending a link to the sponsored product or service to the person’s smartphone, in the same manner as Google’s news beta.

This would be the equivalent of a click in PPC (Pay-Per-Click) or CPC (Cost Per Click) advertising. Perhaps we could even see a “Cost Per Consent” model emerge, in which the advertising brand or publisher is charged for every searcher who consents to hearing a sponsored result.

You might reason that few people would opt in to advertising in this scenario, but few people click on advertising of any kind to begin with – and many of those who do click, do so unintentionally. We know that forcing advertising on consumers only breeds resentment, so why begin with a model which is already proven to irritate people?

Other ways of monetising voice search

Some argue that the nature of voice search means it will never be viable to monetise it through search advertising, and that companies like Google and Amazon will stick to more reliable methods of money-making, like ecommerce.

It’s true that Amazon already benefits plenty from purchases being made through Echo devices, but in my view, there will always be a limit to how well ecommerce works with a voice-only interface. For repeat purchases of known products, it makes sense, but in all other scenarios, a visual component is badly needed.

With that said, voice devices like the Amazon Echo and the Google Home are now available with screens (the Echo Show and the Home Hub, respectively). I made the argument in my Google Performance Firestarters presentation that we are unlikely to relinquish some kind of visual display for browsing the web, be that a smart speaker with a screen, a pair of smart glasses, or a smart phone linked up to wearable technology. Therefore, in most scenarios, it will still be possible to visually browse products.

Don’t be fooled, voice is still the future of commerce. Just not yet

There’s also Alexa Skills and Google Assistant Actions – the voice equivalent of apps – which already give businesses the opportunity to have a brand presence on voice devices. Earlier this year, Amazon introduced the option for developers to monetise Alexa skills, making things like games and premium audio content available for a fee – the equivalent of in-app purchases, or paid-for apps. Harry McCracken of Fast Company wrote of the development,

“The new features don’t just give skill builders the opportunity to charge for the sort of offerings they’re already building; they could incentivize the creation of ambitious new skills that nobody would bother to tackle without profit in mind.”

Monetisation of voice is still very much a developing area, and at the moment, all things are possible. Cynic that I am, I don’t believe that companies like Google and Amazon will pass up the opportunity to explicitly monetise voice search if it does take off, but we will see. My advice to the audience at Google Firestarters when it comes to both voice and visual search was to keep an eye on developments, experiment with the technology, and if budget permits, invest in ad options as they become available.

In the second part of this article, I’ll examine the rise of visual search, how it could be tied to the increasing adoption of augmented reality, and what opportunities the technology presents for performance marketers.

Read the second part of this article:

The future of visual search: smart glasses, AR and retail opportunity

Comments

There are 2 comments at the moment, we would love to hear your opinion too.

Peter Austin of Fresh Relevance (martech) 16 November 2018

“Hey, Google: what is performance marketing?”

“Available soon from Fresh Relevance,” the Assistant replies, “Performance Marketing is a comprehensive term that refers to online marketing and advertising programs in which advertisers and marketing companies are paid when a specific action is completed; such as a sale, lead or click.”

Fixed it for you. Shorter and no need for the “I also have a relevant sponsored result for you. Would you like to hear it?”” which nobody is going to say yes to.

That being said, few people are going to use voice to browse the internet. It’s fine for simple tasks such as choosing music, but the bandwidth is very poor for anything complicated.

Log in to Reply
- Rebecca Sentance 16 November 2018
  
  @Peter Austin However, I don’t think consumers would take kindly to the sole result that they hear being a sponsored result 🙂 Which is why I proposed the “consent” model and putting sponsored results after organic. If you know that the only answer you get is an ad that someone has paid for, why would anyone use voice search?
  
  Log in to Reply

You must be logged in to post a comment.