2013: The year that voice took off
2013 was the pivotal year for voice tech, with big improvements achieved via a combination of greater word accuracy and the impact of deep neural networks. Andrew Ng commented on this progress from a user’s perspective, in his famous tweet, as follows:
As speech-recognition accuracy goes from 95% to 99%, we’ll go from barely using it to using all the time! https://t.co/TfjqJLDTPJ
— Andrew Ng (@AndrewYNg) December 16, 2016
Whilst this sounds encouraging on the accuracy front, the voice recognition/activation journey has been a long and slow one, and to demonstrate this point, at SXSW we were shown an early attempt at voice technology — IBM’s Shoebox, from 1961.
In the modern, digital era, things have moved on a pace from the launch of Siri (to decidedly mixed reviews) in 2011.
A conversational technology, for all generations
Following pivotal changes in 2013; 2014 is now perceived as ‘Year One’ for voice activation, and already this technology is growing faster than smartphones, at the same time in their development – using the launch of the iPhone 2007 as the comparable starting point. Not only is take-up faster, but it is being more quickly adopted by older generations.
Not all older people are necessarily comfortable with the technology (see the great video below of an Italian grandmother learning to use Google Home) but the adoption of voice is easy; as it is the first technology that will not need to be learnt. Intuitive by nature, unlike computers and phones, this technology will adapt to us, not the other way around.
With all connective technology, even with the keyboard, it has always been a conversation, involving a ‘back and forth’ and ‘process and progress’. The process, for the consumer at least, is getting simpler. And, although it is likely, that the ‘screens of tomorrow will increasingly be speakers’; one element holding voice back are the different approaches being taken in the East and West. In order to really prosper, the approach needs to be a concerted global enterprise.
The ‘voice’ challenge for marketers
With the rise of voice technology, so comes the rise of the screenless / voice marketer and developer. These roles are becoming essential as voice become the first filter and the main point of brand, consumer contact. I like the following perspective on the challenges these roles will be facing (from The Next Web):
Interfaces are gradually becoming invisible as we move toward a world of Zero UI. Screens will start to go away, and interactions will primarily happen via voice, gestures, glances, or even by thought. Branding means a lot more than just visuals these days.
Admittedly, some brands have recognised the importance of different modes of human interaction, for example the importance of gestures was addressed by SnapChat, Tinder and Spotify in 2015; and whilst work on the ‘impact of the mind’ is still fairly nascent, one example is in the area of thought-controlled drone flight. Here’s an example courtesy of Emotiv Epoc.
Understanding humanity and individual cultures is essential to understanding effective communication — as Peter Drucker (may have) said, ‘Culture Eats Strategy For Breakfast.’ Understanding culture (the ideas, customs, and social behaviour of a particular people or society) is the essential starting point for an effective communications strategy.
The changes that voice may bring
Some of the more obvious effects of voice activation may include — the complementing of human customer service with voice activiation; the ‘long-tail’ impact on search (as voice searches are generally longer than text searches); and the rise of selling through voice search.
Some less obvious effects may include — the monetisation of ‘conversation data’ and the rise of audio ads around questions (pre-roll, post roll). The future of voice apps is more open to question, as only 3% of apps are used actively in a given week.
The impact on search
In terms of success metrics, search is currently interested with the accuracy of answers provided to questions asked — e.g. ‘who’, ‘when’ and ‘why’ — and metrics such as purchase, adoption and retention. Increasingly, we will be more preoccupied with new notions of ‘emotional success’ — such as engagement and happiness.
Amongst all of this, it is important to remember, that whilst voice can shrink the space between brand and consumer, designing artificial commercial conversations that adequately replicate human interactions is highly challenging. When it works well, from the consumer’s perspective, voice is about efficiency and ease – essentially, ‘being met where you are’.
Getting the technology right for the user
Optimal approaches in this space include the importance of successfully ‘evolving the mediation in a conversation’; the need to successfully translate ‘user intentions’; and moving through three key stages of connection: ‘the foundations of conversation’, ‘to changing the conversation’, ‘to better conversations’. As voice becomes more familiar, it becomes more relevant, and more effectively fulfils user journeys.
In this example involving an Amazon Echo in South Park, Cartman seems to have had a fulfilling user journey, and there are some interesting things on his (and other people’s shopping lists) (please note – NSFW).
A new perspective on marketing
Finally, there is also need for a change in marketing perspective – from ‘how your brand looks’ (old) to ‘how your brand looks at the world’ (new). Brands need to have a personality and perspective, like people, to fully connect in a voice activated world.
More on voice: