I was lucky enough to attend Voice Conference 2018 (#voiceconf2018) hosted in Amsterdam at the beginning of October. It was a thoroughly informative and insightful event where speakers from some of the tech world’s biggest players were in attendance giving talks. I was particularly interested to learn more about the practical challenges and solutions regarding voice interfaces experienced by these industry leaders to draw parallels to certain challenges WeAreBrain discovered during a recent voice assistant project we have been working on for a client.
Voice interface is becoming the next ‘big thing’ in the tech world and some speakers referred to it as the ‘4th revolution in the digital landscape’. This certainly piqued everyone’s interest and got us all excited to hear from the pioneers of voice technology. And so I happened to find myself in the right place at the right time to learn about the most recent use-cases by some of the most brilliant minds in the industry.
The main concepts being given focus on the day were virtual assistants, relevant use-cases, multi-modality and choosing the right modality for each case. Tech giants like Google, Amazon, and Microsoft have been focussed on heavily subsidising their software and hardware for this modality in recent years while grabbing headlines in the process. And Apple has been quietly growing their intelligence capabilities for Siri who already ‘lives’ in our pockets. With the leaders already starting to run away with the tech, where does it leave the smaller players in the game? I was going to find out.
Two of the standout talks were from KLM and Rabobank who insisted that companies should focus on getting into the voice game as soon as possible. Performing digital ‘land grabs’ for this new technology is imperative if you, as a business owner, wish to be given license to experiment a bit before millions of users begin adopting it.
Essentially what these talks were about was how developing for effective communication with text-based chatbots is a great basis for speech-based chatbots. While in many cases you will not be able to completely reuse the content and interactions of the visual text-based chatbots, the core flow and some of the technologies can be shared between different conversational modalities. For example, reusing the same dialogue design tools like Flowchart and usability testing methods like Wizard of Oz, but with the help of text-to-speech for voice interactions.
Know your conversation starters
Seb Reeve, Director of Strategic Solutions at Nuance, spoke about the main business-relevant use-cases. He reminded us that one of the biggest drivers of conversational interactions was, and still is, customer service. Infamous IVR (Interactive Voice Response) technology was the intermediate step towards more natural speech-based interactions with virtual customer service assistants. Open navigation with natural language processing support will help solve problems with customers choosing a wrong navigation path of the IVR ’tree’. Another use-case is identification by voice. Voice biometrics is not new but it is growing in acceptance by users and by legislation. And a third example was remote controlling TV, especially for complex queries.
The complexity of implementing voiced conversational interfaces
While voice is posited as a more accessible technology as users don’t need to type, it is not always universally accessible either. It was refreshing to hear from many speakers how labour intensive it is to get speech-based interactions right. It starts with understanding human behaviour, making hard choices in reducing interactions, to crafting every word as well as how a normal conversation flows. As the expert in AI Oren Etzioni said, “machine learning is 99% human work. Everybody hits the ‘oh shit’ moment”. He gave an example of a customer service assistant that took years of hard work to create, but phone calls to the human agents took a long time to decrease because it took consumers quite some time to discover and start trusting a virtual assistant.
From this, you can see that it is not magic but hard work and sweat that get these things working smoothly. Besides the challenges of getting conversations right, there are integration problems with the legacy systems needed to offer personalised experiences, legislation issues, technical stability and in the Netherlands case — limited support of Dutch language and local conversational traditions. Only Google supports Dutch language and is trained to understand modern communication in the Netherlands. Microsoft and Amazon are not releasing their plans as of yet but promise they are currently making inroads to this. So if you are to support Amazon Alexa or Microsoft Cortana, you will need to be targeting an even smaller pool of early adopters who feel comfortable speaking English to their devices. As all companies presenting their cases recommend, this small coverage should not limit you as you want to be able to have an updated and matured version of your service when these platforms hit the mass market in Dutch. Google Assistant has been available on some Android smartphones since July 2018 already. In the Netherlands, it will be the obvious platform to start with.
As Ruben Klerks from KLM put it, “With voice, we are learning how clients will use it, and at the same time clients are learning how companies will deliver their service via voice.” Time will tell how well voice will be integrated into everyday technology. It’s exciting times we are living in!
We, at WeAreBrain, are investing in learning new technologies on different levels — business relevance like reflected in our articles, specifically on user experience and technological implementation. We are experienced in supporting brands in their pilots in collaborative agile ways.
If your business is interested in giving your brand a voice, contact us today.