It’s jaw-dropping how quickly voice assistant technology has been adopted by both gadgets, and by gadget-lovers in the home. The technology is now seen as the natural way to control the smart home, thanks to the low-cost ability to add voice to your setup. To understand how we got here, though, you have to look to the past – and to a rather familiar name.
Siri was the first to catapult voice technology into the mainstream, when it launched with the iPhone 4S back in 2011. At the time, having a voice assistant felt frankly futuristic – because of this it came riding an uneasy wave of Terminator / GLaDOS references – but the reality was somewhat different.
Yes, Siri was perfectly functional, but the scope of what could actually be done with it was rather limited – not just because it was beta technology, but because it was restricted to the phone.
There was another problem: it was a victim of its own popularity. In a piece that outlines the troubled history of Siri, Cult of Mac points out (opens in new tab) that: “Bugs and other problems reportedly began almost from the time Apple acquired the voice system back in 2010. Part of the problem was Siri’s instant popularity.
“The backend servers weren’t prepared for the demand coming from millions of iPhone users. The company has struggled ever since to make Siri’s code more efficient.”
- Do you have a brilliant idea for the next great tech innovation? Enter our Tech Innovation for the Future competition and you could win up to £10,000!
For all its niggles, Siri paved the way for the voice assistants of today, and instantly proved that controlling things with you voice was a feature consumers wanted.
Of course, Siri is still very much around, and is now used in myriad products, from iPads to Macs to the Apple Homepod. But while Siri got everyone used to speaking to a synthetic voice – and there are rumblings that it's about to get a whole lot better – right now it's Google, and, in particular, Amazon, that are making most of the running in this space.
Hearing the Echo
Google Assistant, to begin with, was much like Siri – using natural language processing it would interpret a query posed by the user, use the big data trench that is Google search to find the answer, and surface it.
But just this year – and this is where it gets really interesting – Google revealed an extension of Assistant called Google Duplex. This technology is designed not just to answer questions and make lists, but to become a bona fide assistant. It will phone restaurants on your behalf, mimicking conversational language, and do this fully autonomously.
Watching this in action – see the video below – can be chill-inducing, and it’s one of those ‘Oh, computers can actually do that?’ moments that has many privacy and ethical implications. Google has played down how this tech could be used going forwards, but it’s certain that we'll see it being more widely adopted in the months and years to come.
And that leads us to Alexa. Where Apple and Siri planted the acorn for voice assistant tech, Alexa is the oak tree. Amazon has used its might to make sure Alexa is a piece of technology that's available to all, no matter what their budget. And it’s a strategy that's worked. But perhaps it's how it offered this voice tech to millions - and that's through the humble speaker.
Reports vary, but it’s thought that between 40-50 million people now have access to a smart speaker – and when we say smart, we really mean Alexa. Amazon’s range of products means you can have fully-fledged speakers with Alexa built-in, or, with the Amazon Echo Dot, you can make any speaker you have smart.
The voice OS has also made its way to Amazon tablets and, through third parties, is in everything from fridges to robots.
The smart home is a complex web of disparate products, but Alexa simplifies your setup by bringing them all together. The smartest thing Amazon did with Alexa was to take it out of the Amazon ecosystem, and open it up to as many partners as possible.
Being prevalent doesn't always mean a technology's future is secure, though, and there's a lot to do yet before voice becomes the natural OS of choice.
The future of voice
Natural is the key word here. While Alexa and its bedfellows manage to serve you up the right information most of the time, improvements are being made that will make voice assistants even smarter.
Their voices, for a start, are getting more work. While the sound of a voice changes in different countries, within those countries there are also regional dialects and this is something that's being worked on.
Speechmatics, a British-based company, has spent 12 years looking into speech recognition and accents, and has come up with a language pack that should help to address the issue.
“We realized that we [needed] to come up with what we like to call ‘one model to rule them all’ – an accent-agnostic language pack that is just as accurate at transcribing [an] Australian accent as it is with Scottish,” Speechmatics CEO Benedikt von Thüngen told VentureBeat (opens in new tab).
The result is thousands of hours of speech data that could form the backbone for future voice assistants.
There's also the issue of catering for those who have disabilities that mean voice just isn't a viable option. A recent Alexa mod (opens in new tab) solved this issue, and while it's by no means official it proves that Alexa and other assistants could use visual data as well as text and voice. Google Assistant can also use visual cues via the Google Lens app.
The true future of voice assistants, however, could lie in the end of our addictive relationship with our smartphones. What do we actually use our phone for? Information, games, communication, and perhaps actually phoning people once in a while. Voice assistants want to do that and more.
If you use your smartphone as an alarm, then the Echo Spot wants to take that reason away and add a little bit of smarts to your wake-up call, offering a particular song or radio station to wake up to. Couple this with Routines and your lights can be switch on, the coffee machine fired up and the weather read to you, all just by you
saying something as simple as "Good morning".
This isn’t the future, but now. Alexa and other smart assistants already do this, but the processes need to be simplified still further, and more things need to be connected, before we can feel comfortable enough to truly leave our phones in our pocket.
The future isn't us looking down and scrolling, but doing two things that are far more instinctive: speaking, and listening.
TechRadar's Next Up series is brought to you in association with Honor(opens in new tab)