Hear me now: the future of voice and how it's the new OS

It’s jaw-dropping how quickly voice assistant technology has been adopted by both gadgets, and by gadget-lovers in the home. The technology is now seen as the natural way to control the smart home, thanks to the low-cost ability to add voice to your setup. To understand how we got here, though, you have to look to the past – and to a rather familiar name. 

Siri was the first to catapult voice technology into the mainstream, when it launched with the iPhone 4S back in 2011. At the time, having a voice assistant felt frankly futuristic – because of this it came riding an uneasy wave of Terminator / GLaDOS references – but the reality was somewhat different. 

Yes, Siri was perfectly functional, but the scope of what could actually be done with it was rather limited – not just because it was beta technology, but because it was restricted to the phone. 

There was another problem: it was a victim of its own popularity. In a piece that outlines the troubled history of Siri, Cult of Mac points out that: “Bugs and other problems reportedly began almost from the time Apple acquired the voice system back in 2010. Part of the problem was Siri’s instant popularity. 

“The backend servers weren’t prepared for the demand coming from millions of iPhone users. The company has struggled ever since to make Siri’s code more efficient.”

Apple's Phil Schiller announcing Siri on stage at the iPhone 4S launch event in October 2011

Apple's Phil Schiller announcing Siri on stage at the iPhone 4S launch event in October 2011

For all its niggles, Siri paved the way for the voice assistants of today, and instantly proved that controlling things with you voice was a feature consumers wanted. 

Of course, Siri is still very much around, and is now used in myriad products, from iPads to Macs to the Apple Homepod. But while Siri got everyone used to speaking to a synthetic voice – and there are rumblings that it's about to get a whole lot better – right now it's Google, and, in particular, Amazon, that are making most of the running in this space.

Hearing the Echo

Google Assistant, to begin with, was much like Siri – using natural language processing it would interpret a query posed by the user, use the big data trench that is Google search to find the answer, and surface it. 

But just this year – and this is where it gets really interesting – Google revealed an extension of Assistant called Google Duplex. This technology is designed not just to answer questions and make lists, but to become a bona fide assistant. It will phone restaurants on your behalf, mimicking conversational language, and do this fully autonomously.

Watching this in action – see the video below – can be chill-inducing, and it’s one of those ‘Oh, computers can actually do that?’ moments that has many privacy and ethical implications. Google has played down how this tech could be used going forwards, but it’s certain that we'll see it being more widely adopted in the months and years to come.

And that leads us to Alexa. Where Apple and Siri planted the acorn for voice assistant tech, Alexa is the oak tree. Amazon has used its might to make sure Alexa is a piece of technology that's available to all, no matter what their budget. And it’s a strategy that's worked. But perhaps it's how it offered this voice tech to millions - and that's through the humble speaker.

Reports vary, but it’s thought that between 40-50 million people now have access to a smart speaker – and when we say smart, we really mean Alexa. Amazon’s range of products means you can have fully-fledged speakers with Alexa built-in, or, with the Amazon Echo Dot, you can make any speaker you have smart. 

The voice OS has also made its way to Amazon tablets and, through third parties, is in everything from fridges to robots.

A photo of the Amazon Echo speaker

Amazon's Echo was the original smart home speaker (Image credit: Amazon)

The smart home is a complex web of disparate products, but Alexa simplifies your setup by bringing them all together. The smartest thing Amazon did with Alexa was to take it out of the Amazon ecosystem, and open it up to as many partners as possible.

Being prevalent doesn't always mean a technology's future is secure, though, and there's a lot to do yet before voice becomes the natural OS of choice.

The future of voice

Natural is the key word here. While Alexa and its bedfellows manage to serve you up the right information most of the time, improvements are being made that will make voice assistants even smarter.

Their voices, for a start, are getting more work. While the sound of a voice changes in different countries, within those countries there are also regional dialects and this is something that's being worked on.

Speechmatics, a British-based company, has spent 12 years looking into speech recognition and accents, and has come up with a language pack that should help to address the issue.

“We realized that we [needed] to come up with what we like to call ‘one model to rule them all’ – an accent-agnostic language pack that is just as accurate at transcribing [an] Australian accent as it is with Scottish,” Speechmatics CEO Benedikt von Thüngen told VentureBeat.

The result is thousands of hours of speech data that could form the backbone for future voice assistants.

There's also the issue of catering for those who have disabilities that mean voice just isn't a viable option. A recent Alexa mod solved this issue, and while it's by no means official it proves that Alexa and other assistants could use visual data as well as text and voice. Google Assistant can also use visual cues via the Google Lens app.

The true future of voice assistants, however, could lie in the end of our addictive relationship with our smartphones. What do we actually use our phone for? Information, games, communication, and perhaps actually phoning people once in a while. Voice assistants want to do that and more. 

If you use your smartphone as an alarm, then the Echo Spot wants to take that reason away and add a little bit of smarts to your wake-up call, offering a particular song or radio station to wake up to. Couple this with Routines and your lights can be switch on, the coffee machine fired up and the weather read to you, all just by you
saying something as simple as "Good morning". 

This isn’t the future, but now. Alexa and other smart assistants already do this, but the processes need to be simplified still further, and more things need to be connected, before we can feel comfortable enough to truly leave our phones in our pocket. 

The future isn't us looking down and scrolling, but doing two things that are far more instinctive: speaking, and listening.

TechRadar's Next Up series is brought to you in association with Honor

Marc Chacksfield

Marc Chacksfield is the Editor In Chief, Shortlist.com at DC Thomson. He started out life as a movie writer for numerous (now defunct) magazines and soon found himself online - editing a gaggle of gadget sites, including TechRadar, Digital Camera World and Tom's Guide UK. At Shortlist you'll find him mostly writing about movies and tech, so no change there then.