Why voice recognition is no longer just a gimmick

voice recognition
Voice recognition is becoming the norm

"I take it as a good omen that wireless should have reached its present perfection at a time when the empire has been linked in closer union, for it offers us immense possibilities to make that union closer still." The world has come a very long way since King George V uttered those words into a microphone in the first ever Christmas Speech broadcast by the BBC in 1932.

Today, 82 years later, King George would no doubt have stared in disbelief as I ask my smartphone to find me a recording of his famous words via YouTube on the internet. Yet, in the opinion of a humble 'commoner', I believe voice recognition and dictation systems have finally come of age.

In relative terms – at least since the birth of the digital age – voice dictation applications are nothing particularly new. The ill-fated Belgian company, Lernout and Hauspie were developing speech recognition systems back in 1987 and bought Dictaphone and Dragon Systems at the start of the Millennium to add weight to their product base.

Although the company no longer exists, Microsoft has continued to use some of L & H's speech interface tech.

A maturing technology

However, much of the early applications were flaky and unreliable, requiring voice databases to be linked to algorithms and 'training' based on a few hundred or so users enrolled as part of research and development programmes.

Today, with advent of cloud and big data, there is an almost infinite amount of voice data available from 'real' users linked to servers that define and process languages and complicated words without any training of either algorithm or user. This sea change is evidenced by the sophistication of Apple's Siri and DragonDictate's mobile app, which even learn from your own vocabulary as it goes along.

Now, anyone (without a really heavy accent) can access free, consumer-based voice recognition and dictation tools reliably and without any training. Moreover, the dataset for using voice translation has grown exponentially.

Without question, this paradigm shift in technology has turned voice-based systems from quirky techno-gimmicks to genuine business tools.

Why? Because the pace of development has jumped so fast helped not only by big data, but also the demand for 'hyper-tasking' tools that can keep up with the consumer demand for immediacy and for working while on the move. It's certainly faster than typing on a mobile screen and, for those that can't touch-type, often quicker than using a desktop keyboard.

Text-to-speech advancements

There has also been a coming-of-age for text-to-speech applications. Once confined to a niche tool for the visually impaired and accessibility markets, the revolution in mobile devices and, in particular, in-car systems has generated a broader consumer demand for software that can read text without sounding like a foreign language.

In fact, text-to-speech has an unexpected benefit when it comes to proof-reading. A journalist friend of mine told me that, no matter whether he's writing for broadcast or print, he always reads anything he has typed out loud.

Not only does it give him a sense of the writing in general, but it's the best way of picking up spelling mistakes that would otherwise be missed with silent reading which uses a different part of the brain. The same applies to text-to-speech technology. After all, where would we be without such literary classics as "The DaVinci Cod" or Gabriel Garcia Marquez's "One Hundred Ears of Solitude"?

Ten years ago, voice dictation software was only really the domain of secretaries, lawyers, medics and the occasional savvy executive. Nowadays it's almost taken for granted and the result could be a dramatic increase in productivity and a safer life on the move.

Finally, if you're one of those sceptics, perhaps badly traumatised by the effort of use it earlier incarnations of voice dictation - why don't you give some of these new systems a go? Why not try Google Now, Apple Siri or the Nuance Dragon Dictate app?

  • Dr Peter Chadha is Managing Director of Dr Pete Inc and Steegle.com. He is an IT consultant providing strategic IT reviews and implementation to global enterprise. He takes a pragmatic approach to business solutions, but is a technology evangelist.