It is becoming increasingly commonplace to use voice to control our homes and computers. Where only a decade ago asking Alexa or Google Home to control the lights, or order a product from the internet might have seemed unusual, but it’s now the norm.
One of the companies that have invested the most in this type of technology is Google, enabling their Google Home devices and Android systems to accept verbal commands, and even recognise the speaker.
Today we’re going to look at how they’ve linked this technology into their Google Docs platform and consider if this a viable alternative to other commercial voice-to-text solutions.
The software is included in Google Docs, and so is inherently free. All you need to use it is a Google account and either a mobile device or computer with a microphone. It is another source of data for Google to harvest, but that’s part of the price of ‘free’ these days.
It is possible to use voice for input on Google Docs, the feature is called ‘Voice Typing’, and can be found on the Tools menu on a Google Doc or within Google Slides. It can also be activated in either location using the hotkey Ctrl-Shift-S.
When you activated it for the first time, you are asked to sanction access to the microphone for docs.google.com.
Once you’ve accepted that a small box appears with a microphone logo on it that you can click to activate voice input. This is then replaced with a red microphone symbol alongside the document to indicate that the system is in listening mode.
Before you activate voice typing the control panel, you can pick a language from a menu, and as this is Google, there are plenty of choices. And, you can also click on a question mark and receive some help about how to use the system.
As these systems go, this is a highly condensed solution.
Google will process what you say to the best of its ability, and if the system is unsure exactly about a word, it will underline it in grey. These ‘suspect’ words can then be clicked on and some alternatives provided by the system.
But you can quickly move around a document and fix issues manually or place the cursor and give Voice Typing another stab at it.
If you need to talk to another person while working, you can ask Google to stop listening, and then resume afterwards.
But to get the full capability of this solution there is a long list of commands that need to be memorised that can copy, paste, move around the document, insert tables, and a myriad of other functions.
And, you can insert punctuation, format the document, and even insert hyperlinks.
But, getting the most from it assumes that you can remember the commands, or have the help open to jog your memory.
A list is quickly available by saying ‘Voice commands list’, conveniently.
Where many voice-to-text solutions only cover a small number of languages, Google’s has a significant amount. The current definitive list is:
Afrikaans, Amharic, Arabic, Arabic (Algeria), Arabic (Bahrain), Arabic (Egypt), Arabic (Israel), Arabic (Jordan), Arabic (Kuwait), Arabic (Lebanon), Arabic (Morocco), Arabic (Oman), Arabic (Palestine), Arabic (Qatar), Arabic (Saudi Arabia), Arabic (Tunisia), Arabic (United Arab Emirates), Armenian, Azerbaijani, Bahasa Indonesia, Basque, Bengali (Bangladesh), Bengali (India), Bulgarian, Catalan, Chinese (Simplified), Chinese (Traditional), Chinese (Hong Kong), Croatian, Czech, Danish, Dutch, English (Australia), English (Canada), English (Ghana), English (India), English (Ireland), English (Kenya), English (New Zealand), English (Nigeria), English (Philippines), English (South Africa), English (Tanzania), English (UK), English (US), Farsi, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Hebrew, Hindi, Hungarian, Icelandic, Italian, Italian (Italy), Italian (Switzerland), Japanese, Javanese, Kannada, Khmer, Korean, Laotian, Latvian, Lithuanian, Malayalam, Malaysian, Marathi, Nepali, Norwegian, Polish, Portuguese (Brazil), Portuguese (Portugal), Romanian, Russian, Slovak, Slovenian, Serbian, Sinhala, Spanish, Spanish (Argentina), Spanish (Bolivia), Spanish (Chile), Spanish (Colombia), Spanish (Costa Rica), Spanish (Ecuador), Spanish (El Salvador), Spanish (Spain), Spanish (US), Spanish (Guatemala), Spanish (Honduras), Spanish (Latin America), Spanish (Mexico), Spanish (Nicaragua), Spanish (Panama), Spanish (Paraguay), Spanish (Peru), Spanish (Puerto Rico), Spanish (Uruguay), Spanish (Venezuela), Sundanese, Swahili (Kenya), Swahili (Tanzania), Swedish, Tamil (India), Tamil (Malaysia), Tamil (Singapore), Tamil (Sri Lanka), Thai, Turkish, Ukrainian, Urdu (India), Urdu (Pakistan), Vietnamese and Zulu.
That’s 119 languages, including 13 Arabic forms, 19 Spanish variations, 13 English dialects and even four flavours of Tamil.
There are languages included that are rarely supported by dictation software, like Zulu and Icelandic, due to the relatively small number of speakers.
Language coverage is probably Google Voice Typing’s biggest strength.
If this solution has a weakness, it’s that it can’t easily process recordings.
While it isn’t impossible to make it do this, but it requires patching the audio system of the computer so that it takes output destined for the speakers and directs it as if it was coming from the microphone. But doing this doesn’t enable you to differentiate between different people on the recordings, and it might interfere with the AI that Google uses to make verbal accuracy better by learning how you speak.
If you wish to transcribe podcasts or recorded interviews, we’d recommend you use something else, as this tool isn’t built for that purpose.
It is hard to judge the accuracy of a voice processing system when you can’t send it the same recordings that other products have converted. And, anyone who uses Alexa or Google Home on a regular basis will know that occasionally it won’t understand us, mostly because of extraneous sounds or inconsistent speaking.
That said, in the active testing we did, this tool generally got most of the words correct, or the correct word was quickly available on the suspected words menu.
To get the best results needs some control to be exhibited in the speed, volume and tone of speaking, something that undoubtedly comes with practice. Also, being able to remember all the special commands can reduce the amount of post-recording edits that are required, critically.
Depending on your expectations, the accuracy here is acceptable. There is a consistency to its interpretations that it maintained during our tests. How well it works for you, we can’t predict. But as it is free, it won’t cost anything other than your time to determine that.
As this is Google, the security model is the same one that controls access to all Google accounts. That ranges from simple password protection up a more reasonable to two-factor authentication (TFA) methodology.
Given the number of identity thieves around, those using Google without TFA are running a significant risk in having their accounts compromised.
Even this security option has its limits, but it is better than merely a password.
For those who aren’t sufficiently paranoid, we strongly recommend you go along to https://myactivity.google.co (opens in new tab)m/myactivity
And, you’ll see what Google collects on you daily, and that might include recordings of your voice commands.
This might be a longer review if this software offered more functionality, but it doesn’t.
As voice-to-text solutions go this one isn’t complicated, but it has enough functionality to be genuinely useful.
Other solutions are built to handle the transcribing of conversations between multiple people, where this was designed to handle a single person who is speaking in a controlled and precise fashion.
What using it assumes is that you are happy to use Google and Google Docs, even if that isn’t the ultimate destination of the text you input.
It’s no chore to copy a paste dictation from Google Docs into another application, and you will have a cloud copy to reference should you end up needing one.
Some users understandably have issues with feeding Google’s insatiable appetite for user data, and this mechanism is yet another source of data for it to snack.
If you feel like that, then you won’t use Google Voice Typing, or anything by Google.
For those willing to accept how much Google might know about them, then the voice dictation solution in Google Docs is capable enough for general use, especially if you only need this functionality occasionally.