Female owners of smart speakers are more likely than men to report that their device fails to understand their commands, according to a recent study of 1000 British smart speaker owners by YouGov (opens in new tab).
YouGov found that "two thirds of female owners (67%) say that their device fails to respond to a voice command at least “sometimes”, compared to a small majority of male owners who say the same (54%)".
"By contrast, 46% of men say their device “rarely” or “never” fails to work, compared to only 32% of women."
- The best smart speaker of 2019
- Alexa games: the best games you can play on your smart speaker
- Amazon Echo vs Google Home: which speaker is best for you?
The researchers also found that women tend to speak more politely to their smart speakers, with "45% saying they “always” or “often” say ‘please’ and ‘thank you’, compared to only 30% of male owners".
This discrepancy between male and female users could be a result of bias at the point of training AI assistants like Alexa or Siri; if programmers train the AI to respond to mainly male voices, it may have trouble recognizing female voices in the future.
Not everyone believes this to be the case however. In its reporting of the study, the Evening Standard (opens in new tab) cites a blog post (opens in new tab) by founder and CEO of R7 Speech Sciences, Delip Rao, who believes that the discrepancy is down to technological issues rather than gender bias.
He explains that female voices have more variances in pitch compared to male voices, which can lead to smart speakers having difficult distinguishing their commands.
Rao posits that this issue could be resolved by training voice assistants to distinguish when a male or female voice is speaking; but, as he points out, this comes with its own set of issues, as it doesn't account for non-gender binary or transgender users, who could be misgendered by the voice assistant mistakenly.
Perhaps the easiest way to combat this discrepancy would be for developers to "build models that directly learn from raw waveforms", and provide their AI assistants with more varied examples of peoples' speaking voices to learn from.