Microsoft Cortana: reaching new levels of voice recognition

"You go up to the elevators and they open before you get to the door. At some point, without human interaction it learned to understand the behaviour of people around elevators."

Lee is optimistic about the next decade based on what he sees in the lab. "Being able to understand what you see in videos is advancing, having a deep understanding of human discourse is making great progress.

"In ten years we will knock down some of the core AI problems in understanding speech and human discourse," he predicts. That's because Microsoft Research isn't afraid to take unusual approaches.

For example, Cortana's speech recognition comes from a Microsoft Research project that started in 2009 using deep neural networks; a project Lee admits he would never have approved if he'd been running the lab then.

"I surely would have killed that project; I would have said it was completely ridiculous and I would have been backed by all the top researchers, but a year and a half later it completely transformed the field."

Multi-lingual learning

In fact, those neural networks are now developing some unusual abilities he calls transfer learning. "We take a neural net and we train it on a large amount of English, and as far as we can tell, the more data we train on, the better the performance.

"Now take that neural net and train it on Chinese data and what you get is a neural net that performs very well on Chinese but also get better on English. And then if you teach it French, the training on French is much faster; everything is much faster, including English recognition."

What happens is that the system starts to understand not just one language or two, but concepts of language. If you visualise the way these neural nets store information as a 3D space with connections between different words, Lee says that the connection between the word 'man' and the word 'woman' has similarities to words like 'boy' and 'girl' or ' king' and 'queen'.

"These things are emerging on their own out of the machine learning system. At the lowest layer these systems are, in an unsupervised way, discovering some internal structure of speech. We're getting at something very deep about human discourse and intelligence but also in a way that has tremendous commercial value."

Basic research that might not end up in a product for several years – rather than pure product development – is important for Microsoft Research for several reasons.

"In order to attract the best people you have to promise people they will be able to be independently influential in the research community. That means open publication policies, encouraging people to partner and to appear at conferences. It can't just be about making money for Microsoft.

More than commercial benefit

"So there's lots of thought being given to health, energy, social justice and other areas going on in our research labs. It's interesting for me that it's not a stretch for me to face Steve Ballmer or now Satya Nadella and justify expenditure on those research activities, because what we learn from them informs the Azure cloud service, what we learn informs things like Cortana.

"What we learn tracking zebras ends up having commercial value. One of the great things about basic research is you operate at the foundation where things tend to have multiple purposes; both social and commercial.

"You end up having ability to bring great smart people into your company but you also get the glow and satisfaction of bringing great research into the world."

Lee is keen to let more people know what it is that Microsoft Research does. When he was running DARPA, the Defence Advanced Research Projects Agency (which funded things like the internet and the research projects that became Siri and Google's self-driving car), he offered a public prize for finding giant red weather balloons hidden by a research team.

He's hired the organiser of the balloon search at Microsoft and promises, slightly mysteriously, that he has some fun ideas.

"Generally you will see over the next few months Microsoft Research activities being much more public. We'll be doing things in an attempt to reach many more people and engage with them directly." Join in the fun and who knows, you might be helping build a future version of Cortana.


Mary (Twitter, Google+, website) started her career at Future Publishing, saw the AOL meltdown first hand the first time around when she ran the AOL UK computing channel, and she's been a freelance tech writer for over a decade. She's used every version of Windows and Office released, and every smartphone too, but she's still looking for the perfect tablet. Yes, she really does have USB earrings.