How Microsoft's Clutter can tame your email inbox and save you time

There's also a concept called 'label noise'. This pertains to "when your actual behaviour differs from your ideal behaviour – when the action that you should take or maybe the action you intended to take isn't what you do," Winn explained. "You intend to reply but you don't; or you reply to another message from the same person instead, because it's the most recent mail from them. We explicitly model those behaviours."

The idea is to make Clutter accurate and sensitive to the subtle nuances of the way we handle email – and to do that without a mass of complicated machine learning code for all these special cases that would make it hard to maintain.

It works by using all the information from Exchange and the Office graph to build a probabilistic prediction model that simulates what you'll do when you get a new email. Unlike older systems that keep the whole model in memory – which slows things down – Clutter uses Microsoft's Infer.NET compiler. It runs fast enough to handle the petabytes of information in Exchange, and adding the idea of 'label noise' to explain unexpected user behaviour takes only a few lines of code.

This also made it easier for the MSR team to work with the Exchange group. "In the early days we would talk the Exchange team through our program and what assumptions we were making and they could easily see what we were doing. And they'd say 'that's not right! We know users do this in Outlook, not that' and we could quickly go back and modify our model of what a user does," Winn told us.

No interference

This approach is one of the reasons that Clutter became a feature when other ideas the researchers had come up with in their four years of working with the Exchange team didn't get anywhere. "We've been exploring a number of different ways that machine learning could work in the inbox," Winn said. "Only with Clutter did we feel we'd got something that can really add value, and not be in some way creepy or have the negativity you can sometimes get when you start applying machine learning to personal email."

We may all complain about email, but people quickly get unhappy if their mail system 'interferes' with their messages – and gets it wrong. So far Clutter is well received – and Winn hopes to extend it beyond email.

"The opportunity is very broad. We're looking at other applications in Microsoft products. Some I can't talk about, but there are some already in the Azure ML service and we're actively working with both Exchange on future work with Clutter and with other product teams on using probabilistic predictions in other products."

One possibility is working not just with the structured information in the email header but also the unstructured information in the message itself. Winn calls unstructured text "the last uncomputable data – it isn't easy to compute with, so it tends to just sit there."

The latest version of Infer.NET is better at working with unstructured text, like the content of email or Office documents. That means in the future, Clutter might be able to understand what a mail message or attachment is about, to decide if you'll be interested in it – and that would be much more accurate.

Contributor

Mary (Twitter, Google+, website) started her career at Future Publishing, saw the AOL meltdown first hand the first time around when she ran the AOL UK computing channel, and she's been a freelance tech writer for over a decade. She's used every version of Windows and Office released, and every smartphone too, but she's still looking for the perfect tablet. Yes, she really does have USB earrings.