MIT let an AI loose on the web - what could possibly go wrong?

It knows how to Google

In almost every list of existential threats to humanity, artificial intelligence appears near the top. Many top scientists are concerned about humankind inadvertently unleashing a machine intelligence that decides people are getting in the way of its pre-programmed goals. 

Generally, it's thought that an effective way to defend against such a situation is to keep AIs within a 'sandbox' - limiting their access to external knowledge. Telling them, in essence, only what they need to know to perform their tasks. Now, however, MIT researchers have designed an artificial intelligence with the ability to search the web for information that it doesn't have. 

To be fair, it's been done in the spirit of helping AIs to learn more effectively. The system was created to automatically classify text data, combing through it and looking for patterns that correspond to different categories provided by humans.

Do whatever it takes

"Traditionally, in natural-language processing, you are given an article and you need to do whatever it takes to extract [information] correctly from this article," said Regina Barzilay, the senior author of a paper describing the research. 

"That's very different from what you or I would do. When you're reading an article that you can't understand, you're going to go on the web and find one that you can understand."

So Barzilay and her colleagues taught the AI to do just that. When it's less confident about its predictions, it generates a web search query designed to pull up texts that are likely to contain the data that it's trying to extract and analyses those, comparing the results with its original analysis.

Mass Shootings and Food Poisoning

To test it out, the researchers gave the algorithm two tasks - one was to collect information about instances of food contamination from about 300 documents, the other was a database of mass shootings, where it was asked to extract the name of the shooter, the location of the shooting, the number of people wounded, and the number of people killed. In both cases, the algorithm was about to improve on the performance of its predecessors by about ten percent.

The full details of the algorithm were published in a paper on the arXiv preprint server.