Can machines imitate how humans think?

AI is no longer the preserve of academia and science fiction

Depth of knowledge

In reality though, chatterbots are still too simple for us to be fooled for long. They don't have the depth of day-to-day knowledge that makes conversations interesting in the real world, and any conversation over a few minutes reveals that paucity of information.

A more recent development was the episode of the quiz show Jeopardy in which two human contestants played against IBM's Watson computer (although given the fact that Watson has 2,880 processor cores and 16 terabytes of RAM, perhaps 'computer' is too simple a term).

Watson was not only programmed with a natural language interface that could parse and understand the quiz questions, but also had a four-terabyte database of structured and unstructured information from encyclopedias (including Wikipedia), dictionaries, thesauruses, and other ontologies. In essence, Watson has 'knowledge' about many things and the software to query that knowledge, to form hypotheses about that knowledge, and to apply those hypotheses to the questions posed.

Parsing natural language

Although Watson may be seen as the current best contender for passing the Turing Test, there are still issues with the natural language interface - perhaps the hardest part of the software to write.

One of the questions asked (or rather, given the nature of the quiz show, one of the answers for which the contestants had to provide the question) was: "Its largest airport was named for a World War II hero; its second largest, for a World War II battle", and Watson failed to parse the sentence properly, especially the part after the semicolon, causing it to reply with "What is Toronto?" when the answer should have been "What is Chicago?".

Natural language can be parsed in two ways: either through a strict semantic analysis and continually improving that analysis, or through a looser analysis and then using statistical methods to improve the result.

This kind of algorithm is used by Google Translate: by using a huge corpus of texts and translations by human linguists, the quality of on-demand translations can be improved greatly. Google Translate uses the power of the crowd to translate text rather than strict language algorithms, but it's a useful application of AI.

One of the problems with the Turing Test is that it invites programs with ever more complex conversation routines that are just designed to fool a human counterpart. Although researchers are investigating how to create a 'strong AI' that we can converse with, more research is being done on 'specific AI' or 'domain-bound AI' - artificial intelligence limited to a specific field of study.

Real advances are being made here, to the extent that we no longer think of these solutions as AI. Indeed, there's more of a move to view AI research as research into problems whose solutions we don't yet know how to write.

An example of such specificity in AI research is face detection in images. Yes, it's been solved now, but it was only 2001 when Paul Viola and Michael Jones published their paper on how to approach the problem. A decade later, we have point-and-shoot cameras that can find a face in the field of view, then focus and expose for it.

Fifteen years ago or earlier, the face detection problem would have been considered AI, and now we have cameras that can do the work in real time. AI is a concept that shifts its own goalposts.

Neural networks

Many specific-AI systems use a neural network as the basis of the software. This is a software encapsulation of a few neurons, also emulated in software and known as perceptrons. Just like our neurons, perceptrons receive stimuli in the form of input signals, and fire off a single signal as output, provided the sum of (or the mix of) input signals is greater than some value.

Neural networks need to be trained. In other words, they aren't written as fully functional for a problem space - they have to be taught. The programmer has to feed many examples of features in the problem space to the neural network, observe the outputs, compare them with the desired outputs, then tweak the configuration of the perceptrons to make the output closer to the expected results.

The face detection 'algorithm' is one such neural network: Viola and Jones used a database of thousands of faces from the internet to tune their network to recognise faces accurately (according to their paper, they used 4,916 images of faces and 9,500 images of non-faces that they sliced and diced in various ways). The training took weeks.

Another specific AI-like problem is OCR, or Optical Character Recognition. Again, the main engine is a neural network, and this time it's trained with lots of examples of printed characters from different fonts, and from high to low resolution. The success of OCR is such that all scanners come with it as part of their software, and the main reason for investing in commercial OCR packages is for slightly better recognition and the ability to fully replicate the formatting of the scanned document.

The kind of intelligence exposed by these types of programs is statistical in nature. We can see the success of these applications as embodiments of AI, despite the fact that they would never be able to participate in a Turing Test. Nevertheless, even as recently as a few years ago, such programs were inconceivable.

Such statistical intelligence isn't limited to image recognition or to translation engines. In 2008, Netflix promoted a prize asking for improvements on its movie recommendation algorithm. The winning algorithm (known as Pragmatic Chaos) uses a slew of factors to help provide recommendations, including temporal factors like movie popularity (charted over time), user biases and preferences as evinced by their changing tastes. In essence: using a lot of statistical data passed through various models to such an extent that the resulting system wasn't designed, but evolved.

As you've seen, we can view AI through two incompatible lenses. The first is what we've grown up with: a computer system and software that can emulate a human being to the extent that it can fool judges using the Turing Test. It's known as strong AI and is the subject of many science fiction movies, the most famous being HAL 9000 from 2001: A Space Odyssey.

The second is perhaps more interesting, because it affects us in our daily lives: specific AI that solves single problems, and that would have been the subject of SF novels just a few years ago. What new specific AI awaits us in the next five years?