IBM is betting that the best computational approach for answering a human question may be to break down the task into multiple parts and run them all in parallel.
IBM will be using his strategy next month when its custom-built computer, nicknamed Watson, will compete in an episode of the Jeopardy game show against two previous champions.
While IBM has thus far been silent about Watson's configuration, Watson lead manager David Ferrucci recently shared a few insights with the IDG News Service about how the system was built to take on this formidable task.
For IBM, the Jeopardy challenge represents the next stage in mimicking human intelligence in computer form. In 1997, IBM's Deep Blue computer won a game of chess against grandmaster Garry Kasparov. Jeopardy will be an even tougher job, Ferrucci said.
"In chess, there is nothing tacit, nothing contextual," Ferrucci said. In contrast, the questions in a Jeopardy match assume an understanding of how people communicate, including the many references and allusions they use. "It's a huge challenge," he said.
"Natural language processing is so difficult because of the many different ways the same information can be expressed," Ferrucci said.
Watson's approach is to divide and conquer. "You have to look at the data from so many different perspectives and combine the [results], because you can never rely on there being only one way to express that content."
In the game show, contestants compete to correctly answer a series of questions. In a grammatical twist, the questions are phrased as answers and the contestants must provide their answers in the form of questions.
To make the contest even more difficult, the questions are often phrased in elliptic ways, forcing the contestants to think about what is really being asked. One typical question: "This measurement of cloth is equal to 40 yards." (Answer: What is a bolt?).
Only when host Alex Trebek finishes reading the question (in the form of an answer) are the contestants allowed to indicate that they know the answer, by hitting a buzzer. The first contestant hitting the buzzer gets the first chance to answer the question.
Typically, it takes about three seconds for the announcer to finish asking the question, Ferrucci said. It is in this compact time frame that Watson must determine a plausible answer.
At first glance, the challenge might seem easy. After all, Internet search engines do these sorts of searches millions of times a day. But it is not so easy, Ferrucci said.
"There is a misconception that [the computer] is just looking the answer up somewhere. I wish it were that easy," he said. Google and other Internet search services return only the documents that may provide the answers, not the answers themselves. And databases hold material that only can be accessed through precisely worded queries.