Speaking artificial intelligence (a machine that speaks) has been elusive. We want such machines, but there aren't any good ones. They were on the drawing board in 1956 when A.I. was invented. Have we given up? Today, you could be forgiven for thinking that every technology worth inventing has been invented. The Apple iTunes store shows more than a million choices and a typical computer superstore provides all manner of hardware and software.
We shouldn't give up. In my research since the early 1980s, it is clear: We can now build speaking machines that really work. Why don’t we? We've missed a gear, but it will kick in soon. I will show how we can build new language-based machines to interact naturally with us in ways we can only dream about with today’s technology.
The approach to A.I. has been wrong, but the will (and financial capital) is available. When we flick the switch to the new model, modern development timelines will kick in and the revolution will accelerate, as happened with Apple and the first personal computers.
What success looks like
A model of success for how we should interact with computers is the HAL/9000 computer from 2001: A Space Odyssey. HAL plays chess very well, which demonstrates his “intelligence,” and better still, he speaks eloquently. His obvious flaw is around ethics — killing humans to achieve his goals. We needn't worry about that danger; I'll explain why in a later post.
Why hasn't this been done sooner?
The history of artificial intelligence holds the key to progress. We are amazingly good at exploiting ideas, and it won’t take long to radically improve with the next generation of language-understanding machines. We have had no shortage of good ideas throughout history — but some poor choices were made. You will see how just dusting the right ones off will put us back on the right path.
The problems of artificial intelligence today are seen in the roadblocks to language mastery by machines. The established technology giants are threatened by this disruption to their businesses, because A.I.’s language-based technology will require big changes to steady-state businesses with investment in conversation by device.
On a personal note, I can get away with poking fun at the technology giants, because I’m one of them, a long-term technologist from the IT industry. I’m a scientist: a mathematician, computer scientist and cognitive scientist. Back in the 1980s, I was a computer engineer at IBM, repairing mainframe computers. I love programming and debugging, and I love the awesome power that the Internet gives us. I really appreciate the amazing improvements in programming in the last few decades, thanks to Microsoft, IBM and Oracle in particular. And I love learning more about how our brains work so we can emulate them for our benefit. Even if you don’t share my passion, join us; every perspective is helpful.
We are entering the natural language understanding (NLU) era for speech and text. When a machine understands us (in any language), we can interact with it using our everyday conversational speech, but we probably won’t be chatting with it. It’s the first step to “talking with HAL,” and an important step with us firmly in control.
Insights into how a brain-inspired system works should be thought-provoking. If my model is right, the computational model of brains used since the 1930s will morph into a pattern-based model of brains that will help not only machines to speak, but neuroscientists to improve treatments of brain injury.
A.I.: The industry’s fine mess
Let’s look at how we got into this fine mess with artificial intelligence.
Today’s machines don’t understand us. It’s a big deal, because we see the effects holding back almost every single thing we do. There are many, many applications we cannot have because A.I. isn’t working accurately: (a) translation between languages, (b) speech recognition and generation, (c) conversation with machines, (d) accurate dictation, (e) automatic reading of documents for accurate summary and questioning and (f) accurate, interactive Internet search to get the right answers regardless of source language.
And who hasn't had the chance to deal with a major corporation using interactive voice response (IVR) technology — a “productivity” tool for an organization at the expense of the customer. Imagine future callers efficiently getting what they want instead of today’s dialogue like “press 5 for technical support.” And if you want to complain, press 812546432#!
Areas where we can improve
Translation software today is my favorite example of a failed design, with both Google and Microsoft Bing providing text translations between languages that, in many cases, mean nothing at all in the target language. These systems represent the switch to statistical analysis by IBM and others and, after decades of work, still produce gibberish in a target language to some simple sentences. It’s free — and still overpriced.
Accurate translation applications will also remove the language barriers. We could have medical observations published one day in, say, Tokyo in Japanese and accessible immediately in Russian and other languages, for a patient’s treatment.
Dictation software, like the Nuance Dragon products, will type meaningless sentences that often rhyme well with your intended sentence. Speech synthesizers will repetitively mispronounce words while speaking from a book. Siri on an iPhone, like other “intelligent assistants,” will mindlessly feed us answers we didn't ask for.
Search engines, including the dominant Google search, provide a list of responses so you, the intelligent one, can remove the dumb responses and investigate only the right ones.
IBM Watson, the machine that beat the best human beings in the game show Jeopardy!, will give answers it doesn't understand from content provided to it. Like other big data solutions, computers can extract information from data, but what’s the real worth of such responses when compared with real understanding?
You will see that all of these systems will improve markedly with the introduction of NLU systems.
The mess in A.I. is illustrated by language-understanding systems not yet working, despite it being an objective set in 1956. There seems little progress. Moore’s Law, that processors get faster and cheaper, doesn't help without a working program because speed alone isn't enough. We need to adopt the new approach to get to the results we all want.
Today’s article introduces the Speaking A.I. series, to explore speaking artificial intelligence. Good science helps engineers to build useful tools, but bad science inhibits progress. History is littered with examples where bad science holds back progress. Aristotle, perhaps the greatest scientist in history, popularized the geocentric Earth model, with the Earth at the center of the universe. He also cemented the concept that the heart was the center of intelligence, while the brain cooled the blood. That’s archaic nonsense, condemned to history, perhaps like the computational view of the brain.
Next time, I look at what Alan Turing would do to fix A.I.
This article is published as part of the IDG Contributor Network. Want to Join?