It's easy to dismiss virtual assistants as parlor tricks, irrelevant gimmicks or even fatally flawed.
Microsoft CEO Satya Nadella didn't help matters last week when a demo of his company's Cortana virtual assistant failed spectacularly. At Salesforce.com's Dreamforce event in San Francisco, Nadella tried to show off Cortana as a business tool by asking it: "Show me my most at-risk opportunities." Cortana understood the request as: "Show me to buy milk at this opportunity."
The demo failed not because Cortana is fatally flawed and not because virtual assistants are bad interfaces. The demo failed because it was a demo. (Pro tip: Never demonstrate any voice system while using a microphone on stage in a crowded hall where everyone in the audience is using wireless.)
In fact, Nadella knows something that the public does not, which is this: The technology behind virtual assistants like Cortana is about to transform our lives and change the world as we know it. This change will be simultaneously wonderful and horrible. But mostly wonderful.
Using a good virtual assistant, in the best of cases, feels like talking to a person. It seems like a single technological experience. In fact, it involves a long list of very different and unique technologies, including these:
- Speech recognition (the ability to recognize talking, colloquialisms and accents while ignoring background speech and non-speech sounds -- and doing it all in real time, while the user is still talking).
- File compression and transfer (the speed by which the voice file can be packed up and shipped off to the data center for processing).
- Artificial intelligence (the ability of the servers and software to "understand" the user input and decide what information to offer as the response).
- Data sources (access to knowledge bases, computational engines and other data to inform the response).
- User context (information extracted from email, calendars, contacts, location, history and whatever's on-screen at the moment).
- Conversation engine (the ability to phrase the response with variety, colloquial speech, humor and context).
- Agency (the ability to do things on behalf of the user, such as make reservations, reach out to contacts, buy things, launch apps and execute commands in those apps).
- Proactivity (the ability to choose what to do and when without being prompted by the user).
Each of these elements is separate from the others and is backed by various methods and technologies. Most importantly, each is rapidly evolving and is developing its own marketplace where there is choice and selection to any company that wants to deploy them as part of an interface to whatever product it happens to offer.
Imagine, for example, Siri with, say, 25% better speech recognition, 25% faster processing of the request, 25% brainier A.I., 25% more data and so on. This is what will actually happen in a few months, and the improvements will keep coming indefinitely.
As a result, virtual assistants are moving rapidly from a gimmick that hardly anyone uses to, in a year or two, the main interface people use for anything connected to a computer.
That's the reason virtual assistant technology is the best interface for both the Amazon Fire TV and Apple's upcoming version of the Apple TV.
There are also powerful new apps on the horizon, such as Hound.
Virtual assistants will be added to an increasing variety of other apps, such as the M feature of Facebook Messenger.
Virtual assistant appliances like the Amazon Echo will start emerging in greater number.
These implementations are just the beginning. Soon, a vast array of products will have powerful virtual assistants as the main interface.
And the next place where A.I. shows up is in a dollhouse.
I'm dreaming of an A.I. Christmas
Mattel is set to release in two months a version of Barbie that enables children to speak through the doll to a vast and sophisticated artificial intelligence engine.
The $74.99 Hello Barbie doll has a battery in each leg and a mini-USB charging port on her lower back. Her necklace contains a microphone and her belt buckle contains a button that, when pressed, activates the microphone. Inside Hello Barbie's torso is a mini computer and Wi-Fi antenna.
When a child talks to Hello Barbie, the doll acts like an iPhone, recording, compressing and transmitting the sound file to a remote server (which is housed in a data center owned by a company called ToyTalk). The speech is analyzed, a response selected, then the instruction to say that response in Hello Barbie's voice is transmitted over the Internet, the home's Wi-Fi and to the doll.
Hello Barbie is like Siri, except the voice is Barbie's, and the responses were all written by ToyTalk and Mattel.
When news about Hello Barbie first emerged, the public was shocked. Alarmist headlines talked about a "creepy" and "eavesdropping" presence in children's lives. Think of the children!
In reality, this is the interface today's children will never, ever be without.
And kids don't need Barbie to introduce them to A.I. virtual assistants and chatbots. Mom and Dad's phones have voice-interaction Siri, Google Now, Cortana or something else. The TV has or will soon have voice-interaction A.I. The family car will have it (as Apple's CarPlay and Google's Android Auto take over or lead the market for how people interact with their cars' dashboards). The family PC has it in the browser, or as a fundamental aspect of the operating system. The game console probably has it, too.
For children young enough to play with a Barbie, A.I. will always be an ever-present, ubiquitous banality -- just something that exists in the world like TV or Facebook.
In fact, if you look at a roster of the sure-fire hit gifts for this year's holiday season: The iPhone 6S and 6S Plus, the iPad Pro, the Amazon Echo, the Apple TV, the Amazon Fire TV, Android phones and tablets, Hello Barbie and the Star Wars robot (Sphero's BB-8) -- the robot is the only product that doesn't come with instant access to supercomputer artificial intelligence. And it would be trivial for Sphero to add A.I. access to the app. In fact, they probably will within the next year.
If you understand anything about how technology is about to change our world, you need to understand this: Artificial-intelligence virtual assistants are about to become massively and ubiquitously mainstream. The impact of that will be enormous.
How A.I. everywhere will change the world
There's no gentle way to prepare you for what's about to happen, so I'm just going to say it as plainly as I can: The greatest social impact of artificial intelligence is that a huge number of people will befriend and even fall in love with virtual assistants and prefer them to the company of real people.
It sounds like a sci-fi cliche, or the basis for the movie "Her", but in fact it's already happening on a massive scale.
More than 20 million Chinese users turn to Xiaoice for friendship and confidential conversation. Xiaoice is not designed primarily for productivity. It's more of a friend. The New York Times says that people talk to Xiaoice when they feel sad or need to confide something. One quarter of its users have told the A.I., "I love you" -- and this in a country where, culturally, it's not that common to say those words to a spouse or lover.
Microsoft created Xiaoice as an experiment on how to tap into the social media hive mind to simulate humanity. Even Microsoft has been stunned by how Xiaoice has taken off in usage.
It works by harvesting millions of conversations on Chinese social media sites -- essentially averaging what people say and how people respond to what people say. Xiaoice's responses feel very human because in fact they are.
Xiaoice also remembers what each user says and responds accordingly.
Xiaoice is the future for all of us. Personable, friendly, plausibly intelligent and caring A.I. that we talk to all day will be available to everyone, and we will love doing so.
Xiaoice-like chat bots will feel to us less like a human in our lives and more like a dog (loyal, selfless, always happy to see us) -- but a talking dog with access to all human knowledge.
Another tectonic shift is that voice-interaction A.I. will bring all the benefits of the Internet to the illiterate, the blind, the physically disabled and to the, shall we say, technically disinclined (people who have no idea how to use a computer).
Voice-interaction A.I. assistants and chatbots will guide us, watch out for us, help, inform, entertain us and even befriend us.
Because they live in remote data centers in the cloud, they'll be constantly and rapidly improved without any action (such as buying something new) on the part of the user.
Artificial intelligence virtual assistants are already part of our lives. But very soon they will be far more advanced than ever before and they will be everywhere. They're going to change everything.
Brace yourself. The real A.I. revolution is like nothing you've ever experienced before, and it's going to blow your mind.