Target's voice-recognition effort: How natural does natural language have to be?

Target's voice-recognition trial misunderstands the allure of Amazon's Alexa and Apple's Siri, and also how shoppers think and communicate.

actor portraying alexander graham bell in an att promotional film 1926

Target has started toying with a voice-recognition device, positioned to compete against Amazon's Alexa, said The Chicago Tribune. The issue is seeing how far Target can push natural language comprehension. This, however, is a misunderstanding of the allure of Alexa and Apple's Siri, as well as how shoppers think and communicate.

The consumer attraction to Alexa and Siri involves proximity. In other words, in the normal lives of these consumers, these always-listening devices are right next to them. Indeed, Siri will always be as close as a user's phone, which is going to be really close an awful lot of the time. By the way, the ability of these devices to snap to life when you say their magic phrase means that they are constantly listening to you. No, that's not creepy at all. Can't think of any potential for massive privacy invasion there.

Let's bring this back to how consumers speak. For decades, the overwhelming challenge for voice recognition was the first step: having the software able to figure out what words the shopper was saying. Today, most of these devices have done quite well in mastering that skill, and some are even starting to differentiate one person's voice from another. It's far from foolproof, but it's a nice touch.

Now, however, comes the much harder part, which is understanding intent and meaning. Think that's easy? Try some Google searches and see how well it deals with complicated questions.

Siri — and Alexa is at roughly the same level — can't even master some obvious logical connections. For example, I just told Siri that I want to buy some apples. In context, most people would interpret that to be the fruit. Not Siri. It referred me to the Apple Store. I then clarified and said "I want to buy apple the fruit." I swear it then recommended the Apple Watch. It wasn't until I said "I want to buy some apple that is the fruit," that it gave me the info I sought.

Next up: clothing. I told Siri — and it correctly understood as it typed out my question — simply that "I want to buy a tie." It showed locations of Bow Tie Cinemas. Trying to be helpful, I said, "I want to buy a tie to wear." Again, it showed me movie listings. Four attempts later, it finally showed me ties when I said "Clothing tie." And that was on the third attempt. The first two times, it "helpfully" corrected "clothing tie" to "closing time" and asked me for the name of the business.

You get the idea. And those are relatively easy requests. If Target wants to get into retail voice recognition, it will need to deal with sentences such as "Limiting yourself to stores within 30 minutes of me, find me a dress in XX size and XX color and XX style." When a shopper can ask that kind of question to Siri while driving home from work, this will be getting somewhere.

All of that was about a mobile-based voice-recognition item. Target is working with a Chicago company called AddStructure and is toying with a device that would sell its products and partner offerings. The Tribune story said that Target "is already running a six- to nine-month pilot and [AddStructure] plans to launch pilots with L’Oreal, Under Armour and online antiques marketplace 1stdibs in October."

That raises a rather huge problem: How to get those devices in immediate proximity with users. At one level, Target would get into its customers' homes the same way that Amazon is: by making it an ultra-easy way to shop. Here's the problem. At home, shoppers have easy ways to search online, though laptops/desktops and tablets/smartphones. Alexa has had limited success by controlling lights and entertainment and tons of other things, where it's shopping is a side benefit.

How could Target compete with that? A much better question would be "Why would Target want to compete with that?"

Voice recognition is valuable when the shopper is in an environment where other searches are much more difficult. That gives mobile devices a huge advantage, especially when integrated with a car. Stuck in traffic? Get some shopping done.

Hard to see this particular investment paying off. But I have to run now to a meeting. I'll just throw on my coat and movie theater and grab an Apple Watch to eat on the way.

This article is published as part of the IDG Contributor Network. Want to Join?

The march toward exascale computers
View Comments
Join the discussion
Be the first to comment on this article. Our Commenting Policies