The limitless potential of mobile gesture communications — and how it will trip up users

Gesturing in the air near a mobile device is going to become the preferred mode of interaction. Long term, ease of use will soar, but before we get there, expect a lot of user errors.

businessman touching pane on interface 172586712

In the world of corporate IT, technology advances often have a "one step forward, two steps back" feeling. Try to boost security by insisting on more robust passwords that have to be changed regularly, and users will write them down more.

This stutter-step nature of progress is going to be a key factor in the next generation of mobile interfaces, as gesturing in the air near the device becomes the preferred mode of interaction. Such gesturing has the potential to allow for an order-of-magnitude improvement in ease of use, with more diverse commands available, but before we get there, it has the much greater near-term potential for generating massive user errors.

Let's be clear, though: Short-term pain should never be used as the reason for avoiding long-term benefits. Few true advances have a clean linear path to better productivity; a learning curve — and user resistance to the new and different and initially uncomfortable — is inevitable. Still, enterprise IT must be realistic about those hurdles and prepare for them.

One interesting approach to this is how Apple has rolled out 3D Touch, it's effort to allow a user to communicate something different by pressing on a touchscreen pixel hard instead of lightly. You'll note that Apple has, thus far, not used 3D Touch to do anything significant, and certainly nothing irreversible. A soft touch can open an app, for example, where a hard touch will mark all apps for potential deletion. But a user mistakenly pressing hard won't, for example, permanently delete data. By limiting the associated 3D Touch actions to fairly innocuous commands, it is protecting itself from user-error-fueled-destruction. After all, one user's light touch is another user's hard press.

This brings us back to gesture-detection. Microsoft just won a U.S. patent for gesture-detection for its tablets, and Apple has long expressed interest in touchless user interactions. (By the way, can the industry get together on what gesture detection is called? Apple and Microsoft have opted for touchless, but Accenture has been using "touchless" to refer to its automated AI efforts that have nothing to do with interface. Come on, guys, let's agree on a name.)

There are many versions of gesture-detection, but the potential goes beyond what these patents initially cover. And it even goes beyond what Minority Report depicted. In essence, it's the ability to use all of the air surrounding your device to gesture. Envision a mobile device that can understand sign language as easily as speech. This lets users work with a true 3D environment, not just the flat, small touchscreen of most mobile devices, allowing for a massive number of commands. A gesture done two inches from the device could mean something completely different from the exact same gesture performed six inches from the device.

For that matter, why limit the human pointer-equivalent to hands and arms? What could a tilt of the head indicate? A twist of the torso? The theoretical potential is for a vast new language solely expressed via gesturing and movement.

On the plus side, this could allow for programmers and engineers much different environments to take and share input. But the downside — and it's a very non-trivial downside — is that coders are likely to very quickly go far beyond users' comfort zones. 

This problem will build on itself. As more errors materialize, user dissatisfaction will soar, and user comfort will plummet. In short, pushing gesture-based communications too quickly will become self-defeating.

Let's drill into what Microsoft and Apple have talked about. Apple's TrueDepth camera, which it is currently using for its Face ID effort, has vast potential to deliver the exact kind of proximity awareness that gesture communications will require. The Patently Apple piece referenced earlier does a good job at explaining Redmond's approach: "Microsoft's granted patent covers visually detecting touchless input. A tracking system including a depth camera and/or other source is used to receive one or more depth maps imaging a scene including one or more human subjects. Pixels in the one or more depth maps are analyzed to identify non-static pixels having a shallowest depth. The position of the non-static pixel(s) is then mapped to a cursor position. In this way, the position of a pointed finger can be used to control the position of a cursor on a display device. Touchless input may also be received and interpreted to control cursor operations and multitouch gestures."

Although that is a serious leap beyond today's mobile interactions, it is really only replacing cursor clicking with pointing. In short, it's merely replicating existing functionality, albeit with a different communication method. Even these patent applications — which themselves tend to be futuristic efforts at mapping the near-term possibilities of the technology — haven't even begun to envision the true potential of special gesture communications. Since 2016, BMW has allowed for a few gesture commands in some of its higher-end models, but it's rudimentary. At this stage, it's more of a cooler way to communicate than a better way. Still, it's a nice start.

Now, about those users. As the canvas that users can use gestures to paint with grows and flows in the air, the ability to communicate something unintended soars. Errors are bad for all of the obvious reasons, but they're also bad because errors will discourage use of the new interface and will sour users to the overall experience. Engineers and coders are always going to be attracted to the next shiny object — which is a good thing, unless you're sad that punch cards are gone — but that can mean that they'll push the technology faster than users are ready.

Charles Mauro, a user interface specialist who has testified as an expert witness for Apple, Nike and Dyson, said the probability of seeing big user error rates is high, especially in the beginning. "The basic problem with these types of gesture interfaces is that they dramatically increase input and data correction errors compared to more feedback-rich interfaces," he said.

The key cause for that will be the lack of any tactile or other feedback that a user's command was properly recognized. "When you remove the feedback modules, you impact performance and you impact accuracy and your ability to detect an error," Mauro said.

Mauro also cautioned that, when compared with traditional mobile touchscreen interfaces, gesture-based interfaces can't be used for a very long time because people's arms get tired. "You get arm fatigue when you're holding your arm up rather than simple touch," Mauro said.

Robert Ferguson is a product marketing director dealing with gesture recognition at Texas Instruments. Ferguson said that his team's testing has shown accuracy as high, but they have limited their gestures to broad strokes.

"For users accessing gesture recognition, the precision required is not high, meaning if the gesture recognized is a swipe motion, our findings have been as long as the swipe is a single contiguous swipe in the correct direction, it works nearly all the time," Ferguson said. "We believe there are multiple places within Industrial market where contactless gesture recognition will be valuable, including the ability for the sensor to be placed behind plexiglass, glass, wood, drywall and still work well. One case is in an operating room where you have multiple images from a camera displayed on a monitor and you would like to zoom in or zoom out, and physical contact with the monitor may not be safe. In some cases, voice commands are affected by accents or other audible variances such as pitch or intonation, which could be misunderstood."

Gesture communications is absolutely going to be the interface that corporate developers will be exploring within the next two years. Taking it slow — allowing users' comfort-level and accuracy to slowly rise — is the cautious and wise approach.

6 tips for scaling up team collaboration tools
Shop Tech Products at Amazon