Dragon NaturallySpeaking 12 Premium review: Accurate voice recognition

If you prefer to dictate your documents or use voice commands, this is the place to go.

By Lamont Wood
November 28, 2012 06:00 AM ET

Computerworld - You decide what you want to say. You say it. The words appear on the screen.

Forget the frustrating months it took you to learn typing. In fact, you can forget that writing involves any particular effort. Today's powerful, multi-core computers, combined with the latest speech recognition software and a good microphone, can produce results that are, frankly, startling.

The technology has gotten so good, in fact, that the weak link in the system appears to be the user's ability to dictate. While this may sound like a trivial point, dictation turns out to be a distinct skill that involves factors that are not intuitive. But once the skill is mastered, keyboarding seems painfully primitive.

Dragon NaturallySpeaking
Dragon NaturallySpeaking corrects a dictated sentence from Shakespeare's Hamlet: The word "town" is changed to "tongue." In this case the correct alternative is second on the list and can be designated by saying "Choose two."

While newer speech recognition mobile apps such as Siri and Google Now have grabbed most of the headlines, one of the longest-running and most well-known speech recognition software packages is Dragon NaturallySpeaking from Nuance.

There are a variety of versions available. For this review, I tried out Dragon NaturallySpeaking 12 Premium for Windows PCs, available for $199.99. Other versions include a Home Edition for $99.99, which does not integrate with spreadsheets or support off-line dictation and has no playback facility; a Professional Edition with enterprise-level administrative, customization, and multi-user features for $599.99; and a similar Legal Edition with a law office vocabulary, also for $599.99. There is a version for the Mac called Dragon Dictate ($199.99), along with specialized Mac products for legal and medical workers.

Dragon also has several apps for mobile devices, including Dragon Dictation for iOS devices and Dragon Go, an audio search app for iOS and Android.

A bit of background: I'm not new to speech recognition. In fact, I've been using PC-based speech recognition on and off for nearly two decades to alleviate the stresses of keyboarding. At first, speech recognition packages were more like frustrating toys with maddening limitations, but they have steadily improved over time.

The crossover point was probably NaturallySpeaking version 8 in 2004, when the utility of speech recognition finally outweighed its limitations. But limitations remained: speech recognition was still more reliable with long words than with short ones (making it popular with doctors); misinterpreted words were often rendered as commands with random and startling results (Bill Gates himself was the victim of this at a live demo in 2006); the software's demand on the hardware was nontrivial (so that switching between documents could be painfully slow); and the software could get confused to the point that it stopped listening.

But with version 12, these factors have faded into the background (although they they haven't entirely disappeared). For example, you can dictate effectively at about half the speed of an auctioneer -- should you prove able to do so. Assuming that you stay focused while dictating, the error rate is now trivial (see sidebar).

An important part of that new reliability is the noise canceling headset microphone supplied with the software, which does not react to background noise. It made things a lot easier for me -- I had to turn off my previous microphones every time I stopped speaking to keep them from picking up other sounds. The Home and Premium versions come with a two-speaker analog headset, while the Professional and Legal versions come with a one-speaker USB headset.

The software

Version 12 is outwardly not very different from previous versions, with the same interface and basic command scheme. The vendor claims that accuracy out-of-the-box is 20% better than that of version 11, and in my testing, that did seem to be the case. New features include an interactive tutorial, Bluetooth support, and enhanced support for Gmail and Hotmail.

Dragon installs from a CD; during the installation, it asks a number of questions about your age, gender and accent. (It also tests the microphone, and in my case was not happy until I had tried several ports.) It then listens to your voice during a short training session, taking about five minutes. (With early versions the training took easily 45 minutes.) You have the option to let it examine your document folders and outgoing email folders to look for commonly used words.

When invoked, Dragon puts a thin control bar across the top of the screen. You click an icon in this control bar to turn on the microphone. When you start to talk, text appears at the cursor. If you talk quickly, the text may fall as much as a sentence behind, but I found it invariably caught up fairly quickly. Punctuation marks must be pronounced.

If word X is misrecognized, you can adjust the software by saying "Correct X." Word X will then be selected and Dragon will present a list of possible corrections. If none of them match, you can spell the desired word. Thereafter, Dragon is more likely to recognize the word correctly. (With version 12, I found that one correction was always enough.)

On the other hand, if you simply decide you want to change word X, you say "Select X." Dragon assumes you want to change it as an editorial decision (rather than because there was a mistake), and will not alter its later recognition based on your change. You can also select arbitrary phrases, whole sentences or paragraphs in order to delete, move, or reformat, etc. by saying things like "select next three words," "select previous paragraph," or "select current line," etc.

