We're all getting more comfortable talking to devices these days, whether it means talking to Cortana, Google Now or Siri to check the weather forecast, asking Amazon Alexa which room your keys are in or telling Xbox to pause the video you're watching. But there's a voice dictation and control application that's been available for many years that is considerably more advanced.
Nuance's latest Dragon voice recognition for Windows now comes in several packages. Dragon 13 Home ($100) is for simple personal use; Dragon 13 Premium ($200) adds email, to-dos and other document-related features; Dragon Professional Individual ($300) is for business users who need features such as transcription; and Dragon Professional Group adds IT admin options for deployment and tracking. For this review, I worked with Dragon Professional Individual.
(There is also a version available for the Mac, which was reviewed in a previous article.)
If you're not familiar with Dragon, it is an application that lets you use your voice both for dictation and control; for example, you can tell Windows to open Word and then dictate your document. It works directly with familiar applications such as Word, Excel, Outlook, WordPerfect and Notepad, and popular browsers such as Chrome, Firefox and Internet Explorer; you can also control some popular websites like Bing and Gmail using spoken shortcuts.
When you start dictating in applications that are not directly supported, a Dictation Box pops up automatically to recognize your text and let you transfer it into the application.
Command and control
Getting started with Dragon Professional is much less work than in older versions of the software. Once upon a time, you needed to read an entire chapter from a book into voice recognition software to get it to understand anything you were saying. Those days are gone. Setup and initial training took me less than 20 minutes, after which the software recognized my voice reasonably well.
You do need to pick both your region and accent; there's a different set of accents for different regions. For the UK that includes Australian, Indian and Southeast Asian as well as a "standard" British accent, whereas the U.S. and Canadian regions include not only "standard" English but southern U.S. English, British English and Pakistani, Spanish and teen (because children's voices need a different speech model).
Cleverly, the text you read to set Dragon up is made up of tips about using the software, such as keeping a consistent distance away from the microphone, speaking at the same volume and keeping your natural tone of voice. (Nuance's acoustic models for voice recognition are based on recordings of people speaking normally rather than in the artificial tone of voice some people adopt when speaking to a computer. They also use samples of users' voices; if you don't want to upload your own speech and recognition data to Dragon anonymously, you can opt out during setup).
Once installed, Dragon puts a floating window that it calls the DragonBar at the top of the screen to indicate that the voice recognition software is running.
Most of the time, the bar collapses to an icon that shows only whether the microphone is on and what it's listening for; hover your cursor over it to show the full controls. You can use your voice to open menus and choose commands on the DragonBar to change options in Dragon. You can also turn the microphone off with your voice, or put it to sleep (but of course, once the mic is off you can't turn it back on with a voice command). The DragonBar will also show tips -- for example, it will issue a warning if you can't dictate into the application you're using doesn't allow dictation.
Once the DragonBar is up, you can start using commands like "Start menu," "Open Microsoft Excel," "Post to Twitter" or "Scroll down" to control your computer, or start dictating text within an application.
Whether you're dictating or controlling your computer, you can use a voice command at any point to ask Dragon what you can say; you can get a list of commands to say for navigation, formatting and punctuation as well as correction, and making the most of the software is mostly a question of getting into the habit of using those rather than switching back to keyboard or mouse.
Accuracy can depend on application
One of the major drawbacks with Dragon is that not all software lets you dictate into it automatically.
You can open a new Word or Notepad document, start talking and have your words appear directly in your document. But if you prefer to work in an app like OneNote, then you have to dictate into the Dictation Box, which is a floating window that automatically appears when you talk at any application Dragon can't insert text into directly. What you say is recognized and shows up in the Dictation Box, but it's much less convenient than dictating straight into an application like Word or Outlook, because once you've finished speaking you need to remember to move what you've said into your application, using the Transfer button in the dialog.
In testing, that worked well with some apps -- I was able to dictate tweets even into Windows apps like Tweetium, although I couldn't control the app to post a tweet with a voice command.
But far too often, the same process didn't work with OneNote. Clicking the Transfer button in the Dictation Box dialog with the mouse correctly transferred the text into my OneNote document every time. But saying "Click Transfer" to do the same thing -- without going back to using mouse and keyboard to control the PC -- would often lose the text I had dictated. On one occasion I found the text in a different OneNote window that was open in the background, but other times it vanished completely. Having a voice command not only fail, but fail and delete dictated text, is less than impressive.
As mentioned before, Dragon works with most common browsers (but not Edge); you'll be prompted to install the Dragon extensions for Chrome, Firefox or Internet Explorer the first time you open the browser after installing Dragon. (I was surprised when Dragon repeatedly mis-recognized Bing as "being.")
While you can open a browser and navigate the interface with voice commands, you can also tell Dragon directly to search the Web for specific keywords. You can also use spoken searches for news, maps, photos, video or even specific sites such as eBay, MSN, YouTube, Facebook, Twitter and Wikipedia. That opens a dialog box where you can check that it recognized the key words correctly (to avoid potentially embarrassing results), but again I found that I sometimes had to manually click using the mouse rather than say "Select" in the dialog box to get the search going.
You can also control Web apps like WordPress or Facebook Messenger -- although I had variable success with these. Outlook.com was particularly difficult to drive with voice commands; I could dictate an email message, including the subject, and select the recipient from the address book, but no matter how many times I said "New" on the Outlook home screen I couldn't actually create a new email with voice commands. I could sometimes delete email messages, but other times -- as with trying to create a new email -- Dragon would show numbers overlaid on the Web page corresponding to possible commands, but no matter how many times I spoke the number corresponding to the Delete command, I couldn't get Dragon to actually send the command.
Controlling the Outlook desktop app was considerably more successful; I was able to reply to messages and even accept meeting requests using voice commands, although I could not switch to different folders. I was also able to navigate around Windows, including opening the Start menu and choosing applications to launch, although oddly the Start menu sometimes remained open even after the application launched.
Controlling Excel or Word with voice commands worked well when using the Ribbon (I could easily insert smart art or a chart -- in fact, I occasionally did it by accident), and there are handy voice shortcuts to insert the total of a group of numbers into a table or file a message in a folder. Confusingly, though, you need to use a completely different voice command to trigger the File menu ("open File tab" rather than "open Layout") using speech in the Office applications.
Dragon lets you move seamlessly between controlling an application and dictating documents when you work in an application like Word.
While dictating text, I found a few short words would occasionally get left out, and from time to time a word would be recognized correctly, then inserted twice. Quite often, Dragon would tell me that it needed me to repeat a phrase and then would immediately insert it correctly anyway (which was another way I ended up with duplicate words).
Some very similar-sounding words were recognized incorrectly, like "sync" and "sink" or "dot" and "dock" (which Dragon initially recognized as "dork"). More annoyingly, I would sometimes get the singular form of a word like "suggest" when I had said "suggests." On the other hand, if Dragon mis-recognized, say, "accept" as "except," then the correct word would almost always be listed as an alternate when I told it to correct the mistake.
When you notice a word or phrase that's been recognized wrong, you can say "Undo that" or "Delete that." If you say "Correct that" Dragon opens a Correction menu that shows a numbered list of alternatives; you can say the number to choose the one you want, or say "Spell that" if you don't see the correct word on the list.
If you need to correct something you didn't just enter, you can say "Select" and then the word or phrase that's wrong; if it's a word that appears in your document more than once, Dragon shows numbers in the text so you can correct other instances.
As with the rest of Dragon, you can control the Correction menu with voice commands, including adding new words to Dragon's vocabulary.
It's also easy to do some simple formatting as you dictate, by selecting the words you want to format (by speaking the "Select" command). You can create a numbered or bulleted list, put words into to bold or italics or underline them, change the capitalization of words or put a phrase into quotes.
Almost real time
Generally, I found that the recognition quality was good. I was able to dictate large portions of this review into Microsoft Word reasonably quickly and without being slowed down much by recognition errors; there were only three or four instances of words that were so badly wrong that I later had problems working out what I might have originally said. (If you're stumped, the Correction menu has an option for playing back what you dictated, although that doesn't save as much information when you're using Web apps as when you dictate into a desktop app.)
I didn't need to pause frequently when speaking, although you will probably find that it takes some time for you to be completely comfortable composing out loud rather than on a keyboard.
Eventually, I found that I could dictate most of a sentence without a break on my Intel Core i5 laptop and Dragon would catch up with me soon after I got to the end of the sentence and stopped talking, while I was thinking about what to say next. This is close enough to real time so that most users should be able to talk in phrases and sentences rather than a word as a time, and still keep an eye on how accurate the recognition is.
You do need to minimize background noise though. If there is music playing or people talking elsewhere in the room, or if a pet is making noise, you're likely to get far more errors. And if you accidentally leave the microphone on while you're having a conversation, what you get is a particularly abstract form of poetry.
The most disconcerting thing is likely to be getting used to talking to your computer (and hearing your own voice) instead of typing on a keyboard. The times when spoken corrections went wrong occasionally left me in a loop where the commands I used to try and correct the mistake were recognized as words instead. It was sometimes easier to drop back to the keyboard briefly just to fix the problem -- but I ran into this far less often than I did in earlier generations of the software.
Cortana, Windows 10’s built-in virtual assistant, is both really cool and really creepy.
Services like Keep, Evernote and Microsoft OneNote are often called "note-taking apps." But they've...
It had a good 36-year run, but its day is done.
Raspberry Pi's new Compute Module 3 has serious competition coming its way from the maker of the $15...
The new wireless headphones do a lot of things right -- and look like two cigarettes stuck in your...
IT leaders need to understand the financial policies that control the way IT buys infrastructure and...
We live in revolutionary times, and we have to figure out what we are going to do about it.