Talk Back to Your Mac
Speech recognition shows promise for hands-free input, but it could use some more polish.Rebecca Freed, PC World
Comments or questions? Drop a line to The Mac Skeptic.
Most of us interact with our computers primarily with a keyboard and mouse, particularly when composing text, but there are lots of reasons you might not want to use your hands while you're computing. You might want to give your hands and wrists a break, or use them for another activity while using your computer. I wish I could knit and browse the Web at the same time, for example.
Speech recognition has been around a long time, and it's one of the main ways to operate a computer without using a keyboard. But we tend to think of it as a tool for people who can't use their hands, as well as one that is difficult to use. I wanted to find out whether voice recognition has become a reasonable alternative to keyboarding for occasional use--and check out the state of the software on the Mac.
Mac OS X comes with basic speech recognition built in, but it's limited to navigation and control commands for the system. You can't use it for text dictation; for that you need a voice recognition program with a large vocabulary database. I tried two such programs: IBM's $125 ViaVoice for Macintosh and MacSpeech's $149 IListen.
Each product ships with a headset microphone, which improves accuracy greatly over using a computer's internal microphone. Both products also have a learning phase in which you read aloud one or several fairly extensive passages of text, so the program can register your pronunciation of specific words in its vocabulary. Both also can analyze samples of your text documents, to learn words and phrases that you typically use.
Common Complications
The training phase helps to increase the recognition rate of individual words, but reading a prepared script at an even rate is quite a different experience from the stops and starts of composing a text. When you're writing, you pause to think, and often rephrase. Because of years of practice, this process doesn't feel strange with either paper and pencil or keyboard; but it does feel strange when spoken aloud. If you've learned to dictate to a voice recorder, this process is probably easier for you. In fact, IListen can transcribe sound files from digital voice recorders, which could be handy if you do your best thinking while away from the computer.
Another complication is that just like hands and wrists, voices get tired. While training the software, my voice became fatigued because of the steady nature of the reading--though you can pause the training modules at any time.
Also, although typing punctuation as we write is second nature to most of us, it feels unnatural to say "comma" to indicate a pause, or "period, paragraph" to signal the start of a new thought. Unfortunately, speech recognition software doesn't understand cadence or intonation, the verbal signals that punctuation represents. So you can't just pause or add emphasis with your voice.
And in return for the use of your hands while you compute, when you use speech recognition software you can't eat or drink while you're composing--unless you push the microphone away or pause the software. I definitely noticed this when I stopped what I was doing to take a sip of coffee.
One more troubling thing that happened with both applications: If I forgot to put the microphone to sleep when I was through using voice recognition, it picked up ambient noise in my office and put random words into my documents. Having words appear as if typed by an invisible hand is quite disconcerting--ghost in the machine indeed. The lesson I learned is to always unplug the microphone, or say "Go to sleep."
ViaVoice for Macintosh
This IBM program, which is distributed by ScanSoft, hasn't been updated for about three years. In fact, it supports OS X 10.1x and 10.2x, so I could not test it adequately on my OS X 10.3x system. IBM and ScanSoft expect to release an upgrade that supports 10.3 (Panther) early next year. ScanSoft could not tell me anything about its plans for compatibility with the upcoming version 10.4 (Tiger). When the ViaVoice upgrade is available, I will evaluate it more completely.
I found ViaVoice quite easy to install and to train to recognize my pronunciations, but it wouldn't accept editing commands. While most words I spoke ended up in the document correctly, I got tripped up when trying to go back and correct while dictating.
There was a slight delay from the time I spoke a word until it appeared in the document (while the program searched through its vocabulary), and I would watch the text file and wait for the words to appear. This interrupted my train of thought and made what I wrote rather choppy and artificial sounding--exactly the opposite of the uninterrupted flow of ideas and fluid language I was trying for.

ViaVoice recognized most of the words I spoke, but it was impossible to execute commands, such as those used for text editing.
I also started to edit while I was writing, as I do when typing, and then got bogged down completely. The major bug I encountered under OS X 10.3 was that I couldn't get ViaVoice to switch from dictation mode (transcribing what I said) to command mode (executing commands that I spoke). As a result, I ended up with about five sentences of text, followed by a screen full of "Scratch that." "Delete that." The words scratch, and delete should have let ViaVoice know I wanted a command executed.
After this experience, I faced away from the screen and just spoke, to avoid stopping while the program recognized my speech and to short-circuit that impulse to edit while dictating.
I'm not ready to give up on the program, considering that the voice recognition engine in ViaVoice is quite good. The distributor, ScanSoft, has been actively upgrading its Windows product, Dragon Naturally Speaking; so I'll be interested to see the developments with ViaVoice.
MacSpeech IListen

IListen's voice training mode shows how you should speak punctuation marks while dictating.
IListen is an up-to-date product; I looked at version 1.6, which was released late this year. It was the more usable program overall, but it didn't recognize my pronunciation as readily as ViaVoice did. I spent more time training it, and I repeated myself frequently during the training. There is an extensive microphone calibration routine in IListen as well, which helped recognition accuracy.
I found IListen's interface more intuitive, though both programs have similar small consoles that float on your desktop. IListen's console has an animated photo of a dog (the developer's golden retriever) which indicates whether the application is listening to you. There's also a small dialog box that shows the words that the program registered. These tell you at a glance whether your dictation or command is successful.
In actual dictation, IListen was reasonably accurate; and its editing commands worked for me about half the time. IListen lets you edit by opening a correction window, which is designed both to correct your dictated text and to train IListen to better recognize your pronunciation. I found that the suggestions in the correction window tended not to include the word I wanted, so I needed to spell out the correct word, then either click the Done button with the mouse or say "commit corrections." It would have been more intuitive to speak the word "done."
The commands in IListen sound like natural English or are actual phrases you might use, but you can't say just anything. For example, you can say "scratch that" or "forget that" to delete the last phrase transcribed, but "backspace" or "stop" won't work. You have to take the time to memorize its commands--you can spend a lot of time trying to guess the right phrases by trial and error.

The "What Can I Say" window shows commands you can speak in IListen.
The "What Can I Say" window lets you browse for the command you need, but then you're using a mouse or keyboard to find the correct voice command. It's awfully easy to reach for the mouse when learning how to use these programs.
IListen unexpectedly shut down a couple of times while I was switching tasks, which is rather annoying, but I didn't lose the text I'd dictated.
The Verdict
In a way, adjusting to using a microphone feels like using a mouse for the first time, or learning to type. It's a far from perfect input method, but practice makes it easier. However, the editing tools in both these speech recognition programs are far from perfect. I would need a compelling reason--such as a hand injury or a professional need to dictate while doing something else--to spend more than $100 to purchase either one.
The jury is still out on ViaVoice, but I probably will keep using my review copy of IListen because I expect the program will work better the more I practice.
