TechWhirl (TECHWR-L) is a resource for technical writing and technical communications professionals of all experience levels and in all industries to share their experiences and acquire information.
For two decades, technical communicators have turned to TechWhirl to ask and answer questions about the always-changing world of technical communications, such as tools, skills, career paths, methodologies, and emerging industries. The TechWhirl Archives and magazine, created for, by and about technical writers, offer a wealth of knowledge to everyone with an interest in any aspect of technical communications.
>> However, my DX4-100 not only has complete multimedia,
>> 3-D CAD modeling and publishing capabilities, it also
>> speaks and understands my voice commands.
>I've been talking to my Mac, and it has been talking back for years. I was
>using a program in 1987 that would read ascii text back to me. (Great for
>proofreading lists of numbers. I would follow along on the hardcopy while my
>Mac read off the list.)
>Voice navigation, only recently available on the PC, has been around almost
>seven years on the Mac.
I'd like to add that the technologies collectively known as PlainTalk,
which were introduced with the Mac AV models, were revolutionary in terms
of both speech recognition and voice synthesis. An article about PlainTalk
The Plain Truth About PlainTalk
by John San Filippo
Back when the original Star Trek television series first aired, it seemed
unbelievable - even ridiculous - to most people that Captain Kirk could
issue voice commands to his computer and get responses back in a normal
human voice. But what seemed like vivid Hollywood imagination less than
thirty years ago is now a reality on many desktop computers.
When Apple rolled out the Quadra AV models late last year, they included
almost too many new features to keep track of. There was the built-in
interfaces for both video-in and video-out. There was input and output for
stereo quality sound. There was the GeoPort. And the power boosts from the
DSPs. Well, you get the picture. But the one feature that you seem to hear
the least about is the one that may ultimately have the biggest impact on
the future of computing, and that's the speech technologies collectively
known as PlainTalk.
PlainTalk is comprised of two basic components, text-to-speech processing,
where the computer converts typed text to audible words, and voice
recognition, where you speak to the computer and it understands you.
Text-to-speech processing is nothing new. Mac users have enjoyed MacinTalk
and the talking moose for years. But PlainTalk pushes this technology to a
whole new level by providing human sounding voices that speak with normal
This sounds simple enough, but isn't. As stated in a handout from Apple's
training department, "plain language English is converted into phonemic
representations for the individual words. The resulting sequence of
phonemes is converted into audible sounds by mapping of the individual
phonemes into a series of waveforms, which are sent to the sound hardware
to be played." Simple, eh? And that's just the short version.
Voice recognition has also been around for awhile. Until recently, however,
these systems have had prohibitive if not crippling limitations. One such
limitation is speaker dependence. In other words, speech recognition
software traditionally had to be customized for each user. This was
accomplished by having the user spend hours "training" the software.
PlainTalk on the other hand is speaker independent. This means that when I
use your 840AV, voice recognition will work just as well for me as it does
for you. To make sure they had all their bases covered, Apple had over 500
speakers from around the country speak over 200,000 utterances, the result
of which was a 40 gigabyte database of voices to work with.
The older voice recognition systems also required you to pause briefly
between each word and generally performed poorly with even a moderate
amount of background noise. With PlainTalk, you can talk to your computer
just like you'd talk to anyone else. If you need to use the Chooser, you
simply say "Open the Chooser" in a normal voice. (Or you can say "Open
Chooser" or "Open menu item Chooser." Your syntax isn't limited.) PlainTalk
also adapts to changing background noises within a few utterances. Simply
put, PlainTalk was designed for real-world computing.
How does PlainTalk do all this? You had to ask, didn't you? Well, here it
goes. From the same Apple handout: "Spoken text is received by the computer
and converted into a waveform that describes the waveform of the signal.
This signal is processed and sent to a recognition search engine that
searches through speech and language models to determine the best fit for
the received waveform. This information is then converted to the textual
information that has been previously identified, and sent to whatever
application is running." Happy now?
What does it all mean? So what if you can talk to your Mac and it can talk
back? Who needs a mouthy computer anyway? The immediate impact of voice
recognition is obvious. Instead of trudging through a hierarchy of menus to
get to the desired command, you can tell the computer to do whatever it is
you want done. But think about the impact that these technologies can have
on the blind, as well as other physically challenged individuals. Think
aboutimplementing computers in a mobile environment where your hands
simply aren't free to operate a computer, such as emergency medical
treatment. And more in the mainstream, think about dictating a letter to
your 660AV and having it pop out of your laser printer within a few
PlainTalk and voice technologies may seem peculiar now, but in the year's
to come, they're certain to become an integral part of your everyday