Using Speech-to-Text to Boost Productivity Andrew Levine, C.T. (French-English)  Dictating words into a computer with a microphone instead of (or in.

Download Report

Transcript Using Speech-to-Text to Boost Productivity Andrew Levine, C.T. (French-English)  Dictating words into a computer with a microphone instead of (or in.

Using Speech-to-Text to Boost Productivity
Andrew Levine, C.T. (French-English)
 Dictating words into a computer with a microphone
instead of (or in addition to) using a keyboard
 To be useful to translators, it must work in different
text entry environments (e.g. your favorite CAT tool)
 Must be fast and accurate in order for it to be worth
the time and money invested
 “Speaker-dependent” S2T is needed for accuracy
Speech recognition
computer software was
first researched in the
1950s. An IBM product
that could recognize 16
words and all ten digits
was exhibited at the 1962
and 1964 World’s Fairs.
 In 1983, Popular Science
promised readers that the
“listening typewriter” was
on its way. Voice controls at
this time were starting to
be used for directory
assistance, as they are
today.
 By the end of the decade,
people with disabilities
that kept them from using
keyboards could access the
digital world.
 124 words entered accurately in 102 seconds, including time
for 2 corrections.
 Including time for review, this implies 1500 words of these
claims would need only 40 minutes for complete
translation.
 Speed gains will be limited by pauses for research, breaks,
etc.
 Some other parts (abbreviations in parentheses) would be
added in the quality checks
• Not including quality checks, just raw input
~10,000 word
sample texts
1 hour of typing
45 minutes of
Dragon, 15 typing
Telecom
patents
1800-2000 words
2500-2800 words
Nuclear
power plant
specifications
1000-1400 words
1400-1800 words
 Total translation in one hour including quality checks
~10,000 word
sample texts
Telecom
patents
Nuclear
power plant
specifications
Typing
Dragon + Typing
1200-1500 words 1600-2000 words
700-900 words
1000-1200 words
Today’s Boston Globe: October 29, 2011
 Payne vs. Paine vs. pain
 “Roger Payne”: Dragon knows that when a common
English given name is entered, there is a higher
probability that the next word will be a surname
 “Thomas Paine”: In this case, the name of a famous
historical figure is already stored in the dictionary
 “chronic pain”: If it doesn’t already know this exact
phrase is a common one, it will pick the ordinary noun
because an adjective is preceding it
 mayor vs. Mayer vs. may or
 “mayor”: This may be a title preceding a name, or a
common noun that goes where a common noun should
 “Mayer”: Same rules for surnames
 “may or…”: Dragon knows where in a sentence a verb is
the most likely candidate. A slight pause between the
two words “may” and “or” helps clarify.
1.
2.
1.
2.
3.
4.
3. 4.
Microphone icon: Tells you whether the microphone is listening.
Dialogue box: Indicates a (possibly) relevant status message.
Profile: Lets you select your personal voice profile, language, and device.
Tools: Options and features you can turn on or off.
1. “cap”: Capitalize the first letter of the next word.
2. “numeral one”: Enter the digit “1”, not the word “one.”
3. “comma”: Enter a comma (punctuation).
What to expect when creating a new profile
Dragon NaturallySpeaking
(for PC)
Dragon Dictate
(for Mac)
 English
 English
 French
 French
 German
 German
 Italian
 Italian
 Spanish
 Dutch
 Choosing the right accent
 Sample text provided by the software
 Option to scan e-mails and target translation folder
 Corrections, corrections, corrections!
 Faster for most people
 Cost of software: $199
(once it’s been trained)
 Less physical stress on
hands from typing
 Each version has
improved on accuracy of
previous one
 Helps your earnings, not
your clients’
 Minutes or hours needed
to learn voice
 Time occasionally lost to
odd bugs
 Best rule: If you can think faster than you type,
Dragon has the most chance of saving you time
 Long, grammatically correct sentences: Entered
quickly and with enough context for accurate matches
 In a field with technical jargon that you often reuse
 Words like “polyurethane” or “radiography”, etc.
 If you often experience carpal-tunnel syndrome or
simply aren’t used to typing for more than a few
minutes at time in comfort
 Best rule: If you are used to methodically translating
(you tend to type one or two words, then pause)
 If you fancy yourself an expert typist (more than 100
WPM)
 If the source text is more full of sentence fragments
(bullet points, cells in an Excel table, etc.) than
complete sentences
 If you are not confident of your ability to speak in a
fluid sequence of words
 Dragon CAN handle many accented forms of English.
It learns your voice. But if you pronounce particular
words in your own unique way, you might get
frustrated with the results. Like “echo system”…
 Asian languages, especially Chinese text entry
 Sharing S2T vocabularies on multi-translator projects
 Leveraging translation memory “subsegments” to help
the S2T engine choose between different sound-alikes.
Source text includes:
Target text should be:
masculin, mâle…
courrier, messagerie…
clou, ongle…
Mel
male
mail
nail
Mel
 Fast answers to simple questions: Official Twitter
account for Nuance
@DragonTweets
 More detailed questions:
www.knowbrainer.com
Slides available at: http://acknof.wordpress.com
@andrewlevine
[email protected]