Simultaneous Machine Interpretation –
Utopia?
Alex Waibel and the InterACT Team Carnegie Mellon University
Karlsruhe Institute of Technology Mobile Technologies, LLC
The Language Challenge
• Dilemma:
– Living in the Global Village
• Globalization, Global Markets
• Increased Exchange and Communication • European/International Integration
– Cultural Diversity:
• Beauty, Identity, Language, Culture, Customs • Pride and Individualism
• Language Ability
– Challenge:
• Providing Access to Global Markets and Opportunities
“Everyone Speaks English”… ???
5
The Magnitude of the Problem
• Today Almost all Translation Done by Human Effort (>99%) • ~ 500,000 translators worldwide. ~150,000 in Europe
• ~ $31 Billion dollar market
• European Union: 1.3 B€ Spent on Translation/Interpretation
– 506 Language Directions to Translate
– Current Effort Insufficient to Keep up with Needs of 27 Member States
• Worldwide 6000 Languages
– 36,000,000 language directions !!!
• Actual translation work is currently only about 10% of translatable text.
• Translation needs are growing 25% -35% per year • …. And that’s just for Text…..
Interpretation of Speech
• Conferences
– Estimated 300,000 conference per Year in Europe – Compared to Needs Few Professional Interpreters – 1% or Less are Interpreted
• Internet
– On You-Tube, Every Minute 13 Hours of New Videos
• Television
– Satellite, Cable: Virtually Unlimited Channels
• Lectures
– Government, Universities, Corporations
• Meetings
• Telephone Conversations • Travel Dialogs/Encounters
Technology
To Build a Speech Translator for a New Language
– 6 Component-Engines: Automatic Speech Recognition, Machine Translation, and Text-to-Speech Synthesis
– Each is in Principle Language Independent, but Requires Language Dependent Models
– Models are Automatically Trained but Require Large Corpora – Certain Language Dependent Peculiarities Exist
Statistical Translation Approach
• Translation and Speech Systems Learn Automatically
• Statistics Trained over Lots of Data • Uses Parallel or Speech Data
Speech Translation
Progression of Technologies:
– Domain Limited, Clear Speaking Style (late 80’s-91)
• Janus (first European&US speech-to-speech system) • ATT, NEC, ATR
– Domain Limited, Spontaneous (‘91-’00)
• Janus II/III (work on 20 languages), Verbmobil, Nespole, Enthusiast, C-STAR, ATR, ETRI, NLPR,…
– Mobile Consecutive Interpretation
• Transtac, Babylon, Phraselator, Jibbigo, U-STAR
– Domain Unlimited Simultaneous Interpretation
• Parliamentary Speeches (TC-STAR) • Broadcast News (GALE)
Mobile Consecutive Interpretation
Humanitarian Needs
• How it is Done Now:
– Human Interpreters – Charts, Dictionaries • Limitations/Problems: – Limited Supply!! – Fidelity/Trust/Security – Number of Languages
Jibb
igo
:
– R
eal
-Tim
e Tr
ans
latio
n
– 1
5++
lan
gua
ges
– 4
0,0
00 W
ord
s
– A
ll on
the
Ph
one
– N
o S
erv
er Ne
ces
sar
y!
Jibbigo Systems
• iTunes & Android App Stores:
– English, Spanish, French, German, Japanese, Chinese, Korean, Filipino, Iraqi, Thai, Pashto, Dari
– Other Languages
• Cost:
– Free Jibbigo Online Translator – Off-Line: Freedom from Network
• Outside of App Store:
– Enterprise Versions for Special Applications
Cobra Gold’11
Unlimited Domain Simultaneous
Speech Translation Technologies
Domain Unlimited
Domain Unlimited Translators for:
– TV/Radio Broadcast Translation
– Translation of Lectures and Speeches – Parliamentary Speeches (UN, EU,..) – Telephone Conversations
– Meeting Translation
University Lectures
êß*0vúbØi∫BA¬pysUêÍ}hÿ5
≈ƒÄ<„y‡ëŒkû¢OFˇØ∏kô#å ¯«Zeû
EU-BRIDGE –
Meeting of the Future
Arabic
Spanish English
Seeing Personal Translations
Technology: Heads-up Display Goggles
Hearing Personal Translations
Targeted Audio
– Array of Ultra-Sound Speakers – Targeted Beam of Audio
– Can only be Heard in Narrow Area – Multiple Arrays Could
Prof. Alex Waibel
Internet Delivery
Students bring their own Devices Transcription/Translation Output is Delivered via Web Page
Interpretation Done on Server User Can Select Languages
ASR MT Lectu re 1 Lecture 2 Lec ture 3
Components Services Events
Service Infrastructure
Adaptation, Learning New Improved Technologies Speech-Services for Users and DevelopersProf. Alex Waibel
Lecture Interpretation Service
Launch at KIT: Summer 2012, Support for 4 Courses
• Translation of Power Point Slides • Presentation by Sub-Titles
Search for Content
• Transcripts useful to Search for Content
– Slides, and Lectures in the Cloud– Multi-Lingual Search and Retrieval in
New Challenges
Simultaneous Translation of Lectures
•Continuous Monologue– Broadcast News, Speeches, Lectures •Speaking-Style
– Fast, spontaneous, fragmentary, and no punctuation!! – Noise, Caughing, Singing (!)
•Vocabulary
– Much larger, Special Vocabularies •Speed, Realtime
•Service-Infrastructure
– Many parallel lectures;
The German Lecture Translator
• MT in
German
Lectures is particularly hard. Why?
• Peculiarities of German:
– Wordorder:
Ich möchte mich zu der Konferenz über Maschinelle Übersetzung anmelden
I want to register to the conference on Machine Translation – Compounds:
Worterkennungsfehlerrate
Word Recognition Error Rate – Inflections and Agreement:
Words, Words, Words…
• Technical Terms
normally not in ‘normal’ vocabularies
– Cepstral-Koeffizienten– Wälzlagerungen Roller Bearings – Unterraum Subspace
• Technical Terms
with special Meanings
– Klausur Final Exam (not Retreat) – Vorzeichen Sign (not Omen)
• Formulas:
Words, Words, Words….
• Foreign Words in German Language
– Computer Science, English Expressions – Political Speeches, Latin Proverbs• Accent
– “Würfelkalkül” (Asfour)
• Foreign Words in German Language
– “Cloud”, “iPhone”, “iPad”, “Laser”• Inflections & Declinations of these Words
– Web-ge-casted, down-ge-loaded• Formation of Compounds:
– Cloudbasierter WebcastzugriffSolution
• Use Power Point Slides and Publications • Search Internet for Similar Topics
• Incorporate User Corrections • Adapt Vocabulary to Lecture
The Long Tail of Language
• Languages:
– Only a Few Languages are Currently Addressed (<50) – Development of Technology Takes Long & Is Expensive
• Ongoing Research to
Discussion
• Is Interpretation by Machine Possible?
– Yes, Performance will Continue to Improveand be Made Available over the Internet
• Are we Replacing Human Interpreters?
– No! Machine Translation Quality Remains Worse it Lacks Human Judgement and Intuition
– But: Human vs. Machine is Usually not the Choice we have! What about the Rest of us? The Common Reality is:
Poor English or No Communication! A Social Challenge!
• Are we Hindering Human Language Learning?
– No! Technology Enables and Empowers Human Interaction thus Motivates and Supports more Language Learning