• No results found

VoiceXML Overview. James A. Larson Intel Corporation (c) 2007 Larson Technical Services 1

N/A
N/A
Protected

Academic year: 2021

Share "VoiceXML Overview. James A. Larson Intel Corporation (c) 2007 Larson Technical Services 1"

Copied!
78
0
0

Loading.... (view fulltext now)

Full text

(1)

VoiceXML Overview

James A. Larson

Intel Corporation

[email protected]

(2)

Outline

•  Motivation for VoiceXML

•  W3C Speech Interface Framework

Languages

•  Dialog—VoiceXML 2.0

•  Speech Synthesis—SSML

•  Grammars—SRGS

•  Semantic Interpretation—SI

•  VoiceXML 2.1

(3)

VoiceXML in the Marketplace

•  VoiceXML 2.0 is now ratified as a

Recommendation (e.g., official standard)

by the W3C

•  Hundreds of millions of VoiceXML calls

are answered every day

VoiceXML is the standard for building

speech-enabled applications

(4)

Motivation for Speech

Applications

•  Users access Web sites from any

telephone, anywhere, any time.

•  Speaking and listening are the

(5)

Strength of VoiceXML Applications

•  Traditional system-directed dialogs for

novice users

•  Mixed initiative dialogs for experienced

users

•  Novice users smoothly become

(6)

Limitations of VoiceXML Applications

•  No special analysis of speech input

– Not suitable for training speech skills—

Reading, ESL, singing, etc.

•  VUI conversational bandwidth is slower

than GUI conversational bandwidth

– Using a VUI is like drinking from Lake

Superior with a straw

(7)

Exercise 1

•  Name or describe a speech application

you could use at work.

•  Name or describe a speech application

you or family member can use at home.

(8)

XML

•  XML = eXtensible Markup Language •  Elements are surrounded by tags

<prompt>Welcome to the voice system </prompt>

•  Elements may be nested

<prompt>

Welcome to Ajax Travel <break/> we have the cheapest fares

</prompt>

•  Elements may have attributes

<choice next="#boat">

<grammar type="application/grammar+xml" version="1.0" root = "by_boat" src = “boat.grxml”>

•  Because “<”, “>”, and “&” have special meanings

“&lt;” in place of “<” “&gt;” in place of “>”

(9)

Outline

•  Motivation for VoiceXML

•  W3C Speech Interface Framework

Languages

•  Dialog—VoiceXML 2.0

•  Speech Synthesis—SSML

•  Grammars—SRGS

•  Semantic Interpretation—SI

•  VoiceXML 2.1

(10)

DB Multimedia Files Audio Files Web Server HTML Scripts VoiceXML Scripts Grammars Speech Server/Gateway Web Browser Capture Voice ASR DTMF Replay Audio TTS Database Server Voice Browser

Documents

(11)

W3C Speech Interface

Framework

Speech Synthesis Grammar Other VoiceXML 2.0 Call Control Semantic Interpretation

(12)

Status of W3C Speech Interface Languages

Voice XML 2.0 Grammar (SRGS) Synthesis (SSML) Call Control (CCXML) Semantic Interpret- Ration (SISR) Recommendation Proposed Recommendation Candidate Recommendation Last Call Working Draft Requirements Working Draft Voice XML 2.1 V3

(13)

Outline

•  Motivation for VoiceXML

•  W3C Speech Interface Framework

Languages

•  Dialog—VoiceXML 2.0

•  Speech Synthesis—SSML

•  Grammars—SRGS

•  Semantic Interpretation—SI

•  VoiceXML 2.1

(14)

(c) 2007 Larson Technical Services 14

Example of VoiceXML 2.0 Fragment

<?xml version="1.0"?> <vxml version="2.0"> <form>

<field name = "account"> <prompt>

Which account <break/>

<emphasis> savings </emphasis> or <emphasis> checking </emphasis> </prompt>

<grammar type = "application/grammar+xml" root = “account_type" mode = "voice">

<rule id = “account_type"> <one-of>

<item> savings </item> <item> checking </item> <item> CD </item>

<item> certificate of deposit <tag>$ = “CD”<tag> </item> </one-of> </rule> </grammar> </field> …. </form> … </vxml>

Dialog Language (VocieXML 2.0)

Speech Synthesis Markup Language (SSML)

Speech Recognition Grammar Specification (SRGS) Semantic Interpretation (SI)

(15)

(c) 2007 Larson Technical Services 15

Example of VoiceXML 2.0 Fragment

<?xml version="1.0"?> <vxml version="2.0"> <form>

<field name = "account"> <prompt>

Which account <break/>

<emphasis> savings </emphasis> or <emphasis> checking </emphasis>

</prompt>

<grammar type = "application/grammar+xml" root = “account_type" mode = "voice">

<rule id = “account_type"> <one-of>

<item> savings </item> <item> checking </item> <item> CD </item>

<item> certificate of deposit <tag>$ = “CD”<tag> </item> </one-of> </rule> </grammar> </field> …. </form> …

Dialog Language (VocieXML 2.0)

Speech Synthesis Markup Language (SSML)

Speech Recognition Grammar Specification (SRGS) Semantic Interpretation (SI)

(16)

(c) 2007 Larson Technical Services 16

Example of VoiceXML 2.0 Fragment

<?xml version="1.0"?> <vxml version="2.0"> <form>

<field name = "account"> <prompt>

Which account <break/>

<emphasis> savings </emphasis> or <emphasis> checking </emphasis>

</prompt>

<grammar type = "application/grammar+xml" root = “account_type" mode = "voice"> <rule id = “account_type">

<one-of>

<item> savings </item> <item> checking </item> <item> CD </item>

<item> certificate of deposit <tag>$ = “CD”<tag> </item>

</one-of> </rule> </grammar> </field> …. </form> … </vxml>

Dialog Language (VocieXML 2.0)

Speech Synthesis Markup Language (SSML)

Speech Recognition Grammar Specification (SRGS) Semantic Interpretation (SI)

(17)

(c) 2007 Larson Technical Services 17

Example of VoiceXML 2.0 Fragment

<?xml version="1.0"?> <vxml version="2.0"> <form>

<field name = "account"> <prompt>

Which account <break/>

<emphasis> savings </emphasis> or <emphasis> checking </emphasis>

</prompt>

<grammar type = "application/grammar+xml" root = “account_type" mode = "voice"> <rule id = “account_type">

<one-of>

<item> savings </item> <item> checking </item> <item> CD </item>

<item> certificate of deposit <tag>new.account = “CD”<tag> </item> </one-of> </rule> </grammar> </field> …. </form> …

Dialog Language (VocieXML 2.0)

Speech Synthesis Markup Language (SSML)

Speech Recognition Grammar Specification (SRGS) Semantic Interpretation (SI)

(18)

VoiceXML 2.0 features

•  Menus, forms, sub-dialogs

–  <menu>, <form>, <subdialog>

•  Inputs

–  Speech recognition <grammar> –  Recording <record>

–  Keypad <grammar mode=“dtmf”>

•  Output

–  Audio files <audio>

–  Text-to-speech <prompt>

•  Variables

–  <var> <script> <assign>

•  Events

–  <nomatch>, <noinput>, <help>, <catch>, <throw>

•  Transition and submission –  <goto>, <submit> –  Telephony –  Connection control –  <transfer>, <disconnect> –  Telephony information –  Platform –  Objects –  Performance –  Fetch

(19)

Typical Form Fill-In

<form> <block>

<prompt>Welcome to the electronic payment system.</prompt> </block>

<field name="card_number">

<prompt> Please enter your credit card number? </prompt>

<grammar src=“http://www.ajax.com/credit_card_number.grxml"/> </field>

<field name="date">

<prompt>Please enter your expiration date </prompt>

<grammar src=“http://www.ajax.com/credit_card_date.grxml"/> </field>

(20)

Exercise 2

Capture “birth date”

<form> <block>

<prompt> _____________________ </prompt> </block>

<field name = "month">

<prompt> _______________________________</prompt> <grammar src=“http://www.ajax.com/month.grxml"/>

</field>

<field name = "day">

<prompt> ______________________________ </prompt> <grammar src=“http://www.ajax.com/day.grxml"/>

</field>

<field name = "year">

<prompt> ______________________________ </prompt> <grammar src=“http://www.ajax.com/year.grxml"/>

</field> </form>

(21)

Event Handlers

•  Deal with exceptional or error conditions

•  Control mechanism for dialog turn retries

–  <catch event=“noinput”> … </catch> –  <catch event=“nomatch” … </catch> –  <catch event=“help”> … </catch>

•  Shorthand notation available

–  <noinput> … </noinput>, etc.

•  Scoped according to where they occur

(22)

Adding Event Handlers

<form>

<prompt> When were you born? </prompt> <field name = "month">

<catch event=“noinput”> ….. </catch> <catch event=“nomatch> ….. </catch>

<prompt> What month?</prompt>

<grammar src=“http://www.ajax.com/month.grxml"/> </field>

….. </form>

(23)

Adding Event Handlers

<form>

<prompt> When were you born? </prompt> <field name = "month">

<catch event=“noinput”> ….. </catch> <catch event=“nomatch> ….. </catch>

<prompt> What month?</prompt>

<grammar src=“http://www.ajax.com/month.grxml"/> </field>

….. </form>

(24)

Adding Event Handlers

<form>

<prompt> When were you born? </prompt> <field name = "month">

<catch event=“noinput”> ….. </catch> <catch event=“nomatch> ….. </catch>

<prompt> What month?</prompt>

<grammar src=“http://www.ajax.com/month.grxml"/> </field>

….. </form>

(25)

Default Event Handlers

<catch event = "help"> <prompt>

Sorry, no help is available. </prompt>

</catch> <catch event = "nomatch">

<prompt>

I did not understand, please try again </prompt>

</catch>

<catch event = "noinput"> <prompt>

I did not hear anything, please speak again </prompt>

(26)

Exercise 3

Write event handlers for the month field

<catch event = "help"> <prompt>

____________________ </prompt>

</catch> <catch event = "nomatch">

<prompt>

__________________________ </prompt>

</catch>

<catch event = "noinput"> <prompt>

___________________________________ </prompt>

(27)

Outline

•  Motivation for VoiceXML

•  W3C Speech Interface Framework

Languages

•  Dialog—VoiceXML 2.0

•  Speech Synthesis—SSML

•  Grammars—SRGS

•  Semantic Interpretation—SI

•  VoiceXML 2.1

(28)

Speech Synthesis ML

Structure Analysis Text Normali- zation Text-to- Phoneme Conversion Prosody Analysis Waveform Production Markup support: p, s Non-markup behavior: infer structure by

(29)

Before and after Structure Analysis

•  Before structure analysis

–  Dr. Smith lives at 214 Elm Dr. He weights 214 lb. He plays bass guitar. He also likes to fish; last week he caught a 19 lb. bass.

•  After structure analysis

<s>

He plays bass guitar. </s>

<s>

He also likes to fish; last week he caught a 19 lb. bass.

</s> <p>

<s>

Dr. Smith lives at 214 Elm Dr. </s>

<s>

He weights 214 lb. </s>

(30)

Speech Synthesis ML

Structure Analysis Text Normali- zation Text-to- Phoneme Conversion Prosody Analysis Waveform Production

Markup support: say-as for dates, times, etc.

sub for aliasing

Non-markup behavior: automatically identify

and convert constructs

Markup support:

p, s

Non-markup behavior:

infer structure by

(31)

After Text Normalization

<p>

<s>

<sub alias= "doctor">Dr. </sub> Smith lives at 214 Elm <sub alias = "drive">Dr. </sub> </s>

<s>

He weights 214<sub alias= "pounds"> lb. </sub> </s>

<s>

He plays bass guitar. </s>

<s>

He also likes to fish; last week he caught a 19 <sub alias= "pound"> lb. </sub> bass.

(32)

Speech Synthesis ML

Structure Analysis Text Normali- zation Text-to- Phoneme Conversion Prosody Analysis Waveform Production Markup support: phoneme, say-as Non-markup behavior: look up in pronunciation dictionary

Markup support: say-as for dates, times, etc.

sub for aliasing

Non-markup behavior: automatically identify

and convert constructs

Markup support:

p, s

Non-markup behavior:

infer structure by

(33)

After text-to-phoneme conversion

<p>

<s>

<sub alias = "doctor">Dr.</sub> Smith lives

at <say-as interpret-as = “address"> 214 </sayas> Elm <sub alias = "drive">Dr. </sub>

</s> <s>

He weighs <sayas interpret-as = “number”>214 </sayas> <sub alias= "pounds"> lb.</sub>

</s> <s>

He plays <phoneme alphabet = “IPA" ph="b@s">bass</phoneme> guitar. </s>

<s>

He also likes to fish; last week he caught a <sayas interpret-as= “number">19 </sayas> <sub alias= "pound"> lb. </sub>

<phoneme alphabet = “IPA" ph="bas">bass</phoneme>. </s>

(34)

Speech Synthesis ML

Structure Analysis Text Normali- zation Text-to- Phoneme Conversion Prosody Analysis Waveform Production Markup support:

emphasis, break, prosody

Non-markup behavior:

automatically generate prosody through analysis of document structure and sentence syntax Markup support: phoneme, say-as Non-markup behavior: look up in pronunciation dictionary

Markup support: say-as for dates, times, etc.

sub for aliasing

Non-markup behavior: automatically identify

and convert constructs

Markup support:

p, s

Non-markup behavior:

infer structure by

(35)

Prosody Analysis

(Initial text)

<prompt>

Environmental control menu. Do you want to

adjust the lighting or temperature?

(36)

Prosody Analysis

<prompt>

Environmental control menu

<break/>

<emphasis level =

"

reduced

"

>

do you want to adjust the

</emphasis>

<emphasis level =

"

strong

"

>

lighting

</emphasis>

<break/>

or

<emphasis level =

"

strong

"

>

temperature?

</emphasis>

(37)

Speech Synthesis ML

Structure Analysis Text Normali- zation Text-to- Phoneme Conversion Prosody Analysis Waveform Production Markup support: voice, audio* Markup support:

emphasis, break, prosody

Non-markup behavior:

automatically generate prosody through analysis of document structure and

Markup support:

phoneme, say-as

Non-markup behavior:

look up in pronunciation dictionary

Markup support: say-as for dates, times, etc.

sub for aliasing

Markup support:

paragraph, sentence

Non-markup behavior:

infer structure by

automated text analysis

*audio icons,

(38)

Wave Form Production

<prompt>

<audio src=“http://www.example.com/adjust.wav" >

<desc>

Environmental control menu. Do

you want to adjust the lighting or

temperature

</desc>

</audio>

(39)

Exercise 4

(

insert SSML commands

)

<prompt>

Welcome to Ajax Bank do you want to

withdraw or deposit funds?

(40)

Outline

•  Motivation for VoiceXML

•  W3C Speech Interface Framework

Languages

•  Dialog—VoiceXML 2.0

•  Speech Synthesis—SSML

•  Grammars—SRGS

•  Semantic Interpretation—SI

•  VoiceXML 2.1

(41)

Grammars

•  Describe what the user may say at a point

in the dialog

•  Enable the speech recognition engine to

work faster and more accurately

(42)

Example Grammar

<grammar type = "application/srgs+xml" root = "zero_to_ten" mode = "voice"> <rule id = "zero_to_ten"> <one-of>

<item> zero </item>

<ruleref uri = "#single_digit"/> <item> ten </item>

</one-of> </rule>

<rule id = "single_digit"> <one-of>

<item> one </item> <item> two </item> <item> three </item> <item> four </item> <item> five </item> <item> six </item> <item> seven </item> <item> eight </item> <item> nine </item> </one-of>

</rule> </grammar>

XML form of grammars

(43)

Example Grammar

<grammar type = "application/srgs+xml" root = "zero_to_ten" mode = "voice"> <rule id = "zero_to_ten"> <one-of>

<item> zero </item>

<ruleref uri = "#single_digit"/> <item> ten </item>

</one-of> </rule>

<rule id = "single_digit"> <one-of>

<item> one </item> <item> two </item> <item> three </item> <item> four </item> <item> five </item> <item> six </item> <item> seven </item> <item> eight </item> <item> nine </item> </one-of>

</rule> </grammar>

Grammar processor should start with the

(44)

Example Grammar

<grammar type = "application/srgs+xml" root = "zero_to_ten" mode = "voice"> <rule id = "zero_to_ten"> <one-of>

<item> zero </item>

<ruleref uri = "#single_digit"/> <item> ten </item>

</one-of> </rule>

<rule id = "single_digit"> <one-of>

<item> one </item> <item> two </item> <item> three </item> <item> four </item> <item> five </item> <item> six </item> <item> seven </item> <item> eight </item> <item> nine </item> </one-of>

</rule> </grammar>

This is a grammar used by the speech recognizer. (There may

also be grammars for DTMF recognizers.)

(45)

Example Grammar

<grammar type = "application/srgs+xml" root = "zero_to_ten" mode = "voice"> <rule id = "zero_to_ten"> <one-of>

<item> zero </item>

<ruleref uri = "#single_digit"/> <item> ten </item>

</one-of> </rule>

<rule id = "single_digit">

<one-of>

<item> one </item> <item> two </item> <item> three </item> <item> four </item> <item> five </item> <item> six </item> <item> seven </item> <item> eight </item> <item> nine </item> </one-of>

</rule>

</grammar>

Rule describing single digits

(46)

Example Grammar

<grammar type = "application/srgs+xml" root = "zero_to_ten" mode = "voice"> <rule id = "zero_to_ten"> <one-of>

<item> zero </item>

<ruleref uri = "#single_digit"/> <item> ten </item>

</one-of> </rule>

<rule id = "single_digit"> <one-of>

<item> one </item> <item> two </item> <item> three </item> <item> four </item> <item> five </item> <item> six </item> <item> seven </item> <item> eight </item> <item> nine </item> </one-of>

</rule> </grammar>

<one-of> describes alternatives

(47)

Example Grammar

<grammar type = "application/srgs+xml" root = "zero_to_ten" mode = "voice"> <rule id = "zero_to_ten"> <one-of>

<item> zero </item>

<ruleref uri = "#single_digit"/> <item> ten </item>

</one-of> </rule>

<rule id = "single_digit"> <one-of>

<item> one </item> <item> two </item> <item> three </item> <item> four </item> <item> five </item> <item> six </item> <item> seven </item> <item> eight </item> <item> nine </item> </one-of>

</rule> </grammar>

Rule element references another rule

(48)

Example Grammar

<grammar type = "application/srgs+xml" root = "zero_to_ten" mode = "voice"> <rule id = "zero_to_ten"> <one-of>

<item> zero </item>

<ruleref uri = "#single_digit"/> <item> ten </item>

</one-of> </rule>

<rule id = "single_digit"> <one-of>

<item> one </item> <item> two </item> <item> three </item> <item> four </item> <item> five </item> <item> six </item> <item> seven </item> <item> eight </item> <item> nine </item> </one-of>

</rule> </grammar>

Exercise 5:

Write a grammar for that recognizes the digits zero to nineteen

(49)

More Grammar Elements

•  Repeat and optional

<rule id = "goodness" scope = "public"> <item repeat = "0-3" > very </item>

good </rule>

•  Sequence

<rule id = "twenty_thru_twentynine“>

Twenty <ruleref uri = "#single_digit"/> </rule>

•  Garbage

<rule name = "James_Lewis">

(50)

Reusing existing grammars

<grammar

type = "application/srgs+xml"

root = "size”

(51)

Outline

•  Motivation for VoiceXML

•  W3C Speech Interface Framework

Languages

•  Dialog—VoiceXML 2.0

•  Speech Synthesis—SSML

•  Grammars—SRGS

•  Semantic Interpretation—SI

•  VoiceXML 2.1

(52)

Semantic Interpretation

•  Semantic Interpretation defines how to

extract and modify the results returned by

the speech recognition engine

•  Semantic interpretation instructions

contained in the <tag> element

•  Two kinds of syntax for <tag> contents:

– Semantic Literals (literal values)

– Semantic Scripts (ECMAScript)

(53)

Semantic Interpretation

•  Semantic Literals example:

<rule id=“drink“> <one-of>

<item> coca cola <tag> coke </tag> </item> <item> cola <tag> coke </tag> </item>

<item> black fizzy stuff <tag> coke </tag> </item> <item> coke </item>

</one-of> </rule>

(54)

Semantic Interpretation

•  Semantic Literals example:

<rule id=“drink“> <one-of>

<item> coca cola <tag> coke </tag> </item> <item> cola <tag> coke </tag> </item>

<item> black fizzy stuff <tag>coke </tag> </item> <item> coke </item> Default Assignment

</one-of> </rule>

(55)

No Semantic Scripts

ASR Grammar with Semantic Interpretation Scripts Semantic Interpretation Processor VoiceXML Interpreter text ECMAScript object fourteen

(56)

No Semantic Interpretation

ASR Grammar with Semantic Interpretation Scripts VoiceXML Interpreter text fourteen fourteen ECMAScript object Semantic Interpretation Processor

(57)

Semantic Interpretation

ASR Grammar with Semantic Interpretation Scripts VoiceXML Interpreter text fourteen <item> fourteen <tag>new.quantity=“14”;</tag> </item> ECMAScript object Semantic Interpretation Processor

(58)

Semantic Interpretation

ASR Grammar with Semantic Interpretation Scripts VoiceXML Interpreter text fourteen fourteen { quantity: “14” } <item> fourteen <tag>new.quantity=“14”;</tag> </item> ECMAScript object Semantic Interpretation Processor

(59)

Semantic Interpretation

•  Semantic Scripts employ ECMAScript

•  Advantages:

•  Richer structure (objects)

(60)

Semantic Interpretation

•  Example grammar rule with Script Syntax: <rule id = "action">

<one-of>

<item> small <tag> out.size = "small"; </tag> </item>

<item> medium <tag> out.size = "medium"; </tag> </item> <item> large <tag> out.size = “large"; </tag> </item>

</one-of> <one-of>

<item> green <tag> out.color = "green"; </tag> </item> <item> blue <tag> out.color = "blue"; </tag> </item> <item> white <tag> out.color = "white"; </tag> </item> </one-of> </rule> •  ECMAScript structure: action: { size: "large" color: "white" } Large white

(61)

Semantic Interpretation

•  Example grammar rule with Script Syntax: <rule id="calculator">

What is

<ruleref uri="#digit"/><tag>$.total = $digit;</tag> <item repeat="1-">

plus

<ruleref uri="#digit"/>

<tag> $.total = $.total + $digit; </tag> </item> </rule> •  ECMAScript structure: calculator: { total: 6 What is 1+ 2+ 3?

(62)

Exercise 6

Fill in the contents of <tag>

•  Grammar rule:

<rule id = “transfer"> from

<one-of>

<item> savings <tag>________________________ </tag> </item> <item> checking <tag>________________________</tag> </item> </one-of>

to

<one-of>

<item> savings <tag>________________________</tag> </item> <item> checking <tag>________________________</tag> </item> </one-of> </rule> •  ECMAScript structure: transfer: { source_account: "savings" target_account: “checking" From savings to checking

(63)

Outline

•  Motivation for VoiceXML

•  W3C Speech Interface Framework

Languages

•  Dialog—VoiceXML 2.0

•  Speech Synthesis—SSML

•  Grammars—SRGS

•  Semantic Interpretation—SI

•  VoiceXML 2.1

(64)

VoiceXML 2.1

•  VoiceXML’s success and popularity resulted in

many implementations early in the standardization

process

•  Additional, innovative features were conceived

after VoiceXML 2.0 content was agreed

•  Goals of VoiceXML 2.1:

–  Ensure

portability

by specifying a set of commonly

implemented extensions

–  Backwards-compatible

with VoiceXML 2.0

–  Follow a “

fast track

” to standardization

(65)

VoiceXML 2.1

•  Standardized extensions:

– Locate barge-in occurrences within prompts

– Access recognition utterances for analysis

– Increase performance be reducing server

round-trips

(66)

Summary

•  W3C Speech Interface Framework

– Dialog—VoiceXML

– Grammar—SRGS

– Synthesis—SSML

– Semantic Interpretation—SI

– Call Control—CCXML

•  Can work together or separately

(67)

Industry Organizations

•  World Wide Web Consortium

–  http://www.w3.org

•  W3C Voice Browser Working Group

–  http://www.w3.org/voice/

•  W3C Multi-Modal Working Group

–  http://www.w3.org/2002/mmi/

•  VoiceXML Forum

–  http://www.voicexml.org

•  SALT Forum:

–  http://www.saltforum.org

•  Speech Technology Magazine

(68)

Books

•  James A. Larson, VoiceXML—An Introduction

to Developing Speech Applications, 2002, Upper

Saddle River, NJ: Prentice Hall.

•  Eve Astrid Andersson, et.al., Early Adopter Voice, 2001, Birmingham UK: Vrox.

•  Bruce Balentine & David P. Morgan, How to Build a Speech

Recognition Application: A Style Guide for Telephony Dialogues, 1999,

San Ramon, CA: Enterprise Integration Group.

•  Rick Beasley et. al., Voice Application Development with Voice, 2002, Indianapolis: Sams.

•  Bob Edgar, The Voice Handbook, 2001, New York: CMP.

•  Susan Weinschenk & Dean T. Barker, Designing Effective Speech

Interfaces, 2000, New York: John Wiley & Sons.

•  Chetan Sharma & Jeff Kunins, Voice: Strategies and Techniques for

Effective Voice Application Development with Voice 2.0, 2002, New

York: John Wiley.

•  Michael H. Cohen, James P. Giangola, & Jennifer Balogh, Voice User

(69)

Other Resources

•  The VoiceXML Guide

(70)

Tutorials and Articles

•  VoiceXML Forum

– 

http://www.voicexmlforum.org/

•  VoiceXML Review

– 

http://www.voicexmlreview.org/

•  World of VoiceXML

– 

http://www.kenrehor.com/voicexml/

(71)

Online Voice SDKs

Name

URL

BeVocal Cafe

http://cafe.bevocal.com

Tellme Studio

http://studio.tellme.com

VoiceGenie Developer

Workshop

http://developer.voicegenie.com

(72)

Questions?

(73)
(74)

Answer to Exercise 2

<form>

<prompt> When were you born? </prompt> <field name = "month">

<prompt> What month?</prompt>

<grammar src=“http://www.ajax.com/month.grxml"/> </field>

<field name = "day">

<prompt> What day of the month? </prompt>

<grammar src=“http://www.ajax.com/day.grxml"/> </field>

<field name = "year">

<prompt> What year </prompt>

<grammar src=“http://www.ajax.com/year.grxml"/> </field>

(75)

Answer to Exercise 3

Write event handlers for the month field

<catch event = "help"> <prompt>

In what month were you born?

</prompt> </catch>

<catch event = "nomatch"> <prompt>

Which month, for example, January February, or March?

</prompt> </catch>

<catch event = "noinput"> <prompt>

Say the name of the month you were born in </prompt>

(76)

Answer to Exercise 4

<prompt>

Welcome to Ajax Bank

<break/>

<emphasis level =

"

reduced

"

> do you want to </emphasis>

<emphasis level =

"

strong

"

> withdraw </emphasis>

<break/>

or

<emphasis level =

"

strong

"

>deposit </emphasis>

funds?

</prompt>

(77)

Answer to Exercise 5

Write a grammar for zero to nineteen

<grammar type = "application/srgs+xml" root = "zero_to_19" mode = "voice"> <rule id = "zero_to_19"> <one-of>

<item> zero </item>

<ruleref uri = "#single_digit"/> <item> ten </item> <item> eleven </item>

<item> twelve </item> <item> thirteen </item> <item> fourteen </item> <item> fifteen </item> <item> sixteen </item> <item> seventeen </item> <item> eighteen </item> <item> nineteen </item>

</one-of> </rule>

<rule id = "single_digit"> <one-of>

<item> one </item> <item> two </item> <item> three </item> <item> four </item> <item> five </item> <item> six </item> <item> seven </item> <item> eight </item> <item> nine </item> </one-of>

</rule> </grammar>

(78)

Answer to Exercise 6

From savings to checking •  Grammar rule: <rule id = “transfer"> from <one-of>

<item> savings <tag> out.source_account = “savings"; </tag> </item> <item> checking <tag> out.source_account = “checking"; </tag> </item> </one-of>

to

<one-of>

<item> savings <tag> out.target_account = “savings"; </tag> </item> <item> checking <tag> out.target_account = “checking"; </tag> </item> </one-of> </rule> •  ECMAScript structure: transfer: { source_account: "savings" target_account: “checking"

References

Related documents

The library module included a brief in-class presentation about research concepts and library services, an online interactive library scavenger hunt given as

In the Teddy Bear case the Court held that it was primarily concerned with Part 1 of Chapter 3 of the Sexual Offences Act which criminalised the act of consensual

The mean absolute error (MAE) metric can be used to measure the accuracy, for which reputation scores are compared with actual item ratings. The lower the MAE

If the network produces a lot of variance along a given direction during a previous tasks, the total variance along this direction λ ∗ i will take on a larger value and therefore

The purpose of the present study was to evaluate mutations associated with thalassemia and other hemoglobinopathies in Masjed Soleiman County, Iran.. Methods: This descriptive

Health Insurance program is “An American Indian or Alaska Native or other individual who is eligible for health services through the Indian Health Service, tribes and tribal