RealSpeak Telecom Software Development Kit

(1)

(2)

1. Grant of Rights

In consideration of a possible commercial relationship, ScanSoft hereby grants to you, the LICENSEE, who accepts, a non-exclusive right to internally evaluate and test the software program(???theSoftware???).

2. Ownership of Software

ScanSoft retains title, interests and ownership of the Software recorded on the original disk(s) and all subsequent copies of the Software and Documentation, regardless of the form or media in or on which the original and other copies may exist. ScanSoft reserves all rights not expressly granted to LICENSEE.

3. Copy Restrictions

This Software and the accompanying documentation are copyrighted. Unauthorized copying of the Software, including Software that has been merged or included with other software, or of the documentation is expressly forbidden. LICENSEE may be held legally responsible for any intellectual property infringement that is caused or encouraged by his failure to abide by the terms of this agreement. LICENSEE is allowed to make two (2) copies of the Software solely for backup purposes, provided that the copyright notice is included on the backup copy. 4. Use Restrictions

LICENSEE agrees not to use the Software for any other purpose than internally evaluating the Software. LICENSEE may physically transfer the Software from one computer to another, provided that the Software is used on only one computer at a time. LICENSEE may not modify, adapt, translate, reverse engineer, decompile, disassemble or create derivative works based on the Software.LICENSEE may not modify, adapt, translate or create derivative works based on the documentation provided by ScanSoft. The Software may not be transferred to anyone without the prior written consent of ScanSoft. In no event may LICENSEE transfer, assign, lease, sell or otherwise dispose of the Software and Documentation on a temporary or permanent basis except as expressly provided herein.

5. Warranty

THE SOFTWARE IS PROVIDED ???AS IS???WITHOUT WARRANTY OF ANY KIND,EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. ScanSoft shall have no liability to LICENSEE or any third party for any claim, loss or damage of any kind, including but not limited to lost profits, punitive, incidental, consequential or special damages, arising out of or in connection with the use or performance of the Software and accompanying documentation.

6. Termination

This agreement is effective until terminated. ScanSoft reserves the right to terminate this agreement automatically if any provision of this agreement is violated. LICENSEE may terminate this agreement by returning the Software and the accompanying documentation to ScanSoft, along with a written warranty stating that all copies have been returned.

Copyright

(C) ScanSoft, Inc RealSpeak Telecom Software Development Kit

User???sGuideforNetherlands Dutch V4.0 ?? - January 2005 - December 2005

No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording or any information retrieval system, without the written permission of ScanSoft.

(3)

(4)

December 2005

Upda

t

e

???SSML

Pr

e

pr

oc

e

s

or

???

c

ha

pt

e

r

a

nd

???Us

i

ng

Cont

r

ol

Se

que

nc

e

s

???

s

e

c

t

i

on

of

Chapter I

(5)

Using Control Sequences ...10

Quick Reference of the Control Sequences...11

Speech Rates in Words per Minute for Netherlands Dutch Voices...14

Entering phonetic input...15

How to proceed ...15

Lexical stress and sentence accents in phonetic input...17

The Netherlands Dutch L&H+ and UNIPA Phonetic Alphabets ...19

Using a User Dictionary...21

Creating an input text file...21

Using the Microsoft SAPI5 Lexicon...22

User Lexicons ...23

Application Lexicons ...23

The Netherlands Dutch SAPI5 Phoneme List ...24

Notes on the Netherlands Dutch Text-To-Speech System ...27

Cardinal Numbers ...27

Decimal Numbers ...27

Ordinal Numbers...28

Telephone Numbers ...28

Bank Account Numbers & Visa Numbers ...28

Dates ...30

Time Indications ...31

Currencies ...32

Alphanumeric strings ...32

Abbreviations and Acronyms...33

E-MAIL PREPROCESSOR... 35

Introduction ...35

E-Mail Header Processing ...36

Header Field Extraction ...36

Header Field Reading ...38

(6)

Introduction ...54

APPENDIX ... 56

(7)

Chapter I

(8)

Chapter I

Netherlands Dutch Text-To-Speech System

User???

s

Gui

def o r

N e t h e r l a n d s D u t c h

(9)

Netherlands Dutch

Text-To-Speech System

Introduction

This section provides operational instructions for the ScanSoft Text-To-Speech system for Netherlands Dutch. It reviews the functionality of the system, and describes how the user can customize the pronunciation of input texts. This part also describes issues that are particular to the Netherlands Dutch Text-To-Speech system. It introduces the Netherlands Dutch phonetic alphabet and discusses some language-specific features of the Netherlands Dutch Text-To-Speech system.

Preparing a text for Text-To-Speech

In general, there are four ways to intervene in the pronunciation of text:

??? By using control sequences

??? By entering phonetic input

??? By using a user dictionary or a user ruleset

??? Byusing oneofthesupported API???s

These mechanisms are described in the Programmer???sGuide.

In this part, however, the specifications for Netherlands Dutch are fully described.

Native Character Set

The native character set of the Belgian Dutch TTS system is Windows-1252; it has the printable characters in the ASCII range

(10)

1-Using Control Sequences

For a description of the various supported markup languages

(independent from the language), refer to theProgrammer's Guide.

Remark:<ESC> representsthe escape character???\x1B???

(decimal 27) that generates the ASCII character 27 (Hex 1B).

Below, you find a quick reference table for the RealSpeak native control sequences for Netherlands Dutch. The language-specific

supportfortheSSML markup languageisdescribed in the???SSML Preprocessor??? chapter.

(11)

Netherlands Dutch

Sequence Description Range Default Delimiter

Volume (x : 0 .. 100) 0 = silence_{10 = low} 100 = high 80 No <ESC> \vol=x\ For example:

<ESC>\vol=10\ Ik kan stil spreken, <ESC>\vol=90\ maar ook zeer luid. Speech Rate (x : 1 .. 100) 10 = slow 100 = fast 50 No <ESC> \rate=x\ For example:

Ik kan <ESC>\rate=70\ snel spreken, <ESC>\rate=20\ maar ook zeer traag.

Words per minute (xxx: 1..1000)

Voice-specific (see

subsequent table) Voice-specific No <ESC>

\rate_wpm

=xxx\ _{For example:}

Ik kan <ESC>\rate_wpm=350\ snel spreken, <ESC>\rate_wpm=110\ maar ook zeer traag. Read mode; some

read modes are not supported in e-mail mode x = 0..3: 0 = character-by-character 1 = word-by-word (not supported in e-mail mode) 2 = sentence-by-sentence 3 = line-by-line (not supported in e-mail mode) 2 Yes <ESC>Mx For example: <ESC>M0 Demonstrator

(12)

Part-of-Speech N = noun J = adjective V = verb

No For example:

<ESC>@N kantelen (the noun 'kantelen') vs.

<ESC>@V kantelen (the verb 'kantelen')

Dat is een <ESC>@J voornaam man. (adjective) vs.

Zijn <ESC>@N voornaam is Jan. (noun) <ESC>@x

Comment:

You can also use the control sequence <ESC>@@, which allows you to get the alternative pronunciation of a homographwithout

specifying the part-of-speech. This is especially useful for words that have different pronunciations based on meaning rather than on part-of-speech.

For example:

voorkomen (prevent) vs.

<ESC>@@ voorkomen (appear)

appel

(apple)

vs.

<ESC>@@ appel

(appeal)

Long Pause 1 ..65535 msec No <ESC>

\Pause=xx

x\ For example:_{Je kan zelf de duur <ESC>\pause=1500\ van de pauze bepalen.}

Sentence Accent No

<ESC>"

For example:

<ESC>"Hans komt morgen. (niet Peter) Hans komt <ESC>"morgen. (niet vandaag) Note:

Manually inserted sentence accents may have no effect in RealSpeak. The RealSpeak synthesis module may indeed have reasons to override the requested sentence accent, and thus not realize it.

Continuation No

<ESC>C

For example:

Prinses Astridlaan 15. Kamer 66.

Prinses Astridlaan 15. <ESC>C Kamer 66.

In the first of the above examples, the text-to-speech system will detect an end-of-sentence after15. and hence pause before pronouncing the second part of the input. In order to make the

(13)

For example:

Dit is het eerste deel van deze zin <ESC>E en dit is het tweede. In the above example, the sequence <ESC>E forces the system to read the entire input as two separate sentences.

Phonetic Input (L&H+ phonetic alphabet) No <ESC>/+ For Example:

<ESC>/+???h^&ys<ESC>/+

Preprocessing

Mode text = standard textmode email = e-mail mode

Yes <ESC>%x

For example:

<ESC>%text U hebt een boodschap van Karen Donkers. <ESC>%email From: " Karen Donkers "

<[email protected]> Guide text

normalization; not all modes are supported in e-mail mode s = string: address=address mode (not supported in e-mail mode) normal=standard mode spell=spell mode normal No <ESC> \tn=s\ For example:

<ESC>\tn=address\Prof. Leen Janssens, Vrijdagsmarkt 236a, 1000 AB Amsterdam<ESC>\tn=normal\

Prof. Leen Janssens, Vrijdagsmarkt 236a, 1000 AB Amsterdam

Reset to Default Yes

<ESC>F

For example:

<ESC>\vol=90\ Dit is het maximum volume. <ESC>F Nu hoort u het standaardvolume.

<ESC>\rate=10\ Dit is het traagste spreektempo. <ESC>F Nu hoort u het standaard spreektempo.

(14)

<ESC>\au

dio=???s???\ Insert an audiofile; not supported in e-mail mode s = string: theURIof a document with an appropriate MIME type Yes

Speech Rates in Words per Minute for Netherlands Dutch Voices

Words per minute

Voice Range Default

Laura Min = 78 Max = 392

157

(15)

Entering phonetic input

How to proceed

To switch from orthographic to phonetic mode, insert <ESC>/+ to use the L&H+ phonetic alphabet. The phonetic input mode remains active until the command is explicitly reset by entering <ESC>/+ again.

The phonetic input string is composed of symbols of the L&H+ phonetic alphabet (see phonetic table below). Examples are given below in the phonetic table.

In addition to the phonetic symbols, it is advised to use the following characters in the phonetic input string:

(16)

Special characters L&H +

Symbol Meaning As in:

' (ASCII 39, Hex 27)

primary word stress <ESC>/+ 'be.d$.l$n <ESC>/+

(theverb ???bedelen???)

vs.

<ESC>/+ b$.'de.l$n <ESC>/+ (the verb 'bedelen')

'2 secondary word

stress <ESC>/+ 'h^&ys.'2ka.m$r<ESC>/+ (huiskamer)

" (ASCII 34, Hex 22)

sentence accent <ESC>/+$r_zE&in_"tVe_"klEm .to.n$n_?In_ de.z$_'zIn

<ESC>/+

(Er zijn TWEE KLEMTONEN in deze zin)

. syllable boundary <ESC>/+ 'lE.t$r.Grep <ESC>/+

(lettergreep) # silence (pause) <ESC>/+

?Ik_'zE&i#???du_dIt_nit

<ESC>/+

(Ik zei:???Doeditniet.???)

Note that the use of punctuation marks remains useful within phonetic input to assure a correct intonation. Each punctuation mark needs to be preceded by an asterisk.

For example:

<ESC>/+ ja *,

'zE&i_'GInK.$n_"GIs.t$.r$n_'vrux_'nar_"h^&ys *. <ESC>/+ e

(Ja, zij ging gisteren vroeg naar huis.)

Punctuation Marks

L&H+ Symbol Meaning

- Word delimiter *. End of declarative *, Comma *! End of exclamation *? End of question *; Semicolon *: Colon

(17)

In phonetic input strings, lexical stress and sentence accents can be

manuallyindicated bytheuser,byusing asinglequote(???)ordouble quote(???)respectively.

Note that manually inserted lexical stress or sentence accents may have no effect in RealSpeak. The RealSpeak synthesis module may indeed have reasons to override the requested stress/accent and thus not realize it.

1. The Text-To-Speech system will automatically convert all lexical stress marks into sentence accents in case no manually added sentence accents are found in the phonetic input string.

For Example:

<ESC>/+ 'mE&in_'brur_'het_'jo.hAn*. <ESC>/+

(Mijn broer heet Johan.)

(All lexical stress marks are converted into sentence accents.)

2. If phonetic input contains at least one manually added

sentence accent, no additional sentence accents are

assigned by the Text-To-Speech system. Therefore, only those words marked with " will get a sentence accent. As a

consequence, a message containing only one manual sentence accent will have an almost flat intonation on all the other words.

For Example:

<ESC>/+ 'mE&in_'brur_'het_"jo.hAn*. <ESC>/+

(18)

3. Phonetic input can also be combined with orthographic input. If no sentence accents are found in the input text (indicated by <ESC>" in orthographic input, or by " in phonetic input), the Text-To-Speech system will automatically assign sentence accents. In the orthographic part of the input, the Text-To-Speech system will realize these sentence accents on the basis of part-of-speech and syntactic information. In the phonetic part of the input, all lexical stress marks (if any) will be converted into sentence accents. If there are no lexical stress marks, no sentence accent will be realized for the phonetic part of the input (see point 1 above).

If the user has manually specified one or more sentence accents, no additional sentence accents will be realized (see point 2 above).

Example:

Als het mooi weer is, vertrekken we naar <ESC>/+ nu.'jOrk <ESC>/+

(No sentence accents are found; the Text-To-Speech system will automatically assign sentence accents.)

Als het mooi weer is, vertrekken we naar <ESC>/+ nu."jOrk <ESC>/+

(A sentence accent is specified in the phonetic part of the input text. No additional sentence accents will be realized.)

<ESC>"Als het mooi weer is, vertrekken we naar <ESC>/+ nu.'jOrk <ESC>/+

(A sentence accent is specified in the orthographic part of the input text. No additional sentence accents will be realized.)

<ESC>"Als het mooi weer is, vertrekken we naar <ESC>/+ nu."jOrk <ESC>/+

(Two sentence accents were specified; no additional sentence accents will be realized.)

(19)

L&H+

Symbol TranscriptionL&H+ UNIPASymbol TranscriptionUNIPA As in:

A ???kAn A ???kAn kan a ???strat a ???strat straat E ???bEn E ???bEn ben e ???kel e ???kel keel I ???kInt I ???kInt kind i ???nit i ???nit niet O ???Op O ???Op op o ???ros o ???ros roos u ???buk u ???buk boek $ ???d$ $ ???d$ de y ???yr y ???yr uur ^ ???n^t ^ ???n^t nut e+ ???de+r e= ???de=r deur E&i ???rE&is E+i ???rE+is reis ^&y ???h^&ys ^+y ???h^+ys huis A&u ???kA&u A+u ???kA+u kou

Vowels in French loan words

E: pri.'mE:r E: pri.'mE:r primair A%~ A%~.gA.Z$.'mE

nt A%~ A%~.gA.Z$.'mEnt engagement E%~ lE%~.Z$.'ri E%~ lE%~.Z$.'ri lingerie O%~ bO%~.'bO%~ O%~ bO%~.'bO%~ bonbon E+%~ pAr.'fE+%~ E=%~ pAr.'fE=%~ parfum E+ 'E+.vr$ E= 'E=.vr$ oeuvre

(20)

Consonants L&H+

Symbol TranscriptionL&H+ UNIPASymbol TranscriptionUNIPA As in:

p 'p^nt p 'p^nt punt

b 'bErx b 'bErx berg

t 'tal t 'tal taal

d 'dAx d 'dAx dag

k 'kel k 'kel keel

f 'fA&ut f 'fA+ut fout v 'le.v$n v 'le.v$n leven s 'sA&us s 'sA+us saus z 'la.z$n z 'la.z$n lazen S 'Si.na S 'Si.na China Z Zur.'nal Z Zur.'nal journaal

x 'lAx x 'lAx lach

G 're.G$n G 're.G$n regen g 'zAg.duk g 'zAg.duk zakdoek

V 'VAxt V 'VAxt wacht

w 'niw w 'niw nieuw

l 'lIxt l 'lIxt licht

r 'rar r 'rar raar

j 'jar j 'jar jaar

h 'hAnt h 'hAnt hand

m 'mAt m 'mAt mat

n 'nojt n 'nojt nooit

nK 'rInK nK 'rInK ring

n~ 'n~o n~ 'n~o njo

? (glottal

(21)

???

NOTES

Note that the L&H+ alphabet is not SSML compliant. For SSML, use the UNIPA alphabet.

Using a User Dictionary

Creating a user dictionary can be done in two ways:

??? Creating an input text file and generate a binary file by using the UDE (User Dictionary Editor)

??? Creating the actual dictionary directly by using UDE Creating an input text file

For general instructions on the creation of an input text file, refer to

Part IV: Text-To-Speech System Reference of the User's Guide and Programmer's Reference.

Following is an example of what the input file for a dictionary might look like: [Header] Language = DUN [SubHeader] Content = EDCT_CONTENT_BROAD_NARROWS Representation = EDCT_REPR_SZZ_STRING

(22)

Content=EDCT_CONTENT_ORTHOGRAPHIC Representation=EDCT_REPR_WSZ_STRING

[Data]

Info Informatie

IT "Informatie Technologie" DLL "Dynamic Link Library"

Afr Afrika TEL telefoonnummer Dhr de heer Enz Enzovoort

???

NOTE

For the actual creation of a user dictionary by UDE, refer to on-line help of UDE (User Dictionary Editor)

Using the Microsoft SAPI5 Lexicon

Microsoft SAPI5 provides lexicons so that users and applications can specify pronunciation and part-of-speech information for particular words. As such, all SAPI compliant Text-To-Speech engines should use these lexicons to guarantee uniformity of pronunciation and part of speech information.

There are two types of lexicons in SAPI: user lexicons and application lexicons.

(23)

Each user who logs in to a computer will have a User Lexicon. Initially, this lexicon is empty; words can be added either programmatically, or by using an engine's add/remove words UI component (for example, the sample application Dictation Pad provides an Add/Remove Words dialog).

Application Lexicons

Applications can create and ship their own lexicons of specialized words. These lexicons are fixed and cannot be edited.

Detailed information on how to use the MS SAPI5 lexicons can be

found in themanual???MicrosoftSpeech SDK V5.1???,chapter ???ISpLexicon Interface???.

(24)

To add entries to the lexicon, the user should use a set of language specific phonemes. The language specific phoneme list for Netherlands Dutch is given below.

SAPI5 Symbols SAPI5

Symbol As in: SAPI5 Transcription SAPIPhone ID

AA kan S1 K AA N 0251 A straat S1 S T RR A T 0061 EH ben S1 B EH N 025B E keel S1 K E L 0065 IH kind S1 K IH N T 026A I niet S1 N I T 0069 AO op S1 AO P 0254 O roos S1 RR O S 006F U boek S1 B U K 0075 AX de S1 D AX 0259 Y uur S1 Y RR 0079 AH nut S1 N AH T 028C EU deur S1 D EU RR 00F8 EH + I reis S1 RR EH + I S 025B 0361 0069 AH + Y huis S1 H AH + Y S 028C 0361 0079 AA + U kou S1 K AA + U 0251 0361 0075 EH lng primair P RR I . S1 M EH lng RR 025B 02D0 AA nas croissant K RR W AA . S1 S AA nas 0251

0303 EH nas lingerie L EH nas . ZH AX . S1 RR I 025B 0303 AO nas bonbon B AO nas . S1 B AO nas 0254 0303 OE nas parfum P AA RR - S1 F OE nas 0153 0303

(25)

P punt S1 P AH N T 0070 B berg S1 B EH RR X 0062 T taal S1 T A L 0074 D dag S1 D AA X 0064 K keel S1 K E L 006B F fout S1 F AA + U T 0066 V leven S1 L E . V AX N 0076 S saus S1 S AA + U S 0073 Z lazen S1 L A . Z AX N 007A SH China S1 SH I . N A 0283 ZH journaal ZH U RR . S1 N A L 0292 X lach S1 L AA X 0078 GH regen S1 RR E . GH AX N 0263 G zakdoek S1 Z AA G . D U K 0067 VA wacht S1 VA AA X T 028B W nieuw S1 N I W 0077 L licht S1 L IH X T 006C RR raar S1 RR A RR 0072 J jaar S1 J A RR 006A H hand S1 H AA N T 0068 M mat S1 M AA T 006D N nooit S1 N O J T 006E NG ring S1 RR IH NG 014B GT geaarzel d AX L TGH AX . S1 GT A RR . Z 0294

(26)

SAPI5 Symbols

SAPI5 Symbol Meaning SAPI Phone ID

S1 primary stress 02C8

S2 secondary stress 02CC

. syllable break 002E

_& word boundary 0002

_s silence 0004

_. sentence terminator (period) 2198 _, sentence terminator (comma) 0003

(27)

Notes on the Netherlands Dutch Text-To-Speech

System

The Netherlands Dutch Text-To-Speech system is designed in order to pronounce correctly any input written according to the rules of Netherlands Dutch orthography. The following cases, however, require special attention.

Cardinal Numbers

Cardinal numbers up to 15 digits are pronounced as full numbers. If periods are used to separate groups of digits, numerals up to 15 digits are correctly pronounced.

For example: 12500 12500000 12.500 12.500.000

Decimal Numbers

Decimal numbers may consist of up to 15 digits before and up to 15 digits after the comma.

The digits before the comma are pronounced as full numbers. The digit string after the comma is spelled, except if it is a 2-digit string. A 2-digit string is pronounced as a full number.

Periods may be used to separate groups of digits in the string before the comma

(28)

Ordinal Numbers

Cardinal numbers up to 15 digits, followed by???de???or???ste???are

pronounced as ordinal numbers.

For example:

1ste 14de

Telephone Numbers

In order to ensure a correct pronunciation of telephone numbers, it is recommended to use parentheses, a slash (/) or a hyphen (-) to separate country code and/or area code from the remainder of the telephone number. Also, you may use periods or spaces to separate groups of digits. Telephone numbers written in this format will always be pronounced in groups of two or three digits, with a pause at the place of the slash, the hyphen, the period or the space.

For example: 030-2886527 (030)2886527 0031-30-288.65.27 (+32)(52)2266777 05780-12401

Telephone numbers may also be preceded by the abbreviation "tel.".

For example:

tel. (057)2216077

Bank Account Numbers & Visa Numbers

Bank account numbers of following formats are pronounced correctly:

Netherlands bank account numbers: ddd-ddddddd-dd Dutch bank account numbers: dd.dd.dd.ddd or dd dd dd ddd

(29)

390-0120123-78 02 05 02 009 12 55 55 985

To have other bank account numbers and social security numbers correctly pronounced, use dots between groups of digits. The number will be pronounced in groups of 2 or 3 digits.

For example:

12.34.56.78

To have a bank account number pronounced digit by digit, switch to spell mode (<ESC>\tn=spell\).

(30)

Dates

The Netherlands Dutch Text-To-Speech system reads dates in the following formats:

with slashes:

Day(1 or 2 digits)

/

Month(1 or 2 digits)

/

Year(2 or 4 digits) with periods:

Day(1 or 2 digits).Month(1 or 2 digits).Year(2 or 4 digits) with hyphens:

Day(1 or 2 digits)-Month(1 or 2 digits)-Year(2 or 4 digits)

For example:

01/01/2001 7.5.94 31-12-201

If the year is indicated by 2 digits, these digits may be preceded by a quote.

For example: 01/02/???99

The digit string that indicates the month is replaced by the name of that month.

For example:

21/06/00 is pronounced: '21 juni 2000 '

Dates can also be specified in written format:

For example:

1 januari 2005

(31)

Time Indications

Time indications will be correctly pronounced when written in one of the following formats: (u / h / u. / h. / uur)

For example: 15 uur 7 h 10:05u. 10.15h 10u45 10h50 10:25 PM

(32)

Currencies

The Netherlands Dutch Text-To-Speech system handles the most common currency indications. Amounts up to 15 digits will be correctly pronounced

The Netherlands Dutch Text-To-Speech system correctly handles the currency symbols $, ??,???, DMand ??. The currency indications may precede or follow the numeral.

For example:

100 fr. EUR 200.000 $ 150

?? 70

Decimal currency amounts up to 15 digits before the comma will be correctly pronounced. The digit string after the comma is spelled, except if it is a 2-digit string. A 2-digit string is pronounced as a full number.

For example:

USD 950633,50

Alphanumeric strings

An alphanumeric string is a combination of letters and digits (without spaces). The alphabetic part of the alphanumeric string will be spelled; the numeric part will be read as a full number.

For example:

KX4956 125DC20

(33)

Abbreviations and Acronyms

The Netherlands Dutch Text-To-Speech system contains a dictionary with the most common abbreviations such as:

For example:

bv. bijvo orbe eld

dr. dokte r

d.m.v . door midd el v an m.a.w . met a nder e wo orde n

Abbreviations are case-insensitive: uppercase and lowercase are both accepted. If the abbreviation requires a punctuation mark, make sure the punctuation mark is written. If not, the abbreviation will not be found in the dictionary and will consequently be mispronounced. Abbreviations that are NOT in the dictionary:

??? will be spelled if they consist of consonants only

??? will be spoken as full words if the abbreviation contains one or more vowels

Alphabetical strings (with or without periods) consisting of 2 or more uppercase consonants are considered to be acronyms and are spelled: e.g. TV, BTW, B.G.J.G.

The Netherlands Dutch Text-To-Speech system also knows the most common acronyms such as:

For example:

NASA (is pronounced as a single word) EEG (is spelled)

(34)

Chapter II

E-Mail Preprocessor

User???

s

Gui

def o r N e t h e r l a n d s

D u t c h

(35)

E-Mail Preprocessor

Introduction

The ScanSoft e-mail preprocessor (EMPP) is developed to analyze a specific type of text, namely e-mail messages. E-mail messages differ from any average type of text in both their structure and contents. An e-mail message consists of two clearly distinguishable parts: the header and the body. A substantial part of the header contains routing and administrative information, which is irrelevant to the user. Both the header and the body contain all kinds of e-mail specific text features, e.g. e-mail addresses, emoticons such as smileys, etc. Furthermore, informal writing is often combined with a lack of grammatical conventions. Spelling rules are frequently violated, punctuation is often omitted, etc.

Although the standard ScanSoft Text-To-Speech system can handle special text items (abbreviations, numbers, dates, etc.), it is not capable of correctly handling all e-mail specific text features. These text features are therefore dealt with by the e-mail preprocessor. The EMPP transforms e-mail specific information into a format that complies with the rules of the standard ScanSoft Text-To-Speech system. The EMPP is a plug-in preprocessing module of the ScanSoft Text-To-Speech system. It replaces the preprocessor of the standard Text-To-Speech system.

In the following sections you will find a description of the functioning of the ScanSoft e-mail preprocessor as well as an overview of its features.

(36)

smaller text units, called text-to-speech messages, which are synthesized by the Text-To-Speech system. Text normalization is applied to e-mail specific text features such as e-mail addresses, proper names, emoticons, URLs (Universal Resource Locators), etc. For the text normalization of an e-mail message, the ScanSoft EMPP applies linguistic rules and performs dictionary look-up, in order to yield an adequate phonetic transcription. The EMPP also supports the ScanSoft user dictionary mechanism, which allows the user to customize the output of the e-mail processing.

E-Mail Header Processing

Header Field Extraction

An e-mail message consists of two clearly distinguishable parts: the header and the body. The EMPP detects the header and extracts the relevant header fields. Information that is of no interest to the user (such as routing information) is not retained.

The EMPP extracts the following header fields:

From Field Containsthesender???snameand/oraddress

Date Field Contains the date and time of sending

Subject Field Optionally contains the subject of the e-mail The extraction of the header fields is based on the detection of specific keywords in the e-mail header. The supported keywords for the extraction of the header fields are listed below:

From Field From: Author: Sender: De: Von: Date Field Date:

Enviado: Gesendet: Subject Field: Subject:

Subj: Asunto: Betreff:

(37)

header holds information that is irrelevant to the user. After extraction of date, sender and subject, the processed header merely mentions the Date field, the From field and the Subject field.:

Original header:

From [email protected] Fri Jul 04 14:22:16 2000 Path: news.be.innet.net!INbe.net!

From: Rik <[email protected]> Newsgroups: nl.beurs

Subject: Re: langlopende opties Date: 30 Jun 2000 21:59:01 GMT Organization: EuroNet Internet

Message-ID: <[email protected]>

References: <01bc8563$3b9f8da0$12020059@ALEXANDER> NNTP-Posting-Host: p302.asd.euronet.nl

Mime-Version: 1.0

Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit

X-Mailer: Mozilla 3.01Gold (Win95; I) To: Joost <[email protected]> Lines: 81

Xref: news.be.innet.net nl.beurs:5049

Extracted header fields:

From: Rik <[email protected]> Subject: Re: langlopende opties Date: 30 Jun 2000 21:59:01 GMT

(38)

Header Field Reading

After the header fields have been extracted, they are processed by the EMPP. The header field keywords (see above) are replaced by an introductory message. The remainder of the header fields is processed by the EMPP in order to allow the Text-To-Speech system to

intelligently read the fields. From Field

TheFromfield keyword is replaced by the introductory message

???Bericht van:???.

For example:

From: Peter Devries

is pronounced:

Bericht van: Peter Devries

The remainder of theFromfield is further processed by the EMPP. The EMPP supportsFromfields that either consist of

a) a proper name

b) a proper name and an address c) an address

a) - b) In case theFromfield contains a proper name, this name and only this name is sent to the Text-To-Speech system. This means that if both a name and an address are found in theFromfield, the address will not be read by the Text-To-Speech system.

(39)

From: Fred Smekens

From: "Anne-Marie Klaassen" <Klaassen>@ALPHA.ee.upenn.edu> From: Jan Mast at IepXchgPO

From: [email protected] (Tom Peters)

are pronounced:

Bericht van : Fred Smekens

Bericht van : Anne-Marie Klaassen Bericht van : Jan Mast

Bericht van : Tom Peters

c) In case theFromfield contains only an address, the EMPP extracts the name out of the address and expands the domain that is contained in the address. In other words, the e-mail address is not read literally. For example: Author: [email protected] Author: sni.be!daalman Author: [email protected] are pronounced:

Bericht van: erik b at ScanSoft dot com Bericht van: daalman at Siemens Nixdorf, Belgi??

(40)

TheDatefield keyword is replaced by the introductory message

??????Datum:???

TheDatefield contains the date and time of sending. The EMPP supports multiple date and time formats, which are transformed into a uniform format that complies with the rules for date and time indications of the ScanSoft Text-To-Speech system. The EMPP only pronounces the date.

The EMPP supports dates in the following formats:

For example:

Date: 01/18/2041

Date: Tue, 7 Dec 1995 7:00 Date: 13 Mar 2000 8:45

are pronounced:

Datum: 18 januari 2041

Datum: dinsdag, 7 december 1995 Datum: 13 maart 2000

Subject Field

TheSubjectfield keyword is replaced by the introductory message

???Betreft:???

TheSubjectfield can contain all kinds of data, but may also be empty. The EMPP searches for keywords that are typical for the subject field (e.g. RE, FYI, FW).

For example:

Subject: RE: Kom je langs ?

Subject: FW: Ik heb het echt nodig !!!! Subject: FYI : Resultaten - april 2000.

are pronounced:

Betreft: Antwoord op: Kom je langs ?

Betreft: Doorgestuurd : Ik heb het echt nodig !!!!

Betreft: Ter informatie : Resultaten ???april 2000

(41)

E-Mail body processing

Message Extraction

The e-mail preprocessor splits the body of the e-mail message into text-to-speech messages. This is done on the basis of a number of criteria, such as punctuation, capitalization, layout, intelligent abbreviation handling, etc.

The following examples illustrate some criteria for splitting the e-mail text into text-to-speech messages:

??? Using sentence final punctuation and capital letters

Als u hierover informatie wenst, neem dan contact op met An Smekens. U kunt haar bereiken op het volgende adres :

[email protected]. Bedankt !

??? Using layout

Programma van deze vergadering :

1) Nieuwe collega???s

2) Reorganisatie commerci??le afdeling 3) Planning december

??? Using intelligent abbreviation handling

Voor meer informatie kan u zich wenden tot dr. A.C. Stienstra.

(42)

Text Normalization

An e-mail message typically contains e-mail specific text features, such as e-mail addresses, URLs, file names, emoticons, etc. The EMPP transforms these e-mail specific features into a format that complies with the rules of the standard text normalization of the ScanSoft Text-To-Speech system.

The following are examples of e-mail specific text normalization:

??? Support for multiple e-mail address formats

[email protected]@ucunix.san.uc .edu

Anna Blom at IepXchgPO

???Kris.Heemskerk???

<[email protected]>

??? Support for URLs (Universal Resource Locators)

http://www.scansoft.com

http://wpnet.com/timl/mrc/mrc.html gopher://tecno.com/11

??? Support for file names

ldb001.tse sysinfo.exe lipedu.xls ??? Processing of emoticons :-) is pronounced ha ha :-x smak ;-) knipoog-smiley

??? Processing of overuse of punctuation

OPGELET!!!!!!!!:VIRUS OP HET NET!!!!!!!! Ik heb nog een probleem ! #%&#@$<! Kan jij me helpen ?

becomes:

OPGELET! VIRUS OP HET NET!

(43)

Q. Mijn e-mail adres is veranderd. Hoe kan ik ervoor zorgen dat ik nog steeds de nieuwsbrief van Eco ontvang ?

A. Geen probleem ! Stuur gewoon een mail naar het volgende adres

:[email protected] en vermeld zowel je oude als je nieuwe adres

becomes:

Vraag :

Mijn e-mail adres is veranderd.

Hoe kan ik ervoor zorgen dat ik nog steeds de nieuwsbrief van Eco ontvang ?

Antwoord :

Geen probleem ! Stuur gewoon een mail naar het volgende adres :[email protected] en vermeld zowel je oude als je nieuwe adres

??? Processing of inserted mail

Paul> Wel, ik kocht net een nieuwe CD speler en

Paul>ik ben toch wat teleur gesteld. De klank is

Paul>prima, maar met de volumeregeling is er

Paul>iets mis. Ik begrijp er niets van! Sarah> Breng hem terug. Je kan hem vast Sarah> wel ruilen.

becomes:

Paul:

Wel, ik kocht net een nieuwe CD speler en ik ben toch wat teleur gesteld . De klank

(44)

The ScanSoft E-mail Preprocessor for Netherlands Dutch has a special function for the detection of accentuated characters???even when the user has dropped the accent, the system will pronounce the word correctly.

For example:

prive

English words

Since e-mail is an international medium, Netherlands Dutch e-mail messages will inevitably contain many English words, that might refer to Internet, electronic mail or soft- and hardware. The typical e-mail jargon is handled by the exceptions dictionary of the e-mail

preprocessor. This dictionary is a lexicon for e-mail terminology and provides the Text-To-Speech system with an adequate Netherlands Dutch transcription or translation for a number of English words.

For example:

attac hmen t /+ E .'tE t&S. m$nt

banne r /+ 'b E.n$ r

forwa rded /+ 'f Or.w $r.d $t

frame /+ 'f rem

(45)

Customizing the E-Mail Preprocessor

The e-mail preprocessor supports the standard ScanSoft Text-To-Speech SDK user dictionary mechanism, which allows the user to customize the output of the e-mail preprocessor. The user dictionary is consulted both during the header processing and the body

processing.

For more information on how to build and use user dictionaries, see

Appendix C: User Dictionariesof theProgrammer???s Guide.

Customizing the E-Mail Header

The user dictionary is consulted during the header processing while reading theFromfield and theSubjectfield.

From Field

TheFromfield either consists of a) a proper name

b) a proper name and an address c) an address

a) In case theFromfield contains only a proper name, the name is passed to the user dictionary. If the lookup is successful, the proper name is substituted by the replacement string. If not, the name is further processed by the header reading module.

For example:

If the user dictionary contains the following line:

(46)

b) In case theFromfield contains a proper name and an address, the EMPP first passes the address to the user dictionary. If the lookup is successful, both the proper name and the address are substituted by the replacement string. If not, the EMPP passes the proper name to the user dictionary. If this lookup is successful, the name and the address are substituted by the replacement string. If not, the name is further processed by the header reading module. The address will not be read by the Text-To-Speech system.

For example:

If the user dictionary contains the following lines:

[email protected] Peter, mijn collega

Franck Mijn beste vriend

Schrijver /+'sxrE&i.v$r

the followingFromfields:

From: [email protected] (P. Brink) Author: [email protected] (Franck) From: "Hans De Schrijver"

<[email protected]>

become:

Bericht van : Peter, mijn collega Bericht van : Mijn beste vriend Bericht van : Hans De

(47)

c) In case theFromfield contains only an address, the complete address is looked up in the user dictionary. If the lookup is successful, a proper name is added to theFromfield. If not, only the domain part is sent to the user dictionary. The EMPP first calls the dictionary for the complete domain part. If the lookup is successful, the

complete domain part is substituted by the replacement string. Otherwise, the EMPP cuts the leftmost sublevel domain and repeats the lookup and matching procedure for the remainder of the domain part. If the lookup is successful, the remainder of the domain part is substituted by the replacement string. This procedure is called repeatedly until the top level domain is encountered. If none of the lookups is successful, the address is further processed by the header reading module.

For example:

If the e-mail user dictionary contains the following lines:

[email protected] Mijn beste vriend

afb.com A F B

the followingFromfield:

Sender: [email protected] From: [email protected]

become:

Bericht van : mijn beste vriend Bericht van : kim corneels at A F B

???

NOTE

(48)

Each word in theSubjectfield is sent to the user dictionary. If the lookup is successful, the replacement string is sent directly to the Text-To-Speech system. If not, theSubjectfield is further processed by the header reading module.

For example:

If the user dictionary contains the following lines:

ECA /+ e.se.a IDT I D T

the followingSubjectfields:

Subject: ECA vergadering Subject: meer winst voor IDT

are pronounced:

Betreft : <ESC>/+ e.se.a<ESC>/+ vergadering Betreft : meer winst voor I D T

(49)

Customizing the E-Mail Body

When a user dictionary has been loaded, the EMPP will call the dictionary for every word of the e-mail body. If the word is found in the user dictionary, it is substituted by the replacement string. If not, the body is further processed by the e-mail body processing module.

For example:

If the user dictionary contains the following line:

Alexandra A L

the word " Alexandra" in the following sentence:

Het voorstel van Alexandra werd enthousiast onthaald.

is replaced by the corresponding string found in the e-mail user dictionary:

Het voorstel van A L werd enthousiast onthaald.

(50)

Chapter III

SSML Preprocessor

User???

s

Gui

def o r N e t h e r l a n d s

D u t c h

(51)

SSML Preprocessor

Introduction

SSML (Speech Synthesizer Markup Language) is part of a set of markup specifications by the W3C for voice browsers.

General information regarding the RealSpeak SSML processor can be found in theSSML Supportappendix of theProgrammer???s Guide. The RealSpeak Telecom SDK provides a built-in preprocessor that supports a large portion of the SSML 1.0 September 2004

Recommendation (REC). Moreover RealSpeak extends SSML with a number of Scansoft specific elements/attributes.

The setsupported byScansoftiscalled ???ScanSoftSSML???(4SML).

The section below describes language-specific SSML support included in the RealSpeak Telecom V4.0???Netherlands Dutch language version.

Netherlands Dutch specific SSML markup

XML encoding types for Netherlands Dutch

The encoding is specified in the XML text declaration

("<?xml??? ?>") by the encoding declaration which is of the form encoding="<EncodingName>".

E.g. <?xml version="1.0" encoding="UTF-8"?> RealSpeak Telecom V4.0???Netherlands Dutch supports:

??? ???Windows-1252??? and ???ISO-8859-1??? to ???ISO-8859-9???

??? TheUnicodeencoding ???UTF-8???,???UTF-16??? and ???UCS-4???

(52)

4SML Specifics for Netherlands Dutch

Forreasonsofcompatibilitywith the???standard???Netherlands Dutch system, the parallel text control sequence (<esc> sequence) is listed where applicable. As such, a similar TTS behavior can be created???or combined???with non-SSML driven text input.

4SML Tags Comment Corresponding control sequence High-level and document structure tags

xml:lang Supported ???nl-NL??? for Netherlands Dutch. Attribute of speak, paragraph, sentence and voice.

Text normalization tags

<say-as

interpret-as=???xxx???> Not supported,except for the types listed below. <say-as

interpret-as=???address???> Supported <esc>\ tn=address\

<say-as

interpret-as=???spell???> Supported <esc>\ tn=spell\

Pronunciation tags

<phoneme

alphabet=???unipa???> _SeSupported_e_s_e_c_t_i_{on ???}_t_he

Netherlands Dutch L&H+ and UNIPA

phonetic alphabets???

for an overview of the alphabet.

(53)

Chapter IV

(54)

Custom G2P Dictionaries

Introduction

ScanSoft's RealSpeak system now offers support for custom G2P dictionaries. A custom G2P dictionary module is an add-on module specifically designed to improve the quality of pronunciation for specific kinds of words.

The Netherlands Dutch system is currently not designed to support the use of a custom G2P dictionary module.

(55)

Appendices

(56)

Appendices

Appendix A: Netherlands Dutch voice names

The RealSpeak Telecom Text-To-Speech system now supports selecting the voice and language via a string as well as a define (please see the definition for the functionTtsInitialize(Ex)()in the

Programmers Guideand also theBackwards Compatibility Guidefor details). The name strings for the currently supported Netherlands Dutch voices are listed in the table below.

Netherlands Dutch Voice Name Strings

Voice Name String

Laura ???Laura???

Claire ???Claire???

The string to use to set the language to Netherlands Dutch is