Userâ
s
Gui
de
f
or
F r e n c h
V 4 . 0
1. Grant of Rights
In consideration of a possible commercial relationship, ScanSoft hereby grants to you, the LICENSEE, who accepts, a non-exclusiverightto internally evaluateand testthesoftwareprogram (âtheSoftwareâ).
2. Ownership of Software
ScanSoft retains title, interests and ownership of the Software recorded on the original disk(s) and all subsequent copies of the Software and Documentation, regardless of the form or media in or on which the original and other copies may exist. ScanSoft reserves all rights not expressly granted to LICENSEE.
3. Copy Restrictions
This Software and the accompanying documentation are copyrighted. Unauthorized copying of the Software, including Software that has been merged or included with other software, or of the documentation is expressly forbidden. LICENSEE may be held legally responsible for any intellectual property infringement that is caused or encouraged by his failure to abide by the terms of this agreement. LICENSEE is allowed to make two (2) copies of the Software solely for backup purposes, provided that the copyright notice is included on the backup copy. 4. Use Restrictions
LICENSEE agrees not to use the Software for any other purpose than internally evaluating the Software. LICENSEE may physically transfer the Software from one computer to another, provided that the Software is used on only one computer at a time. LICENSEE may not modify, adapt, translate, reverse engineer, decompile, disassemble or create derivative works based on the Software.LICENSEE may not modify, adapt, translate or create derivative works based on the documentation provided by ScanSoft. The Software may not be transferred to anyone without the prior written consent of ScanSoft. In no event may LICENSEE transfer, assign, lease, sell or otherwise dispose of the Software and Documentation on a temporary or permanent basis except as expressly provided herein.
5. Warranty
THE SOFTWARE IS PROVIDED âAS ISâWITHOUT WARRANTY OF ANY KIND,EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. ScanSoft shall have no liability to LICENSEE or any third party for any claim, loss or damage of any kind, including but not limited to lost profits, punitive, incidental, consequential or special damages, arising out of or in connection with the use or performance of the Software and accompanying documentation.
6. Termination
This agreement is effective until terminated. ScanSoft reserves the right to terminate this agreement automatically if any provision of this agreement is violated. LICENSEE may terminate this agreement by returning the Software and the accompanying documentation to ScanSoft, along with a written warranty stating that all copies have been returned.
Trademarks
MS-DOS®, WINDOWS®, MICROSOFT® VISUAL C++, BORLAND C++ and Sound Blaster are registered trademarks of their respective owners. ScanSoft is a registered trademark. All rights reserved.
October 2004
Document creation
V4.0
December 2005
Upda
t
e
âSSML
Pr
e
pr
oc
e
s
s
or
â
c
ha
pt
e
r
a
nd
âUs
i
ng
Cont
r
ol
Se
que
nc
e
s
â
section of Chapter I
Native Character Set ...8
Using Control Sequences ...8
Quick Reference of the RealSpeak native Control Sequences for French ...10
Entering phonetic input...14
How to proceed ...14
Lexical stress and sentence accents in phonetic input...15
The French L&H+ and UNIPA Phonetic Alphabets ...17
Using a User Dictionary...19
Using the Microsoft SAPI5 Lexicon...19
User Lexicons ...20
Application Lexicons ...20
The French SAPI5 Phoneme List...21
Notes on the French Text-To-Speech System ...23
Cardinal Numbers ...23 Numbers...24 Ordinal Numbers...24 Roman Numbers ...24 Fractions ...24 Telephone Numbers ...25
Bank Account Numbers & Visa Numbers ...26
Dates ...26 Time Indications ...27 Currencies ...27 Alphanumeric Strings ...28 Mathematical formulas ...29 Abbreviations...30
Acronyms and Initialisms ...30
E-MAIL PREPROCESSOR... 32
Introduction ...32
E-Mail Header Processing ...33
Header Field Extraction ...33
Header Field Reading ...35
From Field...35
Date Field ...36
Subject Field ...37
E-Mail body processing ...38
Message Extraction...38
Text Normalization ...39
Language specific accents...41
English words ...41
Customizing the E-Mail Preprocessor ...42
4SML Specifics for French ...49
CUSTOM G2P DICTIONARIES ... 54
Introduction ...54
APPENDICES ... 56
Chapter I
French Text-To-Speech System
Userâ
s
Gui
d
e f o r F r e n c h
French
Text-To-Speech System
Introduction
This section provides operational instructions for the RealSpeak Telecom Text-To-Speech system for French. It reviews the
functionality of the system, and describes the way in which the user can customize the pronunciation of input texts. This part also describes issues that are particular to the French Text-To-Speech system. It introduces the French phonetic alphabet and it discusses some language-specific features of the French Text-To-Speech system.
Preparing a text for Text-To-Speech
In general, there are four ways to intervene in the pronunciation of text:
ï· By using control sequences ï· By entering phonetic input
ï· By using a user dictionary or a user ruleset ï· Byusing oneofthesupported APIâs
Thesemechanismsaredescribed in theProgrammerâsGuide. In this part, however, the specifications for French are fully described.
Native Character Set
The native character set of the French TTS system is Windows-1252; it has the printable characters in the ASCII range 1-127 as a subset. Note that TTS input encoded in another supported character set is converted to the native character set for that language before it is processed internally. Consequently, input must be representable in the native character set even if it is encoded in another character set supported by the API.
Using Control Sequences
For a description of the various supported markup languages
Remark:<ESC> representsthe escape characterâ\x1Bâ
(decimal 27) that generates the ASCII character 27 (Hex 1B). Below, you find a quick reference table for the RealSpeak native control sequences supported for French. The language-specific supportfortheSSML markup languageisdescribed in theâSSML Preprocessorâ chapter.
Quick Reference of the RealSpeak native Control Sequences for French Sequence Description Range Default Delimiter
Volume (x : 0 .. 100) 0 = silence 10 = low 100 = high 80 No <ESC> \vol=x\ For example:
Modifier le volume ne pose pas de problèmes. <ESC>\vol=90\ Le plus haut volume <ESC>\vol=10\sâopposeau volumele
plus bas. Speech Rate (x : 1 .. 100) 10 = slow 100 = fast 50 No <ESC> \rate=x\ For example:
Toujours selon votre choix, la vitesse d'élocution
<ESC>\rate=10\ sera plus lente <ESC>\rate=90\ ou plus rapide.
Words per minute (xxx: 1..1000)
Voice-specific (see
subsequent table) Voice-specific No <ESC>
\rate_wpm= xxx\
For example:
Toujours selon votre choix, la vitesse d'élocution <ESC>\rate_wpm=110\ sera plus lente <ESC>\rate_wpm=350\ ou plus rapide. Read mode;
some read modes are not supported in e-mail mode x = 0..3: 0 = character-by-character 1 = word-by-word (not supported in e-mail mode) 2 = sentence-by-sentence 3 = line-by-line (not supported in e-mail mode) 2 Yes <ESC>Mx For example: <ESC>M0 Test
(The word "Test" will be spelled.) <ESC>M1 Ceci est un test.
(This sentence will be read word by word.) <ESC>M2 Ceci est un test.
(This sentence will be read as one sentence.) <ESC>Wx Wait Period 0 = no wait period
1 = 200 milliseconds wait period
9 = 1800 milliseconds
Sequence Description Range Default Delimiter For example:
<ESC>M2 <ESC>W2 Cet énoncé est suivi d'une pause courte... <ESC>W9 Et maintenant d'une pause longue. Est-ce que vous entendez la différence?
Long Pause 1 ..65535 msec No <ESC>
\Pause=xxx
\ For example:On peut déterminer la longueur <ESC>\pause=5000\ des pauses.
Sentence
Accent No
<ESC>"
For example:
Mets le livre <ESC>"sur la table. Mets le <ESC>"livre sur la table. Note:
Manually inserted sentence accents may have no effect in Realspeak. The RealSpeak synthesis module may indeed have reasons to override the requested sentence accent, and thus not realize it.
Continuation No
<ESC>C
For example:
Jean pose une question à Marie. Mais Marie ne lui répond pas. Jean pose une question à Marie. <ESC>C Mais Marie ne lui répond pas.
In the first of the above examples, the text-to-speech system will detect an end-of-sentence after Marie and will read the input as two separate sentences. In the second example, a continuation sequence is inserted in order to make the system pronounce the entire input as one sentence.
End-of-Message Yes
<ESC>E
For example:
Vousentendezdâabord lapremièreligne<ESC>E etpuisla
deuxième.
In the above example, the sequence <ESC>E forces the system to pronounce the two halves of the input separately.
Phonetic Input (L&H+ phonetic alphabet) No <ESC>/+ For example: <ESC>/+'sE+R<ESC>/+ <ESC>%x Preprocessing
Sequence Description Range Default Delimiter For example:
<ESC>%textLâexpéditeurdu présentmessageestRobert
Guinot.
<ESC>%emailFrom:âRobertGuinotâ <[email protected]>
Guide text normalization; limited support in e-mail mode address=address mode (not supported in e-mail mode) normal=standard mode spell=spell mode The text normalization types corresponding with the SSML <say-as> types are also supported in standard text mode (not in e-mail mode), see the
âSSML Preprocessorâ
chapter for more details.
Normal No <ESC>
\tn=x\
For example:
<ESC>\tn=address\ MM. Dufour et Ganty, 45 psg. Bottin, 75005 Paris <ESC>\tn=normal\
<ESC>\tn=address\ Rés. Les boutons verts, 14, r. st.- Jean, 54000 Nancy <ESC>\tn=normal\
<ESC>\tn=address\ 125-127 pl. Voltaire<ESC>\tn=normal\ <ESC>\tn=address\ RN 25, Lille <ESC>\tn=normal\ Reset to Default yes <ESC>F
For example:
<ESC>\vol=10\ Maintenant le volume est très bas , <ESC>F et remis à la valeur normale.
<ESC>\rate=10\ Ici, la vitesse est réduite au niveau minimal, <ESC>F pour ensuite retrouver son rythme normal.
<ESC>@c Declare the
part-of-speech With c a characterwith possible values: N = noun J = adjective A = adverb V = verb R = past participle No
Sequence Description Range Default Delimiter <ESC>\domain =s\ Enable the extension (only if a custom g2p has been loaded)
s = string: the name
of the extension Yes
<ESC>\domain\ Disable the last
extension Yes
<ESC>\voice=s\ Set the voice
(if there is more than 1 voice is available)
s = string: the name
of the voice Yes
<ESC>\mrk=n\ Insert a bookmark n = 0.. 2147483647 No <ESC>\p\ Insert a paragraph boundary - Yes <ESC>\aud io="s"\ Insert an audio file; not supported in e-mail mode s = string: the URI of a document with an appropriate MIME type Yes
Speech Rates in Words per Minute for French Voices Words per minute
Voice Range Default
Virginie Min = 83
Max = 416
166
Entering phonetic input
How to proceed
To switch from orthographic to phonetic mode, insert <ESC>/+ to use the L&H+ phonetic alphabet. The phonetic input mode remains active until the command is explicitly reset by entering <ESC>/+ again.
The phonetic input string is composed of symbols of the L&H+ phonetic alphabet (see phonetic table below). Examples are given in the phonetic table below.
In addition to the phonetic symbols, it is advised to use the following characters in the phonetic input string:
Special characters L&H +
Symbol Meaning As in: '
(ASCII 39, Hex 27)
primary word stress <ESC>/+ pRe.zi.'dA%~ <ESC>/+
(noun 'président') vs.
<ESC>/+ pRe.'zi.d$ <ESC>/+ (verb form 'président')
" (ASCII 34, Hex 22)
sentence accent <ESC>/+
sEt_'fRA.z$_kO%~.tjE%~_"de+ _zak."sA%~*.<ESC>/+
(Cette phrase contient deux accents.)
. syllable boundary <ESC>/+ si.'lA.b$ <ESC>/+ (syllabe)<ESC>/+ 'sI.l$.b$l <ESC>/+
(syllable) # silence (pause) <ESC>/+
Ze_"di_#_mE_"nO%~ <ESC>/+
Note that the use of punctuation marks remains useful within phonetic input to assure a correct intonation. Each punctuation mark needs to be preceded by an asterisk.
For example: <ESC>/+ bjE%~"syR*,_Ze.tE_"la*.<ESC>/+ (Biensûr,jâétaislà .) <ESC>/+ ty_E_"fOR*.<ESC>/+ (Tu es fort.) Punctuation Marks L&H+ Symbol Meaning
_ Word delimiter *. End of declarative *, Comma *! End of exclamation *? End of question *; Semicolon *: Colon
Lexical stress and sentence accents in phonetic input
In phonetic input strings,lexical stressandsentence accentscan be manuallyindicated bytheuser,byusing asinglequote(â)ordouble quote(â)respectively.
Note that manually inserted lexical stress or sentence accents may have no effect in RealSpeak. The RealSpeak synthesis module may indeed have reasons to override the requested stress/accent.
ï· The Text-To-Speech system will automatically convert all lexical stress marks into sentence accents in case no manually added sentence accents are found in the phonetic input string.
Example:
<ESC>/+sO%~_'pER_sa.'pEl_gi.'jom*. <ESC>/+
<ESC>/+sO%~_"pER_sa."pEl_gi."jom*.<ESC>/+ (Son pèresâappelleGuillaume.)
ï· ï If phonetic input contains at least one manually added
sentence accent, no additional sentence accents are assigned by the text-to-speech system. Therefore, only those words marked with " will get a sentence accent. As a consequence, a message containing only one manual sentence accent will have an almost flat intonation on all the other words.
Example:
<ESC>/+sO%~_'pER_sa.'pEl_gi."jom*.<ESC>/+
(Only one sentence accent will be realized.)
ï· Phonetic input can also be combined with orthographic input.
If no sentence accents are found in the input text (indicated by <ESC>" in orthographic input, or by " in phonetic input), the Text-To-Speech system will automatically assign sentence accents. In the orthographic part of the input, the Text-To-Speech system will realize these sentence accents on the basis of part-of-speech and syntactic information. In the phonetic part of the input, all lexical stress marks (if any) will be converted into sentence accents. If there are no lexical stress marks, no sentence accent will be realized for the phonetic part of the input (see point 1 above).
If the user has manually specified one or more sentence accents, no additional sentence accents will be realized (see point 2 above).
For example:
Sâilpleutdemain,nouspartironspour
<ESC>/+pa.'Ri<ESC>/+.
(No sentence accents are found; the Text-To-Speech system will automatically assign sentence accents.) Sâilpleutdemain,nouspartironspour
<ESC>/+pa.âRi<ESC>/+.
(A sentence accent is specified in the phonetic part of the input text. No additional sentence accents will be realized.)
Si elles s'en <ESC>"vont, ils <ESC>/+'pARt<ESC>/+ pour Paris.
(A sentence accent is specified in the orthographic part of the input text. No additional sentence accents will be realized.)
Si elles s'en <ESC>"vont, ils
<ESC>/+âpARt<ESC>/+ pour Paris.
(Two sentence accents were specified; no additional sentence accents will be realized.)
The French L&H+ and UNIPA Phonetic Alphabets Vowels
L&H+
Symbol TranscriptionL&H+ UNIPASymbol TranscriptionUNIPA As in: i mi.'nyt i mi.'nyt minute
e e.'te e e.'te été E 'tRE E 'tRE très a 'ba a 'ba bas A 'pAt A 'pAt pâte O 'mORt O 'mORt morte o 'bo o 'bo beau u 'nu u 'nu nous y 'fy y 'fy fût e+ 'de+ e= 'de= deux E+ 'sE+R E= 'sE=R soeur $ 'l$ $ 'l$ le
E%~ 'bE%~ E%~ 'bE%~ bain A%~ 'blA%~ A%~ 'blA%~ blanc
O%~ 'bO%~ O%~ 'bO%~ bon
Consonants L&H+
Symbol TranscriptionL&H+ UNIPASymbol TranscriptionUNIPA As in:
p 'pa p 'pa pas b 'ba b 'ba bas t 'ta t 'ta tas d 'do d 'do do k 'ki k 'ki qui g 'gOm g 'gOm gomme ? A%~.ti.?aR.'pO
%~ ? A%~.ti.%~?aR.'pO harpon
anti-f 'fE%~ f 'fE%~ faim
v 'vOl v 'vOl vol
s 'sak s 'sak sac
z ze.'Ro z ze.'Ro zéro
S 'SaR.m$ S 'SaR.m$ charme Z ZaR.'dE%~ Z ZaR.'dE%~ jardin
m 'mo m 'mo mot
n 'nu n 'nu nous
n~ a.'n~o n~ a.'n~o agneau nK smO.'kinKg nK smO.'kinKg smoking
l 'la l 'la la
R 'RO%~ R 'RO%~ rond
j bRi.'je j bRi.'je briller
w 'wi w 'wi oui
h\ 'lh\i h\ 'lh\i lui
ï±
NOTE
ï· Note that the L&H+ alphabet is not SSML compliant. For SSML, use the UNIPA alphabet.
Using a User Dictionary
For information on how to create and use user dictionaries, please refer totheâUserConfigurationâ chapterin the RealSpeak Telecom ProgrammerâsGuide.
Using the Microsoft SAPI5 Lexicon
Microsoft SAPI5 provides lexicons so that users and applications can specify pronunciation and part-of-speech information for particular words. As such, all SAPI compliant Text-To-Speech engines should use these lexicons to guarantee uniformity of pronunciation and part of speech information.
There are two types of lexicons in SAPI: user lexicons and application lexicons.
User Lexicons
Each user who logs in to a computer will have a User Lexicon. Initially, this lexicon is empty; words can be added either
programmatically, or by using an engine's add/remove words UI component (for example, the sample application Dictation Pad provides an Add/Remove Words dialog).
Application Lexicons
Applications can create and ship their own lexicons of specialized words. These lexicons are fixed and cannot be edited.
Detailed information on how to use the MS SAPI5 lexicons can be found in themanualâMicrosoftSpeech SDK V5.1â,chapter âISpLexicon Interfaceâ.
The French SAPI5 Phoneme List
To add entries to the lexicon, the user should use a set of language specific phonemes. The language specific phoneme list for French is given below.
SAPI5 Symbols SAPI5
Symbol Phone IDSAPI Example SAPI5 Transcription
A 11 patte P A T AA 10 pâte P AA T AX 13 justement ZH UY S T AX M A ~ EH 16 seize S EH Z EU 30 deux D EU EY 17 ses S EY IY 22 si S IY OE 29 neuf N OE F OH 12 comme K OH M OW 31 gros G R OW UY 21 du D UY UW 37 doux D UW P 32 pont P OW ~ B 14 bon B OW ~ M 25 mont M OW ~ F 18 femme F A M V 38 vent V A ~ T 36 temps T A ~ D 15 dans D A ~ N 26 nom N OW ~ S 34 sans S A ~ Z 41 zone Z OW N L 24 long L OW ~ SH 35 champ SH A ~ ZH 42 gens ZH A ~
SAPI5 Symbols SAPI5
Symbol Phone IDSAPI Example SAPI5 Transcription
NJ 28 oignon OW NJ OW ~ NG 27 camping K A M P IY NG Y 40 ion, pierre Y OW ~, P Y EH R W 39 coin K W EY ~ K 23 quand K A ~ G 19 grand, gant G R A ~, G A ~ R 33 rond R OW ~ HY 20 juin ZH HY EY ~ A ~ 11 9 vent V A ~ EY ~ 17 9 vin V EY ~ OE ~ 29 9 brun B R OE ~ OW ~ 31 9 bon B OW ~ SAPI5 Symbols
SAPI5 Symbol Meaning As in: SAPI Phone ID
-(hyphen) syllable boundary zh a rx - 1 d e~ 1 !
(exclamation mark)
sentence
terminator 1 l ax & 1 v e~ & 1eh & 1 t rx eh & 1 b o~ !
2 & word boundary 1 l ax & 1 v e~ & 1
eh & 1 t rx eh & 1 b o~
3 ,
(comma) terminatorsentence 1 l ax & 1 v e~ , 1eh & 1 b o~ . 4 .
(period) terminatorsentence 1 l ax & 1 v e~ & 1eh & 1 t rx eh & 1 b o~ .
5 ?
(question mark) terminatorsentence 1 l ax & 1 v e~ & 1eh & 1 t rx eh & 1 b o~ ?
6 _
(underscore) silence 1 l ax & 1 v e~ & _1 eh & 1 t rx eh & 3 b o~
7 1 primary stress zh a rx - 1 d e~ 8
Notes on the French Text-To-Speech System
The French Text-To-Speech system has been designed to allow a correct pronunciation of any input written according to the rules of French orthography. The following cases, however, require special attention.
Cardinal Numbers
Cardinal numbers up to 15 digits are pronounced as full numbers. Commas may be used to separate groups of digits. Digit strings consisting of more than 15 digits are pronounced digit by digit. A number starting with a zero is automatically spelled.
For example: 20598500 610.456.789 235 566 887 123 256.789.411.789.215
ï±
NOTE
ï· Numerals that are normally pronounced as full numbers, can also be read digit by digit by using the control sequence <ESC>\spell=on\ in front of the numeral to set the spell mode.
Numbers
Decimal numbers may consist of up to 15 digits before or after the decimal point. Commas may be used to separate groups of digits in the digit string before the decimal point. The decimals following the comma are pronounced as full numbers.
55,35 55,255
Ordinal Numbers
Cardinal numbers followed by the correct ordinal suffix will be pronounced as ordinals: For example: 1er 2e 10e la 61e fois
Roman Numbers
The French Text-To-Speech system supports the use of Roman numbers up to 39, when consisting of combinations of X, V and I. The Roman numbers may either be used separately or in combination with proper names.
Roman numbers up to 30, followed by e, E, e. or E. are read as ordinal numbers. For example: Louis XIV XIX Ve Ie
Fractions
Digit strings consisting of maximally 15 digits, followed by a slash, followed by a maximum of 15 digits and an ordinal suffix, are pronounced as fractions.
Digit strings 1,2,3,4 followed by a slash, followed by 1,2,3,4 are pronounced as fractions.
For example:
1/2 un demi
2/3 Deux tiers
125/425e cent vingt-cinq quatre cent vingt-cinquièmes
Telephone Numbers
In order to ensure a correct pronunciation of telephone numbers, it is recommended to use slashes or parentheses to separate the area code from the remainder of the telephone number. Also, use periods, hyphens or a space to separate groups of digits. Telephone numbers written in this format will be pronounced in groups of two digits, with a pause at the place of the period, hyphen or space.
For example: 04.42.27.86.53 83-50-33-33 83 54 21 73 03/88.41.73.00 (05)42 21 89 53 33 03-22-70-59-99 33.(0)3.22.71.49.90
If telephone numbers are preceded by an abbreviation, the number doesnâtneed to beseparated byspaces,periodsorhyphens:
For example:
Tél. (03)22718919 tél. 22713990
The following formats for Belgian telephone numbers are supported:
Tél. (03)8200222 Tél. (057) 82.05.22 Tél. 02/460.33.97 04 1234567 02-652.89.75
The following formats for Canadian telephone numbers are supported: (514) 895-7868 418.644.5950 1 123.123.1234 789 0456
Telephone numbers written in this way will be spelled.
Bank Account Numbers & Visa Numbers
In order to have bank account numbers correctly pronounced, use hyphens between groups of digits. The number will be pronounced in groups of two or three digits.
For example:
810-1254887-87
To have a bank account number pronounced digit by digit, switch to spell mode (<ESC>\spell=on\).
For a correct pronunciation of visa numbers, use hyphens between each group of digits. Each group of 4 digits is spelled.
For example:
1234-5678-9112-3456
Dates
The French Text-To-Speech system reads dates written as structured groups of digits in the following numeric formats:
with slashes:
Day(1 or 2 digits)
/
Month(1 or 2 digits)/
Year(2 or 4 digits)with hyphens:
Day(1 or 2 digits)-Month(1 or 2 digits)-Year(2 or 4 digits)
with periods:
For example:
11/12/98 11-12-1998 2.6.92 11/05/00
You can also use the written format: For example:
le 1er déc. 97 le 31 janv. 96
Time Indications
Time indications will be correctly pronounced when written in one of the following formats:
22h30 vingt-deux heures trente 22 H 30
22.30 h 22.30H 22:30h 22:30 H
22h30â vingt-deux heures trente minutes
22 H 30â 12h00 midi 00h00 minuit 2 h deux heures 14 H quatorze heures 22â vingt-deux minutes
6240â six mille deux cent quarante minutes.
Currencies
The French Text-To-Speech system correctly handles the currency symbols FF, FRS, FR, FB, $, £,â¬and Â¥, provided that they follow the numeral.
For example:
15 FF 14 FB 50 $
16â¬
16 EUR EUR 16
Currencies up to 15 digits (with or without periods) will be correctly pronounced.
For example:
20.579.500 FF
Decimal digits in combination with currency indications are also supported. Decimal currency amounts up to 15 digits before and 15 digits after the comma will be correctly pronounced.
For example:
1.999,50 FB 1.999,50 $
Also the most common currency abbreviations from around the world are supported. These abbreviations can follow or precede the amount and are expanded.
For example:
3 USD: trois dollars américains
Other currencies are written in full words and have to follow the numeral.
For example:
1500 lires
Alphanumeric Strings
Alphanumeric strings consist of a combination of letters and digits. The alphabetic part of the alphanumeric string will always be spelled; the numeric part will be read as a full number.
For example:
AB1956 is pronounced as âA B 1956â
Mathematical formulas
The French Text-To-Speech system supports a range of mathematical formulas.
Numbers, cardinals or decimals can be negative (i.e. preceeded by the minus symbol). Supported operators: + plus - moins *xX fois / divisé par % modulo
^ puissance (but: ^2: au carré, ^3: au cube) Supported separators:
( [ parenthèse, crochet ouvert ) ] parenthèse, crochet fermé
For example:
4 + 2 = 6 quatre plus deux égale six
4-2*5=-6 quatre moins deux fois cinq égale moins six
(4 + 2) + 2 = 8 parenthèse ouverte quatre plus deux parenthèse fermée plus deux égale huit
[4-2]+3-5=0 crochet ouvert quatre moins deux crochet fermé plus trois moins cinq égale zéro
15/(10-5)+1=4 quinze divisé par parenthèse ouverte dix moins cinq parenthèse fermée plus un égale quatre
(4 + 2) + (4 - 2) = 8 parenthèse ouverte quatre plus deux parenthèse fermée plus parenthèse ouverte quatre moins deux parenthèse fermée égale huit
[4+2][4-2]=12 crochet ouvert quatre plus deux crochet fermé fois crochet ouvert quatre moins deux crochet fermé égale douze
5*(2+3)-2+[1/1]-20=4 cinq fois parenthèse ouverte deux plus trois parenthèse fermée moins deux plus crochet ouvert un divisé par un crochet fermé moins vingt égale quatre
Abbreviations
The French RealSpeak system contains a dictionary with the most common abbreviations, such as:
e.a. entre autres svp. sâilvousplaît
Mme Madame O.K. oké ex. exemple
Words consisting only of consonants are spelled: e.g. PDG, HLM. Some abbreviations are ambiguous, however, and are pronounced depending on the context in which they appear. For example, the abbreviation "MM" is pronounced "millimètres" when preceded by a digit, but "messieurs" in other cases.
For example:
3 MM 3 millimètres
MM Deprez et Dupont Messieurs Deprez et Dupont
Acronyms and Initialisms
The French Text-To-Speech system contains a standard dictionary with acronyms and initialisms such as: RAM, NASA, AMEX, ETC.
Acronymsare abbreviations formed by combining the first letter(s) of a group of words. They are pronounced as words.
For example: NATO, UNESCO
Initialismsare abbreviations formed by combining the first letter of each part of a group of words. Initialisms are spelled.
For example: API, FBI
Chapter II
E-Mail Preprocessor
Userâ
s
Gui
de
f o r F r e n c h
E-Mail Preprocessor
Introduction
The ScanSoft e-mail preprocessor (EMPP) has been developed to analyze a specific type of text: e-mail messages. E-mail messages differ from any average type of text in both structure and contents. An e-mail message consists of two clearly distinguished parts: the header and the body. A substantial part of the header contains routing and administrative information, which is irrelevant to the user. Both the header and the body contain all kinds of e-mail specific text features, e.g. e-mail addresses, emoticons such as smileys, etc. Furthermore, informal writing is often combined with a lack of grammatical conventions. Spelling rules are frequently violated, punctuation is often omitted, etc.
Although the standard ScanSoft Text-To-Speech system can handle special text items (abbreviations, numbers, dates, etc.), it is not capable of correctly handling all e-mail specific text features. These text features are therefore dealt with by the e-mail preprocessor. The EMPP transforms e-mail specific information into a format that complies with the rules of the standard ScanSoft Text-To-Speech system. The EMPP is a plug-in preprocessing module of the ScanSoft Text-To-Speech system. It replaces the preprocessor of the standard Text-To-Speech system.
In the following sections you will find a description of the functioning of the ScanSoft e-mail preprocessor as well as an overview of its features.
The e-mail preprocessor has two main tasks: processing of the e-mail header and processing of the body of the e-mail message.
The input to the EMPP consists of one or more e-mail messages. In order to process the e-mail header, the EMPP extracts relevant header fields and then provides an intelligent header field reading.
During the processing of the e-mail body, the text is divided into smaller text units, called text-to-speech messages, which are synthesized by the Text-To-Speech system. Text normalization is applied to e-mail specific text features such as e-mail addresses, proper names, emoticons, URLs (Universal Resource Locators), etc. For the text normalization of an e-mail message, the ScanSoft EMPP applies linguistic rules and performs dictionary look-up, in order to yield an adequate phonetic transcription. The EMPP also supports the ScanSoft user dictionary mechanism, which allows the user to customize the output of the e-mail processing.
E-Mail Header Processing
Header Field Extraction
An e-mail message consists of two clearly distinguished parts: the header and the body. The EMPP detects the header and extracts the relevant header fields. Information that is of no interest to the user (such as routing information) is not retained.
The EMPP extracts the following header fields:
From Field Containsthesenderâsnameand/oraddress Date Field Contains the date and time of sending Subject Field Optionally contains the subject of the e-mail The extraction of the header fields is based on the detection of specific keywords in the e-mail header. The supported keywords for the extraction of the header fields are listed below:
From Field From: Author: Sender: De: Von: Date Field Date:
Enviado: Gesendet: Subject Field: Subject:
Subj: Asunto: Betreff:
The following is an example of header field extraction. The original header holds information that is irrelevant to the user. After extraction of date, sender and subject, the processed header merely mentions the Date field, the From field and the Subject field: Original header (English version):
From [email protected] Tue Oct 22 15:52:02 1996
Path: chaos.kulnet.kuleuven.ac.be!Belgium.EU.net!EU.net!www.nntp.p rimenet.com!nntp.primenet.com!nntp.uio.no!newsfeed.easynet.co.uk!eas ynet-uk!news.easynet.fr!easynet-fr!rain.fr!francenet.fr!usenet
From: Jean-Marc Bruneau <[email protected]> Newsgroups: fr.comp.lang.java
Subject: Un jeu en JAVA sur cd-rom Date: Thu, 17 Oct 1996 14:36:38 +0200 Organization: Acapella Lines: 5 Message-ID: <[email protected]> Reply-To: [email protected] NNTP-Posting-Host: pppa233.francenet.fr Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit
X-Mailer: Mozilla 2.01I [fr] (Macintosh; I; PPC
Extracted header fields:
From: Jean-Marc Bruneau <[email protected]> Subject: Un jeu en JAVA sur cd-rom
Header Field Reading
After the header fields have been extracted, they are processed by the EMPP. The header field keywords (see above) are replaced by an introductory message. The remainder of the header fields is processed by the EMPP in order to allow the Text-To-Speech system to
intelligently read the fields.
From Field
TheFromfield keyword is replaced by the introductory message âMessage envoyé par:â.
For example:
Author: Sandrine Rosier
is pronounced:
Message envoyé par: Sandrine Rosier
The remainder of theFromfield is further processed by the EMPP. The EMPP supportsFromfields that either consist of
a) a proper name
b) a proper name and an address c) an address
a) - b) In case theFromfield contains a proper name, this name and only this name is sent to the Text-To-Speech system. This means that if both a name and an address are found in theFromfield, the address will not be read by the Text-To-Speech system.
For example:
From: Patrick Parigeaud
From: "Robert Griffon" <[email protected]> Author: Bernard Garnier at IepXchgPO Author: Hélène Dubois/LHS/IEP/CA
are pronounced:
Message envoyé par: Patrick Parigeaud Message envoyé par: Robert Griffon Message envoyé par: Bernard Garnier Message envoyé par: Hélène Dubois Message envoyé par: François Simonnet
c) In case the From field contains only an address, the EMPP extracts the name out of the address and expands the domain that is
contained in the address. In other words, the e-mail address is not read literally.
For example:
From: [email protected] Author: videotron.ca!labelle
Author: [email protected]
From: cdnsport.ca!Philippe_Lacour at Internet
are pronounced:
Message envoyé par: Stéphane Chabert at L'Académie de Toulouse Message envoyé par: labelle at Vidéotron Canada
Message envoyé par: c 16 89 at Télé 7 Jours
Message envoyé par: Philippe Lacour at Sport Canadien
Date Field
TheDatefield keyword is replaced by the introductory message âDate:â.
TheDatefield contains the date and time of sending. The EMPP supports multiple date and time formats, which are transformed into a uniform format that complies with the rules for date and time indications of the ScanSoft Text-To-Speech system. The EMPP only pronounces the date.
For example:
Date: Thu, 7 Dec 1995 13:45:46 EDT Date: 13 Mar 2003 11:45 AM
are pronounced:
Date: jeudi, 7 décembre 1995 Date: 13 mars 2003
Subject Field
TheSubjectfield keyword is replaced by the introductory message âSujetâ.
TheSubjectfield can contain all kinds of data, but may also be empty. The EMPP searches for keywords that are typical for the subject field (e.g. RE, FYI, FW).
For example:
Subject: Re: Perdu des disquettes ! Subject: FYI: service rapide Subj: reunion annulee (fwd)
are pronounced:
Sujet: réponse: Perdu des disquettes! Sujet: à but informatif: service rapide Sujet: réunion annulée (message redirigé)
E-Mail body processing
Message Extraction
The e-mail preprocessor splits the body of the e-mail message into text-to-speech messages. This is done on the basis of a number of criteria, such as punctuation, capitalization, layout, intelligent abbreviation handling, etc.
The following examples illustrate some criteria for splitting the e-mail text into text-to-speech messages:
ï· Using sentence final punctuation and capital letters
Veuillez renvoyer cette information le plus vite possible. Mon adresse est [email protected]. Merci bien !
ï· Using layout
Cette semaine:
1) Gagnez un voyage a Moscou 2) La nouvelle creme anti-rides 3) Les starlettes de Cannes.
ï· Using intelligent abbreviation handling
Text Normalization
An e-mail message typically contains e-mail specific text features, such as e-mail addresses, URLs, file names, emoticons, etc. The EMPP transforms these e-mail specific features into a format that complies with the rules of the standard text normalization of the ScanSoft Text-To-Speech system.
The following are examples of e-mail specific text normalization: ï· Support for multiple e-mail address formats
Lu Chong/LHS/IEP/BE [email protected] Sally Smith at IepXchgPO
ï· Support for URLs (Universal Resource Locators)
http://offre.qc.ca
http://abg.grenet.fr/abg/jobs.html gopher://gopher.upenn.edu/11/lists
ï· Support for file names
ldb001.tse sysinfo.exe lipedu.xls ï· Processing of emoticons :-x bisou :-O aïe
ï· Processing of overuse of punctuation
Fais attention!!!!!!!!: INTERNET VIRUS!!!!!!!! Jamais de compliments! #%&#@$
becomes:
Fais attention!!!. INTERNET VIRUS!!! Jamais de compliments.?
ï· Normalization of lay-out lines (e.g. part of an e-mail signature); not active when in spell mode.
These sequences of identical characters are not pronounced: o 10 or more identical digits
o a word consisting of 5 or more identical US-ASCII encoded letters of the modern Latin alphabet o a sequence of 3 or more identical US-ASCII
characters that are no letters, no digits, no sentence-final punctuation marks (.?!) and no white spaces; e.g. '&', '#', '%', '*', '-'
For example:
oooooooooooooooooooooooooooooo ---will be removed.
ï· Normalization of lay-out lines (e.g. part of an e-mail signature); not active when in spell mode.
These sequences of identical characters are not pronounced: o 10 or more identical digits
o a word consisting of 5 or more identical US-ASCII encoded letters of the modern Latin alphabet o a sequence of 3 or more identical US-ASCII
characters that are no letters, no digits, no sentence-final punctuation marks (.?!) and no white spaces; e.g. '&', '#', '%', '*', '-'
For example:
oooooooooooooooooooooooooooooo ---will be removed.
ï· Processing of Question/Answer (FAQ)
Q. I have an e-mail address change. How can I ensure that I will continue to receive my EcoLink e-mail newsletter?
A. Easy. Just send an e-mail to [email protected]. Be sure to include both your old address and your new address.
becomes: Question:
I have an e-mail address change. How can I ensure that I will continue to receive my EcoLink e-mail newsletter?
Answer:
Easy. Just send an e-mail to [email protected]. Be sure to include both your old address and your new address.
ï· Processing of inserted mail
Roland> Si vous jouez avec des émissions Roland> diffusées, vous perdez une Roland> partie de l'image, ou vous la
Roland> déformez en écrasant l'image.
Roland> Aucune solution n'existera jamais à ce Roland> problème.
Cecile> Pourtant ce n'était pas mon expérience.
becomes:
Roland:
Si vous jouez avec des émissions diffusées, vous perdez une partie de l'image, ou vous la déformez en écrasant l'image. Aucune solution n'existera jamais à ce problème.
Cécile:
Pourtant ce n'était pas mon expérience.
Language specific accents
The ScanSoft E-mail Preprocessor for French has a special function for the detection of the specific French characters é, è, ê and cédille (ç).
For example:
The following text without accents (é, è and ê) and cédille:
Il n'avait pas remarque que ce probleme se presentait un peu partout. Ca montre qu'il avait la tete ailleurs la semaine passee.
becomes:
Il n'avait pas remarqué que ce problème se présentait un peu partout. ça montre qu'il avait la tête ailleurs la semaine passée.
English words
Since e-mail is an international medium, French e-mail messages will inevitably contain many English words, that might refer to Internet, electronic mail or soft- and hardware. The typical e-mail jargon is handled by the exceptions dictionary of the e-mail preprocessor. This dictionary is a lexicon for e-mail terminology and provides the Text-To-Speech system with an adequate French transcription or
translation for a number of English words. For example:
Provider /+prO.vaj.âdE+R
hypertext/+ ?i.pER.âtEkst
Customizing the E-Mail Preprocessor
The e-mail preprocessor supports the standard ScanSoft Text-To-Speech SDK user dictionary mechanism, which allows the user to customize the output of the e-mail preprocessor. The user dictionary is consulted both during the header processing and the body
processing.
For more information on how to build and use user dictionaries, see theâUserConfigurationâ chapterof theProgrammerâsGuide.
Customizing the E-Mail Header
The user dictionary is consulted during the header processing while reading theFromfield and theSubjectfield.
From Field
TheFromfield either consists of a) a proper name
b) a proper name and an address c) an address
a) In case theFromfield contains a proper name only, the name is passed on to the user dictionary. If the lookup is successful, the proper name is substituted by the replacement string. If not, the name is further processed by the header reading module.
For example:
If the user dictionary contains the following line:
John /+âdZOn the following From field:
From: John Leblanc
becomes:
Messageenvoyépar:*/+ âdZOn*/+ Leblanc
b) In case theFromfield contains a proper name and an address, the EMPP first passes the address to the user dictionary. If the lookup is successful, both the proper name and the address are substituted by the replacement string. If not, the EMPP passes the proper name to the user dictionary. If this lookup is successful, the name and the address are substituted by the replacement string. If not, the name is further processed by the header reading module. The address will not be read by the Text-To-Speech system.
For example:
If the user dictionary contains the following lines:
[email protected] Pierre, mon ami sportif Berthelots le chef
Mortier /+ mOR.'tiR
the followingFromfields:
From: [email protected] (P. Dupont) From: [email protected] (Berthelot)
From: "Alex Mortier" <[email protected]>
become:
Message envoyé par: Pierre, mon ami sportif Message envoyé par: le chef
c) In case the From field contains only an address, the complete address is looked up in the user dictionary. If the lookup is successful, a proper name is added to the From field. If not, only the domain part is sent to the user dictionary. The EMPP first calls the dictionary for the complete domain part. If the lookup is successful, the
complete domain part is substituted by the replacement string. Otherwise, the EMPP cuts the leftmost sublevel domain and repeats the lookup and matching procedure for the remainder of the domain part. If the lookup is successful, the remainder of the domain part is substituted by the replacement string. This procedure is called repeatedly until the top level domain is encountered. If none of the lookups is successful, the address is further processed by the header reading module.
For example:
If the e-mail user dictionary contains the following lines:
[email protected] petit Jacques duripex.com Duripex
the followingFromfield:
From: [email protected] From: [email protected]
becomes:
Message envoyé par: petit Jacques
Message envoyé par: d lecompte at Duripex
ï±
NOTE
To allow a correct processing of theFromfield, the replacement string in the e-mail user dictionary should not contain an address or a domain.
Subject Field
Each word in theSubjectfield is sent to the user dictionary. If the lookup is successful, the replacement string is sent directly to the Text-To-Speech system. If not, theSubjectfield is further processed by the header reading module.
For example:
If the user dictionary contains the following lines:
DITA d.i.t.a.
Massachusetts /+ ma.sa.tSu.'sEts
the following Subject fields:
Subject: rapport du projet DITA
Subject: fait pas trop mauvais dans le Massachusetts
are pronounced:
Sujet: rapport du projet d.i.t.a.
Sujet: fait pas trop mauvais dans le */+ ma.sa.tSu.'sEts */+
Customizing the E-Mail Body
When a user dictionary has been loaded, the EMPP will call the dictionary for every word of the e-mail body. If the word is found in the user dictionary, it is substituted by the replacement string. If not, the body is further processed by the e-mail body processing module.
For example:
If the user dictionary contains the following line:
strcpy /+ stRinK.kO.'pi
the word "strcpy" in the following sentence:
Ajoute un strcpy à ton programme, ca pourrait aider.
is replaced by the corresponding string found in the e-mail user dictionary:
Chapter III
SSML Preprocessor
Userâ
s
Gui
de
f
or
F r e n c h
SSML Preprocessor
Introduction
SSML (Speech Synthesizer Markup Language) is part of a set of markup specifications by the W3C for voice browsers.
General information regarding the RealSpeak SSML processor can be found in theSSML Supportchapter of theProgrammerâs Guide. The RealSpeak Telecom SDK provides a built-in preprocessor that supports a large portion of the SSML 1.0 September 2004
Recommendation (REC). MoreoverRealSpeakextends SSML with a number of Scansoft specific elements/attributes.
The set supported by Scansoft is called âScanSoftSSMLâ(4SML). The section below describes language-specific SSML support included in the RealSpeak Telecom V4.0âFrench language version.
French specific SSML markup
XML encoding types for French
The encoding is specified in the XML text declaration
("<?xml⦠?>") by the encoding declaration which is of the form encoding="<EncodingName>".
E.g. <?xml version="1.0" encoding="UTF-8"?> RealSpeak Telecom V4.0âFrench supports:
ï· âWindows-1252â and âISO-8859-1â (ISO Latin1)
ï· The Unicodeencoding âUTF-8â,âUTF-16â and âUCS-4â (Note that the alias "ISO-10646-UCS-4" is not supported) ï· Any coding character set supported by the ICU component
as long as the input text only contains characters that can be transcoded to the native coded character set, being
âWindows-1252â.For more information about the character sets supported by ICU, take a look at the ICU website
http://www-306.ibm.com/software/globalization/icu andhttp://www.iana.org/assignments/character-sets.
NOTE
Encoding names are parsed case-insensitive; hyphens and underscores are ignored
4SML Specifics for French
For reasons of compatibility with theâstandardâFrench system,the parallel text control sequence (<esc> sequence) is listed where applicable. As such, a similar TTS behavior can be createdâor combinedâwith non-SSML driven text input.
4SML Tags Comment Corresponding control sequence High-level and document structure tags
xml:lang Supported
âfr-FRâ forFrench. Attribute of speak, paragraph, sentence and voice.
Text normalization tags <say-as
interpret-as=âxxxâ> Supported; limitedsupport in e-mail mode. In e-mail mode the only supported interpret-asvalueisâspellâ. <say-as interpret-as=ânumberâ format=âcardinalâ> Supported <esc>\tn=number_cardinal\ <say-as interpret-as=ânumberâ format=âdigitsâ>
Supported <esc>\ tn=number_digits\
<say-as
interpret-as=ânumberâ format=âdecimalâ>
Supported <esc>\ tn=number_decimal\
<say-as
interpret-as=ânumberâ> Supported <esc>\ tn=number\
<say-as
interpret-as=ânumberâ format=âordinalâ>
Supported <esc>\ tn=number_ordinal\
<say-as
interpret-as=ânumberâ format=âtelephoneâ>
Supported <esc>\ tn=number_telephone\
<say-as
interpret-as=ânumberâ format=âtelephoneâ
Supported <esc>\
<say-as
interpret-as=âordinal>â Supported <esc>\ tn=ordinal\
<say-as
interpret-as=âacronymâ> Supported <esc>\ tn=acronym\
<say-as
interpret-as=âacronymâ detail=âstrictâ>
Supported <esc>\ tn=acronym_strict\
<say-as
interpret-as=âmeasureâ> Supported <esc>\ tn=measure\
<say-as
interpret-as=âlettersâ> Supported <esc>\ tn=letters\
<say-as interpret-as=âlettersâ detail=âstrictâ>
Supported <esc>\ tn=letters_strict\
<say-as
interpret-as=âwordsâ> Supported <esc>\ tn=words\
<say-as
interpret-as=âdateâ> Supported <esc>\ tn=date\
<say-as interpret-as=âdateâ format=âmdyâ>
Supported <esc>\ tn=date_mdy\
<say-as
interpret-as=âdateâ format =âdmyâ>
Supported <esc>\ tn=date_dmy\
<say-as
interpret-as=âdateâ format=âymdâ>
Supported <esc>\ tn=date_ymd\
<say-as
interpret-as=âdateâ format=âymâ> Supported <esc>\ tn=date_ym\ <say-as interpret-asâdateâ
format=âmyâ> Supported <esc>\ tn=date_my\
<say-as interpret-as=âdateâ format=âdmâ>
Supported <esc>\ tn=date_dm\
<say-as interpret-as=âdateâ format=âmdâ>
Supported <esc>\ tn=date_md\
<say-as
interpret-as=âdateâ format=âyâ> Supported <esc>\ tn=date_y\
<say-as
interpret-as=âdateâ format=âmâ> Supported <esc>\ tn=date_m\
<say-as
<say-as
interpret-as=âtimeâ> Supported <esc>\ tn=time\
<say-as
interpret-as=âtimeâ format=âhâ> Supported <esc>\ tn=time_h\
<say-as interpret-as=âtimeâ format=âhmâ> Supported <esc>\tn=time_hm\ <say-as interpret as=âtimeâ format=âhmsâ>
Supported <esc>\ tn=time_hms\ <say-as
interpret-as=âdurationâ format=âhmsâ>
Supported <esc>\ tn=duration_hms\
<say-as interpret-as=âdurationâ format=âhmâ>
Supported <esc>\ tn=duration_hm\
<say-as interpret-as=âdurationâ format=âmsâ>
Supported <esc>\ tn=duration_ms\
<say-as interpret-as=âdurationâ format=âhâ>
Supported <esc>\ tn=duration_h\
<say-as interpret-as=âdurationâ format=âmâ>
Supported <esc>\ tn=duration_m\
<say-as interpret-as=âdurationâ format=âsâ>
Supported <esc>\ tn=duration_s\
<say-as
interpret-as=âdurationâ> Supported <esc>\ tn=duration\
<say-as
interpret-as=âcurrencyâ> Supported <esc>\ tn=currency\
<say-as
interpret-as=âtelephoneâ> Supported <esc>\ tn=telephone\
<say-as interpret-as=âtelephoneâ detail=âpunctuationâ>
Supported <esc>\ tn=telephone_punctuation\
<say-as
interpret-as=âaddressâ> Supported <esc>\ tn=address\
<say-as
interpret-as=âspellâ> Supported <esc>\ tn=spell\
<say-as
<say-as interpret-as=ânetâ format=âemailâ>
Supported <esc>\ tn=net_email\
<say-as
interpret-as=ânetâ format=âuriâ> Supported <esc>\ tn=net_uri\
<say-as
interpret-as=ânetâ> Supported <esc>\ tn=net\ Pronunciation tags
<phoneme
alphabet=âunipaâ> SeSupportede section âthe
French L&H+ and UNIPA phonetic
alphabetsâ for an
overview of the alphabet.
Chapter IV
Custom G2P Dictionaries
Userâ
s
Gui
de
f
or
F r e n c h
Custom G2P Dictionaries
Introduction
ScanSoft's RealSpeak system now offers support for custom G2P dictionaries. A custom G2P dictionary module is an add-on module specifically designed to improve the quality of pronunciation for specific kinds of words.
The French system is currently not designed to support the use of a custom G2p dictionary module.
Appendices
Userâ
s
Gui
de
f
o
r F r e n c h
Appendices
Appendix A: French voice and language strings
The RealSpeak Telecom Text-To-Speech system now supports selecting the voice and language via a string as well as a define (please see the definition for the functionTtsInitialize(Ex)()in the
Programmers Guideand also theBackwards Compatibility Guidefor details). The name strings for the currently supported French voices are listed in the table below.
French Voice Name Strings
Voice Name String
Virginie âVirginieâ