• No results found

VECSYS LIMSI ARCHITECTURE

N/A
N/A
Protected

Academic year: 2021

Share "VECSYS LIMSI ARCHITECTURE"

Copied!
20
0
0

Loading.... (view fulltext now)

Full text

(1)

VECSYS LIMSI

ARCHITECTURE

Samir Bennacef

Vecsys

(2)

Centralized Architecture

Semantic Frame sentence text results queries Semantic frame Speech Recognizer Semantic Analyzer Sentence Generator Dialog Manager Information Retrieval Speech Synthesis Language models Acoustic models Caseframe Grammar

Task Model Database

Generation Grammar Unit Dictionary Telephone Interface

(3)

Telephone Interface

phone

program

– Input: commands and speech

– Output: events and speech

• Recording and playback, DTMF detection and

generation, pickup, hangup and call transfert

• Hardware echo cancellation

• Barge-in based on adaptative speech detection

• NMS QX2000 hardware

(4)

Speech Recognizer

• Cepstral Features Computation:

sig2mfcc

– Input: speech recorded by the

phone

program

– Output: 13 component cepstral vector every 10

ms on a 8kHz bandwidth.

• Speech recognizer:

nsearch

– Inputs: commands and cepstral coefficients

– Output: recognized text

(5)

Semantic Analyzer

• Lexical normalization and labelling:

sentprocess

– Input: recognized sentence

– Output: labelled sentence

• Caseframe analysis:

cases

– Input: labelled sentence

– Output: semantic frame

(6)

Dialog Manager

Dialog

– Input: semantic frame resulting from

cases

– Output: semantic frame to be converted in

natural language

• Contextual understanding

• Database query generation

• Semantic frame generation

(7)

Natural Language Generator

Genere

– Input: semantic frame resulting from

dialog

– Output: natural language sentence

(8)

Information Retrieval Interface

Dbserver

– Input: SQL query

– Output: database result

• Query parsing and translating

• Retrieves informations from the target database

• Provides the result table

(9)

Speech Synthesis System

Syn

– Input: sentence resulting from

genere

– Output: speech signal which is played by the

telephone interface

• Use of unit dictionary

• Select the

best

sequence of units using a

dynamic programming algorithm

(10)

C-shell script

• # --- Phone interface --- #

rsh $remote $bin/phone.exe –h$dialhost $dialport –t70 –x8192 –n2 \ –l2 –g –f$cfg/cta.cfg –a$data&

• # --- Speech recognizer loading --- #

• # ---- SigToCep ---- #

set CEP = \

$bin/sig2mfcc -w240 -s80 -l20 -n12 -r8000 -b0:3500 -c -en0 -0 \

--$fifo/tosentrec.fifo$i $fifo/tosig2mfcc.fifo$i -:

• # ---- Speech Recognizer ----#

set RECO = \

$bin/nsearch -@$phones -d$fifo/tosentrec.fifo$i -t \

-p${plist}:$stbl -s0:160:0:f -l$voc -z3 -w4:25 -n1 -q63,12:8:3 \ -zb$tg -zw30 -zr -xg$gsl -zy$clst -sw50 -sh25000 \

-cmr${cepmean}:0.996 -en4.5 -- $hmm -xf

• $bin/recocheck -r $fifo/torecord.fifo$i -c$CEP –d$RECO \

-t$fifo/fromdial.fifo$i -v < \

(11)

#--- Semantic Analyzer and Dialogue loading ---#

$bin/

sentprocess

-k -t -d -c -v2 $dial/rules.txt < \

$fifo/

tocases.fifo

$i | \

$bin/

cases

-k -o -m -v $dial/caseframe.txt | \

$bin/

dialogue

-i -v1 $dial/task.txt $dial/dial.arg \

-tr$fifo/

pushtotalk.fifo

$i -fp$fifo/

fromplay.fifo

$i \

-fn$fifo/

todial.fifo

$I -rf$tmp/reco.tmp$i \

-e$fifo/

fromdial.fifo

$i -fg$fifo/

fromgenere.fifo

$i \

-tt$fifo/

todb.fifo

$i -ft$fifo/

fromdb.fifo

$i | \

$bin/

genere

$dial/genere.txt -f$fifo/

fromgenere.fifo

$i –v

(12)

-l"$logcmd" -s$fifo/torecord.fifo$i -db$fifo/fromplay.fifo$i \

-dt$fifo/todial.fifo$i -df$fifo/fromdial.fifo$i -dp$dialpid \

-kf$fifo/fromdbconn.fifo$i -kt$fifo/todb.fifo$i \

-kw$synt/sig/waitdb.sig -kl$synt/sig/wait.sig -v –r \

< $fifo/fromphone.fifo$i > $fifo/tophone.fifo$i &

• # --- Database Loading --- #

$bin/dbserver -t$fifo/todbtarg.fifo$i -f$fifo/fromdbtarg.fifo$i \

-c$db/table1.txt -s$db/table2.txt -p$db/table3.txt \

-d$fifo/fromdbconn.fifo${i}:120 -m10 -a -v2 \

< $fifo/todb.fifo$i > $fifo/fromdb.fifo$i &

• # --- Synthesis loading --- #

$bin/syn -s${sig}:2 -l$wd -w4:2:0 -o$fifo/toplay.fifo$i -c \

$synt/wdlist.lst &

(13)

How the system works

server.csh: telephone interface loading

server.csh: speech recognizer loading server.csh: dialog loading

server.csh: dispatcher loading server.csh: dbserver loading server.csh: synthesis loading

telephone: pickup

telephone: line number=[0] telephone: play

telephone: get dtmf [*]

dialogue:

frame: { concept: (acte formalite-ouverture). }

genere: Quel voyage souhaitez-vous effectuer ?

telephone: play

telephone: end of play telephone: recording

(14)

Lille -> $place

matin -> *matin

$place(Paris) $place(Lille) *to(pour) demain(demain)

*matin(matin)

cases

: <defaut>

{

place: Paris.

place: Lille.

departure-period: *matin.

departure-date: demain.

}

dialogue

: request=[SELECT from, deph, to, arrh, chg, day,

stopa, stopah, stopd, stopdh, stopdur, type WHERE

from=Paris AND to=Lille AND day=17/5/101 AND arrh ~= 1000]

dbserver

: target query=[00043 00000001 ? 12 FRPAR FRLIL 17

MAY 1000]

dbserver: result=[1 ( from deph to arrh chg day stopa

(15)

nb-trains: (value 1).

concept2: (acte confirmation) (value hour). from-place: (value Paris-Gare-du-Nord). to-place: (value Lille-Flandres).

departure-wday: (value jeudi). departure-day: (value 17/5/101). departure-period: (value *matin). stop: (value 0).

sched: (dep 0858) (arr 0959). }

genere:Le matin , jeudi dix-sept mai vous avez un train de

Paris-Nord `a Lille-Flandres `a huit heures cinquante-huit arrivant `a neuf heures cinquante-neuf. Cet horaire vous convient-il ?

nsearch: <s> oui </s>

cases: <defaut> { mode: *affirmatif.}

dialogue: { concept: (acte relance) (value retour). }

genere: Souhautez-vous le retour ?

nsearch: recognized string: <s> non merci </s>

genere:Vous avez donc un aller Paris-Nord Lille-Flandres le jeudi

dix-sept mai d'epart huit heures cinquante-huit, arriv'ee neuf heures cinquante-neuf. Souhaitez-vous un autre trajet ?

(16)

Distributed Architecture

Network (TCP/UDP) Network (TCP/UDP) Client Application m Client Application m Application Programming Interface

(Data exchange Protocols) (Service Name-Address Resolution) Application Programming Interface

(Data exchange Protocols) (Service Name-Address Resolution)

Client Application 2 Client Application 2 Client Application 1 Client Application 1 Host n (Slave) Recognizer Recognizer Vnetd Daemon Other Services… Dialogue

Dialogue SpeechsynthesisSpeech synthesis Recognizer Recognizer Other Services… Dialogue Dialogue Speech synthesis Speech synthesis Host1 (Master) Vnetd Daemon Net Audio server Net Audio server

(17)

Services

• 1. Audio

• 2. Speech recognition

• 3. Dialog (understandig, dialog and generation)

• 4. Information retrieval

• 5. Speech synthesis

(18)

Galaxy Communicator

• Similarities between GC and Oasis

– A distributed client/server architecture

– A central manager :

hub

in GC and the

application manager

in Oasis

– A set of services listening for client connections

and requests

(19)

Compliant

• Include the GC server functions in all services:

– make initialization

– include a dispatch function

– invoke the hub by using

GalIO_Comm

family

functions

(20)

Tests and Evaluation

• The speech recognizer only

• The dialog connected to the database

• The dialog with the recognizer

• The whole system

References

Related documents