Pattern Recognition and Human Language Technology Group
(Grupo de Reconocimiento de Formas y Tecnolog´ıas de la Percepci ´on)Institut Tecnol `ogic d’Inform `atica
Departament de Sistemes Inform `atics i Computaci ´o Universitat Polit `ecnica de Val `encia
http://prhlt.iti.es/ December 2005
B
RIEFH
ISTORY 1981 Universitat de Val `encia: RFIA group1986 Universitat Polit `ecnica de Val `encia: RFIA group, DSIC department
• European projects: ROARS, SPIN, EuTrans-I
• Spanish projects: ALBAYZIN, TRACOM
1997 PRHLT group derives from RFIA group and integrates in ITI
• European projects: EuTrans-II, TT2
• Spanish projects:
– Pattern Recognition and Computer Vision: TAR, ATRAM, TYRIG – Machine Translation: TAVAL, SISHITRA, TEFATE,
M
AINR
ESEARCHA
REAS• Pattern Recognition
• Handwritten Character Recognition
• Biometrics • Computer Vision • Language Translation • Speech Processing • Machine Learning • Dialogue Systems
P
EOPLE• Academic degree – 12 Ph. D.
– 19 Ph. D. Students
• Academic position
– 14 Professors and assistants (DSIC-UPV, DC-UPV, DI-UCLM) – 8 Research contracts
R
ECENTE
UROPEAN PROJECTS• Projects
– EUTRANS: Example based langUage TRANslation Systems.
ESPRIT Open Long Term Research. ACCION 30268. 2 Phases.1996-2000. ITI, Zeres, Fundaziones Ugo Bordoni, Aachen University.
– TT2: TransType2-Computer-Assisted Translation.
IST Programme. IST-2001-32091. 2002-2005. ATOS, ITI, Aachen University, Celer, RALI, Xerox Co., Gamma.
• “Acciones Integradas”
– Espa ˜na-Alemania. 2000-2002: ITI, Aachen University – Espa ˜na-Portugal. 2002-2004: ITI, INESC ID/IST Lisbon.
R
ECENTS
PANISH PROJECTS(CICYT)
• EXTRA: Extensiones del sistema de traducci ´on de texto y habla en dominios restringidos aprendible con ejemplos. 1997-1999. UJI, DSIC-UPV.
• BASURDE: Desarrollo de un sistema de di ´alogo para habla espont ´anea en un dominio sem ´aticamente restringido. 1998-2001. UPC, DSIC-UPV, LSI-UPC, EHU-UPV, UZ, UJI.
• TAVAL: Traductor autom ´atico bidireccional entre castellano y valenciano. 2000-2001. ITI.
• SISHITRA: Sistemas h´ıbridos para la traducci ´on valenciano-castellano a partir de voz y texto.
2001-2003. ITI, LSI-UA.
• TAR: T ´ecnicas avanzadas en reconocimiento de formas y sus aplicaciones en procesos industriales y comerciales. 2001-2003. ITI, LSI-UA, LSI-UJI.
• DIHANA: Sistema de Di ´alogo para el Acceso a la Informaci ´on mediante habla espont ´anea en diferentes entornos. CICYT. 2003-2005. DSIC-UPV, EHU-UPV, UZ.
• ATRAM: Aplicaci ´on de T ´ecnicas de Reconocimiento de Formas para el An ´alisis Morfol ´ogico del Pie y Fabricaci ´on del Calzado. CICYT. 2001-2004. IBV, DSIC-UPV.
• ITEFTE: Inferencia de traductores de estados finitos para la traducci ´on autom ´atica y la ayuda a la traducci ´on en tareas espec´ıficas. 2003-2005. LSI-UA, DSIC-UPV.
R
ECENT PROJECTS WITH COMPANIES• AMETRA, ADUR Software Productions S. Co.
• Pick-by-Voice, RUMBO Sistemas S.L.
• A classification system by voice, Teismaderas S.A.
• Biometrics, Advanced Software Technologies S.A.
• Opinion poll by phone, ODEC.
P
UBLICATIONS• Proceedings of conferences: International Conference on Pattern Recognition, International Conference on Acoustic, Speech and Signal Processing, Conference on Computational Linguistics, ...
• Journals: IEEE Transactions Pattern Analysis and Machine Intelligence, Pattern Recognition, Machine Learning Journal, Computational Linguistics, IEEE Transactions on Acoustic, Speech and Signal Processing, Computer Speech and Language, IEEE Transactions on Speech and Audio Processing, IEEE Transactions on Systems, Man and Cybernetics, ...
L
INKS• European groups
– Lehrstuhl f ¨ur Informatik VI. RWTH Aachen - University of Technology. Germany. (H. Ney) – Equipe Universitaire de Recherche en Informatique de Saint Etienne (EURISE). Universit ´e
de Saint Etienne-Jean Monnet. France. (C. de la Higuera)
– Institute for Systems and Computer Engineering. Spoken Language Systems Lab (L2F). Lisboa. Portugal. (I. Trancoso)
• Spanish groups
– Grupo de Reconocimiento Autom ´atico del Habla. Universidad del Pa´ıs Vasco
– Grupo de Aprendizaje Computacional, Reconocimiento Autom ´atico y Traducci ´on del Habla.
Universitat Jaume I.
– Grupo de Reconocimiento de Formas e Inteligencia Artificial. Universidad de Alicante – Grup de Teoria del Senyal. Universitat Polit ´ecnica de Catalunya
M
ETHODOLOGIES• MODELS
– Hidden markov models
– Stochastic finite-state transducers
– Statistical alignment models and phrase-based models – (Local) Feature vectors and (weighted) distances
• TRAINING
– Statistical estimation (E-M algorithms) – Grammatical inference techniques
– Clustering
• SEARCH
– Viterbi algorithm (+ N-best + Word graphs) – Stack-decoding algorithm (+ N-best)
B
ASIC PROTOTYPES• ATROS: Speech recognition, speech translation and handwritten character recognition.
• PBSMT: Machine translation.
• TT2: Computer-assisted translation.
• LFC: Face recognition, speaker recognition, computer vision.
M
ACHINE TRANSLATION SYSTEMS• SISHITRA: A knowledge-based Spanish-to-Catalan translator http://prhltdemos.iti.es/˜taval/ (Access)
• Statistical translators
http://dcomgp05.gnd.upv.es/WebTrans.debug/trad – TEFATE: Spanish-to-Catalan (Unrestricted task)
– AMETRA-METEO: Spanish-to-Basque (Meteorological News) – AMETRA-DFB: Spanish-to-Basque (Administrative task)
C
OMPUTER-
ASSISTED TRANSLATION• Translation of printer manuals – English-Spanish – English-German – English-French • Weather reports – Spanish-Basque • Administrative proceedings – Spanish-Basque – Spanish-Catalan
S
PEECH-
TO-
SPEECH TRANSLATION(Access)• Spanish-to-English tourist task
http://prhltdemos.iti.upv.es/demo/spanish_demo.html
• Italian-to-English tourist task
http://prhltdemos.iti.upv.es/demo/italian_demo.html
• Catalan-to-English tourist task
http://prhltdemos.iti.upv.es/demo/valcat_demo.html
• Spanish-to-Basque tourist task
http://prhltdemos.iti.upv.es/demo/ametra_demo.html
S
PEECH RECOGNITION• A speech understanding prototype
http://prhlt.iti.es/demos/demo_speechunderstand/index.htm
• Automatic voice-driven telephone exchange
http://prhlt.iti.es/demos/demo_exchange/index.htm
• Speech dialogue with an information system
http://physionet.cps.unizar.es/˜eduardo/investigacion/voz/ tic98-0423.html
H
ANDWRITTEN CHARACTER RECOGNITION• Off-line handwritten character recognition – An example
http://prhlt.iti.es/demos/demo_htr/index.htm
B
IOMETRICS• Automatic face recognition – Preprocessing
– Training & search
O
THER CLASSIFICATION TASKS• Plain text classification
• Handwritten text classification
• Chromosome classification
• Prostate ultrasonography pattern analysis
• Breast cancer detection in digitized mammograms