• No results found

C E D A T 8 5. Innovating services and technologies for speech content management

N/A
N/A
Protected

Academic year: 2021

Share "C E D A T 8 5. Innovating services and technologies for speech content management"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)

Innovating services and technologies for

speech content management

(2)

Company profile

25 years experience in the market of transcription/reporting services;

Cedat 85 Group:

Cedat 85 srl

Subtitle Voice (57%)

Real Time Reporting srl (55%)

70+ employees (group)

2011 turnover (Group) : 5+ milion Euros

ITALY: Roma, Padova, Genova, Palermo, San Vito dei Normanni (BR)

(3)

Main customers

(4)

Cedat 85 – main partners

LUSI Lab (Language Understanding and Speech Interfaces) is part of Physics Department, University of Naples “Federico II”.

• Integrate Research and industrial applications to transfer to the markets

• Create prototypes for pre-competitive solutions based on Cedat 85 ASR technology

LUSI Lab and Cedat 85 cooperate to:

LUSI Lab University of Naples “Federico II”

EML – European Media Laboratory GmbH

EML and Cedat 85 cooperate to:

• Research in the field of ASR technologies

• Develop new languages

(5)

Technologies & services

Speech reporting

-

High quality speech reporting service: parliamentary reporting (political debates top State Institutions)

- Real time high quality speech reporting service (congress, meeting, Lectures)

Speech-to-Text (S2T) enabling technologies

-

speaker independent spontaneous speech (without human intermediation) with large vocabularies (300k+

words)

- speaker dependent (with human intermediation)

Professional services

-

domain customization for speaker independent automatic speech recognition system

(6)

Products & services

Multimedia transcription

-

audio/video - text time alignment for indexing, searching & mining media contents

Mobile application

-

sms/email dictation; voice/text chat

Subtitling

-

real time subtitling with full text review

-

standard subtitling

Enhanced reporting work-flow system

-

web-based system for real time full management and monitoring of reporting work-flow

(high quality audio recording, automated audio/text dispatching, text delivery, audio/text archive)

Innovating services and technologies for speech content management

(7)

SPEECH to TEXT (S2T)

(8)

Speech-to-Text enabling technologies

Speech Communication Technologies

Subtitling

Speech-to-Text (S2T) Applications

Speech Analytics for Telephone Conversations

Media Transcription

SMS/Email dictation

Building High Performance Speech-to-Text Systems

(9)

Speech-to-Text enabling technologies

Technology:

-

Functionality: Speech-to-Text automatically transforms human conversations in text and makes unstructured speech

content available for analysis & search

- Challenges: Speaker independence, very large vocabulary, robust speech recognition

Application areas:

-

Speech Analytics:

http://en.wikipedia.org/wiki/Speech Analytics : Speech Analytics is a term used to describe automatic methods of analyzing speech to extract useful

information about the speech content or the speakers. Although it often includes elements of automatic speech recognition, …, it may also include analysis of one or more of the following: the topic(s) being discussed, the identities of the speaker(s), the genders of the speakers, the emotional character of the speech, the amount & locations of speech vs. non-speech (e.g. background noise or silence)

Scenarios: Telephony conversations - human & agent, human & self-serviceResult: index & search; classify & filter; classify & route; statistical analysis

-

Media Transcription:

• Objective: automatically produce high quality, verbatim transcription of conversations

• Scenarios: Political debates, Meetings & Lectures, Broadcast News

(10)

Speech Analytics for Telephone Conversations

Speech Search

Objective: Turn spoken search requests into text, classify, push search result

Benefits: convenience, server side S2T & integration to search engines

Speech Messaging

Objective: S2T, (maybe) classify content and send to receiver (e.g. SMS)

Benefits: get insight to messages and judge priority

Automated Quality / Activity Monitoring for Contact Center & Helpdesks

Objective: Monitor and “mine” ideally 100% of calls

Benefits: Agent performance, Business Intelligence, Cross-sell/up-sell, …

Customer Inquiries

Objective: Interview customer before / during / after call

✴ Note: Free form answers are orthogonal to automated, self-service scenarios

Benefits: measure & improve customer satisfaction, categorize problem during waiting loop, route customer to proper agent,

(11)

Objective:

• Transcribe audio streams in media data to text and provide time alignments (word, phrases, paragraphs, …)

Benefit:

• Index, search & mine media content (video, audio

• Video subtitling

Scenario:

• Transcription of Broadcast News and audio/video archives

Media Transcription

(12)

Automated Quality & Activity Analysis

Recording & quality monitoring platform: record & store customer calls

Speech-to-Text: transform human conversations into text

Automatic call analysis

• Text representation for indexing for retrieval & analysis

• Call Categorization: assign call category & route appropriately

• order, availability, cancelation, complaint, …

• Call Filter: spot few “bad calls” and improve evaluation process

• “agent followed start/end procedure?”

• “agent solved caller problem?”, …

• Automatic Form Filling: Evaluation Sheet

Benefits:

• increased productivity and “quality of service” > gain insight to call

(13)

Customer Inquiries

– Before/during/after Call

Objective:

• Interview customer before/during/after call + statistical analysis

Benefit:

• Measure and improve customer satisfaction

Scenario:

• Transcription of Broadcast News and audio/video archives

Telephone & Network

(14)

VoiceNote -

Speech Messaging

Objective:

Voice messages are automatically transcribed and sent as email/SMS or posted on social networksAlternatively: categorize content (e.g. meeting) and send

BUONGIORNO

LA CHIAMO PER AVERE CONFERMA L' APPUNTAMENTO DI DOMANI MI PUÒ

RICHIAMARE AL TRE QUATTRO SETTE SEI DUE SEI ZERO OTTO QUATTRO DUE

GRAZIE

Innovating services and technologies for speech content management

Available

Italian

German

on

(15)

VoiceChat

IM Application (Instant Messaging) based on client/server architecture

allowing

double interaction

VOICE and TEXT

Realtime chat with Android smartphones just

exchanging short phrases

between two users

(16)

VoiceChat

Visually impaired people

(with reading and typing difficulties) can

interact through voice and listen to the

messages through speech synthesis

Audio injuried people

can receive on

their phone the transcription of the

communication

(17)

Building High Performance

- Speech-to-Text Systems

AUDIO

Transcription Engine

ASR accuracy (WER, SER) Real-time capable

Acoustic Model

Speaker coverage: dialect, speaking style, gender, age, non-native

Channel coverage: mobile phone, VoIP, microphones (far field, close talk) Environmental Noise: office, car, train, public places

Language Model Vocabulary Word sequences

Application & Language independent

Language data - How to speak

Application data - What is spoken

RECOGNIZED TEST

Needs: Technology + Data, Data, Data There is no Data like more Data (Bob Mercer, 1989)

(18)

Contact Details

Enrico Giannotti General Manager Cedat 85 srl Tel: +39 06 45230.800 x 816 Mob: +39 348.1302376 Email: [email protected]

via di Monte D'oro, 16 - 00186  Roma tel.: +39 06 68134164

Numero verde: 800.85.00.85 Email: [email protected]

References

Related documents

A group of FIFO based queue management mechanisms to support end-to-end congestion control in the Internet..

accounting performance based pay-performance relationship, while cash flow rights in non-state controlled firms have a positive effect on market performance

The insight gained into the crystal structure of dibenzo-21-crown-7 not only demonstrates the power of invariom refinement, Hirshfeld surface analysis and interaction-

Therefore, the study aims to specifically identify women’s role and the distinction between their practical, strategic and productive gender needs within rural electrification

When trying to reduce noise levels it is important to absorb, deflect, and muffle intrusive sound as close to the source as possible.. For most homeowners that point is

As Figure 4-4 shows, without a share of debt interest payments allocated to Scottish expenditure, Scottish public sector finances would be in a stronger