• No results found

New Models for Perceived Voice Quality Prediction and their Applications in Playout Buffer Optimization for VoIP Networks

N/A
N/A
Protected

Academic year: 2021

Share "New Models for Perceived Voice Quality Prediction and their Applications in Playout Buffer Optimization for VoIP Networks"

Copied!
18
0
0

Loading.... (view fulltext now)

Full text

(1)

New Models for Perceived Voice Quality

Prediction and their Applications in

Playout Buffer Optimization for VoIP

Networks

University of Plymouth

United Kingdom

{L.Sun; E.Ifeachor}@plymouth.ac.uk

Dr. Lingfen Sun

(2)

Outline

‰

Background

ƒ Speech quality for VoIP networks ƒ Current status

ƒ Aims of the project ‰

Main Contributions

ƒ Novel non-intrusive voice quality prediction models

ƒ Novel perceptual-based speech quality optimization (e.g. jitter

buffer optimization) mechanism

(3)

Background – Speech Quality for VoIP Networks

MOS

‰ VoIP speech quality: end-user perceived quality (MOS), an

important metric.

‰ Affected by IP network impairments and other impairments. ‰ Voice quality measurement: subjective (MOS ) or objective

SCN IP Network

Gateway

SCN: Switched Comm. Networks

(PSTN, ISDN, GSM …) Non-intrusive

measurement

SCN

Gateway

End-to-end Perceived speech quality

Reference speech Intrusive Degraded speech

measurement

(4)

Current Status and Problems

‰

Lack of an efficient non-intrusive speech quality

measurement method

ƒ E-model (a complicated computational model)

ƒ Based on subjective tests to derive models/parameters,

time-consuming and expensive. Only limited models exist

‰

Lack of perceptual optimization control methods

ƒ only based on individual network parameters for buffer optimization

and QoS control purposes

(5)

Aims of the Project

IP Network Receiver Voice receiver Jitter buffer Decoder De-packetizer Non-intrusive measurement

End-to-end perceived voice quality (MOS)

Voice source Encoder Sender Packetizer MOS

‰ To develop novel and efficient method/models for non-intrusive

quality prediction,

‰ To apply the models for perceptual-based optimization control (

(6)

Novel Non-intrusive Voice Quality Prediction

VoIP Network PESQ

E-model Measured MOSc

delay MOS(PESQ)

Reference speech Degraded speech

Intrusive method

(packet loss, delay, codec …)

Non-intrusive

method New model (regression or ANN models)

Predicted MOSc

‰ Based on intrusive quality measurement (e.g. PESQ) to predict

voice quality non-intrusively which avoids subjective tests.

(7)

New Structure to Obtain MOS

c

PESQ

Delay model

MOS Æ R Æ I

e Ie End-to-end delay

E-model

MOSc Id Reference speech Degraded speech MOS (PESQ)

‰

PESQ can only predict one-way listening speech quality

(expressed as MOS).

‰

By a new combined PESQ/E-model structure, a conversational

(8)

Regression based Models (1)

Ie Codec

Ie model

‰

Nonlinear regression models are derived for I

e

based on

PESQ/PESQ-LQ

Further combine I

with I

to obtain MOS

MOS (PESQ)

Packet loss MOSc

E-model Delay (d) I d model I d (a) PESQ/

PESQ-LQ MOSÆ RÆIe Measured Ie

Reference speech

Degraded speech

Speech

database Encoder Loss model Decoder

Nonlinear regression

model (Iemodel) Predicted Ie

(9)

Regression based Models (2)

‰ Ie can be modelled by a logarithm fitting function with the form of

‰ Parameters for different codecs (PESQ)

c

b

a

I

e

=

ln(

1

+

ρ

)

+

12.59 20.06 21.14 30.86 16.68 a iLBC G.723.1 G.729 AMR(L) AMR(H) Parameters

(10)

Regression Models for AMR (12.2Kb/s)

e.g.

for AMR (12.2Kb/s),

96

.

14

)

3011

.

0

1

ln(

68

.

16

+

+

=

ρ

e

I

The goodness of fit is: SSE = 2.83 and R2 = 0.998

MOS vs. packet loss and delay

(11)

Perceptual-based Buffer Optimization

‰

Motivation:

ƒ only based on individual network parameters (e.g. delay or loss) ƒ targeting only minimum average delay or minimum late arrival loss,

not maximum MOS.

ƒ There is a need to design buffer algorithm to achieve optimum

perceived speech quality.

‰

Contribution

ƒ A perceptual-based optimization jitter buffer algorithm

o Use regression based models for buffer optimization

o Use a minimum impairment criterion instead of traditional maximum

MOS score

o A Weibull delay distribution based on trace analysis

(12)

Impairment Function

I

m

‰

Define: impairment function I

m

parameters

related

codec

are

and

0

if

1

)

(

0

if

0

)

(

)

1

ln(

)

3

.

177

(

)

3

.

177

(

11

.

0

024

.

0

)

,

(

b

a

x

x

H

x

x

H

where

b

a

d

H

d

d

I

I

d

f

I

m d e

=

<

=

+

+

+

=

+

=

=

ρ

ρ

ρ r d n n n n b n

P

X

d

e

) / ) ((

)

100

(

)

(

)

100

(

ρ

ρ

ρ

µ α

ρ

ρ

ρ

ρ

=

+

=

+

=

+

− − Playout delay d buffer loss ρb Weilbull distribution

(13)

Minimum Impairment Criterion

‰ Define: minimum impairment criterion

Given: network delay dn, network loss ρn and codec type Estimate: an optimized playout delay dopt

Such that: minimize Im can be reached.

d1 d2 d3 d4 Minimum Im

(14)

Perceptual-based Optimization Buffer Algorithm

For every packet i received, calculate network delay ni If mode == SPIKE then

if ni tail*old_d then mode = NORMAL

elseif ni > head*di then

mode = SPIKE; old_d = di

else

-update delay records for the past W packets

endif

At the beginning of a talkspurt If mode == SPIKE then

di = ni else

-obtain (µ, α, γ) for Weilbull distribution for the past W packets -search playout d which meets minimum Im criterion

(15)

Performance Analysis and Comparison (1)

0.2 0.2 150 5 4.4 0.7 16 4 14.3 19.5 186 3 0.3 0.8 46 2 1.1 16.2 153 1 Loss (%) Jitter (ms) Delay (ms) Trace

‰

Selected five traces from UoP to CU (USA), DUT

(Germany), BUPT (China), and NC (China).

(16)

Performance Analysis and Comparison (2)

‰

“p-optimum” algorithm achieves the optimum voice

quality for all traces.

‰

“adaptive” algorithm achieves sub-optimum quality with

low complexity.

Performance comparison for buffer algorithms

0.5 1 1.5 2 2.5 3 3.5 4 1 2 3 4 5 Traces MO S exp-avg fast-exp min-delay spk-delay adaptive p-optimum

(17)

Conclusions and Future Work

‰

Conclusions

ƒ The development of a new methodology and regression models to

predict voice quality non-intrusively.

ƒ Demonstrated the application of new non-intrusive voice quality

prediction models to perceptual-based optimization of playout buffer algorithms.

‰

Future Work

ƒ To consider buffer adaptation during a talkspurt in order to achieve

the best trade-off between delay, loss and end-to-end jitter.

ƒ To extend the work to improve the performance of multimedia

(18)

Contact Details

‰

http://www.tech.plymouth.ac.uk/spmc

‰

Dr. Lingfen Sun

[email protected]

‰

Prof Emmanuel Ifeachor

[email protected]

‰

Any questions?

References

Related documents

Semester 1 Core courses, Free Electives and Restricted Electives. /School of Industrial

The controller methodology and equations are discussed, and simulations using the controller to set a tunnel Mach number in the NASA Langley 14- by 22-Foot Subsonic Tunnel

“My research team and I are interested in studying urban-focused school psychology training programs (UFSPTP) and how they prepare prospective urban practitioners to work successfully

After geolocating academic and work institutions, and using the time period associated with each degree in the vitae data, the trajectories become spatiotemporal objects.. We

Finally, computer experiments will be conducted to investigate and probe the possibilities, limitations and effectiveness of using modern and digital technology to preserve

Fabio Mosca Chief of the Neonatal Intensive Care Unit, Department of Maternal and Pediatric Sciences, University of Milan, IRCCS Cà Granda Foundation, Ospedale

To whom comes from Genoa : take the motorway to Piacenza, then follows to Bologna and exit at Parma Centre (2 nd station ). To whom comes from Viareggio-La Spezia : take