New Models for Perceived Voice Quality Prediction and their Applications in Playout Buffer Optimization for VoIP Networks

(1)

New Models for Perceived Voice Quality

Prediction and their Applications in

Playout Buffer Optimization for VoIP

Networks

University of Plymouth

United Kingdom

{L.Sun; E.Ifeachor}@plymouth.ac.uk

Dr. Lingfen Sun

(2)

Outline

Background

Speech quality for VoIP networks Current status

Aims of the project

Main Contributions

Novel non-intrusive voice quality prediction models

Novel perceptual-based speech quality optimization (e.g. jitter

buffer optimization) mechanism

(3)

Background – Speech Quality for VoIP Networks

MOS

VoIP speech quality: end-user perceived quality (MOS), an

important metric.

Affected by IP network impairments and other impairments. Voice quality measurement: subjective (MOS ) or objective

SCN _{IP Network}

Gateway

SCN: Switched Comm. Networks

(PSTN, ISDN, GSM …) Non-intrusive

measurement

SCN

Gateway

End-to-end Perceived speech quality

Reference speech _Intrusive Degraded speech

measurement

(4)

Current Status and Problems

Lack of an efficient non-intrusive speech quality

measurement method

E-model (a complicated computational model)

Based on subjective tests to derive models/parameters,

time-consuming and expensive. Only limited models exist

Lack of perceptual optimization control methods

only based on individual network parameters for buffer optimization

and QoS control purposes

(5)

Aims of the Project

IP Network Receiver Voice receiver Jitter buffer Decoder De-packetizer Non-intrusive measurement

End-to-end perceived voice quality (MOS)

Voice source Encoder Sender Packetizer MOS

To develop novel and efficient method/models for non-intrusive

quality prediction,

To apply the models for perceptual-based optimization control (

(6)

Novel Non-intrusive Voice Quality Prediction

VoIP Network PESQ

E-model Measured _MOSc

delay MOS(PESQ)

Reference speech Degraded speech

Intrusive method

(packet loss, delay, codec …)

Non-intrusive

method New model (regression or ANN models)

Predicted MOSc

Based on intrusive quality measurement (e.g. PESQ) to predict

voice quality non-intrusively which avoids subjective tests.

(7)

New Structure to Obtain MOS

_c

PESQ

Delay model

MOS Æ R Æ I

_e I_e End-to-end delay

E-model

MOSc I_d Reference speech Degraded speech MOS (PESQ)

PESQ can only predict one-way listening speech quality

(expressed as MOS).

By a new combined PESQ/E-model structure, a conversational

(8)

Regression based Models (1)

I_e Codec

I_emodel

Nonlinear regression models are derived for I

_e

based on

PESQ/PESQ-LQ

Further combine I

with I

to obtain MOS

MOS (PESQ)

Packet loss _MOSc

E-model Delay (d) _I d model _I d (a) PESQ/

PESQ-LQ MOSÆ RÆIe Measured Ie

Reference speech

Degraded speech

Speech

database Encoder Loss model Decoder

Nonlinear regression

model (I_emodel) Predicted Ie

(9)

Regression based Models (2)

I_e can be modelled by a logarithm fitting function with the form of

Parameters for different codecs (PESQ)

c

b

a

I

_e

=

ln(

1 +

ρ

)

+

12.59 20.06 21.14 30.86 16.68 a iLBC G.723.1 G.729 AMR(L) AMR(H) Parameters

(10)

Regression Models for AMR (12.2Kb/s)

e.g.

for AMR (12.2Kb/s),

96 .

14 )

3011

.

0

1 ln(

68 .

16 +

+

=

ρ

e

I

The goodness of fit is: SSE = 2.83 and R2 _{= 0.998}

MOS vs. packet loss and delay

(11)

Perceptual-based Buffer Optimization

Motivation:

only based on individual network parameters (e.g. delay or loss) targeting only minimum average delay or minimum late arrival loss,

not maximum MOS.

There is a need to design buffer algorithm to achieve optimum

perceived speech quality.

Contribution

A perceptual-based optimization jitter buffer algorithm

o Use regression based models for buffer optimization

o Use a minimum impairment criterion instead of traditional maximum

MOS score

o A Weibull delay distribution based on trace analysis

(12)

Impairment Function

I

_m

Define: impairment function I

_m

parameters

codec

are

and

0 if

1 )

(

0 if

0 )

(

)

1 ln(

)

3 .

177 (

)

3 .

177 (

11 .

0

024 .

0 )

,

(

b

a

x

H

x

H

where

b

a

d

H

d

I

d

f

I

_m _d _e







≥

=

<

=

+

−

+

=

+

=

ρ

_ρ r d n n n n b n

P

X

d

e

) / ) ((

)

100 (

)

(

)

100 (

ρ

µ α

ρ

₌

₊

₌

₊

₋

_≥

₌

₊

₋

− − Playout delay d buffer loss ρ_b Weilbull distribution

(13)

Minimum Impairment Criterion

Define: minimum impairment criterion

Given: network delay d_n, network loss ρ_n and codec type Estimate: an optimized playout delay d_opt

Such that: minimize I_m can be reached.

d₁ d₂ d₃d₄ Minimum I_m

(14)

Perceptual-based Optimization Buffer Algorithm

For every packet i received, calculate network delay n_i If mode == SPIKE then

if n_i ≤ tail*old_d then mode = NORMAL

elseif n_i > head*d_i then

mode = SPIKE; old_d = d_i

else

-update delay records for the past W packets

endif

At the beginning of a talkspurt If mode == SPIKE then

d_i = n_i else

-obtain (µ, α, γ) for Weilbull distribution for the past W packets -search playout d which meets minimum I_mcriterion

(15)

Performance Analysis and Comparison (1)

0.2 0.2 150 5 4.4 0.7 16 4 14.3 19.5 186 3 0.3 0.8 46 2 1.1 16.2 153 1 Loss (%) Jitter (ms) Delay (ms) Trace

Selected five traces from UoP to CU (USA), DUT

(Germany), BUPT (China), and NC (China).

(16)

Performance Analysis and Comparison (2)

“p-optimum” algorithm achieves the optimum voice

quality for all traces.

“adaptive” algorithm achieves sub-optimum quality with

low complexity.

Performance comparison for buffer algorithms

0.5 1 1.5 2 2.5 3 3.5 4 1 2 3 4 5 Traces MO S exp-avg fast-exp min-delay spk-delay adaptive p-optimum

(17)

Conclusions and Future Work

Conclusions

The development of a new methodology and regression models to

predict voice quality non-intrusively.

Demonstrated the application of new non-intrusive voice quality

prediction models to perceptual-based optimization of playout buffer algorithms.

New Models for Perceived Voice Quality Prediction and their Applications in Playout Buffer Optimization for VoIP Networks