QoS Mapping of VoIP Communication using Self-Organizing Neural Network

(1)

1

QoS Mapping of VoIP Communication using Self-Organizing Neural Network

Masao MASUGI

NTT Network Service System Laboratories, NTT Corporation

3-9-11 Midori-cho, Musashino-shi, Tokyo 180-8585, Japan

E-mail: [email protected]

Abstract: This paper proposes a QoS mapping method of VoIP communications for real network environments. To totally take acount of the effects of several QoS-related parameters, we used a self-organizing neural network, which can map high-dimensional data into simple geometric relationships on a low-dimensional display. For training the self-organizing neural network, we measured sevaral QoS-related parameters such as PSQM+, end-to-end delay, and packet loss rate for three VoIP systems. Evaluation results confirmed that our method can effectively evaluate the total QoS level composed of several QoS-related factors.

Keyword: VoIP, QoS mapping, PSQM, Self-organization I. Introduction

Voice over IP (VoIP) enables voice data integration over IP networks, reducing the network transmission cost for IP protocol users. Due to the shared nature of current network structures, however, it is difficult to guarantee the quality level of voice speech. The quality of service (QoS) of VoIP depends on several factors such as network delay, packet loss rate, and the kinds of codec, so that it is naturally categorized as a “best effort service”.

In order to perform appropriate evaluations of voice speech quality degradation, there have been various studies and proposals for packet-based voice communications [1]-[4]. For example, PSQM (Perceptual Speech Quality Measurement) and PSQM+, which can generate a perceptual distortion level for each voice frame, have been adopted as ITU-T Recommendation P.861 for assessing voice speech quality [5],[6]. In addition, PESQ (Perceptual Evaluation of Speech Quality) has been also standardized as Recommendation P.862 in ITU-T [7], [8]. However, especially for interactive communications, the QoS level of packet-based voice network systems depends on many factors such as end-to-end delay, so estimating the speech quality level only is not enough for executing the end-to-end evaluations.

To account for compound factors in estimating the QoS level of the end-to-end network system, the E-model, which can deal with several QoS-related factors including the voice speech quality and end-to-end delay, has also been proposed [9]. The E-model-based method is effective for evaluating the QoS level of VoIP communications, because it can deal with several QoS-related factors including the

speech quality level and end-to-end delay. However, this method outputs only the calculated QoS value, and the position of each factor in the result and the correlation of results among other QoS levels cannot be efficiently displayed. Therefore, a QoS level mapping technique that can show multilateral aspects of the QoS evaluation is required. In addition, when some of the components for its calculation cannot be obtained, it does not necessarily provide appropriate outputs, meaning that its applicability is limited in evaluating the QoS level in real network environments.

Incidentally, a self-organization-based mapping model [10],[11] is an effective tool that can clarify the relative relationships in high-dimensional input data. Based on this method, nonlinear statistical relationships in high-dimensional data can be converted into a two-high-dimensional space, while preserving the metric and topological relationships of the input data. As a result, this mapping model can be used to evaluate and categorize the relative relationships of high-dimensional input data.

This paper describes a QoS mapping of VoIP communications for objective evaluations in real environments. To totally take into account the effects of several QoS-related factors, we used a self-organizing neural network, which can map high-dimensional data into simple geometric relationships on a low-dimensional display. Section two shows the basic flows of the QoS mapping and evaluations for VoIP communications. Section three presents case studies using several parameters such as PSQM+ and end-to-end delay for three VoIP systems. Section four summarizes the results of this paper and mentions on further studies for this subejects.

II. QoS evaluation of VoIP communication using self-organizing mapping

A. Flow of self-organization [10],[11]

The self-organizing algorithm can calculate multi-dimensional parameters so that they optimally denote the domain, in which the relationships of primary data are preserved topologically. In this paper, a two-dimensional map is employed to model the QoS level of VoIP communications.

The basic training process of the self-organization model is defined as

(2)

2 mi(t+1) = mi(t) + hc(x),i (x(t) – mi(t)) (1) where m is the weight vector, x is the input vector, hc(x),i is

the neighborhood function, i is the node number of an output layer, and t is the regression step index. The concept of this self-organizing map is shown in Fig.1, where the n-dimensional vector x is projected onto the output layer. In this process, the input vector x is compared with all mi, and the subscript c(x) is defined by the Euclidean condition

|| x – mc || = min{|| x – mi ||}, (2)

i

wheremc is the winner that best matches x. Here, the initial value of mi is set to random value, and the Gaussian type neighborhood function can be given by

hc(x),i =α(t) exp{–|| ri – rc ||2 / 2σ2(t) }, (3)

where 0 < α(t) < 1 is the learning rate parameter, ri∈ R2

and rc∈ R2 are the vertical locations on the grid, and σ(t)

corresponds to the width of the neighborhood function. Also, assuming that T is the total training number, α(t) and σ(t) can be defined as

α(t) = α(0) (1 – t / T) (4) σ(t) = σ(T) + {σ(0) – σ(T)}(1 – t / T). (5)

The procedure of this training process can be described as follows;

(a) Initialize mi to a random value, (b) Input x(t), one at a time, (c) Calculate eq.(2), and find mc, (d) Calculate eq.(1) using eqs.(3) - (5), (e) Repeat from (b).

To raise the training efficiency, the algorithm is performed in two phases. In the first phase, a relatively large initial neighborhood radius is used to tune the map approximately. Then, in the second phase, the initial

neighborhood radius is set to a small value to fine-tune the map.

B. Application to QoS mapping of VoIP communications The concept of QoS evaluation of VoIP network communications is shown in Fig.2, where the input data for training the self-organizing map can be given by the target VoIP communication condition. In this training, two categories of variables are input into the self-organizing map: one is the QoS-related parameters and the other is the system identifiers that indicate the basic system performance level and properties. As for the QoS-related parameters, we focus on two aspects for both the voice speech level and the delay impairment, and select PSQM+ and end-to-end delay as input variables to the self-organizing map. Incidentally, the packet loss is a major source of speech impairment in VoIP communications, and is a very important parameter that shows the degree of loss patterns on the voice speech quality. Therefore, to supplement QoS evaluations, the packet loss rate is also employed as an input variable to the self-organizing map. In this paper, in terms of the QoS-related input parameters to the self-organization map, three parameters --- PSQM+, end-to-end delay, and packet loss rate --- were measured in a real environment, and they were used to evaluate the total QoS level among other conditions.

On the other hand, as for the system identifiers that correspond to the basic system specifications, we use the PSQM+ for no background load conditions between the originating and terminating VoIP gateways, which can be

VoIP GW VoIP GW Tel Tel IP network Time Time L eve l L eve l ∆t (Delay)

Data for the training Input1:(variable_1(1),variable_2(1),…,variable_n(1))

Input2:(variable_1(2),variable_2(2), …,variable_n(2))

・

InputN:(variable_1(N),variable_2(N), …,variable_n(N) )

Self-organizing map Train ing

Fig. 2 Concept of QoS evaluation procedure for VoIP communication(GW: gateway).

Measurement of data Input data m_ij Output layer

・・・

x₁ x₂ x_n

(3)

3 regarded as the standard PSQM+ for each system. The QoS level of VoIP communications depends on network system factors such as the kind of codec, and inputting the standard PSQM+ helps to map target data when some of the input variables are lost. To obtain the standard PSQM+ as an input variable, we directly connect the originating and terminating gateways via a switch and measure it for no background data conditions.

By inputting the measured QoS-related parameters and system identifiers to the self-organizing map, we can map the QoS level of VoIP communications on a two-dimensional space.

III. Case study A. Setup for training data

The test-bed for measuring QoS-related parameters is shown in Fig. 3, where both a LAN (100 Base-T) and the Internet are used as transmission networks. In this measurement setup, three types of VoIP gateways (GWs) were used, and we define the connection patterns as : GW1-to-GW1 as System 1, GW2-to-GW2 as System 2, and GW3-to-GW3(PC-to-PCusing VoIP communication software) as System 3. As for the transmission network, the LAN was used for Systems 1 and 2, both the LAN and Internet were used for System 3, and the bandwidth of access lines between the Internet and VoIP gateways was 128kbps.

When measuring QoS-related parameters of PSQM+ and the end-to-end delay for these systems, voice sample data [12], based on ITU-T Recommendation P.800 [13], was transmitted. After the voice sample data was sent from the originating gateway to the terminating one, the source and transmitted voice signals were compared, and then the values of PSQM+ were calculated by a voice measuring device. In this calculation, the time-averaged results of all PSQM+ for each frame of 16 ms were calculated across the voice sample files (five files in Japanese language, each 4 s long). The end-to-end delays were also measured by

comparing the leading parts of transmitted and received voice signals. Here, the mesurement condition for three VoIP systems and the voice sample data used is shown in Table 1.

Incidentally, for the measurements in the LAN, two conditions were employed: for one, there was no other background data in the network and for the other, there was. In the case when there was no other background load in the LAN, the packet loss based on the Poisson distribution was generated at the transmission network part by a packet loss generator. Furthermore, we also confirmed that the processing time caused by the switch was less than 1 ms for these VoIP systems.

B. Training with measured data

Examples of measured PSQM+ and end-to-end delay for no background load in the LAN are shown in Fig.4, where measured values of five voice files were averaged for each system. Here, the packet loss rates were set in the ranges of 0 to 9%, and both PSQM+ and end-to-end delay were measured for three systems. Fig.4(a) shows that PSQM+ tended to increase with the packet loss rate for both G.711 and G.729, and the order of voice quality level was Systems 1, 2, and 3. Also, Fig.4(b) shows that the end-to-end delays tended to increase in the order of Systems 1, 2, and 3, and the processing speed using G.711 was almost the same as that for G.729.

Figure 5 shows examples of measured PSQM+ and end-to-end delay when there was background data in the transmission network (both the LAN and the Internet). The values of packet loss rates were not measured in these cases, meaning that the trainings were performed without this variable. From Fig.5, we confirmed that the QoS levels for both PSQM+ and end-to-end delay were degraded when there was other background data in the transmission network. Based on measured data in Figs. 4 and 5, the training was performed to evaluate the QoS level of VoIP communications for three systems. In this process, a rectangle was used as the map topology type, and the map size was set to 8 × 8. Here, we set the total training number to 3000.

Table 1 Measurement setup Item

System setup

Condition

(a) Voice packet sending period - System 1: 10 ms - System 2: 20 ms - System 3: 20 ms (b) Type of codec G.729, G.711 - Language : Japanese - Time length : 4 s Vo ice file GW 1 SW GW 2 GW 3 GW 1 GW 2 GW 3 SW LAN Vo ice measuring device

Originating side Terminating side

Fig. 3 Test-bed for measuring QoS related parameters

(GW : gateway, SW: switch, R: router).

Source signal Transmitted signal

Packet loss generator

R

(4)

4

Also, to raise the training efficiency, we performed the training in two phases [11]: in the first phase with training number of 1000, the initial value of learning rate parameter α(0) was set to 0.5, and in the second phase it was set to 0.05. In addition, for the neighborhood radius, initial value σ(0) and final value σ(T) in the first phase were set to 4 and 1, while the values in the second phase were 2 and 1 respectively.

C. Evaluation

Figure 6 shows a visualization result projected onto two-dimensional domain, in which the axes are set as x and y.

The results for Fig. 6 are as follows;

- The longer the distance from the left-bottom points became, the more PSQM+ tended to increase, meaning that the voice quality level tended to decrease with increases in x and y.

- The left part corresponds to the domain where the end-to-end delay level was relatively low, and the right upper part is equivalent to the domain where end-to-end delay was highest.

- The packet loss rate tended to increase along the y-axis, so we estimate that the voice quality level tended to degrade along the y-axis.

As shown in Fig. 6, the shorter the distance from the left bottom became, the higher the total QoS level of VoIP communications tended to become. For lecture-type

0 5 10 1 2 3 4 5 PS Q M +

Packet loss rate (%)

0.8

8

(a) Relationship between packet loss rate and measured PSQM+

Fig. 4 Examples of measured PSQM+ and end-to-end delay (LAN, no background data).

System1 0 100 150 200 E nd -t o-end de la y ( m s)

Packet loss rate (%)

250

50

System2 System3

G.711 G.729 G.711 G.729 G.711 G.729

(b) End-to-end delay of each system

: G.711, : G.729 : G.711, : G.729 : G.711, : G.729 System 1 = System 2 = System 3 = 0 3 6 9 0 3 6 9 0 3 6 9 0 3 6 9 0 3 6 9 0 3 6 9 50 100 200 1 2 3 4 5 8 PS Q M + End-to-end delay (ms) : G.711, : G.729 : G.711, : G.729 : G.711, : G.729 System 1 = System 2 = System 3 = 300 500 700

Fig.5 Examples of measured PSQM+ and end-to-end delay when there was background data.

[L] : via the LAN, [ I ] : via the Internet

[L] [L] [L] [L] [L] [L] [L] [L] [L] [L] _[L] [L] [Ｉ] [Ｉ] x y

Fig.6 Visualization result of the training data. : System 1 [G.711], : System 1 [G.729] : System 2 [G.711], : System 2 [G.729] : System 3 [G.711], : System 3 [G.729] ( ) : Numerical value represents packet loss rate [L] : via the LAN which has background load [ I ] : via the Internet

(9) (0) [L] (9) (9) (9) (9) (0) (9) (0) (3) (6) (6) (0) [L] [L] (6) (0) [L] (0) (3) [L] [L] [L] [L][L] (3) (3) [L] [L] [I] [I] (6) [L] (6) (6) (3) (3)

(5)

5 communications, in which interactive conversation scene is not necessary, however, the priority of the end-to-end delay factor can be sometimes low, so the mapping position may be allowed to shift to the right side in two-dimensional space. Our method can deal with multi-dimensional QoS-related parameters and project the results onto a two-dimensional space, so we can effectively evaluate the positioning of QoS level for each condition composed of several variables.

IV. Conclusion

This paper proposed a QoS mapping method of VoIP network communications for real network environments. To totally take account of the effects of several QoS-related factors, we used the self-organizing neural network, which can map high-dimensional data into simple geometric relationships on a low-dimensional display. By employing a self-organizing training scheme, we showed that our method can combine multi-dimensional QoS-related parameters and project the results onto a two-dimensional space, so that we can effectively evaluate the positioning of QoS level for each condition composed of several variables.

In this training, two categories of variables were input into the self-organization map: the QoS-related parameters and the system identifiers that indicate the system performance level and properties. In terms of the QoS-related input parameters to the self-organizing map, we measured three parameters --- PSQM+, end-to-end delay, and packet loss rate --- in the test-bed, and used them to evaluate the total QoS level. On the other hand, as system identifiers that correspond to the basic system specifications, we used the PSQM+ for no background load between the originating and terminating VoIP gateways, which can be regarded as a standard PSQM+ for each system. Evaluation results confirmed that our method effectively evaluates total the QoS level composed of several QoS-related factors. We also confirmed that this method can deal with input data that lacks one of its variables, and can help to estimate its unknown QoS level.

Future study will include QoS evaluations of input data using other QoS-related variables, further evaluations of precision and applicability in real environments, and application to other multimedia communications.

Acknowledgements

I would like to thank Mr. Hajime Sugawara, Tsuyoshi Takenaga, and Hiroyuki Oouchi of NTT Network Service Laboratories for their contributions to this work.

References

[1] C.J.Weinstein, “Experience with speech communica-tion in packet networks,” IEEE J. Selected Areas. Commun. vol.SAC-1, no.6, pp.963-980, Dec.1983. [2] S.Voran, “Objective estimation of perceived speech

quality –part 1: Development of measuring

normal-ized block technique, ” IEEE Trans. Speech and Audio Processing, vol.7, no.4, pp.371-382, July 1999. [3] S.B.ZahirAzami, A.Yongacoglu, L.Orozco- Barbosa,

and T. Aboulnasr, “Evaluating the effects of buffer management on voice transmission over packet switching networks,” Proc. of ICC 2001(Helsinki, Finland), pp.738-742, June 2001.

[4] M.Masuda and K.Ori, “Network performance metrics in estimating the speech quality of VoIP,” proc. APSITT 2001 (Kathumandu, Nepal and Atami, Japan), pp. 333-337, Nov. 2001.

[5] ITU-T Recommendation P.861, “Subjective quality measurement of telephony-band (300-3400 Hz) speech codecs,” Aug. 1996.

[6] ITU-T Contribution COM 12-20-E, “Improvement of the P.861 perceptual speech quality measure,” KPN research, Netherlands, Dec. 1997.

[7] ITU-T Recommendation P.862, “Perceptual evaluation of speech quality (PESQ), an objective method for end-end speech quality assessment of narrow-band telephone networks and speech codecs,” Feb.2001. [8] A.Takahashi, “Performance evaluation of objective

speech quality measure, ITU-T Recommendation P.862 PESQ,” The 16th Int. workshop on communca-tion quality & reliability (CQR2002, Okinawa), pp.39-43, May 2002.

[9] ITU-T Recommendation G.107, “The E-model, a computational model for use in transmission planning,” May 2000.

[10] T. Kohonen, “Self-organization and association memory, ” Proc. IEEE, vol. 78, no.9, pp. 1464-1480, 1990.

[11] T.Kohonen, “Self-organizing maps, ” Springer, Berlin, Heidelberg (Second extended edition), chapter 3, 1997. [12] NTT-AT,“Multi-lingual speech database for

tele-phonmetry 1994,” NTT-AT CD-ROM, 1994.

[13] ITU-T Recommendation P.800, “Methods for subjec-tive determination of transmission quality,” Aug. 1996.