Comparison with ST grid - Alternate method for content classification

4.6 Alternate method for content classification

4.6.5 Comparison with ST grid

In this section the results obtained are compared to the ST grid shown in Fig. 4.2. Figure 4.12 shows the principal co-ordinates analysis also known as multidimensional scaling of the twelve content types. The function cmdscale in MATLABTM is used to perform the principal co-ordinates analysis. cmdscale takes as an input a matrix of inter-point distances and creates a configuration of points [83]. Ideally, those points are in two or three dimensions, and the Euclidean distances between them reproduce the original distance matrix. Thus, a scatter plot of the points created by cmdscale provides a visual representation of the original distances and produces representation of data in a small number of dimensions.

Figure 4.12. Principal co-ordinates analysis

-60 -40 -20 0 20 40 60 -15 -10 -5 0 5 10 15 20 25 Akiyo Suzie Grandma Stefan Football Rugby Table Tennis Coastguard Tempete Bridge-close Carphone Foreman Similarity index L in k a g e d is ta n c e

4.7. Summary

In Fig. 4.12 the distance between each video sequence indicates the characteristics of the content, e.g. the closer they are the more similar they are in attributes. Comparing Fig. 4.12 to Fig. 4.2 it can be seen that classifying contents from the MOS scores (through objective video quality evaluation in our case), did group contents with similar attributes together e.g. contents of Football, Stefan Table-tennis and Rugby are high spatial and high temporal. However, according to the ST grid they should be in the bottom right hand side as opposed to the top right hand side, however rotating the grid by 270° shows that it can be fitted in the high temporal and high spatial feature contents into the top right hand side of Fig. 4.12. Fig. 4.12 will then be a better fit to the ST grid. The cophenetic coefficient was also much lower (~ 73%) indicating that the traditional method was a better fit. Also, the accuracy of classifying the contents this way can be difficult to quantify, as if the experiments are repeated, will it yield the same results, though it can be clearly seen that a pattern is formed as the PER increases and hence contents with similar ST features behave similarly to packet losses. More work is required to explore the accuracy of this method and has been left as a future work leading from this PhD thesis.

In the rest of the thesis content classification is carried out at the receiver side using the traditional method of ST feature extraction.

4.7 Summary

In this chapter two methods of content classification have been described. The first method classified the contents based on feature extraction, whereas, in the second method MOS values were used. It was concluded that the traditional method of ST feature extraction is more accurate. Although the second method gave interesting results as it grouped videos of similar attributes together. However, it may be difficult to reproduce that under different set of conditions. It is left as an area of future research leading from this thesis. Therefore, the

4.7. Summary

rest of the thesis is based on the first method (ST-feature extraction method) for content type estimation carried out at the receiver side. Content type is an input to our models presented in Chapters 5 and 6.

Chapter 5

Regression-based Models for Non-intrusive

Video Quality Prediction over WLAN and

UMTS

5.1 Introduction

Multimedia content and services are growing exponentially across wireless access networks – both WLAN and UMTS. Digital videos are everywhere – from various hand held devices to personal computers. However, due to the bandwidth constraints of such networks, Quality of Service (QoS) still remains of concern. QoS is affected by parameters related to both the encoder (e.g. sender bitrate, frame rate, etc) and access network (e.g. block loss, jitter, etc) as was discussed in Chapter 3. The impact of these distortions is very much content dependent. For video applications to be successful over such access networks QoS is likely to be the major determining factor. In order to meet user’s QoS requirements, there is a need to predict, monitor and if necessary control video quality. Non-intrusive models [101] provide an effective and practical way to achieve this.

Chapter 2 discussed the different objective (both intrusive and non-intrusive) and subjective video quality measurement methods. Intrusive methods are accurate and efficient however,

5.2. Related Work

they are impractical in real time monitoring. Hence, non-intrusive methods are preferred to intrusive analysis as they are more suitable for on-line quality prediction/control.

In this Chapter new regression-based models are developed for predicting video quality non- intrusively over wireless access networks of WLAN and UMTS. The prediction is from a combination of parameters associated with the encoder and access network for different types of content. The models are predicted in terms of the MOS obtained objectively (from simulation) and further from subjective tests.

This Chapter is organized is as follows. Related work on video quality modelling is introduced in Section 5.2. Section 5.3 outlines objective test set-up over WLAN and UMTS. Section 5.4 describes the subjective tests over UMTS. Section 5.5 presents the video quality prediction scheme. Section 5.6 presents the procedure for developing the models. The proposed models over WLAN are presented in Section 5.7. Section 5.8 presents the models over UMTS. Model comparison and validation with external databases is presented in Section 5.9. Section 5.10 summarizes the Chapter.

5.2 Related Work

The exponential growth of multimedia applications accessed via UMTS networks on mobile devices makes video quality prediction at the user level very desirable. Several studies on video quality prediction can be found in literature. Existing video quality prediction algorithms consider video content features or the effects of distortions caused by the encoder or network impairments. In addition, they are restricted over IP networks. However, with the growth of video services and applications over wireless access networks it is important to model losses that occur in the access network. Work presented in [16] have presented full reference video quality prediction models based on video content features for H.264 video. Reduced reference metrics presented in [102],[103] use raw video features to predict video

5.2. Related Work

quality. Whereas, works presented in [15],[104, 105]-[107] are from raw video features too, but the models are reference free. In [108] video quality for mobile applications has been evaluated. They chose content, codecs, bitrates and bit error patterns and found their impact on quality. They then present a no reference metric in [109] based on spatio-temporal features to estimate blur for images and video. Work presented in [110] compares three methods where spatio-temporal information and the impact of packet loss from the content is used to monitor video quality. In [111] a metric is presented that measures the temporal quality degradation caused by regular and irregular frame loss. In [11] video quality prediction models for H.264 video has been presented. The model presented is based on sender bitrate, frame rate and content types. In [112] perceptual metric for H.264 encoded panorama style video sequences has been presented. In [7] a theoretical framework is presented for MPEG4 video quality prediction. The framework considers sender bitrate in the application layer. In [8] a method to measure the quality of pictures of compressed videos based on an estimation of PSNR for H.264 encoded videos has been proposed. In [9] a model to measure temporal artefacts on perceived video quality in mobile video broadcasting services has been presented. They concluded that perceived quality for low spatio-temporal videos is affected adversely by frame rate decimation. In [10] a video quality metric based on quantization errors, frame rate and motion speed has been proposed. Metrics presented in [113] are to measure streaming video quality over windows media player. Work presented in [114] estimate the quality of H.264 encoded video sequences using a video decoder. They have used two parameters together (quantization parameter and contrast measure) within the H.264 decoder to give an estimation of subjective video quality. The prediction models presented in these works are from application layer parameters only (encoder based distortion and/raw content features).

In document Video Quality Prediction for Video over Wireless Access Networks (UMTS and WLAN) (Page 113-119)