• No results found

OBJECTIVE QUALITY MEASURES FOR COMPRESSED IMAGES

N/A
N/A
Protected

Academic year: 2021

Share "OBJECTIVE QUALITY MEASURES FOR COMPRESSED IMAGES"

Copied!
58
0
0

Loading.... (view fulltext now)

Full text

(1)

Date of issue: 08/00

Unclassified Report

OBJECTIVE QUALITY MEASURES

FOR COMPRESSED IMAGES

Grégory HAMON

(2)

Contact: [email protected]

.

Koninklijke Philips Electronics N.V. 2000

All rights are reserved. Reproduction in whole or in part is prohib-ited without the written consent of the copyright owner

(3)

Unclassified Report: UR 2000/818

Title: OBJECTIVE QUALITY MEASURES FOR

COMPRESSED IMAGES.

Author(s): Grégory HAMON

Partof project:

Customer:

Keywords: Image compression, DCT, human visual system, image quality.

Abstract: After a study and implementation of several objective image quality

measures, an image compression scheme has been modified to improve the visual quality of the compressed images. The DCT-based compression scheme produces a scalable bit stream, which may be truncated at any point. This enables trivial bit rate control and easy adaptability to transmission channels with varying band-width. Previously, the compression scheme optimized the signal-to-noise ratio (SNR) of the compressed images, but this measure does not correspond very well to the visual quality as it is perceived by human observers. By optimising the compression for a perceptually weighted SNR, a better image quality has been obtained.

Conclusions: In this paper, the image compression has been improved and we

found an objective quality measure based on the human visual system with follows our subjective evaluation for this compression algorithm.

(4)

Résumé:

Après avoir etudié et implémenté plusieurs mesures de qualité d’images numériques, nous avons modifié un algorithme de compression d’images pour ameliorer la qualité visuelle des images compressées. L’algorithme est basé sur la transformée DCT et génère un flux de bits qui peut-être tronqué a n’importe quel endroit: si le flux de bits est tron-qué, l’image compressée sera reconstruite avec les bits disponibles et elle contiendra des erreurs plus ou moins visibles. Par consequent, le but est de mettre l’information la plus importante au debut du flux de bits. Cet algorithme permet facilement de gérer le nombre de bits par pixel que l’on souhaite pour reconstruire l’image et de s’adapter aux canaux de transmissions avec des largeurs de bande differentes. Au debut, le shema de compres-sion optimisait le rapport signal sur bruit des images compressées mais cette mesure de qualité ne correspond pas très bien avec la qualité visuelle telle qu’elle est percue par l’observateur humain. En optimisant la compression pour une autre mesure, un rapport signal sur bruit pondéré perceptuellement, une meilleure qualité de l’image a été obtenue.

(5)

Philips profile.

World-wide Company Profile.

Philips, one of the world’s biggest electronics companies, was founded in 1891 when Gerard Philips established a company in Eindhoven, the Netherlands, to manufacture incandescent lamps and other electrical products.

The company initially concentrated on making carbon-filament lamps and by the turn of the century was one of the largest producers in Europe. Developments in new lighting technologies fuelled a steady program of expansion, and, in 1914, it established a re-search laboratory to study physical and chemical phenomena, so as to further stimulate product innovation.

In 1920, Philips began to protect its innovations with patents, for areas taking in X-ray radiation and radio reception. This marked the beginning of the diversification of its product range. Having introduced a medical X-ray tube in 1918, Philips then became involved in the first experiments in television in 1925. It began producing radios in 1927 and had sold one million by 1932. One year later, it produced its 100 millionth radio valve, and also started production of medical X-ray equipment in the United States. Philips first electric shaver was launched in 1939, at which the Company employed 45,000 people worldwide and had sales of 152 millions guilders.

Science and technology underwent tremendous development in the 1940s and 1950s, with Philips Research inventing the rotary heads which led to the development of the Philishave electric shaver, and laying down the basis for later ground-breaking work on transistors and integrated circuits. In the 1960s, this resulted in important discoveries such as CCDs (charge coupled devices) and LOCOS (local oxidation of silicon).

Philips also made major contributions in the development of the recording transmission and reproduction of television pictures, its research work leading to the development of the Plumbicon TV cameratube and improved phosphors for better picture quality. It introduced the Compact Audio Cassette in 1963 and produced its first integrated circuits in 1965.

The flow of exciting new products and ideas continued throughout the 1970s : research in lighting contributed to the new PL and SL energy-saving lamps, but it was in the proc-essing, storage and transmission of images, sound and data that Philips Research made key breakthroughs, resulting in the inventions of the Laser Vision optical disc, the Com-pact Disc and optical telecommunication systems.

In this period, the field of optical recording was opened up by Philips Research giving rise to such well-known products as Compact Disc Digital Audio, CD-ROM and -more recently- DVD (the 'Digital Video Disc' or 'Digital Versatile Disc'). Philips Research was, by this time, also heavily involved in medical systems such as magnetic-resonance imaging and ultrasound. In mobile telephony -where the smaller bandwidth and the required error correction ask for more economic speech coders than normal telephony- an important Philips Research contribution, the full-rate GSM speech coder, found its way into all GSM basestations and handsets in the nineties. The same holds true for television

(6)

system research, with emphasis on digital standards and digital processing. Systems are made up of components and software. Research into components has brought a great deal of success. World-class semiconductor lasers from infrared to red, yellow and green are good examples of this. Also, research into polymer Light-Emitting Diodes (LEDs) and ’plastic electronics’ show great prospects for useful innovative com-ponents. New dedicated multi-million-transistor ICs are designed for digital video coding and decoding (according to the MPEG standards), for the reception of Digital Audio Broadcasting (DAB) and for speech recognition, to name a few applications. Program-mable processors (like TriMedia), however, make it attractive to realize increasingly more functions in software. Finding the right balance between dedicated and program-mable solutions (’co-design’) is just one more example of the many activities in which Philips Research is involved today.

Philips Research.

Founded in Eindhoven, The Netherlands, in 1914, Philips Research -a part of Philips Electronics N.V.- has expanded the scale and scope of its activities to become one of the world’s major private research organisations. With laboratories in six different countries (The Netherlands, England, France, Germany, Taiwan and the United States) and staffed by around 3,000 people, the common vision is to create technologies that will lead to products for improving people’s lives. Scientists from a wide range of disciplines, from electrical engineering and physics to chemistry, mathematics, mechanics, information technology and software, work in close proximity, influencing and broadening each other's views.

In close co-operation with the Philips Product Divisions, the Philips Research organisa-tion generates oporganisa-tions for new and improved products and processes and produces im-portant patents in many fields.

I have performed my final year project at the Philips Natuurkundig Laboratorium (Nat.Lab.) in Eindhoven. It is the largest research centre, employing over 2000 persons working on ICs, electronic system, multimedia, optics, chemistry…My project is in-volved in the Display Systems & Personal Care sector within the Video Processing and Visual Perception group. The video processing team tries to provide new features for high quality television receivers applying recent developments in the field of digital image processing and works on programmable video architectures in order to employ suitable hardware architectures and software support, with respect to a wide range of video applications.

(7)

1 Introduction 1

2 Compression algorithm description. 3

2.1 The principle of the compression. 3

2.2 The compression scheme. 4

3 Different measures to evaluate the quality of a compressed image. 7

3.1 The Peak Signal-to-Noise Ratio (PSNR). 7

3.1.1 Definition. 7

3.1.2 Results. 10

3.2 The Signal Noise Ratio (SNR). 14

3.2.1 Definition. 14

3.2.2 Results. 15

3.3 The Block Impairment Metric (BIM). 17

3.3.1 Definition. 17

3.3.2 Results. 18

3.4 The visual distortion metric developed in the EBCOT algorithm. 20

3.4.1 Definitions and explanations. 20

3.4.2 Results. 21

4 Compression algorithm improvement. 24

4.1 Comparison between the Uniform, the Spiral, the Adaptive codec. 24

4.1.1 Subjective evaluation 24

4.1.2 Objective evaluation 24

4.2 Algorithm modification. 28

4.3 Comparison between the Adaptive, the Bit Plane and the Adaptive Bit Plane

codec . 31

4.3.1 Subjective evaluation 31

4.3.2 Objective evaluation 31

4.4 Comparison between the Adaptive Bit Plane and the Energy Bit Plane Codec35

4.4.1 Subjective evaluation 35

4.4.2 Objective evaluation 35

5 Suggestions for future work. 37

5.1 Subjective evaluation. 37 5.2 Objective measures 37 5.3 Algorithm improvement 37 6 Conclusion. 38 7 Appendix 39 8 References 47

(8)
(9)

1

Introduction

Distortion measures, which give a numerical measure of picture quality, play an impor-tant role in many fields of image processing and especially in image coding. These dis-tortion measures can be used for example to provide a objective criteria in the design of image-compressing systems. Objective measures are based on mathematical formulas that somehow calculate the error between an original image and a compressed one. The disadvantage is well known to anyone that has had to choose a compression scheme for their application, mathematical measures do not always indicate whether an image looks pleasing or whether it is acceptable for use since the final user in image compression is the human observer. The need to have objective measure that correlate well with subjec-tive image quality has led to many measures. In this report we are going to determine a metric quality which is close to our subjective evaluation as part to a particular still im-ages compression algorithm: “Low-complexity scalable DCT image compression” [1] and [2]. After having described briefly this compression algorithm and studied several objective image quality measures, the image compression scheme has been modified to improve the visual quality of the compressed images by optimising it with a perceptual metric.

(10)
(11)

2 Compression algorithm description.

2.1

The principle of the compression.

Why do we need compression?

The examples, in Table 1 below, clearly illustrate the need for sufficient storage space, large transmission bandwidth, and long transmission time for image data. At the present state of technology, the only solution is to compress (with a encoder) multimedia data before its storage and transmission, and decompress it at the receiver (with a decoder) for play back. For example, with a compression ratio of 10:1, the space, bandwidth, and transmission time requirements can be reduced by a factor of 10, with acceptable quality.

Multimedia Data

size Bits /pixel

Uncom-pressed size Transmis-sion band-width Transmis-sion time (using a 28.8K Modem) Greyscale image 512 * 512 8 256 KBytes 2.1 Mb/image 1 min 13 sec Color image 512 * 512 24 768 KBytes 6.29

Mb/image 3 min 39 sec Full-motion Video 640 * 480 during 1 min (30 frames/sec) 24 1.66 GB 221 Mb/sec 5 days 8 hrs

Table 1 Image and video data types and uncompressed storage space, transmission bandwidth, and transmission time required.

What are the principles behind compression?

A common characteristic of most images is that the neighbouring pixels are correlated and therefore contain redundant information. The foremost task then is to find less cor-related representation of the image. Two fundamental components of compression are redundancy and irrelevancy reduction. Redundancy reduction aims at removing predict-ability from the signal source (here, image). Irrelevancy reduction omits parts of the signal that will not be noticed by the signal receiver, namely the Human Visual System (HVS). In general, three types of redundancy can be identified:

- Spatial redundancy or correlation between neighbouring pixel values.

- Spectral redundancy or correlation between different color planes or spectral bands.

- Temporal Redundancy or correlation between adjacent frames in a sequence of

(12)

Image compression research aims at reducing the number of bits needed to represent an image by removing the spatial and spectral redundancies as much as possible. Since we will focus only on still image compression, we will not worry about temporal redun-dancy.

What are the different classes of compression techniques?

Two ways of classifying compression techniques are mentioned here.

- Lossless vs. Lossy compression: In lossless compression schemes, the reconstructed image, after compression, is numerically identical to the original image. However lossless compression can only a achieve a modest amount of compression. An image reconstructed following lossy compression contains degradation relative to the original. Often this is because the compression scheme completely discards irrelevant information. However, lossy schemes are capable of achieving much higher compression. Under normal viewing conditions, no visible loss is perceived (visually lossless).

- Predictive vs. Transform coding: In predictive coding, information already sent or available is used to predict future values, and the difference is coded. Differential Pulse Code Modulation (DPCM) is one particular example of predictive coding. Transform coding, on the other hand, first transforms the image from its spatial domain representa-tion to a different type of representarepresenta-tion using some well-known transform and then codes the transformed values (coefficients). This method provides greater data compres-sion compared to predictive methods, although at the expense of greater computation.

2.2

The compression scheme.

In this section, we’re going to explain briefly how the ”Low-Complexity Scalable Image Compression” algorithm described in detail in [1] and [2], works in order to understand the quality measures, explained in the next section, that we have implemented in it. The main property of this algorithm is that it is scalable. Scalable compression refers to the generation of a bit-stream which represents an efficient compression of the original image. A key advantage of scalable compression is that the target bit-rate or reconstruc-tion resolureconstruc-tion need not be known at the time of compression. Another advantage of practical significance is that the image need not be compressed multiple times in order to achieve a target bit-rate, as is common with the existing JPEG compression standard. The algorithm compression is based on the transform coding and used the 8*8 block Discrete Cosine Transform (DCT) defined as:

(13)

) , ( . 16 ). 1 2 ( cos . 16 ). 1 2 ( cos 4 ) ( ). ( ) , ( 7 0 7 0 j i f v j u i v C u C v u F i j π π + + =

∑∑

= = where     = 1 2 1 ) (ε C for ε =0

and f( ji, ) (respectively,F( ji, )) is the image in the spatial domain (respectively, spatial frequency domain) as you can see in the figure below.

Figure 1 From spatial domain to spatial frequency domain.

In theory, the DCT transform introduces no loss to the source image samples: it merely transforms them to a domain in which they can be more efficiently encoded. Further-more, since the DCT is a orthonormal transformation, first, the total energy in the coeffi-cient domain will equal that of the original data (in the spatial domain): it means that, if we introduce a error during the encoding, in the frequency domain, this error will be conserved in the spatial domain. Secondly, DCT decreases considerably the correlation between each coefficient: it means that each coefficient can then be treated independ-ently. For a good analysis of orthogonal transforms for image coding you can refer the Clarke’s book [3].

In the present algorithm there is no additional quantization or entropy coding (such as Huffman or arithmetic coding). The DCT coefficients are organised in bit planes. There are 11 bit planes because the DCT coefficients are coded with 11 bits (the maximum coefficient value will be 2048, in accordance with the DCT formula). The first bit plane, as shown in the figure 2, of a block represent the plane with all the most significant bit of each coefficient of this block. Thus, the last bit plane is constituted with all the least significant bits of each of the 64 coefficients of this block. Bit-rate or quality scalability is enabled by encoding the DCT coefficients bit plane per bit plane, starting at the most significant plane of each block, the first one. The goal of scalable compression methods is to generate a bit string that can be truncated at any desired point, while always giving the best possible quality for the selected bit-rate. Therefore, since a truncatable bit-string is generated, the main goal is to put the most significant information in the beginning of the bit string. Since, the DC coefficients (the coefficients in the upper left corner of each DCT blocks) corresponds to the average luminance of its block, it means that they con-tain the main information, they are processed separately from the AC coefficients (all

(14)

coefficients except the DC coefficient). The DC coefficients from all the blocks are collected and put into the bit string before any AC coefficient data. They sent first with-out any further encoding.

The DC coefficient

8

8

The 64th coefficient

The 1st bit plane

The 11th bit plane

Figure 2 A DCT block with 11 bit planes.

So, since each time one bit plane of a block is put in the bit stream (each block is scanned or processed 11 times), there are different way to put the data or to organise the bit string in order to reconstruct the image.

First, we can put each bit plane of the processed blocks in the usual scan order, from the upper left corner to the lower right of the image, it means that we can put first, in the bit string, the bit plane blocks corresponding to those who are in the upper left corner of the image until those who are in the lower right corner. We can call this way “uniform” so the associated encoder-decoder will called Uniform codec.

Secondly, instead of the uniform way, we can transmit at first (after the DC coefficients, of course) the processed blocks in the middle of the image and after, in a “spiral way” towards the edge. So the last processed blocks in the bit stream are those which are on the edge of the image. It will be called the Spiral codec.

Thirdly, we can send the processed blocks in accordance with their own energy. The human visual system is less sensitive when there are degradations in areas where there are much details, areas which possess high energy, than the ones where there are less details (low-contrast/low-texture areas), those which have lower energy. So, we transmit, at first, those with low energy.

As a contrast measure, we use the total number of significant coefficients for a block(c.f. [2]). During the encoding of a certain bit plane, a significant coefficient is a coefficient whose magnitude had a one in any of the higher bit planes (which have already been

(15)

encoded). An example is given the figure 3. This bit belongs to the 3rd bit plane.

This bit belongs to the 1st bit plane.

Figure 3 Example of a coefficient which will become significant during the encoding of the 3rd bit plane (x = 0 or 1).

Consequently, in this case, the scan order is adapted to each individual image. In the implementation, both encoder and decoder re-adjust their scanning orders, according to the block contrast, at the start of each new coefficient bit plane. With this adaptive scan-ning order, the perceptual image quality of this scheme should be better as you could see in the section 5-1. This codec will be called the Adaptive codec.

3 Different measures to evaluate the quality of a

com-pressed image.

The following measures have been implemented in the codec software related with the ”Low-Complexity Scalable Image Compression” described in the first section as the

Spiral codec. Most of the measures are in accordance with the bit rate, i.e. the number of

bit per pixel from those, the compressed image is reconstructed. So, more the bit rate will be high more the reconstructed image will be closer to the original one. As data, we have used the Lena image (512*512 pixels) in color and in grey level which you can look at in the appendix (figure 36).

3.1

The Peak Signal-to-Noise Ratio (PSNR).

3.1.1 Definition. *

[ ]

    = MSE dB PSNR 2 10 255 log *

10 with the Mean Squared Error defined as

=     − = N i i i N Y X MSE 1 2 ) (

with -N is the number of pixel of the image

-X , the original image and i Y , the reconstructed onei

(16)

* We can also define the Coefficients PSNR like the PSNR but with, in this case,Xij

(respectivelyYij) represent the th

i coefficient of the jth frequency band of the original (respectively reconstructed) transformed image in the frequency domain (the DCT do-main). Coefficients

[ ]

    = MSE dB PSNR 2 10 255 log * 10 . = MSE

∑ ∑

(

)

= = − 64 / 1 64 1 2 1 N j i j i j i Y X N

It gives the same result as the PSNR because of the DCT property, the conservation of the energy. We have defined this measure in order to introduce the following measure:

* The Weighted PSNR is defined like the Coefficients PSNR but with a different MSE. We use a perceptually weighted MSE defined as:

Weighted MSE

∑ ∑

(

)

= = − = /64 1 64 1 2 * 1 N j i j i j i i X Y N α i

α is a weight in order to give more importance to the lower spatial frequencies because the human eye is less sensitive to higher frequencies (frequency sensitivity). So we can define it with the help of the luminance quantization matrix and the chrominance quanti-zation matrix.

These quantization matrices have been specified by the JPEG standard and based on the human visual perception [4].

These matrices are defined as

-for the luminance: -for the chrominance:

                          99 103 100 112 98 95 92 72 101 120 121 103 87 78 64 49 92 113 104 81 64 55 35 24 77 103 109 68 56 37 22 18 62 80 87 51 29 22 17 14 56 69 57 40 24 16 13 14 55 60 58 26 19 14 12 12 61 51 40 24 16 10 11 16                           99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 99 66 47 99 99 99 99 99 56 26 24 99 99 99 99 66 26 21 18 99 99 99 99 47 24 18 17

The quantization values can be set individually for each coefficient, using criteria based on visibility of the basis functions. If we measure the threshold for visibility of a given basis function (the coefficient amplitude that is just detectable by human eye) we can divide (quantize) the coefficients by that value (with appropriate rounding to integer values). If we multiply (dequantize) the scaled-down coefficient by that value before

(17)

reconstructing, we create a condition in which the eye should not be able to detect any difference between quantized and unquantized DCT coefficients. If we are willing to tolerate some visible artifacts in the reconstructed image, we might divide by a value larger than the visibility threshold.

But, as we read before, there is no quantization during the compression but we can use these values in order to give more or less importance to any coefficients, in weighting them. We can, in fact, consider αi as a “detection threshold”.

Without weighting, it’s like that αi and Q ( the i th

i element of the quantization matrix which contain of course 64 coefficients) are equal to 1 for i

{ }

1;64 so we choose αi

such as 64 64 1 =

= i i

α . Consequently, we can define αi in this way: 2

i i Q c = α , with c a

constant (which depends on the quantization matrix) and still 64

64 1 =

= i i α . So c is such as 64 1 * 1 2 =

= N i Qi c .

Consequently, c=662.2946 for luminance quantization matrix and c=2537.8 for chro-minance quantization matrix. We can also above what the 64 coefficientsαilook like. - for the luminance:

                          07 . 0 06 . 0 07 . 0 05 . 0 07 . 0 07 . 0 08 . 0 13 . 0 06 . 0 05 . 0 05 . 0 06 . 0 09 . 0 11 . 0 16 . 0 28 . 0 08 . 0 05 . 0 06 . 0 10 . 0 16 . 0 22 . 0 54 . 0 15 . 1 11 . 0 06 . 0 06 . 0 14 . 0 21 . 0 48 . 0 37 . 1 04 . 2 17 . 0 10 . 0 09 . 0 25 . 0 79 . 0 37 . 1 29 . 2 38 . 3 21 . 0 14 . 0 20 . 0 41 . 0 15 . 1 59 . 2 92 . 3 38 . 3 22 . 0 18 . 0 20 . 0 98 . 0 83 . 1 38 . 3 60 . 4 60 . 4 18 . 0 25 . 0 41 . 0 15 . 1 59 . 2 62 . 6 47 . 5 59 . 2

-for the chrominance:

                          26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 58 . 0 15 . 1 26 . 0 26 . 0 26 . 0 26 . 0 26 . 0 81 . 0 75 . 3 41 . 4 26 . 0 26 . 0 26 . 0 26 . 0 58 . 0 75 . 3 75 . 5 83 . 7 26 . 0 26 . 0 26 . 0 26 . 0 15 . 1 41 . 4 83 . 7 78 . 8

(18)

3.1.2 Results.

As you we will see on the following plots, for the color image the highest bit rate is 13.7 instead of 24 bits (8 bits for each component: Y, the luminance one, U and V the two chrominance ones), it means that it is the lowest bit rate with which, we get a lossless compression (i.e. when all the bit plane are decoded, when all the generated bit-string is decoded). For the grey level image it is 4.9 instead of 8 bits (8 bits only for the lumi-nance).

The Y,U and V components come from the R,G, B (Red, Green, Blue) components which are use to represent a color image for a display CRT (usual computer screen). The relations between them are:

Y= 0.299R+0.587G+0.114B;

V=R-Y; U=B-Y;

The luminance provides a greyscale version of the image, and the chrominance compo-nents provide the extra information that converts the greyscale image to a color image. Below a bit rate of 2, in accordance with the figure 4, the Y component quality is smaller than the other ones so that, there is more error in the Y component. Consequently, since all the components are encoded in the same way, we can affirm that the luminance com-ponent contains more information than the chrominance ones.

(19)

With the figure 5 and 6, we can say the same remark because the Coefficient PSNR and the Weighted PSNR have approximately the same behaviour that the PSNR.

Figure 5 Coefficient PSNR of the Y,U and V components.

(20)

In figures 7, 8, 9 and 10, there is a little difference between the PSNR and the coeffi-cients PSNR because the DCT coefficoeffi-cients are rounded during the calculation in the encoder. They are also rounded during the Inverse DCT (IDCT) in the decoder.

In figures 7, 8, 9 and 10, we can also note that the Weighted PSNR is lower than the Coefficients PSNR and the PSNR below a bit rate of 2. Since, the quantization matrix give more importance to the low frequencies (as you can see above, the coefficients

i

α are bigger for the low frequencies than for the high frequencies), we confirm that the low frequencies contain the main information of the image.

We can also remark the different evolution between the PSNR of the luminance compo-nent in the grey level image (figure 7) and the PSNR of the luminance compocompo-nent in the color image (figure 8). This difference is due to the way each components are encoded and decoded in the software. The “floor effects” in the figure 8 (related with the color image) is due to the fact that when a component (for example Y), is encoded and de-coded during a certain bit rate the other ones (for example U and V) can’t be processed too in the same time. Consequently, the PSNR of those components is constant. Of course we don’t have this phenomena for the grey level image because there is only one component the luminance one, Y.

(21)

Figure 8 PSNR of the luminance component, Y.

(22)

Figure 10 PSNR of the chrominance component, V.

3.2

The Signal Noise Ratio (SNR).

3.2.1 Definition.

The SNR is defined almost in the same way than the PSNR. The difference lies in the fact that the PSNR takes the maximum power of any images (2552) while in the SNR, it’s the real power of the considered image. Consequently,

[ ]

= MSE dB SNR S 2 10 log *

10 σ with σS2, the image power is defined as,

= − = N i X i S X m N 1 2 2 ) ( 1

σ , where X is the value of thei th

i pixel of the original image and

X

m the average value of the pixels of the image. So,

= = N i i X X N m 1 1 .

For the transformed image (DCT) the power which is equal to power of the original image, because of the DCT property, is defined as:

= = 64 1 2 2 i i C σ

σ , withσi2, the power of the th

(23)

it means the coefficient of the th

i frequency band of the coefficients image.

Consequently, σi2 =

= − 64 / 1 2 ) ( 64 / 1 N j Y j i j i m Y N with =

= 64 / 1 64 / 1 N j j i Y Y N m j i

And Yij represents the coefficient of the th

i frequency band of the j block of the DCTth

image.

3.2.2 Results.

Seeing that the SNR takes account of the power of the image itself , these curves (figure 8, 9, 10 and 11) show better that the luminance component possesses more information that the chrominances ones because the SNR of the luminance component is higher than the SNR of the chrominance components. They also show, fortunately, like the PSNR, that the low frequencies contain the main information of the image.

(24)

Figure 12 SNR of the chrominance component, U.

(25)

Figure 14 SNR of the grey level Lena.

3.3

The Block Impairment Metric (BIM).

3.3.1 Definition.

As it is well known, DCT coding scheme causes blocking artifacts in image coding and PSNR or SNR are ineffective in quantifying this kind of artifacts.

The BIM is a quantitative distortion measure for these blocking artifacts.

The algorithm is based on the interpixel difference between each of the horizontal (verti-cal edge artifacts ) and verti(verti-cal (horizontal edge artifacts) block boundaries. There also parameters which take into account the luminance masking effects in extreme bright as well as extreme dark areas in the reconstructed image. Consequently, this distortion measure does not require the original image as a comparative reference. BIM is detailed in [5].

(26)

3.3.2 Results.

Figure 15 BIM of the components Y, U and V for Lena.

In accordance with the figure 12, the BIM is smaller for the luminance component than the chrominance ones consequently the chrominance blockinesses is more prominent that the luminance blockiness. The BIM for the original image is 0.41 dB for the luminance component and 0.17 and 0.14 dB for the chrominance ones.

We can compare now the PSNR with the BIM in order to see if BIM measure brings more information than the PSNR (figure 16 and 17).

(27)

Figure 16 BIM and PSNR for Lena grey level.

(28)

Between 0.3 and 1 bit per pixel, PSNR and (39dB-BIM) evolve approximately in the same way. Consequently, at first sight in this case, the BIM does not bring more infor-mation on the quality of the compressed image but referring to [5] the BIM seems to be a better quality measure than the PSNR for a POCS filtering which is a postfiltering of reconstructed video images algorithm to reduce the blocking artifacts.

3.4

The visual distortion metric developed in the EBCOT algorithm.

3.4.1 Definitions and explanations.

As it is described in [6], EBCOT is a new image compression algorithm. The acronym means “Embedded Block Coding with Optimized Truncation”. It is another scalable image compression but rather than focusing on generating a single scalable bit-stream to represent the entire image, EBCOT partitions each subband into relatively small blocks of samples and generates a separate highly scalable bit-stream to represent each so-called code-block. The EBCOT algorithm was adopted for inclusion in the evolving JPEG2000 image compression standard. Our goal here, it is not to describe the algorithm, but to adapt the distortion metric developed in [6] to our compression algorithm.

Consequently, here is the ithfrequency band distortion formula:

(

)

( )

+− = k k i i k i k i i V Y X D 2 2 σ

where - Xik ( respectively, Yik) represent the kthcoefficient of theithfrequency band of the original image ( respectively, the reconstructed image).

i is define in [6] as the “visibility floor” term which establishes the visual significance of distortion in absence of masking. In fact, in our case, we’re going to take

i i α

σ = 1 ,αithe weight defined in the section 3.1.1.

- Vik is called “the visual masking strength operator”, defined as:

- The first adaptation.

[ ] k i k k k i k i i X V Φ = ′∈

Φ ′ 10 2 1

Here, Φki denotes a neighbourhood of the coefficient k i

X and k

i

Φ denotes the size of the neighbourhood.

(29)

We divide k i

V by 1024 in order to scale it compare with αi because the DCT coeffi-cients are coded with 11 bits so that the maximum value will be 2048. When we di-videVik by 2048, its influence is too small so that we divide by 1024.

We have adapted this measure in this way to take account the masking inside the same frequency band. Indeed, interactions appear between coefficients which belong to the same frequency band. Such interference results in a modification of the detection thresh-old αi and it is stronger if the coefficient is high and if his neighbourhood Vikhas about the same energy.

-the second adaptation.

In this case, we take the current 8*8 DCT block of the current coefficientXik as his neighbourhood except the DC coefficient. It means that this neighbourhood represent in fact, the energy of the current block, so it won’t depend on the coefficient i anymore but only on the block k. We use this because on the block with high energy, the distortion will be smaller since the human eye is less sensible with this kind of regions. Referring to the first adaptation, it is, in fact, a masking due to the presence of the others coefficients inside a block, a interaction between different frequency bands. We can call that a con-trast masking referring to [7].

So that, the neighbourhood is no more dependant of the current frequency band and it is defined as:

( )

= = 64 2 2 10 63 * 2 1 i k i k X V

The scaling factor has the same goal than in the first adaptation.

3.4.2 Results.

-for the first adaptation.

We can see, below, the remarks we did above.

In accordance with the figure 18, the error in the low frequencies is different than the weighted MSE: the error is weighted by the neighbourhood of the coefficient of the original coefficient image. In the high frequencies the error is the same as the weighted MSE. In this figure, we plot the error in fact in accordance with the DCT coefficients from the left to the right for each line. Since the low frequencies are concentrated in the upper left corner, that is explained the “wave phenomena” of the curve.

In order to compare this measure m with the Weighted SNR we have computed it in the

same as the SNR , it means as 

     m p 10 log *

10 where pis the coefficients image power. In accordance with the figure 19, the difference between this distortion metric and the weighted SNR is higher in the low bit rates than in the high bit rates. Indeed, error in the low frequencies is reduced first and contain the main information.

(30)

Figure 18 Lena grey level distortion in accordance with the frequency at 0.1bit per pixel.

(31)

-for the second adaptation

We have computed this new measure like in the first adaptation.

Figure 20 Lena grey level weighted SNR and the second adapted EBCOT distortion metric.

We can remark that in the low bits rates the difference between these two measures is bigger than in the higher bit rate.

Indeed, in a very low bit rate only the DC coefficients are transmitted and so that de-coded. But these coefficients introduced no error, so the MSE, in this case, is equal to the energy, the neighbourhood. So, the blocks with high energy have the highest error in the beginning and later, all the blocks have about the same error.

(32)

4 Compression algorithm improvement.

In order to evaluate the best measure, it means the one which is closer to our subjective judgement, described before, we are going to compare (objectively and subjectively) different images of Lena processed by the different codecs explained before in section 1. After that, we will explain the changes we did in the algorithm and check them with our subjective evaluation and with the objective measures.

4.1

Comparison between the Uniform, the Spiral, the Adaptive codec.

4.1.1 Subjective evaluation

As we explained before, the Uniform codec processed the DCT blocks in the usual scan order (i.e. left to right and top to bottom of the image). However, at lower bit rates (for example at 0.6 bit per pixel) this gives annoying artifacts (see the figure 37 Lena_UN.pgm in the appendix), because of a clearly visible quality difference between the blocks “above” and “below” the truncation point, we can also see that in the error image (the difference image between the original and the reconstructed one) in the figure 39 . So, the Spiral codec, the one which sends the DCT blocks in the middle of the im-age and “spiralling” out towards the edges, give a higher perceived quality than the

Uniform codec (see the figure 38 Lena_SP.pgm and figure 40 in the appendix ).

Since it is closer to the human visual system, the Adaptive codec seems in fact better at lower bit rate (see figure 41 Lena_AD.pgm in the appendix). In the error image (figure 42) we can see that the error is distributed in the areas with high energy (edges and the hair).

4.1.2 Objective evaluation

We can see below, that at 0.6 bit per pixel all measures are correlated with our judgement except for the SNR. But the measures are not so well correlated when we compare

Lena_UN.pgm and Lena_SP.pgm (except for the SNR!): it is logical because they don’t take account the difference between the two codecs. You can see it also in the following plots.

Images SNR Weighted

SNR

BIM EBCOT1 EBCOT2

Lena_UN.pgm 19.80 16.69 4.43 17.80 17.95

Lena_SP.pgm 19.97 16.70 4.31 17.78 17.77

Lena_AD.pgm 19.67 17.12 3.84 18.08 19.05

Table 2 Objective results of Lena at 0.6 bit per pixel. Let’s see now, in a accordance with the bit rate.

With the figure 21 , we can see that most of time the SNR of the Spiral Codec is higher or equal than the SNR of the Adaptive one.

In the figure 22, it is almost the opposite compare with the figure 21 and we know that the Adaptive codec gives better results so we can say, in this case, that the weighted SNR is close to the subjective evaluation.

(33)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5 10 15 20 25 30 35 40 45 SNR (dB) bit rate

SNR for the Uniform codec SNR for the Spiral codec SNR for the Adaptive codec

Figure 21 SNR for the Uniform, Spiral and Adaptive codec.

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 5 10 15 20 25 30 35 40 45 50 Weighted SNR (dB) bit rate

Weighted SNR for the Uniform codec Weighted SNR for the Spiral codec Weighted SNR for the Adaptive codec

(34)

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 2 4 6 8 10 12 BIM (dB) bit rate

BIM for the Uniform codec BIM for the Spiral codec BIM for the Adaptive codec

Figure 23 BIM for the Uniform, Spiral and Adaptive codec.

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5 10 15 20 25 30 35 40 45 50 EBCOT1 (dB) bit rate

EBCOT1 for the Uniform codec EBCOT1 for the Spiral codec EBCOT1 for the Adaptive codec

(35)

According to the figure 23, we can note that in the low bit rates the BIM of the Adaptive

codec is lower that the BIM of the other ones. It means that the Adaptive codec brings

less blocking artifacts. But we can check this property with our judgement, so we can notice that the BIM measure is also, in this case, close to our subjective evaluation. In the figure 24, the first adapted EBCOT distortion metric (EBCOT1) has almost the same behaviour than the weighted SNR.

In the figure 25, we can see that the second adapted EBCOT distortion metric (EBCOT2) gives better results for the Adaptive Codec than the other measures because as we have seen, this codec is linked with the energy of each block and the second adapted EBCOT metric takes also account the energy of the block of the considered pixel. So we can say that it is the best measure related with our subjective judgement but only for this particu-lar codec.

So, we can conclude first that the Spiral codec optimise the SNR when we compare it with the Uniform codec, and secondly, the Adaptive codec which is the best for the moment optimise the EBCOT2 metric.

Seeing that we can expose now the changes we did in order to improve the algorithm.

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5 10 15 20 25 30 35 40 45 50 EBCOT2 (dB) bit rate

EBCOT2 for the Uniform codec EBCOT2 for the Spiral codec EBCOT2 for the Adaptive codec

(36)

4.2

Algorithm modification.

We can improve the codecs described in the section 1, in making them closer to the human visual system (HVS). For the moment these codecs don’t give more importance to any coefficients except for the DC coefficient. All the DC coefficients are sent first. The idea, here, is doing the same for the others coefficients, the AC coefficients. Those which are more important for the HVS, i.e. those which correspond to the low frequency are put first in the bit stream. To determine the most important coefficients we used again the quantization matrix described in the first section. We can divided it in different areas. We have found that the best division is defined in this such a way (c.f. figure 26). The first region, so, the most important covered the coefficients from 1*10 to 2*10, the second one from 2*10 to 4*10, the third one, from 4*10 to 8*10, and the fourth one, the least important one, from 8*10 to 16*10 as you can see below.

                          99 103 100 112 98 95 92 72 101 120 121 103 87 78 64 49 92 113 104 81 64 55 35 24 77 103 109 68 56 37 22 18 62 80 87 51 29 22 17 14 56 69 57 40 24 16 13 14 55 60 58 26 19 14 12 12 61 51 40 24 16 10 11 16

Figure 26 The different areas of the luminance quantization matrix.

We can see now there a ratio of two between each region. So, we’re going to multiply the coefficients (except the DC one) which belong to the most important region by 8 in shifting them by 3, those of the second region by 2, those of the third one by 1 and those of the last region will be no shifted.

Consequently, since the maximum shift is 3, we need 3 more bit planes for the AC coef-ficients. So, now we have 14 bit planes for the AC coefficients and still 11 for the DC coefficient since it is processed separately (c.f. figure 27). But now, we have more infor-mation to transmit but we know where is the useless inforinfor-mation. In fact the useless information is the zeros we introduced when we shift a coefficient (c.f. figure 28): for example, if we shift a coefficient by 3 (we multiply it by 8), the useless information is the 3 zeros at the end (those of the 3 last bit planes) so we don’t need to transmit them. It is the same also for the coefficient which are no shifted, the useless information is the 3 first zeros (those of the 3 first bit planes).

(37)

The DC coefficient

The 1st bit plane

The 64th coefficient

The 14thbit plane

Figure 27 A DCT block with 14 bit planes for AC coefficients and 11 for the DC coeffi-cient. A coefficient shifted by 3: MSB Useless 0. A coefficient shifted by 2: MSB Useless 0. x x x x x x x x x x x 0 0 0 0 x x x x x x x x x x x 0 0

(38)

A coefficient shifted by 1:

MSB

Useless 0.

A no shifted coefficient (shifted by 0):

MSB

Useless 0.

Figure 28 Shifted coefficients and useless 0 (x = 0 or 1).

We can also calculate the maximum of the AC coefficient and see if it’s worth to use 14 bit planes. We can send this information in the beginning in the bit string, it takes only 4 bits. For, most our test images, for example Lena, the maximum of the AC coefficient is 636, so in this case 13 bit planes are enough for them. It will be the Bit Planes codec (the

BP codec).

Consequently, we can define a new codec which is in fact a combination of the two last ones, the Adaptive and the Bit Planes codec: the Adaptive Bit Planes codec (the

Adap-tive BP codec).

We can also calculate the energy of each block in another way than in the Adaptive

codec. Instead of counting the total number of significant coefficients for a block during

the encoding of a bit plane, we calculate before the encoding the real energy of the block

k,Vkdefined in the second adapted EBCOT distortion metric (section 3.4.1)

After scaling it, we introduce this new parameter in the method we choose the regions of the quantization matrix described above.

Given MAX , the maximum of the region r r, for the ithcoefficient, if i α 1 + k V

(

)

2 r MAX c

then this coefficient will be shifted by the shift which defined the

region r. We have still 2 i i

Q c

=

α (c = 662.2649 for the luminance quantization matrix )

and

( )

= = 64 2 2 10 63 * 2 1 i k i k X V . 0 0 x x x x x x x x x x x 0 0 0 0 x x x x x x x x x x x
(39)

For example, for the first region, MAX =19, so, for the r ithcoefficient, if i α 1 + Vk2 19 c

then this coefficient will be shift by 3.

Consequently, with this method the blocks with very low energy (V[k]≅0)will be put in the bit-string before those which have higher energy.

In the decoder, we can calculate iteratively the total energy of each block during the decoding of a bit plane.

This codec will be called the Energy Bit Plane Codec.

4.3

Comparison between the Adaptive, the Bit Plane and the Adaptive

Bit Plane codec .

4.3.1 Subjective evaluation

As we can guess, the best codec is in fact, the Adaptive Bit Plane codec and in second position is the Bit Plane codec. So, the modification we did in the last section is in fact a real visual improvement. You can see the images at 0.6 bit per pixel in the appendix: Lena_BP.pgm (figure 43) and Lena_ABP.pgm (figure 44) respectively processed by the

Bit Plane and the Adaptive BP codec. The visual difference at this bit rate is not so

obvious, so you can see also the figure 46, 47, and 48 which represent respectively Lena_AD.pgm, Lena_BP.pgm, Lena_ABP.pgm, at 0.5 bit per pixel.

Let’s see now the quality measures.

4.3.2 Objective evaluation Images SNR (dB) Weighted SNR (dB) BIM (dB) EBCOT1 (dB) EBCOT2 (dB) Lena_AD.pgm 19.67 17.12 3.84 18.18 19.05 Lena_BP.pgm 19.61 18.52 4.09 19.25 19.52 Lena_ABP.pgm 19.51 18.49 4.05 19.24 19.59

Table 3 Results at 0.6 bit per pixel for Lena_AD.pgm, Lena_BP.pgm, and Lena_ABP.pgm .

(40)

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 8 10 12 14 16 18 20 22 24 26 28 bit rate SNR (dB)

SNR for the Adaptive codec SNR for the BP codec SNR for the Adaptive BP codec

Figure 29 SNR for the Adaptive, BP and Adaptive BP codec.

With the figure 29, we can notice that from 0.5 bit per pixel the SNR for the Adaptive

Bit Plane codec is lower than the others (see also table 3 at 0.6 bit per pixel) and we

know that in fact it is the best one for the human eye, so we can conclude that, here, the SNR is not correlated with our subjective evaluation.

In the figure 30 and in the table 3, we can also remark that the Weighted SNR for the

Adaptive BP codec is almost the same as the one for the BP codec but the both codecs

have a higher Weighted SNR than the one for the Adaptive codec. Since the BP codec give more importance to low frequencies, it is logical that Weighted SNR gives better result. But the fact that this measure give the same results in the lower bit rate for the

Adaptive BP and BP codec, makes this measure not very well correlated with our

(41)

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 0 5 10 15 20 25 30 35 bit rate Weighted SNR (dB)

Weighted SNR for the Adaptive codec Weighted SNR for the BP codec Weighted SNR for the Adaptive BP codec

Figure 30 Weighted SNR for the Adaptive, BP and Adaptive BP codec.

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 0 1 2 3 4 5 6 7 8 9 10 bit rate BIM

BIM for the Adaptive codec BIM for the BP codec BIM for the Adaptive BP codec

(42)

But it is different for the BIM measure when we look at the figure 31 and the table 3. Indeed, the BIM tells that there are more blocking artifacts for the best codec from 0.6 bit per pixel than for the Adaptive codec. This maybe true in the absolute but it is less visi-ble, so, in this case, the BIM measure is not very well correlated with our judgement.

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 5 10 15 20 25 30 35 EBCOT1 (dB) bit rate

EBCOT1 for the Adaptive codec EBCOT1 for the BP codec EBCOT1 for the Adaptive BP codec

Figure 32 EBCOT1 for the Energy, SBP and Adaptive BP codec.

The first adapted EBCOT distortion metric in the figure 32 does bring much more infor-mation than the Weighted SNR. The neighbourhood of the frequency band has not a lot of influence.

But as we can see in the figure 33 and the table 3, the second adapted EBCOT distortion metric is slightly better.

Consequently, we can conclude that this quality metric is the best one compare with our subjective assessment.

(43)

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 5 10 15 20 25 30 35 bit rate EBCOT2 (dB)

EBCOT2 for the Adaptive codec EBCOT2 for the BP codec EBCOT2 for the Adaptive BP codec

Figure 33 EBCOT2 for the Energy, SBP and Adaptive BP codec.

4.4

Comparison between the Adaptive Bit Plane and the Energy Bit

Plane Codec

4.4.1 Subjective evaluation

Although the Energy BP codec takes account the real energy of each block, when we look at the images at lower bit rate, the Adaptive BP codec seems to be the better one. You can compare, in the appendix, Lena_ABP.pgm (figure 44) and Lena_EBP.pgm (figure 45), processed by the Energy BP codec reconstructed from 0.6 bit per pixel.

4.4.2 Objective evaluation Images SNR (dB) Weighted SNR (dB) BIM (dB) EBCOT1 (dB) EBCOT2 (dB) Lena_EBP.pgm 18.82 16.67 4.21 17.75 19.31 Lena_ABP.pgm 19.51 18.49 4.05 19.24 19.59

(44)

In fact, referring to the table 4, any measures don’t bring more information because they are all in accordance with our subjective evaluation. Since the EBCOT2 seems to be the best measure for the moment, we can see how this measure evolve at lower bit rate in the figure 34. 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 5 10 15 20 25 30 35 EBCOT2 (dB) bit rate

EBCOT2 for the Adaptive BP codec EBCOT2 for the Energy BP codec

Figure 34 EBCOT2 for Adaptive BP and Energy BP codec. This measure confirms that the Adaptive BP codec is the best codec at that time.

(45)

5 Suggestions for future work.

5.1

Subjective evaluation.

Most of the time we did our subjective evaluation on the Lena image but we did it also to others few images but to validate the subjective assessment more “seriously”, more evaluation with more images and more people will be welcomed.

We can evaluate also these quality measures for video sequences not only for still im-ages.

5.2

Objective measures

We can, in fact, improve the adapted EBCOT distortion metric in optimising the scaling factor of the respective neighbourhoods, in order to make these two measures closer to our subjective evaluation.

5.3

Algorithm improvement

In order to improve the compression algorithm we can reduce the number of bits in the bit stream in using less bit planes for the DC and for the AC coefficients.

For the AC we explained that in calculating the maximum of the AC coefficients most of the time 10 bits are enough instead of 11 bits (it means 13 bit planes instead of 14) but we can extend this idea for each AC coefficients. Lets take a example: for Lena image, you can see in the figure 35, the maximums of the all the DCT coefficients.

                          13 16 15 23 21 21 19 20 18 14 20 24 29 26 30 24 31 21 27 27 30 40 41 44 26 27 41 48 65 52 62 83 24 48 43 67 82 117 116 125 32 35 101 89 126 169 206 214 37 78 70 97 167 228 280 444 55 68 109 177 279 288 636 1789

Figure 35 the 64 maximums of all the DCT coefficients.

Now, it is very easy to see that using 10 bits for a coefficient is most of the time too much.

Consequently, 4*63 bits is the maximum information we have to send in the beginning of the bit stream but we can do it with less bits with a clever method. So, the number of bits we add is in fact much smaller than the number of bits we economise if we take 10 bits for each AC coefficients. Then it’s worth.

(46)

told, they are put in the bit stream directly without further encoding. But we can send them with the DPCM method, it means that we just send the difference from DC coeffi-cient of the previous DCT block and the current one (except of course for the first one) because although the DC coefficient is large and varied, it is often close to previous value. But in this case, the algorithm cannot be 100 % scalable, it will become it from the first AC coefficient. It can be interesting to implement this idea because most of the time we will truncate the bit stream after that the DC coefficients we will sent to have not a too bad reconstructed image.

6 Conclusion.

In this paper, we have, first, studied several objective image quality measures such as PSNR, SNR, BIM, and several perceptually weighted SNR which take account the hu-man visual system (frequency sensitivity, masking) and secondly, implemented them in different available image compression schemes related with “Low-complexity Scalable DCT Image Compression”. At this point we found among the measures one, the percep-tually weighted SNR called in this paper the second adapted EBCOT distortion metric (EBCOT2), which follows our subjective evaluation. Finally, by optimising the compres-sion, called Adaptive Bit Plane, for this particular metric, a better image quality has been obtained.

At the beginning, the compression scheme such as, Uniform and Spiral, optimized the signal-to-noise ratio (SNR) of the compressed images. Next, the Adaptive codec opti-mized the weighted SNR, the Block Impairment Metric (BIM), the first and the second adapted EBCOT distortion metric (EBCOT1 and 2). Finally, we saw that indeed it is the second adapted EBCOT distortion metric which was the best measure for the best codec, the measure which correspond the best to the visual quality as it is perceived by human observers.

But we have to know that a quality measure cannot use for any kind of artifacts due to the compression scheme. The metric quality we found is the best only for this codec. Since this measure takes account the human visual system, this measure can be use for others processed images but we cannot guarantee that it will be the best.

(47)

7 Appendix

(48)

Figure 37 Lena_UN.pgm at 0.6 bit per pixel.

(49)

Figure 39 The error image related with Lena_UN.pgm at 0.6 bit per pixel.

(50)

Figure 41 Lena_AD.pgm at 0.6 bit per pixel.

(51)

Figure 43 Lena_BP.pgm at 0.6 bit per pixel.

(52)
(53)

Figure 46 Lena_AD.pgm at 0.5 bit per pixel

(54)
(55)

8 References

[1] R.J. van der Vleuten and R.P. Kleihorst, “Low-complexity scalable image compres-sion”, in Data Compression Conference (DCC 2000), (Snowbird, UT), pp.23-32, Mar. 28-30, 2000.

[2] R.J. van der Vleuten, R.P. Kleihorst, and C. Hentschel “Low-complexity scalable DCT image compression”, International Conference on Image Processing (ICIP), Van-couver, Canada, Sept. 10-13, 2000.

[3] R.J. Clarke “Transform coding of images” Academic Press 1985.

[4] W. B. Pennebaker, J. L. Mitchell “JPEG Still Image Data Compression Standard” Van Nostrand Reinhold 1993.

[5] H.R. Wu and M. Yen, “A Generalized Block-edge Impairment Metric for Video Coding”, IEEE Signal Processing Letters. Vol. 4. No. 11, pp.317-320. Nov. 1997.

[6] D. Taubman, “High Performance Scalable Image Compression with EBCOT”, IEEE Trans. Image Proc., Submitted March 1999; Revised August 1999.

[7] A. B. Watson, “DCT quantization matrices visually optimized for individual images” (1993) Proceedings, Human Vision, Visual Processing, and Digital Display IV, Bellingham, WA, SPIE, pp. 202-216, 1993.

(56)
(57)

Author(s): Grégory HAMON

Title: OBJECTIVE QUALITY MEASURES FOR COMPRESSED IMAGES

Distribution

Nat.Lab./PI WB-5

PRL Redhill, UK

PRB Briarcliff Manor, USA

LEP Limeil-Brévannes, France

PFL Aachen, Germany

CP&T WAH

Director: Dr.ir. A.A. van Gorkum WB-57

Department Head Dr.ir. G.F.G. Depovere WO-01

Full report

R.A.C. Braspenning Nat.Lab. WO-01

F.J. de Bruijn Nat.Lab. WO-02

M. Gabrani Nat.Lab. WO-01

G. de Haan Nat.Lab. WO-02

G.J. Hekstra Nat.Lab. WO-01

C. Hentschel Nat.Lab. WO-01

I.E.J. Heynderickx Nat.Lab. WY-81

E.G.T. Jaspers Nat.Lab. WO-01

A.A.C. Kalker Nat.Lab. WY-82

R.P. Kleihorst Nat.Lab. WAY-41

E. Langendijk Nat.Lab. WY-81

A. Pelagotti Nat.Lab. WO-02

R. Peset Llopis Nat.Lab. WL-01

A.K. Riemens Nat.Lab. WO-01

E.B. v.d. Tol Nat.Lab. WO-01

R.J. v.d. Vleuten Nat.Lab. WO-01

I. de Weerd Nat.Lab. WY-81

M. Vanderschaar PRB Briarcliff Manor, NY, USA

J. Caviedes LEP Limeil-Brévannes, France

M. Verberne CE-CSI SX 2

(58)

References

Related documents

Electives:- (At most 1 Elective Course) BCE 3106 Communication Systems BCE 3107 Communications Technology CSC 3101 Software Engineering – I CSC 3105 Computer Graphics

This program introduces high school students to complex corporate transactions, takes them on enrichment trips to visit corporations, and hosts monthly meetings to revisit curriculum

To estimate the cost of changing the FCTC and the Caregiver Credit, the Conference Board calculated the difference between the current amount paid out by the government for each

temporary food service event in which it participates (unless exempted). Annual temporary event licenses are also available which allow participation in an unlimited number

Another interesting type of topology is the one using a central mediation element for data transfer between systems (Goel 2006), which is called a message broker or

The house, built in 1911, is an early Vancouver home, valued for its association with the growth and development of the Cedar Cottage neighbourhood, once part of the

This Regulation lays down harmonised rules on the transparency to be applied by financial market participants, insurance intermediaries which provide insurance advice with regard

Although the temperature used in the reactive deposition experiments of the bimetallic materials (200ºC) was generally lower than the temperatures employed in the