Color in Image and Video Processing: Most Recent Trends and Future Research Directions

(1)

Volume 2008, Article ID 581371,26pages doi:10.1155/2008/581371

Review Article

Color in Image and Video Processing: Most Recent Trends

and Future Research Directions

Alain Tr ´emeau,1_{Shoji Tominaga,}2_{and Konstantinos N. Plataniotis}3

1_{Laboratoire LIGIV, Universit´e Jean Monnet, 42000 Saint Etienne, France}

2_{Department of Information and Image Sciences, Chiba University, Chiba 263-8522, Japan} 3_{The Edward S. Rogers Department of ECE, University of Toronto, Toronto, Canada M5S3G4}

Correspondence should be addressed to Alain Tr´emeau,[email protected]

Received 2 October 2007; Revised 5 March 2008; Accepted 17 April 2008

Recommended by Y.-P. Tan

The motivation of this paper is to provide an overview of the most recent trends and of the future research directions in color image and video processing. Rather than covering all aspects of the domain this survey covers issues related to the most active research areas in the last two years. It presents the most recent trends as well as the state-of-the-art, with a broad survey of the relevant literature, in the main active research areas in color imaging. It also focuses on the most promising research areas in color imaging science. This survey gives an overview about the issues, controversies, and problems of color image science. It focuses on human color vision, perception, and interpretation. It focuses also on acquisition systems, consumer imaging applications, and medical imaging applications. Next it gives a brief overview about the solutions, recommendations, most recent trends, and future trends of color image science. It focuses on color space, appearance models, color diﬀerence metrics, and color saliency. It focuses also on color features, color-based object tracking, scene illuminant estimation and color constancy, quality assessment and fidelity assessment, color characterization and calibration of a display device. It focuses on quantization, filtering and enhancement, segmentation, coding and compression, watermarking, and lastly on multispectral color image processing. Lastly, it addresses the research areas which still need addressing and which are the next and future perspectives of color in image and video processing.

Copyright © 2008 Alain Tr´emeau et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. BACKGROUND AND MOTIVATION

The perception of color is of paramount importance in many applications, such as digital imaging, multimedia systems, visual communications, computer vision, entertainment, and consumer electronics. In the last fifteen years, color has been becoming a key element for many, if not all, modern image and video processing systems. It is well known that color plays a central role in digital cinematography, modern consumer electronics solutions, digital photography system such as digital cameras, video displays, video enabled cellular phones, and printing solutions. In these applications, compression- and transmission-based algorithms as well as color management algorithms provide the foundation for cost eﬀective, seamless processing of visual information through the processing pipeline. Moreover, color also is crucial to many pattern recognition and multimedia systems, where color-based feature extraction and color segmentation have proven pertinent in detecting and classifying objects

in various areas ranging from industrial inspection to geomatics and to biomedical applications.

(2)

tool for those who wish to contribute to the development of color image processing solutions and also for those who wish to develop a new generation of color image processing algorithms based on high-level concepts.

A number of special issues, including survey papers that review the state-of-the-art in the area of color image processing, have been published in the past decades. More recently, in 2005 a special issue on color image process-ing was written for the signal processprocess-ing community to understand the fundamental diﬀerences between color and grayscale imaging [1]. In the same year, a special issue on multidimensional image processing was edited by Lukac et al. [3]. This issue overviewed recent trends in multidimensional image processing, ranging from image acquisition to image and video coding, to color image processing and analysis, and to color image encryption. In 2007, a special issue on color image processing was edited by Lukac et al. [4] to fill the existing gap between researchers and practitioners that work in this area. In 2007, a book on color image processing was published to cover processing and application aspects of digital color imaging [5].

Several books have also been published on the topic. For example, Lukac and Plataniotis edited a book [6] which examines the techniques, algorithms, and solutions for digital color imaging, emphasizing emerging topics such as secure imaging, semantic processing, and digital camera image processing.

Since 2006, we have observed a significant increase in the number of papers devoted to color image processing in the image processing community. We will discuss in this survey which are the main problems examined by these papers and the principal solutions proposed to face these problems. The motivation of this paper is to provide a comprehensive overview of the most recent trends and of the future research directions in color image and video processing. Rather than covering all aspects of the domain, this survey covers issues related to the most active research areas in the last two years. It presents the most recent trends as well as the state-of-the-art, with a broad survey of the relevant literature, in the main active research areas in color imaging. It also focuses on the most promising research areas in color imaging science. Lastly, it addresses the research areas which still need addressing and which are the next and future perspectives of color in image and video processing.

This survey is intended for graduate students, researchers and practitioners who have a good knowledge in color science and digital imaging and who want to know and understand the most recent advances and research in digital color imaging. This survey is organized as follows: after an introduction about the background and the motivation of this work,Section 2 gives an overview about the issues, controversies, and problems of color image science. This section focuses on human color vision, perception, and interpretation. Section 3 presents the issues, controversies, and problems of color image applications. This section focuses on acquisition systems, consumer imaging applica-tions, and medical imaging applications. Section 4 gives a brief overview about the solutions, recommendations, most recent trends and future trends of color image science. This

section focuses on color space, appearance models, color diﬀerence metrics, and color saliency. Section 5 presents the most recent advances and researches in color image analysis. Section 5 focuses on color features, color-based object tracking, scene illuminant estimation and color constancy, quality assessment and fidelity assessment, color characterization and calibration of a display device. Next,

Section 6presents the most recent advances and researches in color image processing.Section 6focuses on quantization, filtering and enhancement, segmentation, coding and com-pression, watermarking, and lastly on multispectral color image processing. Finally, conclusions and suggestions for future work are drawn inSection 7.

2. COLOR IMAGE SCIENCE AT PRESENT: ISSUES, CONTROVERSIES, PROBLEMS

2.1. Background

The science of color imaging may be defined as the study of color images and the application of scientific methods to their measurement, generation, analysis, and represen-tation. It includes all types of image processing, including optical image production, sensing, digitalization, electronic protection, encoding, processing, and transmission over communications channels. It draws on diverse disciplines from applied mathematics, computing, physics, engineering, and social as well as behavioural sciences, including human-computer interface design, artistic design, photography, media communications, biology, physiology, and cognition.

(3)

For several years we have been facing the development of worldwide image communication using a large variety of color display and printing technologies. As a result, “cross media” image transfer has become a challenge [7]. Likewise, the requirement of accuracy on color reproduction has pushed the development of new multispectral imaging systems. The eﬀective design of color imaging products relies on a range of disciplines, for it operates at the very heart of the human-computer interface, matching human perception with computer-based image generation.

Until recently, the design of efficient color imaging systems was guided by the criterion that “what the user cannot see does not matter.” This is no longer true. This has been, so far, the only guiding principle for image filtering and coding. In modern applications, this is not sufficient enough. For example, it should be possible to reconstruct on display the image of a painting from a digital archive under different illuminations. From the human vision point, the problem is that visual perception is one of the most elusive and changeable of all aspects of human cognition, and depends on a multitude of factors. Successful research and development of color imaging products must therefore combine a broad understanding of psychophysical methods with a significant technical ability in engineering, computer science, applied mathematics, and behavioral science.

2.2. Human color vision

The human color vision system is immensely complicated. For a better understanding of its complexity, a short introduction is given here. The reflected light from an object enters the eye, first passes through the cornea and lens, and creates an inverted image on the retina at the back of the eyeball. The retinal surface contains millions of two types of photoreceptors: rods and cones. The former are sensitive to very low levels of light but cannot see color. Color information is detected at normal (daylight) levels of illumination by the three types of cones, named L, M, S, corresponding to light sensitive pigments at long, medium, and short wavelengths, respectively. The visible spectrum ranges between about 380 to 780 nanometers (nm). The situation is complicated by the retinal distribution of the photoreceptors: the cone density is the highest in the foveal region in a central visual field of approximately 2◦diameter, whereas the rods are absent from the fovea but attain maximum density in an annulus of 18◦ eccentricity, that is, in the peripheral visual field. The information acquired by rods and cones is encoded and transmitted via the optic nerve to the brain as one luminance channel (black-white) and two opponent chrominance channels (red-green and yellow-blue), as proposed by the opponent-process theory of color vision of Hering. These visual signals are successively processed in the lateral geniculate nucleus (LGN) and visual cortex (V1), and then propagated to several nearby visual areas in the brain for further extraction of features. Finally, the higher cognitive functions of object recognition and color perception are attained.

At very low illumination levels, when the stimulus has a luminance lesser than approximately 0.01 cd/m2_{, only the}

rods are active and give monochromatic vision, known as scotopic vision. When the luminance of the stimulus is greater than approximately 10 cd/m2_{, at normal indoor}

and daylight level of illumination in a moderate surround, the cones alone mediate color vision, known as photopic vision. In between 0.01 and 10 cd/m2 _{there is a gradual}

changeover from scotopic to photopic vision as the retinal illuminance increases, and in this domain of mesopic vision both cones and rods make significant contributions to the visual response.

Yet the mesopic condition is commonly encountered in dark-surround or dim-surround conditions for viewing of television, cinema, and conference projection displays, so it is important to have an appropriate model of color appearance. The cinema viewing condition is particularly interesting, because although the screen luminance is definitely pho-topic, with a standard white luminance of 40–50 cd/m2_{, the}

observers in the audience are adapted to a dark surround in the peripheral field which is definitely in the mesopic region. Also, the screen fills a larger field of view than is normal for television, so the retinal stimulus extends further into the peripheral field where rods may make a contribution. Additionally, the image on the screen changes continuously and the average luminance level of dark scenes may be well down into the mesopic region. Under such conditions, the rod contribution cannot be ignored. There is no oﬃcial CIE standard yet available for mesopic photometry, although in Division 1 of the CIE there is a technical committee dedicated to this aspect of human vision: TC1-58 “Visual Performance in the Mesopic Range.”

When dealing with the perception of static and moving images, visual contrast sensitivity plays an important role in the filtering of visual information processed simultaneously in the various visual “channels.” The high frequency active channels (also known as parvocellular or P channels) enable detail perception; the medium frequency active channels allow shape recognition, whereas the low-frequency active channels (also known as magnocellular or M channels) are more sensitive to motion. Spatial contrast sensitivity func-tions (CSFs) are generally used to quantify these responses and are divided into two types: achromatic and chromatic. Achromatic contrast sensitivity is generally higher than chro-matic. For achromatic sensitivity, the maximum sensitivity to luminance for spatial frequencies is approximately 5 cycles/degree. The maximum chrominance sensitivity is only about one tenth of the maximum luminance sensitivity. The chrominance sensitivities fall oﬀabove 1 cycle/degree, particularly for the blue-yellow opponent channel, thus requiring a much lower spatial bandwidth than luminance. For a nonstatic stimulus, as in all refreshed display devices, the temporal contrast sensitivity function must also be considered. To further complicate matters, the spatial and temporal CSFs are not separable and so must be investigated and reported as a function on the time-space frequency plane.

(4)

et al. proposed a model which provided insight into the activity and interactions of the achromatic and chromatic mechanisms involved in the perception of contrasts [9]. However, the proposed model does not oﬀer significant improvement over other models in high mesopic range or in mid-to-low mesopic range because the mathematical model used is not relevant to adjust correctly these extreme values.

Likewise, there is a need to determine the limits of visibility, for example, the minimum of brightness contrast between foreground and background, in diﬀerent viewing conditions. For example, Ojanpaa et al. investigated the eﬀect of luminance and color contrasts on the speed of reading and visual search in function of character sizes. It would be interesting to extend this study to small displays such as mobile devices and to various viewing conditions such as under strong ambient light. According to Kuang et al., contrast judgement as well as colorfulness has to be analysed in function of highlight contrasts and shadow contrasts [10].

2.3. Low-level description and high-level interpretation

In recent years, research efforts have also focused on semantically meaningful automatic image extraction [11]. According to Dasiapoulou et al. [11], these efforts have not bridged the gap between low-level visual features that can be automatically extracted from visual content (e.g., with saliency descriptors), and the high-level concepts capturing the conveyed meaning. Even if conceptual models such as MPEG7 have been introduced to model high-level concepts, we are always confronted to the problem of extracting the objects of a scene (i.e., the regions of an image) at intermediate level between the low level and the high level. Perhaps the most promising way to bridge the former gap is to focus the research activity on new and improved human visual models. Traditional models are based either on a data-driven description or on a knowledge-based description. Likewise, there is in a general way a gap between traditional computer vision science and human vision science, the former considering that there is a hierarchy of intermediate levels between signal-domain information and semantic understanding meanwhile the latter consider that the relationships between visual features in the human visual system are too complex to be modeled by a hierarchical model. Alternative models attempted to bridge the gap between low-level descriptions and high-level interpretations by encompassing a structured representation of objects, events, relations that are directly related to semantic entities. However, there is still plenty of space for new alternative models, additional descriptors and methodologies for an efficient fusion of descriptors [11].

Image-based models as well as learning-based approaches are techniques that have been widely used in the area of object recognition and scene classification. They consider that humans can recognize objects either from their shapes or from their color and their texture. This information is considered as low-level data because it is extracted by the human vision system during the preattentive stage. Inversely, high-level data (i.e., semantic

data) is extracted during the interpretation stage. There is no consensus in human vision science to model intermediate stages between preattentive and interpretation stages because we do not have a complete knowledge of visual areas and of neural mechanisms. Moreover, the neural pathways are interconnected and the cognitive mechanisms are very complex. Consequently, there is no consensus for one human vision model.

We believe that the future of image understanding will advance through the development of human vision models which better take into account the hierarchy of visual image processing stages from the preattentive stage to the interpretation stage. With such a model, we could bridge the gap between low-level descriptors and high-level interpretation. With a better knowledge of the interpretation stage of the human vision system we could analyze images at the semantic level in a way that matches human perception.

3. COLOR IMAGE APPLICATIONS: ISSUES, CONTROVERSIES, PROBLEMS

When we speak about color image science, it is fundamental to evoke firstly problems of acquisition and reproduction of color images but also problems of expertise for particular dis-ciplinary fields (meteorologists, climaticians, geographers, historians, etc.). To illustrate the problems of acquisition, we evoke the demosaicking technologies. Next, to illustrate the problems with the display of color images we speak about digital cinema. Lastly, to illustrate the problems of particular expertise we quote the medical applications.

3.1. Color acquisition systems

(5)

temporal color video demosaicking algorithm based on the motion estimation and data fusion in order to reduce color artefacts over the intraframes. In this paper, the authors have considered that the temporal dimension of a color mosaic image sequence could reveal new information on the missing color components due to the mosaic subsampling which is otherwise unavailable in the spatial domain of individual frames. Then, each pixel of the current frame is matched to another in a reference frame via motion analysis, such that the CCD sensor samples diﬀerent color components of the same object position in the two frames. Next, the resulting interframe estimates of missing color components are fused with suitable intraframe estimates to achieve a more robust color restoration. In [23], Lukac and Plataniotis surveyed in a comprehensive manner demosaicking demosaicked image postprocessing and camera image zooming solutions that utilize data-adaptive and spectral modeling principles to produce camera images with an enhanced visual quality. Demosaickingtechniques have been also studied in regards to other image processing tasks, such as compression task (e.g., see [24]).

3.2. Color in consumer imaging applications

Digital color image processing is increasingly becoming a core technology for future products in consumer imaging. Unlike past solutions where consumer imaging was entirely reliant on traditional photography, increasingly diverse color image sources, including (digitized) photographic media, images from digital still or video cameras, synthetically generated images, and hybrids, are fuelling the consumer imaging pipeline. The diversity on the image capturing and generation side is mirrored by an increasing diversity of the media on which color images are reproduced. Besides being printed on photographic paper, consumer pictures are also reproduced on toner- or inkjet-based systems or viewed on digital displays. The variety of image sources and repro-duction media, in combination with diverse illumination and viewing conditions, creates challenges in managing the reproduction of color in a consistent and systematic way. The solution of this problem involves not only the mastering of the photomechanical color reproduction principles, but also the understanding of the intrinsic relations between visual image appearance and quantitative image quality mea-surements. Much is expected from improved standards that describe the interfaces of various capturing and reproduction devices so they can be combined into better and more reliably working systems.

To achieve “what you see is what you get” (WYSIWYG) color reproduction when capturing, processing, storing, and displaying visual data, the color in visual data should be managed so that whenever and however images are display-ed their appearance remains perceptually constant. In the photographic, display, and printing industries, color ap-pearance models, color management methods and stan-dards are already available, notably from the International Color Consortium (ICC, see http://www.color.org/), the International Commission on Illumination (CIE) Divisions 1 “Vision and Color” (seehttp://www.bio.im.hiroshima-cu

.ac.jp/∼cie1) and 8 “Image Technology” (see http://www .colour.org/), the International Electrotechnical Commission (IEC) TC100 “Multimedia for today and tomorrow” (see

http://tc100.iec.ch/about/structure/tc100 ta2.htm/), and the International Organisation for Standardisation (ISO) such as ISO TC42 “Photography” (seehttp://www.i3a.org/iso.html/), ISO TC 159 “Visual Display” and ISO TC171 “Document Management” (see http://www.iso.org/iso/). A computer system that enables WYSIWYG color to be achieved is called a color management system. Typical components include the following:

(i) a color appearance model (CAM) capable of pre-dicting color appearance under a wide variety of viewing conditions, for example, the CIECAM02 model recommended by CIE;

(ii) device characterization models for mapping between the color primaries of each imaging device and the color stimulus seen by a human observer, as defined by CIE specifications;

(iii) a device profile format for embodying the translation from a device characterization to a color appearance space proposed by ICC.

(6)

need for automatic color transfer toolboxes (e.g., color balance, RGB channel alignment, color grade transfer, color correction). Unfortunately, little attention has been paid to color transfer in a video or in a film. Most of color transfer algorithms have been defined for still images from a reference image, or for image sequences from key frames in a video clip [29]. Moreover, the key frames computed for video sequences are arbitrarily selected regardless of the color content of these frames. A common feature of color transfer algorithms is that they operate on the whole image independent of the image’s semantic content (however, an observer who sees a football match in a stadium is more sensitive to the color of the ground than to the color of the steps). Moreover, they do not take into account metadata such as the script of the scenario or the lighting conditions under which the scene was filmed. Nevertheless, such metadata is used by the Digital Cinema System Speci-fication for testing digital projectors and theatre equipment [30].

The problems of color reproduction in graphic arts are in many regards similar to those in consumer imaging, except that much of the image capturing and reproduction is in a controlled and mature industrial environment, making it generally easier to manage the variability. A particularly important color problem in graphic arts is the consistency and predictability of the “digital color proof ” with regard to the final print. According to Bochko et al., the design of a system for accurate digital archiving of fine art paintings has awakened increasing interest [31]. Excellent results have been achieved under controlled illumination conditions, but it is expected that approaching this problem using multispectral techniques will result in a color reproduction that is more stable under diﬀerent illumination conditions. Archiving the current condition of a painting with high accuracy in digital form is important to preserve it for the future, likewise to restore it. For example, Berns worked on digital restoration of faded paintings and drawings using a paint-mixing model and a digital imaging of the artwork with a color-managed camera [32]. Until 2005, Berns also managed a research program entitled “Art Spectral Imaging” which focused on spectral-based color capture, archiving, and reproduction [30].

Another interesting problem in graphic arts is col-orization. Colorization is a computerized process that adds color to a monochrome image or movie. Few methods for motion pictures have been published (e.g., [33]). Various applications such as comics (Manga), a cartoon film, and a satellite image have been reported (e.g., [34]). In addition, the technology is not only used to color images but also for image encoding [35]. In recent years, techniques have developed in the field of other image processing, such as image matting [36], image inpainting [37], and physical reflection model [38] and have been applied to colorization. The target of colorization is not only limited to coloring algorithm but extends to the problem of color-to-gray (e.g., [39]). This problem is interesting and must be a new direction in colorization. The colorization accuracy for monochrome video needs to be improved and considered as an essential challenge in the future.

3.3. Color in medical imaging

In general, medical imaging focuses mostly on analysing the content of the images rather than the artefacts linked to the technologies used.

Most of the images, such as X-ray and tomographic images, echo-, or thermographs are monochrome in nature. In a first application of color image processing, pseudocol-orization was used to aid the interpretation of transmitted microscopy (including stereo microscopy, 3D reconstructed image, and fluorescence microscopy) [40]. In the context of biomedical imaging, an important area of increasing significance in society, color information, has been used significantly in order, amongst other things, to detect skin lesions, glaucomatous in eyes [41], microaneurysms in color fundus images [42], and to measure blood-flow velocities in the orbital vessels, and to analyze tissue microarrays (TMAs) or cDNA microarrays [43,44]. Current approaches are based on colorimetric interpretation, but multispectral approaches can lead to more reliable diagnoses. Multispectral image processing may also become an important core technology for the business unit “nondestructive testing” and “aerial photography,” assuming that these groups expand their applications into the domain of digital image processing. The main problem in medical imaging is to model the image formation process (e.g., digital microscopes [45], endoscopes [46], color-doppler echocardiography [47]) and to correlate image interpretation with physics-based models. In medical applications, usually lighting conditions are controlled. However, several medical applications are faced with the problem of noncontrolled illumination, such as in dentistry [48] or in surgery.

Another important problem addressed in medical imag-ing is the quality of images and displays (e.g., sensitivity, contrast, spatial uniformity, color shifts across the grayscale, angular-related changes of contrast and angular color shifts) [49–51]. To face with the problem of image quality, some systems classify images by assigning them to one of a number of quality classes, such as in retinal screening [50]. To classify image structuresfound within the image Niemeijer et al. have used a clustering approach based on multiscale filterbanks. The proposed method was compared, using diﬀerent feature sets (e.g., image structure or color histograms) and classifiers, with the ratings of a human observer. The best system, based on a Support Vector Machine, had performance close to optimal with an area under the ROC curve of 0.9968.

(7)

metadata from specimens to characterize them, to abstract their interpretation, to correlate them to clinical data, next to use these metadata for automated and accurate analysis of digitized images.

Lastly, dentistry is faced with complex lighting phenom-ena (e.g., translucency, opacity, light scattering, gloss eﬀect, etc.) which are diﬃcult to control. Likewise, cosmetic science is faced with the same problems. The main tasks of dentistry and cosmetic science are color correction, gloss correction, and face shape correction.

3.4. Color in other applications

We have evoked in this section several problems of medical applications, but we could also evoke the problems with assisting the diagnosis in each area of particular expertise (meteorologists, climaticians, geographers, historians, etc.). Likewise, we could evoke the problems of image and display quality in web applications, HDTV, graphic arts and so on or applications of nondestructive quality control for numerous areas including painting, varnishes, and materials in the car industries, aeronautical packaging, or in the control of products in the food industry. Numerous papers have shown that even if most of the problems in color image science are similar for various applications, color imaging solutions are widely linked to the kinds of image and to the applications.

4. COLOR IMAGE SCIENCE—THE ROAD AHEAD: SOLUTIONS, RECOMMENDATIONS, AND FUTURE TRENDS

4.1. Color spaces

Rather than using a conventional color space, another solution consists of using an ad hoc color space based on the most characteristic color components of a given set of images. Thus, Benedetto et al. [53] proposed to use the YST color space to watermark images of human faces where Y, S, and T represent, respectively, the brightness component, the color average value of a set of diﬀerent colors of human faces, and the color component orthogonal to the two others. The YST color space is next used to watermark images that have the same color characteristics as the set of images used. Such a watermarking process is robust to illumination changes as the S component is relatively invariant to illumination changes.

Other solutions have been also proposed for other kinds of processes such as the following.

(i) For segmentation. The Fischer distancestrategy has been proposed in [54] in order to perform figure-ground segmentation. The idea is to maximize the foreground/background class separability from a linear discriminant analysis(LDA) method.

(ii) For feature detection. The diversification principle strategy had been proposed in [55] in order to perform selection and fusion of color components. The idea is to exploit nonperfect correlation between color components or feature detection algorithms

from a weighting scheme which yields maximal feature discrimination. Considering that a trade-oﬀexists between color invariant components and their discriminating power, the authors proposed to automatically weight color components to arrive at a proper balance between color invariance under varying viewing conditions (repeatability) and dis-criminative power (distinctiveness).

(iii) For tracking. The adaptive color space switching strategy had been proposed in [56] in order to perform tracking under varying illumination. The idea is to dynamically select the better color space, for a given task (e.g., tracking), as a function of the state of the environment, among all conventional color spaces.

These solutions could be extended to more image processing tasks than those initially considered provided these solutions are adapted to these tasks. The proper use and understanding of these solutions is necessary for the development of new color image processing algorithms. In our opinion, there is room for the development of other solutions for choosing the best color space for a given image processing task.

Lastly, to decompose color data in diﬀerent components such as a lightness component and a color component, new techniques recently appeared such as the quaternion theory [57, 58] or other mathematical models based on polar representation [59]. For example, Denis et al. [57] used the quaternion representation for edge detection in color images. They constrained the discrete quaternionic Fourier transform to avoid information loss during pro-cessing and defined new spatial and frequency operators to filter color images. Shi and Funt [58] used the quaternion representation for segmenting color images. They showed that the quaternion color texture representation can be used to successfully divide an image into regions on basis of texture.

4.2. Color image appearance (CAM)

The aim of the color appearance model is to model how the human visual system perceives the color of an object or of an image under different points of view, different lighting conditions, and with different backgrounds.

The principal role of a CAM is to achieve successful color reproduction across diﬀerent media, for example, to transform input images from film scanners, cameras, onto displays, film printers, and data projectors considering the human visual system (HVS). In this way, a CAM must be adaptive to viewing conditions, that is ambient light, surround color, screen type, viewing angle, and distance. The standard CIECAM02 [60] has been successfully tested at various industrial sites for graphic arts applications, but needs to be tested before being used in other viewing conditions (e.g., cinematographic viewing conditions).

(8)

Stevens effect, Hunt effect, Bezold-Brücke effect, simulta-neous contrast, crispening, color constancy, memory color, discounting-the-illuminant, light, dark, and chromatic adap-tation, surround effect, spatial and temporal visions. All these phenomena are caused by the change of viewing parameters, primarily illuminance level, field size, back-ground, surround, viewing distance, spatial, and temporal variations, viewing mode (illuminant, surface, reflecting, self-luminous, or transparent), structure effect, shadow, transparency, neon-effect, saccades effect, stereo depth, and so forth.

Many color appearance models have been developed since 1980. The last one is the CIECAM02 [60]. Although CIECAM02 does provide satisfactory prediction to a wide range of viewing conditions, there still remain many limita-tions. Let us consider four of these limitations: (1) objective determination of viewing parameters; (2) prediction of color appearance under mesopic vision; (3) incorporation of spatial eﬀects for evaluating static images; (4) consideration of the temporal eﬀects of human vision system for moving images.

The first limitation is due to the fact that in CIECAM02 the viewing conditions need to be defined in terms of illumination (light source and luminance level), luminance factor of background and surround (average, dim, or dark). Many of these parameters are very difficult to define, which leads to confusion in industrial application and deviations in experimentation. The surround condition is highly critical for predicting accurate color appearance, especially when associated with viewing conditions for different media. Typically, we assume that viewing a photograph or a print in a normal office environment is called “bright” or “average” surround, whereas watching TV in a darkly lit living room can be categorized as “dim” surround, and observing projected slides and cinema images in a darkened room is “dark” surround. Users currently have to determine what viewing condition parameter values should be used. Recent work has been carried out by Kwak et al. [61] to make better prediction of changes in color appearance with different viewing parameters.

The second shortcoming addresses the state of visual adaptation at the low-light levels (mesopic vision). Most models of color appearance assume photopic vision, and completely disregard the contribution from rods at low levels of luminance. There are few color appearance datasets for mesopic vision and the experimental data from conventional vision research are difficult to apply to color appearance modeling because of the different experimental techniques employed (haploscopic matching, flicker photometry, etc.). The only color appearance model yet to include a rod contribution is the Hunt 1994 model but, when this was adapted to produce CIECAM97s and later CIECAM02, the contributions of rod signal to the achromatic luminance channel were omitted [62]. In a recent study, color appear-ance under mesopic vision conditions was investigated using a magnitude estimation technique [8, 63]. Larger stimuli covering both foveal and perifoveal regions were used to probe the effect of the rods. It was confirmed that colors looked “brighter” and more colorful for a 10-degree patch

than a 2-degree patch, an effect that grew at lower luminance levels. It seemed that perceived brightness was increased by the larger relative contribution of the rods at lower luminance levels and that the increased brightness induced higher colourfulness. It was also found that the colors with green-blue hues were more affected by the rods than other colors, an effect that corresponds to the spectral sensitivity of the rod cell, known as the “Purkinje shift” phenomenon. Analysis of the experimental results led to the development of an improved lightness predictor, which gave superior results to eight other color appearance models in the mesopic region [61].

The third shortcoming is linked to the problem that the luminance of the white point and the luminance range (white-to-dark, e.g., from highlight to shadow) of the scene may have a profound impact on the color appearance. Likewise, the background surrounding the objects in a scene influences the judgement of human evaluators when assessing video quality using segmented content.

For the last shortcoming, an interesting direction to be pursued is the incorporation of spatial and temporal eﬀects of human vision system into color appearance models. For example, although foveal acuity is far better than peripheral acuity, many studies have shown that the near periphery resembles foveal vision for moving and flickering gratings. It is especially true for sensitivity to small vertical displace-ments, and detection of coherent movement in peripherally viewed random-dot patterns. Central fovea and peripheral visions are qualitatively similar in spatial-temporal visual performance and this phenomenon has to be taken into account for color appearance modeling. Some researches have been conducted on spatial and temporal eﬀects by numerous papers [64–67].

Several studies have shown that the human visual system is more sensitive to low frequencies than to high frequencies. Likewise, several studies have shown that the human visual system is less sensitive to noise in dark and bright regions than in other regions. Lastly, the human visual system is highly insensitive to distortions in regions of high activity (e.g., salient regions) and is more sensitive to distortions near edges (objects contours) than in highly textured areas. All these spatial effects are unfortunately not taken into account enough by CIECAM97s or CIECAM02 color appearance models. A new technical committee, the TC1-68 “Effect of stimulus size on colour appearance,” has been created in 2005 to compare the appearance of small and large uniform stimuli on a neutral background. Even if numerous papers have been published on this topic, in particular in the proceedings of the CIE Expert Symposium on Visual Appearance organized in 2006 [68–71], there is a need for further research on spatial effects.

(9)

with these models is that the interactions between individual pixels are mostly ignored. To deal with this problem, spatial appearance models have been developed such as the iCAM [64] which take into account both spatial and color properties of the stimuli and viewing conditions. The goal in developing the iCAM was to create a single model applicable to image appearance, image rendering, and image quality specifications and evaluations. This model was built upon previous research in uniform color spaces, the importance of image surround, algorithms for image diﬀerence and image quality measurement [72], insights into observers eye movements while performing various visual imaging tasks, adaptation to natural scenes and an earlier model of spatial and color vision applied to color appearance problems and high dynamic range (HDR) imaging.

The iCAM model has a sound theoretical background, however, it is based on empirical equations rather than a standardized color appearance model such as CIECAM02 and some parts are still not fully implemented. It is quite eﬃ -cient in dealing with still images but it needs to be improved and extended for video appearance [64]. Moreover, filters implemented are only spatial and cannot contribute to color rendering improvement for mesopic conditions with high contrast ratios and a large viewing field. Consequently, the concept and the need for image appearance modeling are still under discussion in the Division 1 of the CIE, in particular in the TC 1-60 “Contrast Sensitivity Function (CSF) for Detection and Discrimination.” Likewise, how to define and predict the appearance of a complex image is still an open question.

Appreciating the principles of color image appearance and more generally the principles of visual appearance opens the door for improving color image processing algo-rithms. For example, the development of emotional models related to the color perception should contribute to the understanding of color and light eﬀects in images (see CIE Color Reportership R1-32 “Emotional Aspects of Color”). Another example is that the development of measurement scales that relate to the perceived texture should help to analyze textured color images. Likewise, the development of measurement scales that relate to the perceived gloss should help to describe perceived colorimetric eﬀects. Numerous studies have been done on the “science” of appearance in the CIE Technical Committee TC 1-65 “Visual Appearance Measurement.”

4.3. Color difference metrics

Beyond the problem of the color appearance description arises also the problem of the color difference measurement in a color space. The CIEDE2000 color difference formula was standardized by the CIE in 2000 in order to compensate some errors in the CIELAB and CIE94 formulas [73]. Unfortunately, the CIEDE2000 color difference formula suffers from mathematical discontinuities [74].

In order to develop/text new color spaces with Euclidean color diﬀerence formulas, new reliable experimental datasets need to be used (e.g., using visual displays, under illuminat-ing/viewing conditions close to the “reference conditions”

suggested for the CAM). This need has recently been expressed by the Technical Committee CIE TC 1-55 “Uni-form color space for industrial color difference evaluation” [75]. The aim of this TC is to propose “a Euclidean color space where color differences can be evaluated for reliable experimental data with better accuracy than the one achieved by the CIEDE2000 formula.” (See recent studies of the TC1-63 “Validity of the range of the CIEDE2000” and R1-39 “Alternative Forms of the CIEDE2000 Colour-Difference Equations.”)

The usual color diﬀerence formulas, such as the CIEDE2000 formula, have been developed to predict color diﬀerence under specific illuminating/viewing con-ditions closed to the “reference concon-ditions.” Inversely, the CIECAM97s and CIECAM02 color appearance models have been developed to predict the change of color appearance under various viewing conditions. These CIECAM97s and CIECAM02 models involve seven attributes: brightness (Q), lightness (J), colorfulness (M), chroma (C), saturation (s), hue composition (H), and hue angle (h).

Lastly, let us note that meanwhile the CIE L∗a∗b∗ΔE metric can be seen as a Euclidean color metric, the S-CIELAB space has the advantage of taking into account the diﬀerences of sensitivity of the HVS in the spatial domain, such as homogeneous or textured areas.

5. COLOR IMAGE PROCESSING

The following subsections focus on the most recent trends in quantization, filtering and enhancement, segmentation, coding and compression, watermarking, and lastly on mul-tispectral color image processing. Several states of the art on various aspects of image processing had been published in the past. Rather than globally describing the problematic of these topics, we focus on color specificities in advanced topics.

5.1. Color image quantization

The optimal goal of the quantization method is to build a set of representative colors such that the perceived diﬀerence between the original image and the quantized one is as small as possible. The definition of relevant criteria to characterize the perceived image quality is still an open problem. One criterion commonly used by quantization algorithms is the minimization of the distance between each input color and its representative. Such criterion may be measured thanks to the total squared error which minimizes the distance within each cluster. A dual approach tries to maximize the distance between clusters. Note that the distance of each color to its representative is relative to the color space in which the mean squared error is computed. Several strategies have been developed to quantize a color image, among them the vectorial quantization (VQ) is the most popular. VQ can be also used as an image coding technique that shows high data compression ratio [76].

(10)

bit depth, even cell phones. Image quantization algorithms are considered of much less usefulness today due to the increasing power of most digital imaging devices, and the decreasing cost of memory. The future of color quantization is not in the displays community due to the fact that the bit depth of all triprimaries displays is currently at least equal to 24 bit (or higher, e.g., equal to 48 bits!). Inversely, the future of color quantization will be guided by the image processing community due to the fact that typical color imaging processes such as compression, watermarking, filtering, segmentation, or retrieval use the quantization.

It has been demonstrated that the quality of a quantized image depends on the image content and on gray-levels of the color palette (LUT); likewise the quality of a compression or a watermarking process based on a quantization process depends on these features [77]. In order to illustrate this aspect, let us consider the problem of color image water-marking. Several papers have proposed a color watermarking scheme based on a quantization process. Among them, Pei and Chen [78] proposed an approach which embed two watermarks in the same host image, one on the a∗b∗ chromatic plane with a fragile message by modulating the indexes of a color palette obtained by color quantization, another on the L∗ lightness component with a robust message of gray levels palette obtained also by quantization. Chareyron et al. [79] proposed a vector watermarking scheme which embeds one watermark on the xyY color space by modulating the color values of pixels previously selected by color quantization. This scheme is based on the minimization of color changes between the watermarked image and the host image in the L∗a∗b∗color space.

5.2. Color image filtering and enhancement

The function of a filtering and signal enhancement module is to transform a signal into another more suitable for a given processing task. As such, filters and signal enhancement modules find applications in image processing, computer vision, telecommunications, geophysical signal processing, and biomedicine. However, the most popular filtering appli-cation is the process of detecting and removing unwanted noise from a signal of interest, such as color images and video sequences. Noise aﬀects the perceptual quality of the image decreasing not only the appreciation of the image but also the performance of the task for which the image was intended. Therefore, filtering is an essential part of any image processing system whether the final product is used for human inspection, such as visual inspection, or an automatic analysis.

In the past decade, several color image processing algo-rithms have been proposed for filtering, noise reduction tar-geting, in particular, additive impulsive and Gaussian noise, speckle noise, additive mixture noise, and stripping noise. A comprehensive class of vector filtering operators have been proposed, researched, and developed to eﬀectively smooth noise, enhance signals, detect edges, and segment color images [80]. The proposed framework, which has supplanted previously proposed solutions, appeared to report the best performance to date and has inspired the introduction of a

number of variants inspired by the framework of [81] such as those reported in [82–90].

Most of these solutions are able to outperform classical rank-order techniques. However, they do not produce con-vincing results for additive noise [89] and fall short of deliv-ering the performance reported in [80]. It should be added at this point that classical color filters are designed to perform a fixed amount of smoothing so that they are not able to adapt to local image statistics [89]. Inversely, adaptive filters are designed to filter only those pixels that are likely to be noisy while leaving the rest of the pixels unchanged. For example, Jin and Li [88] proposed a “switching” filterwhich better preserves the thin lines, fine details, and image edges. Other filtering techniques, able to suppress impulsive noise and keep image structures based on modifying the importance of the central pixel in the filtering process, have also been developed [90]. They provide better detailed preservation whereas the impulses are reduced [90]. A disadvantage of these techniques is that some parameters have to be tuned in order to achieve an appropriate performance. To solve this problem, a new technique based on a fuzzy metric has been recently developed where an adaptive parameter is automatically determined in each image location by using local statistics [90]. This new technique is a variant of the filtering technique proposed in [91]. Numerous filtering techniques used also morphological operators, wavelets or partial diﬀerential equations[92,93].

Several research groups worldwide have been working on these problems, although none of the proposed solu-tions seems to outperform the adaptive designs reported in [80]. Nevertheless, there is a room for improvement in existing vector image processing to achieve a tradeoﬀ between detailed preservation (e.g., edge sharpness) and noise suppression. The challenge of the color image denois-ing results mainly from two aspects: the diversity of the noise characteristics and the nonstationary statistics of the underlying image structures [87].

The main problem these groups have to face is how to evaluate the effectiveness of a given algorithm. As for other image processing algorithms, the effectiveness of an algorithm is image-dependent and application-dependent. Although there is no universal method for color image filtering and enhancement solutions, the design criteria accompanied the framework reported in [80,81,86] appear to offer the best guidance to researchers and practitioners.

5.3. Color image segmentation

(11)

this reason, considerable care is taken to improve the state-of-the-art in color image segmentation. The latest survey on color image segmentation techniques were published in 2007 by Paulus [94]. These surveys discussed the advantages and disadvantages of classical segmentation techniques, such as histogram thresholding, clustering, edge detection, region-based methods, vector region-based, fuzzy techniques, as well as physics-based methods. Since then, physics-based methods as well as those based on fuzzy logic concepts appear to oﬀer the most promising results. Methodologies utilizing active contour concepts [95] or hybrid methods combining global information, such as image histograms and local information, regions and edge information [96,97], appear to deliver eﬃcient results.

Color image segmentation is a rather demanding task and developed solutions have to be effectively deal with image shadows, illumination variations and highlights. Amongst the most promising line of work in the area is the computation of image invariants that are robust to photometric effects [54, 98, 99]. Unfortunately, there are too many color invariant models introduced in the open literature, making the selection of the best model and its combination with local image structures (e.g., color derivatives) in order to produce the best result quite difficult. In [100], Gevers et al. survey the possible solutions available to the practitioner. In specific applications, shadow, shading, illumination, and highlight edges have to be identified and processed separately from geometrical edges such as corners and T-junctions. To address the issue, local differential structures and color invariants in a multidimensional feature space were used to detect salient image structures (i.e., edges) on the basis of their physical nature in [100]. In [101], the authors proposed a classification of edges into five classes, namely, object edges, reflectance edges, illumination/shadow edges, specular edges, and occlusion edges to enhance the performance of the segmentation solution utilized.

Shadow segmentation is of particular importance in applications such as video object extraction and tracking. Several research proposals have been developed in an attempt to detect a particular class of shadows in video images, namely, moving cast shadows, based on the shadow’s spectral and geometric properties [102]. The problem is that cast shadow models cannot be eﬀectively used to detect other classes of shadows, such as self-shadows or shadows in diﬀuse penumbra [102] suggesting that existing shadow segmentations solutions could be further improved using invariant color features.

Presently, the main focus of the color image processing community appears to be the fusion of several low-level image features so that image content would be better described and processed. Several researches provided some solutions to combine color derivatives features and color invariant features, color features and other low-level features (e.g., color and texture [103], color and shape [100]), low-level features and high-level features (e.g., from graph representation [104]). However, none of the proposed solu-tions appear to provide the expected performance leading to solutions that borrow ideas and concepts from sister signal processing communities. For example, in [105] the

authors propose the utilization of color masks and MPEG-7 descriptors in order to segment prespecified target objects in video sequences. According to this solution, available priori information on specified target objects, such as skin color features in head-and-shoulder sequence, are used to automatically segment these objects focusing on a small part of the image. In the opinion of the authors, the future of color image segmentation solutions will heavily rely on the development and use of intermediate-level features derived using saliency descriptors and by the use of a priori information.

Color segmentation can be used in numerous applica-tions, such as skin detection. Skin detection plays an impor-tant role in a wide range of image processing applications ranging from face detection, face tracking, content-based image retrieval systems, and to various human computer interaction domains [106–109]. A survey of skin modeling and classification strategies based on color information was published by Kakumanu et al. in 2007 [108].

5.4. Color coding and compression

A number of video coding standards have been developed, ITU-T H.261, H.263, ISO/IEC 1, 2, MPEG-4, and H.264/AVC, and deployed in multimedia applications such as video conferencing, storage video, video-on-demand, digital television broadcasting, and Internet video streaming [110]. In most of the developed solutions, color has played only a peripheral role. However, in the opinion of the authors, video coding solutions could be further improved by utilizing color and its properties. Most of the traditional video coding techniques are based on the hypothesis that the so-called luminance component, that is the Y channel in the YCbCr color space representation, provides meaningful textural details which can deliver acceptable performance without resorting to the use of chrominance planes. This fundamental design assumption explains the use of models with separate luminance and chrominance components in most transform-based video coding solutions. In [110], the authors suggested the utilization of the same distribution function for both the luminance and chrominance com-ponents demonstrating the eﬀectiveness of a nonseparable color model both in terms of compression ratio and compressed sequence picture quality.

Unfortunately, most of codecs use diﬀerent chroma subsampling ratio as appropriate to their compression needs. For example, video compression schemes for Web and DVD use make use of a 4 : 2 : 0 color sampling pattern and the DV standard uses 4 : 1 : 1 sampling ratio. A common problem when an end user wants to watch a video stream encoded with a specific codec is that if the exact codec is not present and properly installed on the user’s machine, the video will not play (or will not play optimally). Spatial and temporal downsampling may also be used to reduce the raw data rate before the basic encoding process. The most popular of such transforms is the 8×8discrete cosine transform(DCT).

(12)

very fast decoding/encoding, progressive transmission, low computational complexity, low dynamic memory require-ment, and so forth [111]. The recent survey of [112] summa-rized color image compression techniques based on subband transform coding principles. The discrete cosine transform (DCT), thediscrete Fourier transform(DFT), the Karhunen-Loeve transform (KLT), and the wavelet tree decomposition had been reviewed. The authors proposed a rate-distortion model to determine the optimal color components and the optimal bit allocation for the compression. It is interesting to note that these authors had demonstrated that the YUV, YIQ, and KLT color spaces are not optimal to reduce bit allocation. There has been also a great interest in vector quantization (VQ) because VQ provides a high compression ratio and better performance may be obtained than using any other block coding technique by increasing vector length and codebook size. Lin and Chen extended this technique in developing a spread neural network with penalized fuzzy c-means (PFCM) clustering technology based on interpolative VQ for color image compression [113].

In [114], Dhara and Chanda surveyed color image compression techniques that are based on block truncation coding (BTC). The authors’ recommendations to increase the performance of BTC include a proposal to reduce the interplane redundancy between color components prior to applying a pattern fitting (PF) on each of the color plane sep-arately. The work includes recommendations on determining the size of the pattern book, the number of levels in patterns, and the block size based on the entropy of each color plane. The resulting solution oﬀers competitive coding gains at a fraction of the coding/decoding time required by existing solution such as JPEG. In [115], the authors proposed a color image coding strategy which combines localized spatial correlation and intercolor correlation between color components in order to build a progressive transmission, cost-eﬀective solution. Their idea is to exploit the correlation between color components instead of decorrelating color components before applying the compression. Inspired by the huge success of set-partitioning sorting algorithms such as the SPIHT or the SPECK, there has been also extensive research on color image coding using the zerotree structure. For example, Nagaraj et al. proposed a color set partitioned embedded block coder (CSPECK) to handle color still images in the YUV 4 : 2 : 0 format [111]. By treating all color planes as one unit at the coding stage, the CSPECK generates a single mixed bit-stream so that the decoder can reconstruct the color image with the best quality at that bit-rate.

Although it is a known fact that interframe-based coding schemes (such as MPEG) which exploit the redundancy in the temporal domain outperform intrabased coding schemes (like Motion JPEG or Motion JPEG2000) in terms of compression ratio, intrabased coding schemes have their own set of advantages such as embeddedness, frame-by-frame editing, arbitrary frame-by-frame extraction, and robustness to bit errors in error-prone channel environments which the former schemes fail to provide [111]. Nagaraj et al. exploited this statement to extend CSPECK for coding video frames by using an intrabased setting of the video sequences. They called this scheme as Motion-SPECK and compared its

performance on QCIF and CIF sequences against Motion-JPEG2000. The intended applications of such video coder would be high-end and emerging video applications such as high-quality digital video recording system and professional broadcasting systems.

In a general way, to automatically measure the quality of a compressed video sequence the PSNR is computed on multimedia videos, consisting of CIF and QCIF video sequences compressed at various bit rates and frame rates [111,116]. However, the PSNR has been found to correlate poorly with subjective quality ratings, particularly at low bit rates and low frame rates. To face with this problem, Ong et al. proposed an objective video quality measurement method better correlated to the human perception than the PSNR and the video structural similarity method [116]. On the other hand, S¨usstrunk and Winkler reviewed the typical visual artifacts that occur due to high compression ratios and/or transmission errors [117]. They discussed no-reference artifact metrics for blockiness, blurriness, and colorfulness. In our opinion, objective video quality metrics will be useful for weighting the frame rate of coding algorithms in regard to the content richness fidelity, to the distortion-invisibility, and so forth. In this area, numerous researches have been made but few of them focused on color information (seeSection 6.5).

Lastly, it is interesting to note that even if the goals of compression and data hiding methods are by definition contradictory, these methods can be used jointly. While the former methods add perceptually irrelevant information in order to embed data, the latter methods remove this irrelevancy and redundancy to reduce storage requirements. In the opinion of the authors, the future of color image compression will heavily rely on the development of joint methods combining compression and data hiding. For example, Lin and Chen proposed a color image hiding scheme which first compresses color data by an interpolative VQ scheme (IVQ), then encrypts color IVQ indices, sorts the codebooks of secret color image information, and embeds them into the frequency domain of the cover color image by the Hadamard transform (HT) [113]. On the other hand, Chang et al. [118] proposed a reversible hiding scheme which first compresses color data by a block-truncation coding scheme (BTC), then applies a genetic algorithm to reduce the binary bitmap from three to one, and embeds the secret bits from the common bitmap and the three quantization levels of each block. According to Chang et al., unlike the codebook used in VQ, BTC never requires any auxiliary information during the encoding and decoding procedures. In addition, BTC-compressed images usually maintain acceptable visual quality, and the output can be compressed further by using other lossless compression methods.

5.5. Color image watermarking

(13)

in 2007 [5]. In watermarking, we tend to watermark the per-ceptually significant part of the image to ensure robustness rather than providing fidelity (except for fragile watermarks and authentication). Therefore, the whole challenge is how to introduce more and more significant information without perceptibility, and how to keep the distortion minimal. On one hand, this relies upon crypting techniques, and on the other, the integration of HSV models. Most watermarking schemes use either one or two perceptual components, such as color and frequency components. Obviously, the issue is the combination of the individual components so that a watermark with increased robustness and adequate imperceptibility is obtained [119,120].

Most of the recently proposed watermarking techniques operate on the spatial color image domain. The main advantage of spatial domain watermarking schemes is that their computational cost is smaller compared to the cost associated with watermarking solutions operating on the transform image domain. One of the first spatial-domain watermarking schemes, the so-called the least significant bit (LSB) scheme, was on the principle of inserting the watermark in the low order bits of the image pixel. Unfor-tunately, LSB techniques are highly sensitive to noise with watermarks that can be easily removed. Moreover, as LSB solutions applied to color images use color transforms which are not reversible when using fixed-point processor, the watermark can be destroyed and the original image cannot be recovered, even if only the least significant bits are altered [121]. This problem is not specific to LSB techniques, it concerns any color image watermarking algorithm based on nonreversible forward and inverse color transforms using fixed-point processor. Another problem with LSB-based methods is that most of them are built for raw image data rather than for compressed image formats that are usually used across the Internet today [118]. To face this problem, Chang et al. proposed a reversible hiding method based on a block truncation coding of compressed color images. The reversibility of this scheme is based on the order of the quantization levels of each block and the property of the natural image, that is, the adjacent pixels are usually similar.

In the authors’ opinion, watermarking quality can be improved through the utilization of the appearance models and color saliency maps. As a line for future research, it will be interesting to examine how to combine the various saliency maps that influence the visual attention, namely, the intensity map, contrast map, edginess map, texture map, and the location map [119,122,123].

Generally, when a new watermarking method is pro-posed, some empirical results are provided so that per-formance claims can be validated. However, at present there is no systematic framework or body of standard metrics and testing techniques that allow for a systematic comparative evaluation of watermarking methods. Even for benchmarked systems such as Stirmark or Checkmark, comparative evaluation of performance is still an open question [122]. From a color image processing perspective, the main weaknesses of these benchmarking techniques is that they are limited to gray-level images. Thus, in order to compute the fidelity between an original and a watermarked

image, color images have to be converted to grayscale images. Moreover, such benchmarks use a black-box approach to compute the performance of a given scheme. Thus, they first compute various performance metrics which they then combine to produce an overall performance score. According to Wilkinson [122], a number of separate performance metrics must be computed to better fully describe the performance of a watermarking scheme. Likewise, Xenos et al. [119] proposed a model based on four quality factors and approximately twenty criteria hierarchized in three levels of analysis (i.e., high level, middle level, and low level). According to this recommendation, four major factors are considered as part of the evaluation procedure, namely, high-level properties, such as the image type, color-related information, such as the depth and basic colors, color features, such as the brightness, saturation, and hue, and regional information, such as the contrast, the location, the size, the color of image patches. In the opinion of the authors, it will be interesting to undertake new investigations towards the development of a new generation of a comprehensive benchmarking system capable of measuring the quality of the watermarking process in terms of color perception.

Similar to solutions developed for still color images, the development of quality metrics that can accurately and con-sistently measure the perceptual diﬀerences between original and watermarked video sequences is a key technical chal-lenge. Winkler [124] showed that the video quality metrics (VQM) could automatically predict the perceptual quality of video streams for a broad variety of video applications. In the author’s opinion, these metrics could be refined through the utilization of high-level color descriptors. Unfortunately, very few works had been reported in the literature on the objective evaluation of the quality of watermarked videos.

5.6. Multispectral color image processing

A multispectral color imagingsystem is a system which captures and describes color information by a greater number of sensors than an RGB device resulting in a color representation that uses more than three parameters. The problem with conventional color imaging systems is that they have some limitations, namely, dependence on the illuminant and characteristics of the imaging system. On the other hand, multispectral color imaging systems, based on spectral reflectance, are device and illuminant independent [7,30,31].

During the last few years, the importance of multispectral imagery has sharply increased following the development of new optical devices and the introduction of new applications. The trichromatic, RGB color imaging becomes unsatisfac-tory for many advanced applications but also for the inter-facing of input/output device and color rendering in imaging systems. Color imaging must become spectrophotometric, therefore, multispectral color imaging is the technique of the immediate future.