Open research topics - Classifiers and machine learning techniques for image processing and com

When performing data-hiding in digital images, we have an additional problem: images are expected to be subjected to many operations, ranging from simple transformations, such as translations, to nonlinear transformations, such as blurring, ﬁltering, lossy compression, printing, and rescanning. The hidden messages should survive all attacks that do not degrade the image’s perceived quality [6].

Steganography’s main problem involves designing robust information-hiding techniques. It is crucial to derive approaches that are robust to geometrical attacks as well as nonlinear transformations, and to ﬁnd detail-rich regions in the image that do not lead to artifacts in the hiding process. The hidden messages should not degrade the perceived quality of the work, implying the need for good image-quality metrics.

Hiding techniques often rely on private key sharing, which involves previous communication. It is important to work on algorithms that use asymmetric key schemes.

If multiple messages are inserted in a single object, they should not interfere with each other [6].

We need new powerful Steganalysis techniques that can detect messages without prior knowl- edge of the hiding algorithm (blind detection). The detection of very small messages is also a signiﬁcant problem. Finally, we need adaptive techniques that do not involve complex training stages.

3.10 Conclusions

In this paper, we have presented an overview of the past few years of Steganography and Ste- ganalysis, we have showed some of the most interesting hiding and detection techniques, and we have discussed a series of applications on both topics.

Terrorism has inﬁltrated the public’s perception of this technology for a long period. Public fear created by mainstream press reports, which often featured US intelligence agents claiming that terrorists were using Steganography, created a mystique around data hiding techniques. Legislators in several US states have either considered or passed laws prohibiting the use and dissemination of technology to conceal data [61].

Six years after September 11th, 2001’s tragic incidents, Steganography and Steganalysis have become mature disciplines, and data hiding approaches have outlived their period of hype. Public perception should now move beyond the initial notion that these techniques are suitable only for terrorist-cells’ communications. Steganography and Steganalysis have many legitimate applications, and represent great research opportunities waiting to be addressed.

3.11 Acknowledgments

We thank the support of FAPESP (05/58103-3 and 07/52015-0) and CNPq (301278/2004 and 551007/2007-9). We also thank Dr. Valerie Miller for proof reading this article.

Imagens

No Cap´ıtulo 4, apresentamos uma abordagem para meta-descri¸cão de imagens denominada Randomiza¸cão Progressiva (PR) para nos auxiliar nos problemas de: (1) Deteçcão de mensagens escondidas em imagens digitais; e (2) Categoriza¸cão de imagens.

PR ´e um novo meta-descritor que captura as diferen¸cas entre classes gerais de imagens usando os artefatos estat´ısticos inseridos durante um processo de perturba¸c˜ao sucessiva das imagens analisadas.

Nossos experimentos mostram que esta técnica captura bem a separabilidade de algumas classes de imagens. A observa¸cão mais importante é que classes diferentes de imagens possuem comportamentos distintos quando submetidas a sucessivas perturba¸cões. Por exemplo, um conjunto de imagens que não possui mensagens escondidas apresenta diferentes artefatos mediante sucessivas perturba¸cões comparado com um conjunto de imagens que possui mensagens escondidas.

Testamos a técnica no contexto da análise forense de imagens para deteçcão de mensagens escondidas bem como para a classifica¸cão geral de imagens em categorias como indoors, outdoors, geradas em computador e obras de arte.

O trabalho apresentado no Cap´ıtulo 4 é uma compila¸cão de nosso artigo submetido à Elsevier Computer Vision and Image Understanding (CVIU). Após um estudo que mostrou viabilidade comercial de nossa técnica, conseguimos o depósito de uma patente nacional junto ao INPI e sua versão internacional junto ao PCT.

Finalmente, o trabalho de deteçcão de mensagens nos rendeu a publica¸cão [147] no IEEE Intl. Workshop on Multimedia and Signal Processing (MMSP). A extensão da técnica para o cenário multi-classe (indoors, outdoors, geradas em computador, e obras de arte) resultou o artigo [148] no IEEE Intl. Conference on Computer Vision (ICCV).

Progressive Randomization: Seeing

the Unseen

Abstract

In this paper, we introduce the Progressive Randomization (PR): a new image meta-description approach suitable for different image inference applications such as broad class Image Catego- rization and Steganalysis. The main difference among PR and the state-of-the-art algorithms is that it is based on progressive perturbations on pixel values of images. With such perturbations, PR captures the image class separability allowing us to successfully infer high-level information about images. Even when only a limited number of training examples are available, the method still achieves good separability, and its accuracy increases with the size of the training set. We validate our method using two different inference scenarios and four image databases.

4.1 Introduction

In many real-life applications, we have to make decisions based only on images. In this paper, we introduce a new image meta-description approach based only on information invisible to the naked eye. We apply and validate our new technique on two very distinct problems that use supervised learning: Image Categorization and Digital Steganalysis.

Image Categorization is the body of techniques that distinguish between image classes, point- ing out the global semantic type of an image. Here, we want to distinguish the class of an image (e.g., Indoors from Outdoors), or the type of an object in restricted domains (e.g., vegetables in a supermarket cashier). One possible scenario for a consumer application is to group a photo al- bum, automatically, according to classes. Another situation can be to automate a supermarket cashier. Common techniques in content-based image retrieval use color histograms and tex- ture [60], bag of features [62,68,100], and shape and layout measures [196] to perform queries in massive image databases. With our solution, we can improve these techniques by automatically restraining the search to one or more classes.

Digital Steganalysis is the body of techniques that attempts to distinguish between non-stego or cover objects, those that do not contain a hidden message, and stego-objects, those that contain a hidden message. Steganalysis is the opposite of Steganography: the body of techniques devised to hide the presence of communication. In turn, Steganography is different from Cryptography, that aims to make communication unintelligible for those that do not possess the correct ac- cess rights. Recently, Steganography has received a lot of attention around the world mainly because its possible applications which includes: feature location (identification of subcompo- nents within a data set), captioning, time-stamping, and tamper-proofing (demonstration that original contents have not been altered) [149]. Unfortunately, not all applications are harmless, and there are strong indications that Steganography has been used to spread child pornography pictures on the internet [64,113]. Hence, robust algorithms to detect the very existence of hidden messages in digital contents can help further forensic and police work. Discovering the content of the hidden message is a much more complex problem than Steganalysis, and involves solving the general problem of breaking a cryptographic code [188].

In this paper, we introduce the Progressive Randomization (PR): a new image meta- description approach suitable for diﬀerent image inference applications such as broad class Image Categorization and Steganalysis. This technique captures statistical properties of the images’ LSB channel, information that are invisible to the naked eye. With such perturbations, PR captures the image class separability allowing us to successfully infer high-level information about images.

The PR image meta-description approach has four stages: (1) the randomization process, that progressively perturbates the LSB value of a selected number of pixels; (2) the selection of feature regions, that makes global descriptors work locally; (3) the statistical descriptors analysis, that ﬁnds a set of measurements to describe the image; and (4) the invariance transformation, that allows us to make the descriptor’s behavior image independent.

With enough training examples, PR is able to categorize images as a full self-contained classiﬁcation framework. Even when only a limited number of training examples are available, the method still achieves good separability. The method also provides interesting properties for association with other image descriptors for scene reasoning purposes.

To validate our image meta-description approach for Image Categorization and Steganal- ysis, we have created two validation scenarios: (1) Image Categorization; and (2) Hidden Messages Detection. In the Image Categorization scenario, we have performed four experiments. In the first experiment, we show PR as a complete self-contained multi-class classification procedure. For that, we have used a 40,000-image database with 12,000 outdoors, 10,000 indoors, 13,500 art photographs, and 4,500 computer generated images (CGIs) with two different classification approaches: All Pairs majority voting of the binary classifier Bagging of Linear Discriminant Analysis (All-Pairs-BLDA), and SVMs [16]. In addition, we have tested the PR technique in three other categorization experiments: one to provide another interpretation of the first experiment, one for 3,354 FreeFoto images categorization into nine classes and finally, one for categorizaton of 2,950 image of fruits into 15 classes. In the Hidden Messages Dectection scenario, we have used the 40,000-image database first scenario to detect the very existence of

hidden messages in digital images. We have used the binary classiﬁers: Linear Discriminant Analysis with and without Bagging ensemble, and SVMs [16].

We organize the remainder of this paper as follows. In Section 4.2, we present the Image Categorization and Steganalysis’ state-of-the-art. In Section 4.3, we introduce the Progressive Randomization (PR) image meta-description approach. In Section 4.4, we validate our method for Image Categorization and, in Section 4.5, for Steganalysis, and compare our results with related work in the literature. In Section 4.6, we give a close study to the reasons of why PR works. Finally, we present the conclusions and remarks in Section 4.8.

In document Classifiers and machine learning techniques for image processing and computer vision (Page 88-92)