Image retrieval based system on Image Annotation

(1)

Volume 2, Issue 3, 2015

100 Available online at www.ijiere.com

International Journal of Innovative and Emerging

Research in Engineering

e-ISSN: 2394 - 3343 p-ISSN: 2394 - 5494

Image retrieval based system on Image Annotation

Ms. Samiksha Dangre1_,

_{Prof. Ms. Pooja Thakre}

2

1_{PG Student, M.tech-IVth sem, Department of VLSI, NUVA College of Engg., Nagpur India} 2_{HOD, Department of Electronics,NUVA College of Engg., Nagpur, India}

ABSTRACT:

Image retrieval is concerned with retrieving images relevant to users’ queries from a large collection. Relevance is determined by the nature of the application. Here we proposed an algorithm to retrieve image from the database, based on image annotation. In first stage we annotated the images and results get stored into xml database. A query keyword is fired to retrieve the images, it first matches with the images stored in database and find the similarity with images. The result consisting of images which are equals to query word is shown as a output. The COREL dataset is used to perform experiment.

Keywords: Image annotation, Image Retrieval, Corel database, xml database

I. INTRODUCTION

The increasing availability of multimedia information combined with the decreasing storage and processing costs have changed the requirements on information systems drastically. Today, general purpose database systems are incorporating support for multimedia storage and retrieval, as well as features which used to be found in specialized imaging systems or multimedia databases. Increased use of multimedia has important implications for overall information system design regarding storage, processing, retrieval and transmission. Image retrieval is concerned with retrieving images relevant to users’ queries from a large collection. Relevance is determined by the nature of the application. In a fabric image database, relevant images may be those that match a sample in terms of texture and color. In a news photography database, date, time, and the occasion in which the photograph was taken may be just as important as the actual visual content. Image annotation is the process of assigning keywords to digital images depending on the content information. In one sense, it is a mapping from the visual content information to the semantic context information. Assignment of the words to images depends on several criteria.

1) Segmental Approaches: This group of studies considers the image as consisting of semantically meaningful parts and tries to find a probabilistic relation between the parts of the image and the keywords. For this purpose, images are segmented or parts are taken from the image and features are extracted from these parts.

2) Holistic Approaches: This group considers the image as a whole. Features are extracted from the whole image. And a relation is explored directly between the image and the annotation words.

Both approaches bear many advantages and disadvantages, which depends highly on the application domain. The first approach starts by segmentation, which is problematic by itself. It is based on the assumption that, it is possible to find the annotation words of a given image by means of considering the image regions. However, once the human information processing system is considered, it is clear that one needs to consider the whole image to obtain the concept information. Furthermore, it may not be the case that the annotation of the image needs segmentation. Even if it is so, segmentation is an extremely difficult and unsolved problem, which brings an extra error to the annotation problem. The second approach avoids segmentation. However, it may not always be possible to extract the meaningful words from the whole image represented by low-level visual features.

Automatic image annotation (AIA) plays an important role and attracts much research attention in image understanding and retrieval. The advances of information and communication technologies allow the creation of image archives extensively. As a result, the size of images database archives is increasing rapidly. Even though with the enhance of image search technology whether in the system itself in the stand alone system or search images via the Web, the search via image content is still potential field to enhance. Image annotation has been an active research topic in recent years due to its potentially high impact on Web image search. To affectively access and retrieve images, a widely adopted solution is to tag images with meaningful keywords semantically called image annotation. There are three types of image annotation approaches available: manual, automatic and semi automatic.

(2)

101 in the training data. For each image, one has access to keywords assigned to the entire image and it is not known which regions of the image correspond to these keywords.

Issues Relevant To Image Annotation

Images are annotated to simply access to them by using metadata that being added to images in order to allow more effective searches. If the images are described by textual information, then text search technique can be used to perform images searches [11]. However, there is a need to improve generation of automated metadata for images called automatic image annotation. Many researchers have proposed various techniques in attempting to bridge the well known semantic gap. Many of them realize another problem which is dependency on the training dataset to learn the models [12]. Image annotation surveys have been reviewed by many researchers according to the demanding the needs for annotating images. Jiayu [13] has classified image annotation approaches into statistical approaches, vector-space related approaches and classification approaches. The idea behind the Holistic approach is to estimate the probabilities of images queries that then will be ranked according to their probabilities.

In this paper, we proposed an image retrieval system based on automatic image annotation. Our method focuses on two challenges. First, we provide signatures for images with more descriptive property which are extracted in real-time. Another contribution of our approach is related to extraction of prototypes in each category of images. Thus, we propose a new strategy that improves the performance of automatic image annotation results to produce more representative prototypes with less computation.

The rest of paper is organized as follows. We review the existing approaches which are used in Automatic image annotation and image retrieval in section 2. The proposed structure is described in section3. The experimental results are presented in section 4 with the conclusion in section 5.

II. EXISTING APPROCHES

On Automatic Image Annotation a large number of techniques have been proposed in the last decade. Most of these treat annotation as translation from image instances to keywords. The translation paradigm is typically based on some model of image and text co-occurrences. Rami ALBATAL, Philippe MULHEM, Yves CHIARAMELLA in 2011 propose model “A NEW ROI GROUPING SCHEMA FOR AUTOMATIC IMAGE ANNOTATION” [1] where the regions of Interest (ROI) are successfully used in automatic image annotation through Bag of Visual Words (BoVW) models. The obtained results indicate that this method outperforms (+6,2 % of MAP) the CBoVW method. These results encourage us to do further analysis on the topological Visual Phrases in order to find interesting patters for object classes. Chang et al. [16] proposed a solution to the annotation problem by using Bayesian Point Machines (BPM). In their method, each training image is manually assigned a concept term from the lexicon, and the visual characteristics of the whole image are modelled using a colour and texture feature vector (144-dimension). BPM is then used to train a classifier for each concept to determine the confidence score of assigning the concept to the image.

(3)

102 III. PROPOSED WORK

Problem formalization: For an image annotation task, we have n images I = {I1, I2, I3 .... In} and K semantic classes c = {c1, c2 . . . , ck}. Each image is abstracted as a data point, which is associated with a number of labels Li represented by a binary vector y belongs to {0, 1} K, such that yi (K) = 1 if xi belongs to the K-th class, and 0 otherwise.

A training set S consisting of N images in set I = {Ii} Ni = 1 and their associated text documents in set T = {Tt} N t=1 such that, S = {(I1, T1), (I2, T2)... (IN, TN)}, is given. Each image in the dataset is described by a set of visual descriptors, Ii = {δi1, δi2… δiD} where δij is the feature vector representing the ith image in the jth description space. Each text document Ti consists of a set of words, Ti = {wi1, wi2, ...,wiM}, where wim corresponds to the mth word of the ith image, and wim ∈ W where W = {w1, w2, ...,wL}, L is the number of words in the dataset. Given a test image Q, the problem is to assign a document A, which is obtained from the elements of W, to Q. Each word in W is considered as a class label. Each image in I is associated to a set of words in the vocabulary W. All membership values for the image Ii provided by annotator Aj is represented by Pi,j which is a vector constructed as follows:

Pij=[p1,i,j p2,i,j... pl,i,j... pL,i,j] (1)

Feature Extraction: If the input image is partitioned into m segments, we define the color signature of input image as a discrete distribution {(Zv Pi), (Z2' P2), ... , (Zm' Pm)} where Zi is the mean vector of the ith segment and Pi indicates the probability of that segment which can be estimated by the percentage of pixels associated to the ith segment. The document of an image, that is the words of an image Ii is represented as Ti where Ti = {wi1. . . wim}, and the jth word of image Ii is represented as wij . Each image is described with at least one word and at most M words, 1 ≤ m ≤ M.

Concept Modeling: We propose to solve the annotation problem defined above by means of a hierarchical architecture which consists of two layers. In the first layer, called level-0, information from all visual description spaces are processed separately and candidate annotation words and their membership values are estimated for a given image. In the second layer, called metalevel, information provided by level-0, is considered and most probable words are assigned to an unknown image.

Level-0 consists of annotators, which assign a membership value to each word in the vocabulary based on distinct low level visual features and the high level context information provided by the annotation words. In the meta-level, a set of final annotation words is selected by aggregating the results of level-0. In other words, meta-level processes the output of all level-0 annotators.

The annotation process is described in Algorithm 1.

Algorithm

Input : An Images The Weight vector w

Feature vector of test images t Feature matrix of training images F Keywords

Algorithm:

for all rows in F

Compute Similarity array as si = ( ti - fi ) * wi End for

Eliminate values which comes more than one Take predefine number of values from s

Compute the local frequency of keywords and rank it. Transfer keyword according to their local frequency Output:

Annotated images with defined keywords

IV. EXPERIMENTALRESULTS

(4)

103 average of 10 results is reported. The number of words in the test set is 263 and 260 of them also take place in the training set. Thus, ideally it is possible to annotate only 260 of the words. The number of annotation words associated to each image varies between one and five. Although this allows flexibility for the number of words in annotation, it brings a bias to the precision, recall and coverage percentage while measuring the performance. The dictionary contains 795 words that appear in the dataset. In order to reduce the class, we use the LabelMe matlab function LMaddtags to reduce the variability on object labels used to describe the same object class. From Fig. 1 we can see that, in this dataset, each image has been segmented into several regions by different attributes and each region associated with an annotation keyword, which makes the annotation process much easier.

Figure 1: Example Images from Corel Draw dataset

In our approach, labels to a sample image are assigned by looking at its k-neighbors’ labels together with the distance of the neighbors from the sample image. The algorithm assigns high probability to words that appear in close neighborhood. Based on Image annotation, output of image retrieval is as follows

Figure 2: Result of image retrieval conducted on 30 images on the keyword “tree”

V. CONCLUSION

This paper presents a novel approach for image retrieval system. Preparation of xml database containing image annotation plays very important role in image retrieval. COREL dataset is used to prepare database and to perform experimentation. Similarity algorithm is used to compare the query keyword with the image annotations. Results are shown in above sections. Experimental results on benchmark dataset show that our proposed algorithm significantly improves the performance of image annotation as well as the image retrieval quality.

REFERENCES

[2] Hua Wang, Heng Huang and Chris Ding “Review on Statistical Approaches for Automatic Image Annotation” , 2009 International Conference on Electrical Engineering and Informatics , 5-7 August 2009, Selangor, Malaysia

[3] Hua Wang, Heng Huang and Chris Ding “Image Annotation Using Bi-Relational Graph of Images and Semantic Labels” by, pages 126–139, 2011.

[4] Ran Li, YaFei Zhang, Zining Lu, Jianjiang Lu, Yulong Tian, “Technique of Image Retrieval based on Multi-label Image Annotation” , 2010 Second International Conference on MultiMedia and Information Technology [5] Dongjian He & Yu Zheng, Shirui Pan, Jinglei Tang , “Ensemble Of Multiple Descriptors For Automatic Image

Annotation” 2010 3rd International Congress on Image and Signal Processing

[6] G. Tsoumakas, I. Katakis, and I. Vlahavas. Random k- Labelsets for Multi-Label Classification. TKDE, 2010. [7] Tianxia Gong, Shimiao Li, Chew Lim Tan , “A Semantic Similarity Language Model to Improve Automatic Image annotation” School of Computing National University of Singapore, 1082-3409/10 $26.00 © 2010 IEEE [8] H. Muller, W. Muller, D. M. Square, S. M. Maillet and T. Pun, “Performance Evaluation in Content Based

(5)

104 [9] J. Vogel and B. Schiele, “Performance Evaluation and Optimization for Content Based Image Retrieval”,

Pattern Recognition Letters, vol. 22, Apr. 2001.

[10] Y. Alp Aslandogan and Clement T. Yu, Senior Member, IEEE, Techniques and Systems for Image and Video Retrieval VOL. 11, NO. 1, JANUARY/FEBRUARY 1999

[11] A. Hanbury, “A Survey of Methods for Image Annotation,” J. Vis. Lang. Comput., vol. 19, pp. 617-627, Oct. 2008.

[12] J. Liu, B. Wang, M. Li, Z. Li, W. Y. Ma, H. Lu and S. Ma, “Dual Cross-Media Relevance Model for Image Annotation,” in Proceedings of the 15th International Conference on Multimedia 2007, p. 605 – 614.

[13] T. Jiayu, “Automatic Image Annotation and Object Detection,” PhD thesis, University of Southampton, United Kingdom, May 2008.

[14] C. F. Tsai and C. Hung, “Automatically Annotating Images with Keywords: A Review of Image Annotation Systems,” Recent Patents on Computer Science, vol 1, pp 55-68, Jan., 2008.

[15] R. Datta, D. Joshi, J. Li and J. Z. Wang, “Image Retrieval: Ideas, Influences, and Trends of the New Age,” ACM Computing Surveys (CSUR), vol 40, Apr. 2008.

[16] Chung, K.P. and Fung, C.C. Relevance feedback and intelligent technologies in content-based image retrieval system for medical applications. Australian Journal of Intelligent Information Processing Systems, 8 (3). pp. 113-122, 2004.

[17] P. Duygulu, K. Barnard , J. F. G. de Freitas, and D. A. Forsyth, “Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary,” in ECCV ’02: Proceedings of the 7th European Conference on Computer Vision-Part IV, (London, UK), pp. 97–112, Springer-Verlag, 2002.

[18] E. Y. Chang, K. Goh, G. Sychay, and G. Wu. “ CBSA: content-based soft annotation for multimodal image retrieval using bayes point machines. IEEE Trans. Circuits Syst. Video Techn, 13(1):26–38, 2003.

[19] E. Izquierdo and A. Dorado. Climbing the semantic ladder: Towards semantic semi-automatic image annotation using MPEG-7 descriptor schemas. In Proc. IEEE Int’l Workshop on Computer Architecture for Machine Perception, pages257–262, Italy, Jul 2005.

[20] J. Jeon, V. Lavrenko, and R. Manmatha, “Automatic image annotation and retrieval using cross-media relevance models,” in SIGIR ’03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, (New York, NY, USA), pp. 119–126, ACM Press, 2003. [21] V. Lavrenko, R. Manmatha, and J. Jeon, “A model for learning the semantics of pictures,” in Advances in

Neural Information Processing Systems 16 (S. Thrun, L. Saul, and B. Sch¨olkopf, eds.), Cambridge, MA: MIT Press, 2004.