IMPLEMENTATION OF RE-RANKING OF IMAGES BASED ON TEXTUAL AND VISUAL CONTEXT

(1)

256 | P a g e

IMPLEMENTATION OF RE-RANKING OF IMAGES BASED ON TEXTUAL AND VISUAL CONTEXT

Ms. Smita R Patil

¹

, Mr. Gopal Prajapati

²

1,2Dept. of Computer, RKDF School of Engineering RGPV Bhopal (India)

ABSTRACT

In this paper, we propose re-ranking of images which will be based on textual and visual context. In early days various web-search engines are available to adopt Image re-ranking as an effective way to improve the results of web-based image search.Two types of image search methods are available in the Internet. They are query keyword based model and content based image retrieval models. Text query are used in the textual image retrieval model. Visual re-ranking has been widely deployed to refine the quality of conventional content-based image retrieval engines.

The query keyword is given as input query, a pool of images are first retrieved based on textual information. It becomes difficult for user to interpret intention only on query keywords which leads to ambiguous and noisy results which are far different from user’s satisfaction. It helps user to ask to select a query image from image pool with minimum effort and images from image pool retrieved by text-based search are re-ranked based on both visual and textual content.

Firstly Adaptive Weight Schema is used for the similarity analysis and categorizing the images and re-ranking it with adaptive weighting schema. Feature weight learning algorithm is applied to estimate feature weights for the images and its category. Secondly query is expanded with keyword and visual information by using the visual content of the query image selected by the user and by using image clustering then, query keywords are expanded and image pool is enlarge that contain relevant images and finally these are expanded by using keyword expansion. All these are done very easily and automatically without any extra work.

Key Features: Image Search, Image Re-Ranking, Keyword Expansion, Adaptive Similarity.

I. INTRODUCTION

Most of the commercial Internet image search engines use only text keywords as queries. Users, used type query keywords in the intent of searching a particular type of image. The search engine returns thousands of images ranked by the keywords extracted from the surrounding text. But the text-based image search suffers from the lot of ambiguity of query keywords because the keywords given by users are generally short. They are not enough to describe the content of images accurately. The search results are noisy and consist of images with quite different semantic meanings. Fig. 1.1 shows the top ranked images from Bing image search using “apple”

as query. They belong to different categories, such as “green apple,” “red apple,” “apple logo,” and “iphone”

(2)

257 | P a g e because of the ambiguity of the word “apple.” The ambiguity issue comes mainly because of two reasons. First, the meaning of query keywords and the users’ expectation may be very different in terms of semantics. For example, the meanings of the word “apple” include apple fruit, apple computer, and apple ipod. Second, the user might not have enough knowledge so that he/she can give actual textual description of the required image.

Fig. 1 Top-ranked images returned from Bing image search using

“apple” as query.

For example, if users do not know “gloomy bear” as the name of a cartoon character (shown in Fig. 2a) and they have to input “bear” as query to search images of “gloomy bear.” Lastly and most importantly, in many cases it is hard for users to describe the visual content of target images using keywords accurately.

Fig. 2. (a) Images of “gloomy bear.”

(b) Google related searches of query “bear.”

So some additional and useful information would be required to solve this ambiguity and capture the users’

actual search intention. One of the way is text-based keyword expansion, which elaborates the textual description of query in more detail. Existing linguistically-related methods find either synonyms or other linguistic-related words from thesaurus, or find words frequently co-occurring with the query keywords. For example, Google image search provides the “Related Searches” feature to suggest likely keyword expansions.

However, even with the same query keywords, the intention of users can be highly diverse and cannot be

(3)

258 | P a g e accurately captured by these expansions. As shown in Fig. 1.2b, “gloomy bear” is not among the keyword expansions suggested by Google “related searches.”

Another way is content-based image retrieval with relevance feedback. Users label multiple positive and negative image examples. A query-specific visual similarity metric is learned from the selected examples and used to rank images. The requirement of more users’ effort makes it unsuitable for web-scale commercial systems like Bing image search and Google image search in which users’ feedback has to be minimized. We do believe that adding visual information to image search is important. However, the interaction has to be as simple as possible. The absolute minimum is One-Click.[13]

1.1 Need: Today’s biggest challenge or issue is that, the similarities among the visual features of images and the semantic meanings of images do not properly correlate with each other in order to capture and interpret the users’ search intention. This system is proposed to relate images in semantic space. The semantic space uses

attributes or reference classes that closely related to the semantic meaning of images as the basis.

However, learning a universal visual semantic space to characterize highly diverse images from the web is difficult and inefficient. So in this re-rank system it improves the efficiency of image search, which needs to search the image very quickly on the basis of user intention. Also it improves the accuracy of the searching.

1.2 Basic concept: Web based image searching can be improved with the concept of re-ranking. Search engine like Google and Bing retrieve pool of images according to the textual query keyword and then by asking user to select query image retrieve final result. But these results are based on visual features of query image which do not correlate with image semantic. Thus to enhance matching efficiency proposed system focus on semantic signatures of query keyword.

II. LITERATURE SURVEY

2.1 Related word done:

1. Image search and visual Expansion:

One of the major challenges of content-based image retrieval is to learn the visual similarities which reflect semantic relevance of image well. Image similarities can be learned from a large training set where the relevance of pairs of images is known.[4][5] So for general image recognition and matching, there have been a number of works on using projections over predefined concepts, attributes or reference classes as image signatures. The classifiers of concept, attributes and reference classes are trained from known classes with labeled examples. But the knowledge learned from the known classes can be transferred to recognize samples of novel classes which have few or even no training samples. Since these concepts, attributes and reference classes are defined with semantic meanings, the projections over them can well capture the semantic meanings of new images even without further training.[8]

2. Keyword expansion:

Query keywords input by user tend to be short and some important keyword may be missed because of user lack of knowledge on the textual description of target images. In our approach, query keyword are expanded to

(4)

259 | P a g e capture users search intention, inferred from the visual content of query images, which are not considered in traditional keyword Expansion approaches. A word is suggested as an expansion of the query if a cluster of images are visually similar to the query image and all contain the same word. The expanded keywords better capture user search intention since the consistency of both visual content and textual description is ensured.

3. Semantic signature:

The visual and textual features of images are then projected into their related semantic spaces to get semantic signature. At the on-line stage, images are re-ranked by comparing their semantic signature obtained from the semantic space of the query keyword. The semantic correlation between concepts is explored and incorporated when computing the similarity of semantic signature. The explored and incorporated when computing the similarity of semantic signatures. Our query-specific semantic signatures effectively reduce the gap between low-level visual features and semantic categories, and also make image matching more consistent with visual perception.

2.2 Existing system:

In existing system, one way is text-based keyword expansion, making the textual description of the query more detailed. Existing linguistically-related methods find either synonyms or other linguistic-related words from thesaurus, or find words frequently co-occurring with the query keywords.For example, Google image search provides the Related Searches” feature to suggest likely keyword expansions. However, even with the same query keywords, the intension of user can be highly diverse and cannot be accurately capture by these expansion. Search by image is optimized to work well for content that is reasonably well described on the web.

For this reason, you will likely get more relevant result for famous landmarks or paintings than you will for more personal images like you toddler’s latest figure painting.

Existing approach:

1. Scale-invariant feature transform.

2. Adaptive algorithm.

3. Histogram of gradient.

III. IMPLEMENTATION

The proposed system is divided into four modules:

1. Image search module 2. Query categorization

3. Retrieval of images from expanded keywords.

4. Visual query expansion

(5)

260 | P a g e Image Search Module:

The image search module uses query keyword provided by user and displays various images to the user. From those displayed results user will have to select one of the images that matches his/her intention. The selected image is then used as training image for further re-ranking process.

Query Categorization:

Here in this module, the adaptive similarity is used to categorize the images. Then, based on the similarity between the query image and other images re-ranking is performed.

Retrieval of images from expanded keywords:

Here the query keywords are expanded to capture the users’ intention, inferred from the visual content of the query image.

Visual Query Expansion:

The images selected from expanded keyword are used as positive examples to learn the visual and textual similarity metrics, which are more robust and more specific to the query, for image re-ranking.

IV. ALGORITHMS

4.1. Algorithm for Weight algorithm:

Input: Initial weight Wi for all query image I in the current intention category C. Similarity matrix Mf[i] for all query image i & features f.

a. Initially set step p=1, set initial weight Wi´=Wi for all images of i.

b. While not converged d c. For each query image i C do

d. Select best feature f & corresponding similarity matrix Mf[i] for current Re-ranking problem under weight Wi^P. then Calculate enable wt is βp.

e. Adjust weight Wi^p+1(j,k) α Wi^p(j,k) . exp{βp[Mf[i,j]- βp[Mf[i,k]};

f. Wi^p+1 make to be distributed.

g. p++;

h. End for i. End while

j. Final output :final optimal similarity measure for current intension category.

Matrix Mf^q[i , j]= βp Mf[i , j]

4.2. Algorithm for picture analysis or get frequently used colors:

Input: Input image I1

Initialized: Create empty dictionary dctColorIncidence <int, int> ();

(6)

261 | P a g e a. for each pixel Pi of I1 do

b. Get pixel value.

c. if(dctColorIncidence.Keys contains(pixelValue)) theIncrement index of that pixel value in dictionary d. else Assign 1 to at index pointed by pixel value in dictionary.

e. end for f. Sort dictionary

g. Output: Return first element of dictionary as mostUsedcolor 4.3. Algorithm for Face Detection:

Input: Input image I1

Initialized: opencv library. Set no of face nf=0;

a. Create HaarCascade classifier object face-cascade. Load train xml file into object.

b. Create HaarCascade classifier object eye-cascade. Load train xml file into object.

c. Get input image I1

d. Convert it ino gray sacle Gi.

e. Detect HaarCascade of Gi f. Detect faces and increment nf g. Output: Return detected faces nf

V. CONCLUSION

In this paper, we propose a re-ranking of images which is done by one-click of user and improve quality of images by acquiring user intention. The weight schema is proposed to combine visual features and to compute visual similarity adaptive to query images. It improves image search with faster speed and high quality on web.

With this project we conclude that it improves user experience and also user gets the exact search as the user wants. We also build a large labeled database from Internet to share to the community. Using the developed technology, we implemented a real-time online image search engine, combining text and Intent Search.

REFERENCE

[1] F. Jing, C. Wang, Y. Yao, K. Deng, L. Zhang, and W. Ma, “Igroup: Web Image Search esults Clustering,” Proc. 14th Ann. ACM Int’l Conf. Multimedia, 2006.

[2] J. Cui, F. Wen, and X. Tang, “Real Time Google and Live Image Search Re-Ranking,” Proc. 16th ACM Int’l Conf. Multimedia, 2008.

[3] J. Cui, F. Wen, and X. Tang, “IntentSearch: Interactive On-Line Image Search Re-Ranking,” Proc. 16th ACM Int’l Conf. Multimedia, 2008.

[4] J. Deng, A.C. Berg, and L. Fei-Fei, “Hierarchical Semantic Indexing for Large Scale Image Retrieval,”

Proc. IEEE Int’l Conf. Computer Vision and Pattern Recognition, 2011

[5] G. Chechik, V. Sharma, U. Shalit, and S. Bengio, “Large Scale Online Learning of Image Similarity through Ranking,” J. Machine Learning Research, vol. 11, pp. 1109-1135, 2010.

(7)

262 | P a g e [6] R. Fergus, P. Perona, and A. Zisserman, “A Visual Category Filter for Google Images,” Proc. European

Conf. Computer Vision, 2004.

[7] G. Park, Y. Baek, and H. Lee, “Majority Based Ranking Approach in Web Image Retrieval,” Proc.

Second Int’l Conf. Image and Video Retrieval, 2003.

[8] N. Rasiwasia, P.J. Moreno, and N. Vasconcelos, “Bridging the Gap: Query by Semantic Example,” IEEE Trans. Multimedia, vol. 9, no. 5, pp. 923-938, Aug. 2007.

[9] W.H. Hsu, L.S. Kennedy, and S.-F. Chang, “Video Search Reranking via Information Bottleneck Principle,” Proc. 14th Ann. ACM Int’l Conf. Multimedia, 2006.

[10] R. Datta, D. Joshi, and J.Z. Wang, “Image Retrieval: Ideas, Influences, and Trends of the New Age,”

ACM Computing Surveys, vol. 40, pp. 1-60, 2007.

[11] A. Torralba, K. Murphy, W. Freeman, and M. Rubin, “Context-Based Vision System for Place and Object Recognition,” Proc. Int’l Conf. Computer Vision, 2003.

[12] N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” Proc. IEEE Int’l Conf.

Computer Vision and Pattern Recognition, 2005.

[13] Yogesh A. Sankpal and Vidya Chitre, “Survey and Analysis for User Intention Refined Internet Image Search”, IJARCE, Vol. 1, No. 3, May 2014.

[14] Y. Jing and S. Baluja, “Pagerank for Product Image Search,” Proc. Int’l Conf. World Wide Web, 2008.

[15] “Bing Image Search,” http://www.bing.com/images, 2012.

[16] N. Ben-Haim, B. Babenko, and S. Belongie, “Improving Web-Based Image Search via Content Based Clustering,” Proc. Int’l Workshop Semantic Learning Applications in Multimedia, 2006.