PERFORMANCE EVALUATION OF CLUSTERING BASED CONTENT BASED IMAGE RETRIEVAL SYSTEMS

(1)

K.SRINIVASA RAO and Prof.I.Ramesh babu ijesird, Vol. II (VII) January 2016/ 424

PERFORMANCE EVALUATION OF CLUSTERING BASED CONTENT BASED

IMAGE RETRIEVAL SYSTEMS

1K.SRINIVASA RAO , ²Prof.I.Ramesh babu

1Associate Professor , Dept. of CSE, GIT, GITAM UNIVERSITY, Visakhapatnam - 530 045, ² Professor , Dept. of CSE, Acharya Nagarjuna University , Nagarjuna Nagar,Guntur(Dt)AP

Abstract- Retrieving images from available databases attracted researchers due to its ability to retrieve images in accurate and reliable way. Content based image retrieval scheme (CBIR) has lot of advantages over conventional META data based image retrieval scheme. The computer vision based application has huge importance in the field of digital image processing and content based image retrieval scheme (CBIR) yields good results based on three contents namely color, shape and texture. The color based content based image retrieval (CBIR) scheme is simple and efficient. Although tremendous progress has been made in the past years on image retrieval schemes but still acquiring images from large data sets based on various traditional image retrieval schemes is concerned area in the image processing domain. The proposed work presents content based image retrieval scheme (CBIR) based on three algorithms namely (a) Content based image retrieval scheme (CBIR) based on histogram properties (b) Content based image retrieval scheme (CBIR) based on histogram properties with K-Means clustering (c) Content based image retrieval scheme (CBIR) based on histogram properties with pillar K–means clustering.

Finally the simulation results shows better accuracy and reliability in retrieving images in digital computers and the performance of the proposed system overcomes the disadvantages of the traditional systems.

KEYWORDS: Content based image retrieval scheme (CBIR), K- Means clustering, pillar –Means clustering, histogram properties

1. INTRODUCTION

Content based image retrieval scheme (CBIR) approaches reported in the literature are implemented mostly for the image processing and computer vision. The retrieved images have its own characteristics when it is retrieved from the collection of images with the significant size. The main applications of the image retrieval scheme are the digital image processing and computer vision and in that image processing covers compression, enhancement, transmission and interpretation.

Conventional feature analysis mechanism suffers from accuracy and reliability while content based image retrieval scheme yields output fairly in terms of accuracy and performance oriented.

Accurate detection of faces in applications of crime, security has improved and automatic face recognition has helped in many applications.

Among all available approaches content based image retrieval scheme (CBIR) is most successful in real time scenario. The content based image retrieval scheme (CBIR) automatically derived features based on three contents namely color, shape and texture and retrieved features derivation can be elaborated in semantic and as well as primitive. CBIR defers from the classic information retrieval in that the image data bases are essentially unstructured, since digitized images consists purely of arrays of pixel intensities, with no inherent meaning. One of the key issues with any kind of image processing is the need to extract useful information from the raw data( such as recognizing the particular shapes or textures) before any kind of reasoning about the image contents is possible.

2. BACKGROUND

1) A content based image retrieval scheme (CBIR) scheme based on color and gradient direction features is proposed by JIANLIN ZHANG, RAGHAVAN, V.V. in the year 2010 [1]. The work proposed in this paper initially divides the input image blocks then each block characteristics is extracted based on the color and edge direction features.

Then usage of the clustering algorithm is observed to preserve all the extracted color information in the form of code book.

Finally color code indexes are used to retrieve the image based on content based image retrieval scheme (CBIR)

2) In the year 1995 initial representation of the content based image retrieval scheme

(2)

K.SRINIVASA RAO and Prof.I.Ramesh babu ijesird, Vol. II (VII) January 2016/ 425 (CBIR) scheme [2] is proposed by )

Gudivada, V.N, Raghavan, V.V. This paper mainly focuses on the applications of the digital image processing based on image retrieval scheme and its importance is discussed in detailed way. Applications like biomedical imaging, fingerprinting, scientific experiments etc are observed and concluded that content based image retrieval scheme (CBIR) scheme is effectively and efficiently use information from these image repositories. Finally ever increasing importance of the multimedia also increases the importance of content based image retrieval scheme (CBIR) and increases the attention of researchers on it.

3) A novel content based image retrieval scheme (CBIR) scheme based on DTCWT (Dual-Tree complex wavelet transform) is proposed in the 2011 by FENG CHEN, SONG-NIAN YU [3]. Detection of the key points and feature vectors which are invariant in nature are accomplished in ease way by the DTCWT (Dual-Tree complex wavelet transform) for the effective image retrieval approach. Similarity of the images based on the Euclidian distance and feature similarity is derived based on content based image retrieval scheme (CBIR) scheme.

Finally the propose result is compare with AT-SIFT.

4) A large number of image processing applications reside on the content based image retrieval scheme (CBIR) scheme for retrieving the effective images based on similarity property. Medical image is the one of the predominant domain which lies on the content based image retrieval scheme (CBIR) scheme for retrieving data from the amiable medical image databases for better clinical analysis. A novel content based image retrieval scheme (CBIR) scheme for large medical databases is proposed by KAK, A, PAVLOPOULOU, C. in the year 2002. This work resent some of the more significant results obtained with ASSERT (Automatic Search and Selection Engine

with Retrieval Tools), the content based image retrieval system developed in our laboratory.

3. PROPOSED METHODS

This paper mainly consists of three algorithms; the performance of each algorithm is evaluated using standard image retrieval database.

(A) HISTOGRAM BASED CBIR

In this algorithm the images are retrieved by using the histogram properties of it. As stated in [4] the colour histogram features of an image can be denoted as below equations.

Without loss of generality the definition of histogram is

𝑃 𝑔 =^𝑁(𝑔)

𝑀

Where P(g) is the probability of occurrence of gray level while N(g) is its intensity value and M

is the total number of pixels . The following are the properties that used for CBIR

(i) Mean

𝑀𝑛 = ^𝐿−1_𝑔=0𝑔𝑃(𝑔) Where is the range of gray level image [0 L-1] ([0 255])

(ii) Standard Deviation 𝜎_𝑔 = ^𝐿−1_𝑔=0 𝑔 − 𝑀𝑛 ²𝑃(𝑔) (iii) Skewness

Sk= ¹

𝜎_𝑔³ ^𝐿−1_𝑔=0 𝑔 − 𝑀𝑛 ³𝑃(𝑔)

The skew will be positive if the tail of the histogram spreads to the right (positive) and negative if the tail of the histogram spreads to the left (negative).

(iv) Energy 𝐸 = ^𝐿−1_𝑔=0[𝑃 𝑔 ]² (v) Entropy:

𝐸𝑛 = − ^𝐿−1_𝑔=0𝑃 𝑔 log⁡(𝑝 𝑔 )

All these properties constitute the features for the images and similarity is measured using Euclidean or chessboard distance transforms. The experimental results obtained using this approach are shown in section Iv

(B) K MEANS CLUSTERING BASED CBIR The K-means [5] is one of the simplest unsupervised learning algorithms .The procedure

(3)

K.SRINIVASA RAO and Prof.I.Ramesh babu ijesird, Vol. II (VII) January 2016/ 426 follows an easy way to classify a given data set

through a certain number of clusters (assume k clusters) fixed apriori. The main idea is to define k centroids, one for each cluster. The K-Means method is numerical, nondeterministic and iterative.

Step 1: Input the Image and perform Hierarchical clustering.

Step 2: Consider the Every point as its own cluster.

Step 3: Find Most Similar Pairs of Clusters.

Step 4: Merge those two points to one parent cluster.

Step 5: Repeat Step 3 to Step 5 until all points are merged into one cluster.

Step 6: Apply K-means clustering to the required image set obtained from Hierarchical clustering.

Step 7: Enter How Many Clusters (Let “k”).

Step8: Randomly Guess K Cluster center Locations.

Step 9: Each Data point finds out which center it’s closest to.

Step 10: Thus Each Center “Owns” Set of Points.

Step 11: Each Center Finds the Centroid of its Own Points.

Step 12: Center now moves to the New Centroid.

Step 13: Repeat Step 9 to Step 12 until Terminate Histogram is quantized to n-bins ( 8 or 16) for the clustered image and the respective properties form a feature vector for retrieval. The experimental results are shown in section Iv

(C) PILLAR K MEANS BASED CBIR

Because of initial starting points generated randomly, using K-means algorithm it is difficult to reach global optimum which will lead to incorrect clustering results. These obstacles in K-means have been addressed by specifying a procedure to initialize the cluster centers before proceeding with the standard k-means optimization iterations.[6]

Input: Dataset, D = {d1 , d2 ,...,dn} //set of n data points. k //set of desired clusters

Output: A set of k clusters.

Step1: One centriod is uniformly chosen at random from among the data points.

Step2: For each data point z, the distance D(z), between z and the nearest chosen centroid is computed. Step3: One new data point is chosen at random as a new center, using a weighted

probability distribution where a point z is chosen with probability proportional to D(z)².

Step4: The 2 and 3 steps are repeated until k centers have been chosen.

Step5: Now that the initial centers have been chosen, one can proceed using standard k- means clustering.

Histogram is quantized to n-bins ( 8 or 16) for the clustered image and the respective properties form a feature vector for retrieval. The experimental results are shown in section IV

IV RESULTS AND DISCUSSIONS

The proposed algorithms are tested and performance is evaluated using Berkeley Image retrieval data base. [7] Which contains multiple colour natural images for retrieval and segmentation purpose. The results of each individual algorithm is presented below

Figure 1: Result of the histogram based retrieval system

(4)

Figure 2: Results of retrieved images using K-mean clustering and histogram properties

Figure 3: Results of retrieved images using Pillar K-mean clustering and histogram properties

To evaluate the algorithm performance metrics like precision and recall are used [8,9]

Recall Rate = Number of relevant images retrieved/

Total number images in database

Precision Rate= Number of relevant images retrieved/ Total number relevant images in database

Figure4: Performance analysis of the proposed algorithms

V. CONCLUSION

In this paper pillar K-means histogram properties based image retrieval system is proposed, the algorithm is tested and evaluated with Berkeley image retrieval database and the respective precision and recall rates were recorded. From the results it can be concluded that the using pillar algorithm along with histogram properties improve the precision rate by 0.05 units which is a 5%

increment on the overall rate. The proposed algorithm can be further improved by utilizing this algorithm in the time scale domain or transform domain like wavelets, contourlet and many[10] .

REFERENCES

[1] Jianlin Zhang, Raghavan, V.V., Content-Based Image Retrieval using color and edge direction features, Advanced Computer Control (ICACC), 2010 2nd International Conference (IEEE) 2010

[2] Gudivada, V.N, Raghavan, V.V., Content based image retrieval systems, IEEE computer society, 1995

[3] Feng Chen, Song-nian Yu, Content-based image retrieval by DTCWT feature, IEEE, 2011

[4] KAK, A, PAVLOPOULOU, Content-based image retrieval from large medical databases, IEEE, 2002.

[5] K.A. Abdul Nazeer, M.P.Sebastian, “Improving the Accuracy and Efficiency of the K-means Clustering

(5)

Algorithm”, Proceedings of the World Congress on Engineering, Vol I, 2009.

[6] Sergyan, S., "Color histogram features based image classification in content-based image retrieval systems,"

in Applied Machine Intelligence and Informatics, 2008.

SAMI 2008. 6th International Symposium on , vol., no., pp.221-224, 21-22 Jan. 2008

[7]

https://www.eecs.berkeley.edu/Research/Projects/CS/visio n/bsds/

[8] Mianshu Chen; Ping Fu; Yuan Sun; Hui Zhang, "Image retrieval based on multi-feature similarity score fusion using genetic algorithm," in Computer and Automation Engineering (ICCAE), 2010 The 2nd International Conference on , vol.2, no., pp.45-49, 26-28 Feb. 2010 [9] Barakbah, A. R., & Kiyoki, Y. (2009). A new approach for

image segmentation using Pillar-Kmeans algorithm. World Academy of Science, Engineering and Technology, 59, 23- 28.

[10] I. Markov, N. Vassilieva, “Image Retrieval: Color and Texture Combining Based on Query-Image,” ICISP 2008, LNCS 5099, Springer-Verlag Berlin Heidelberg, 2008, pp.

430–438.