A Recognition Algorithm for Wireless Capsule Gastroscopy Images

(1)

2018 International Conference on Information, Electronic and Communication Engineering (IECE 2018) ISBN: 978-1-60595-585-8

A Recognition Algorithm for Wireless Capsule Gastroscopy Images

Xiao-dong JI

1,2,*

, Ting-ting XU

1

and Wen-hua LI

3

1

School of Electronics and Information, Nantong University, Nantong China 226019

2

Nantong Research Institute for Advanced Communication Technologies, Nantong China 226019

3

Wenluo Corporation of Electronic Science & Technology of Jiangsu, Nantong China

*Corresponding author

Keywords: Wireless capsule gastroscopy, Recognition, Features extraction, BPNN.

Abstract. In order to enhance recognition accuracy, a novel recognition algorithm for wireless capsule gastroscopy images(WCGIs) was developed by using a new proposed feature extraction method and back propagation neural network(BPNN). First, an improved color histogram was computed in the Hue, Saturation, Value space. Meanwhile, by using the wavelet transform, the low frequency parts of WCGIs were filtered out and then the characteristic values of the reconstructed WCGIs’ co-occurrence matrix were computed. Next, the feature values of color moments and co-occurrence matrix were normalized as the inputs of the BPNN for training and recognition. Recognition experiments were conducted and the results demonstrated that the accuracy of the developed algorithm is up to 99.09% that is much better than the compared methods.

Introduction

Wireless capsule gastroscopy system allows painless screening for gastric diseases. During the screening process, however, a large number of wireless capsule gastroscopy images (WCGIs) will be generated, and incur much time and energy costs if these WCGIs are read and recognized by physicians. Thus, it is imperative to develop an efficient computer aided analysis algorithm that can help physicians recognize WCGIs with high efficiency and accuracy.

Generally, features of a WCGI consists of shape, color, texture, etc. It is noted in [1] that the features employed for recognition will directly affect the ultimate recognition performance. In [2], the ULBP (Uniform Local Binary Pattern) features are extracted from the discrete wavelet transform sub-band in the RGB (Red, Green, Blue) and HSI (Hue, Saturation, Intensity) spaces, and then the support vector machine is used as the recognizer. The authors of [3] developed an image feature extraction algorithm and an image feature matching technique that are able to discover feature places between two continuous WCGIs. In [4], the Gabor filter based gray-level co-occurrence matrix and the wavelet are adopted to extract texture information of WCGIs. Of note is that most of the existing feature extraction methods are either based on the full image or its low-frequency part, and often ignore the middle and high frequency parts that actually have rich texture information.

It is known that WCGIs have rich color and texture information. Thus, for a WCGI, the lesion and non-lesion regions possess major diversity in color and texture. In the paper, a novel feature extraction method based on color histogram, wavelet transform and co-occurrence matrix is studied. Then, based on the framework of the novel feature extraction method, a recognition algorithm for WCGIs is proposed so as to enhance the recognition accuracy.

Method

(2)

(BPNN) and those in the examining set are used for conducting recognition experiments. The extracted feature values of the WCGIs in the training set are normalized as the inputs of the BPNN and the recognition outcome is achieved accordingly.

The algorithm proposed in the paper consists of the following three steps.

 Step 1. Extracting the color features: 1) transforming the WCGIs from the RGB to HSV spaces, 2) computing the color histogram after quantification 3) choosing the bins and then constructing the color feature vector;

 Step 2. Extracting the texture features: 1) choosing the middle and high frequency bands and then reconstructing images through the wavelet transform, 2) computing the characteristic values of the co-occurrence matrix and constructing the texture feature vector;

 Step 3. Training and then recognition: 1) normalizing the color and texture features of the training images and then training the BPNN, 2) using the trained BPNN to recognize the examining images.

Features Extraction

Extracting Color Features. Here, an improved color histogram method is presented for extracting color features. It is known that HSV color space is a natural color model that can better reflect the physiological perception of human eyes [5]. Thus, in the paper, a non-uniform quantizer proposed in [2] is adopted to quantify the H, S and V components of a WCGI. Then the color histogram is computed according to the output of the non-uniform quantizer. Next, by choosing the suitable bins from the color histogram, the color feature vector is constructed and synthesized into an one-dimensional vector, giving

9h_q 3s_q v_q

  

 , (1) where hq, sq and vq are the H, S and V components of a WCGI after quantifying, and

0,1, 2, 7 ]1

[ ,



 . It is noticed that Eq. 1 presents an one-dimension characteristic histogram with 72

bins that is considered as the 72 dimension characteristics.

Denote by F the ratio of numbers of the pixels with characteristic value  to the number of all the pixels in the image matrix after quantification. It should be noted that the dimension of F is still high, and the 72 characteristic values include lots of 0, leading to redundancy. To that end, an improved color histogram is proposed. Specifically, for a WCGI, if and only if F1 72 holds, the corresponding  will be collected. Therefore, the collecting values will be less than 72. In order to further reduce complexity, only the largest 15 values of F are chosen and then constructed the feature vector.

Extracting Texture Features. It is known that the lesion regions of a WEGI are different from those of non-diseased regions. Therefore, extracting texture features of a WCGI is of vital importance to the design of a practical recognition algorithm. In the paper, the pyramid wavelet decomposition is adopted and the Daubechies function is chosen as the basis function of wavelet transform [6]. Denote by



,



i i i

Q  L_ H_ , i1, 2,3



1, 2,3  1,...,9, (2) a decomposed version of an example image Q, where L denotes the low frequency parts of the horizontal and vertical components of a WCGI, H denotes the middle and high frequency parts, α represents the decomposition level, β stands for the wavelet band, and i represents the color channel.

(3)

 

IDWT

i i

O  H_ , i  1, 2, 3     1, 2, , 9. (3)

In Eq. 3, IDWT{.} denotes the inverse discrete wavelet transformation, β stands for the wavelet band, and i represents the color channel.

Calculate the co-occurrence matrix W_T( , )m n of the R, G and B channels of the reconstructed WCGI, where T∈{R, G, B}, the value of pixel pair (m, n) of the co-occurrence matrix denotes the number of occurring times of the two pixels with distance d and having color levels m and n at a given direction θ. In practice, θ is set to 0°, 45°, 90° or 135o. It is a two order statistical feature of the change of image brightness [7]. Next, normalize the co-occurrence matrix and let w_T



m n,



be the value of

pixel pair (m, n) of the normalized co-occurrence matrix, where T∈{R, G, B} and θ∈{0°, 45°, 90°, 135o}.

In the paper, 4 commonly used features are used, namely, angular second moment, contrast, entropy and correlation [8]. The angular second moment, contrast, entropy and correlation, respectively, represent the homogeneity, inertia, randomness and directional linearity of the co-occurrence matrix, and are defined, respectively, as

 

2

1 1

.

D D

T T

m n

E w m n

 

 





_ _ (4)





2





1 1

,

D D

T T

m n

I m n w m n

 







 

(5)



,



log



,



D D

T T T

m 1 n 1

w m n w m n

  



 

 



(6)

  



1 2

1 1 1 2 . , D D T m n T

m n w m n A             





(7)

where ₁, ₂, ₁ and ₂ are given as the following





1 1 1 , D D T m n

m w m n

 



 



 

, (8)





2

n 1 1

n ,

D D

T m

w m n

 



 



 

, (9)





2





1 1 1 1 , D D T m n

m w m n

  

 

 









, (10)





2





2 2 1 1 , D D T n m

n w m n

 









  

  . (11)

Here, D is the maximum color level of a WCGI.

In the paper, d=1 is assumed. According to the characteristic values computed earlier, the texture feature vector of a WCGI can be constructed by

T T T T T T T T T

Z  E , E , I , I , , , A , A _

(4)

where 0 45 90 135 4

T

, , ,

T

X

X  



 



,





 

2

0 45 90 135

4

T T

, , ,

T

X X

X 

 



  



, X 



E,I ,, A



, T



R G B, ,



and 



0 , 45 ,90 ,135



.

The summation of the eight dimensional texture features of the R, G and B channels that are achieved earlier is considered as the final extracted texture feature, giving

texture R G B

F Z Z Z . (13)

Training and Classification

BPNN is a feed-forward neural network with a tutor, having strong nonlinear mapping ability and a flexible network structure. The main idea behind the proposed algorithm is to use the samples of the known results to train the network, and then adopt the trained network to recognize the images. The BPNN includes the input, hidden and output layers. The hidden layer can have one or multiple layers. Neurons in adjacent layers are connected by weights, but no connection exists between the neurons in each layer.

The training period of a BPNN can be classified as two phases: forward and backward propagation phases. During the forward propagation phase, the feature values of the training samples reach the hidden layer through the nonlinear transformation from the input layer and then to the output layer, leading to the output results. Compare the output results with the expected outputs, if they are not equal, then enter the step of back propagation. During the backward propagation phase, the error signals propagate layer by layer from the output layer to the input layer through the hidden layer and reduce errors by adjusting weights. In principle, the BP algorithm uses the square of the network errors as the objective function, and uses the gradient descent method to calculate the minimum value of the objective function.

Experiment and Analysis

The experiment was conducted by matlab R2016a. The Daubechies function is used as the basis function of the wavelet transform, and the max decomposition level is set to 3. The comprehensively extracted color features and texture features are finally used as the BPNN input feature vectors as given in Eq. 14.



color, texture



F  F F (14)

The image library in this paper is provided by Hangzhou Hitron Technologies Co., Ltd., including 1251 stomach capsule clinical images, the resolution is 480×480 and the image format is bmp. The image library includes 135 diseased and 1116 healthy images. In each experiment, 108 diseased and 1116 healthy images are randomly selected to construct a training set with a total number of 1001, and the remaining 250 images are considered as the BPNN examining set. According to the previous experience, as long as there are suitable hidden layer nodes, neural network with single hidden layer can approximate any continuous function on the bounded domain with arbitrary precision. Therefore, the number of layers of the BPNN is set to 3 in the experiment. The number of hidden nodes is usually given by empirical formula [9]



(5)

the value of ω. It can be observed from Fig. 1 that ω is appropriately set to 15 in the following experiments. The recognition results of the proposed algorithm are compared with the existing methods, as shown in Table 1, where the values of TPR (True Positive Rate) and TNR (True Negative Rate) can be calculated as the following.

the number of images correctly recognized as diseased =

the number of diseased images

TPR

the number of images correctly recognized as healthy the number of healthy images

TNR

Here, “Method 1” means that in each channel of the image in the HSV space, the wavelet transform is used to select the low-frequency bands while the max decomposition level is set to 2 [10]. “Method 2” means that the extracted features will be filtered according to the average influence value, and then recognized by SVM [1]. “Method 3” means that only Fcolor is used as the input of BPNN.

0 2000 4000 6000 8000 10000 12000

245 245.5 246 246.5 247 247.5 248 248.5 249 249.5 250

Experiment Times

C

o

rr

e

c

t

C

la

s

if

ic

a

ti

o

n

N

u

m

b

e

r

The Experimental Results of 10000 Times

[image:5.595.181.413.288.481.2]

=6 =8 =10 =12 =15

Figure 1. Comparison of recognition results of BPNN with different  values.

It can be observed that the proposed method is superior to the existing methods in terms of both practicability and accuracy, and its accuracy of 99.09% can well meet the clinical requirements.

Table 1. Performance comparison between the proposed algorithm and the existing methods.

Method TPR TNR Accuracy

Method 1 85.38% 84.17% 84.30%

Method 2 90.26% 91.37% 91.25%

Method 3 88.14% 89.13% 89.02%

Proposed

algorithm 97.43% 99.29% 99.09%

Summary

[image:5.595.192.405.563.694.2]

(6)

problem arises: how to recognize or classify the types of diseases according to the provided wireless capsule gastroscopy images, which will be studied in our future work.

Acknowledgement

This work is supported in part by the National Natural Science Foundation of China (Grant Nos. 61401238, 61871241) and by the Nantong University-Nantong Joint Research Center for Intelligent Information Technology (Grant No. KFKT2017A03).

References

[1] J. Deng, L. Zhao, Image classification model with multiple feature selection and support vector machine, J. Jilin Univ. (Science Edition). 4 (2016) 862-866.

[2] B. Li, M. Q. Meng, Automatic polyp detection for wireless capsule endoscopy images, Expert Syst. Appl. 12 (2012) 10952-10958.

[3] Z. Cai, C. Hu, W. Lin, W. Yang, A feature point extraction and matching algorithm for Wireless Capsule Endoscope image, in: Proc. of IEEE Int. Conf. Inf. Aut., 2014, pp. 608-613, .

[4] F.K. Nezhadian, S. Rashidi, Palmprint verification based on textural features by using Gabor filters based GLCM and wavelet, in: Proc. of 2nd Conf. Swarm Intell. and Evolutionary Comput., 2017, pp. 112-117.

[5] N. Suciati, D. Herumurti, A.Y. Wijava, Fractal-based texture and HSV color features for fabric image retrieval, in: Proc. of IEEE Int. Conf. Control Syst., Comput. Eng., 2015, pp. 178-182.

[6] D. Sudarvizhi, Feature based image retrieval system using Zernike moments and Daubechies Wavelet Transform, in: Proc. of Int. Conf. Recent Trends in Info. Technol., 2016, pp. 1-6, 2016.

[7] F. Zhu, B. Zhu, P. Li, Z. Wang, L. Wei, Quantitative analysis and identification of liver B-scan ultrasonic image based on BP neural network, in: Proc. of Int. Conf. Optoelectronics and Microelectronics, 2013, pp. 62-66.

[8] R. M. Haralick, Statistical and structural approaches to texture, Proc. IEEE. 5 (1979) 786-804.

[9] D. Weng, R. Chen, Y. Li, D. Zhao, Techniques and applications of electrical equipment image processing based on improved MLP network using BP algorithm, in: Proc. Og Power Electron. Motion Control Conf. 2016, pp. 1102-1105.