Research on the Top down Visual Attention Method with Unknown Interference Target

(1)

2016 International Conference on Electronic Information Technology and Intellectualization (ICEITI 2016) ISBN: 978-1-60595-364-9

Research on the Top-down Visual Attention

Method with Unknown Interference Target

Hongchang Ke, Hui Wang and Degang Kong

ABSTRACT

Visual attention can prior to allocate the limited processing resources to several salient regions of the image, which can increase the efficiency of the relevant intelligence applications. Based on differences between task features and image background features, key features of task target can be reasoned, feature channels of visual attention are filtered top-bottom. The top-down visual attention method with unknown interference target is proposed, in order to ensure the hit rate of target as far as possible to shorten the generation time of visual attention saliency map, to improve the computing efficiency of visual attention.

INTRODUCTION

Visual attention can priority to allocate the limit processing resource to several salient regions of the image, which can increase the efficiency of image application such as target searching, target tracking. In recent years, visual attention has gradually become a hot topic.

There are a number of representative models in visual attention field. Itti et al., which proposed the saliency model to extract the color features, the intensity features, orientation features of input image, and use multi-scale center-surround ________________________

Hongchang Ke, Degang Kong. School of Computer Technology and Engineering, Changchun Institute of Technology, No. 2494. Hongqi Street, Changchun City, Jilin Province, 130012, China

(2)

computing to simulate the features of the cell receptive field to obtain a local saliency, and generate the final saliency map in which the WTA and the prohibition return mechanism to guide the focus of attention to transfer [1]. Iva, which combines static features with dynamic features the intensity, color, and orientation of the input image, the salient region of the panoramic image sequence is extracted from the spherical coordinate system [2].

Li et al., which proposed the hyper complex Fourier transform method for salient features extraction of visual attention, which use low-pass filtering to highlight the salient signal with the amplitude spectrum of methods [3]. Rafael et al., which proposed the method of cutting the bottom-up features extracting through multi-scale adjustable Fovea model to reduce the processing time [4]. Mahadevan et al., who is use of bottom-up center-surround salient information, the feature selection of feature-based attention and top-down saliency for target detection to track the salient target [5]. Fahad et al., who is use of the psychology and other disciplines, such as eye-tracking experiment data to train artificial neural networks to integrate weighted bottom-up, top-down as well as salient motion visual cues to deal with the salient visual computing [6]. Tsotsos et al. proposed a selective visual attention model, spatial selection and top-down by inhibiting irrelevant visual connection in Pyramid is proposed to realize the operation [7]. Harel et al. who introduced the Markov chain into the calculation of the local saliency. The stationary distribution of the Markov chain defined on the feature vector is used to the saliency of the activity diagram [8].

The paper proposed the top-down visual attention method unknown interference target. According to the difference between the feature data of task target and the mean feature data, the key features of the inference task which are distinguished from the background of the image can be reasoned to generate the key features queue. Based on the extent critical of features and hit rate situations of task targets to filter features channel, which can improve search efficiency.

GENERATING KEY FEATURES OF TASK OBJECT WITH UNKNOWN INTERFERENCE TARGET

Set specific task the form background is ( , , )U A I _{, object set is}U x x1, , ,2  xn, ( , , )

L U A I _{stands for the whole concept of} ( , , )U A I _{. Task target is} x_i _{, set}

(x_i,l_i,rg_i,by_i,o₁_i,o₂_i,o₃_i,o₄_i)L(U,A,I)_{, are respective the features of task target}x_i _,

where x_{stands for the name of target concept,}l_，rg_，by_，o1，o2，o3，o4 are

(3)

rg rgmin rg_max          (1)

Where, rgmax，rgmin are the maximum value, minimum value of RG features of

target. Other methods similarly defined target feature matrix.

Set the size of image is MN_,lij is the intensity of point

 

i,j . The mean of intensity is,

 l 

l_ij

j1 N



i1 M



MN (2)

Where, 1iM,1 jN.

The method of RG, BY color of image, 0,45,90,135 local orientation is similarly defined.

However, the feature information of background image y is

(y,l,rg,by,o₁,o₂,o₃,o₄)_{, where}_l__，rg_，by_，o₁_，o₂_，o₃_，o₄_{are respective the}

Intensity, RG, BY, the mean of 0,45,90,135 local orientation.

In order to obtain the key feature matrix of task target, the distance between (x_i,l_i,rg_i,by_i,o₁_i,o₂_i,o₃_i,o₄_i)_and (y,l,rg,by,o₁,o₂,o₃,o₄) _{should be calculated. Because}

RG feature of target

rg_i rgmini

rg_max_i

        

is matrix, rgiis the average of matrix:

rg_irgminirgmaxi

2 (3)

Where, rgmax rgmin stand for the max, min of RG information. The method of

target feature matrix is similarly defined.

(x_i,l_i,rg_i,by_i,o₁_i,o₂_i,o₃_i,o₄_i)_{is the deformation feature information of task target}x_i_.

The distance between (xi,li,rgi,byi,o1i,o2i,o3i,o4i) and (y,l,rg,by,o1,o2,o3,o4) is

DIS_FC



(x_i,l_i,rg_i,by_i,o₁_i,o₂_i,o₃_i,o₄_i),(y,l,rg,by,o₁,o₂,o₃,o₄)



DIS_FC



(x_i,l_i,rg_i,by_i,o₁_i,o₂_i,o₃_i,o₄_i),(y,l,rg,by,o₁,o₂,o₃,o₄)



[E(l_i,l) E(c_i,c) E(o_i,o)]

(4)

E(c_i,c)E((rg_i,by_i)T_,(

rg,by)T₎ (5)

E(o_i,o)E((o₁_i,o₂_i,o₃_i,o₄_i)T,(o₁,o₂,o₃,o₄)T) (6)

The features difference of the recognizable attribute matrix can be normalized, and key feature queue of task targets f (f1,f2,f3)can be sort to obtain. Where

E(f₁_i,f₁)E(f₂_i,f₂)E(f₃_i,f₃) (7)

The key features of task target xi are f1 which is the max features of the

Euclidean distance of background image features.

EXTRACTING VISUAL ATTENTION FEATURES OF UNKNOWN INTERFERENCE TARGET

In the case of unknown interference target features channel filter based top-down visual attention guidance as follows:

(1)First, the feature channels of bottom-up visual attention only filter the key features of the first queue features (the primary key features of task targets f1)

corresponding to channel generates saliency map.

(2) Select n most salient area, use SIFT feature descriptor determine the task targets whether exists in the n region, if present, turn (6), otherwise, turn (3).

(3) Determine whether all of the features channel have already been selected, and if so, turn (5), otherwise, select the task targets of key features queue the next major key corresponding channel, turn (4).

(4) Calculate the activity diagram, with the previously calculated features channel activity diagram together to weight and generate saliency map (where the weighting coefficient by the task targets and image background on this feature difference normalized results are given), turn (2);

(5) Not hit the target, the algorithm fails; (6) Hit the target, task is completed.

(5)

EXPERIMENT RESULT

The paper proposed the top-down visual attention method unknown interference target base on the constructed visual feature knowledge base. According to the difference between the feature data of task target and the mean feature data, the key features queue of task target can be calculated. Based on the extent critical of features and hit rate situations of task targets to filter features channel, which can improve search efficiency.

ACKNOWLEDGEMENTS

Our work is supported by project of science and technology development of Jilin provincial education department, China(No. 2016310,No. 2016346).

Corresponding author:HongchangKe,[email protected].

REFERENCES

1. Itti. L., Koch, C., Niebur. E., 1998. “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11):

1254-1259.

2. Iva B., Alexandre B., Heinz H., Farine, P., A. 2010. “Dynamic visual attention on the sphere,”

Computer Vision and Image Understanding, 114: 100–110.

3. Li. J., Levine, M., D., An X., Xu, X., and He H.2013. “Visual saliency based on scale-space analysis in the frequency domain,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 35: 996-1010.

4. Rafael, B., Gomes., Bruno, M., Carvalho, D., Luiz, M., and Garcia, G. 2013. “Visual attention guided features selection with foveated images. Neurocomputing,” 120: 34-44.

5. Mahadevan, V., Vasconcelos, N.2013. “Biologically inspired object tracking using center-surround saliency mechanisms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(3): 541-554.

6. Fahad, F., Elahi, G., Faouzi A., C. 2015. “Neural networks based visual attention model for surveillance videos,” Neurocomputing,149: 1348–1359.

7. Tsotsos, A., John K. 2006. “Cognitive Vision Needs Attention to Link Sensing with Recognition,”

Cognitive Vision Systems, LNCS, 3948: 25–35.