The Top down Visual Attention Method with Known Interference Target

(1)

2016 International Conference on Electronic Information Technology and Intellectualization (ICEITI 2016) ISBN: 978-1-60595-364-9

The Top-down Visual Attention Method with

Known Interference Target

Hui Wang*, Lijiao Tian and Hong Chang Ke

ABSTRACT

The simulation of visual attention can improve the efficiency of relevant intelligent applications by finding out most salient locations quickly in images or videos while ignoring redundant information. The paper can reason key features of target based on task scenario, filter the bottom-up visual attention feature channel by top-down method, and propose the top-down visual attention method with known interference target. The proposed method can minimize the time of generating visual saliency map by ensuring the hit rate of mission targets, and improve the efficiency of the search targets based on visual attention.

INTRODUCTION

Visual attention is a psychological mechanism, which can directivity and intensively process massive data acquired by the human brain at the same time. Computer model of visual attention can quickly lock the region of interesting at the image or video and cut unnecessary information, which can improve the efficiency of the relevant intelligence process. In recent years, computer model of visual attention has the massive data cutting capacity which has gradually become one of the focus on machine vision [1].

________________________

Hui Wang*, LijiaoTian. College of Computer Science and Engineering, Changchun University of Technology, No. 2055. Yan'an Street, Changchun City, Jilin Province, 130012, China

(2)

In recent years, many scholars have proposed some important visual attention model[2]. Itti proposed the saliency model, which can extract the color features, the intensity features, orientation features of input image, and use multi-scale center-surround computing to simulate the features of the cell receptive field to obtain a local saliency, and generate the final saliency map in which the WTA and the prohibition return mechanism to guide the focus of attention to transfer.

Rafael proposed the method of cutting the bottom-up features extracting through multi-scale adjustable Fovea model to reduce the processing time [3]. Zhong use the transition probability SVR prediction Markov chain of eye tracking data to estimate generating the saliency map by the stationary distribution of Markov chain [4]. Mahadevan is use of bottom-up center-surround salient information, the feature selection of feature-based attention and top-down saliency of target detection to track the salient target [5].Han extract the object bank and other features of the input image and calculating entropy to guide image classification tasks [6]. Fahad is use of the psychology and other disciplines, such as eye-tracking experiment data to train artificial neural networks to integrate weighted bottom-up, top-down as well as salient motion visual cues to deal with the salient visual computing [7].

The paper proposed the top-down visual attention method with known interference target. According to analyze the visual feature data of each target and the task target of each learning stage, the key features of the inference task which are distinguished from the background of the image can be reasoned to generate the key features queue. Based on the extent critical of features and hit rate situations of task targets to filter features channel, which can shorten the generation time of visual attention saliency map and improve search efficiency of task targets.

GENERATING KEY FEATURES OF TASK OBJECT

Set specific task the form background is ( , , )U A I _{, object set is}U x x1, , ,2  xn,

( , , )

L U A I _{stands for the whole concept of} ( , , )U A I _{. Task target is} x_i _{, set}

(x_i,l_i,rg_i,by_i,o₁_i,o₂_i,o₃_i,o₄_i)L(U,A,I) _, (x_j,l_j,rg_j,by_j,o₁_j,o₂_j,o₃_j,o₄_j)L(U,A,I)

are respective the features of task target xi and task target xj, where x stands for the

name of target concept, l_，rg_，by_，o1，o2，o3，o4 are respective the features

of intensity, RG, BY, 0,45,90,135 orientation. Where, the features matrix of RG channel is:

rg rgmin rg_max



  



  



(3)

Where, rgmax，rgmin are the maximum value, minimum value of RG features of

target. Other methods similarly defined target feature matrix.

To obtain the key feature of mission targets the discernibility attribute matrixFC

is constructed, which can be used to distinguish (xi,li,rgi,byi,o1i,o2i,o3i,o4i) and

(x_j,l_j,rg_j,by_j,o₁_j,o₂_j,o₃_j,o₄_j)

，

_FC DISFC



(xi,li,rgi,byi,o1i,o2i,o3i,o4i),(xj,lj,rgj,byj,o1j,o2j,o3j,o4j)





| (x_i,l_i,rg_i,by_i,o₁_i,o₂_i,o₃_i,o₄_i),(x_j,l_j,rg_j,by_j,o₁_j,o₂_j,o₃_j,o₄_j)L U



,A,I





 



  



(2)

Where, l _， c _， o _{are the features matrix of Intensity, color and local}

orientation, then

DISFC



(xi,li,rgi,byi,o1i,o2i,o3i,o4i),(xj,lj,rgj,byj,o1j,o2j,o3j,o4j)



[E(l_i,l_j) E(c_i,c_j) E(o_i,o_j)]

(3)

Where, E is the Euclidean distance between the features.

E(c_i,c_j)E((rg_i,by_i)T,(rg_j,by_j)T) (4)

E(o_i,o_j)E((o₁_i,o₂_i,o₃_i,o₄_i)T,(o₁_j,o₂_j,o₃_j,o₄_j)T) (5)

Normalizing the characteristic differences of discernibility matrix, and sort to obtain the key feature queue of task targets, where the key features of the task targets

i

x _{is the max of Euclidean distance between interference targets.}

EXTRACTING VISUAL ATTENTION FEATURES OF KNOWN INTERFERENCE TARGET

Down-up visual attention feature sex traction. In feature vector extraction stage, use the method of ref[2], input image are calculated to obtain the intensity channels feature map, RG color channels feature map, BY feature map, orientation channels feature map O( ) , where  0 45 90 135, , , . Seven feature maps of three channels is respectively filtered with nine layer Gaussian pyramid. Selecting 2,3 layer of Gaussian pyramid to normalize, and obtain the feature vectors by the same scales. Then the activity maps are generated with the method of ref[8].

(4)

whole calculated, but to use the key features generating method of the task targets to obtain the key features queue of task targets to filtering features channel of the bottom-up visual attention, which can improve the efficiency of visual attention. The key features generating method of the task targets to obtain the key features queue of task targets is f (f1,f2,f3), where E(f1i,f1j)E(f2i,f2j)E(f3i,f3j), so in the case of

known interference target features channel filter based top-down visual attention guidance as follows:

(1)First, the feature channels of bottom-up visual attention only filter the key features of the first queue features (the primary key features of task targets f1)

corresponding to channel generates saliency map.

(2) Select n most salient area, use SIFT feature descriptor determine the task targets whether exists in the n region, if present, turn (6), otherwise, turn (3).

(3) Determine whether all of the features channel have already been selected, and if so, turn (5), otherwise, select the task targets of key features queue the next major key corresponding channel, turn (4).

(4) Calculate the activity diagram, with the previously calculated features channel activity diagram together to weight and generate saliency map (where the weighting coefficient by the task targets and interference target feature data on this feature difference normalized results are given), turn (2);

(5) Not hit the target, the algorithm fails; (6) Hit the target, task is completed.

The optimum result of bottom-up visual attention features channel filtering methods can hit the target only calculating a feature channel, the worst case is to calculating all features channels. The features channel filtering rules can be effectively shortened the generating time of bottom-up visual attention saliency maps, and improve the efficiency of visual attention.

EXPERIMENT RESULT

(5)

[image:5.612.117.468.100.165.2]

TABLE I. EXPERIMENTAL RESULT.

Original Algorithm The proposed algorithm

The average of

computing time(s) 9.284 4.073

Based on the experimental results, the algorithm can significantly reduce computing time.

ACKNOWLEDGEMENTS

Our work is supported by project of science and technology development of Jilin provincial education department, China(No. 2016346, No. 2016310).

REFERENCES

1. Borji, A., Itti, L., 2013. “State-of-the-art in Visual Attention Modeling,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(1): 185-207.

2. Itti. L., Koch, C., Niebur. E., 1998. “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11): 1254-1259.

3. Rafael, B., Gomes., Bruno, M., Carvalho, D., Luiz, M., and Garcia, G. 2013. “Visual attention guided features selection with foveated images,”Neurocomputing, 120: 34-44.

4. Zhong, M., Zhao, X,. B., Zou, X., C., James, Z. and Wang, W,. H. 2014. “Markov chain based computational visual attention model that learns from eye tracking data,” Pattern Recognition Letters, 49(1): 1–10.

5. Mahadevan, V., Vasconcelos, N. 2013. “Biologically inspired object tracking using center-surround saliency mechanisms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(3): 541-554.

6. Han, J., W., Wang, D., Y., Shao, L., Qian, X., L., and Han, J., G.2014.“Image visual attention computation and application via the learning of object attributes,” Machine Vision and Applications, 25(7): 1671-1683.

7. Fahad, F., Elahi G., Faouzi A., C. 2015. “Neural networks based visual attention model for surveillance videos,” Neurocomputing,149: 1348–1359.

8. Zhang X., L., Zhao H., W., Wang H., Dai J., B.2010. “Extracting Attention Information Algorithm Based on Contrast Sensitivity and Markov Chain,” Acta Electronica Sinica, 38(2A):