2016 International Conference on Electronic Information Technology and Intellectualization (ICEITI 2016) ISBN: 978-1-60595-364-9
Research on the Top-down Visual Attention
Method with Unknown Interference Target
Hongchang Ke, Hui Wang and Degang Kong
ABSTRACT
Visual attention can prior to allocate the limited processing resources to several salient regions of the image, which can increase the efficiency of the relevant intelligence applications. Based on differences between task features and image background features, key features of task target can be reasoned, feature channels of visual attention are filtered top-bottom. The top-down visual attention method with unknown interference target is proposed, in order to ensure the hit rate of target as far as possible to shorten the generation time of visual attention saliency map, to improve the computing efficiency of visual attention.
INTRODUCTION
Visual attention can priority to allocate the limit processing resource to several salient regions of the image, which can increase the efficiency of image application such as target searching, target tracking. In recent years, visual attention has gradually become a hot topic.
There are a number of representative models in visual attention field. Itti et al., which proposed the saliency model to extract the color features, the intensity features, orientation features of input image, and use multi-scale center-surround ________________________
Hongchang Ke, Degang Kong. School of Computer Technology and Engineering, Changchun Institute of Technology, No. 2494. Hongqi Street, Changchun City, Jilin Province, 130012, China
computing to simulate the features of the cell receptive field to obtain a local saliency, and generate the final saliency map in which the WTA and the prohibition return mechanism to guide the focus of attention to transfer [1]. Iva, which combines static features with dynamic features the intensity, color, and orientation of the input image, the salient region of the panoramic image sequence is extracted from the spherical coordinate system [2].
Li et al., which proposed the hyper complex Fourier transform method for salient features extraction of visual attention, which use low-pass filtering to highlight the salient signal with the amplitude spectrum of methods [3]. Rafael et al., which proposed the method of cutting the bottom-up features extracting through multi-scale adjustable Fovea model to reduce the processing time [4]. Mahadevan et al., who is use of bottom-up center-surround salient information, the feature selection of feature-based attention and top-down saliency for target detection to track the salient target [5]. Fahad et al., who is use of the psychology and other disciplines, such as eye-tracking experiment data to train artificial neural networks to integrate weighted bottom-up, top-down as well as salient motion visual cues to deal with the salient visual computing [6]. Tsotsos et al. proposed a selective visual attention model, spatial selection and top-down by inhibiting irrelevant visual connection in Pyramid is proposed to realize the operation [7]. Harel et al. who introduced the Markov chain into the calculation of the local saliency. The stationary distribution of the Markov chain defined on the feature vector is used to the saliency of the activity diagram [8].
The paper proposed the top-down visual attention method unknown interference target. According to the difference between the feature data of task target and the mean feature data, the key features of the inference task which are distinguished from the background of the image can be reasoned to generate the key features queue. Based on the extent critical of features and hit rate situations of task targets to filter features channel, which can improve search efficiency.
GENERATING KEY FEATURES OF TASK OBJECT WITH UNKNOWN INTERFERENCE TARGET
Set specific task the form background is ( , , )U A I , object set is U x x1, , ,2 xn, ( , , )
L U A I stands for the whole concept of ( , , )U A I . Task target is xi , set
(xi,li,rgi,byi,o1i,o2i,o3i,o4i)L(U,A,I), are respective the features of task target xi ,
where x stands for the name of target concept, l,rg,by,o1,o2,o3,o4 are
rg rgmin rgmax (1)
Where, rgmax,rgmin are the maximum value, minimum value of RG features of
target. Other methods similarly defined target feature matrix.
Set the size of image is MN, lij is the intensity of point
i,j . The mean of intensity is, l
lij
j1 N
i1 M
MN (2)
Where, 1iM,1 jN.
The method of RG, BY color of image, 0,45,90,135 local orientation is similarly defined.
However, the feature information of background image y is
(y,l,rg,by,o1,o2,o3,o4), where l,rg,by,o1,o2,o3,o4 are respective the
Intensity, RG, BY, the mean of 0,45,90,135 local orientation.
In order to obtain the key feature matrix of task target, the distance between (xi,li,rgi,byi,o1i,o2i,o3i,o4i) and (y,l,rg,by,o1,o2,o3,o4) should be calculated. Because
RG feature of target
rgi rgmini
rgmaxi
is matrix, rgiis the average of matrix:
rgirgminirgmaxi
2 (3)
Where, rgmax rgmin stand for the max, min of RG information. The method of
target feature matrix is similarly defined.
(xi,li,rgi,byi,o1i,o2i,o3i,o4i)is the deformation feature information of task target xi.
The distance between (xi,li,rgi,byi,o1i,o2i,o3i,o4i) and (y,l,rg,by,o1,o2,o3,o4) is
DISFC
(xi,li,rgi,byi,o1i,o2i,o3i,o4i),(y,l,rg,by,o1,o2,o3,o4)
DISFC
(xi,li,rgi,byi,o1i,o2i,o3i,o4i),(y,l,rg,by,o1,o2,o3,o4)
[E(li,l) E(ci,c) E(oi,o)]
E(ci,c)E((rgi,byi)T,(
rg,by)T) (5)
E(oi,o)E((o1i,o2i,o3i,o4i)T,(o1,o2,o3,o4)T) (6)
The features difference of the recognizable attribute matrix can be normalized, and key feature queue of task targets f (f1,f2,f3)can be sort to obtain. Where
E(f1i,f1)E(f2i,f2)E(f3i,f3) (7)
The key features of task target xi are f1 which is the max features of the
Euclidean distance of background image features.
EXTRACTING VISUAL ATTENTION FEATURES OF UNKNOWN INTERFERENCE TARGET
In the case of unknown interference target features channel filter based top-down visual attention guidance as follows:
(1)First, the feature channels of bottom-up visual attention only filter the key features of the first queue features (the primary key features of task targets f1)
corresponding to channel generates saliency map.
(2) Select n most salient area, use SIFT feature descriptor determine the task targets whether exists in the n region, if present, turn (6), otherwise, turn (3).
(3) Determine whether all of the features channel have already been selected, and if so, turn (5), otherwise, select the task targets of key features queue the next major key corresponding channel, turn (4).
(4) Calculate the activity diagram, with the previously calculated features channel activity diagram together to weight and generate saliency map (where the weighting coefficient by the task targets and image background on this feature difference normalized results are given), turn (2);
(5) Not hit the target, the algorithm fails; (6) Hit the target, task is completed.
EXPERIMENT RESULT
The paper proposed the top-down visual attention method unknown interference target base on the constructed visual feature knowledge base. According to the difference between the feature data of task target and the mean feature data, the key features queue of task target can be calculated. Based on the extent critical of features and hit rate situations of task targets to filter features channel, which can improve search efficiency.
ACKNOWLEDGEMENTS
Our work is supported by project of science and technology development of Jilin provincial education department, China(No. 2016310,No. 2016346).
Corresponding author:HongchangKe,[email protected].
REFERENCES
1. Itti. L., Koch, C., Niebur. E., 1998. “A Model of Saliency-Based Visual Attention for Rapid Scene Analysis”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11):
1254-1259.
2. Iva B., Alexandre B., Heinz H., Farine, P., A. 2010. “Dynamic visual attention on the sphere,”
Computer Vision and Image Understanding, 114: 100–110.
3. Li. J., Levine, M., D., An X., Xu, X., and He H.2013. “Visual saliency based on scale-space analysis in the frequency domain,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 35: 996-1010.
4. Rafael, B., Gomes., Bruno, M., Carvalho, D., Luiz, M., and Garcia, G. 2013. “Visual attention guided features selection with foveated images. Neurocomputing,” 120: 34-44.
5. Mahadevan, V., Vasconcelos, N.2013. “Biologically inspired object tracking using center-surround saliency mechanisms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(3): 541-554.
6. Fahad, F., Elahi, G., Faouzi A., C. 2015. “Neural networks based visual attention model for surveillance videos,” Neurocomputing,149: 1348–1359.
7. Tsotsos, A., John K. 2006. “Cognitive Vision Needs Attention to Link Sensing with Recognition,”
Cognitive Vision Systems, LNCS, 3948: 25–35.