Histogram of Oriented Gradients - Optimization of common computer vision algorithms : beating O

Histogram of oriented gradients (HoG) is a commonly used feature for object detection. It describes the shape of an object as a distribution of its intensity gradients. The image is divided into small spatial regions called cells each containing n_×n pixels. For each cell a histogram of its pixels gradient directions is computed and normalized to make it

more robust to illumination and shadowing. Theﬁnal descriptor is the concatenation of the normalized histograms of each block (group of cells).

First, the horizontal∂x(x, y) and vertical∂x(x, y) image gradients are computed. To do

so, thederivative masks is applied over the image in the vertical and horizontal direction.

[₋1,0,1] and [₋1,0,1]� (5.5)

Second, for each pixel in the image its gradient magnitude and orientation are computed.

m(xi, yi) = � ∂x(xi, yi)2+∂y(xi, yi)2 (5.6) Θ(xi, yi) =arctan � ∂y(xi, yi) ∂x(xi, yi) � (5.7) Then each cell histogram is generated. Histograms x coordinate represent the angle of the gradients which can be unsigned ([0◦,180◦]) or signed ([0◦,360◦])s). Each pixel within the cell falls into a bin of the histogram based on its gradient orientation and casts a weighted vote for that orientation equal to its gradient magnitude.

To improve robustness against illumination and shadow changes a normalization factor is applied to large spatial regions. In other words, normalization is applied over groups of cells named blocks which overlap so each cell contributes to multiple blocks. There are diﬀerent possible block normalization techniques, the one used consists on computing the L2-norm (Equation 5.8) and then clipping the values and re-normalize,

�

�h_�2₂+c2

(5.8)

wherev represents the histogram andc a constant. The ﬁnal descriptor is the concatenation of the normalized histograms.

Temporary planning

In this Chapter the deﬁnition of tasks and their temporal scheduling plan are deﬁned.

6.1 Tasks identiﬁcation

The development of this project will be performed usingAgilemethodologies. Hence, the project is divided into a set of tasks (iterations) each one divided in analysis, implementation, testing, integration and documentation. The application of agile methodologies will allow a more close up feedback relationship with the project supervisors as well as a faster response against obstacles and incidents.

6.1.1 State of the art and learning

In this first task an evaluation of the current state of the art will be analyzed. The objective of this task is to identify the most promising state of the art training algorithms, features and classifiers which do/could fulfill the established detector constraints. Once identified, study all the necessary theory in order to understand and be able to implement the method.

6.1.2 Debugging dataset generation

Search and ﬁlter an on line frontal face dataset to be used for initial debugging of the classiﬁer, features and learning algorithm.

6.1.3 Learning classiﬁer

Once decided in the State of the art this task states the classifier to be used. Its implementation must be highly modular and reusable because different features and learning algorithm would be tested on it. Also, it is not desired to implement it with low level optimizations as multiple version and changes could be made in future iterations. The classifier must be as transparent as possible. To achieve this transparency a set of scripts to plot the current state of the classifier will be implemented.

6.1.4 Features

Implement, test and evaluate all identiﬁed features that could be used. All features must have the same interface for modular and reusable purposes. In this way, changing between features does not aﬀect any of the other modules.

6.1.5 Learning algorithm

Implementation and testing of the learning algorithm to train the classiﬁer. Learning algorithms tend to be a black box. It is desired to make this step as transparent as possible. To do so, a set of scripts to plot the state of each step of the learning process must be implemented.

6.1.6 Final dataset generation

Once the classifier is properly being trained, a rich, wellfiltered dataset will be generated. It is intended to implement a script to scratch the web, downloading face pictures and then implementing a program to fast label each sample. The labeling will consist of defining the face bounding box and position configuration.

6.1.7 Classiﬁer training and validation

In this step the classiﬁer will have to be implemented by cross-validating all the critical parameters.

6.1.8 Classiﬁer implementation optimization

Once the classiﬁer and the features to be used have been decided, highly optimize the code. Run a code proﬁler to identify the most critical parts of the code an apply modern low-level compiler optimizations.

6.1.9 Final documentation

At each iteration the documentation of that task will be generated. Hence, in thisﬁnal iteration onlyﬁnal polishes over the documentation will be made.

In document Optimization of common computer vision algorithms : beating OpenCV face detector (Page 42-46)