4.3 The DCS Algorithm
4.3.5 The Objective Function
The design of the objective function is crucial to the DCS algorithm. Not only does it determine which candidates are kept in HM and ultimately determine the optimum, but it also guides the swap operation by determining which components should be swapped. Its design is significantly influenced by the way image features are compared and the expected accuracy with which positive matches can be identified using only these comparisons. The function is designed to give more or less weight to feature comparisons depending on the method used.
A good correspondence set not only contains strong feature matches but must also be geometrically consistent. To measure this we make the static world assumption that states that all observed features are static and all perceived feature movement between images are due to camera movement only. Since only the camera is moving all perceived feature movement must be in the same direction. When there is a lot of diversity in the perceived feature movement between images in a correspondence set, it means that either the scene was corrupted with real moving objects or the correspondence set is incorrect.
Feature movement is calculated by using the 3D coordinates of the features in the images. These are calculated independently using other sensors like laser scanners or by using multiple view geometry [108]. The local 3D coordinates of each feature in one image is subtracted from the 3D coordinates of the corresponding feature in the other image. This gives the movement vector for that feature given the correspondence set that is being tested. An illustration of this process is given in Figure 4.5.
The mean and variance of the motion vectors over all features are then calculated. The mean is used to determine which feature correspondences contribute most to low- ering the fitness of the correspondence set. The movement that deviates most from the
Figure 4.5: Perceived feature movement due to camera movement is measured using motion vectors. At time T the camera observes 3 features. Then it moves forward and observes the same features. From this position it appears that 2 of the features moved backwards while one moved diagonally down. The most likely interpretation is that two of the correspondences are correct while the one that moved differently from the others is incorrect.
mean corresponds to the worst feature correspondence and represents the component that would likely benefit most from a swap operation.
Notice however, that due to perspective geometry, objects that are further away from the camera appear to move less than objects that are close by. To cancel this effect the motion vectors are normalised before being used in the weight calculation. Once normalised the mean and variance over all features are calculated.
The variance is used to calculate the fitness weight. A small variance indicates a low diversity in movement vectors and leads to a high fitness. A large variance indicates inconsistent feature movement and leads to a low fitness. The variance term is added to a feature match quality term that is dependent on the choice of feature representation. In this case it is simply the SSD measure between corresponding features which was introduced in Section 4.2.2.
In order to discourage DCS from generating correspondence sets with too many φ
values, a starvation penalty term is added to the weight to make sure that a sufficient number of good correspondences are found. The penalty is designed so that the fitness is severely penalised below a certain threshold but makes a small contribution if plenty
4.3 The DCS Algorithm
of correspondences are found. The threshold is dependent on the application for which the correspondence set is used for. For example, many good correspondences are needed for accurate visual navigation and the fitness should be severely lowered if there are less than the required minimum number of correspondences.
The equation that combines these terms and calculates the fitness is as follows.
f(CRA) =−( m X i=1 C(CRA[i], FR[i]) +ασ+βe−cτ) , (4.4) where Pm
i=1C(CRA[i], FR[i]) is the SSD measure over all feature representations. The
two weights,αand β, determine the relative importance of the three terms in the final score while σ is the motion vector variance. The final term is the starvation penalty modelled as an exponential function withτ controlling the shape andcthe number of correspondences in the set.
4.3.6 Initialisation
Before optimisation using DCS starts the HM is first initialised with two copies of the prior correspondence set estimate. The remaining vectors are generated randomly. The generation of the prior estimate is very important to the efficiency of the optimiser as the prior usually becomes the initial best solution and therefore will have a large influence on the direction of the search. This step should also be as fast as possible to allow the main loop of the optimiser the maximum number of iterations when computational resources are limited.
In the DCS algorithm image features are modelled using the SURF descriptors [99] mentioned in Section 4.2.1. In order to quickly build an initial estimate for the cor- respondence set, a fast but naive approach is used to match SURF descriptors. Each feature in the reference image is considered independently and compared with all the remaining unmatched features in the alternate image using SSD. The match with the lowest SSD measure is then considered as the matching correspondence if the SSD is sufficiently low that a true match is likely. If none of the SSD measures are sufficiently low the feature is considered unmatched and φ is inserted into the prior set at that index. This process is continued until all the indices in the prior set has been set.
This process generates a very rough estimate of the correct correspondence set as geometric consistency is completely ignored. Errors are also introduced by the
elimination of possibly correct matches due to them being incorrectly matched to a previous feature in the reference image. In other words the first few features in the reference image have more options to match than those that are processed later. This puts an incorrect bias on some features and leads to incorrect matches for many of the features that are processed last.
Even though the prior is far from perfect it is good enough to initialise the HM and gives the search process a good stating point. This method of initialisation is also fast since the list of features in the reference image is only iterated once.