Graph Cut Methods - Finding the Correspondence Set

4.1 Problem Statement

4.2.2 Finding the Correspondence Set

4.2.2.2 Graph Cut Methods

Graph cut methods are inspired by the discovery that energy minimisation (or the optimisation of some objective function) can be efficiently performed by modelling the problem as a set of edges and vertices arranged in a graph. Typically the pixels in the image is represented by vertices in the graph with neighbouring pixels connected by edges called neighbour links. Special terminal vertices, which represent pixel labels, are also added to the graph and these are also connected to the pixel vertices with special edges called terminal links.

Each edge has a weight associated with it that is set by the energy function. When the problem is correctly modelled as a graph, combinatorial optimisation theory teaches that the optimal labelling, that is the one that minimises the energy function, is represented by theminimal graph cut [112]. The minimal graph cut is a way to partition the graph such that the sum of the edge weights associated with the edges that were cut in the partition is minimised [113]. An example of a pixel labelling problem modelled as a graph is shown in Figure 4.3.

Graph cut minimisation has been used successfully in image segmentation and clus- tering applications and has shown to be more efficient than most state-of-the-art methods in these applications [114, 115]. The visual correspondence problem is also es- sentially a pixel labelling problem that can be solved through energy minimisation.

Figure 4.3: In this diagram an image labelling problem is represented as a graph. The square vertices are pixels in the image and the circular vertices are image labels. All the image labels are connected to all the pixel vertices with terminal links (t-link) (some links are omitted for clarity) and pixels are connected to their neighbours with neighbour links (n-link). The partitionspktoPkresults from a single graph cut that associates pixels with labels and represent a specific pixel labelling. This image was taken from [112].

4.2 Overview of Current Approaches

input :CRA= the set of matched features (observations)

input :M = a camera motion model that can be fitted to matches

input :n= the minimum size of an inlier set

input :Gmax= the total number of iterations

input :t= threshold value to decide when a match fits the model

output:Mbest= the overall best model output:Sbest= the best inlier set

output:ebest= the fitting error from the best model

1 i= 0 2 M_best ={} 3 S_best={} 4 ebest=∞

5 whilei < G_max do

6 S_inliers=nrandom matches

7 Minliers= the model fitted to Sinliers

8 Sconsensus=Sinliers

9 foreachmatch ∈C_RA and ∈/ S_inliers do 10 if match fits Minliers with error < tthen

11 add match toSconsensus

12 end

13 end

14 if |S_consensus|>|S_best|then // this is a good model

15 Sbest=Sconsensus

16 end 17 i=i+ 1 18 end

19 Mbest = model fitted to all matches inSbest

20 e_best= fitting error from M_best 21 RETURNM_best, S_best, e_best

In traditional energy minimisation the aim is to minimise a energy function that has the following form:

E(CRA) =K·Esmooth(CRA) +Edata(CRA), (4.2)

whereCRA is a correspondence set between the features of the reference image and the

alternate image (see Figure 4.1). The first term is a smoothness term that imposes a penalty on solutions that violate spatial smoothness while the second term penalises solution that are inconsistent with the observed data. K is a weighting constant that determines the relative importance between the two terms. When searching for a correspondence set, the data term is typically built around a feature matching measure and the SSD measure introduced in Equation (4.1) is often used. The smoothness term is more difficult to choose and many variations have been proposed [112].

A popular smoothness term that is often used in graph cut minimisation is the Potts model. The Potts model preserves discontinuities in the image and has various other features that make it preferable for use in graph cut minimisation [112]. The Potts model for the smoothness term is defined as

Esmooth(CRA) = X p,q∈N

T(fp 6=fq) , (4.3)

where T(·) is 1 if its argument is true, and 0 otherwise. The term is only calculated for those pixels inN which is the set of interacting pixels and is typically those pixels that are adjacent.

The Potts model encourages solutions consisting of regions where pixels in the same region have equal labels. This type of term is sometimes informally called a piecewise constant model [112]. The Potts model is especially useful when the labels are unordered and the number of labels is small. It has been shown that the Potts model based energy minimisation problem can be directly reduced to the multiway graph cut problem [112]. It can therefore be proven that the global minimum for a Potts model based energy function can be computed by finding the minimum cost graph cut on an appropriately constructed graph.

Therefore, one can solve the visual correspondence problem by first modelling it as an appropriately constructed graph and then using known combinatorial optimisation

In document The Application of Harmony Search in Computer Vision (Page 140-144)