• No results found

1.3 Thesis Overview

1.3.3 Thesis Structure

In Chapter 2 we introduce various forms of regularization for the multi-model fitting problem with fixed matches and a graph-cut based optimization algorithm to optimize them,PEaRL. In Chapter 3 we propose an extension toα-expansion [4] to optimize multi-

labeling problems with spatial smoothness and label cost, i.e. energy (1.23). Finally, in Chapter 4 we propose the joint formulation of multi-model fitting and matching problems and an approximation algorithm for the corresponding NP-hard optimization problem. The multi-model fitting method covered in Chapter 2 is used as a sub-procedure in one of the steps of the proposed block coordinate descent approach for joint fitting and matching discussed in Chapter 4.

Bibliography

[1] Ravindra K. Ahuja, Thomas L. Magnanti, and James B. Orlin. Network Flows: Theory, Algorithms, and Applications. Prentice-Hall, Inc., 1993.

[2] Julian Besag. On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society. Series B (Methodological), pages 259–302, 1986.

[3] Yuri Boykov and Marie-Pierre Jolly. Interactive graph cuts for optimal boundary & region segmentation of objects in nd images. In International Conference on Computer Vision, volume 1, pages 105–112. IEEE, 2001.

[4] Yuri Boykov, Olga Veksler, and Ramin Zabih. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23:1222–1239, November 2001.

[5] Rainer Burkard, Mauro Dell’Amico, and Silvano Martello. Assignment Problems. Society for Industrial and Applied Mathematics, Philadelphia, USA, 2009.

[6] Philip David, Daniel Dementhon, Ramani Duraiswami, and Hanan Samet. Soft- posit: Simultaneous pose and correspondence determination. International Journal of Computer Vision, 59(3):259–284, 2004.

[7] Olivier Faugeras and Quang-Tuan Luong. The geometry of multiple images: the laws that govern the formation of multiple images of a scene and some of their applications. MIT press, 2004.

BIBLIOGRAPHY 17

[8] Martin A Fischler and Robert C Bolles. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography.

Communications of the ACM, 24(6):381–395, 1981.

[9] Stuart Geman and Donald Geman. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images.Pattern Analysis and Machine Intelligence, IEEE Transactions on, 6:721–741, 1984.

[10] R. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cam- bridge University Press, 2003.

[11] Nikos Komodakis, Nikos Paragios, and Georgios Tziritas. Mrf energy minimization and beyond via dual decomposition. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 33(3):531–552, 2011.

[12] Hongdong Li. Two-view motion segmentation from linear programming relaxation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1–8, June 2007.

[13] David G Lowe. Distinctive image features from scale-invariant keypoints. Interna- tional journal of computer vision, 60(2):91–110, 2004.

[14] Y. Ma, S. Soato, J.Kosecka, and S. Sastry. An Invitation to 3-D Vision: From Images to Geometric Models. Springer Verlag, 2003.

[15] Carsten Rother, Vladimir Kolmogorov, and Andrew Blake. Grabcut: Interactive foreground extraction using iterated graph cuts. In ACM Transactions on Graphics (TOG), volume 23, pages 309–314. ACM, 2004.

[16] Konrad Schindler and David Suter. Two-view multibody structure-and-motion with outliers through model selection. IEEE Transections on Pattern Analysis and Ma- chine Intelligence, 28(6):983–995, June 2006.

[17] E. Serradell, M. ¨Ozuysal, V. Lepetit, P. Fua, and F. Moreno-Noguer. Combining geometric and appearance priors for robust homography estimation. In European Conference on Computer Vision (ECCV), pages 58–72. Springer, 2010.

[18] Kenneth Steiglitz and Christos H Papadimitriou. Combinatorial optimization: Al- gorithms and complexity. Printice-Hall, New Jersey, 1982.

[19] Roberto Toldo and Andrea Fusiello. Robust multiple structures estimation with j-linkage. In European Conference on Computer Vision, pages 537–547. Springer, 2008.

[20] Philip Torr and Andrew Zisserman. MLESAC: A New Robust Estimator with Appli- cation to Estimating Image Geometry. Computer Vision and Image Understanding, 78(1):138–156, 2000.

[21] Lorenzo Torresani, Vladimir Kolmogorov, and Carsten Rother. A dual decomposi- tion approach to feature correspondence. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 35(2):259–271, 2013.

[22] Olga Veksler.Efficient graph-based energy minimization methods in computer vision. PhD thesis, Cornell University, 1999.

[23] Jonathan S Yedidia, William T Freeman, and Yair Weiss. Understanding belief propagation and its generalizations. Exploring artificial intelligence in the new mil- lennium, 8:236–239, 2003.

[24] Marco Zuliani, Charles S Kenney, and BS Manjunath. The multiransac algorithm and its application to detect planar homographies. In International Conference on Image Processing (ICIP), volume 3, pages III–153. IEEE, 2005.

Chapter 2

Energy-based Geometric

Multi-Model Fitting

Geometric multi-model fitting is a typical chicken-&-egg problem: data points should be clustered based on geometric proximity to models whose unknown parameters must be es- timated at the same time. Most existing methods, including generalizations ofRANSAC,

greedily search for models with most inliers, within a threshold, ignoring overall classifi- cation of points. We formulate geometric multi-model fitting as a multi-labeling problem with a global energy function balancing geometric errors and regularity of inlier clus- ters. Regularization based on spatial coherence (on some near-neighbor graph) and/or label costs is NP-hard. The combinatorial algorithm introduced in Chapter 3 can mini- mize such regularized energies over a finite set of labels, but it is not directly applicable to a continuum of labels, e.g. R2 in line fitting. Our proposed approach PEaRL com-

bines model sampling from data points as in RANSAC with iterative re-estimation of

inliers and models’ parameters based on a global regularization functional. In practice,

PEaRLconverges to an approximate but good quality solution, shown empirically, of our

proposed energy while automatically selecting a small number of models that best explain the whole data set. Our tests demonstrate that our energy-based approach significantly improves the current state-of-the-art in geometric model fitting currently dominated by various greedy generalizations of RANSAC.

(a) low noise (b) high noise

Figure 2.1: Blue dots are data points supporting 8 lines. In multi-model cases, detecting models by maximizing the number of inliers may work for low levels of noise (a). Higher noise levels require larger thresholds to detect inliers (b), but then some random model (red) may have far more inliers than the true model (green). The integers show the number of model inliers for the selected thresholds. Simplistic greedy selection of models with the largest number of inliers would fail in (b). This example illustrates a general problem that affects many multi-model fitting approaches that greedily selects one model at a time while ignoring the overall solution. It also motivates our global energy approach that considers the quality of the whole solution.

2.1

Introduction

We study a general case of geometric multi-model fitting problem where data is a mixture of outliers and points supporting an unspecified number of models of some known type1.

The majority of existing algorithms treat inlier classification and parameter estimation as two independent problems. Typically, each model is selected greedily by maximizing inliers within some fixed threshold. This approach works well when data supports a single model [14], but we argue that this approach is fundamentally flawed in multi-model cases, see Fig. 2.1.

RANSAC[14] is a well-known robust method for dealing with large number of outliers

when the data supports only one model. The main idea is to generate a number of model proposals by randomly sampling data points to estimate these model proposals and then selecting the model with the largest set of inliers (a.k.a. consensus set) with respect to some fixed threshold. Many publications [28, 33, 40] proposed various generalizations of

RANSACfor multi-model fitting. For example, [28, 33] runRANSACsequentially. Each

iteration of these methods selects one of the randomly sampled models that maximizes either the number of inliers or some similar threshold-based measure. Then the identified model’s inliers are removed from the set of data points before the next iteration looks for the next model. Other methods rely on different forms of greedy clustering, e.g. J-linkage

2.1. Introduction 21

(a) fitting homographies (stereo) (b) estimation accuracy vs. λ

Figure 2.2: Motivating spatial regularization in geometric model fitting. In many vision problems combining geometric errors and spatial coherence terms in energy (2.3) can be justified generatively because clusters of inliers are generated by regular objects (a). Moreover, spatial regularization can also be justified discriminatively. Plot (b) shows an average deviation from the ground truth for optimal models obtained in 100 randomly generated line-fitting tests as in Fig. 2.4(d). Each point in this plot corresponds to some fixed smoothness parameter λ in (2.3). Clearly, the spatial coherence term significantly reduces estimation errors for some λ >0 .

[26], implicitly maximizing the number of inliers within a given threshold. One can also apply Hough transform to formulate multi-model fitting as a clustering problem in the space of model parameters and use mean-shift [9] to identify the modes in this Hough space. It is easy to see that this approach also greedily maximizes the number of inliers. We argue that, in general, greedy selection of one model while ignoring the overall solution is a flawed approach to multi-model cases. Figure 2.1 shows a simple example illustrating the typical problem in the context of greedy inlier maximization: if the level of noise is increased, some random model can have a larger number of inliers than the true models. This also explains our results in Section 2.3 and Figs. 2.7-2.9 which show that many existing greedy methods work only on examples with low levels of noise and clutter.