Conclusions - A saliency based framework for multi-modal registration.

In this chapter we have reviewed the relevant work for the thesis. Initially, point feature detection in 2D and 3D was reviewed, along with line detection in each modality; due to their fundamental importance in registration tasks. Secondly, existing approaches to 2D-3D registration were described that motivated the use of a feature-based approach for our problem. Subsequently, existing approaches to correspondence-free 2D-3D geometry estimation were reviewed. Existing approaches typically scale poorly with the number of features: as a result, we aim to detect relatively sparse features. Finally, global optimisation in geometry estimation was reviewed, focusing particularly on Branch-and-Bound approaches.

In light of the above review, we are now able to place the contributions of this thesis in context. In Chapter 3, we present a general framework for point feature detection inspired by the histogram-based Kadir-Brady detector [74], and apply it to both 2D and 3D data. In contrast to existing point feature detectors, it naturally detects 3D features based on both the geometry and the texture of the scene and allows for points to be detected meaningfully across each modality. Furthermore, its ability to detect sparse sets of non-repetitive features is of key importance for the later computationally expensive registration phase.

In Chapter 4, we present a novel salient line segment detector. In contrast to existing line detectors, our histogram-based approach is based upon the distribution of pixels either side of

2.7. Conclusions 39

the line, allowing it to naturally detect lines in non-repetitive areas and obtain a representative set of lines for the scene. It is extended to 3D data where it detects lines based on both the geometry and texture of the scene. Similarly to Chapter 3, we detect sparse, non-repetitive lines with the later computationally expensive registration phase in mind.

In Chapter 5, we present a globally optimal approach to 2D-3D registration from points and lines where correspondences are unknown. By construction, it is more robust than existing heuristic approaches, and this is particularly well demonstrated for high rates of outliers. We propose formulations that allow for the speed-up of nested BnB algorithms while preserving optimality properties of the solution. We evaluate by comparing the proposed salient features with state-of-the-art features, and comparing the proposed BnB approach with existing 2D-3D geometry estimation approaches.

Chapter 3

Salient Point Detection

3.1 Introduction

Any feature-based registration pipeline requires, as a first stage, a feature detector that is able to detect a significant proportion of repeatable features across both representations of the data. In doing so, the subsequent feature matching process becomes far more tractable; conversely, a low repeatability rate may result in a very slow feature matching process, or it may fail all together.

In this thesis, we are particularly interested in multi-modal feature detectors, and as such, we propose a generalisable point feature detector that may be broadly and meaningfully applicable across modalities, where we specifically focus on 2D and 3D. The Kadir-Brady (KB) saliency detector [74] for greyscale images is used as a starting point for this research. Its ability to detect sparse sets of non-repetitive features is furthermore of key importance for the later computationally expensive registration phase.

Existing point feature detection methods are typically centred around images, with applications across a range of subfields (registration, reconstruction, image retrieval, etc). Recent advances in 3D data acquisition (e.g. Microsoft Kinect) has resulted in a significant interest in 3D feature detection[149, 60]. However, it is clear that the majority of 2D and 3D feature detectors are constructed in very separate ways. The more popular 2D feature detectors are based on the derivative of the image, and provide a principled approach to scale selection using scale-space

theory [110, 97]. Yet, very few may be extended to operate on 3D data, with many 3D feature detectors based on surface curvature [149]. Furthermore, the traditional scale-space approach typically cannot be applied to 3D data without altering the geometry. The differences between 2D and 3D feature detectors are further exacerbated by the range of existing 3D data types (point cloud, volumetric, mesh, textured / untextured), leading to different 3D feature detectors for each case [149, 163, 60].

As such, it is very difficult to use existing point feature detectors jointly across 2D and 3D due to the incomparable nature of their constructions. Applications such as registration, that would typically rely on point feature detectors, instead use other techniques in the 2D-3D case (e.g. learning a bag of features across multiple viewpoints [147], or Mutual Information alignment [104]). These approaches are not as general as their feature-based counterparts; often making restrictive assumptions about the scene, or requiring a good initial alignment.

To address this issue, here we propose a more general approach to point feature detection, based on the KB saliency detector [74]. Its histogram-based approach does not exclusively depend upon data-type specific quantities such as derivatives or curvatures. Instead, it defines a salient point as having a high information content (as measured by the entropy of its histogram) at a particular scale. This histogram-based approach allows it to be formulated across different modalities in a more meaningful manner than other feature detectors due to the vast array of ways in which histograms may be constructed.

Based upon the KB saliency detector, and inspired by the success of the 2D Harris corner detector [61, 1] we propose a novel extension to the 2D KB saliency detector. Whereas the original KB saliency detector constructs a histogram of pixel intensities in a circular region, we propose a derivative based approach whereby the histogram is constructed based on the distribution of eigenvalues of the second moment matrix. This allows our approach to detect salient points with respect to the derivative of the image, where it may operate in a more general manner than a typical corner detector and avoid repetitive parts of the scene.

By using the generalisable histogram-based approach of the KB saliency detector, the above approach may be naturally extended to 3D data by constructing a histogram based on the 3D second moment matrix [140]. Furthermore, the histogram-based approach allows for the detection of salient points based on both the geometry and texture of the scene by constructing a 2D

3.1. Introduction 43

histogram based on the texture of the 3D surface, and combining the 2D and 3D histograms. This allows it to operate in a meaningful manner regardless of whether or not the 3D data is textured, and is able to combine the best of both sets of features for textured data.

To briefly review similar work to ours; the KB saliency detector has already been proposed for 3D data by Fiolka et al. [51], who construct a histogram based on the distribution of normals. By contrast, here we propose a derivative-based 2D KB saliency detector, alongside a 3D KB detector that operates jointly on the geometry and texture of the scene. Furthermore we propose a framework for generalisable salient point detection, and as such provide a 2D-3D evaluation on a range of synthetic and real data. An earlier version of this work was published in [21] based on the mean curvature, however this was a purely geometry based KB saliency detector.

The contributions of this chapter are three-fold. Firstly, a generalisation to the KB saliency detector is formulated, demonstrating its broad applicability to operate wherever histograms may be meaningfully constructed within a metric space. Secondly, in light of this generalisation, we propose a 2D derivative-based KB saliency detector based on the second moment matrix. Thirdly, the derivative-based KB saliency detector is naturally extended to 3D, where it may operate on both textured and untextured 3D data. It is, to the best of our knowledge, the first 3D feature detector to operate based on both the geometry and texture of the scene simultaneously. The proposed detectors are evaluated in a 2D-3D manner where it is shown to be more repeatable than existing detectors (Harris 2D and 3D [61, 140], and SIFT 2D and 3D [97, 164]).

This chapter is structured as follows. In Section 3.2 a description of the KB saliency detector is given, along with proposed extensions and modifications [75, 137, 136]. In Section 3.3 we propose a generalisation of the KB saliency detector. The generalisation is subsequently imple- mented for a 2D derivative-based KB saliency detector 3.4, and a 3D KB saliency detector 3.5 that may operate on textured or untextured 3D data. In Section 3.6 results will be given, involv- ing qualitative and quantitative results in both 2D and 3D; finally, conclusions are presented in Section 3.7.

In document A saliency based framework for multi-modal registration. (Page 48-54)