The Next-Best-View Problem - A Next-Best-View Solution

Chapter 3 A Next-Best-View Solution

3.1 The Next-Best-View Problem

The objective of automated view planning for 3D reconstruction of objects or environments is to automatically determine an efficient set of scanner views such that the acquisition constraints are met and a digital model of the object or environment can be satisfactorily reconstructed. A view is defined as a pose of the scanner and the associated imaging parameters. Examples of acquisition constraints include the field of view of the scanner and the minimum distance that must be observed between the scanner and the object. The efficiency of a set of planned views is measured according to different metrics. It can be the total number of scans, the amount of acquired data, or the total scanning time. An object or environment can be satisfactorily reconstructed if all surfaces accessible to the scanner have been acquired with the required quality wherever possible. The relationship between view planning and reconstruction quality is elaborated in Section 3.2.

For 3D reconstruction, a priori knowledge of the object’s or environment’s geometry is not available to the view planner. The first scan is made from a view appropriately selected by a human operator, and for the subsequent scans, the planner must determine the next best views based on the information it has already collected from the previous scans. This is why the view planning problems for 3D reconstruction are also called the next-best-view problems. They are part of the more general view planning problems, which also include view planning for objects whose a priori geometric information is known, which are known as the model-based view planning problems.

In the most fundamental model-based view planning problem, where only visibility is considered and each viewpoint can see in all directions, the objective is to determine the minimum number of viewpoints, such that every surface point on the object or in the environment is visible to at least one of the viewpoints. This problem has been shown to be NP-hard, and is equivalent to an extended version of the well-known art-gallery problem [O’Rourke1987]. With suitable discretization of the model surfaces and the viewpoint space, the basic model-based view planning problem can be reduced to the set-covering problem [Tarbox1995, Scott2001d].

Although NP-complete, it is well known that the set-covering problem has a polynomial- time approximation algorithm whose solution is worse than the optimal by at most a logarithmic factor [Cormen1990]. The algorithm uses a greedy approach at each iteration to select the subset that covers the most uncovered elements. In the basic model-based view planning problem, this is equivalent to selecting the scanner view that can acquire the most new information. This greedy approach has been used in many other model-based view planning algorithms.

Since global geometric information is unknown, a next-best-view problem cannot be solved globally to get an optimal solution. It is inherently a local optimization problem [Kakusho1995]. The local optimization is applied to only the known geometric information, which has been acquired by the previous scans. With appropriate representation of the unknown occluded regions, for example, with occlusion surfaces (which are artificial surfaces added to connect surfaces on opposite sides of depth discontinuities caused by occlusions; see Figure 3.9(a)), an optimal solution is the smallest set of viewpoints such that all points on the occlusion surfaces are visible from at least one viewpoint. A model-based view planning algorithm can be applied to get the locally optimal solution. Then, the set of new viewpoints is used to acquire new surfaces to update the partial model, and the local optimization is repeated on the new partial model.

The local problem remains NP-hard, and thus the greedy approximation algorithm is often used in the local optimization. When the greedy approach is used, one would want to keep the partial model as up-to-date as possible by incorporating the latest scan, and ensure that the next computed viewpoint is based on this most up-to-date information. This naturally leads to a tight iterative approach. It is tight in the sense that at the end of each iteration, only

a single new viewpoint is produced, and after the scan is made, it is immediately incorporated for the computation of the next viewpoint. This tight iterative greedy approach is used in almost all existing next-best-view algorithms.

However, each iteration of the greedy approach is still very computationally expensive due to the potentially huge search space and expensive visibility computations. This apparent computation difficulty has limited many next-best-view algorithms to incomplete search space, simple and small objects, and low-quality acquisition. Some early algorithms even ignore self-occlusion of the objects [Connolly1985].

When only visibility is considered and each viewpoint can see in all directions, then the size of the search space is proportional to the size of the viewpoint space partition (VSP) and the aspect graph (dual of the VSP) of the object [Plantiga1990]. Each aspect, or a region in the VSP, of a polyhedral object is a maximal set of viewpoints that can see the same set of polygons, forming images of the same topological appearance, i.e. the structures of the polygons’ boundary line segments and intersection points in the images are the same. For a non-convex polyhedral object, the number of aspects is O(n9), where n is the number of vertices [Plantiga1990]. At present, the best time bound for computing the aspect graph is O(n9 log n).

With the VSP of the partial model, at each iteration of the greedy algorithm, the objective is to select a viewpoint in an aspect that can see the largest total area of occlusion surfaces2. Although the number of vertices, m, of the occlusion surfaces can be much less than n, the number of aspects is generally larger than O(m9). This is because occlusion surfaces can be occluded by true surfaces, adding complexity to the VSP. The size of such a VSP is O(m3_n6_).

The above discussion assumes 3D viewpoint locations, unlimited field of view, and direct visibility. In practice, besides the 3D location, other parameters of the scanner have to be determined in the solution, such as the scanner’s orientation, the adjustable field of view, and the scanning resolution. These parameters add additional dimensionality to the already huge search space.

2_{All viewpoints in an aspect see the same set of occlusion surfaces but do not necessary see the same}

amount of surface area of the occlusion surfaces. Therefore, additional optimization must be performed within the aspect to find the viewpoint that sees the largest area of occlusion surfaces. Since, from within the aspect, the occlusion surfaces are in the same topological appearance, the optimization function should be smooth, and a numerical optimization method may be used to find the best viewpoint.

In addition, besides the simple direct visibility requirement, usually, multiple other requirements and constraints need to be taken into account. For example, a surface is considered satisfactorily acquired only if it is entirely visible to the scanner and its sampling density is higher than a threshold. These additional requirements and constraints may further increase the complexity of the search. In the next section, I specify the requirements and constraints specific to the next-best-view problem addressed in this work.

In document View planning for range acquisition of indoor environments (Page 55-58)