Surface reconstruction from multiple camera images

3.2 Telepresence

3.3.7 Surface reconstruction from multiple camera images

Surface reconstruction, or polyhedral modelling, in which the surface of the object to be modelled is directly converted into polygons has been the subject of much less research than volumetric approaches This is perhaps because accurately calculating the intersections of projected silhouette cones in three dimensions is dif- ficult and computationally expensive to achieve. Most research has followed the well trodden path of firstly calculating a volumetric intersection and then, if nec- essary, using a surface recovery approach such as the marching cubes algorithm to determine a polygonal surface model.

The first attempt at directly generating a polyhedral 3D model of an object from multiple views was in Baumgart’s 1974 PhD thesis, in which the shape from sil-

houette concept was first proposed. These silhouettes were extruded away from the camera to form a cone-like shape, and the 3D intersections of these cones

used to approximate the object surface. Shortly afterwards, [8] used a method

in which images of a simple rotating object were used to construct a wire-frame mesh. Image contours were used to identify second order irregularities, which were tracked from view to view in order to build up a network of nodes to be joined by arcs in the final model. Such an approach works reasonably for simple objects largely comprised of a small number of corners joined by straight edges, but probably would not for more complex objects and particularly those where the

edges are largely curved with few irregularities that can be tracked. [37] describes

a method for constructing polyhedral models from a rotating object using a laser range finder to determine the distance to the object at various points over the surface. Since accurate depth maps can be obtained from a laser range finder, this essentially becomes an exercise in polygonising a 3D form from progressively rotated depth maps of an object. In principle the reconstruction of a surface is possible from a set of apparent contours which can be obtained by circumnavi- gating the surface This was first described by Giblin and Weiss (1987), and later generalised by Cipolla and Blake (1990). Conceptually, the problem of generating the visual hull of an object by intersecting projected silhouette cones is a simple one, and some general advances were made into the efficient intersection of 3D

polyhedra by [18] and later by [100] [116] refine Baumgart’s earlier work on

polyhedral intersection by adding a post intersection mesh simplification step and creating triangular splines which are used to control model fitting, but this work is aimed more at reconstructed object recognition than efficient reconstruction. It is not until the realisation that 3D silhouette cones are a special type of polyhedron that significant improvements in the intersection calculation speed is achieved. Since the silhouette projection into 3D has a fixed cross section, the operation of intersecting it with another silhouette cone can be reduced to a 2D operation A face from the silhouette cone is projected onto a 2D silhouette image from another camera, the intersection between the projected face and silhouette are calculated, and this is projected back onto the silhouette cone face. The silhouette cone face can now be intersected with its reprojection from the silhouette intersection to

CHAPTER 3. BACKGROUND AND RELATED WORK 51 calculate the 3D intersection of that silhouette cone face with the other silhouette cone, thus reducing the cost of the intersection operation and solving numeric

rounding problems simultaneously. This was first suggested by [110] in which

photographs of trees are used to construct 3D models, and later improved by [79]

in a more general work in which polyhedral visual hulls of objects from silhouettes are created efficiently.

[68] propose a method which does not require 3D polyhedral or volumetric in-

tersections to be calculated. Apparent contours from camera images are used to sweep out a viewing cone which grazes the object tangentially to the surface, this is used to create a ”cone strip” that continuously bounds the rim of the object from a camera’s point of view. These cone strips are delimited by intersection curves between two visual cones (which are easier to calculate than a polyhedral intersection). Frontier points (where rims and cones intersect at a point) are used to construct a ”rim mesh”, the edges of which have a one to one relationship with the faces of the visual hull mesh. Since only the position of the camera centre is required, and not the position of the image plane itself, this method can be used under weak camera calibration. However, the method is only suitable for objects with smooth curved surfaces.

A hybrid approach to reconstruction is presented by [14] in which both volumetric

and surface based methods are mixed to overcome some of the shortcomings of each. A surface reconstruction approach is employed to create an irregular grid of cells close to the surface of the object to be reconstructed, these are then ”carved” using volumetric methods according to silhouette information. This results in an approach which maintains the robustness of volumetric approaches, whilst pro-

viding high precision, yet efficient results. [39] improve upon this method in Ex-

act Polyhedral Visual Hulls (EVPH) by removing the volumetric step from their algorithm and replacing it with an algorithm that recovers mesh connectivity, resulting in an approach which is quicker and produces an exact polyhedron that is consistent with silhouette images. This is an important piece of work as it also removes the topological constraints introduced by numerical instabilities which forced other polyhedral approaches to restrict themselves to simple objects. In

[41] the EPVH algorithm is parallelised for processing in a network distributed setting.

A novel approach to polyhedral reconstruction from silhouettes is presented by

[73] A geodesic sphere is created by successive subdivision of the faces of an

icosahedron, the centre of this sphere is arranged to fall within the volume of the object being reconstructed. Rays are traced out from the centre point to the vertex edges of the sphere. Using silhouette information, the lengths of these rays are adjusted so that they fall within the silhouette cone projected from each image. The sphere vertices are then adjusted to match the ray lengths. In this way it is possible to reconstruct certain types of objects in a very efficient manner. However, this method only works for objects which are ”spherical-terrain-like” (STL) Such objects are ones where the surface can be traced out from a point within the volume, spheres and cubes would fall into this category, but a teapot would not since the handle represents a secondary set of surfaces. Although the paper suggests that the human head is a good example of an STL object, human ears are not consistent with this opinion. Nevertheless, for STL objects excellent results are obtained in less time than with other leading polyhedral approaches.

[69] is a method complimentary to EVPH and producing very similar results Sil-

houettes are used to reconstruct the visual hull without the need to attempt 3D polyhedra intersections, instead the visual hull is built up in an incremental manner from features such as frontier points and intersection points that are derived solely from projective and epipolar constraints on the input silhouette images and resulting silhouette cone. Consequently the algorithm is able to reconstruct an exact visual hull from weakly calibrated camera images. Performance is similar

to that of the EVPH algorithm. [40] significantly improves the performance of

polyhedral modelling techniques by reducing the resolution of the silhouette contour polygon, which had previously been modelled through sub-pixel methods. At the same time, it is demonstrated that despite the lower resolution representation of the silhouette contour, the output is pixel consistent with input images. Huge reductions in the execution times of both Franco and Boyer’s previous algorithm, and Lazebnik’s algorithm are demonstrated.

CHAPTER 3. BACKGROUND AND RELATED WORK 53

In document Improving the performance of video based reconstruction and validating it within a Telepresence context (Page 62-66)