Summary - Learning Non-rigid, 3D Shape Variations using Statistical, Physical and Geometric Mod

In this chapter, we have shown how to perform principal geodesic analysis in the space of discrete shells. In doing so, we derived an alternate formulation of PGA that avoids performing any operations in the tangent space and works directly with objects lying on the manifold. The whole approach is based on an elastic energy functional measuring membrane and bending distortion, so it is promising to interpolate and extrapolate between sparse sample shapes in a physically-meaningful way (Challenge 3). The result is a physically-guided statistical shape model, that is able to generalise across datasets containing large nonlinear articulations and deformations (Challenge 2). The central tool - the projection onto a submanifold of discrete shells - is well suited as the key ingredient in mesh editing or model fitting. Most importantly, the whole framework does not require any alignment step to get rid of rigid body motion which is notorious in shape modelling (Challenge 4).

In comparison to the original PGA model [79], which deals with a low dimensional medial axis description, we consider high dimensional shape manifolds. Furthermore, we extend PGA to the time-discrete setting and introduce a rigid body motion invariant distance measure. This invariance is also a substantial advantage over the Shell PCA model proposed in the previous chapter, which is based on vertex displacement and hence alignment-dependent. To this end, Shell PCA model only allows for small deformations, i.e. mesh editing and motion tracking

Figure 4.13: Qualitative visualisation of input shape (gray) projected onto model with (cols 2-4) J = 5, 11, 17 dimensions. Col 5 shows residual energy of projection with J = 17.

Figure 4.14: Qualitative reconstructions of input shapes from SCAPE [3]. Top: ground truth; bottom: reconstructions using J = N _{− 1 = 70.}

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0 20 40 60 80 100 Per-vertex RMS Error % with error < x

Freifeld and Black 2012 Gao et al. 2016 Shell PCA (Chapter 3)

Ours

Figure 4.15: Leave-one-out evaluation of generalisation error on the SCAPE data set compared to [5] (using all shapes), Lie body [6] (60 dimensions) and Shell PCA (Chapter 3).

CHAPTER

5 G

ROUPWISE

S

HAPE

C

ORRESPONDENCE

VIA

F

UNCTIONAL

M

APS

5.1 Introduction

Computing dense correspondence for shapes in a collection (represented as discrete 3D meshes) is a fundamental problem in computer vision research. It arises in a number of applications including statistical shape modelling [24], face morphing [130], motion capture [131], perfor- mance driven animation [132] and face transfer [97]. In essence, dense correspondence allows a collection of 3D shapes to be treated as vectors, and facilitates subsequent analysis such as Principal Component Analysis.

There are two distinct classes of dense correspondence problem. The first version of the problem (which we term within-subject correspondence) is to compute correspondence between scans of the same subject. This is a non-rigid registration problem which, in principle, has a well defined correct solution. Given enough information, it should be possible to uniquely describe a local region of one scan and find its corresponding region in another. A special case (and more constrained version) of this problem is where the scans come from a motion sequence where the shape non-rigidly deforms over time. In this context, temporal consistency means that dense correspondence can be viewed as a tracking problem.

The second version of the problem (which we term between-subject correspondence) is computing dense correspondence between scans of different subjects (i.e. faces with different identities). This is a much harder problem and arguably not well defined. In general, correspondence is a hypothesis of equivalence and defining the objective of the correspondence requires a definition of equivalence. Defining a meaningful notion of equivalence may only be possible in a sparse or low frequency sense. For example, sparse landmark points can be identified across different shapes [133] or it may be meaningful to talk of correspondence between parts or segments [134]. In this case, the correspondence in the remaining regions is interpolated (either

explicitly or implicitly). An alternate view is to impose some external desirable criterion on the correspondence. For example, we may require that the correspondence is smooth [24] or that it is optimal with respect to an information theoretic measure (e.g. minimum description length [64]).

In both within-subject and between-subject correspondence, another important distinction is between pairwise and groupwise methods. Pairwise methods compute correspondence between each shape in the collection and a reference shape. This includes all template-based methods. To solve the problem in a different way, groupwise methods explicitly optimise such an objective function that measures the quality of the correspondences across the whole set of shapes simultaneously. The advantage of this is that the result is not dependent on a choice of reference shape or the order in which samples are processed. Furthermore, groupwise information can help resolve ambiguities that would be present in pairwise correspondence. For a long time, groupwise approaches to computing correspondence have had limited practical application. This is because the size of the problem space grows very rapidly with the number of samples in the set, leading to a very high dimensional nonlinear optimisation problem.

A recent paradigm shift in non-rigid shape analysis is based on the notion of “functional maps”. The idea is to correspond real-valued functions on the mesh rather than points on the mesh directly [46]. A functional map can be converted to a point-to-point correspondence and they have recently been shown to perform very well for point-to-point shape matching [52, 53]. In this chapter, we pick up functional maps as representation to solve the problem of dense correspondence for 3D shapes. Specifically, we propose a groupwise variant of functional maps. The functional map representation overcomes the problem of the computational expense of groupwise methods in two ways. 1). functional maps are of much lower dimensionality than the mesh geometry themselves. We show in our experiments that functional maps of dimension as low as 30 are enough for high quality correspondence between face meshes containing 250k vertices. 2). functional maps can be composed meaning that, in our approach, only a minimal subset of maps need to be optimised with the remainder maps being constructed by compositions of these minimal subset of maps.

We apply our method to both between-subject and within-subject correspondence problems. This includes high resolution, high quality static facial expression scans and, large non-linearly deformed general objects, such as human bodies.

In document Learning Non-rigid, 3D Shape Variations using Statistical, Physical and Geometric Models (Page 80-84)