BUILDING A MODEL 117 matrix, X, where each row represents a sample, in this case, the landmarks from a face in the form

Landmarks Using a Sparse Shape Model

6.2. BUILDING A MODEL 117 matrix, X, where each row represents a sample, in this case, the landmarks from a face in the form

f = (x1, y1, z1, . . . , xL, yL, zL). To construct the matrix X, as before the data must be zero-meaned.

We already have the mean location of the landmarks from the alignment step and equation 6.2.

Once the data matrix has been zero-meaned X⁰ then we perform PCA using the singular value decomposition method, so that:

X⁰= UΣW^T. (6.8)

The matrix W is a matrix of singular vectors, equation 6.7 shows how this is used in the model.

It is also possible to remove some of the dimensions in the model by removing the last columns of W that represent dimensions with little variation. By reducing the number of dimensions that capture a low variation in the input data, the possibility of the model over fitting is reduced. When over-fitting occurs the model fit procedures can use high parameter values in the low variation dimensions to move the model fit closer to the input points but further from a ”normal” face, as defined by the model.

Model Parameters

The space defined by the basis W covers a face space that defines configurations of points. As the basis is found with PCA of the aligned landmark points from section 6.2.1, the basis vectors are ordered according to the variation found in the landmark points. The new face space defined by W is indexed by b in equation 6.7, this vector represents any allowable configuration of landmarks.

The initial aligned landmarks used to build the model define prior constraints on the allowable configurations.

The new model basis found using PCA are ordered according to how much variation in the input is captured by each direction. Therefore, the first basis is the direction in the model space where the landmark positions showed the greatest variation in the dataset. The second parameter is for the second basis vector, orthogonal to the first and in the direction of second greatest variation. By looking at the effect on the face landmarks of each parameter, we can develop an intuitive notion of how the structure of faces varies with the population (assuming the dataset is representative of the population as a whole).

When building the model, we can measure the amount of variation captured by each basis.

When a model is constructed using a neutral expression face from every individual in the FRGC dataset [13], the first three parameters captured the following amount of variation in the dataset respectively: 27.2%, 14% and 10.3%.

(a) P1:Profile (b) P1:Front

(e) P3:Profile (f ) P3:Front

Figure 6.3: The variation in face shape captured by the first three parameters of the model. The for each landmark there is a mean position and positions for three standard deviations away from the mean using the model parameters, the vectors show the relative directions from the mean.

6.2. BUILDING A MODEL 119 Figure 6.3, shows how the first three parameters effect the positions of the face landmarks. The fourteen face landmarks that are represented in the model are shown in figure 6.1. The effect of each parameter is shown in front and side views. In each diagram, the black, centre point represents the mean landmark location, the red and blue point are at -3 and +3 standard deviations from the mean respectively.

For the first parameter, we can see that each of the landmarks are moving away from and towards an approximate centroid. So intuitively, the first parameter represents an approximate scale of the face. Since the faces are not normalised to a fixed scale deliberately, we could expect the largest variance in face landmark locations to be the size of the face. It would be possible to build a shape model that only represents shape and not scale, but then a separate scaling step is needed along with the alignment. A full generalised Procrustes analysis performs this scaling along with with the alignment but here we have chosen to keep the scale within the model and perform alignment without scale matching. The reasoning for this is that the size of the head is intrinsic to its appearance. Among a population of adults face size does not change dramatically, parameter 1 from figure 6.3 shows this. Therefore, as we expect faces to be roughly the same scale, this is argued to contribute to appearance and should be included in the model. One exception is children, where the face size will change depending on age. However, including scale in the model will again provide meaning as it will correlate with age. The FRGC dataset used to build this model does not contain children, therefore this is not a concern in this application.

The second parameter seems to control the depth of the facial features. We can see that by varying the second parameter the features move in or out from the mean. However, this depth control is not simply moving the landmark together toward the camera. Instead, by the differing colours of the movement vectors we can see that by decreasing the parameter we increase the prominence of the nose tip and nasion landmarks, and move the other landmarks backwards. The nasion and pronasale move most between the size standard deviations, 16.4mm and 19.2mm in total. Interestingly the lower lip moves more than the upper lip, 7.5mm over the six standard deviations compared to 2.8mm. We can also see that both alares stay approximately static when varying this parameter only moving approximately 2mm between the two extremes.

The third parameter seems to capture the width of the face. With this parameter, all of the nose landmarks (pronasale, alares and subnasale) are almost completely stationary, the largest movement is less 2.5mm, while the exocanthions and mouth corners move to control the width,

∼ 17mm and ∼ 8mm.

The first three parameters of the model captures 51.5% of the variation in the faces used to build the model. Therefore the positional variation in the landmark locations for faces is well

Landmark Param 1 Param 2 Param 3

Endocanthion(L) 14.4 8.7 7.3

Nasion 16.2 16.4 3.4

Endocanthion(R) 14.0 9.1 7.8

Ala(L) 5.6 2.2 2.1

Pronasale 9.3 19.2 1.8

Ala(R) 6.0 2.2 2.4

Subnasale 5.2 8.0 0.8

Mouth corner(L) 10.8 7.9 8.7

Mouth corner(R) 10.9 7.7 8.1

Upper lip 10.7 2.8 1.0

Lower lip 12.9 7.6 2.9

Pogonion 23.1 2.7 8.2

Exocanthion(L) 16.6 9.4 17.3

Exocanthion(R) 15.5 9.5 16.0

Table 6.1: Distance in millimeters moved by each landmark when the first three parameters are individ-ually between ±3 standard deviations from the mean landmark location. The movement directions can be seen in figure 6.3

correlated; most of the variation is due to scale either uniform or in a particular direction. This supports not including a uniform scaling step with the alignment because this directional scaling may have been lost.

In document Sparse Shape Modelling for 3D Face Analysis (Page 133-136)