Classification Performance Evaluation on the 3DE-VISIR Database

Chapter 6 3D Face Recognition Using Reconstructed Captures from Photometric

6.6 Classification Performance Evaluation on the 3DE-VISIR Database

The 3DE-VISIR database was originally proposed for expression classification and contains at least three sessions, positive, negative and neutral, for each identity [18]. The captures are obtained from both visible (VIS) and near infrared (IR) lights. 3D facial information estimation using these two types of illumination produces different influence on the reconstruction, resulting in 3D captures with different characteristics.

As a subset of the 3DE-VISIR database, the IR captures possess a lot of advantages for 3D face recognition and can significantly contribute to the real world applications. Hansen et al. [10, 18] proposed the use of near infrared light to capture the 3D face as it is very directional and gives a more accurate overall depth reconstruction than visible light. Furthermore, the acquisition systems using the near infrared light are more covert and less intrusive in real world applications. However, some fine surface details might be lost due to sub-surface scattering [144], even though the overall reconstruction by using the IR light is better than the VIS light [18].

Compared to other face databases containing expressions, the 3DE-VISIR database does not contain multiple expressions for each subject and only categorizes the expressions into positive, neutral and negative groups. Such classification provides a wider range of happy and sad related expressions and makes candidates feel more relaxed, which is likely to result in more natural expressions [18]. Therefore, it is a useful source for expression robust features evaluation.

In addition, the ‘one training sample’ scenario is another challenging issue in biometrics, which requires that the features extracted and selected for each identity should be more effective and discriminative for matching. It is interesting to apply the nasal curves found from the Photoface database to this new database. Therefore, in this section, the neutral capture of each identity is used for training and the other two captures from the same identity with positive or negative expressions are used for testing. To make a comparison of the VIS and IR captures, the recognition performance of ‘one training sample’ scenario using both positive and negative captures from VIS and IR groups will be evaluated individually.

6.6.1 Feature Extraction and Recognition Performance Evaluation Using VIS Captures The VIS part in the 3DE-VISIR database can be considered as an extension of the Photoface

database as their acquisition conditions use the same setting. As a consequence, features extracted by 75 nasal curves and FSFS based feature selection used in the Photoface database can be directly applied to this database.

After drawing the 75 nasal curves, feature selection is applied to each component, which results in different curves combinations and can build a stronger classifier. The resulting R1RRs of

each component tested on the VIS part are illustrated in Table 6.2. Features extracted from the depth, SNx, SNz and SI components produce similar recognition performance no matter whether

the positive or negative probes are used. The R1RRs generated from the surface normals and

SI components significantly outperform the depth component. Furthermore, the resulting R1RRs are higher than those reported in [9].

Table 6.2: R1RRs of the best curves combinations of four components using VIS captures under ‘neutral

vs. non-neutral’ one training sample scenario.

R1RR Positive Negative Depth 53.49% 51.72% SNx 70.93% 70.11% SNy 80.23% 72.09% SNz 75.58% 71.26% SI 73.26% 71.26%

Using the positive captures, the R1RR of SNy clearly outperforms the other four components

and the R1RR of 80.23% is produced from only 10 curves from SNy map. The feature-level

fusion of the SNy and SNz components produces a R1RR of 87.21%, which is very competitive

in ‘one training sample’ scenario as the size of features set is very small and the nasal curves extraction is relatively low complexity. For the negative captures, the R1RR (72.09%) of SNy

component is also the highest but is much lower than that using the positive probe. The curves used for matching in each component and groups (positive or negative) are varying as the FSFS results produce different numbers or combinations of curves in each feature selection.

6.6.2 Feature Extraction and Recognition Performance Evaluation Using IR Captures Compared to the landmarking on the VIS captures, the nasal root localization fails on nearly all the IR captures and the consistency of two alar grooves is less accurate than using VIS captures. As can be seen from Figure 1.1, there are not enough fine details provided on the

depth map of IR captures, especially for the nasal and adjoining regions, although the overall estimation of depth information is more accurate [18]. As a consequence, it is hard to apply the well-designed landmarking approaches proposed for the captures with higher accuracy depth map directly to the IR captures. Alternatively, the method proposed in Section 6.3 provides an effective solution to address this problem, which used the constant distances to localize the root and alar grooves.

Using the landmarks found by the constant distances, 75 nasal curves extracted from depth, surface normals and SI maps are used to investigate the IR captures and the resulting R1RRs

are illustrated in Table 6.3. The recognition performances of each component tested on the IR captures outperform those of the VIS captures. Using positive captures as the probe, R1RRs

have ~10% improvements for SNx and SNz components and ~5% improvements for the depth

and SI components.

Table 6.3: R1RRs of the best curves combinations of four components using IR captures under ‘neutral

vs. non-neutral’ one training sample scenario. A fixed sized structure is applied instead of using the detected landmarks. R1RR Positive Negative Depth 56.98% 60.92% SNx 80.23% 78.16% SNy 79.07% 77.01% SNz 80.23% 80.46% SI 79.07% 78.16%

In contrast, for the negative probes, the R1RRs of all the components increase by ~10%.

Moreover, the curves selected from the SNy map do not have the best performance and the

surface normals components are still better than the depth and SI. Compared to using the VIS captures, the IR captures have higher probability to produce discriminative features for recognition, which mainly results from the high accuracy of surface normals reconstruction using IR light.

In document 3D Face Recognition Using Multicomponent Feature Extraction from the Nasal Region and its Environs (Page 133-135)