Problem Analysis
3.3 Evaluation Criteria
(a) Full Resolution (b) Low Resolution
Figure 3.4: An example mesh from the Basel Face Model (BFM) created dateset. Each face is generated then the mesh resolution is reduced before recreating the mesh.
3.3 Evaluation Criteria
The performance of the landmark localisation will be based on the following criteria:
• Accuracy (accuracy of predicted points)
• Reliability (can a solution be guaranteed)
• Robustness (ability to handle different situations: expression, occlusion)
• Efficiency (speed of computation)
• Autonomy (how much help does the system need?)
Accuracy is a measure of how well each landmark has been localised. The accuracy of the landmark localisation system depends on the distance of the output landmarks from their ground truth locations on the input face. This criterion applies to both the candidate detection process and the model fitting. The candidate detector must find candidates close to the ground truth positions of the landmarks. The model fitting process must accurately match to the landmarks available for a fit and provide an accurate prediction for the location of any missing landmarks.
There are several limitations to the achievable accuracy of the process. Firstly, the accuracy of the candidates will constrain the accuracy of the model fit. Secondly, the ground truth locations of landmarks are hand-labelled points on the full resolution data. These ground truth locations are only as accurate as the human operator placing the landmarks. In the FRGC dataset, the initial ground truth data was hand placed on a texture image in approximate alignment with depth
image. These ground truth locations were improved by Creusot[126] but may still present slight inaccuracies. Additionally, the lower mesh resolution in our datasets can mean that a ground truth landmark falls between vertices on the mesh.
Reliability is a measure of the consistency of the system where accuracy is a measure of the quality of the results. Ideally a full set of landmarks will be accurately found on every input face, so accuracy and reliability are related criteria. To measure the reliability of a process the retrieval rate or true positive rate will be used.
Robustness is an important characteristic for any face analysis system, ideally the performance of the system remains unchanged in different conditions. Expression changes on the face result in the surface undergoing transformations and bending, with landmarks being able to move or lose their characteristic shape. Any change in pose of the face results in self occlusion and missing surface data with range scans used in most datasets. Our candidate detection process should be largely unaffected by a pose change because the detections are based on local surface descriptions.
We would expect the detector to continue to find the available landmarks on the surface. The model must be inherently robust to missing data as the RANSAC fitting algorithm requires a fit using a minimal set of points to test for consensus. Another type of robustness that must be addressed is that of hard failures. The landmark localisation may be accurate most of the time but fail completely for some percentage of input, ideally this hard failure rate will be minimal.
Efficiency measures the speed of computation of the landmarks. The system becomes imprac-tical if localising landmark points takes a very long time. Additionally, the choice to develop in the MATLAB scripting language can slow computation down compared to developing in a compiled language like C.
Autonomy is the amount of intervention the system needs to function. The ultimate aim is to develop a system that is completely autonomous requiring no outside intervention at all. The whole system should function from the input of a single face scan and produce a set of labelled landmarks as an output.
3.3.1 Limitations
There are some expected limitations to the performance of the landmark localisation. For the candidate landmarking process the primary limitations are the surface descriptions and the input quality. We aim to determine a good set of landmarks to use based on their ease of detection. This set depends on the specific surface description that is chosen, therefore combining surface descrip-tions for candidate labelling is unlikely to have a large impact on the results. Additionally, this assumes that the chosen surface description is descriptive enough to distinguish between landmark
3.4. CONCLUSION 55 points. The quality of the input will also affect the candidate landmark results because the surface is not preprocessed. No smoothing, spike removal or hole filling is performed on the input data, but these artefacts can greatly affect the accuracy of the surface descriptions when calculated on the face. Also the resolution of the input places an upper limit on the possible accuracy of the candidate detection. Vertices from the input mesh will be selected as features and labelled as candidate landmarks, therefore the difference between the ground truth landmark and the closest vertex in the downsampled input results in a loss of accuracy.
The final landmark localisation using the model fit faces limitations in the quality of the can-didates that are selected. If landmark points are missed by the candidate detection then they cannot contribute to the overall fit. Similarly to the candidate detection, the model fit will also be constrained to some degree by the input mesh resolution since it is governed by the landmark candidates. The number of false candidates that are produced may also be a limitation for the model fitting procedure. If there are many false candidates then there will be a lower likelihood of choosing a correct set of candidates in the RANSAC fitting algorithm. This could result in the model fit being caught in a local optimum solution or needing to search through many combinations resulting in a long runtime.
3.4 Conclusion
In this chapter we have defined the problem that we aim to solve: landmark localisation using a sparse shape model and defined datasets and success criteria for the evaluation of the proposed system. The following three chapters form the core of this thesis and detail the development of the system. In the next chapter we approach the problem of selecting distinctive landmark points for a sparse shape model and suitable surface descriptions for candidate detection. Once the surface description and landmarks are chosen, a landmark candidate detection system is developed in the following chapter. The final core chapter of the thesis details the development of the sparse shape model and the associated fitting procedures.