5. Evaluation
5.2. Smart manual landmarking
5.2.2. Comparison with automatic correspondence optimization
The smart landmarking approach is compared against two fully automatic algorithms: The first algorithm, which is outlined in Section5.2.2.1, is based on the iterative closest point (ICP) algo-
rithm [BM92]. ICP-based algorithms are frequently used in the literature and reported to produce sufficient results in many applications. In Section5.2.2.2, a second algorithm is described which uses population-based optimization.
Both algorithms receive a set of training meshes as input. These input meshes were recon- structed from binary volumes – segmentations of the test data sets – using the marching cubes algorithm [LC87].
5.2.2.1. ICP-based correspondence
ICP-based correspondence algorithms establish correspondence by aligning the training meshes using the ICP point set registration algorithm [BM92]. A well-known shortcoming of ICP-based approaches is the use of the Euclidean distance as a correspondence measure, because pairs of closest points in registered shapes do not necessarily correspond. This shortcoming is alleviated to some extent by extracting the landmarks from consistent spherical parameterizations, which were computed using parameter space propagation [KW10]. While this approach also builds upon the ICP algorithm, it improves the quality of the correspondence through anisotropic scaling and fuzzy instead of fixed point correspondences. Furthermore, the parameter space represen- tation of the meshes allows for easy interpolation of landmarks from the whole surface of the mesh, which means that the resulting landmark positions are not restricted to the mesh points of the input meshes.
The parameter space propagation method [KW10] works as follows: A reference mesh
M
ref isselected from the training set, and an area-preserving spherical parameterization of
M
refis com-puted. All other meshes are anisotropically scaled such that the variances on their principal axes is identical to the variances on the principal axes of the reference meshes. The scaled meshes are aligned with the reference mesh using the ICP algorithm in order to derive a common coordi- nate system. Then a fuzzy correspondence between points of each mesh
M
and points ofM
refisestablished, where the degree of correspondence between two points is determined by weights which are dependent on the Euclidean distance of the points in the common coordinate system. Using this fuzzy correspondence, parameter space coordinates for
M
are computed by interpolat- ing parameter space coordinates of corresponding points ofM
ref. A subsequent correction stephandles overlapping triangles on the spherical parameterization of
M
, such that the mapping fromparameter space to the world space becomes bijective. Note that the anisotropic scaling is only used in order to derive the correspondence relation and is undone once the parameterization is propagated.
Corresponding landmarks are then extracted from the spherical parameterization using the simple icosahedron sampling technique: The faces of a unit icosahedron are subdivided, and all points of the subdivided icosahedron are then normalized to obtain sampling points on the unit sphere. Using the parameterization of a mesh
M
, sampling points can be mapped to world space in order to obtain the positions of the landmarks for the respective training exampleM
.5.2.2.2. Optimization-based correspondence
The optimization-based algorithm treats the problem of establishing correspondence as a prob- lem of reparameterization, as proposed by Kotcheff and Taylor [KT98]. For a detailed description of the general approach, the reader is reffered to the book by Davies et al. [DTT08] and the survey of Heimann et al. [HM09]. The DetCov function [KT98] is used as target function of the
optimization algorithm, which explicitly favors compact models. This function performed well in the study of Styner et al. [SRN∗03].
The procedure is started with computing consistent spherical parameterizations using param- eter space propagation [KW10]. This means that the ICP-based correspondence solution from Section5.2.2.1is the starting solution of the optimization scheme. However, in contrast to the former approach, the positions of the sampling points which are used for remeshing are iteratively optimized. As proposed by Davies et al. [DTT08], only the sampling points of a single shape in one iteration are manipulated using clamped plate spline warps reparameterizations [DTT08]. For each data set, a fixed number of
3000
iterations is used after which the objective function did not change significantly.5.2.2.3. Shape model evaluation
For a quantitative evaluation of smart manual landmarking, two statistical shape models of differ- ent anatomical structures have been built. The first model is a statistical shape model of the left kidney using
16
static CT datasets from a Siemens Somatom Sensation scanner as the basis. The in-plane spacing of the CT scans is0.74
mm. The slices were reconstructed with a thick- ness of5
mm. The second statistical shape model has been built from10
dynamic images of the cardiac left ventricle’s heart cycle. The scans are provided as 3D cine MRI data reconstructed along the ventricle’s main axis and consist of4
mm slices with an in-plane spacing of1.36
mm.The left kidney mesh consists of
1002
points while the cardiac ventricle mesh consists of480
points. Generating the training set using ICP-based correspondence establishment takes less than a minute on a desktop computer with2.4
GHz processor and3
GB Ram. The population- based optimization requires45
minutes for establishing correspondence on the ventricle data set and95
minutes on the kidney data set using the same machine. Both automatic correspondence algorithms produce meshes of1002
points (kidney) and492
points (cardiac ventricle) in order to allow a fair comparison with the manual method.In order to evaluate the quality of the constructed shape models, the common statistical shape model evaluation measures specificity
S
and generalizationG
are computed [DTT08]. They are defined asS=
1
n
s ns∑
A=1max
i(Ψ(A, i)) and
G=
1
M
M
∑
i=1
max
A(Ψ(A, i)).
(5.1)ˆ
Y
= {y
A: A= 1, ...n
s}
is a set of shapes sampled from the model’s probability density functionand
Xˆ
= {x
i: i= 1, ...M}
is the set of training shapes.Ψ(A, i)
denotes a function to compare shapey
A withx
i. As Heimann et al. [HWM06] pointed out, it is more meaningful to calculate thesimilarity based on the resulting binary segmentations instead of using the landmark positions, because in image analysis one is most often interested in the volume encompassed by the model and not in the mesh itself. Therefore, the volumetric overlap
Ψ(A, i) = |Y
A∩ X
i|/|Y
A∪ X
i|
is usedas a similarity measure with
X
A andY
i being the sets of voxels enclosed byy
A andx
i. In thiscase
S= 1
andG= 1
withS∈ [0,1]
andG∈ [0,1]
denote a perfect specificity and generalization respectively.In order to have a comparable data basis, the training meshes generated with the smart land- marking method are converted to binary volumes. The binary masks are then taken as input to the automatic landmarking algorithms.
For the
6
largest modes of variation, a set of sample shapesYˆ
is generated. Every set consist of500
random samples. Using these sets, the generalization ability as well as the specificity0.74 0.76 0.78 0.8 0.82 0.84 0.86 0.88 0.9 1 2 3 4 5 6 Generalization Modes DetCov ICP Smart Landmarking (a) 0.72 0.74 0.76 0.78 0.8 0.82 0.84 0.86 0.88 1 2 3 4 5 6 Specificity Modes DetCov ICP Smart Landmarking (b)
Figure 5.4.: (a) Generalization and (b) specificity measures for the built statistical shape model of the kidney [EKW10]. The metric used to compare the shapes is the volumetric overlap between the labeled voxel segmentations. A value of
1.0
denotes a perfect overlap.of the generated statistical shape models is calculated. The analysis for each model takes
5.7
hours on a2.4
GHz Quad Core Intel processor with Windows XP and 4GB of memory. Figure5.4 shows the results of the evaluation for the built kidney model using smart landmarking in comparison to the ICP-based method and to the DetCov method while Figure 5.5 shows the same calculations for the cardiac left ventricle model. As can be seen, the smart landmarking approach performs better in terms of generalization (Figure5.4(a)) and specificity (Figure5.4(b)) in case of the statistical shape model of the kidney. The cardiac left ventricle model confirms the advantage of the smart landmarking for the generalization ability (Figure5.5(a)) while it performs slightly worse in terms of specificity (Figure5.5(b)). However, in the latter case the performance of all algorithms is very similar.
5.2.3. Discussion
The results show that smart landmarking is in most cases superior to the standard correspon- dence establishment methods using ICP-based correspondence and DetCov population-based optimization. An explanation for this outcome is that smart landmarking does not contain any preprocessing and directly work on the generated meshes. The other approaches rely on binary segmentations and therefore apply a smoothing step to remove staircase artifacts, which reduces the degree of match between the generated training sets and the binary segmentations.
It could be argued that other objective functions, for example based on Minimum Descrip- tion Length [HWM06] would probably improve the results of the automatic correspondence algo- rithms. However, more sophisticated methods also need considerably more time — especially for large training sets. For example, the statistical shape models of the liver (cf. Section 4.2.1) used in this thesis were constructed from over
200
training meshes. Using smart landmarking, no additional work is required apart from the segmentation step. This is especially useful when an existing statistical shape model is improved by adding new training data. Here, new shapes can be directly included in the training base and the statistical shape model can be created. Since the process of creating the statistical shape model is very fast, this procedure can be repeated for every new training shape without a negative impact on the overall workflow. The developed tool0.9 0.92 0.94 0.96 1 2 3 4 5 6 Generalization Modes DetCov ICP Smart Landmarking (a) 0.86 0.88 0.9 0.92 0.94 0.96 1 2 3 4 5 6 Specificity Modes DetCov ICP Smart Landmarking (b)
Figure 5.5.: (a) Generalization and (b) specificity measures for the built statistical shape model of the cardiac left ventricle [EKW10]. The metric used to compare the shapes is the volumetric overlap between the labeled voxel segmentations. A value of
1.0
denotes a perfect overlap.can therefore be integrated into clinical practice when a physician needs to segment a structure manually. This way a very large amount of training meshes can be easily generated.
As the results point out, human guided landmarking is feasible in 3D and also suitable to pro- duce statistical shape models of good quality. However, human interaction always introduce a potential error source. An unexperienced user may not be able to identify all corresponding fea- tures of a particular organ. Therefore, the quality of the training shapes may suffer. Furthermore, even the same user may generate different outcomes for the same organ.