• No results found

1.3 Generating Local Shape Models

1.3.2 Shape Clustering

Classification is a basic human conceptual activity. In general the problem of classification consists in assigning a class label to an object, a physical process or an event [165]. Cluster analysis is the generic name for a collection of procedures that can be used to create a classification. These form subsets of highly similar entities orclusters[4]. In each subset the resemblance between the objects is larger than with other objects in other subsets.

This introduces the problem of how to define the resemblance between the ob- jects, and often the similarity or dissimilarity is assessed according to a proximity measure (or distance measure) between the objects [175]. First theMinkowski dis- tance is introduced which depends on the value ofρto generate different measures. Having two d-dimensional objects qi and qj, it is denoted as:

D(qi,qj) = n X i=1 |qil−qjl|ρ !1/ρ (1.2)

Then, when ρ= 2 the distance becomes the Euclidean distance:

D(qi,qj) = n X i=1 |qil−qjl|2 !1/2 (1.3)

It is possible to obtain two other common special cases of the Minkowski distance: the city-block, also called Manhattan distance when ρ= 1,

D(qi,qj) = n

X

i=1

|qil−qjl| (1.4)

and thesup distance when ρ→ ∞:

D(qi,qj) = max

Chapter 1. Introduction 20 The Mahalanobis distance is another metric and is defined as:

D(qi,qj) = (qi−qj)TC−1(qi−qj), (1.6)

whereCis the covariance matrix defined asC=E[(q−µ)(q−µ)T],µis the mean

vector and E[·] calculates the expected value of a random variable.

In this work only five similarity measures have been presented, but examples and applications of these and others can be found in [175].

The problem of clustering arises when in many practical applications the training objects, or only a small fraction of them, are not labelled. In these cases, the structure of the data needs to be discovered without the help of labels. Clustering techniques are generally classified as hard partitional and hierarchical [175]. Hard partitional clustering attempts to divide data points into some prespecified number of clusters without any hierarchical structure. On the other hand, hierarchical clustering groups data with a sequence of nested partitions from single clusters to a cluster that includes all individuals. The following is a simple mathematical description. Given a set of input patterns O = o1, . . . ,oj, . . . ,om, where oj =

oj1, oj2, . . . , ojn ∈Rn, with each measureoji called a feature (attribute, dimension,

or variable):

1. Hard Partitional clustering attempts to search a K-partition of O, Z =

Z1, . . . ,Zk with (k≤n) such that:

• Zi 6=∅, i= 1, . . . , k;

• Sk

i=1Zi =O;

• Zi∩ Zj =∅, i, j = 1, . . . , k and i6=j

2. Hierarchical clustering attempts to construct a tree-like, nested structure partition of O, H = H1, . . . , Hl(l ≤ n), such that Zi ∈ Hs,Zj ∈ Hz, and

s≥z implies that Zi ⊂ Zj orZi∩ Zj =∅for all i, j 6=i, s, z = 1, . . . , l.

Here the two classical types of clustering have been briefly introduced, neverthe- less, there are more approaches that consider alternative ways to perform clustering

Chapter 1. Introduction 21 such as neural network-based, kernel-based, sequential data, large-scale data, and high-dimensional data clustering. Data observations with thousands of features implies working with high dimensions, as a consequence, applications require clus- tering algorithms that are able to process the data with more features than the number of observations, hence high-dimensional data clustering. This approach requires diverse algorithms to achieve the clustering. According to [175] the meth- ods can be linear, non-linear, projected and subspace. The method used in this work belongs to the category of non-linear projection or non-linear dimensionality reduction algorithms. For a comprehensive review of these methods refer to [175]. Spectral clustering has become one of the most popular modern clustering algo- rithms. They are simple to implement, can be solved efficiently by simple linear algebra, and very often outperform traditional clustering algorithms. For example, K-means and learning a mixture-model using EM are methods based on estimat- ing explicit models of the data, that provide high quality results when the data is organised according to the assumed models. However, when data is arranged in more complex way and there are unknown shapes, these methods tend to fail. Spectral clustering is shown to handle such structured data well since it does not require estimating an explicit model of a data distribution, rather it uses a spectral analysis of the matrix of point-to-point similarities [180].

Spectral methods for clustering use eigenvectors corresponding to the highest eigen- values of a matrix derived from the distance between points. They are closely related to spectral graph partitioning in which the second eigenvector of a graph’s Laplacian is used to define cuts over the graph. But this analysis can be ex- tended to perform the clustering by building a weighted graph in which the nodes correspond to data points and the edges are related to the distance between the points[118].

Here, the idea is to use Diffusion Maps for non-linear, spectral clustering to build a set of linear shape spaces for statistical shape analysis (figure 1.6). The ideas, mathematical foundations, algorithms, application and examples will be given in Chapter 5.

Chapter 1. Introduction 22

Figure 1.6: Diffusion Maps clustering is applied to find sets of linear shape spaces useful in the proposed local shape analysis.