Segmentation Clustering - Segmentation Object Detection

Equation 10-1. Sum of Squares Difference

10.2 Segmentation Object Detection

10.2.1 Segmentation Clustering

Knowledge Discovery in Databases (KDD) [130] is the field focussed on extracting knowledge from large data sets of seemingly unconnected records. Some techniques in the KDD field are also suitable in discovering knowledge pertaining to digital images. A digital image can be considered a spatial data set with unknown natural groupings; performing analysis with this consideration in mind allows for novel approaches in image analysis. Clustering is a method of classifying like data sets based on a series of rules. For CV clustering systems, each pixel is examined to ascertain if it is more likely to belong to a specific group, while also being as dissimilar as possible with any other group. Clustering models are highly iterative, examining each pixel and its neighbours before moving on to the next pixel. Three clustering techniques have been effectively applied to image analysis, and are capable of locating objects or regions of an image. 10.2.1. (a) DBSCAN Clustering

Density Based Spatial Clustering of Applications with Noise (DBSCAN) is based on a density factor, and achieves classification of pixels as a result of the pixels relationship with its nearest neighbour, and also its connectedness with the group/cluster in which it is to be associated. DBSCAN [5] classifies each pixel as belonging to a cluster in which a predefined number of pixels within its radius also belong to the cluster, and is directly density reachable to core cluster points. From seemingly random spatial data, as seen in Figure 10-2, DBSCAN is capable of classifying not only the homogenous pixels, but also pixels that are density reachable to core pixels. Any pixel not associated with any group or cluster is considered noise.

DBSCAN has a time complexity of 𝑂(𝑛 log 𝑛) as a result of its highly iterative processes. A minimal set of user parameters are necessary for DBSCAN to function. During classification of each pixel, the eps

( )

 value determines the neighbourhood radius around the current pixel and the minPts are the minimum number of core points required within the radius before the pixel is classified as belonging to the cluster. A set of criteria is necessary to reach a decision on pixel suitability. Any pixel not meeting the criteria is ignored, while suitable pixels undergo cluster classification. From the processes of Figure 10-3, as each UNCLASSIFIED pixel is processed, it is tested against its neighbouring pixels. The radius set by eps, is typically valued around two. If the central pixel has minPts pixels within its neighbourhood of similar values, it is marked as part of the current cluster and each pixel found suitable within the neighbourhood cycled through for testing. Pixels which do not reach minPts are marked as NOISE. The process then moves on to the next suitable pixel. From this model, a cluster quickly grows based on the density of the suitable pixels as shown in Figure 10-2.

Density based clustering, where point 𝑝 is density reachable to core point 𝑞, is not a symmetrical relationship, which means that a pixel may be suitable for more than one cluster. Pixels already classified in a cluster have no method to assess if they should be associated with a different (better) cluster. The method is first in - first served, so a pixel may be better suited to a different cluster. Selecting the values for eps and minPts can become critical for the success of DBSCAN segmentation. Using a priori knowledge is ideal as setting eps too small will cause a loss of data, while too large will merge neighbouring clusters together. Large values of minPts can reduce the level of noise in the final clustered result.

Figure 10-2. Spatial clustering demonstration Left: unstructured data

10.2.1. (b) k-means Clustering

The ‘mean’ clustering methods attempt to build clusters where membership depends on the pixels distance to the cluster mean. Segmenting images based on k-means clustering [131, 233] aims to minimise the distance of pixels within the homogenous group to the centre of gravity (K) for the cluster. In real terms, this requires the pixel under consideration to be added to the cluster whose mean sum of squares distance value is the smallest. Initial pixel classification is randomly created, with each pixel selected for one of the k clusters. Clustering with k-means requires a pre-set number of clusters to fill, and follows a two-step process. Initially, each pixel is randomly assigned to a cluster and the mean (centre) of the cluster is determined. From this point, the two-step process is repeated for a set number of iterations, or until the solution converges to a stable system. A number of iterations are required for the k-means technique to converge on an acceptable solution.

Figure 10-4 shows the steps performed to classify pixels using the k-means algorithm. A solution may not always be possible if the image does not converge, so a limit on the number of iterations is required. Clustering using k-means has a time complexity of 𝑂(𝑛𝑘𝑖) where 𝑘 is the number of required clusters and 𝑖 the number of iterations. Centroid selection is critical to the function of any of the means clustering models. The new centroid pixel is selected once all pixels have been allocated to a cluster, by the mean distance between pixels within the cluster. The time complexity of the k-means clustering model results in excessive processing times, which preclude it from further use in AR RAL environments.

Step one requires each pixel to be reassessed as to whether it deserves to remain in the current cluster. This is determined by its distance to the cluster mean. Minimising the inter-cluster sum of squares with respect to the current cluster mean drives the new assignment. A pixel will be reassigned to reduce the clusters sum of squares value, while being placed in a different cluster which should be an improved fit. Step two recalculates each cluster mean so that the new centroid is available for the next iteration. The process ceases when a predefined number of iterations are completed, or when the number of pixel reassignments reaches some pre-set minimum count (convergence).

The time complexity for k-means clustering is computationally expensive, and cited as 𝑂(𝑛𝐶𝐷𝐼) where 𝐶 is the number of clusters, 𝐷 the number of dimensions and 𝐼 the number of iterations [131]. A k-means algorithm can take some time to converge, which is unsuitable for real-time video tracking. Additionally, k-means models are unable to build non-convex cluster shapes [234].

10.2.1. (c) Fuzzy C-Means Clustering

A modified version of the k-mean clustering model is a so-called soft clustering method called Fuzzy C-Means (FCM). The Fuzzy C-Means clustering model is similar to the k-means method in that a specified number of clusters are pre-defined, and pixels are associated with the clusters randomly. The FCM model differs from the k-mean model in that a pixel may be a member of more than one cluster, hence fuzzy classification. Within the k-means method, a pixel could be assigned to more than one cluster, but is forced to only a single cluster. A centroid is determined and then each pixel is reassessed as to its suitability for each of the clusters, based on minimising the objective function. Fuzzy C-Means clustering is classified as a soft clustering model because a pixel can potentially be a member of multiple clusters. The degree of membership is a factor with this model. Soft clustering has allowed more detail to be maintained in an image [235] in comparison to hard clustering such as DBSCAN. As with the k-means clustering model, high iterations become computationally too expensive, which precludes it from use with the AR RAL environment.

The fuzzy nature of the model is convenient for some CV systems such as medical MRI [170] or other image interpretation system, but is not suitable for AR systems which require clear segmentation of OoI from the general digital clutter for tracking purposes. FCM also suffers from the same time complexity issues as the k-mean model.

In document Object tracking in augmented reality remote access laboratories without fiducial markers (Page 191-195)