Edge Detection - Bitonal pixel model - Object tracking in augmented reality remote access labor

Equation 4-3. Bitonal pixel model

6.2 Edge Detection

Edge detection is a common first stage process for many CV analysis systems. Significant research on edge detection methods have been performed over the decades. Edges are mostly resolved with the application of full or partial derivatives to detect zero-crossings, or maxima’s within the local environment. Pixel intensity across an entire image varies considerably, causing inconsistent detection results. This is illustrated in Figure 6-5 which shows a portion of the Ground Truth test image GT-08, where the red horizontal line indicates the data points used to map the image intensities in the graph superimposed over it. This demonstrates the complex nature of pixel intensities and the core problems with most CV image analysis processes of extracting meaningful signal data. (Note: The image has been stretched so that it aligns with the image intensity graph.) Different edge detection models capture different aspects of the intensities to realise object boundaries. Some obvious signal points correspond to the variations shown in the image segment. But determining which peak or trough corresponds to valid edges is still an issue. For example, between points 88 and 97 in Figure 6-5, is noise which could be falsely identified as an edge boundary. Some of the noise can be seen manifested as interest points in Figure 6-10. However, any system that uses a derivative kernel suffers from additional noise because derivatives amplify noise[191].

Extracting edge data is achieved through three main methods, which are described below.

Figure 6-5. Image intensity variations from image GT-08 (coordinates 100, 51 to 200, 51) - highlighted in red

6.2.1 Gradient Strength

Object boundary detection models function mostly through the calculation of image intensity gradients. Gradient vectors indicate the rate of change for the object boundary intensity, and the direction of change. Second derivatives (or convolution) of an image with a kernel, such as a Laplacian or Gaussian, produce gradient intensity response maps (shown in Figure 6-6). Shown in Figure 6-7, object boundary intensity gradients, when convoluted with a kernel, creates a local-maxima. Figure 6-7 shows a typical response to a step edge (the change in image intensity as an object boundary is crossed) in convolution with the derivative kernel. The result is a local-maxima point to mark the edge. Other kernel types may produce zero-crossing points, or a sudden change in the gradient orientation may indicate points of interest, depending on the CV models employed. From viewing the varying pixel intensities of Figure 6-5, derivatives of such regions of an image result in a response function similar to Figure 6-6, which

Figure 6-6. Second derivative response function of GT-08

demonstrates the gradient magnitudes and zero crossing points. Figure 6-6 also demonstrates the significant complexity in extracting relevant response outputs. The response functions for each edge detection model are presented as each model is defined. The synthetic object shown in Figure 6-8 (which is part of the SUSAN [16] test pattern) is used to visually present the results of each model’s convolution or method.

Image convolution with a kernel provides simplified full or partial derivative approximation methods to discover edges within an image, and these common methods are described below.

6.2.1. (a) Laplacian

Laplacian filter kernels, as shown in Chapter 4, Figure 4-5, are effective at sharpening image edges. However, applying the variation shown in Figure 6-9, in convolution with the image, approximates the second-derivative of the image, where zero-crossing points are identified as object boundaries, which effectively produces an image edge map. Figure 6-10 demonstrates the Laplacian edge detection kernel’s effect when in convolution with ground truth test image GT-08. Substantial noise is present in the left image of Figure 6-10, with many artefacts visible (for example, visible around the chimneys) which have no bearing on the actual edges of object boundaries.

Figure 6-8. Test object used for response function demonstration

          − − − − − − − − 1 1 1 1 8 1 1 1 1

Noise within the image edge map causes hard edges to merge with weaker nearby edges, confounding the results and providing less than ideal data for any secondary image processing functions. Applying image filtering to reduce high frequency noise, also reduces the effectiveness of the edge detection process, as can be seen in the right image of Figure 6-10. Edge detection results occur specifically from the high frequencies of object boundaries.

The results of the Laplacian filter can be readily seen within the response function shown in Figure 6-11. Strong responses appear from high contrast boundaries such as the black border regions adjacent to the brighter regions. The vertical central line and the right oblique of the test image, show lower intensity gradients, which also appear quite weak in the filters edge detection results. Also visible on the response function, is some noise along the left oblique edge, hidden by the strength of the edge detection.

Filter edge detection results.

Figure 6-11. Laplacian response function to test object

6.2.1. (b) Laplacian of Gaussian

The Laplacian of Gaussian (LoG) filter is a second order operator proposed by Marr [14] to detect intensity changes over different scales. It combines the benefits of Gaussian filtering and the Laplacian edge detector. A rapid change in image intensity is the focus for the LoG models. Equation 6-1 is a variation of the LoG edge detector used for this works, where the response curve is demonstrated in Figure 6-12. For the response curve of Figure 6-12, 𝜎 has been set to 1.2. Calculating the LoG kernel is a similar process to generating the Gaussian kernels in that there are several ways to achieve an effective kernel.

From the response function of the test object, shown in Figure 6-13, a similar result to the Laplacian response seems apparent. While the edge detection results appear to be stronger, the response function does not appear to have visibly changed. Closer examination shows that all edges have a stronger gradient affect (gradient magnitude) than the standard Laplacian response, but the normalisation functions of the graphing functions have minimised the apparent effect. The LoG response has also created larger gradient signals for each of the four outer corners.

∇2𝐺 = 1 2𝜋𝜎6((𝑥

2_{+ 𝑦}2_{) − 2𝜎}2_)𝑒−(𝑥

2_+𝑦2

2𝜎 )

In document Object tracking in augmented reality remote access laboratories without fiducial markers (Page 113-117)