The first step in any edge based segmentation is edge detection. Though methods for the detection of edges in colour images have been proposed [1, 3], the most es- tablished methods involve highlighting rapid changes in the image intensity function (grey level) in order to produce a binaryedge image with pixel values corresponding to the states ‘edge’ or ‘not edge’. It is to the methods of edge detection that we restrict the discussion of this section.
The majority of edge detection methods assume a step edge model in which edges of objects are assumed to correspond with rapid, high contrast changes in grey level which in turn correspond to features in the first- and second-order gradient of the image. Figure 2.2 shows a one dimensional discontinuity along with its first and second order gradient functions. Though the levels of the derivatives have been accentuated for illustrative purposes, it is clear that the location of the discontinuity is signified by the peak in the first derivative and the zero-crossing in the second. The gradient of a continuous function in two-dimensionsf(x, y) is given by equation 2.1 and the second derivative, frequently referred to as the Laplacian, by equation
2.2. |∇f(x, y)|= s ∂f ∂x 2 + ∂f ∂y 2 (2.1) ∇2f(x, y) = ∂2f ∂x2 + ∂2f ∂y2 (2.2) However, the partial derivatives of discrete data cannot be obtained directly, and so must be approximated by differences.
Several methods exist for the estimation of the first derivative of a digital image, all of which rely on the convolution of the signal with a number of kernels each respond- ing more strongly to a gradient in one specific direction such kernel sets include the Sobel and Prewitt operators [1]. The gradient magnitude is most commonly evalu- ated using the Euclidean norm of the results of all the convolutions and the gradient direction by the kernel responding most strongly or some trigonometric calculation based on the contribution of orthogonal kernels. Where edges exist, their direction can be arranged to lie perpendicular to the gradient direction and it can therefore be useful in edge linking. Having estimated the gradient, the image is subjected to a thresholding in which pixels having an associated gradient higher than a specified value are given the label ‘edge’ and those below the value ‘not edge’. It is the de- termination of this threshold value that poses most difficulty. Figure 2.3(a) shows a section of the original image of ‘girl’, the gradient magnitude of which is calculated by taking the Euclidean norm of the convolutions of the image with the vectors [1,0,−1] and [1,0,−1]T, Figure 2.3(b). This image is subject to thresholding at
levels equal to one quarter and one twentieth of the maximum value, the results of which are shown in Figures 2.3(c) and 2.3(d) in which the ‘edge’ state is shown in black. It is clear from the results that the higher of the two thresholds misses impor- tant edges including those of the back of the head of the toy and although the lower threshold detects more of these edges in the image function, it has also detected a number of discontinuities which do not correspond to object boundaries and the thickness of those edges detected for a higher threshold has increased. Thick edges are undesirable because pixels defined as edges are not part of any region and in order to extract a tessellation as required by Definition 1 each ‘edge’ pixel must be assigned a region. Edges greater than a single pixel in width leave some ambiguity as to the exact location of the edge and so correlation of edges with object boundaries is compromised. Thin edges on the other hand being of single pixel thickness can
(a) (b)
(c) (d)
Figure 2.3: A section of the original image of ‘girl’ (a), the normalised gradient thereof (b) and examples of thresholding the gradient image at one quarter (c) and one twentieth (d) of its range.
f x( ) x Quantisation Error Soft Edge Step Edge (a) x grad( ( ))f x (b)
Figure 2.4: A synthetic one-dimensional signal (a) and the gradient thereof (b).
easily be assigned to a region without loosing edge accuracy. Only in very few cases, where objects are very distinct from the background but are otherwise homogeneous with respect to grey level does the result of gradient thresholding give results which can be used directly to form a segmentation. This is due to the fact that edge detection strategies requiring the estimation of the gradient are highly sensitive to quantisation error and noise whilst being relatively insensitive to soft edges. Figure 2.4(a) shows a synthetic one dimensional signal with three features; quantisation error or noise, a soft edge or ramp and a sharp or step edge. It should be noted that the latter is highly unlikely to occur in real images due to the finite sampling grid and anti-aliasing requirements. Figure 2.4(b) shows the gradient of the signal calculated by convolution with the kernel [−1,1]. Whilst the sharp edge gives rise to a large peak in the gradient function, the soft edge gives no higher a response than the quantisation noise. Thresholding the gradient of this function will result in either the detection of all three features or only the step edge. While methods such as non-maximal suppression and hysteretic thresholding [1] can improve the stan- dard gradient thresholding techniques, reducing their response to noise, thinning
the regions of state ‘edge’ and increasing the correlation between detected edges and object boundaries, the resulting images still suffer from the problem that the lines defined as edges do not form closed contours from which a tessellation of re- gions can be easily extracted. Even more computationally expensive methods such as edge relaxation when applied to gradient thresholding results suffer from this problem. The drawbacks of first order gradient thresholding including thick edges, and a tendency to produce open contours are overcome to some extent by methods which use the second order gradient or Laplacian of the image.
It has been demonstrated that the second derivative of an image function has a zero- crossing at every extremum of the first derivative (Figure 2.2). As such, detecting the zero-crossings in the Laplacian of an image function will result in huge over- detection, particularly in the presence of noise or quantisation error. One method of reducing the sensitivity is to smooth the image using a linear filter prior to estimating the second derivative. The method discussed here is due to Marr and Hildreth [4]. The filter of choice is that of the linear Gaussian filter, being the optimal compromise between spatial support and frequency bandwidth. The effect of convolving the Gaussian kernel with the image function is to remove noise and soften edges. The second derivative or Laplacian of the smoothed image is then estimated. An expression for the complete process is given in Equation 2.3.
∇2[G(x, y, σ)∗f(x, y)] (2.3) But, since all the operators are linear and therefore associative, we can change the order of convolution and the Laplacian operator such that we have:
∇2G(x, y, σ)
∗f(x, y) (2.4)
This considerably reduces the complexity of the operation as the Laplacian of Gaus- sian (LoG) can be computed analytically [1] to be:
LoG(x, y, σ) = ∇2G(x, y, σ) = 1 πσ4 2−x 2+y2 2σ2 e−x2+y 2 2σ2 (2.5)
Examples of the LoG kernel are given in Figure 2.5 for values of σ equal to 1 and 4. Figure 2.6(a) shows the original image of girl along with the results of the zero- crossing detection after convolution with a LoG kernel with σ equal to 2, 4 and 6. The Laplacian of Gaussian method of edge detection appears to have some very desirable properties, the lines in the edge image are thin and though no constraint
0 10 20 30 40 50 0 10 20 30 40 50 −0.35 −0.3 −0.25 −0.2 −0.15 −0.1 −0.05 0 0.05 (a) 0 10 20 30 40 50 0 10 20 30 40 50 −14 −12 −10 −8 −6 −4 −2 0 2 x 10−4 (b)
Figure 2.5: Surface plots of the Laplacian of Gaussian kernels for σ equal to 1 (a) and 4 (b).
is placed on the shape of the lines, they tend to form closed contours and may therefore potentially define a segmentation. However, closer inspection reveals that for large σ the edges are very smooth causing discrepancies between ‘edge’ states and object boundaries at sharp corners, whilst a large number of insignificant edges are detected when σ is small, the texture of the carpet in the image of ‘girl’ for example.
In conclusion, whilst many techniques exist of the detection and refinement of edges in an image, those based upon the first derivative suffer badly from noise and quan- tisation error and fail to produce the closed contours necessary for defining a seg- mentation. This is due in part to the fact the operators which estimate the image gradient are relatively insensitive to soft edges which are more likely to occur than the model step edge. Second order gradient methods have a tendency to produce thin, closed contours, but there is a trade of between the accuracy of the edge locations and the noise rejection.
(a) (b)
(c) (d)
Figure 2.6: The original image of girl (a) and the zero-crossing detection of the Laplacian of Gaussian withσ equal to 2 (b), 4 (c) and 6 (d)