Proposed Method for Image Segmentation

Chapter 4 Object Segmentation

4.5 Proposed Method for Image Segmentation

The proposed object segmentation method comprises three stages. In the first stage the image is prepared and the focus value map generated. The second stage performs an initialisation for the active contours. In the third stage, the segmentation of the OoI is performed using Whitaker’s SFM described in Section 4.4. The second and third stage of method are presented in this section. For the first stage, the focus assessment, the user is referred to the method described in Chapter 3.

4.5.1 Contour Initialisation

A good initial boundary (i.e., an initialisation mask) for the active contours algorithm not only helps speed up the segmentation process, it is also of vital importance for the accuracy of the segmentation. As the SFM only looks for new contours in the vicinity of the previous iteration, the final segmentation is dependant on the initial one. Ideally an initialisation should encompass the entire OoI whilst excluding as much background as possible. A grid based approach is adopted to generate the binary initialisation mask, used in the SFM. The focus map is first split into boxes, the height and width of which are determined by the size of the image to determine the best initialisation mask, i.e.,

box size (pixels) =           

300 if (Height_Image+ WidthImage) ≥3000

200 if 3000> (Height_Image+ WidthImage) ≥2000

100 if 2000> (Height_Image+ WidthImage) ≥1000

30 if (Height_Image+ WidthImage) <1000.

(4.34) The maximum focus value in each box of the grid is assigned to all pixels within that particular grid square. Otsu’s thresholding method [Otsu, 1979] is then applied

to this grid, and blocks with a value above 0.5 of Otsu’s threshold are assigned a value of 1 (denoted by white) and those otherwise assigned 0 (denoted by black). This is to ensure that that OoI lies within initial contour. The initialisation process is illustrated in Figure 4.4.

(a) (b) (c) (d)

Figure 4.4: Focus assessment and contour initialisation of an image with a watch as the OoI: (a) image; (b) focus energy map; (c) maximum values are assigned to each square in the grid; and (d) the corresponding initialisation mask after thresholding.

4.5.2 Object Segmentation

The automatically generated binary segmentation mask is used to create the initial condition for an active contours algorithm. An implicit level set active contours method is adopted, using the energy function defined by Chan-Vese [Chan and Vese, 2001]. The energy of the contour, C, is repeated here for clarity, i.e.,

E(C) = λ1 Z inside |I(x, y)−c1|2 dxdy+λ2 Z outside |I(x, y)−c2|2 dxdy +µ·length(C) +ν·area(inside(C)), (4.35) where the contour C is represented by the zero-level set of a continuous Lipschitz function, I is the image, c1 and c2 are respectively the average pixel values inside and outside of contourC, length(.) and area(.) respectively impose length and area constraints on the contour, andλ1,λ2,ν andµare fixed parameters withλ1, λ2>0 and v, µ ≥ 0. In the proposed implementation λ1 = λ2 = 1, ν = 0, as in [Chan and Vese, 2001], andµ= 0.4 as the goal is to segment large objects and reduce the likelihood of the contour leaking into object boundaries. Using the Active Contours with Edges model means that areas in focus can be segmented, even if there is not a well defined boundary in the focus energy map. To speed up the active contours algorithm, Whitaker’s sparse field method [Whitaker, 1998] (see Section 4.4) is used so that calculations are performed only around the zero level set, thus improving the algorithm efficiency.

appear. This means that holes within an object, e.g., as with a donut, are consid- ered parts of the object. However, this is an advantage when segmenting weakly textured objects since the object pixels only return small focus values, but as they are contained within more dominant object edges they are still segmented correctly. The active contours algorithm is applied using the initialisation mask to a downsampled focus energy map for 200 iterations. The downsampling increases the segmentation speed for larger images and is performed with a factor of 2n, wheren

is determined by the image size to obtain the best segmentation, i.e.,

n =           

3 if (Height_Image+ WidthImage) ≥3000

2 if 3000> (Height_Image+ WidthImage) ≥2000

1 if 2000> (Height_Image+ WidthImage) ≥1000

0 if (Height_Image+ WidthImage) <1000.

(4.36)

A binary segmentation is obtained with interior pixels being assigned the value 1 and exterior 0. The binary segmentation is then used as the initialisation mask for a further 200 iterations. The reinitialisation prevents the level set function from becoming too flat as in [Chan and Vese, 2001]. This 2-stage process is repeated until the method converges on a solution to give an initial scaled down binary segmentation Si(x, y). This is then upscaled by interpolation by a factor of 2n to

the size of the original image with non-zero values being assigned the value 1, thus giving the final binary segmentationS(x, y).

The final segmentation can then be used to obtain a view of the segmented object using a simple method. An object segmented imageI(x, y) is generated with pixels of value 0 being the background, i.e.,

I(x, y) = (

G(x, y) if S(x, y) = 1

0 if S(x, y) = 0 (4.37) whereGis the original greyscale image. This is illustrated in Figure 4.5.

In document Object segmentation from low depth of field images and video sequences (Page 78-80)