Object with axis of other orientations - Symmetry Shape Prior for Object Segmentation

To handle orientations other than vertical, symmetry detection can be repeated for image ro- tated at a discrete set of angles. This is relatively efficient, since complexity for one orientation is linear in the number of pixels. Alternatively, one can extract only vertical and horizontal seams and match them under a discrete set of axis orientations.

3.4. Object with axis of other orientations 37

Figure 3.3: Dynamic programming results for matching two blue seams. Middle pixelsmithat

have label “on axis” are in green, label “not axis” in red. Left result is for parameter choice for a stricter (more straight) symmetry axis, resulting in five smaller “axis” subparts. Right result is for a more relaxed symmetry axis, resulting in two larger “axis” subparts.

38 Chapter3. Symmetry detection

Chapter 4 Symmetry object segmentation

In this chapter we formulate our energy function for symmetric object segmentation. There are four terms in our segmentation energy function. First, we introduce the data term, which is based on the color appearance model of the foreground and background. A set of axis candidates are obtained in chapter 3. We use the axis and the supporting region of the axis to determine the appearance model. The pixels around the axis and in the supporting area tend to be foreground. So we use the color histogram of the supporting area to initialize the foreground appearance model. The background appearance model is initialized by pixels on the image border of a small margin.

Second, we explain the boundary term. The boundary term models the penalty for discon- tinuity between two neighboring pixels. We use the standard boundary terms that encourages segmentation boundary to align with image edges.

Finally, we introduce our novel symmetry constraints in the energy function. The symmetry constraints consist of two parts. The first part is the weight between two pixels that are symmetric to each other according to the symmetry axis. The second symmetry part prohibits assigning foreground to pixels which do not have a symmetric counterpart across the axis. Figure 4.1 show the flow chart of this section.

4.1 Data term

This section explains the data term in segmentation energy function. We use the data term as in the graph cut segmentation framework Boykov, Yuri Y and Jolly, Marie-Pierre [10]. Letθ0_, θ1 _{be the background and foreground appearance models, respectively. Like Vicente, Sara et} al. [56], we build appearance models based on histograms of quantized image colors.

Let pbe an image pixel andPthe set of all pixels. LetL ={0,1}be the label set, where 0

40 Chapter4. Symmetry object segmentation

4.2. Boundary term 41

means background and 1 means object. Letxp ∈ Lbe a variable corresponding to pixel p, and

x=(xp|p∈ P) be a vector of all variables.

Let J be the quantized set of colors, andip ∈ J be the quantized color of pixel p. Then

Pr(ip|θ0), Pr(ip|θ1) are probabilities that pis a background and object pixel, respectively. The

appearance term is:

fa(x, θ0, θ1)= X

p∈P

−logPr(ip|θxp) (4.1)

When it comes to single symmetric object case, since the initial axis estimate may be inaccurate, we search over symmetry axis locations as part of optimization. Let v be the x- coordinate of the vertical symmetry axis, which is completely specified by the value ofv. We begin by detecting a set of symmetry axis candidates as in Chapter 3. We choose the largest score (i.e. largest length) candidate. Letv0 be its x-coordinate. We usev0 to initializev. The region supporting the symmetry axis (Figure 3.2(d) ) initializes object appearance. Image border of small margin initializes background appearance. We can see in Figure 3.2(d) , the area inside the red rectangle reflects the supporting area. The pixels in the supporting area can build the foreground appearance model.

4.2 Boundary term

The boundary term in this thesis is the same as in the graph cut segmentation framework Boykov, Yuri Y and Jolly, Marie-Pierre [10]. The boundary term constructs the boundary constraints in segmentation. The length of the boundary is encouraged to be small. If the nearby pixels have the same label, then there is no penalty between them. The weight of the edges is usually a non-increasing function of the local image intensity. In our approachwpqis

a function that decreases when the difference between neighboring pixels increases.

LetN be a 4-connected neighborhood of ordered pixel pairs. The boundary regularization term is: fr(x)= X (p,q)∈N wpq· h xp , xq i , (4.2) wpq= exp(− 1 2(cp−cq) T_Σ−1 c (cp−cq)), (4.3)

where cp is the color of pixel p, Σ is a 3× 3 diagonal matrix whose elements correspond to

42 Chapter4. Symmetry object segmentation

Figure 4.2: Illustrates symmetry neighborhood system

4.3 Symmetry prior

There are two main utilization of the symmetry axis.

The first one is to consider about the pixels that are symmetric to each other according to the symmetry axis. Because in the supporting area, pixels that have the same distance to the symmetry axis both tend to be the foreground. While outside the supporting area, pixels that have the same distance to the symmetry axis both tend to be the background. However, the supporting area may be inaccurate. Therefore, we put symmetry constraints for pixels outside the supporting area as well.

We wish to enforce symmetry more strictly for pixels that are closer to the symmetry axis. For pixels that are further away, we wish the symmetry constraints to be more loose, to allow the segmented object to be not absolutely symmetric. Therefore we select a certain parameter

d, which is a range of distance. Pixels within this range of distance have a weight according to the distance between the pixel and the symmetry axis. The nearer the distance is, the larger the weight between the two pixels will be. While pixels outside this range of distance have a fixed weighttbetween each other. tis the minimum of the weightupq. We can see the relationship

of pandqin Figure 4.2.

We now define neighborhood Ns(v) for symmetry constraints. Let (px,py) be the coordi-

nates of pixel p. We have:

Ns(v)= {(p,q)∈ P | v− px = qx−v>0,py =qy} (4.4)

4.4. Energy function 43

assigned label 0, or both assigned label 1. This is a submodular Endre Boros and Peter L. Hammer [8] pairwise term, and therefore is easy to optimize. Novel symmetry term is:

fs(x,v)= X

(p,q)∈Ns(v)

upq·[xp , xq]. (4.5)

A constant upq would penalize all symmetry violations equally. To allow the object be less

symmetric further away from the symmetry axis, we use the following model:

upq =          t if|px−v| ≥d d−|px−v| d−1 t+ |px−v|−1 d−1 T, if 1<|px−v|<d (4.6)

The maximum of upq is T and it is achieved one pixel away from the symmetry axis. The

minimum ofupqist, and it is achieved when distance between pandqis 2dor larger.

Second, there is an additional constraint from symmetry for pixels whose symmetric counterpart across the axis falls outside image. Such pixels cannot be assigned to the object. The pixels in the green rectangle in Figure 4.3 cannot be assigned to the object. Let 1 be the smallest andN the largestxcoordinates. Let

P(v)= {p∈ P | 2v− px > N or 2v− px < 1} (4.7)

We hard-constrain pixels inP(v) not to take label 1:

fh(x,v)= X p∈P(v) w∞· xp, (4.8) wherew∞ = ∞.

4.4 Energy function

The full energy function before adding the symmetry prior is: ˆ

f(x, θ0, θ1)= fˆa(x, θ0, θ1)+λrfr(x), (4.9)

and we need to optimize it over segmentationsxand appearance modelsθ0,θ1. For a fixedx, the optimalθ0, θ1are simply the normalized histograms over the object and background pixels. Let: fa(x)= X j∈J X s∈L −ns_j(x) logn s j(x) ns₍_x₎, (4.10)

wherens(x)= |{p|xp = s}|is the number of pixels with labelsin segmentation x, and nsj(x) =

44 Chapter4. Symmetry object segmentation

Figure 4.3: Illustrates symmetry constraints. Thick red dashed line is the symmetry axis. Pixels at equal distances on the opposite sided from the symmetry axis are encouraged to take the same label. The green box shows the area where the object is forbidden.

Sara et al. [56] they show that minimization of ˆf(x, θ0, θ1) is equivalent to minimization of the following f(x) overx:

f(x)= fa(x)+λrfr(x), (4.11)

The energy in Eq 4.11 is iteratively optimized with block coordinate descent. Starting from initialx,ns₍_x_{) and}_ns

j(x) are computed, thenxis re-computed, and so on until convergence to a

local minimum of this NP-hard problem.

After adding the symmetry prior, the complete energy function is:

f(x,v)= fa(x)+λrfr(x)+λsfs(x,v)+ fh(x,v). (4.12)

4.5 Axis re-estimation

In this section, we present a dynamic programming method for symmetry axis re-estimation based on segmentation result. This algorithm is efficient in practice and is guaranteed to find a globally optimal solution that minimizes the energy function.

Recall that an object with label 1 ≤ l ≤ khas a symmetry axis with xcoordinate denoted byvl. The value ofvl has influence on two terms. The first one is:

f_sl(x,vl)= X

(p,q)∈Ns(vl)

4.5. Axis re-estimation 45 where upq =          t if|px−v| ≥d d−|px−v| d−1 t+ |px−v|−1 d−1 T, if 1<|px−v|<d The second one is:

f_hl(x,vl)= X p∈P(vl)

w∞[xp= l]. (4.13)

To find the optimalvl, we compute fsl(x,vl)+ f_hl(x,vl) for each possible position and choose

vlwith lowest energy. However, a naive implementation will lead to an algorithm withO(M2×

C) complexity, where M is the number of rows andC is the number of columns. This is too slow.

We develop a dynamic programming algorithm to accelerate symmetry re-estimation. The main idea is to divide each row into several intervals so that all computations, such as moving the symmetry axis to a nearby position, or computing the symmetry energy can be performed efficiently based on these intervals.

We compute two kinds of intervals: background and asymmetric. A background interval is a maximal sequence of consecutive pixels of the same row that have label other thanl. Let

p = |px − vl|. Suppose p,q are two endpoints of a background interval with px ≤ qx and

py = qy. We denote background intervals on the left side ofvlas [dlp,d l q]

b and those on the right

side as [dl p,d l q] R b.

In the initialization stage, the value ofvl is set to that of the current solution. We find the

background intervals on the left side and on the right side of vl separately by searching the

whole image. Figure 4.4 gives a simple example of a row with 17 pixels. The background intervals in Figure 4.4 are [8,8]l

b,[5,3] l b,[3,4] r b,[6,7] r b.

Once the background intervals are calculated, it is easy to update them when we move vl

one pixel to the left or to the right. For example, if we move vl in Figure 4.4 one pixel to

the left, we can update the background interval by adding 1 to the endpoints on the right and subtracting 1 from endpoints on the left. When we move the symmetry axis to a new position, we update the background intervals in each row of the image. Since the number of intervals in a row is usually small, the update process is very efficient in practice.

Once the background intervals are updated, it is easy to calculate the asymmetry intervals. An asymmetric interval is a maximal connected set of pixels that are labeled with l but can not find the symmetric counterpart with respect to axisvl. Let pandqbe two endpoints of an

asymmetric interval. We denote the asymmetric interval on the left side ofvl as [dlp,d l q]

L a and

on the right side of vl as [dlp,d l q]

a. The asymmetric intervals in Figure 4.4 are [7,6] L a, [5,5]

R a

and [8,8]R

46 Chapter4. Symmetry object segmentation

Figure 4.4: An illustration of the symmetry axis re-estimation. Pixels labeled with l are in green. Other pixels are in blue. The asymmetric intervals are outlined with a red dashed line.

We can obtain the symmetry energy directly from the asymmetric intervals. Due to the specific definition of upq, the symmetry energy of an asymmetric interval can be written in

terms of the sum of one or two arithmetic sequences. Below, we enumerate all possible cases. Given an asymmetric interval [dl

p,d l q]

a on the left side of vl, we define El(p,q) to be its

symmetry energy. Let∆ = T_d₋−₁t. We have:

Case 1: If [dlp,d l q] L a T_P

(vl) , ∅, El(p,q) = ∞. In this situation,fhl(x,vl) = ∞ so vl can

not be the symmetry axis and we should move to a new position directly. We assume that [dlp,d l q] L a T_P

(vl)= ∅in the following cases.

Case 2: Ifdlp ≥d l q ≥d, we have El(p,q)=t(dlp−d l q+1) (4.14) Case 3: Ifdl p ≥d ≥dlq, we have El(p,q)=(d_pl −d_ql +1)t+ d−d l q 2 ∆(d−d l q+1) (4.15) Case 4: Ifd≥ dl p ≥ d l q, we have El(p,q)= [T −∆(d l p+d l q 2 −1)](d l p−d l q+1) (4.16)

We add the symmetry energy of all asymmetric intervals in every row of the to obtain fl

s(x,vl)+

f_hl(x,vl). We computing fsl(x,vl)+ f_hl(x,vl) for every possible position ofvl, and choosevl that

minimize the energy function. Our algorithm can find the global optimal symmetry axis of an 600×800 image in about 0.01 seconds.

The time is proportional to the number of intervals and the image size. Since the number of intervals is usually small, this is very efficient in practice.

For the axis re-estimation to be effective, we found that annealing of parameterλsis helpful.

4.5. Axis re-estimation 47

Figure 4.5: Axis re-estimation.

symmetry constraints are noticeably violated. Then the symmetry axis is updated to a more accurate estimate, andλsis increased to better enforce symmetry.

Figure 4.5(a) is an example where the initial symmetry axis is inaccurate, causing incom- plete object segmentation without axis re-estimation, Figure 4.5(b). Without annealing, axis re-estimation is not effective, Figure 4.5(c). With annealing, axis is accurately re-estimated, improving segmentation, Figure 4.5(d).

We optimize the energy in Eq. 4.12 overx andv in block-coordinate fashion. We initialize foreground/background models andv as describe in the beginning of this section, update appearance models, computexwith a graph cut Boykov, Yuri Y and Jolly, Marie-Pierre [10], re-estimate appearance models andv. The energy is guaranteed to go down1.

1_{If annealing, the energy is guaranteed to start decreasing when}_λ

sreaches its highest allowed value. During

Chapter 5 Experimental Results

5.1 Image Data

To make quantitative comparisons, we build a new symmetry object segmentation dataset which consists of 57 images with manual labeled ground truth. To the best of our knowl- edge, we are the first to build a symmetry object dataset which tries to combine symmetry object detection with image segmentation. Based on the degree of difficulties in segmentation, we roughly divide the dataset into 3 categories, the easy level, the medium level and the difficult level. The easy level contains 17 images. The backgrounds of these images are almost in pure color and are of high contrast with the foreground objects. The medium level contains 28 images. The background of these images are not in pure color but are still very different from the foreground objects and does not contain much complex texture. The remaining12 images are grouped into difficult level. These images are either of complex texture or their background and foreground colors are quite similar.

5.2 Evaluation metrics

We first demonstrate the effectiveness of symmetry segmentation by comparing the segmentation results of full symmetry utilization, partial symmetry utilization and no symmetry utilization. In full symmetry utilization, symmetry is not only used as a cue to initialize the appearance model, but also served as a shape prior to encourage robust objects segmentation. In partial symmetry utilization, we only use symmetry for appearance model initialization. In no symmetry utilization examples, we simply initialize the appearance model with a bounding box located in the center of the image. No symmetry cue is used for segmentation at all.

We compare their performances using two commonly used evaluation metrics. One is the

5.3. Quantitative comparison 49

Table 5.1: Quantitative comparison among full symmetry utilization, partial symmetry utilization, no symmetry utilization in terms of segmentation accuracy and time efficiency. From the top to the bottom are the results of full symmetry utilization, partial symmetry utilization and no symmetry utilization.

Method F measure Error Rate Time(s) Full Symmetry 0.878 6.66% 15.1 Partial Symmetry 0.812 9.51% 12.5

No Symmetry 0.780 11.5% 13.9

error rate, which estimates the percentage of pixels that are labeled with wrong label. The other is the F-meature, which is a more standard metric computed as Cheng, Ming et al. [13]

Fβ = (β

2+₁₎_×_Precision_×_Recall

β2_×_Precision+_Recall (5.1)

We setβ2 = 0.3 to weigh precision more than recall. As for each of the three algorithm, we choose the parameters that optimize the error rate for comparison. The running time is also

In document Symmetry Shape Prior for Object Segmentation (Page 46-60)