• No results found

We introduce a novel shape descriptor, called a chordiogram, which addresses most of the challenges outlined above [Toshev et al., 2010]. It captures both the object boundary as well as its interior in a holistic fashion. In addition, it is invariant to certain rigid transformations and robust to shape deformations. Most importantly, however, it can be applied in images with severe clutter, which allows for recognition in unsegmented images.

2.2.1

Definition of the Chordiogram

Let us denote by C all the boundary points of a segmented object. To define the chordiogram, consider for a moment a pair of boundary edges p and q fromC. We will call such a pair (p, q) a chord. We can think of a chord as a way to express a

p

q

(l

pq

, ψ

pq

)

θp−ψpq

θq −ψpq

(a) Translation-invariant features.

p

q

θp−ψpq θq−ψpq

l

pq

c

l

p

l

q (b) Rotation-invariant features.

p1

q1

p2

q2

(c) Normals.

Figure 2.4: Chord features and orientation of the normals at boundary edges.

dependency between edges p and q. One can define various features which describe the geometry of the chord, which we will denote by fpq ∈ RD, as we will see in the subsequent section. These features capture geometrical relationships between the two boundary points.

We describe the shape of a segmented object by capturing the features of all chords. In this way we attempt to capture all dependencies among boundary points and achieve a holistic description. More precisely, the chordiogram ch is defined as a K-dimensional histogram of all chords, where the mth chordiogram element is given

by:

chm = #{(p, q)|fp,q ∈bin(m), p, q ∈C} m= 1. . . K (2.1) To define the chordiogram, one needs to sample points from the shape. In our

definition we take every pixel on the shape.

2.2.2

Chord Features

The exact features that are used to describe a chord determine the properties of the chordiogram. In our work, we use the following two possible sets of chord features.

I. Translation-invariant chord features. One potential chord characterization can be achieved if we focus on the relative geometric configuration of the two bound- ary edges. More precisely, we define four chord features (see Fig. 2.4(a)):

• Chord length lpq and orientation ψpq of the vector connectingp and q.

• Normals θp and θq to the object boundary at p and q. Thus, the chord features can be written as:

f(t)

pq = (lpq, ψpq, θp−ψpq, θq−ψpq)T

The normals are defined such that they point towards the interior of the object. In this way not only the contour shape at points p and q is captured but also the relation of the interior to the chord. For example, in Fig. 2.4(c) the chords at the two L-junctions at the bottom of swan’s neck differ because the object interior is positioned differently w. r. t. the two junctions.

Since the features are real-valued, to compute the above histogram one needs to quantize the features into bins. The lengths lpq are binned in bl bins in a log space, which allows for larger shape deformation between points lying further apart. The lengthhof the largest bin determines the scale of the descriptor – every two boundary points lying within distanceh will be captured by the descriptor. To guarantee that the descriptor is global, we set h equal to the diameter of the object in case of pre- segmented object masks. The remaining three features are angles lying in [0,2π) and are binned uniformly – the chord orientation in br bins; the normal angles are

(a) Coarse shape. (b) Fine shape.

Figure 2.5: For each pair of shapes (upper row), we show the chordiogram computed over the normal features only (middle row) and over the chord length and orientation (lower row).

binned in bn angles. This binning strategy results in a N =bl×br×b2n dimensional shape descriptor at scale h. The chord features are summarized in Table 2.1.

Chord Features Analysis The chord features determine the invariance of the chordiogram to geometric transformations. Since we do not capture absolute rotation information, the resulting descriptor is translation invariant. However, the chord orientation prevents the descriptor from being rotation invariant. Similarly, the chord length prevents the chordiogram from being scale invariant. This design choice is motivated by the fact, that translation is the largest possible dimension of a similarity transformation we have to search along during detection. Moreover, the above version of the chordiogram is tailored towards image datasets, which exhibit the characteristics of personal photo collections – users tend to take pictures of objects in their natural pose, which usually means that we do not need to search over possible rotations.

feature binning # bins invariance

rotation scale translation

lpq chord length log space bl yes no yes

lp distance to center uniform bd yes no no

ψpq chord orientation uniform br no yes yes

θp−ψpq relative normal uniform bn yes yes yes

Table 2.1: Summary of the chord features and their properties. Note that both the chord length and distance to object center depend also on the scale, defined as the boundary of the largest bin.

other types of transformations. However, this would decrease the expressiveness of our representation.

The chord features are chosen such that they completely describe the geometry of a chord. When it comes to the chordiogram, the features capture different shape properties. The chord length and orientation capture global coarse shape properties, while the fine information is captured by the normals.

To see this, consider the example given in Fig. 2.5. We can restrict the computa- tion of the chordiogram only over a subset of the features. If we only use the normals at the boundary points, then the fine boundary information shown in Fig. 2.5(b) can be distinguished. If we, however, use only the chord length and orientation, then we can discriminate based on coarse shape, as visualized in Fig. 2.5(a).

II. Rotation-invariant chord features. In certain applications, such as videos or multiple images of the same scene, we could use motion information or stereo to detect the rough location and support of a foreground object (see Chapter 4). In such situations, translation and scale invariance is irrelevant, while we need to deal with rotation.

To introduce a rotation-invariant variant of the chordiogram, consider the center of mass of the object outline defined as

c= 1

|C|

X p∈C

For each boundary point p∈C denote bylp =|pc| the distance to the object center

c. Then the rotation-invariant features are (see Fig. 2.4(b)):

• Chord length lpq of the vector connecting p and q and distances to center lp and lq.

• Normals θp and θq to the object boundary at p and q. To achieve rotation invariance of these features, the angles are normalized with respect to the chord orientation.

Thus, the chord features can be written as:

f(r)

pq = (lpq, lp, lq, θp−ψpq, θq−ψpq)T

The distanceslpandlqare binned uniformly intobdbins, while the remaining features are binned as above. This gives us a N =bl×b2d×b2n dimensional descriptor. The chord features are summarized in Table 2.1.