Model-Based Vision - 2461059 Computer Vision Solution Manual

PROBLEMS

18.1. Assume that we are viewing objects in a calibrated perspective camera and wish to use a pose consistency algorithm for recognition.

(a) Show that three points is a frame group.

(b) Show that a line and a point is not a frame group.

(d) Is a circle and a point not on its axis a frame group?

Solution

(a) We have a calibrated perspective camera, so in the camera frame we can construct the three rays through the focal point corresponding to each image point. We must now slice these three rays with some plane to get a prescribed triangle (the three points on the object). If there is only a discrete set of ways of doing this, we have a frame group, because we can recover the rotation and translation of the camera from any such plane. Now choose a point along ray 1 to be the first object point. There are at most two possible points on ray 2 that could be the second object point — see this by thinking about the 12 edge of the triangle as a link of fixed length, and swinging this around the first point; it forms a sphere, which can intersect a line in at most two points.

Choose one of these points. Now we have fixed one edge of our triangle in space — can we get the third point on the object triangle to intersect the third image ray? In general, no, because we can only rotate the triangle about the 12 edge, which means the third point describes a circle; but a circle will not in general intersect a ray in space, so we have to choose a special point for along ray 1 to be the object point. It follows that only a discrete set of choices are possible, and we are done.

(b) We use a version of the previous argument. We have a calibrated perspective camera, and so can construct in the camera frame the plane and ray corre-sponding respectively to the image line and image point. Now choose a line on the plane to be the object line. Can we find a solution for the object point?

The object point could lie anywhere on a cylinder whose axis is the chosen line and whose radius is the distance from line to point. There are now two general cases — either there is no solution, or there are two (where the ray intersects the cylinder). But for most lines where there are two solutions, slightly moving the line results in another line for which there are two solu-tions, so there is a continuous family of available solusolu-tions, meaning it can’t be a frame group.

(d) Yes, for a calibrated perspective camera.

18.2. We have a set of plane points Pj; these are subject to a plane affine transformation.

Show that

det£

P_iP_jP_k¤ det£

P_iP_jP_l¤

is an affine invariant (as long as no two of i, j, k, and l are the same and no three of these points are collinear).

Solution Write Q_i=MPi for the affine transform of point Pi. Now det£

Q_iQ_jQ_k¤ det£

Q_iQ_jQ_l¤ ⁼

det(M£

P_iP_jP_k¤ ) det(M£

P_iP_jP_l¤

) = det(M) det£

P_iP_jP_k¤ det(M) det£

P_iP_jP_l¤ ⁼ det£

P_iP_jP_k¤ det£

P_iP_jP_l¤

18.3. Use the result of the previous exercise to construct an affine invariant for:

(a) four lines,

(b) three coplanar points,

Solution

(a) Take the intersection of lines 1 and 2 as Pi, etc.

(b) Typo! can’t be done; sorry - DAF.

(c) Construct the line joining the two points; these points, with the intersection between the lines, give three collinear points. The ratio of their lengths is an affine invariant. Easiest proof: an affine transformation of the plane restricted to this line is an affine transformation of the line. But this involves only scaling and translation, and the ratio of lengths is invariant to both.

18.4. In chamfer matching at any step, a pixel can be updated if the distances from some or all of its neighbors to an edge are known. Borgefors counts the distance from a pixel to a vertical or horizontal neighbor as 3 and to a diagonal neighbor as 4 to ensure the pixel values are integers. Why does this mean√

2 is approximated as 4/3? Would a better approximation be a good idea?

18.5. One way to improve pose estimates is to take a verification score and then optimize it as a function of pose. We said that this optimization could be hard particularly if the test to tell whether a backprojected curve was close to an edge point was a threshold on distance. Why would this lead to a hard optimization problem?

Solution Because the error would not be differentiable — as the backprojected outline moved, some points would start or stop contributing.

18.6. We said that for an uncalibrated affine camera viewing a set of plane points, the effect of the camera can be written as an unknown plane affine transformation.

Prove this. What if the camera is an uncalibrated perspective camera viewing a set of plane points?

18.7. Prepare a summary of methods for registration in medical imaging other than the geometric hashing idea we discussed. You should keep practical constraints in mind, and you should indicate which methods you favor, and why.

18.8. Prepare a summary of nonmedical applications of registration and pose consistency.

Programming Assignments

18.9. Representing an object as a linear combination of models is often represented as abstraction because we can regard adjusting the coefficients as obtaining the same view of different models. Furthermore, we could get a parametric family of models

80 Chapter 18 Model-Based Vision

by adding a basis element to the space. Explore these ideas by building a system for matching rectangular buildings where the width, height, and depth of the building are unknown parameters. You should extend the linear combinations idea to han-dle orthographic cameras; this involves constraining the coefficients to represent rotations.

C H A P T E R 19

In document 2461059 Computer Vision Solution Manual (Page 78-81)