Edge Detection - 2461059 Computer Vision Solution Manual

PROBLEMS

8.1. Each pixel value in 500×500 pixel image I is an independent, normally distributed random variable with zero mean and standard deviation one. Estimate the number of pixels that, where the absolute value of the x derivative, estimated by forward differences (i.e.,|Ii+1,j− Ii,j|, is greater than 3.

Solution The signed difference has mean 0 and standard deviation √

2. There are 500 rows and 499 differences per row, so a total of 500× 499 differences. The probability that the absolute value of a difference is larger than 3 is

P (diff > 3) =

P (diff > 3) can be obtained from tables for the complementary error function, defined by

by a change of variables in the integral, so that P (diff > 3) = erfc(3

2) which can be looked up in tables.

8.2. Each pixel value in 500×500 pixel image I is an independent, normally distributed random variable with zero mean and standard deviation one. I is convolved with the 2k + 1× 2k + 1 kernel G. What is the covariance of pixel values in the result?

There are two ways to do this; on a case-by-case basis (e.g. at points that are greater than 2k + 1 apart in either the x or y direction, the values are clearly independent) or in one fell swoop. Don’t worry about the pixel values at the boundary.

Solution The value of each pixel in the result is a weighted sum of pixels from the input. Each pixel in the input is independent. For two pixels in the output to have non-zero covariance, they must share some elements in their sum. The covariance of two pixels with shared elements is the expected value of a product of sums, that is

Now some elements of these sums are shared, and it the shared values that produce covariance. In particular, the shared terms occur when i−l = u−s and j−m = v−t.

The covariance will be the variance times the weights with which these shared terms appear. Hence

E(R_ijRuv) = X

i−l=u−s,j−m=v−t

G_lmGst.

8.3. We have a camera that can produce output values that are integers in the range from 0 to 255. Its spatial resolution is 1024 by 768 pixels, and it produces 30 frames a second. We point it at a scene that, in the absence of noise, would produce the constant value 128. The output of the camera is subject to noise that we model as zero mean stationary additive Gaussian noise with a standard deviation of 1. How long must we wait before the noise model predicts that we should see a pixel with a negative value? (Hint: You may find it helpful to use logarithms to compute the answer as a straightforward evaluation of exp(−128²/2) will yield 0; the trick is to get the large positive and large negative logarithms to cancel.)

Solution The hint is unhelpful; DAF apologizes. Most important issue here is P (value of noise <−128). This is

√1 2π

Z −128

−∞

e(^−x

2/2)dx,

which can be looked up in tables for the complementary error function, as above.

There are 30× 1024 × 768 samples per second, each of which has probability P (value of noise <−128) = p

of having negative value. The probability of obtaining a run of samples that is N long, and contains no negative value, is (1− p)^N. Assume we would like a run that has a 0.9 probability of having a negative value in it; it must have at least log(0.9)/log(1− p) samples in it.

8.4. We said a sensible 2D analogue to the 1D second derivative must be rotationally invariant in Section 8.3.1. Why is this true?

Solution This depends on whether we are looking for directed or undirected edges. If we look for maxima of the magnitude of the gradient, this says nothing about the direction of the edge — we have to look at the gradient magnitude for this

— and so we can mark edge points by looking at local maxima without worrying about the direction of the edge. To do this with a second derivative operator, we need one that will be zero whatever the orientation of the edge; i.e. rotating the operator will not affect the response. This means it must be rotationally invariant.

Programming Assignments

8.5. Why is it necessary to check that the gradient magnitude is large at zero crossings of the Laplacian of an image? Demonstrate a series of edges for which this test is significant.

8.6. The Laplacian of a Gaussian looks similar to the difference between two Gaussians at different scales. Compare these two kernels for various values of the two scales.

Which choices give a good approximation? How significant is the approximation error in edge finding using a zero-crossing approach?

38 Chapter 8 Edge Detection

8.7. Obtain an implementation of Canny’s edge detector (you could try the vision home page; MATLAB has an implementation in the image processing toolbox, too) and make a series of images indicating the effects of scale and contrast thresholds on the edges that are detected. How easy is it to set up the edge detector to mark only object boundaries? Can you think of applications where this would be easy?

8.8. It is quite easy to defeat hysteresis in edge detectors that implement it — essentially, one sets the lower and higher thresholds to have the same value. Use this trick to compare the behavior of an edge detector with and without hysteresis. There are a variety of issues to look at:

(a) What are you trying to do with the edge detector output? It is sometimes helpful to have linked chains of edge points. Does hysteresis help significantly here?

(b) Noise suppression: We often wish to force edge detectors to ignore some edge points and mark others. One diagnostic that an edge is useful is high contrast (it is by no means reliable). How reliably can you use hysteresis to suppress low-contrast edges without breaking high-contrast edges?

C H A P T E R 9

Texture

PROBLEMS

9.1. Show that a circle appears as an ellipse in an orthographic view, and that the minor axis of this ellipse is the tilt direction. What is the aspect ratio of this ellipse?

Solution The circle lies on a plane. An orthographic view of the plane is obtained by projecting along some family of parallel rays onto another plane. Now on the image plane there will be some direction that is parallel to the object plane — call this T . Choose another direction on the image plane that is perpendicular to this one, and call it B. Now I can rotate the coordinate system on the object plane without problems (it’s a circle!) so I rotate it so that the x direction is parallel to T . The y-coordinate projects onto the B direction (because the image plane is rotated about T with respect to the object plane) but is foreshortened. This means that the point (x, y) in the object plane projects to the point (x, αy) in the T, B coordinate system on the image plane (0≤ α ≤ 1 is a constant to do with the relative orientation of the planes). This means that the curve (cos θ, sin θ) on the object plane goes to (cos θ, α sin θ) on the image plane, which is an ellipse.

9.2. We will study measuring the orientation of a plane in an orthographic view, given the texture consists of points laid down by a homogenous Poisson point process.

Recall that one way to generate points according to such a process is to sample the x and y coordinate of the point uniformly and at random. We assume that the points from our process lie within a unit square.

(a) Show that the probability that a point will land in a particular set is propor-tional to the area of that set.

(b) Assume we partition the area into disjoint sets. Show that the number of points in each set has a multinomial probability distribution.

We will now use these observations to recover the orientation of the plane. We partition the image texture into a collection of disjoint sets.

(c) Show that the area of each set, backprojected onto the textured plane, is a function of the orientation of the plane.

(d) Use this function to suggest a method for obtaining the plane’s orientation.

Solution The answer to (d) is no. The rest is straightforward.

Programming Assignments

9.3. Texture synthesis: Implement the non-parametric texture synthesis algorithm of Section 9.3.2. Use your implementation to study:

(a) the effect of window size on the synthesized texture;

(b) the effect of window shape on the synthesized texture;

(c) the effect of the matching criterion on the synthesized texture (i.e., using weighted sum of squares instead of sum of squares, etc.).

9.4. Texture representation: Implement a texture classifier that can distinguish be-tween at least six types of texture; use the scale selection mechanism of Section 9.1.2, and compute statistics of filter outputs. We recommend that you use at least the mean and covariance of the outputs of about six oriented bar filters and 39

40 Chapter 9 Texture

a spot filter. You may need to read up on classification in chapter 22; use a simple classifier (nearest neighbor using Mahalanobis distance should do the trick).

C H A P T E R 10

In document 2461059 Computer Vision Solution Manual (Page 36-41)