Image-Based Computer Graphics
CGRA 352
How to automatically stitch these images?
• Giga-pixel stitching by Microsoft
How to automatically stitch these images?
How to automatically stitch these images?
How to automatically stitch these images?
How to automatically stitch these images?
Homography transformation to alight the content
Advantages of local features
• Locality
– features are local, so robust to occlusion and clutter • Distinctiveness:
– can differentiate a large database of objects • Quantity
– hundreds or thousands in a single image • Efficiency
– real-time performance achievable • Generality
Local Image Point Applications
• Image alignment (stitching,mosaics) • 3D reconstruction
• Motion tracking • Object recognition
Challenges
Challenges
• Invariance
Find good features
What about edges?
Find good features
Suppose you have to click on some point, go away and come back after I deform the image, and click on the same points again.
Find good features
• Want uniqueness
• Leads to unambiguous matches in other images
• Look for “interest points”: image regions that are unusual
• How to define “unusual”?
Suppose you have to click on some point, go away and come back after I deform the image, and click on the same points again.
Find good features
Suppose we only consider a small window of pixels
•What defines whether a feature is a good or bad candidate? -Considering uniqueness, repeatability and invariance…
Find good features
Find good features
Find good features
The math
• Consider shifting the window W by (u,v)
•how do the pixels in W change?
•compare each pixel before and after using the sum of squared differences (SSD)
What is the meaning of eigenvectors?
Simple example
Want E(u,v) to be large in all directions
•the minimumof E(u,v) should be large over all unit vectors [u v] •this minimum is given by the smaller eigenvalue λ-of H
Corner Response Function (in Harris detector)
• Computing eigenvalues are expensive
• Harris corner detector uses the following alternative
•det is the determinant; trace = sum of diagonal elements of a matrix •Very similar to λ, but less expensive (no eigenvalue computation)
𝑅 = 𝜆+𝜆− − 𝑘 𝜆+ + 𝜆− 2
Corner Response Function (in Harris detector)
• R depends only on eigenvalues
• R is large for a corner • R is negative with large magnitude for an edge
Harris detector
1. Compute Gaussian derivatives at each pixel
2. Compute second moment matrix M in a Gaussian window around each pixel
3. Compute corner response function R 4. Threshold R
5. Find local maxima of response function (non-maximum suppression)
C.Harris and M.Stephens. “A Combined Corner and Edge Detector.”
Harris descriptor
Harris Detector: Summary
• Average intensity change in direction [u,v] can be expressed as a bilinear form:
• Describe a point in terms of eigenvalues of H: measure of
corner response
• A good (corner) point should have a large intensity change in
all directions, i.e. R should be large positive
𝐸 𝑢, 𝑣 = 𝑢 𝑣 𝐻 𝑢𝑣
Invariance of Eigenvalue-based feature detectors
• Ellipse rotates but its shape (i.e. eigenvalues) remains the same
– What if you change the brightness?
Scale invariant interest point detection
• But for a Harris detector, different scales have different result.
– Consider how we define it? The basic unit is one pixel…
Scale invariant interest point detection
Scale invariant interest point detection
• What is most different between different scales?
– Differences between inside and outside
What can be used to measure it?
• Normally, around the window/circle in that scale, if there are
salient edges, that will be possible to detect. • Recall: how we detect
Edge detection
• To avoid noises, firstly use Gaussian kernel to smooth it.
So we use the response after Laplacian of Gaussian filtering (LoG)
Important: Now we are doing Laplacian filtering on the result of Gaussian filtering!
How do we use the response?
Input image (one dimensional slice)
Laplacian of Gaussian
Edge -> blob
• Keep in mind, we want to find an optimal scale.
– Look at in which scale the response will be significantly different?
This actually
Edge -> blob
• We want to find the characteristic scale of the blob by
convolving it with Laplacians at several scales and looking for the maximum response
Shapes of LoG
Edge -> blob
To make the response more meaningful:
• The response of a derivative of Gaussian filter to a perfect step edge decreases as σ increases! (scale normalization)
Recall:
The filtering operations for all points are only multiplication. So the area has a factor of 1/ σ, meaning that the response of every point has a factor of 1/ 𝜎
To keep response the same (scale-invariant), must multiply Gaussian derivative by σ!
So what should be multiplied for LoG, The second derivative of Gaussian?
LoG in 2D
• At what scale does the Laplacian achieve a maximum response to a binary circle of radius r?
• To get maximum response, the zeros of the Laplacian have to be aligned with the circle. Write up LoG:
image r Laplacian of Gaussian 6 2 / ) ( 2 2 2 2 ) / 2 (x y e x2y2 2
Therefore, the maximum response occurs at r / 2. circle
Laplacian
Characteristic scale
• We define the characteristic scale of a blob as the scale that produces peak of Laplacian response in the blob center
characteristic scale
T. Lindeberg (1998). "Feature detection with automatic scale selection." International Journal of
Computer Vision 30 (2): pp 77--116.
Different σ means different scale
Scale-space blob detector
1. Convolve image with scale-normalized Laplacian at several scales
2. Find maxima of squared Laplacian response in scale-space
Difference of Gaussian(DoG) and LoG
• Approximating the Laplacian with a difference of Gaussians (DoG):
2 ( , , ) ( , , ) xx yy L G x y G x y ( , , ) ( , , ) DoG G x y k G x y DoG is a good approximation of LoG: Can be understood from the
heat diffusion equation
Difference of Gaussian(DoG) and LoG
• The factor (k - 1) in the equation is a constant over all scales and therefore does not influence extrema location
DoG for blob detection
Then the final important invariance:
What is invariant after rotation?
Find some dominant orientation, and rotate the whole “patch” to make the dominant orientation to be the same, then the patch will be the same.
anticlockwise rotation
How to get dominant orientation?
Gradient magnitude Gradient orientation
Peaks in the orientation histogram correspond to dominant directions of local gradients.
Gradient Orientation Histogram
The method in SIFT:
(SIFT) SCALE INVARIANT FEATURE TRANSFORM by David Lowe
Basic idea to get dominant orientation to align:
•Take 16x16 square window around detected interest point (8x8 shown below)
•Compute edge orientation (angle of the gradient minus 90°) for each pixel
•Throw out weak edges (threshold gradient magnitude)
The method in SIFT:
• Known the dominant orientation, rotate the window to standard orientation
The method in SIFT:
The descriptor in SIFT
Other important questions in SIFT
• How to determine where are the feature points
– Recall what we did in detecting blobs, Harris corner points
Keypoint localization in SIFT
• Use DoG between different scale of Gaussian
– Find local maxima across scale/space – Scale space is separated into octaves:
• Octave 1 uses scale σ • Octave 2 uses scale 2σ
– In each octave, the initial image is repeatedly convolved with Gaussians to produce a set of scale space images.
– Adjacent Gaussians are subtracted to produce the DOG • After each octave, the Gaussian image is down-sampled by a factor of 2 to
Keypoint localization in SIFT
• Detect maxima of difference-of Gaussian in scale space
• Each point is compared to its 8 neighbors in the current image and 9 neighbors each in the
Keypoint localization in SIFT
• Once a keypoint candidate is found, perform a detailed fit to nearby data to determine location, scale, and ratio of principal curvatures
• In initial work keypoints were found at location and scale of a central sample point.
• In newer work, they fit a 3D quadratic function to improve interpolation accuracy.
Different octaves
Octave 1 Octave 2 Octave 1
Octave 2 Scale
More steps to pick better feature points
(a) The 233x189 pixel original image.
(b) The initial 832 keypoints locations at maxima and minima of the difference-of-Gaussian function. Keypoints are displayed as vectors indicating scale, orientation, and location.
(c) After applying a threshold on minimum contrast, 729 keypoints remain.
Sum up SIFT: main steps
Overall Procedure at a High Level
1. Scale-space extrema detection
Search over multiple scales and image locations. 2. Keypoint localization
Fit a model to detrmine location and scale. Select keypoints based on a measure of stability.
3. Orientation assignment
Compute best orientation(s) for each keypoint region.
How to match the SIFT descriptor
• SSD difference on 128 dimensional vectors