4. Rectification of Wide Baseline Images
4.3 Uncalibrated Image Rectification using structural congruency
In this section, we will explain the basic idea of the proposed rectification method and state it formally.
4.3.1 Basic Ideas
The basic idea is illustrated in Figure 4-4 and Figure 4-5. Letβs say we take two images of a triangle. Because of the wide baseline properties of the imaging system, we may get two images like what are shown in Figure 4-4(a) and (b). Notice that the bottom side of the triangle may of very different look in the two views because of the perspective distortion.
(a) (b)
(a') (b')
H H'
Figure 4-4. Image rectification with distortion minimized. By applying homographies H and H' to (a) and (b) respectively, we obtain image (a') and (b'). The corresponding pixels are on the same scanline after the operations but it is still very difficult to establish pixel to
The traditional method of rectification will minimize the local distortion introduced by the rectified homographies, so that a possible result may be look like in Figure 4-4(a') and (b'). The two images are rectified with no doubt. But it is impossible for us to get a fine dense matching for the two images because the sizes of the triangles in the two images are quite different. So we can try applying an affine transformation after the two images are rectified, as shown in Figure 4-5.
(a) (b)
(a') (b')
H H'
A
(a'')
Maximize the shape congruency of (a'') and (b')
Figure 4-5. Using proposed method, we apply one more affine transformation A on intermediate result (a') to maximize the shape congruency of the scenes and objects in the
We use affine transformation here because it will change the x-coordinate of the rectified image and thus will not introduce severe distortions that are not welcome here. Using optimization method, we can find the appropriate affine transformation so that the transformed triangle could be look like the other one as much as possible.
4.3.2 The Delaunay Triangulation
Based on the previous analysis, we naturally think about using the feature points detected in the input images to generate a network of control triangles. And then we can consider the net shape difference for the triangle network. Luckily, there is a readily tool to generate triangle network, i.e., the Delaunay triangulation. This is a well-developed research topic in computational geometry; for the details of how to construct a triangle network from a point set, you are referred to [Ber-08]. For a sample Delaunay triangle network, see Figure 4-6. All the triangles are identified by a list of its vertex. Such as in Figure 4-6, the triangles identified by yellow deltas are (62,8,21), (87,20,62) respectively.
Figure 4-6. The Delaunay triangle network generated using the feature points detected on an image of a silo.
4.3.3 The Proposed Algorithm for Wide Baseline Stereo Rectification
Based on the discussion in the previous section, it is clear that epipolar rectification and distortion compensation are needed before dense matching can be applied on wide baseline stereo image pairs. Epipolar rectification can simplify the dense matching task by reducing 2D searching to 1D searching while distortion compensation compensates for the perspective distortion and made the two views look similar in shape and pose. We define wide baseline image rectification as βepipolar rectification with shape difference minimization β. Now we propose our new algorithm that combines the epipolar rectification and shape distortion compensation. The epipolar geometry of a wide baseline stereo system can be illustrated as in Figure 4-7.
Figure 4-7. The epipolar geometry
To rectify the epipolar lines, two homographies H and H' can be applied on πΌ1 and πΌ2, respectively. To make the conjugate epipolar lines (π¦ποΏ½οΏ½οΏ½οΏ½οΏ½ and π¦β²πβ²οΏ½οΏ½οΏ½οΏ½οΏ½οΏ½ in Figure 4-7) parallel with x-axis and collinear, the equalities ππ = (1,0,0)πand πβ²πβ² = (1,0,0)π are necessary. The fundamental matrix for the rectified stereo pair is given by π οΏ½ = [πβ²πβ²]Γ= [(1 0 0)π]Γ,
which is a 3 Γ 3 skew symmetric matrix. A major constraint on the two rectifying homographies is given by,
, where π οΏ½ = οΏ½
0 0 0
0 0 1
0 β1 0οΏ½.
This equation imposes no constraint on the first row of H and H', which allows certain degrees of freedom of choosing H and H' to attain more objectives.
The outline of the proposed rectification method with maximized structural congruency is as follows:
Proposed Algorithm for Wide Baseline Stereo Rectification
1. Detect and establish the matches between two n- feature point sets, {π¦π’} β οΏ½π¦π’β²οΏ½, π = 1,2, β¦ , π. Here {π¦π’} β πΌ1 and οΏ½π¦π’β²οΏ½ β πΌ2.
2. Estimate the fundamental matrix F using the matching point sets. Solve for the epipoles e and e'.
3. Apply a homography πβ² on πΌ2 and get the image πΌΜ 2, such that the epipole e' is sent to infinity.
4. Generate the 2D triangulation net N2 for πΌΜ 2 using the verticesοΏ½πβ²π¦π’β²οΏ½, π = 1,2, β¦ , π,
and construct a lookup table of triangles of the net.
5. Apply a compatible homography ππ on πΌ1 such that πππ = (1,0,0)π.
6. Apply an optimized affine transformation A on πππΌ1 to maximize the structural
congruency between the triangulation nets.
Essentially, image πΌ2 is transformed by some quasi-rigid transformation to πΌΜ 2. The inherent structure of πΌΜ 2 is represented by the triangulation net N2, with the feature points as its
vertices. N2 is then used as a reference of the rectified structure. Finally, the homography
applied on πΌ1 is optimized to drive the structural congruency between the rectified images πΌΜ 1 and πΌΜ 2.
Typically, we can choose πβ² and ππ such that Equation (4-3) is satisfied. As the first step
of our exploration, we adopt the choices described in [Har-99] and [Har-03] in our experimentation.
πβ² is the resultant of three sequential operations. Initially, the origin of the image coordinate system is sent to the center of πΌ2 by a translation matrix T. After that, a rotation ππ can be applied to relocate the epipole πβ² = οΏ½ππ₯β², ππ¦β², 1οΏ½π onto the x-axis with a rotation
angle Ο = β tanβ1οΏ½ππ¦β²/ππ₯β²οΏ½. Finally, a quasi-rigid projective transformation G is applied to send ποΏ½β² = (πΜπ₯β², 0,1)T to (1,0,0)π. Or equivalently we can state that πβ² = ππΟπ, where
π = οΏ½ 10 0 01 0 β1/πΜπ₯β² 0 1