4. VISUAL NAVIGATION USING HETEROGENEOUS LANDMARKS AND
4.3 System Design and Multilayer Feature Graph
4.3.3 Camera Pose Estimation
With 2D feature correspondences obtained, estimating the 6 degrees of freedom (DoF) camera pose Rkand tkfor Ikis a key step for constructing and updating 3D MFG. Existing
methods (e.g. [59]) usually solve this problem using 3-point algorithm based on the 3D- 2D correspondences {Pi ↔ pi(k)} between known 3D points and their observations in
(a) (b)
(c) (d)
Figure 4.6: An example of our two-stage approach to ideal line matching. (a) and (b): Ideal line matches found in Stage 1 by the PCLM method, each pair of line match is plotted in the same color, and small circles represent point correspondences used by PCLM; (c) and (d): Additional matches found by the FGLM method in Stage 2. (Best viewed in color)
Ik. This method omits those 2D-2D correspondences between Ik−1 and Ik whose 3D
positions are unknown yet. This omission will lead to large estimation error when observed 3D points are few. Various approaches exist to handle this issue, e.g. using Kalman filtering [79] or three-view constraints [80]. A good fit for our system is a method proposed by Tardif et al. [67] that decouples the estimation of Rk from tk in two steps. We adopt
this method with the modification as follows.
E to recover the relative camera rotation Rk−1k and translation tk−1k , with ktk−1k k unknown.
Step 2: Compute the translation distance ktk−1k k using 3D-2D correspondences through a RANSAC process where only one correspondence is needed for a minimal solution. This completes the 6 DoF estimation.
In the Step 2 of [67], Tardif et al. estimate the full 3 DoFs of tk−1k using two 3D-2D correspondences for a minimal solution. This difference can be justified by the differ- ent cameras used - an omnidirectional camera in [67] with 360◦ horizontal field of view (HFOV) vs. a regular camera we use with 40◦−80◦
HFOV. Narrower HFOV results in fewer observable 3D landmarks in view and thus fewer 3D-2D correspondences, espe- cially in a turning situation. Therefore, we choose to reduce the problem dimension in Step 2 to fit our needs.
It is worth noting that when k = 1, we do not need Step 2, but set ktk−1k k = 1. This fixes the scale of the following estimations.
4.3.4 3D MFG Update
We initialize Mk by letting Mk = Mk−1 and then perform 3D MFG update for Mk
using 2D information just obtained. This is a process of associating 2D features in mkwith
3D landmarks in Mkand introducing new 3D landmarks into Mk. We present details for
each type of landmarks as follows.
4.3.4.1 Key Point Update
Key point update involves associating image observations to existing 3D key points, and establishing new 3D key points using new 2D key point correspondences (see Boxes 4.1-4.6 in Fig. 4.3).
A 2D point correspondence must have sufficient parallax to be used for computing a
3D point. Here we define the parallax of a 2D key point correspondence pi,k1 ↔ pj,k2 as ρ(pi,k1, pj,k2) :=hK −1 Hrpi,k1, K −1 pj,k2i (4.6) with Hr =KRkk12K−1 (4.7)
where Hr represents a rotational homography [45], h·, ·i indicates the angle between two
vectors, and Rk1
k2 has been computed in Section 4.3.3.
For a 2D key point correspondence pi,k−1 ↔ pj,k,
• if it is a re-observation of key point Pι, we make the association by letting pι(k) =
pj,k.
• if it is a newly discovered point, compute its parallax ρ(pi,k−1, pj,k) using (4.6). If
ρ(pi,k−1, pj,k) > τρ where τρ is a parallax threshold, we triangulate it and add the
3D point to Mkas a new key point. Otherwise, we set up a new 2D key point track
Qq = {pi,k−1, pj,k} to keep track of it for potential triangulation in the future. A 2D
key point track is a collection of 2D key points corresponding to a 3D point whose position is not computed yet due to insufficient parallax.
• if it is an observation of an existing 2D key point track Qq, we append it to the track
Qq = Qq∪ {pj,k}, and check whether Qq can be converted to a 3D key point. To
do this, we compute the parallax between pj,k and each of the rest points in Qq. If
anyone is larger than τ , we compute a 3D point from all points in Qq and add it to
Mk; Qq is then deleted.
4.3.4.2 Vanishing Point Update
Vanishing point update is straightforward (see Boxes 5.1-5.2 in Fig. 4.3). Given a 2D vanishing point vi,k, if it is a re-observation of existing Vj, let vj(k) = vi,k. Otherwise,
establish a new vanishing point node Vj = [vi,kT Rk, 0]T. It is trivial but important to update
the edges between ideal lines and vanishing points whenever a new ideal line or vanishing point node is added.
4.3.4.3 Ideal Line Update
Before presenting the ideal line update algorithm, we need to define the parallax for ideal lines. Generally speaking, parallax has not been clearly defined for lines. Here we propose a heuristic parallax measurement for ideal lines by leveraging their line segment endpoints. For a 2D ideal line correspondence li,k1 ↔ lj,k2, define
%(li,k1, lj,k2) := 1 n n X ι=1 ρ(dι,k1, d + ι,k2) (4.8)
where {dι,k1|ι = 1, · · · , n} denotes the endpoints of line segments that support li,k1, and
d+ι,k
2 is the perpendicular foot of d
0 ι,k2 := Hrdι,k1 on lj,k2 in Ik2, as illustrated in Fig. 4.7.
H
rCamera center
)
,
(
2 1 , k ,k
d
d
1 ,k ιd
l
i,k1 2 ,k jl
1 kI
2 kI
2 ,k ιd
2 ,k ιd
2 ,k il
Figure 4.7: Illustration of parallax computation for 2D ideal lines. Hr is a rotational ho-
mography defined in (4.7). Bold lines are supporting line segments of the underlying (thin) ideal line. ρ(dι,k1, d
+
ι,k2) is the parallax between points dι,k1 and d
+ ι,k2.
The rationale is that we want to reward line correspondences which have larger distance in their perpendicular direction. If l0i,k2 := H−Tr li,k1 overlap with lj,k2, their parallax should
be zero.
With the parallax defined, the ideal line update is performed in a similar fashion to the key point case (i.e. Boxes 4.1-4.6 in Fig. 4.3), and thus skipped here.
Remark 1. 3D Line segments are also updated in this process. Since a line segment always has an ideal line parent, when a 2D ideal line is converted to 3D, its associated line segments are also converted to 3D. Their 3D positions are computed based on the 3D ideal line parameters.
4.3.4.4 Primary Plane Update
Detecting primary planes is of great importance for robot navigation. Here we detect primary planes by finding coplanar 3D key points and ideal lines using RANSAC. To be specific, let C be a collection of 3D key points and ideal lines which are not yet associated with any primary plane. We briefly describe two key steps of RANSAC below.
1. Compute a plane candidate Γ from a minimal solution set, which could include either 3 key points, or 2 parallel ideal lines, or 1 key point plus 1 ideal line.
2. ∀c ∈ C, compute a consensus score f (c, Γ) as follows.
f (c, Γ) = δ⊥(c, Γ) if c is a key point 1 n Pn
i=1δ⊥(Di, Γ) if c is an ideal line
(4.9)
where δ⊥(·, ·) denotes the perpendicular distance from a point to a plane in 3D, and
{Di|i = 1, · · · , n} is the set of 3D endpoints associated with ideal line c. Therefore,
if c is an ideal line, f (c, Γ) is the average of the distances from its associated line segment endpoints to Γ.
If the size of the largest consensus set is greater than a threshold Ncp, we add the corre-
sponding plane candidate to Mk as a primary plane, and establish edges between it and
the key points and ideal lines in the consensus set. To control the problem size, we do not include all 3D key points or ideal lines in C. Instead, we only take into account those recently established landmarks. Here we enforce |C| ≤450.
Moreover, when new 3D key points or ideal lines are established, we check if they belong to existing primary planes using the metric defined by (4.9) and add edges accord- ingly. An ideal line may have two parent primary planes if it is a boundary line.