3.2 Direct methods
4.1.2 The ESM for vision-based control
4.1.2.1 Improving image-based visual servoing schemes
Similarly to vision-based estimation methods, most vision-based control approaches are based on the extraction of visual features [Espiau 92, Cervera 03]. Let me consider, for example, interest points as visual features. The standard approach consists in building from the selected features a task function ǫ diffeomorphic to the camera pose. Then, we can use the standard control laws described in the previous section to regulate the task function to zero. This approach generally works very well if the starting pose of the camera is not too far from the reference pose. Otherwise, the behavior of the robot in the Cartesian space may be not satisfactory. For example, it is now well known [Chaumette 98] that if the initial camera displacement is a rotation around the ~z axis an undesirable motion is induced by using standard control laws. For example, if we use the Gauss-Newton control law (the interaction matrix is updated at each iteration) the camera moves backward while rotating. On the other hand, if we use the Gauss-Newton control law (the interaction matrix is constant and computed at the equilibrium) the camera moves forward while rotating. Figure 4.1 illustrates these two problems. The images show the isolines of the cost function projected into the subspace (~tz, ~rz). We repeated several simulations with
an increasing initial rotation. Since the initial movement is a pure rotation, the ideal path (in the Cartesian space) should be a straight line perpendicular to the ~tz axis (i.e.
tz = 0). On the contrary, we observe that as rz reaches ±π the translational motion
becomes bigger since the isolines become perpendicular to the ~tz axis (i.e. the steepest
descent direction is along ~tz).
These problems can be solved by using the ESM scheme presented in Chapter 2. The ESM control law is :
v = −1
2(L(η, x) + L(η, 0))
−1
f (x)
Figure 4.1 shows that using this control law the camera performs a pure rotation around the ~z axis. The benefits of using the ESM control law are not limited to the behavior in this particular case and a more detailed discussion can be found in [Malis 04a]. Note that, around the equilibrium point the three control laws have the same behavior. Indeed, if x ≈ 0 then L(η, x) ≈ L(η, 0) ≈ 1
2(L(η, x) + L(η, 0)). Thus, the stability and robustness
analysis presented in the next section applies to all of them.
−0.5 0 0.5 1 1.5 2 2.5 −3 −2 −1 0 1 2 3 −0.5 0 0.5 1 1.5 2 2.5 −3 −2 −1 0 1 2 3 −0.5 0 0.5 1 1.5 2 2.5 −3 −2 −1 0 1 2 3
Efficient Gauss-Newton Gauss-Newton ESM
4.1.2.2 Stability and robustness analysis
In an image-based visual servoing approach, the task function can be computed di- rectly from image data (e.g. the features extracted in the image). On the other hand, the interaction matrix L(η, x) that links the derivative of the task function to the velocity of the camera depends on the camera intrinsic parameters and may also depend on some information about the structure of the target (all these parameters are in the vector η). For example, when the selected features are interest points the interaction matrix depends on the depths (the Z coordinates) of the corresponding 3D points. A good estimation of the interaction matrix is necessary in order to build a stable control law. It was observed experimentally that a rough estimation of the parameters bη was sufficient for a stable control :
v = −1
2(L(bη, x) + L(bη, 0))
−1
f (x)
However, it is important to theoretically understand how big the estimation error kbη−ηk on the parameters can be while still having a stable control law. Due to the complexity of the theoretical analysis very few results have been reported in the literature. Results have been obtained only in a few simple special cases [Espiau 93] [Cheah 98] [Deng 02], often considering a simplified camera model and always supposing the 3D structure is perfectly estimated. We have studied the robustness of standard image-based visual servoing control laws with respect to uncertainties on the structure of the target [Malis 03a] [Malis 02a]. We proved theoretically that even small errors on the depths may lead to unstable control laws [Malis 03c]. The proof has been extended to any central camera in [Mezouar 04]. We not only provided necessary and sufficient conditions for the local stability but also sufficient conditions that can more easily be tested. From these conditions we can measure the ”size” of the possible errors on the depths.
Figure 4.2 illustrates the results of the theoretical analysis with the example of a planar target. When the target is planar, the depths are related to the normal vector n to the plane. Without loss of generality, let me suppose here that ||n| = 1. Then n can be written as a function of two parameters n(θ, φ) = (cos(θ) sin(φ), sin(θ) sin(φ), cos(φ)). Thus all the estimated depth bZi can be obtained using an approximation of bn(bθ, bφ). The
figure shows the stability regions for a pinhole and for a catadioptric camera as a function of (bθ, bφ) for 8 or 16 points on the same plane. The true normal is n = (0.5, 0, 0.866) (i.e. the black cross at θ = 0 and φ = π/6). If we choose the estimated parameters (bθ, bφ) in the green region the control law will be locally asymptotically stable. On the other hand, if we choose the estimated parameters in the red region the system control law is locally unstable. The normals obtained for parameters in the blue region are discarded since we obtain at least a negative depth, which is impossible. Note that the cameras have similar stability regions. Increasing the number of points on the target decreases the unstable region but does not eliminate it completely. More complete results can be found in [Malis 03c, Mezouar 04].
θ φ −90 −60 −30 0 30 60 90 −90 −60 −30 0 30 60 90 θ φ −90 −60 −30 0 30 60 90 −90 −60 −30 0 30 60 90
Pinhole camera (8 points) Pinhole camera (16 points)
θ φ −90 −60 −30 0 30 60 90 −90 −60 −30 0 30 60 90 θ φ −90 −60 −30 0 30 60 90 −90 −60 −30 0 30 60 90
Catadioptric camera (8 points) Catadioptric camera (16 points)
Fig. 4.2 – Stability regions for planar targets with 8 or 16 points observed by a pinhole or a catadioptric camera. Parameters (φ, θ) selected in the green regions lead to stable control while parameters selected in the red regions lead to unstable control.