Weighted Least-squares Estimation - Visual Odometry and Sparse Scene Reconstruction for UAVs wi

There are different techniques to describe the estimation of unknown parameters from given observations, one among these is the well known least-squares estimation. Least- squares estimation is a very effective numerical method and leads to best unbiased estimators for linear relationship and observations disturbed by Gaussian noise.

In this section, we will briefly introduce the Gauss–Markov model, which contains a functional and stochastic model to frame the observation process. The functional model specifies the assumed relation between the acquired observations and the unknown parameters as an explicit function, which usually results from physical or geometrical laws. The stochastic model specifies the statistical properties of the observation process, and is assumed to be sufficiently described by the first and second moments of a normal distribution. The Gauss–Markov model covers many practical estimators including maximum likelihood (ML) and maximum a posteriori (MAP) estimators. For a detailed introduction into estimation theory with emphasis on least squares estimation please refer to the books of Koch (1999, Chap. 3) or F¨orstner and Wrobel (2016, Chap. 4).

2.3.1 Estimation with Non-linear Gauss–Markov Model

The Gauss–Markov model starts from N observations l = [ln], n = 1, ..., N , which are

assumed to be a sample of a multivariate Gaussian distribution N (˜l, Σll) around a true but

unknown observation vector ˜l with a symmetric and positive definite covariance matrix Σll.

Due to the noise induced by the observation process, there are in general no parameters x for which a functional model f (x) = l holds. Therefore the goal is to find corrections _bv for observations l and best estimates x such that the relationb

f (x) = l +b bv = bl (2.24)

between the fitted observations bl = l +bv and the estimated parameters bx holds and the weighted sum of the squared residuals

Ω(x) =_b _bvTΣ−1_ll _bv (2.25)

is minimum.

The optimization problem therefore reads as

x = argmin_x(f (x) − l)TΣ−1_ll (f (x) − l) , (2.26)

which leads to estimated parametersx, which have minimal variance, i.e., are best._b For a nonlinear function f (x) the solution is iterative. Starting from initial valuesxb(ν=0)

2.3 Weighted Least-squares Estimation 33

for the estimated parametersx in the first iteration ν = 0 we determine updates db ∆x(ν=0)

x(ν+1) =x_b(ν)+ d∆x(ν). (2.27)

Each following iteration solves for the updates d∆x(ν) of the linearized function

l +_bv(ν)= f (x_b(ν)) + Ad∆x(ν) (2.28)

with Jacobian matrix

A= ∂f (x) ∂x _x =xb (ν) (2.29)

evaluated at initial parametersx_b(ν). With the reduced observations

∆l(ν)_{= f (}_b_x(ν)_{) − l} _(2.30)

we can determine the unknown parameter updates d∆x(ν)from the normal equation system

ATΣ−1_ll Ad∆x(ν)= ATΣ−1_ll ∆l(ν) (2.31)

for example with Cholesky factorization (Golub and Loan, 1996, Sec. 4.2).

The corrections of the observations can be determined linearly after each iteration by

bv(ν)= Ad∆x(ν)− ∆l(ν) _(2.32)

which after convergence are equal to the non-linearly determined corrections

bv = f(bx) − l . (2.33)

We arrive atx :=_b _bx(ν)in case of convergence, i.e., d∆x → 0. Convergence is achieved if all updates for parametersx are small compared to their standard deviation, |∆b bxu/σxu| < Tc,

e.g. with a threshold Tc= 0.01, requiring the updates to be less than 1 % of their standard

deviation.

The full covariance matrix of the estimated parameters is obtained by

Σ b xbx =bσ 2 0(ATΣ−1ll A) −1 _(2.34)

with estimated variance factor

bσ2

0 =

bvTΣ−1_ll _bv

R (2.35)

with the redundancy R = N − U of the optimization problem with the number N of observations, i.e. the dimension of vector l, and the number U of unknown parameters, i.e. the dimension of vector x.

2.3.2 Robust Estimation

The presented least squares estimation is highly sensitive to outliers in the observations as the weighted sum of squared residuals is minimized. Observations are usually consid- ered as outliers if the realized measurement is significantly out of the dispersion range of the expected value. Within an estimation procedure, outliers can be detected based on the magnitude of a computed residual bvn. Following Baarda (1967) for uncorrelated

observations the test value

Tn= bv n σ b vn (2.36) with Σ b vbv = Σll− AΣbxbxA T _(2.37)

follows the standard normal distribution Tn∼ N (0, 1) if there are no gross errors in the

observations. Assuming all observations to have an equally high influence on the parameter vector, one could use σln instead of σbvn in Eq. (2.36). If Tn deviates significantly from

the standard normal distribution, the corresponding observation can be assumed to be an outlier, thus should be eliminated from the estimation process. Rigorous testing for outliers by means of hypothesis testing is treated by Koch (1999).

Alternatively, the influence of high residuals on the cost function can be reduced by robust estimation techniques as reweighting procedures, which can be directly incorpo- rated into the iterative estimation procedure of non-linear least-squares. Assuming again stochastically uncorrelated observations, Eq. (2.25) can be rewritten as

Ω(x) =X n 1 2 vn σln 2 =X n ρ(yn) (2.38)

with normalized residuals yn= vn/σln and piecewise influence functions

ρ(yn) =

1 2y

n. (2.39)

To arrive at a robust estimation procedure, Huber (1981) proposes using a probability density function for the observations which consists of a normal distribution in the middle and of a Laplace distribution at the ends. This way the density function has more probability mass at the ends and thus allows to model a certain amount of gross errors in the observations. The modified influence function ρH(yn) is defined as

ρH(yn) =    1 2y2n for |yn| ≤ k, k(|yn| −k₂), otherwise, (2.40)

In document Visual Odometry and Sparse Scene Reconstruction for UAVs with a Multi-Fisheye Camera System (Page 32-35)