Predicting Control Uncertainty in Ground Robots

We investigate the use of task-driven dictionaries for uncertainty quantification of control decisions based on sensory information received on-board by a ground robot. The motivation for this work is as follows: maneuvering along a planned path is the most basic control task in ground robotics, but a robot’s ability to accomplish this task is governed by a plethora of physical phenomena, such as ground interactions and shear deformations, that are either difficult to model from first principles, or yield dynamical systems for which obtaining control inputs is intractable. Though driving slowly enough may hide the difference between the kinematic models that underlie most path planning schemes and the ground truth, this precludes any mission with an operational tempo or scale that necessitates operation at high speeds.

By abstracting away the complicated physical phenomena using statistical models of the

difference between expected and true state behavior, which we call the model disturbance,

we may obtain the basic predictive feed-forward power we need to drive a modern control system, such as a classically-inspired two-degree-of-freedom controller [10] or a sampling- based receding-horizon controller [76]. Here we consider the model disturbance to be a function of both model mismatch and unknown environmental effects. In the past, statistical models of ground robotic velocity disturbances have been built by batch-fitting statistics to an entire dataset of one operational environment [65, 162]. The descriptive capability of the statistical model which captures unexpected maneuvers experienced by the platform is

inherently contingent on the variation of the terrain in which the vehicle operates, as well as the sophistication of the parametric form of the chosen model.

In practice, we might not have access to a dataset for a particular operational environment, or the environment may have such dramatically different surfaces that a generic model is too imprecise. Moreover, such techniques would not allow real-time adaptation to environmental effects as they are experienced by the platform. In this case, an adaptive approach, where one sequentially revises estimates of the model disturbance based on new information, is advantageous. Classical approaches such as the Extended Kalman Filter have been successfully applied to handle this adaptive state estimation problem in ground robotics [162, 174]. However, the memory tuning of this estimator plays a large role in its predictive capability: with long memory, the resulting model will eventually end up trying to poorly fit the entire environment; with short memory, the resulting model will be unable to make use of data collected on earlier traversals of the same location or surface type.

One solution to this memory window issue is to develop multiple models in parallel, one for each distinct type of surface, and use a separate estimation procedure, such as visual classification, to decide on which surface the robot is driving and learn a model associated with only that particular surface [106]. In the adaptive setting [142], this classifier must also be learned online, but the number of classes needed to properly model the environment is seldom known in advance, and online multi-class classification is a computationally ex- pensive procedure. With this motivation in mind, we tailor statistical learning techniques to address these shortcomings. In particular, in this chapter we

1. develop an approach to perform online regression over the disturbance statistical model, jointly with a representational basis, or dictionary, of the feature space which consists of control and perception information, by making use of supervised dictionary learning techniques [12, 93] (Sections 4.2 and 4.3);

2. quantify the advantages of using this learning technique which incorporates robotic perception as compared to approaches based on batch statistics or simple adaptive approaches on a ground robot which collects visual and odometric data while driving continuously over multiple surfaces (Section 4.4).

The resulting approach, by incorporating the robot’s real-time sensing and perception capabilities into an adaptive disturbance predictor, effectively bypasses the need for an ex- plicit surface classification step by parameterizing the statistical disturbance model over the visual features and control signals that are observed while the platform experiences the disturbances. Related approaches to predicting steering mistakes based on statistical learning have been considered such as, e.g., Gaussian process based models [147], or treating model mismatch and environmental effects separately [4, 5]. Our approach considers these effects

jointly, and makes use of discriminative matrix-factorization-based methods. Furthermore, we empirically demonstrate the proposed framework’s ability to quantify uncertainty that comes from unmodelled system dynamics and unknown environmental effects along a given path in real-time on a ground robot.

4.2.1 Control Uncertainty Forecasting

We consider the problem of learning unmodelled system dynamics and exogenous environmental effects, which we call the model disturbance, on the platform’s path planning scheme. To do so, we adopt an approach similar to [147] by considering the following discrete-time nonlinear state-space system of equations

xk+1=s(xk,uk) +g(ak), (4.7)

wherexk∈ X is the system state, which consists of pose and possibly velocity information,

uk is the control input at time indexk, and the maps:X × U → X is a simple kinematic

model that is chosen a priori, such as an effective wheel-base [222] or general kinematic slip

model [175]. We concatenate control inputs and signals zk observed by the platform such

as visual, acoustic, or LIDAR information into feature vectors ak := (uk;zk) ∈ A ⊂ Rp

upon which unexpected maneuvers depend. To be specific, we define the model disturbance

g:A → X in (4.7) as a general nonlinear map that captures unmodelled system dynamics due to unmodelled system dynamics that may be both structural and random, which we assume is a function of only control and sensory information. A key point of departure between this formulation of model disturbance and that of prior works is its dependence on the environment, which is parameterized by the perceptive and sensing capabilities of the platform.

In this work, control decisions are provided externally to the system by a user as an open-loop system, and their impacts are then empirically measured. However, the predictive technique developed in the next section may be fed into a model predictive control framework to adjust its cost functional according to learned disturbances.

We consider the model disturbance as a stochastic process depending on feature vectors

a∈ Awhich aggregate past control and sensory information. Considerxk−1 as the system

state which follows a purely kinematic model that is chosen a priori, the estimated state

xk obtained via on-board sensor measurements, and control inputsuk−1 and signals zk−1

which been observed at the previous time slotk−1. Then we may rearrange (4.7) to obtain

an estimate for the model disturbance ˆg(ak−1), i.e.

Our goal is to characterize the unknown true disturbance mapping g : ak 7→ g(ak) based

upon realizations of the pairs (ak,gˆ(ak)) of feature vectors and physical measurements

which estimate the model disturbance. For simplicity, henceforth we will consider ˆg(a) to

be scalar-valued.

Observe that if a generic path planner’s kinematic model perfectly captures the ground truth state, the expression in (4.8) will be null. Since this is not the case, the model disturbance is a quantifiable phenomenon, especially across varying operating terrains for the

platform. Thus, we seek to characterize the map g by learning a Gaussian approximation

of the exogenous environmental and dynamical effects. In particular, we consider the ran-

dom pair (a,gˆ(a)) to be related through a conditional Gaussian distribution of the form

g(a)|a∼ N(µ(a), σ2(a)) with unknown meanµ(a) and scalar varianceσ2(a) that depend

on the system state and observed sensory informationa, i.e.

P [ˆg(a)|a] = p 1 2πσ2₍_a₎exp −(ˆg(a)−µ(a)) 2 2σ2₍_a₎ . (4.9)

Information about the map g(·) in the form of realizations the random pair (a,ˆg(a)) are

sequentially revealed as the robot explores the feature space associated with its operating

environment. In order for information about the disturbance to be leveraged for path

planning, disturbance predictions must be made on an incremental basis, which motivates the formulation of learning the Gaussian approximation of the model disturbance as an online learning problem, whereby we seek to repeatedly revise predictions of the Gaussian disturbance approximation based on newly available information.

Since g(·) in the state space model given in (4.7) represents a complicated relationship

between robotic sensory perception and unexpected effects of control decisions, we expect

the relationship between (a,gˆ(a)) to be highly nonlinear, in which case the performance of

a simple regressor on the likelihood given in (4.9) may be boosted by learning an alternative

feature encoding of signals a. Motivated by this observation, we seek to represent realiza-

tions ak as a combination of m common basis elements (or atoms) dl which are unknown

and must be learned from data. We stack dk into a matrix which we call the dictionary

matrix D ∈ _R|A|×m _{and denote the coding of} _a

k as αk ∈ Rm. For a given dictionary,

the coding problem calls for finding a representation αk such that the signal ak is close

to its dictionary representation Dαk, which may be mathematically formulated by intro-

ducing a loss function that depends on the proximity betweenDαk and the data point ak,

specifically, we consider an elastic-net minimization [2]

α∗(D;ak) := argmin

αk∈Rm

Figure 4.1: Overview of our system. The platform’s statexand controluintended by a kinematic

planner differ from the ground truth ˆx measured by an inertial measurement unit by the model

disturbancegdue to factors such as modelling errors in the motor control and environmental effects.

Our goal is to develop a learning procedure to sequentially estimate the probability distribution of

g based upon of justuand imageszof the terrain.

which may be efficiently solved [57]. Hereafter, we assume that basis elements are normalized

to have norms kdlk ≤ 1 so that the dictionary is restricted to the convex compact set

D:={D∈_R|A|×m_:_k_d

lk ≤1, for all l}.

The dictionary learning problem associated with the elastic net entails finding a dic-

tionary D such that the signals ak are close to their representations Dα∗(D;ak) for all

possible k. Here, however, we focus on discriminative problems where the goal is to find a

dictionary that is well adapted to a specific classification or regression task, as in [12]. To

do so, we use the codingα∗(D;a) in (4.10) as a feature representation of the signal a and

introduce regressorsw1andw2 that are used to predict the first and second-order statistics

µ(a) and σ2(a) when given the signalα∗(D;a) through general maps of the form

µ(a) =`(w1,α∗(D;a)), σˆ2(a) =l(w2,α∗(D;a)), (4.11)

where ˆµ(a) and ˆσ2(a) denote estimators for the true moments of the model disturbance,

and h and l map regressors and sparse codes to their estimators. Specific forms for (4.11)

are given in Section 4.3.2, (4.17) - (4.18). In the next section, we develop our feed-forward scheme to predict the model disturbance.

In document Stochastic Optimization For Multi-Agent Statistical Learning And Control (Page 96-100)