Stabilization via Penalty Function - Advances in System Identification: Gaussian Regression and

where M(ψ, P ) has been deﬁned in (5.12). Thus, problem 5.3.1can be formalized as:

ˆ ψ, ˆP = arg min wy_,P kψ − P ˆw y Bk 2 s. t. M (ψ, P )≥ 0 (5.13) T r(P ) = n P = P>_{≥ 0}

where the constraint T r(P ) = n is added to improve the numerical conditioning, see

Miller and de Callafon(2013) for further details.

The solution ˆwy of Problem5.3.1is ﬁnally computed as: ˆ

wy = ˆP−1ψˆ (5.14)

In the remaining of the chapter the model ˆG(z) obtained by plugging in (5.6) the estimators ˆwy and ˆw_EBu obtained respectively from (5.14) and the EB procedure in (5.5), will be called “LMI” model.

5.4 Stabilization via Penalty Function

The second stabilization technique is developed to act directly inside the Gaussian regression procedure. As discussed in Section2.4, a crucial step is the estimation of the hyperparameter vector η, that can be done e.g. through marginal likelihood optimization (2.48). It turns out that some hyperparameters η may lead to estimators (5.5) which do not correspond to stable models ˆG(z) and ˆH(z). Thus, one possible remedy is to restrict the set of admissible hyperparameters to a subset ΩS which leads to stable models. This

is not entirely trivial as the estimators (and thus the set ΩS) depend on the measured

data Y, U. Accordingly, Problem5.2.1can be formulated as follows.

Problem 5.4.1 (Reformulation). Estimate the hyperparameters η restricting the search

ˆ η = arg max η∈ΩS pη(Y ) = arg min η∈ΩS − ln pη(Y ) (5.15)

to the set ΩS = {η| ˆA(z) Stable}, i.e., the set of hyperparameters which leads to stable

models ˆG(z), ˆH(z).

Since the set ΩS cannot be determined a priori because it is data dependent, a

102 Enforcing Model Stability in Nonparametric Gaussian Regression

interpreted as a barrier to push the estimate ˆη into ΩS, or equivalently, to keep ˆη away

from the set of hyperparameters η which leads to an unstable A(z).

Denote with Aη(z) the polynomial A(z) in (5.7) built with the estimator

wy_η := Eη[wy|Y ], (5.16)

which is to indicate that ˆwy

η is obtained with the speciﬁc hyperparameters ˆwyη and

deﬁne the dominant root of Aη(z) as ¯ρη := max |σ(Aη(z))|, where σ(A(z)) denotes the

set of roots of the polynomial A(z).

Next, the penalty function J(¯ρη) can be deﬁned:

J(¯ρη) = 1

(α(δ − ¯ρη))α −

(αδ)α (5.17)

where δ ≥ 1 is a scalar which determines the limit point corresponding to an inﬁnite value of the function and α is a positive scalar which adjusts the steepness of the function.

7 ;2 0.2 0.4 0.6 0.8 1 J ( 7;2 ) 0 0.5 1 1.5 2 2.5 3 3.5 7 ;

A

!

,

Figure 5.1: Representation of the penalty function J(¯ρη). The red bullet represents the

value of the penalty function associated to a specific ¯ρ in an illustrative example of an unstable

polynomial Aη(z). The blue arrows show the effects of the penalty function on ¯ρ while

estimating the hyperparameters. The black arrows show the effects of changing the parameters

α and δ.

5.4 Stabilization via Penalty Function 103

diverges (J(¯ρη) → ∞) when ¯ρ → δ and J(¯ρη) → 0 when ¯ρ → 0. Thus, when (5.17) is

added to the minimization problem (5.15), the eﬀect is of penalizing the solutions η which yields ¯ρη outside the stability region.

As it will be shown in Algorithm 11, the two parameters α and δ are iteratively adjusted until the estimated hyperparameters lead to a stable forward model which solves the constrained problem (5.15).

Note that when α → 0, J(¯ρη) gives no penalty for η < δ and an inﬁnite penalty for

η _{≥ δ. Elaborating upon the intuition above, it is easy to prove that the solution of} Problem 5.4.1can be found by the algorithm described below:

Algorithm 11 Stabilization via Penalty Function

1: Init:

2: Compute η0 through marginal likelihood maximization (Section 2.4.3), 3: Compute the predictor impulse response ˆwy_η₀ using (5.16),

4: Compute A_η₀(z) and ¯ρη0 associated to ˆw

y η0, 5: Set α = 1. 6: while ¯ρηk ≥ 1 do 7: Set δ = ¯ρηk(1 + ), 8: Compute ηk= arg min η − ln pη(Y ) + J(¯ρη) (5.18)

and the associated ¯ρηk,

9: if − ln pηk(Y ) + J(¯ρηk) = − ln pηk−1(Y ) + J(¯ρηk−1) then

10: α = α− ∆α, with ∆α suﬃciently small,

11: δ = δ_{− ∆δ, with ∆δ suﬃciently small,} 12: Set α = and δ = 1.

13: The solution of Problem 5.4.1is given by:

η = arg min

η − ln pη(Y ) + J(¯ρη) (5.19)

wy_η_ˆ = Eηˆ[wy|Y ], wˆuˆη = Eηˆ[wu|Y ] (5.20)

In the remaining of the paper the model obtained by (5.6) using (5.20) will be called “ML + PF” model.

Remark 5.4.2. Notice that the iterative procedure which updates δ and α is needed because, in general, there is no guarantee that one can ﬁnd an initial value of η ∈ ΩS.

Note also that the set ΩS is always non-empty provided the hyperparameter η includes

a scaling factor for the Kernel, i.e., a scalar variable which multiplies the Kernel. In fact, if this is the case, there exist values of η which leads to an estimator ˆwy = On×1 which,

104 Enforcing Model Stability in Nonparametric Gaussian Regression

5.5 Stabilization via a Full Bayes Sampling Approach

In document Advances in System Identification: Gaussian Regression and Robot Inverse Dynamics Learning (Page 109-112)