3.5 Stability
4.1.3 Model Structure Selection
A model structure is a mathematical relationship between input and output variables that contains unknown parameters. Examples of model structures are transfer functions with adjustable poles and zeros, state space equations with unknown system matrices, and nonlinear parameterized functions.
The system identification process requires that one chooses a model structure and apply the estimation methods to determine the numerical values of the model parameters. There are two basic approaches to select a model structure, namely black-box modeling and grey-box modeling approaches.
In the black-box modeling approach, the system behavior is inferred by simply relating system inputs and outputs from the data, without the need to know anything about the internal structure of the system. Black-box modeling is usually a trial- and-error process, where one estimates the parameters of various structures and compare the results. Typically, one starts with the simple linear model structure and progress to more complex structures. One can configure a model structure using the model order. The definition of model order varies depending on the type of model one selects. For example, in case of a transfer function representation, the model order is related to the number of poles and zeros. For state-space representation, the model order corresponds to the number of states. If the simple model structures do not produce good models, one can select more complex model structures by:
• Specifying a higher model order for the same linear model structure. Higher model order increases the model flexibility for capturing complex phenom- ena. However, unnecessarily high orders can make the model less reliable.
• Explicitly modeling the noise, by using, for instance, an additive disturbance model which treats the disturbance as the output of a linear system driven by a white noise source. Using a model structure that explicitly models
the additive disturbance can help to improve the accuracy of the measured system component. Furthermore, such a model structure is useful when the main interest is using the model for predicting future response values. • Using a different linear model structure.
• Using a nonlinear model structure. Nonlinear models have more flexibility in capturing complex phenomena than linear models of similar orders. Ultimately, one chooses the simplest model structure that provides the best fit to the measured data.
For what concerns the gray-box modeling approach, first principles that de- scribes the behavior of the system are known (e.g., in the form of a set of difference equations) but not their parameters, which thus need to be estimated from data. In general, the process for building grey-box models involves three steps:
1. Creating a template model structure.
2. Configuring the model parameters with initial values and constraints (if any). 3. Applying an estimation method to the model structure and computing the
model parameter values.
In this thesis, the focus is on the black-box modeling approach.
Black-box modeling can be performed both in a nonparametric and in paramet- ric way. In a nonparametric model, the system is described by using the impulse response or frequency response. The impulse response reveals the time-domain properties of the system, such as time delay and damping. The impulse response of a dynamic system can be estimated by means of least-squares and correlation
analysismethods [112]. The frequency response reveals the complete frequency-
domain characteristics of the system, including properties like the natural frequency of the system. The frequency response of a dynamic system can be estimated with
the Fourier analysis or with the spectral analysis methods [32,112]. New ap-
proaches to nonparametric system identification also include theories of modern
The other type of black-box models are parametric models, which describe systems in terms of difference or differential equations, depending on whether a system is represented by a discrete or continuous model, respectively. As the name implies, such models has a specific analytic form and depend on a certain number of parameters, whose values need to be estimated from data. Parametric model structures for linear systems include transfer-function and state-space model structures.
The transfer function (or polynomial) model structure is a family of linear model that can be represented by the following general time-varying polynomial equation: Ak(q)y(k) = Bk(q) Fk(q) u(k − nk) + Ck(q) Dk(q) e(k) (4.1)
where Ak, Bk, Ck, Dkand Fkare time-varying polynomial matrices expressed in
terms of the time-shift operator q,1nk is the time delay associated to the input,2
and e(k) is a zero-mean stochastic process with a given finite variance, modeling the white noise.
A similar formulation is available for LTI systems, where polynomial matrices are now independent from the time:
A(q)y(k) = B(q)
F(q)u(k − nk) +
C(q)
Dk(q)e(k) (4.2)
To estimate polynomial models, it is necessary to specify the model order as a set of integers that represent the number of coefficients for each polynomial of
the selected structure. For the general model of Eq. (4.1), this means to estimate
the numbers na, nb, nc, nd, and nf corresponding to the number of matrices Ak, Bk.
Ck. Dk, and Fk, respectively. It is also necessary to specify the number of samples
nk, corresponding to the input delay, given by the number of samples before the
1The time-shift operator q is such that when applied to an operand x(k), that is qτx(k), it shifts
the time k of its operand x(·) by τ steps forward or backward, according to the sign of τ; for instance, q−1x(k) , x(k − 1). The notation A(q) represents the time-shift operator applied to a polynomial such that A(q)x(k), 1 + ∑na
i=1aiq−i y(k) , y(k) + a1y(k − 1) + · · · + anay(k − na).
The notation A(q) represents the time-shift operator applied to a polynomial matrix such that A(q)x(k), I + ∑na
i=1Aiq−i y(k) , y(k) + A1y(k − 1) + · · · + Anay(k − na).
2It is worth noting that, in general, there may be a different input delay n
kj for each component
uj(·) of the input u(·). However, to keep the notation simple, we assume the same input delay for
output responds to the input. Typically, one begins modeling using simpler forms of this generalized structure and, if necessary, increases the model complexity.
There are different specialization of the general structure of Eq. (4.1). Among
them, two widely adopted model structures are:
• AutoRegressive with eXogenous variables (ARX): – LTV formulation:
Ak(q)y(k) = Bk(q)u(k − i − nk) + e(k) (4.3)
– LTI formulation:
A(q)y(k) = B(q)u(k − i − nk) + e(k) (4.4)
In this model structure, the model order is given by naand nb. Usually, the
notation ARX(na, nb, nk) is employed to indicate an ARX model with order
naand nb, and with input delay nk. This is one of the more simplest model
structure. The major drawback of this model is that, because of the noise term is coupled to the model dynamics, ARX does not allow to model noise and dynamics independently.
• AutoRegressive Moving Average with eXogenous variables (ARMAX) – LTV formulation:
Ak(q)y(k) = Bk(q)u(k − i − nk) + Ck(q)e(k) (4.5)
– LTI formulation:
A(q)y(k) = B(q)u(k − i − nk) + C(q)e(k) (4.6)
In this model structure, the model order is given by na, nb, and nc. Usually,
the notation ARMAX(na, nb, nc, nk) is employed to indicate an ARX model
with order na, nb and nc, and with input delay nk. This model structure
(using a moving average of white-noise). For such reason, it is particularly indicated to model systems where dominating disturbances enter at the input.
Several techniques have been proposed in the literature to estimate both the model
order and the input delay (e.g., see [91,112]). For instance, in [112], a model
structure selection based on statistical hypothesis testing is proposed, whereby each model structure is evaluated according to a specific goodness-of-fit criterion
like the Akaike’s Information Criterion (AIC) [124] or the Minimum Description
Length(MDL) criterion [141].
The other type of parametric linear model structure is the state-space model. In this model, the relationship between input, output and noise signals is represented as a system of first-order difference equations, instead of specifying one or more nth-order difference equations, by using auxiliary state variables x(·):
x(k + 1) = A(k)x(k) + B(k)u(k) + w(k), (4.7a)
y(k) = C(k)x(k) + D(k)u(k) + v(k), (4.7b)
with the noise covariance matrix:
Eh w(p) v(p) ! wT(q) vT(q) i = Q S ST R ! δpq ≥ 0 (4.8)
where w(·) ∈ Rnwis the disturbance input vector, v(·) ∈ Rnvis the noise vector, and
δi j is the Kronecker’s delta. Usually, w(·) and v(·) are assumed to be zero-mean
stationary white noise stochastic processes with a given variance.
An alternative and frequently used form for state-space model structures is the one where disturbances w(·) and v(·) are related each other by the Kalman matrix, such that w(k) = Kv(k). In this case, the disturbance vector w(k) is usually denoted by e(k), and the state-space model structure form is called the innovation form:
x(k + 1) = A(k)x(k) + B(k)u(k) + K(k)e(k), (4.9a)
with the noise covariance matrix:
Ee(p)eT(q) = Sδpq≥ 0 (4.10)
Similar formulations are available for LTI systems, where matrices are now
independent from the time. Thus, for Eq. (4.7), the following LTI form is obtained:
x(k + 1) = Ax(k) + Bu(k) + w(k), (4.11a)
y(k) = Cx(k) + Du(k) + v(k), (4.11b)
while the LTI innovation form is given by:
x(k + 1) = Ax(k) + Bu(k) + Ke(k), (4.12a)
y(k) = Cx(k) + Du(k) + e(k) (4.12b)
Similarly to the transfer function model structure, also in the state-space model structure it is possible to include time delays. However, unlike the transfer function representation, there are different type of time delays:
• input delays, which model delays at the input;
• output delays, which model delays at the output;
• I/O delays, which model independent transport delays from a given input to a given output of a MIMO transfer function model;
• internal delays, which model interconnection of systems with input, output, or I/O delays, including feedback loops with delays. Internal delays can arise, for instance, from:
– concatenating state-space models with input and output delays, – feeding back a delayed signal, and
– converting MIMO transfer function with I/O delays to state-space model.
The state-space model structure is a good choice for quick estimation because it requires only the estimation of the model order (which is an integer number equal to the dimension of the state vector and relates to the number of delayed input and outputs used in the corresponding linear difference equation) and, possibly, one or more delays.
Compared to nonparametric models, parametric models might provide a more accurate estimation if one has prior knowledge about the system dynamics to determine parameters like model orders and time delays. Nonparametric model estimation is more efficient, but often less accurate, than parametric estimation. One possible use of nonparametric models, is as an estimation method to obtain useful information about a system before applying parametric model estimation. For example, one can use nonparametric model estimation to determine whether the system requires preconditioning, what the time delay of the system is, what model order to select, and so on. Another possible use of nonparametric model estimation is for parametric model verification. For instance, one can compare the
Bode plot3of a parametric model with the frequency response of the nonparametric
model.
In this thesis, the focus is on parametric model structures.