Robust output feedbackstabilization via risk-sensitive control
Valery A. Ugrinovskii
∗, Ian R. Petersen
School of Electrical Engineering, Australian Defence Force Academy, Canberra ACT 2600, Australia Received 12 February 2001; received in revised form 24 October 2001; accepted 21 November 2001
Abstract
We consider a problem of robust linear quadratic Gaussian (LQG) control for discrete-time stochastic uncertain systems with partial state measurements. For a 6nite-horizon case, the problem was recently introduced by Petersen et al. (IEEE Trans. Automat. Control 45 (2000) 398). In this paper, an in6nite horizon extension of the results of Petersen et al. (IEEE Trans. Automat. Control 45 (2000) 398) is discussed. We show that for a broad class of uncertain systems under consideration, a controller constructed in terms of the solution to a specially parameterized risk-sensitive stochastic control problem absolutely stabilizes the stochastic uncertain system. ? 2002 Elsevier Science Ltd. All rights reserved.
Keywords: Robust control; LQG control; Stochastic control; Stochastic risk-sensitive control
1. Introduction
For systems with partial state measurements, the linear quadratic Gaussian (LQG) methodology has proved to be a useful technique for designing output feedbackcontrollers. However, the development of a robust version of the LQG technique has been a challenging problem. It is known that LQG controllers may lead to very poor robustness in terms of gain and phase robustness margins (Doyle, 1978). Therefore, considerable e?orts have recently been undertaken to develop a robust version of the LQG control synthesis methodology. For di?erent uncertainty models, this problem has been attacked using the H∞ and mixed
H2=H∞ control approach (e.g., see Mustafa & Bernstein,
1991; Haddad et al., 1991; BaCsar & Bernhard, 1995), the guaranteed cost and quadratic stabilization approach (Fu, de Souza, & Xie, 1991), the minimax optimization approach based on sum quadratic constraints (Moheimani, Savkin, & Petersen, 1995).
It has recently been demonstrated in Petersen, James, and Dupuis (2000a) that one possibility to enhance the
This workwas supported by The Australian Research Council and
The Defence Science and Technology Organization. This paper was not presented at any IFAC meeting.
∗Corresponding author. Tel.: 6268-8219; fax:
+61-2-6268-8443.
E-mail addresses: [email protected] (V.A. Ugrinovskii), [email protected] (I.R. Petersen).
robustness of an LQG controller is to use a risk-sensitive control approach to the controller synthesis. The risk-sensitive control approach has a number of attractive features. Apart from the enhanced robustness of a con-troller, this approach leads to a tractable minimax design procedure which can be viewed as a direct extension of the existing LQG control technique, see Petersen, Ugrinovskii, and Savkin (2000b). Note that the results in Petersen et al. (2000a) which concern the output feedbackrobust control of linear systems, are incomplete in that they address the 6nite-horizon control problem and lead to time-varying controllers, whereas in practical robust controller design problems, time-invariant controllers are most useful. The derivation of time-invariant output feedbackrobust con-trollers requires that an in6nite-horizon version of the partial information minimax optimal control problem of Pe-tersen et al. (2000a) be considered. Although a steady-state version of the minimax optimal controller was discussed by Petersen et al. (2000a), the stabilizing properties of this controller were not addressed. In this paper, we rigorously address an in6nite-horizon robust control problem which can be regarded as an extension of the problems considered by Petersen et al. (2000a) to the in6nite time horizon case. The contribution of this paper is to show that the resulting optimal control schemes guarantee the absolute stability of the closed loop system against the class of admissi-ble uncertainty perturbations under consideration; see also Ugrinovskii and Petersen (2001a), Petersen et al. (2000b) and Dupuis et al. (1998) in the continuous-time case.
0005-1098/02/$ - see front matter ? 2002 Elsevier Science Ltd. All rights reserved. PII: S 0005-1098(01)00288-6
The main result of the paper is a robust LQG control synthesis procedure based on a pair of discrete-time alge-braic Riccati equations arising in a risk-sensitive optimal control; see Glover and Doyle (1988). We show that so-lutions to a certain specially parameterized risk-sensitive control problem provide us with a controller which guar-antees an optimal upper bound on the time-averaged performance of the closed-loop system in the presence of admissible uncertainties. This result is analogous to the corresponding continuous-time result of Ugrinovskii and Petersen (2001a).
2. Denitions
We adopt the stochastic uncertain system model intro-duced in Petersen et al. (2000a). This model includes a nominal system model and also a description of a class of admissible uncertain perturbations. In this section, we present a modi6cation of the model of Petersen et al. (2000a) which is required in order to obtain an in6nite-horizon minimax LQG control result. The nominal system and measurement dynamics are described by the following equation:
xt+1= Axt+ B1ut+ B2wt+1;
zt= C1xt+ D1ut;
yt+1= C2xt+ vt+1: (1)
In the above equations xt∈ Rn is the state, zt∈ Rp is the
uncertainty output, and yt∈ Rq is the measured output.
Also, w1∞:= {wt}∞t=1; v1∞:= {vt}∞t=1 are two sequences1
of i.i.d. Gaussian variables with zero mean and covari-ance matrices W ¿ 0; ¿ 0, respectively. It is assumed that wt∈ Rr; vt∈ Rq; t = 1; 2; : : : . We let Pw; Pv denote
marginal probability measures of wt; vt. The sequence w1∞
corresponds to the system noise, while the sequence v1∞
is referred to as the measurement noise. For simplicity, we suppose that the reference system noise and the reference measurement noise are independent. All coeLcients in Eqs. (1) are assumed to be constant matrices of corresponding dimensions. Also for the sake of simplicity, we assume that C
1D1= 0.
The initial state of system (1) is a Gaussian random vector x0∈ Rnwith mean Mx0and non-singular covariance matrix Y0.
The corresponding Gaussian probability measure associated with x0is denoted Px. The random variable x0is assumed to
be independent of the reference noise sequences w1∞; v1∞.
The joint probability measure on the initial condition vec-tor x0 and the reference noise sequences w1T and v1T is
1Throughout the paper, tT; t∞ will denote sequences of random
variables as follows tT:= {t; t+1; : : : ; T}; t∞:= {t; t+1; : : :}. denoted by PT: PT(dx 0× dw1T× dv1T) = Px(dx 0) × T t=1 Pw(dw t) × T t=1 Pv(dv t):
This probability measure is de6ned on measurable sets of the 6ltration {FT; T = 0; 1; : : :}. For each T = 0; 1; : : : ; the
-algebra FT is generated by the initial condition x0of
sys-tem (1) and the noise sequences w1T, v1T and is completed
by including all sets of PT-probability zero. The sets of the
-algebra FT are subsets of the ‘noise space’ W,
W = {(x0; w1∞; v1∞)}:
From Kolmogorov’s theorem, the family of probability mea-sures PT gives rise to a probability measure P de6ned on
measurable space (W; F); F = {FT; T = 0; 1; : : :}.
2.1. Stochastic uncertain system
We now introduce an uncertainty model. The uncertainty will be modeled in terms of perturbations of corresponding joint probability distributions of the noise inputs w1∞, v1∞
and the initial condition of the system. It is demonstrated in Section 2.3 that such a model provides a meaningful de-scription of stochastic uncertain systems modeled using a linear fractional transformation (LFT); also, see Petersen et al. (2000a,b) and Ugrinovskii and Petersen (1999). The rig-orous de6nition of our uncertainty model is the following. Let Q := {Q1; Q2; : : :} be a collection of conditional
proba-bility measures referred to as an uncertainty; i.e., for each t = 1; 2; : : : ; Qt(dwt× dvt|x0; w1t−1; v1t−1) is a probability
measure de6ned on measurable subsets of Rr× Rq. It is
as-sumed that
Qt(dwt× dvt|x0; w1t−1; v1t−1)
P(dwt× dvt|x0; w1t−1; v1t−1); t = 1; 2; : : : (2)
Here, the notation means that the probability measure is absolutely continuous with respect to the reference proba-bility measure . Using the collection Q, the joint probaproba-bility measure of the initial condition vector x0 and the perturbed
noise sequences w1T and v1T is de6ned by
QT(dx 0× dw1T× dv1T) = Px(dx 0) × T t=1 Qt(dwt× dvt|x0; w1t−1; v1t−1):
It follows from Eq. (2) that QTPT. Also, for any
FT−1-measurable set ∈ W; QT() = QT−1().
As in Petersen et al. (2000a), we will require that each uncertainty Q = {Q1; Q2; : : :} has the property
Here, h(QTPT) is the relative entropy functional; e.g., see
Dupuis and Ellis (1997). This requirement will allow us to rule out the possibility of singular uncertain probability measure perturbations.
The class of uncertain systems considered in this paper comprises the systems of the form (1) in which the marginal conditional probability measures of the noise inputs w1∞,
v1∞are required to satisfy the above requirements. The set
of such uncertainties Q will be denoted as Q. From the de6nition and properties of the relative entropy functional, it follows that the set Q is convex.
2.2. Relative entropy constraint on stochastic uncertainty LFT uncertainty models require that a magnitude con-straint be imposed on the uncertainty. For example, it is often required that the unmodeled dynamics transfer function has a bounded H∞norm. As shown in Ugrinovskii and Petersen
(1999, 2001a) and Petersen et al. (2000b), for stochastic LFT systems with additive noise, the constraint on the H∞
norm of the uncertainty transfer function can be replaced by a constraint on the relative entropy of the associated pertur-bation probability measure. For discrete time systems, this leads to an extension of the sum quadratic constraint uncer-tainty description (Moheimani et al., 1995); see Section 2.3.
Let EQT
denote the expectation with respect to the prob-ability measure QT.
Denition 1. Let d be a given positive constant. A collec-tion of condicollec-tional probability measures Q ∈ Q is said to de6ne an admissible uncertainty if the following stochastic uncertainty constraint is satis6ed:
1 Th(QTPT) 6 d + 1 2TEQ TT−1 s=0 zs2+ $(T) (3)
for all T =1; 2; : : : . In (3); ztis the uncertainty output de6ned
by Eq. (1). Also; $(T) → 0 as T → ∞.
We denote the set of uncertainties Q ∈ Q satisfying con-dition (3) by %. The corresponding probability measures QT
are also called admissible probability measures.
Observe that the set % is not empty. Indeed, consider the reference probability measure P. Since the Gaussian random variables wt, vtare independent, the corresponding reference
conditional probability measures are simply the marginal probability measures:
Pt(dwt× dvt|x0; w1t−1; v1t−1) = Pw(dwt) × Pv(dvt):
Also, the fact that h(PTPT) = 0 and the condition d ¿ 0
imply that the corresponding collection of conditional prob-abilities P := {P1; P2; : : :} is admissible, P ∈ %. Note that in
this case, constraint (3) is satis6ed strictly. Hence, P is an interior point of the set %.
Fig. 1. An uncertain control system.
2.3. A connection between uncertainty input signals and uncertain probability measures
In order to give a further insight into De6nition 1, we show that the uncertainty class introduced in Section 2.2 includes uncertainty models which often arise in control systems such as H∞norm-bounded uncertainty and bounded exogenous
uncertainty.
Consider an uncertain system shown in Fig. 1. The system is described by the equations
xt+1= Axt+ B1ut+ B2(&'t + &ext + ˜wt+1);
zt= C1xt+ D1ut;
yt+1= C2xt+ ('t + ext + ˜vt+1); (4)
and is driven by the system and measurement noises ˜wt, ˜vt, exogenous disturbance processes &ext , ext and
un-certainty inputs &'
t, 't which are generated by a
sta-ble linear time-invariant uncertainty. The latter is due to the presence of unmodeled dynamics which are de-scribed by a stable transfer function '(z). It is assumed that sup !∈[−);)] W−1=2 0 0 −1=2 '(ej!) 612: (5) Condition (5) constitutes a H∞ type norm bound on the
size of unmodeled dynamics in system (4).
System (4) is considered in a complete probability space (W; F; ˜Q); ˜w1∞, ˜v1∞are Gaussian white noise processes
in this probability space, EQ˜ ˜w
t˜wt=W , EQ˜˜vt˜vt=. It follows
from (5) that the processes &'
0∞, '0∞ are adapted to the
and EQ˜T−1 t=0 (W−1=2&' t2+ −1=2't2) 612EQ˜ T−1 t=0 zt2:
Also, it is assumed that the exogenous disturbances &ex
0∞, ex0∞ are adapted to the 6ltration {FT; T = 0; 1; : : : ; }
and 1 TE ˜ QT−1 t=0 (W−1=2&ex t 2+ −1=2ext 2) 6 d + $(T); (6) where d ¿ 0 is a given constant, and $(T) → 0 as T → ∞. Condition (6) imposes a bound on the size of un-certain exogenous perturbations in system (4). In particular, one can see the role of the constant d from De6nition 1 in this example. The introduction of this constant reUects the fact that the power seminorm of the input disturbances &ex
0∞, ex0∞ does not exceed d.
The above constraints on the exogenous disturbance and unmodeled dynamics lead to an uncertainty constraint which can be represented in the form of the relative entropy uncer-tainty constraint (3). Indeed, the above assumptions imply that the processes
&t:= &'t + &ext ; t:= 't + ext
satisfy the condition 1 2TE ˜ QT−1 t=0 (W−1=2& t2+ −1=2t2) 6 d + 1 2TE ˜ QT−1 t=0 zt2+ $(T): (7)
In order to express constraint (7) in the form of condition (3), we introduce the probability measure transformation
dP d ˜Q FT =T−1 t=0 exp −& tW−1˜wt+1−12&tW−1&t ×exp − t−1˜vt+1−12t−1t : (8) Using the discrete-time version of the Girsanov’s theorem (Elliott, Aggoun, & Moore, 1994) we conclude that under the probability measure P the sequences w1∞; v1∞ de6ned
by the equations wt+1= ˜wt+1+ &t;
vt+1= ˜vt+1+ t; t = 0; 1; 2; : : : : (9)
are the sequences of i.i.d. Gaussian random variables. Fur-thermore, when considered with respect to the probability measure P, system (4) becomes a system of the form (1) driven by the i.i.d. Gaussian white noises wt; vt. Finally,
using the chain rule (Dupuis & Ellis, 1997), we obtain h( ˜QTPT) =1 2E ˜ QT−1 t=0 W−1=2& t2+ −1=2t2:
Here, ˜QT and PT denote the restrictions of the probability
measures ˜Q and P to (W; FT). Thus, from (7), condition
(3) of De6nition 2 follows. Therefore, the uncertain system considered in this section belongs to the class of uncertain systems de6ned in De6nition 1. It can be shown in a similar fashion that De6nition 1 encompasses some other important classes of uncertainty arising in control systems such as, for example, the cone-bounded uncertainty.
2.4. Absolutely stabilizing control
In this paper, our attention will be restricted to linear output-feedbackcontrollers of the form
ˆxt+1= Acˆxt+ Bcyt+1;
ut= Ccˆxt; (10)
where ˆx ∈ Rˆn is the state of the controller, A
c∈ Rˆn× ˆn,
Bc∈ Rˆn×q, and Cc∈ Rm× ˆn. Let U denote this class of
lin-ear controllers. Note that controller (10) is adapted to the 6ltration {Yt; t ¿ 1} generated by the observation process
y; Yt= {ys; s = 1; : : : ; t}.
The closed-loop system corresponding to system (1) and controller (10) is described by a linear di?erence equation of the form Wxt+1= WA Wxt+ WB Wwt+1; zt= WC Wxt; ut= [0 Cc] Wxt: (11) In Eq. (11), Wx = [x ˆx]∈ Rn+ ˆn and Ww t= [wt vt] are the
state and the noise input of the closed loop system. Also, the following notation is used:
WA = A B1Cc BcC2 Ac ; WB = B2 0 0 Bc ; WC = [C1 D1Cc]: (12)
In the sequel, we will consider a subclass of controllers (10) which satisfy some additional stabilizability and observ-ability requirements. We say that a controller K belongs to the class U0⊂ U if for this controller, the corresponding
matrix pair ( WA; WB) is stabilizable2 and the pair (Ac; Cc) is
observable.
It was shown in Petersen et al. (2000a) that in the steady-state case, the minimax optimal LQG controller has the structure given in equation (10). As mentioned above, we wish to investigate the stabilizing properties of this op-timal controller as T → ∞. First, we introduce a de6nition
2For example, this condition holds if the matrix pairs (A; B2) and
of stabilizability which allows for uncertain control systems driven by additive noise whose solutions are not square summable.
Denition 2. A controller K of the form (10) is said to be an absolutely stabilizing controller for the uncertain sys-tem (1) with uncertainty satisfying the relative entropy con-straint (3); if the system state process x0∞; the controller
state process ˆx0∞ and the control process u0∞ de6ned by
the closed-loop system (11) corresponding to this controller satisfy the following condition. There exist constants c1¿ 0;
c2¿ 0 such that for any admissible uncertainty Q ∈ %;
lim sup T→∞ 1 T EQTT−1 t=0 (xt2+ ˆxt2+ ut2) +h(QTPT) 6 c1+ c2d: (13)
In particular, for the uncertain system shown in Fig. 1 above property means that in a stable closed-loop system, time-average norms of the system state, controller state as well as control input and uncertainty inputs are bounded re-gardless the value of the transfer function $(z) and distur-bances &ex
0∞, ex0∞.
In the special case of the uncontrolled uncertain system (1), (3), the above de6nition reduces to the property of absolute stability the system. That is, the uncertain system (1), (3) with ut=0 is absolutely stable if for any admissible
uncertainty Q ∈ %, lim sup T→∞ 1 T EQTT−1 t=0 xt2+ h(QTPT) 6 c1+ c2d: (14) Lemma 1. Suppose the stochastic nominal closed-loop sys-tem (11) is mean square stable; i.e.
lim sup T→∞ 1 TE T−1 t=0 Wxt2¡ ∞: (15)
Also; suppose the pair ( WA; WB) is stabilizable. Then; the ma-trix WA must be stable.
The proof of this lemma is identical to the proof of the corresponding continuous-time result in Ugrinovskii and Petersen (2001a,b) and Petersen et al. (2000b).
3. The main results
The steady-state minimax LQG control problem consid-ered in Petersen et al. (2000a) was to 6nd a controller which attained an upper value
inf
KQ∈%supJ (K; Q) (16)
of the cost functional J (K; Q) = lim sup T→∞ 1 2TEQ TT−1 t=0 F(xt; ut); F(x; u) := xRx + uGu; (17)
R and G are symmetric positive de6nite matrices, R ∈ Rn×n,
G ∈ Rm×m. Here, x
t denotes the solution to system (1)
corresponding to a given controller K and an admissible uncertainty Q ∈ %. The minimax optimal controller for prob-lem (1), (3), (17) proposed in Petersen et al. (2000a) was constructed using a pair of parameter dependent algebraic Riccati equations X2= A X−1 2 + B1(G + 2D1D1)−1B1−12B2WB2 −1 A + R + 2C 1C1; (18) Y2= A Y−1 2 + C2−1C2−12R − C1C1 −1 A+ B 2WB2: (19) Here, 2 is a positive constant which was chosen as follows. For each 2 ¿ 0 such that the Riccati equations (18), (19) admit positive de6nite stabilizing solutions satisfying the conditions
Y−1
2 + C2−1C2−12R − C1C1¿ 0;
X−1
2 −12B2WB2¿ 0; X2−1−12Y2¿ 0 (20)
de6ne the quantity V2= −22log det I −12Y2(R + 2C1C1) −22log det I −125X2 I −12Y2X2 −1 ; (21) where 5 := K + C 2 I −12Y2(R + 2C1C1) −1 Y2C K; K := A Y−1 2 + C2−1C2−12R − C1C1 −1 C 2−1: (22) The set of the parameters 2 ¿ 0 satisfying the above require-ment is denoted T. In order to obtain the minimax optimal LQG controller, the parameter 2 ¿ 0 is chosen to achieve the in6mum in
inf
2∈T(V2+ 2d): (23)
Using this optimal value of 2, the corresponding mini-max optimal steady-state controller K2 is de6ned by the
equations ut= L2ˆxt; (24) L2:= −(G + 2D1D1)−1B1 × X−1 2 + B1(G + 2D1D1)−1B1−12B2WB2 −1 ×A I −12Y2X2 −1 ; where ˆxt was generated by the 6lter
ˆxt+1= A ˆxt+ B1ut+ K(yt+1− C2ˆxt) +12A Y−1 2 + C2−1C2−12R − C1C1 −1 ×(R + 2C 1C1) ˆxt; ˆx0= Mx0: (25)
The above solution to the minimax optimal LQG control problem was obtained in Petersen et al. (2000a) by letting T → ∞ in the corresponding 6nite horizon problem. How-ever, as we let T approach ∞, two important issues arise which were not addressed in Petersen et al. (2000a).
(1) The ability of an optimal controller to stabilize the sys-tem is an important issue in optimal control on an in-6nite time interval. In our case, it is desired that the resulting minimax optimal controller stabilizes the sys-tem (1) for all admissible uncertainties Q ∈ %. (2) It was shown in Petersen et al. (2000a) that on a 6nite
interval, the minimax optimal LQG controller exists if and only if associated parameter dependent di?erence Riccati equations have positive de6nite solutions. In the in6nite horizon case, the design methodology outlined in Petersen et al. (2000a) provides only a suLcient condition for the minimax optimal controller to exist. However, it is often useful to know that the “only if” claim holds as well. Necessary conditions usually give a good indication as to how conservative the proposed controller is.
These issues are addressed in the statements given below. Theorem 1 shows that the existence of stabilizing solutions to the algebraic Riccati equations (18) and (19) is a suL-cient as well as a necessary condition for the in6nite-horizon minimax optimal LQG control problem (26) to have a solu-tion. Furthermore, Theorem 2 shows that the minimax LQG controller proposed in Petersen et al. (2000a) is an abso-lutely stabilizing controller.
Theorem 1. (i) If the set T is non-empty, then the mini-max optimal control problem
inf
K∈UQ∈%supJ (K; Q) (26)
has a 8nite value.
(ii) Conversely, if there exists an absolutely stabilizing controller K ∈ U0which attains the in8mum in (26); then
the set T is not empty.
The in6nite-horizon version of the minimax optimal LQG control result of Petersen et al. (2000a) now follows from Theorem 1.
Theorem 2. Suppose that T = ∅ and 2∗∈ T attains the
in8mum in (23). Then; the corresponding controller K∗=
K2∗ de8ned by (24); (25); guarantees that the worst case
of the cost functional (17) does not exceed the value (23). Furthermore; this controller is an absolutely stabilizing controller for stochastic uncertain system (1) satisfying relative entropy constraint (3).
Remark 1. Note that the controller resulting from Theo-rem 2 is the steady-state limit of the 6nite horizon mini-max optimal controller proposed in the reference Petersen et al. (2000a). Also; the worst-case performance (23) is the steady-state limit of the performance guaranteed by the 6nite-horizon minimax optimal controller.
The proofs of the above theorems will be given in Sections 3.3 and 3.4.
3.1. Preliminary remarks
The proof of Theorem 1 relies on a duality relationship be-tween free energy and relative entropy established in Dupuis and Ellis (1997) and Dai Pra, Meneghini and Runggaldier (1996). Associated with system (1), consider the parameter dependent risksensitive cost functional
IT;2(K) =T2log E exp 1 22 T−1 t=0 F2(xt; ut) ; (27) where 2 ¿ 0 is a given constant, K is a controller de6ned by Eq. (10). Also,
F2(x; u) := x(R + 2C1C1)x + u(G + 2D1D1)u: (28)
When applied to system (1) and the risksensitive cost func-tional (27), the duality result states that for each admissible controller K, IT;2(K) = sup QT: h(QTPT)¡∞ 1 T 1 2EQ TT−1 t=0 F2(xt; ut) −2h(QTPT) : (29) The use of the above duality result is a key step in this pa-per in that it enables us to replace a guaranteed cost con-trol problem by the following risksensitive optimal concon-trol problem:
Find an output feedbackcontroller of form (10) which attains the in6mum in
inf
KT→∞limIT;2(K): (30)
In this paper, we use the solution to this risksensitive con-trol problem presented in Whittle (1990). Note that similar results concerning problem (30) were obtained in Iglesias, Mustafa, and Glover (1990) and Glover and Doyle (1988)3
where the problem was shown to be equivalent to a problem of constructing an output feedbackcontroller which maxi-mizes the entropy functional
Id(K; 2) = 2)2 ) −)log|det(I − H2(e j!)H∗ 2(ej!))|d! (31) over the set of stabilizing output feedbackcontrollers satis-fying the following bound:
H2∞:= sup !∈[−);)]
H∗
2(ej!)H2(ej!) ¡ 1: (32)
Here, H2(s) is the transfer function of the closed loop system
de6ned by the equations
x†t+1= Ax†t + B1ut+ [B2W1=2 0]8t+1; (33) zt†= 1 √2R1=2 0 C1 xt†+ 0 1 √ 2G1=2 D1 ut; yt+1= C2xt†+ [0 1=2]8t+1;
and a stabilizing linear controller ut=K(y1t). In particular,
for a linear controller of the form (10), the transfer function H2(s) is given by H2(s) := 1 √2R1=2 0 0 √1 2G1=2Cc C1 D1Cc (sI − WA)−1 × B2W1=2 0 0 Bc1=2 : (34)
It follows from the results of Iglesias et al. (1990), Glover and Doyle (1988) and Whittle (1990) that the solution to the above maximum entropy problem and equivalently, the solu-tion to the risksensitive control problem (30) is given by the central solution to the corresponding H∞control problem;
see BaCsar and Bernhard (1995).4 This controller is de6ned
by Eqs. (24) and (25) and involves positive de6nite solutions
3Iglesias et al. (1990), Glover and Doyle (1988) consider a slightly
di?erent information pattern which allows the control utto instantaneously
access yt+1. This leads to an optimal controller which is not strictly causal. 4Also, the links between the problem considered in this paper and
H∞ control were discussed in Petersen et al. (2000a).
to the Riccati equations (18) and (19). Furthermore, the minimum risk-sensitive cost corresponding to this controller is given by Eq. (21).
3.2. Absolutely stabilizing properties of risk-sensitive control
In this section, we establish the absolutely stabilizing properties of the risk-sensitive controller.
Lemma 2. Let K be a controller which guarantees a 8nite risk-sensitive cost:
VK:= limT→∞IT;2(K) ¡ ∞:
Then; the controller K is an absolutely stabilizing con-troller for the stochastic uncertain system (1) satisfying the relative entropy constraint (3). Furthermore;
sup
Q∈%J (K; Q) 6 VK+ 2d: (35)
Proof. We wish to prove that the controller K satis6es condition (13) of De6nition 2.
Consider an uncertain system with an admissible uncer-tainty Q ∈ % governed by the controller K. Recall that the system is described by Eq. (1) where the control process u0∞ is generated by the given controller K. Also, the joint
probability distribution of the noise inputs w1∞, v1∞ and
the initial condition x0is de6ned by the given collection of
conditional probability measures Q ∈ %.
Note that since the uncertainty Q is admissible, then the corresponding probability measures QT have the property
h(QTPT) ¡ ∞ for every T = 1; 2; : : : . This allows us
to apply duality condition (29). From this equation, we obtain lim sup T→∞ 1 2TEQ TT−1 s=0 F(xs; us) +2 T 1 2EQ TT−1 s=0 zs2− h(QTPT) 6 VK: (36)
Furthermore since Q ∈ %, satisfaction of Eq. (35) follows from (36) and (3).
We now show that the controller K is absolutely stabi-lizing. Since the matrices R and G are positive de6nite, then inequality (35) implies lim sup T→∞ 1 TEQ TT−1 s=0 (xs2+ us2) 6 9(VK+ 2d); (37)
where 9 ¿ 0 is a constant which depends only on R and G. Next, we show that there exist constants c1; c2¿ 0 such
that lim sup
T→∞
1
To this end, we note that for any Q ∈ %, sup t¿T 1 th(QtPt) 6 d + ˆ$T+ supt¿T 1 2tEQ tt−1 s=0 zs2: (39)
Here, ˆ$T:= supt¿T|$(t)| → 0 as T → ∞. Inequality (39)
takes into account the fact that the probability measure Qt
satis6es the relative entropy constraint (3). Therefore, in-equalities (37) and (39) imply
lim sup T→∞ 1 Th(QTPT) 6 d + lim supT→∞ 1 2TEQ T−1 s=0 zs2 6 c1+ c2d;
where the constants c1, c2 are de6ned by VK, 2, 9, C1, D1, and hence independent of Q ∈ %.
To complete the proof, it remains to prove that the process ˆx0∞satis6es condition (13) of De6nition 2. Indeed, consider
the uncertain closed loop system (11) corresponding to the controller K and an admissible uncertainty Q ∈ %. Using the duality relation between free energy and relative entropy established in Dupuis and Ellis (1997) and Dai Pra et al. (1996), we obtain sup QT∈% : 2EQ TT−1 t=0 Wxt2− h(QTPT) 6 sup QT:h(QTPT)¡∞ : 2EQ TT−1 t=0 Wxt2− h(QTPT) =log E exp : 2 T−1 t=0 Wxt2 :
From this equation, it follows that for any Q ∈ %, lim sup T→∞ : 2TEQ TT−1 t=0 Wxt26 lim sup T→∞ 1 Th(QTPT) + lim T→∞ 1 T log E exp : 2 T−1 t=0 Wxt2 : (40)
Note that the process Wx0∞on the left-hand side of Eq. (40)
corresponds to the uncertain system (11) while the process Wx0∞ that appears in the second term on the right-hand side
of Eq. (40) is generated by the nominal closed loop system (11). Since this nominal system is driven by the Gaussian process Ww1∞, one can choose suLciently small : ¿ 0 such
that the second term on the right-hand side of Eq. (40) is 6nite; e.g., see Glover and Doyle (1988) and Whittle (1990). This term is independent of Q. Since we have already proved (38), then condition (13) follows.
Remark 2. It follows from Lemma 2 that the controller solving the risksensitive control problem (30) is an abso-lutely stabilizing controller for the uncertain system under consideration.
Lemma 3. Suppose5 that for any K ∈ U0
sup
Q∈QJ (K; Q) = ∞: (41)
If there exists an absolutely stabilizing controller ˜K ∈ U0
such that sup
Q∈%J ( ˜K; Q) ¡ c ¡ ∞; (42)
then there exists a 2 ¿ 0 such that V2¡ ∞.
Proof. Since the given controller ˜K ∈ U0absolutely
stabi-lizes the uncertain system (1); (3); then condition (13) of De6nition 2 is satis6ed. Note that it follows from (13) and Lemma 1 that the matrix WA corresponding to the controller
˜
K is stable. Also; condition (13) implies that there exists a positive constants ˜c ¿ 0; : ¿ 0 such that for all Q ∈ %; lim inf T→∞ 1 2TEQ TT−1 s=0 F(xs; us) + : lim inf T→∞ 1 2TEQ TT−1 s=0 (xs2+ ˆxs2) 6 ˜c: (43)
Consider the functionals G0(Q) := ˜c − lim infT→∞ 2T1 EQT T−1 s=0 F(xs; us) − : lim inf T→∞ 1 2TEQ TT−1 s=0 (xs2+ ˆxs2); G1(Q) := − d − lim infT→∞ T1 1 2EQ TT−1 s=0 zs2− h(QTPT) : (44) It is readily proved that satisfaction of condition (43) implies that the following condition is satis6ed:
If G1(Q) 6 0 then G0(Q) ¿ 0: (45)
The proof of this fact follows along the same lines as the proof of the corresponding fact in Ugrinovskii and Petersen (2001a); see also Petersen et al. (2000b). Furthermore; the set of uncertainties satisfying the condition G1(Q) 6 0 has
an interior point; see the remarkfollowing De6nition 1. Also; it follows from the properties of the relative entropy functional that the functionals G0(·) and G1(·) are convex.
5Condition (41) is similar to a corresponding condition in Petersen et al.
(2000a). This condition was introduced in Petersen et al. (2000a) to rule out the case 2=0 in the proof of Theorem 3:1. In the continuous-time case, this condition holds if the pair ( WA; WB) corresponding to the given controller K ∈ U is controllable; e.g., see Ugrinovskii and Petersen (2001a) and Petersen et al. (2000b). Obviously, in the discrete time case, condition (41) can be replaced by a similar controllability condition.
We have now veri6ed all of the conditions needed to apply the Lagrange multiplier result (e.g.; see Luenberger; 1969). Indeed; Theorem 1 on p. 217 of Luenberger (1969) implies that there exists a constant 2 ¿ 0 such that
lim inf T→∞ 1 TEQ TT−1 t=0 1 22F(xt; ut) +: 2lim infT→∞ 1 2TEQ TT−1 t=0 (xt2+ ˆxt2) + lim inf T→∞ 1 T 1 2EQ TT−1 t=0 zt2− h(QTPT) 62˜c− d: (46)
for all Q ∈ Q. Also; condition (41) rules out the possibility that 2 = 0. Thus; 2 ¿ 0.
Condition (46) implies the satisfaction of condition (32) for the transfer function H2(s) corresponding to the system
(33) and the given controller ˜K. This claim can be estab-lished using the same arguments as those used in proving the corresponding fact in Ugrinovskii and Petersen (2001b) and Petersen et al. (2000b). For the sake of brevity, we only outline the proof.
Consider an augmented version of the closed loop system corresponding to system (33) and the given controller ˜K:
Wx†t+1= WA Wx†t + B2W1=2 0 0 Bc1=2 8t; (47) z:;t= 1 √2R1=2 0 0 √1 2G1=2Cc C1 D1Cc : 2I 0 0 : 2I Wx† t:
This system is supposed to be driven by a deterministic in-put disturbance 80∞which has a 6nite autocorrelation
ma-trix and also has a power spectral density function (MZakilZa, Partington, & Norlander 1998; Ljung, 1987). We denote the set of such signals by P+. The transfer function of system
(47) is denoted H2;:(z). We will establish condition (32)
by showing that
H2;:∞6 1: (48)
The proof is by establishing a contradiction. First, we observe that the failure of condition (48) to hold must lead to the existence of a sequence of deterministic inputs
{8N 0∞; N = 1; 2; : : :} ⊂ P+such that 1 2T T−1 t=0 (z:;t2− 8Nt 2) ¿ N − < (49)
for all T ¿ T(<; N), where < ¿ 0 is a suLciently small con-stant. Then, we show that the satisfaction of condition (49) must lead to a contradiction with (46). Indeed, for each in-put 8N
0∞, a collection of conditional probability measures
QN= {QN
1; QN2; : : :} ∈ Q can be de6ned using a
correspond-ing probability measure transformation; see Section 2.3. To this end, the input 8N
0∞ has to be partitioned as follows:
&N t N t = W1=2 0 0 1=2 8N t :
Also as in Section 2.3, the probability measures QN;T are
de6ned, h(QN;TPT) =1 2 T−1 t=0 W−1=2&N t 2+ −1=2Nt 2 : (50) Next, it is shown using (49) that the closed loop system (11) driven by the deterministic inputs &N
0∞, N0∞and considered
on the probability space (W; FT; QN;T), satis6es the
fol-lowing condition: MN := limT→∞2T1 EQN; T T−1 t=0 1 2F(xt; ut) + : 2 Wxt2 + zt2− W−1=2&N t 2− −1=2Nt 2 ¿ N: (51) Letting N → ∞ in Eq. (51) and using Eq. (50), we obtain the following contradiction with (46):
sup Q∈Q lim infT→∞T1EQT T−1 t=0 1 22F(xt; ut) +: 2lim infT→∞ 1 2TEQ TT−1 t=0 Wxt2 + lim inf T→∞ 1 T 1 2EQ TT−1 t=0 zt2− h(QTPT) ¿ sup N¿0MN= ∞:
The above contradiction shows that condition (48) must hold. Since condition (48) implies (32), the lemma now follows from condition (32) and the results of Glover and Doyle (1988) and Whittle (1990).
3.3. Proof of Theorem 1
Part (i) of the theorem follows from Lemma 2. Indeed, if 2 ∈ T, then the corresponding Riccati equations (18) and
(19) have stabilizing solutions. This ensures that the corresponding controller (24) solves the maximum en-tropy control problem and equivalently, the risksensitive control problem (30). The corresponding value of the risk-sensitive cost in (30) is therefore 6nite. Thus, con-troller (24) corresponding to 2 ∈ T also satis6es the condi-tions of Lemma 2. From this lemma, part (i) of Theorem 1 follows.
Part (ii) follows from Lemma 3. 3.4. Proof of Theorem 2
It has been observed that for each 2 ∈ T, controller (24), (25) solves the corresponding risk-sensitive optimal control problem. Hence from Lemma 2, this controller is an abso-lutely stabilizing controller. Also from Eq. (35) in Lemma 2, it follows that the risk-sensitive controller corresponding to 2∗ guarantees that the cost is not greater than the value
de6ned by Eq. (23).
4. Illustrative example
To illustrate our theory, we consider an example dis-cussed in Petersen et al. (2000a). In this example, an uncertain system was considered which had the struc-ture shown in Fig. 1. The system is de6ned by the equations xt+1= 1:1052 0:1105 0 1:1052 xt+ 0:0053 0:1052 ut + 0:0053 0:1052 (&t+ ˜wt+1); zt= 0:5ut; yt+1= [1 0]xt+ (t+ ˜vt+1): (52)
The reference noise signals wt, vt are assumed to be
Gaus-sian white noise signals with covariances W = 1, = 10−4,
respectively. The uncertainty is assumed to satisfy the stochastic sum constraint (7) with d = 10−8. Also in
Petersen et al. (2000a), cost functional (17) was con-sidered in which R = 1 1 1 1 ; G = 10−4:
For this system, the robust LQG controller designed in Petersen et al. (2000a) using the steady-state Riccati
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 104 0 0.5 1 1.5 2 2.5 T ∆=1 ∆=0 ∆=1
Fig. 2. The graph of 1 T
$T−1
t=0 xt2 versus T.
equations (18), (19) is given by the transfer function H(z) =z2−76:85z + 78:03+ 0:4018z + 0:386 (53) which corresponds to 2∗ = 1:18. To verify stabilizing
properties of this controller, the closed loop system (52), (53) was simulated with di?erent values of ' and the graphs of 1 T T−1 t=0 xt2; 1 T T−1 t=0 ut2; 1 T T−1 t=0 &t2
were plotted. It was observed that the above quantities showed a tendency to remain bounded over the time of simulation. One of these graphs is shown in Fig. 2. References
BaCsar, T., & Bernhard, P. (1995). H∞-optimal control and related
minimax design problems: a dynamic game approach. 2nd edition. Boston, BirkhZauser.
Dai Pra, P., Meneghini, L., & Runggaldier, W. (1996). Connections between stochastic control and dynamic games. Mathematics of Control, Systems and Signals, 9(4), 303–326.
Doyle, J. C. (1978). Guaranteed margins for LQG regulators. IEEE Transactions on Automatic Control, 23.
Dupuis, P., & Ellis, R. (1997). A Weak Convergence Approach to the Theory of Large Deviations. New York: Wiley.
Dupuis, P., James, M. R., & Petersen, I. R. (1998). Robust properties of risk-sensitive control. In Proceedings of the IEEE Conference on Decision and Control, Vol. 2. (pp. 2365–2370), Tampa, FL, December.
Elliott, R. J., Aggoun, L., & Moore, J. B. (1994). Hidden Markov models. Estimation and Control. New York: Springer.
Fu, M., de Souza, C. E., & Xie, L. (1991). Quadratic stabilization and H∞control of discrete-time uncertain systems. In Proceedings of the
International Symposium on Mathematical Theory of Networks and Systems, Kobe, Japan.
Glover, K., & Doyle, J. C. (1988). State-space formulae for all stabilizing controllers that satisfy an H∞-norm bound and relations
to risk-sensitivity. Systems & Control Letters, 11, 167–172. Haddad, W. H., Bernstein, D. S., & Mustafa, D. (1991). Mixed-norm
H2=H∞regulation and estimation: The discrete-time case. Systems &
Control Letters, 16(4), 235–248.
Iglesias, P. A., Mustafa, D., & Glover, K. (1990). Discrete time H∞
controllers satisfying a minimum entropy criterion. Systems & Control Letters, 14, 275–286.
Ljung, L. (1987). System Identi8cation:Theory for the User. Englewood-Cli?, NJ: Prentice-Hall.
Luenberger, D. G. (1969). Optimization by Vector Space Methods. New York: Wiley.
MZakilZa, P. M., Partington, J. R., & Norlander, T. (1998). Bounded power signal spaces for robust control and modeling. SIAM J. Control Opt., 37(1), 92–117.
Moheimani, S. O. R., Savkin, A. V., & Petersen, I. R. (1995). A connection between H∞ control and the absolute stabilizability of
discrete-time uncertain systems. Automatica, 31(8), 1193–1195. Mustafa, D., & Bernstein, D. S. (1991). LQG bounds in discrete-time
H2=H∞ control. Transactions of the Institute of Measurement and
Control, 13, 269–275.
Petersen, I. R., James, M. R., & Dupuis, P. (2000a). Minimax optimal control of stochastic uncertain systems with relative entropy constraints. IEEE Transactions on Automatic Control, 45, 398–412.
Petersen, I. R., Ugrinovskii, V., & Savkin, A. V. (2000b). Robust Control Design using H∞ Methods. Berlin: Springer.
Ugrinovskii, V. A., & Petersen, I. A. (1999). Finite horizon minimax optimal control of stochastic partially observed time varying uncertain systems. Mathematics of Control, Signals and Systems, 12(1), 1–23. Ugrinovskii, V. A., & Petersen, I. R. (2001a). Minimax LQG control of stochastic partially observed uncertain systems. SIAM J. Control Optim., 40(4), 1189–1226.
Ugrinovskii, V. A., & Petersen, I. R. (2001b). Robust stability and performance of stochastic uncertain systems on an in6nite time interval. Systems & Control Letters, 44(4), 291–308.
Whittle, P. (1990). Risk-sensitive optimal control. Chichester, UK: Wiley.
Ian R. Petersen was born in Victoria, Australia in 1956. He received the Bach-elor of Engineering (Electrical) degree from the University of Melbourne in 1979. He received a Master of Science degree in 1984 and a Ph.D in Electrical Engi-neering in 1984 both from the University of Rochester. From 1983 to 1985 he was a Postdoctoral Fellow in the Depart-ment of Systems Engineering, Australian National University. In 1985 he was ap-pointed as a Lecturer in the Department of Electrical Engineering, Australian Defence Force Academy and he is currently a Professor in this department. In 1989, he was a visitor in the Department of Engineering, Cambridge University. He served as an Associated Editor for the IEEE Transactions on Automatic Control and Systems and Control Letters. Currently he is an Associate Editor for Automatica and SIAM Journal on Control and Optimization. His main research interests are in robust control theory, H∞control, robust 6ltering,
and optimal control theory.
Valery A. Ugrinovskii was born in Ukraine in 1960. He received the M.Sc. degree in Applied Mathematics in 1982 and a Ph.D in Physics and Mathematics in 1990 both from the State University of Nizhny Novgorod, Russia. From 1982 to 1995 he held research positions with the Radiophysical Research Institute, Nizhny Novgorod. From 1995 to 1996 he was a Postdoctoral Fellow at the University of Haifa. Since 1996, he has been a Research Associate and Lecturer in the School of Electrical Engineering, Australian Defence Force Academy, Canberra. He is the coauthor of the research monograph Robust Control Design using H∞Methods, Springer, London, 2000, with Ian R. Petersen
and Andrey V. Savkin. His current research interests include stochastic control theory, robust control theory and H∞control.