Mixed deterministic and random optimal control of linear stochastic systems with quadratic costs

(1)

R E S E A R C H Open Access

Mixed deterministic and random optimal control

of linear stochastic systems with quadratic costs

Ying Hu · Shanjian Tang

Received: 26 November 2018 / Accepted: 11 December 2018 /

© The Author(s). 2019Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Abstract In this paper, we consider the mixed optimal control of a linear stochastic system with a quadratic cost functional, with two controllers—one can choose only deterministic time functions, called the deterministic controller, while the other can choose adapted random processes, called the random controller. The optimal control is shown to exist under suitable assumptions. The optimal control is characterized via a system of fully coupled forward-backward stochastic differential equations (FBS-DEs) of mean-field type. We solve the FBSDEs via solutions of two (but decoupled) Riccati equations, and give the respective optimal feedback law for both determin-istic and random controllers, using solutions of both Riccati equations. The optimal state satisfies a linear stochastic differential equation (SDE) of mean-field type. Both the singular and infinite time-horizonal cases are also addressed.

Keywords Stochastic LQ·Differential/algebraic Riccati equation·Mixed deterministic and random control·Singular LQ·Infinite-horizon

AMS subject classification 93E20

1 Introduction and formulation of the problem

LetT >0 be given and fixed. Denote bySnthe totality ofn×nsymmetric matrices, and bySn₊its subset of alln×nnonnegative matrices. We mean by ann×nmatrix

Ying Hu

Univ Rennes, CNRS, IRMAR - UMR 6625, 35000 Rennes, France Shanjian Tang ()

Department of Finance and Control Sciences, School of Mathematical Sciences, Fudan University, Shanghai, 200433 China

(2)

S ≥ 0 thatS ∈ Sn₊and by a matrixS > 0 thatSis positive definite. For a matrix-valued functionR: [0, T] →Sn, we mean byR0 thatR(t)is uniformly positive, i.e. there is a positive real numberαsuch thatR(t)≥αI for anyt ∈ [0, T].

In this paper, we consider the following linear controlled stochastic differential equation (SDE)

dXs =

AsXs+Bs1u 1 s +B

2 su

2 s

ds (1)

+ d

j=1

CsjXs+D1js u1s +D 2j s u2s

dWsj, s >0; X0=x0,

with the following quadratic cost functional

J (u) 1

2E

T

0

QsXs, Xs +

R1_su1_s, u1_s+R_s2u2_s, u2_sds+1

2E[GXT, XT]. (2) Here,(Wt)0_≤t_≤T = Wt1,· · ·, Wtd 0≤t≤T is ad-dimensional Brownian motion on a probability space(,F,P). Denote by(Ft)the augmented natural filtration gen-erated by (Wt). A, B1, B2, Cj, D1j and D2j are all bounded Borel measurable functions from[0, T]toRn×n_,_Rn×l1_,Rn×l2_,Rn×n_,Rn×l1_{, and}Rn×l2_{, respectively.} Q, R1,andR2are nonnegative definite, and they are all essentially bounded mea-surable functions on[0, T]with values inSn,Sl1_{, and}Sl2_{, respectively. In the first}

four sections,R1 andR2 are further assumed to be positive definite. G ∈ Sn is positive semi-definite. A controlu =u1, u2 is a processu ∈ L2_F0, T;Rl1+l2 _,

andXu_∈_L2

F(; C(0, T;Rn))is the corresponding state process with initial value x0∈Rn.

We will use the following notations. Letl >0 be an integer. For a givenσ-field G ⊂ F, L2_G;Rl is the set of random variables ξ : (,G) → Rl,BRl

withE|ξ|2 < +∞.L∞_G ; Rl is the set of essentially bounded random vari-ablesξ : (,G) → Rl,BRl .L2_Ft, T;Rl is the set of{Fs}s∈[t,T]-adapted

Rl_{-valued processes} _f _{= {}_f

s : t ≤ s ≤ T} such that E

T

t |fs|

2 _ds _<

∞, and denoted by L2t, T;Rl if the underlying filtration is the trivial one.

L∞_F t, T;Rl is the set of essentially bounded{Fs}s∈[t,T]-adaptedRl-valued pro-cesses.L2_F;Ct, T; Rl is the set of continuous{Fs}s∈[t,T]-adaptedRl-valued processes f = {fs :t ≤s≤T}such that E

sups∈[t,T]|fs|2

< ∞. All vectors used in this paper are column vectors. For a matrix M, M is its transpose, and |M| =_i,jm2_ij is the Frobenius norm. For a vector- or matrix-valued functionK defined on the time interval[0, T]and differentiable at timet ∈ [0, T],K˙t orK(t)˙ stands for the derivative at timet. Define

B:=B1, B2, D:=D1, D2, R:=diagR1, R2, u:=u1,u2

(3)

(Ctx+Dtu)dWt = d

j₌1

Ctjx+D 1j t u1+D

2j t u2

dWtj; CtK:= d

j₌1

Ctj

Kj;

D_tKDt:= d

j=1

Dtj

KDtj, CtKDt:= d

j=1

Ctj

KDjt, CtKCt:= d

j=1

Ctj

KCtj.

If bothu1andu2are adapted to the natural filtration of the underlying Brownian motionW (i. e.,ui ∈ U_adi =L2_F0, T;Rli _for_i =₁_,_{2), it is well-known that the} optimal control exists and can be synthesized into the following feedback of the state:

ut =Rt+DtKtDt −1KtBt +CtKtDt Xt, t ∈ [0, T]. (4)

HereKsolves the following Riccati differential equation:

˙

Ks+AsKs+KsAs +CsKsCs+Qs − KsBs+CsKsDs

Rs+DsKsDs − 1

KsBs +CsKsDs (5) = 0, s∈ [0, T];KT =G.

See Wonham (1968), Haussmann (1971), Bismut (1976,1977), Peng (1992), and Tang (2003) for more details on the general Riccati equation arising from linear quadratic optimal stochastic control with both state- and control-dependent noises and deterministic coefficients.

In this paper, we consider the following situation: there are two controllers called the deterministic controller and the random controller: the former can impose a deter-ministic actionu1only, i.e.,u1∈U_ad1 =L20, T;Rl1 _{; while the latter can impose}

a random actionu2, more preciselyu2 ∈ U_ad2 = L2_F0, T;Rl2 _{. Firstly, we apply}

the conventional variational technique to characterize the optimal control via a sys-tem of fully coupled forward-backward stochastic differential equations (FBSDEs) of mean-field type. Then we give the solution of the FBSDEs with two (but decoupled) Riccati equations, and derive the respective optimal feedback law for both determin-istic and random controllers, using solutions of both Riccati equations. Existence and uniqueness is given to both Riccati equations. The optimal state is shown to satisfy a linear stochastic differential equation (SDE) of mean-field type. Both the singular and infinite time-horizonal cases are also addressed.

(4)

2 Necessary and sufficient condition of mixed optimal controls

The following necessary and sufficient condition can be proved in a straightforward way.

Theorem 1Assume that u∗ is an optimal control, andX∗ is the corresponding solution. Then there is a unique pair of processes p(·), k:=(kj(·))j=1,···,d ) ∈

L2_F(0, T;Rn)×L2_F(0, T;Rn) d(hereafter called the adjoint processes) satisfying the BSDE (6):

dp(s) = −A_sp(s)+C_sk(s)+QsXs∗

ds+k(s)dWs, s∈ [0, T],

p(T ) =GX∗_T; (6)

and the following optimality conditions hold true:

EBs1

p(s)+

D1s

k(s)+R1su1s∗

=0, (7)

Bs2

p(s)+

Ds2

k(s)+Rs2u2s∗ =0. (8)

These optimality conditions are also sufficient for(X∗, u∗)to be an optimal pair.

ProofIt is obvious that BSDE (6) has a unique solution (p(·), k(·)) ∈ L2_F(0, T;Rn)×L2_F(0, T;Rn)d.

Using the convex perturbation, we obtain in a straightforward way the necessary conditions of the optimal controlu∗:

E T

0

B_sip(s)+D_sik(s)+Ri_sui_s∗, ui_s

ds =0, ∀ui ∈Uad;i i=1,2.

(9) Clearly, the preceding conditions are equivalent to (7) and (8). The sufficiency can be proved in a standard way.

3 Synthesis of the mixed optimal control

3.1 Ansatz

Let u be a control, X be the corresponding state process, and (p, k) the unique solution to (6) such that both (7) and (8) are satisfied. Define

X:=E[X], X:=X−X; u2_:=_E_u2

, u2_:=_u2₋

u2_. ₍₁₀₎

We expect a feedback of the following form

(5)

for somen×nmatrix-valued absolutely continuous functionsP1(·)andP2(·)defined on the time interval[0, T]. Applying Ito’s formula, we have

dp(s)= ˙P1(s)Xsds+P1(s)

AsXs +Bs2u2s

ds

+P1(s)[CsXs +Dsus]dWs + ˙P2(s)Xsds+P2(s)

AsXs+Bs1u1s +Bs2u2s

ds.

(12)

Hence

k(s)=P1(s) (CsXs+Dsus) . (13)

Define fori=1,2,

i(S):=Ri+

Di SDi, S∈Sn; (14)

(S) :=1(S)−

D1 SD2−₂1(S)D2 SD1, S ∈Sn; (15)

and

i:=

B2

Pi+

D2

P1C. (16)

Plugging Eqs. (11) and (13) into the optimality conditions (7) and (8):

Bs1

P2(s)Xs+

Ds1

P1(s)

CsXs+Ds1u1s +Ds2u2s

+Rs1u1s =0, (17)

B_s2P1(s)Xs +P2(s)Xs

+D_s2P1(s)

CsXs+Xs +Ds1u 1 s +D

2 su

2 s

+R_s2u2_s =0. (18)

From the last equality, we have

u2= −−₂1(P1)

1X+2X+

D2P1D1u1

(19)

and consequently

u2_{= −}−1 2 (P1)

2X+

D2P1D1u1

. (20)

In view of (17), we have

Bs1

P2(s)Xs+

Ds1

P1(s)CsXs

+D1_sP1(s)Ds2u2s +1(P1(s)) u1s =0 (21)

and therefore,

1(P1(s)) u1s +

B_s1P2(s)Xs +

D1_sP1(s)CsXs

−D1s

P1(s)D2s−21(P1)

2(s)Xs+

Ds2

P1(s)D1su1s

(6)

or equivalently

1(P1)−

D1

P1D2−21(P1)

D2

P1D1

u1

= −B1

P2+

D1

P1C−

D1

P1D2−21(P1)2

Xs. (23)

We have

u1=M1X, u2=M2X+M3X (24)

where

M1 := −1(P1)−

D1 P1D2−₂1(P1)

D2 P1D1

−1

×B1 P2+

D1 P1C−

D1 P1D2−21(P1)2

, (25)

M2 := −−₂1(P1)1, (26)

M3 := −−₂1(P1)

2+

D2 P1D1M1

. (27)

In view of (12) and (6), we have

dp(s) = ˙P1Xds +P1AX+B2M2Xds+kdWs (28) + ˙P2Xds+P2

AX+B1M1X+B2M3Xds

= −As

P1(s)Xs+P2(s)Xs +

Qs+CsP1(s)Cs

Xs+Xs (29) +CsP1(s)Ds1Ms1Xs+Ds2

Ms2Xs+Ms3Xs ds +k(s)dWs.

We expect the following system for(P1, P2):

˙

P1+P1A+AP1+CP1C+Q −P1B2+CP1D2 −₂1(P1)

P1B2+CP1D2 =0, (30)

P1(T )=G

and

˙

P2+P2A+AP2+CP1C+Q+CP1D1M1+CP1D2M3

+P2B1M1+P2B2M3=0, P2(T )=G. (31)

The last equation can be rewritten into the following one:

˙

(7)

where forS∈Sn₊,

U (S):=S−SD2−₂1(S)

D2

S,

Q(S):=Q+CU (S)C−CU (S)D1−1(S)D1U (S)C,

A(S):=A−B2₂−1(S)D2SC

−

B1−B2₂−1(S)D2SD1

−1(S)D1U (S)C,

N(S):=B2−₂1(S)

B2

+

B1−B2−₂1(S)

D2 SD1

−1(S)

B1−B2−₂1(S)

D2 SD1 .

We have the following representation forM1andM2:

M1= −−1(P1)

B1

P1+(D1)U (P1)C−

D1

P1D1−₂1(P1)

B2 P2 ,

M3= −−₂1(P1)

B2

P2+

D2

P1C

−(D2)P1D1−1(P1)

B1

P1+

D1

U (P1)C

−D1

P1D1−₂1(P1)

B2 P2 . (33)

Lemma 1ForS∈Sn₊, we haveQ(S) ≥0.

ProofFirst, we show thatU (S)≥0.In fact, we havesettingD2_:=_S1/2_D2

U (S) =S−S1/2D2

R2+D2_D2

−1

D2_S1/2

≥S−S1/2I S1/2=0. (34)

Here we have used the following well-known matrix inequality:

D(R+DF D)−1D≤F−1 (35)

forD∈Rn×m, and positive matricesF ∈SnandR∈Sm. Using again the inequality (35), we have

settingD1_{:= [}_{U (S)}_]1/2_D1

Q(S)= Q+CU (S)C

− CU (S)D1

R1+D1SD1−D1SD2−₂1(S)D2SD1

−1

D1U (S)C

= Q+CU (S)C−CU (S)D1

R1+D1U (S)D1

₋1

D1U (S)C

= Q+CU (S)C−C[U (S)]1/2D1

R1+

D1_D1

−1

D1_[_{U (S)}_]1/2

C

(8)

3.2 Existence and uniqueness of optimal control

Theorem 2Assume thatR1 0andR2 0. Then, there is a unique optimal controlu∗, and Riccati Eqs. (30) and (32) have unique nonnegative solutionsP1and

P2. The optimal controlu∗=u1∗, u2∗ has the following feedback form:

u1∗=M1X∗, u2∗=M2X∗+M3X∗=M2X∗+M3−M2X∗ (37)

whereX∗is the state process corresponding to the optimal controlu∗,X∗t :=E

X∗_t, andXt∗:=X∗t −X∗t fort ∈ [0, T]. The optimal feedback system is given by

Xt =x0+₀tA+B2M2 Xs+B1M1−B2M2+B2M3 Xsds

+t

0

C+D2M2 Xs+

D1M1−D2M2+D2M3 Xs

dWs, t ≥0.(38)

It is a mean-field stochastic differential equation. The expected optimal stateX∗t is governed by the following ordinary differential equation:

Xt =x0+

t

0

A+B1M1+B2M3

Xsds, t ≥0; (39)

andX_t∗is governed by the following stochastic differential equation:

Xt =

t

0

A+B2M2

Xsds

+

t

0

C+D2M2

Xs +

C+D1M1+D2M3

Xs

dWs, t≥0.

(40) The optimal value is given by

J (u∗)= P2(0)x0, x0. (41)

ProofSinceR1andR2are uniformly positive, the cost functionalJ (u)is strictly convex inuand thus has a unique minimizer. Therefore, we have a unique optimal control. Define

u1:=M1X∗, u2:=M2X∗+M3X∗, u:=u1,u2 (42)

and

p:=P1X∗+P2X∗, k:=P1

CX∗+Du∗ . (43)

We can check that the pairp,k is an adapted solution to BSDE (6), andu,p,k

satisfies the optimality condition. Hence,uis the optimal controlu∗.

The formula (41) is derived from computation ofps, X∗s with the Itˆo’s formula. All other assertions are obvious.

(9)

It can also be proved in a direct way. In fact, if(X, u, p, k) is an alternative solution to the FBSDE and satisfies the optimality condition, then by setting

δp:=p−P1X+P2X , δk:=k−P1(CX+Du),

and by putting

p=δp+P1X+P2X, k=δk+P1(CX+Du)

into (7) and (8), we haveδpT =0and the following two equalities

EB1δp+P1X+P2X +

D1[δk+P1(CX+Du)]+R1u1

=0,(44)

B2δp+P1X+P2X +

D2(δk+P1(CX+Du))+R2u2s =0.(45)

Therefore, we have

u2_{= −}−1 2 (P1)

B2δp+D2δk+2X+D2P1D1u1∗

. (46)

In view of (44) and (45), we have

u1=L1δp+L2δk+M1X (47)

and

u2=L3δp+L4δk+L5δp+L6δk+M2X+M3X

where

L1:= −−1(P1)

B1

−D1

P1D2−21(P1)

B2

,

L2:= −−1(P1)

D1

−D1

P1D2−21(P1)

D2

,

L3:= −−₂1(P1)

B2,

L4:= −−₂1(P1)

D2

,

L5:= −−₂1(P1)

D2P1D1L1,

L6:= −−₂1(P1)

D2P1D1L2.

Define a new function f as follows:

f (s, p, k, P , K)

:=A_s +P1(s)Bs2Ls3+CsP1(s)Ds2L3s

p+C_s+P1(s)Bs2L4+CP1(s)D2sL4s

k

+C_sP1(s)Ds1Ls1+P2(s)Bs1L1s +P2(s)Bs2L3s −P1(s)Bs2L3s+P2(s)Bs2L5s +CsP1(s)Ds2L5s

P +CsP1(s)Ds1L2s +P2(s)Bs1L2s+P2(s)Bs2Ls4−P1(s)Bs2L4s +P2(s)Bs2L6s+CsP1(s)Ds2L6s

K.

Then(δp, δk)satisfies the following linear homogeneous BSDE of mean-field type:

dδps = −fs, δps, δks, δps, δks ds+δksdWs, δpT =0. (48)

In view of (Buckdahn et al. (2009), Theorem 3.1), it admits a unique solution

(10)

4 Particular cases

4.1 The classical optimal stochastic LQ case:B1=0 andD1=0.

In this case, letP1 be the unique nonnegative solution to Riccati Eq. (30). Then,

P1 is also the solution of Riccati Eq. (32), and the optimal control reduces to the conventional feedback form.

4.2 The deterministic control of linear stochastic system with quadratic cost:

B2₌_{0 and}_D2₌_0.

In this case,B=B1andD=D1, and Riccati Eq. (30) takes the following form (we writeR=R1for simplifying exposition):

˙

P1+P1A+AP1+CP1C+Q=0, P1(T )=G,

which is a linear Liapunov equation. Riccati Eq. (32) takes the following form: ˙

P2+P2A+AP2+Q−P2BR+DP1D −1BP2=0, P2(T )=G

with

A:=A−BR+DP1D − 1

DP1C and

Q:=Q+CP1C−CP1DR+DP1D −1DP1C. The optimal control takes the following feedback form:

u∗= −R+DP1D − 1

BP2+DP1C X∗.

5 Some solvable singular cases

In this section, we study the possibility ofR1=0 orR2=0. We have

Theorem 3Assume thatR10and

R2≥0, D2D20, G >0. (49)

Then Riccati Eqs. (30) and (32) have unique nonnegative solutionsP10andP2, respectively. The optimal control is unique and has the following feedback form:

u1∗=M1X∗, u2∗=M2X∗+M3X∗=M2X∗+M3−M2X∗. (50)

The optimal feedback system and the optimal value take identical forms to those of Theorem2.

ProofIn view of the conditions (49), the existence and uniqueness of solution

P1 0 to Riccati Eq. (30) can be found in Kohlmann and Tang (2003), and those of solutionP2 ≥ 0 to Riccati Eqs. (30) comes from the fact that (P1) 0 as a consequence of the condition thatR10.

(11)

Theorem 4Assume thatR20and

R1≥0, D1D10, G >0. (51)

Then Riccati Eqs. (30) and (32) have unique nonnegative solutionsP10andP2, respectively. The optimal control is unique and has the following feedback form:

u1∗=M1X∗, u2∗=M2X∗+M3X∗=M2X∗+

M3−M2

X∗. (52)

The optimal feedback system and the optimal value take identical forms to those of Theorem2.

ProofThe existence and uniqueness of solutionP1to Riccati Eqs. (30) are well-known. In view of the conditionG > 0, we haveP1 0. We now prove those of solutionP2≥0 to Riccati Eq. (30).

In view of the well-known matrix inverse formula:

A+BD−1C

−1

=A−1−A−1B

D+CA−1B

−1

CA−1 (53)

forB ∈ Rn×m_{, C} _∈_Rm×n_{and invertible matrices}_A_∈ _Rn×n_{, D} _∈_Rm×m_{such that}

A+BD−1CandD+CA−1Bare invertible, we have the following identity:

(P1) =R1+

D1

P1−P1D2

R2+D2 P1D2

−1

D2 P1

D1

=R1+D1 P₁−1+D2R2 −1D2 −1D1. (54)

Noting the conditionD1 D10, we have(P1)0.

Other assertions can be proved in an identical manner as Theorem2.

6 The infinite time-horizontal case

In this section, we consider the time-invariant situation of all the coefficients

A, B, C, D, QandRin the linear controlled stochastic differential equation (SDE)

dXs =

AXs+B1u1s +B2u2s

ds+

CXs+D1u1s +D2u2s

dWs, t >0;X0=x0, (55) and the quadratic cost functional

J (u)1

2E

_∞

0

QXs, Xs +

R1u1_s, u1_s+R2u2_s, u2_sds (56)

foru1, u2 ∈L20,∞;Rl1 ×L2 F

0,∞;Rl2 _and_u:=_u1 _,_u2

.

The admissible class of controls for the deterministic controller u1 is

L20,∞;Rl1 _{and for the random controller}_u2_isL2

F0,∞;Rl2 . For simplicity of

(12)

Assumption 1There isK∈Rl2×nsuch that the unique solution X to the following linear matrix stochastic differential equation

dXs =

A+B2K

Xsds+

C+D2K

XsdWs, t >0; X0=I, (57)

lies inL2_F(0,∞;Rn×n). That is, our linear control system (55) is stabilizable using only controlu2.

Remark 2By applying Itˆo’s formula to |Xs|2 and taking the expectation, it is straightforward to see that if there existsK∈Rl2×nsuch that

A+B2K+A+B2K+C+D2KC+D2K<0,

then Assumption1is satisfied. In particular, if

A+A+CC <0,

then Assumption1is satisfied.

We have

Lemma 2Assume thatQ >0and Assumption1and either of the following three sets of conditions hold true:

(i)R1>0andR2>0; (ii)R1>0, R2≥0,D2 D2>0,andG >0; and (iii)

R1≥0,D1 D1>0, R2>0,andG >0. Then, Algebraic Riccati equation

P1A+AP1+CP1C+Q

−P1B2+CP1D2 −₂1(P1)P1B2+CP1D2 =0 (58)

has a unique positive solutionP1, and Algebraic Riccati equation

P2A(P 1)+A(P1)P2+Q(P 1)−P2N(P1)P2=0 (59)

has a positive solutionP2. Here forS∈Sn₊,

U (S) :=S−SD2₂−1(S)D2 S;

Q(S) :=Q+CU (S)C−CU (S)D1−1(S)D1 U (S)C;

A(S) :=A−B2−₂1(S)D2 SC

−B1−B2₂−1(S)D2 SD1

−1(S)D1 U (S)C;

N(S) :=B2−₂1(S)B2

+B1−B2−₂1(S)D2 SD1−1(S)B1−B2₂−1(S)D2 SD1.

(13)

Eq. (59). We use approximation method by considering finite time-horizontal Riccati equations.

For anyT >0, letP₁T andP₂T be unique solutions to Riccati Eqs. (30) and (32), withG=0. It is well-known thatP₁T converges to the constant matrixP1asT → ∞. We now show the convergence ofP₂T. Firstly,P₂T(t)is nondecreasing inT for any

t≥0 due to the following representation formula: for(t, x)∈ [0, T] ×Rn,

P2T(t)x, x

= inf

u1_∈_L2_(t,T_;_Rl1) u2∈L2_Ft,T;Rl2

1 2E

t,x

T

t

QXs, Xs +

R1u1s, u1s

+R2u2s, u2s

ds,

(60) whose proof is identical to that of the formula (41). From Assumption1, it is straight-forward to show that there isCt >0 such that!!P2T(t)!!≤Ct. ThenP2T(t)converges toP2(t)asT → ∞. Furthermore, since all the coefficients are time-invariant and

P₁T(T ), P₂T(T ) =0 for anyT >0, we have

P₁T+s(t+s), P₂T+s(t+s)

=P₁T(t), P₂T(t)

. (61)

Taking the limitT → ∞yields thatP2(t+s)=P2(t). Therefore,P2is a constant matrix.

Taking the limitT → ∞in the integral form of Riccati Eq. (32), we show thatP2 solves Algebraic Riccati Eq. (59).

Finally, in view ofQ >0, we haveP₂1(0) >0. HenceP2≥P₂1(0) >0.

Theorem 5Let Assumption1be satisfied. Assume thatQ > 0and either of the following three sets of conditions holds true:

(i)R1>0andR2>0; (ii)R1>0, R2≥0, (D2)D2>0,andG >0; and (iii)

R1≥0,D1 D1>0, R2>0,andG >0.

Let P1 be the unique positive solution to algebraic Eq. (58), and P2 a posi-tive solution to the algebraic Eq. (59). Let(u∗, X∗)be an optimal pair withu∗ =

u1∗ ,u2∗ . Then the optimal control is unique and has the following feedback form:

u1∗=M1X∗, u2∗=M2X∗+M3X∗=M2X∗+

M3−M2

X∗ (62)

whereX∗t :=E

Xt∗

andX∗t :=Xt∗−Xt∗fort≥0. The optimal feedback system is given by

Xt =x0+₀tA+B2M2 Xs +B1M1−B2M2+B2M3 Xsds

+t

0

C+D2M2 Xs+

D1M1−D2M2+D2M3 Xs

dWs, t ≥0. (63)

It is a mean-field stochastic differential equation. The expected optimal stateX∗t is governed by the following ordinary differential equation:

Xt =x0+

t

0

(14)

andXt∗is governed by the following stochastic differential equation:

Xt =₀tA+B2M2 Xsds

+t

0

C+D2M2 Xs+

C+D1M1+D2M3 Xs

dWs, t ≥0. (65)

The optimal value is given by

Ju∗ = P2x0, x0, (66)

which implies the uniqueness of the positive solution to Algebraic Riccati Eq. (59).

ProofThe uniqueness of the optimal control is an immediate consequence of the strict convexity of the cost functional in both control variablesu1andu2. We now show thatu∗is optimal.

Note that the constant positive matrixP2 solves the Riccati differential Eq. (32) withG:= P2. Define forT > 0,

u1, u2 ∈ L20, T;Rl1 ×L2 F

0, T;Rl2 _{, and} u:=u1, u2 ,

JT(u):= 1

2E

T

0

QXs, Xs +

R1u1_s, u1_s+R2u2_s, u2_sds.

For any admissible pairu1, u2 , from Theorem2, we have

EP2XuT, XTu +JT(u)≥ P2x0, x0. Therefore, lettingT → ∞, we haveJ (u)≥ P2x0, x0.

On the other hand, defineps :=P1X∗s +P2X∗s fors≥0. Using Itˆo’s formula to compute the inner productp, X∗, we have

EP1X∗_T +P2X∗_T, X∗T

+JT u∗|_[0,T] = P2x0, x0, ∀T >0. (67) Since

EP1X∗T +P2X∗T, XT∗

=EP1X∗T,X∗T

+EP2XT∗, X

∗

T

≥0,

we haveJT u∗|_[0,T] ≤ P2x0, x0for anyT >0, and thusX∗is stable andu∗is admissible.

Passing to the limitT → ∞in (67), we have

Ju∗ = P2x0, x0. (68)

The proof is complete.

Acknowledgements Hu acknowledges research supported by Lebesgue center of mathematics “Investissements d’avenir” program-ANR-11-LABX-0020-01, by CAESARS-ANR-15-CE05-0024 and by MFG-ANR-16-CE40-0015-01. Tang acknowledges research supported by National Science Founda-tion of China (Grant No. 11631004) and Science and Technology Commission of Shanghai Municipality (Grant No. 14XD1400400).

Availability of data and materials

(15)

Authors’ contributions

Both authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

References

Bensoussan, A.: Lectures on stochastic control. In: Mitter, S.K., Moro, A. (eds.) Nonlinear filtering and stochastic control, Proceedings of the 3rd 1981 Session of the Centro Internazionale Matematico Estivo (C.I.M.E.), Held at Cortona, July 1–10, 1981 pp. 1–62. Lecture Notes in Mathematics 972. Springer-Verlag, Berlin (1982)

Bismut, J.M.: Linear quadratic optimal stochastic control with random coefficients. SIAM J. Control Optim.14, 419–444 (1976)

Bismut, J.M.: On optimal control of linear stochastic equations with a linear-quadratic criterion. SIAM J. Control Optim.15, 1–4 (1977). Read More:http://epubs.siam.org/doi/10.1137/0315001

Buckdahn, R., Li, J., Peng, S.: Mean-field backward stochastic differential equations and related partial differential equations. Stoch. Process. Appl.119, 3133–3154 (2009)

Haussmann, U.G.: Optimal stationary control with state and control dependent noise. SIAM J. Control.9, 184–198 (1971)

Kohlmann, M., Tang, S.: Minimization of risk and linear quadratic optimal control theory. SIAM J. Control Optim.42, 1118–1142 (2003)

Peng, S.: Stochastic Hamilton-Jacobi-Bellman equations. SIAM J. Control Optim.30, 284–304 (1992) Tang, S.: General linear quadratic optimal stochastic control problems with random coefficients: linear

stochastic Hamilton systems and backward stochastic Riccati equations. SIAM J. Control Optim.42, 53–75 (2003)

Wonham, W.M.: On a matrix Riccati equation of stochastic control. SIAM J. Control.6, 681–697 (1968) Wu, H., Zhou, X.: Stochastic frequency characteristics. SIAM J. Control Optim.40, 557–576 (2001) Yong, J., Zhou, X.Y.: Stochastic Controls: Hamiltonian Systems and HJB Equations. Springer–Verlag,