Linear quadratic optimal control of conditional McKean-Vlasov equation with random coefficients and applications

(1)

Probability, Uncertainty and Quantitative Risk (2016) 1:7

DOI 10.1186/s41546-016-0008-x

R E S E A R C H Open Access

Linear quadratic optimal control of conditional

McKean-Vlasov equation with random coefficients and

applications

Huyˆen Pham

Received: 25 April 2016 / Accepted: 10 August 2016 /

© The Author(s). 2016Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Abstract We consider the optimal control problem for a linear conditional

McKean-Vlasov equation with quadratic cost functional. The coefficients of the system and the weighting matrices in the cost functional are allowed to be adapted processes with respect to the common noise filtration. Semi closed-loop strategies are introduced, and following the dynamic programming approach in (Pham and Wei, Dynamic programming for optimal control of stochastic McKean-Vlasov dynamics, 2016), we solve the problem and characterize time-consistent optimal control by means of a system of decoupled backward stochastic Riccati differential equations. We present several financial applications with explicit solutions, and revisit, in particu-lar, optimal tracking problems with price impact, and the conditional mean-variance portfolio selection in an incomplete market model.

Keywords Stochastic McKean-Vlasov SDEs·Random coefficients·Linear

quadratic optimal control·Dynamic programming·Riccati equation·Backward stochastic differential equation

MSC Classification49N10·49L20·60H10·93E20

H. Pham ()

Laboratoire de Probabilités et Modèles Aléatoires, CNRS, UMR 7599, Université Paris Diderot, Paris, France

e-mail: [email protected] H. Pham

(2)

Introduction and problem formulation

Let us formulate the linear quadratic optimal control of conditional (also called stochastic) McKean-Vlasov equation with random coefficients (LQCMKV in short form). Consider the controlled stochastic McKean-Vlasov dynamics inRd _{given by}

dXt = bt

Xt,E[Xt|W0], αt

dt+σt

dWt

+σt0

dWt0, 0≤t≤T , X0 = ξ0. (1)

HereW, W0 are two independent one-dimensional Brownian motions on some

probability space(,F,P),F0 = (Ft0)0≤t_≤T is the natural filtration generated by W0,F=(Ft)0≤t≤T is the natural filtration generated by(W, W0), augmented with an independentσ-algebra G,ξ0 ∈ L2(G;Rd)is a square-integrableG-measurable

random variable with values in _Rd_,_E_[_X

t|W0] denotes the conditional expecta-tion of Xt given the whole σ-algebra F_T0 of W0, and the control process α is anF0_{-progressively measurable process with values in}_A _{equal either to}_Rm _{or to} L(Rd;Rm)the set of Lipschitz functions fromRd intoRm. This distinction of the

control sets will be discussed later in the introduction, but for the moment, one may

interpret roughly the case whenA = Rm _{as the modeling for open-loop control}

and the case whenA= L(Rd;Rm)as the modeling for closed-loop control. When A=Rm, we require thatαsatisfies the square-integrability conditionL2(×[0, T]),

i.e.,ET 0 |αt|2dt

<∞, and we denote byAthe set of control processes. The

coef-ficientsbt(x,x, a), σ¯ t(x,x, a), σ¯ t0(x,x, a),¯ 0 ≤ t ≤ T, areF0-adapted processes with values inRd_{, for any}_x,_x_¯ _∈_Rd_,_a_∈_A_{, and of linear form:}

bt(x,x, a)¯ =

b_t0+Btx+ ¯Btx¯+Cta ifA=Rm bt0+Btx+ ¯Btx¯+Cta(x)ifA=L(Rd;Rm)

σt(x,x, a)¯ =

γt+Dtx+ ¯Dtx¯+Fta ifA=Rm γt+Dtx+ ¯Dtx¯+Fta(x) ifA=L(Rd;Rm)

σt0(x,x, a)¯ =

γ_t0+D_t0x+ ¯D_t0x¯+F_t0a ifA=Rm

γ_t0+D_t0x+ ¯D_t0x¯+F_t0a(x) ifA=L(Rd;Rm),

(2)

whereb0, γ , γ0areF0-adapted processes vector-valued inRd, satisfying a

square-integrability conditionL2(×[0, T]):E₀T|bt|2+ |b0t|2+ |γt|2+ |γt0|2dt

<∞,

B,B¯,D,D, D¯ 0,D¯0are essentially boundedF0-adapted processes matrix-valued in Rd×d_{, and}_C_,_F_,_F0_{are essentially bounded}_F0_{-adapted processes matrix-valued in} Rd×m_{. For any}_α_∈_A_{, there exists a unique strong solution}_X₌_Xα_{to (1), which is}

F-adapted, and satisfies the square-integrability conditionS2₍_{× [}₀_{, T}_]₎_:

E sup

0≤t≤T| Xαs|2

≤ Cα

1+E|ξ0|2

< ∞, (3)

for some positive constantCαdepending onα: whenA=Rm, Cαdepends onαvia

ET 0 |αt|2dt

(3)

The cost functional to be minimized overα∈Ais:

J (α) = E T

0 ft

Xαt,E

Xtα|W0

, αt

dt+gX_Tα,EXα_T|W0,

→ V0 := inf α∈AJ (α),

where_{ft(x,x, a),¯ 0 ≤ t ≤ T}, is anF0-adapted real-valued process,g(x,x)¯ is a F0

T-measurable random variable, for anyx,x¯∈R

d_,_a_∈_A_{, of quadratic form:}

ft(x,x, a)¯ =

xQtx+ ¯xQ¯tx¯+Mtx+aNta ifA=Rm xQtx+ ¯xQ¯tx¯+Mtx+a(x)Nta(x) ifA=L(Rd;Rm) g(x,x)¯ =xP x+ ¯xP¯x¯+Lx,

(4)

whereQ,Q¯ are essentially boundedF0-adapted processes, with values inSdthe set

of symmetric matrices inRd×d_,_P_,_P¯_{are essentially bounded}_F0

T-measurable random

matrices inSd_,_N_{is an essentially bounded}_F0_{-adapted process, with values in}_Sm_,_M is anF0_{-adapted process with values in}_Rd_{, satisfying a square integrability condition} L2(× [0, T]),Lis anF_T0-measurable square integrable random vector inRd, and _{denotes the transpose of any vector or matrix.}

The above control formulation of stochastic McKean-Vlasov equations provides a unified framework for some important classes of control problems. In particular, it is motivated in particular by the asymptotic formulation of cooperative equilibrium for a large population of particles (players) in mean-field interaction under common noise (see, e.g., (Carmona and Zhu 2016; Carmona et al. 2013)) and also occurs when the cost functional involves the first and second moment of the (conditional) law of the state process, for example in (conditional) mean-variance portfolio selection problem (see, e.g., (Basak and Chabakauri 2010; Borkar and Kumar 2010; Li and Zhou 2000)). WhenA=L(Rd;Rm), this corresponds to the problem of a

(represen-tative) agent, using a controlαbased on her/his current private stateXt at timet, and of the information brought by the common noiseF0

t, typically the conditional mean

E[Xt|W0], which represents, in the large population equilibrium interpretation, the limit of the empirical mean of the state of all the players when their number tend to infinity from the propagation of chaos. In other words, the controlαmay be viewed

as a semi closed-loop control, i.e., closed-sloop w.r.t. the state process, and open-loop w.r.t. the common noiseW0, or alternatively as aF0-progressively measurable

ran-dom field controlα = αt(x),0≤t≤T , x∈Rd

. This class of semi closed-loop control extends the class of closed-loop strategies for the LQ control of McKean-Vlasov equations (or mean-field stochastic differential equations) without common noiseW0, as recently studied in (Li et al. 2016) where the controls are chosen at any

timetin linear form w.r.t. the current state valueXt and the deterministic expected valueE[Xt]. WhenA=Rm, the LQCMKV problem may be viewed as a special par-tial observation control problem for a state dynamics like in 1 where the controls are of open-loop form, and adapted w.r.t. an observation filtrationFI ₌_F0_{generated by}

some exogenous random factor processI driven byW0. In the case whereσ =0,

we see that the processXisF0_{-adapted, hence}_E_[_X

(4)

1999)) with random coefficients, with open-loop controls forA = Rm _or closed-loop controls forA=L(Rd;Rm). Note that this distinction between open-loop and

closed-loop strategies for LQ control problems has been recently introduced in (Sun and Yong 2014) where closed-loop controls are assumed of linear form w.r.t. the cur-rent state value, while it is considered here a priori only Lipschitz w.r.t. the curcur-rent state value.

Optimal control of McKean-Vlasov equation is a rather new topic in the area of stochastic control and applied probability, and addressed, e.g., in (Andersson and Djehiche 2010; Bensoussan et al. 2013; Buckdahn et al. 2011; Carmona and Delarue 2015; Pham and Wei 2015). In this McKean-Vlasov context, the class of lin-ear quadratic optimal control, which provides a typical case for solvable applications, has been studied in several papers, among them (Hu et al. 2012; Huang et al. 2015; Sun 2015; Yong 2013) where the coefficients are assumed to be deterministic. It is often argued that due to the presence of the law of the state in a nonlinear way (here for the LQ problem, the square of the expectation), the problem is time-inconsistent in the sense that an optimal control viewed from today is no more optimal when viewed from tomorrow, and this would prevent a priori the use of the dynamic pro-gramming method. To tackle time inconsistency, one then focuses typically on either pre-commitment strategies, i.e. controls that are optimal for the problem viewed at the initial time, but may be not optimal at future date, or game-equilibrium strate-gies, i.e., control decisions considered as a game against all the future decisions the controller is going to make.

In this paper, we shall focus on the optimal control for the initial value V0 of

the LQCMKV problem with random coefficients, but following the approach devel-oped in (Pham and Wei 2016), we emphasize that time consistency can be actually restored for pre-commitment strategies, provided that one considers as state variable the conditional law of the state process instead of the state itself, therefore making possible the use of the dynamic programming method. We show that the dynamic version of the LQCMKV control problem defined by a random field value function, has a quadratic structure with respect to the conditional law of the state process, lead-ing to a characterization of the optimal control in terms of a decoupled system of backward stochastic Riccati equations (BSREs) whose existence and uniqueness are obtained in connection with a standard LQ control problem. The main ingredient for such derivation is an Itˆo’s formula along a flow of conditional measures and a suit-able notion of differentiability with respect to probability measures. We illustrate our results with several financial applications. We first revisit the optimal trading and benchmark tracking problem with price impact for general price and target processes, and obtain closed-form solutions extending some known results in the literature. We next solve a variation of the mean-variance portfolio selection problem in an incom-plete market with random factor. Our last example considers an interbank systemic risk model with random factor in a common noise environment.

(5)

equations” section is devoted to the characterization of the optimal control by means of a system of BSREs in the case of both a control setA=Rm_and_A₌_L(_Rd_;_Rm₎_. We develop in “Applications” section the applications.

We end this introduction with some notations.

Notations. We denote byP2(Rd)the set probability measuresμonRd, which are square integrable, i.e.,μ2₂ :=_Rd|x|2μ(dx) <∞. For anyμ∈P2(Rd), we denote byL2_μ(Rq)the set of measurable functionsϕ:Rd→Rq, which are square integrable

with respect toμ, byL2_μ_⊗_μ(Rq)the set of measurable functionsψ:Rd×Rd→Rq,

which are square integrable with respect to the product measureμ⊗μ, and we set

μ(ϕ):=

ϕ(x) μ(dx),μ¯ :=

xμ(dx), μ⊗μ(ψ) :=

ψ(x, x)μ(dx)μ(dx).

We also defineL∞μ(Rq)(resp.L∞μ⊗μ(Rq)) as the subset of elementsϕ∈L2μ(Rq) (resp.L2_μ_⊗_μ(Rq₎_{) which are bounded}_μ_(resp._μ_⊗_μ_{) a.e., and}_ϕ

∞is their essen-tial supremum. For any random variableX on (,F,P), we denote byL(X) its

probability law (or distribution) under _P, by L(X|W0) its conditional law given

F0

T, and we shall assume w.l.o.g. thatGis rich enough in the sense thatP2(Rd)=

L(ξ ):ξ ∈L2(G;Rd).

Preliminaries

For anyα∈A, andXα=(Xtα)0≤t≤T the solution to (1), we defineραt =L(Xtα|W0) as the conditional law of Xαt givenFT0 for 0 ≤ t ≤ T. Since Xα is F-adapted, and W0 is a (P,F)-Wiener process, we notice that ρ_tα(dx) = PXα_t ∈dx|F_T0

=PXα_t ∈dx|F_t0, and thusρ_tα,0≤t ≤Tadmits anF0-progressively

measur-able modification (see, e.g., Theorem 2.24 in (Bain and Crisan 2009)), that will be identified with itself in the sequel, and is valued inP2(Rd)by (3), namely:

E sup

0≤t≤T ρ_tα2₂

≤ Cα

1₊_E_|ξ0|2

. (5)

Moreover, we mention that the processρα=(ρtα)0≤t≤T has continuous trajec-tories as it is valued in P2(C([0, T];Rd)the set of square integrable probability measures on the spaceC([0, T];Rd)of continuous functions from[0, T]intoRd.

Now, by the law of iterated conditional expectations, and recalling thatα∈Ais F0_{-progressively measurable, we can rewrite the cost functional as}

J (α) = E T

0 E

ft

Xtα,ρ¯tα, αt Ft0

dt+EgX_Tα,ρ¯_TαF_T0

= E T

0 ρ α t

ft

.,ρ¯tα, αt

dt+ρ_Tαg.,ρ¯_Tα

= E T

0 fˆt

(6)

where we used in the second equality the fact that{ft(x,x, a), x,¯ x¯ ∈Rd, a∈A,0≤ t ≤ T}, is a random fieldF0-adapted process,g(x)isF_T0-measurable, and theF0

-adapted processfˆt(μ, a),0≤t ≤T

, theF0

T-measurable random variableg(μ)ˆ , forμ∈P2(Rd),a∈A, are defined by

_ˆ

ft(μ, a) := μ (ft(.,μ, a))¯ =

ft(x,μ, a)μ(dx)¯ ˆ

g(μ) := μ (g(.,μ))¯ = g(x,μ)μ(dx).¯

From the quadratic forms of f, g in (4), the random fields fˆt(μ, a) and

ˆ

g(μ), (t, μ, a)∈ [0, T] ×P₂(Rd)×A, are given by

ˆ

ft(μ, a) =

⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩

Var(μ, Qt)+v2(μ, Qt+ ¯Qt)

+v1(μ, Mt)+aNta ifA=Rm Var(μ, Qt)+v2(μ, Qt+ ¯Qt)

+v1(μ, Mt)+

[a(x)Nta(x)]μ(dx)ifA=L

Rd_;_Rm_, ˆ

g(μ) = Var(μ, P )+v2(μ, P+ ¯P )+v1(μ, L),

(7)

where we define the functions onP2(Rd)×SdandP2(Rd)×Rdby:

Var(μ, k):=

(x− ¯μ)k(x− ¯μ)μ(dx), μ∈P2

Rd

, k∈Sd,

v2(μ, ) := ¯μμ,¯ μ∈P2

Rd

, ∈Sd

v1(μ, y):= yμ,¯ μ∈P₂Rd, y∈Rd.

We shall make the following assumptions on the coefficients of the model:

(H1)Q,Q+ ¯Q,P,P + ¯P,Nare nonnegative a.s.; (H2)One of the two following conditions holds:

(i) Nis uniformly positive definite i.e.Nt ≥δIm,0≤t ≤T, a.s. for someδ >0; (ii) P orQis uniformly positive definite, and F is uniformly nondegenerate, i.e.

|Ft| ≥δ,0≤t≤T, a.s., for someδ >0.

Let us define the dynamic formulation of the stochastic McKean-Vlasov control problem. For anyt∈ [0, T], ξ∈L2G;Rd, andα∈A, there exists a unique strong

solution, denoted by_{Xt,ξ,αs , t ≤ s ≤ T}, to the Eq. 1 starting from ξ at time t, and by noting thatXt,ξ,α _{is also unique in law, we see that the conditional law of} Xt,ξ,αs givenF_T0 depends onξ only through its lawL(ξ )=Lξ|W0(recall thatG is independent ofW0). Then, recalling also thatGis rich enough, the relation

ρ_st,μ,α :=LXt,ξ,α_s |W0, t≤s≤T , μ=L(ξ ),

defines for any t ∈ [0, T], μ∈ P₂(Rd), andα ∈ A, an F0-progressively

measur-able continuous process (up to a modification)ρst,μ,α, t≤s≤T

, with values in P2(Rd), and as a consequence of the pathwise uniqueness of the solution{X

t,ξ,α

s , t≤

s≤T}, we have the flow property for the conditional law (see Lemma 3.1 in (Pham

(7)

ραs = ρ t,ρtα,α

s , t ≤s≤T , α∈A. (8) We then consider the conditional cost functional

Jt(μ, α)=E

T

t ˆ

fsρst,μ,α, αsds+ ˆg

ρ_Tt,μ,αFt0

, t∈[0, T],μ∈P2(Rd), α∈A,

which is well-defined by (5) and under the boundedness assumptions on the weight-ing matrices of the quadratic cost function. We next define theF0_{-adapted random}

field value function

vt(μ) = ess inf

α∈A Jt(μ, α), t ∈ [0, T], μ∈P2

Rd

,

so that

V0 := inf

α∈AJ (α) = v0(L(ξ0)), (9) which may take a priori for the moment the value_−∞. We shall see later that the Assumptions(H1)and(H2)will ensure thatV0is finite and there exists an optimal

control. The dynamic counterpart of (9) is given by

Vtα := ess inf β∈At(α)

Jt(ρtα, β) = vt(ρtα), t ∈ [0, T], α∈A, (10)

whereAt(α)= {β ∈ A: βs = αs, s ≤t}, and the second equality in (10) follows from the flow property (8) and the observation thatρtβ =ρtαforβ ∈At(α).

By using general results in (El Karoui ) for dynamic programming, one can show (under the condition that the random field v(μ) is finite) that the process

vt(ρtα)+

t

0fˆs(ρsα, αs)ds,0≤t ≤T

is a(P,F0)-submartingale, for anyα∈ A,

andα∗∈Ais an optimal control forV0if and only if{vt(ρα

∗

t )+

t 0fˆs(ρα

∗

s , αs∗)ds,0≤ t ≤ T}is a(P,F0)-martingale. We shall use a converse result, namely a dynamic

programming verification theorem, which takes the following formulation in our context.

Lemma 2.1Suppose that one can find anF0-adapted random field{wt(μ),0≤ t≤T , μ∈P2(Rd)}satisfying the quadratic growth condition

|wt(μ)| ≤ Cμ2₂+It, μ∈P2(Rd), 0≤t≤T , a.s. (11)

for some positive constant C, and nonnegative F0_{-adapted process I with} Esup₀_≤_t_≤_T_|It|<∞, such that

(i) wT(μ)= ˆg(μ),μ∈P2

Rd_; (ii) wt(ρtα)+

t 0fˆs

ρ_sα, αs

ds,0≤t≤Tis a(P,F0)local submartingale, for

anyα∈A;

(iii) there exists αˆ ∈ A such that

wt(ρtαˆ)+

t 0fˆs

ρsαˆ,αˆs

ds,0≤t≤T

is a

(P,F0)local martingale.

Thenαˆ is an optimal control forV0, i.e.V0=J (α)ˆ , and

(8)

Moreover,αˆ is time consistent in the sense that

V_tαˆ = Jt

ρα_tˆ,αˆ, ∀0≤t ≤T .

ProofBy the local submartingale property in condition (ii), there exists a nonde-creasing sequence ofF0_{-stopping times}_(τ

n)n, τnT a.s., such that

E

wτn

ρταn

+ τn 0

ˆ

ft

ρtα, αt

dt

≥ w0ρ₀α = w0(L(ξ0)), ∀α∈A (12)

From the quadratic form off in (4), we easily see that for alln,

E τn 0

ˆ

ft

ρ_tα, αt

dt

≤ Cα

1+E sup

0≤t_≤T ρ_tα2₂

,

for some positive constantCαdepending onα(whenA=Rm, Cαdepends onαvia

ET 0 |αt|2dt

<∞, and whenA=L(Rd;Rm), Cαdepends onαvia its Lipschitz constant). Together with the quadratic growth condition ofw, and from (5), one can then apply dominated convergence theorem by sendingnto infinity into (12), and get

w0(L(ξ0))≤E

wT

ρ_Tα+

T

0

ˆ

ft

ρα_t, αt dt = E ˆ

gρ_Tα+

T

0

ˆ

ft

ρ_tα, αt

dt

= J (α)

where we used the terminal condition (i), and the expression (6) of the cost functional. Sinceαis arbitrary inA, this shows thatw0(L(ξ0))≤V0. The equality is obtained

with the local martingale property forαˆin condition (iii).

From the flow property (8), and sinceρtβ =ρtαˆ forβ ∈At(α)ˆ , we notice that the local submartingale and martingale properties in (ii) and (iii) are formulated on the interval[t, T]as:

• ws

ρt,ρtαˆ,β

s

+s t fˆu

ρt,ραtˆ,β

u , βu

du, t≤s≤T

is a (P,F0) local

sub-martingale, for anyβ∈At(α)ˆ ; • ws

ρt,ρˆ

α t,αˆ s

+s tfˆu

ρt,ρˆ

α t,αˆ u ,αˆu

du, t≤s≤T

is a(P,F0)local

martin-gale.

By the same arguments as for the initial date, this implies thatVtαˆ=Jt

ρtαˆ,αˆ

=

wt(ρtαˆ), which means thatαˆ is an optimal control over[t, T], once we start at timet from the initial stateρtαˆ, i.e., the time consistency ofαˆ.

The practical application of Lemma 2.1 consists in finding a random field

wt(μ), μ∈P2(Rd),0≤t ≤T

_{, smooth (in a sense to be precised), so that one can}

apply an Itˆo’s formula towt(ρtα)+

t

0fˆsρsα, αsds,0≤t ≤T

, and check that the finite variation term is nonnegative for anyα∈A(the local submartingale

(9)

(Lions 2012). We briefly recall the basic definitions and refer to (Cardaliaguet 2012) for the details, see also (Buckdahn et al. 2014, Chassagneux et al. 2015). This notion is based on theliftingof functionsudefined onP2(Rd)into functionsUdefined on

L2(G;Rd)by settingU (X)=u(L(X)). We say thatuis differentiable (resp.C1) on

P2(Rd)if the liftUis Fréchet differentiable (resp. Fréchet differentiable with con-tinuous derivatives) onL2(G;Rd). In this case, the Fréchet derivative viewed as an

elementDU (X)ofL2(G;Rd)by Riesz’s theorem can be represented as

DU (X) =∂μu(L(X))(X),

for some function∂μu(L(X)):Rd →Rd, which is called derivative ofuatμ = L(X). Moreover,∂μu(μ)∈ L2μ(Rd)forμ ∈P2(Rd)=

L(X):X∈L2(G;Rd).

Following (Chassagneux et al. 2015), we say thatuis fullyC2_{if it is}_C1_{, the mapping} (μ, x)∈P₂(Rd)×Rd→∂μu(μ)(x)is continuous and

(i) for each fixedμ∈P₂(Rd), the mappingx∈Rd→∂μu(μ)(x)is differentiable in the standard sense, with a gradient denoted by∂x∂μu(μ)(x)∈Rd×d, and s.t. the mapping(μ, x)∈P2(Rd)×Rd→∂x∂μu(μ)(x)is continuous;

(ii) for each fixed x∈ Rd_{, the mapping}_μ _∈ _P

2(Rd) →∂μu(μ)(x) is differen-tiable in the above lifted sense. Its derivative, interpreted thus as a mapping

x∈Rd →∂μ∂μu(μ)(x)(x)∈Rd×d inL2μ(Rd×d), is denoted byx ∈Rd → ∂μ2u(μ)(x, x), and s.t. the mapping(μ, x, x)∈ P2(Rd)×Rd ×Rd →

∂μ2u(μ)(x, x)is continuous. We say thatu∈C2

b(P2(Rd))if it is fullyC2, ∂x∂μu(μ)∈L∞μ(Rd×d), ∂μ2u(μ)∈ L∞_μ_⊗_μ(Rd×d)for anyμ∈P₂(Rd), and for any compact setKofP₂(Rd), we have

sup μ∈K

Rd

∂μu(μ)(x)2μ(dx)+ ∂x∂μu(μ)_∞+ ∂μ2u(μ)∞

< ∞.

We next need an Itˆo’s formula along a flow of conditional measures proved in (Carmona and Delarue 2014) for processes with common noise. In our context, for the flow of the conditional lawρ_tα, 0≤t ≤T , α∈A, it is formulated as follows. Let

u∈C2

b(P2(Rd)). Then, for allt∈ [0, T], we have

u(ρ_tα) =u(L(ξ0))+ t

0 ρ α t

Lαt

t u(ρtα)

+ρ_tα⊗ρ_tαMαt t u(ρtα)

dt

+

t

0 ρ α t

Dαt

t u(ρtα)

dW_t0, (13)

where for(t, μ, a)∈ [0, T] ×P₂(Rd)×A,La_tu(μ),Da_tu(μ)are theF_t0-measurable

random functions inL2_μ(R)defined by

La

tu(μ)(x) := bt(x,μ, a).∂¯ μu(μ)(x)+ 1 2tr

∂x∂μu(μ)(x)

σtσt+σt0(σt0)

(x,μ, a)¯ ,

Da

tu(μ)(x) := ∂μu(μ)(x)σt0(x,μ, a),¯

and_Ma

tu(μ)is theFt0-measurable random function inL2μ⊗μ(R)defined by

Ma

tu(μ)(x, x):= 1 2tr

(10)

The dynamic programming verification result in Lemma 2.1 and Itˆo’s for-mula (13) are valid for a general stochastic McKean-Vlasov equation (beyond the LQ framework), and by combining with an Itˆo-Kunita type formula for ran-dom field processes, similar to the one in (Kunita 1982), one could apply it to

wt(ρtα)+

t 0fˆs

ρsα, αs

ds,0≤t ≤T

in order to derive a form of stochastic Hamilton-Jacobi-Bellman, i.e., a backward stochastic partial differential equation (BSPDE) forwt(μ), as done in (Peng 1992) for controlled diffusion processes with random coefficients. We postpone this general approach for further study and, in the next sections, return to the important special case of LQCMKV problem[s] for which we show that BSPDE[s] are reduced to backward stochastic Riccati equations (BSRE) as in the classical LQ framework.

Backward stochastic Riccati equations

We search for anF0_{-adapted random field solution to the LQCMKV problem in the}

quadratic form

wt(μ) = Var(μ, Kt)+v2(μ, t)+v1(μ, Yt)+χt, (14)

for someF0_{-adapted processes}_{(K, , Y, χ )}_{, with values in}_Sd_×_Sd_×_Rd_×_R_{, and} in the backward SDE form

⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩

dKt = ˙Ktdt+ZKt dWt0, 0≤t≤T , KT =P dt = ˙tdt+Zt dWt0, 0≤t ≤T , T =P + ¯P

dYt = ˙Ytdt+ZtYdWt0, 0≤t ≤T , YT =L dχt = ˙χt+ZtχdWt0, 0≤t≤T , χT =0,

(15)

for some_F0_{-adapted processes} _K,_˙ _{, Z}_˙ K_{, Z} _{with values in} _Sd_,_{Y , Z}˙ Y _with val-ues in_Rd_{, and}_{χ , Z}_˙ χ _{with values in}_R_{. Notice that the terminal conditions in (15)} ensure by (7) thatwin (14) satisfies:wT(μ)= ˆg(μ), and we shall next determine the generatorsK,˙ ,˙ Y˙, andχ˙ in order to satisfy the local (sub)martingale

condi-tions of Lemma 2.1. Notice that the funccondi-tions Var, v2, v1are smooth w.r.t. both their

arguments, and we have

∂μVar(μ, k)(x) = 2k(x− ¯μ), ∂x∂μVar(μ, k)(x) = 2k = −∂μ2Var(μ, k)(x, x), ∂kVar(μ, k) = Var(μ) := (x− ¯μ)(x− ¯μ)μ(dx)

∂μv2(μ, )(x) = 2μ, ∂¯ x∂μv2(μ, )(x) = 0, ∂μ2v2(μ, )(x, x) = 2, ∂v2(μ, ) = ¯μμ¯

∂μv1(μ, y) = y, ∂x∂μv1(μ, y) = 0 = ∂μ2v1(μ, y)(x, x), ∂yv1(μ, y) = ¯μ.

(16) Let us denote, for any α ∈ A, by Sα the F0-adapted process equal to S_tα = wt(ρtα)+

t

0fˆs(ρsα, αs)ds,0≤t ≤T, and observe then by Itˆo’s formula (13) that it is of the form

(11)

with a drift termDtα=Dt

ρtα, αt, Kt, t, Yt

given by

Dt(μ, a, k, , y) = ˆft(μ, a)+μLatVar(μ, k)+Latv2(μ, )+Latv1(μ, y)

+μ⊗μMa_tVar(μ, k)+Ma_tv2(μ, )+Ma_tv1(μ, y)

+tr∂kVar(μ, k)K˙t

+tr∂v2(μ, )˙t

+∂yv1(μ, y)Y˙t+ ˙χt

+tr∂kμ

Da

tVar(μ, k)

ZK_t

+tr∂μ

Da tv2(μ, )

Z_t

+∂yμDatv1(μ, )

ZY_t,

for allt∈ [0, T], μ∈P₂Rd, k, ∈Sd,y∈Rd,a∈A. (The second-order

deriva-tives terms w.r.t.k,andydo not appear since the functionsv2,Var andv1are linear,

respectively, ink,andy, respectively). From the derivatives expression of Var, v2

andv1in (16), we then have

Dt(μ, a, k, , y)= ˆft(μ, a)+

bt(x,μ, a)¯ [2k(x− ¯μ)+2μ¯ +y]μ(dx)

+ σt(x,μ, a)¯ kσt(x,μ, a)¯ +σt0(x,μ, a)¯ kσ 0 t(x,μ, a)¯

μ(dx)

+

σ_t0(x,μ, a)μ(dx)¯

(−k)

σ_t0(x,μ, a)μ(dx)¯

+Var(μ,K˙t)+v2(μ,˙t)+v1(μ,Y˙t)+ ˙χt

+ σ_t0(x,μ, a)¯ 2ZK_t (x− ¯μ)+2Z_tμ¯ +ZY_tμ(dx).

(17)

We now distinguish between the cases when the control setAisRm_(LQCMKV1)

orL(Rd;Rm)(LQCMKV2).

Control setA=Rm

From the linear form ofbt, σt, σt0in (2), and the quadratic form offˆt in (7), after some straightforward calculations, we have:

Dt(μ, a, k, , y) = Var

μ, t

k, ZK_t + ˙Kt

+v2μ, t

k, , Z_t+ ˙t

+v1

μ, t

k, , Zt, y, ZYt

+ ˙Yt

+t

k, , y, ZYt

+ ˙χt

+at(k, )a+

2Ut

k, , Zt

¯

μ+Rt

k, , y, ZtY

a

(12)

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

tk, ZKt

= Qt+Btk+kBt+DtkDt+Dt0

kD_t0+D0_tZ_tK+Z_tKD_t0 t

k, , Z t

= Qt+ ¯Qt+

Dt+ ¯Dt

kDt+ ¯Dt

+D0_t + ¯D0_tD0_t + ¯D_t0

+(Bt+ ¯Bt)+(Bt+ ¯Bt)+

D_t0+ ¯Dt0

Z_t +Z_t

D0_t + ¯Dt0

tk, , Zt, y, ZtY

= Mt+(Bt+ ¯Bt)y+2b0t +2(Dt+ ¯Dt)kγt+2

D_t0+ ¯D0_tγ_t0

+ D_t0+ ¯D0_tZY

t +2Ztγt0 tk, , y, ZtY

= yb0_t +γ_tkγt+γt0

γ_t0+Z_tYγ_t0 t(k, ) = Nt+FtkFt+Ft0

F_t0 Ut

k, , Z t

= (Dt+ ¯Dt)kFt+

D0_t + ¯D_t0F_t0+Ct+Zt Ft0

Rtk, , y, ZYt

= 2F_tkγt+2Ft0

γ_t0+C_ty+F_t0ZY_t.

(18) Then, after square completion under the condition thatt(k, )is positive definite inSm_{, we have}

Dt(μ, a, k, , y)=Var

μ, t

k, ZK_t + ˙Kt

+v2

μ, t

k, , Z_t −Ut

k, , Z_t−_t1(k, )U_tk, , Z_t+ ˙t

+v1

μ,t

k, , Zt, y, ZYt

−Ut

k, , Zt

t−1(k, )Rt

k, , y, ZYt

+ ˙Yt

+t

k, , y, Z_tY−1 4R

t

k, , y, Z_tY−_t1(k, )Rt

k, , y, Z_tY+ ˙χt

+a− ˆat(μ, k, , y)¯ t(k, )a− ˆat(μ, k, , y)¯ ,

where

ˆ

at(μ, k, , y)¯ = −−t 1(k, )

U_tk, , Z_t μ¯+1

2Rt

k, , y, Z_tY.

Therefore, whenever

˙

Kt+t

Kt, ZKt

= 0,

˙

t+t

Kt, t, Zt

−Ut

Kt, t, Zt

−_t1(Kt, t)Ut

Kt, t, Zt

= 0,

˙

Yt+t

Kt, t, Zt, Ys, ZtY

−UtKt, t, Zt

−_t1(Kt, t)Rt

Kt, t, Yt, ZYt

= 0,

˙

χt+t

Kt, t, Yt, ZYt

−1₄R_tKt, t, Yt, ZtY

_t−1(Kt, t, Yt) Rt

Kt, t, Yt, ZYt

= 0,

holds for all 0≤t≤T, we have

Dα_t = Dt(ρtα, αt, Kt, t, Yt) (19) = αt− ˆatρ¯tα, Kt, t, Ytt(Kt, t)αt− ˆatρ¯tα, Kt, t, Yt,

which implies that Dtα ≥ 0, 0 ≤ t ≤ T, for all α ∈ A, i.e. Stα = wt(ρtα)+

t

0fˆs(ρsα, αs)ds,0≤t ≤T satisfies the

(13)

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

dKt = −tKt, ZKt

dt+Z_tKdW_t0, 0≤t≤T , KT =P dt = −

t(Kt, t, Zt)−Ut

Kt, t, Zt

_t−1(Kt, t)Ut

Kt, t, Zt

dt

+Z_tdW_t0, 0≤t≤T , T =P+ ¯P dYt = −

t

Kt, t, Zt , Yt, ZYt

−Ut

Kt, t, Zt

t−1(Kt, t)Rt

Kt, t, Yt, ZYt

dt

+ZY

t dWt0, 0≤t≤T , YT =L, dχt = −

t

Kt, t, Yt, ZYt

−1 4Rt

Kt, t, Yt, ZYt

t−1(Kt, t)Rt

Kt, t, Yt, ZYt

dt

+Z_tχdW_t0, 0≤t≤T , χT =0.

(20)

Definition 3.1A solution to the system of BSDE (20) is a quadruple of pair (K, ZK), (, Z), (Y, ZY), (χ , Zχ)ofF0-adapted processes, with values,

respec-tively, inSd _×_Sd_,_Sd _×_Sd_,_Rd _×_Rd_,_R_×_R_{, respectively, such that}T

0 |ZtK|2+ |Zt|2+ |ZtY|2+ |Z

χ

t |2dt <∞a.s., the matrix process(K, )with values inSm

is positive definite a.s., and the following relation

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

Kt =P+ T

t s(Ks, ZsK)ds− T

t ZsKdWs0,

t =P+ ¯P+_tTs(Ks, s, Zs)+UsKs, s, Zs

_s−1(Ks, s)Us

Ks, s, Zs

ds

−T

t Z s dWs0, Yt =L+

T ts

Ks, s, Zs, Ys, ZYs

−Us

Ks, s, Zs

s−1(Ks, s)Rs

Ks,s,Ys,ZsY

ds

−T

t ZsYdWs0, χt =_tTs(Ks, s, Ys, ZYs)−14Rs Ks, s, Ys, ZYs

_s−1(Ks, s)RsKs, s, Ys, ZsY

ds

−T

t Z χ sdWs0,

is satisfied for all t∈ [0, T].

The following verification result makes the connection between the system (20) and the LQCMKV1 control problem.

Proposition 3.1Assume thatK, ZK,, Z,Y, ZY, (χ , Zχ)is a solution to BSDE (20) such thatK, , −1(K, )are essentially bounded,Zlies inL2(×

[0, T]), i.e.,E₀T |Z_t|2dt<∞, Y lies inS2(× [0, T]), i.e.E|sup₀_≤t_≤T|Yt|2

<∞, andχ lies inS1(× [0, T]), i.e.E|sup₀_≤t≤T |χt|<∞Then, the control

process

αt∗ = ˆat

EXt∗|W0

, Kt, t, Yt

(21)

= −−_t 1(Kt, t)

Ut

Kt, t, Zt

EX_t∗|W0+1

2Rt

Kt, t, Yt, ZtY

,0≤t≤T ,

whereX∗=Xα∗is the state process with the feedback controlaˆt(., Kt, t, Yt), is an

optimal control for the LQCMKV1 problem, i.e.,V0=J (α∗), and we have

V0 = Var(L(ξ0), K0)+v2(L(ξ0), 0)+v1(L(ξ0), Y0)+χ0.

ProofConsider(K, ZK), (, Z), (Y, ZY), (χ , Zχ)a solution to the BSDE (20),

(14)

S1₍_{× [}₀_{, T}_]₎_{. Moreover, we have the terminal condition}_w

T(μ) = ˆg. Next, by construction, the processDtα=Dt(ρtα, αt, Kt, t, Yt),0≤ t ≤ T, is nonnegative, which means thatS_tα =wt(ρtα)+

t

0fˆs(ρsα, αs)ds,0 ≤ t ≤ T, is a(P,F0)-local submartingale. Moreover, by choosing the controlα∗ in the form (21), we notice

thatX∗, the solution to a linear stochastic McKean-Vlasov dynamics, satisfies the

square integrability condition:E[sup₀_≤t≤T |Xt∗|2] <∞, thus E[

T

0 |α∗t|2dt]<∞,

since U (K, , Z) inherits from Z the square integrability condition L2( ×

[0, T]), −1(K, ) is essentially bounded, and soα∗ ∈ A. Finally, from (19) we

see thatDα∗ =0, which gives the(P,F0)-local martingale property ofSα∗, and we

conclude by the dynamic programming verification Lemma 2.1.

Let us now show, under assumptions(H1)and(H2), the existence of a solution

to the BSDE (20) satisfying the integrability conditions of Proposition 3.1. We point out that this system is decoupled:

(i) One first considers the BSDE for(K, ZK)whose generator(k, z)∈Sd×Sd

→ t(k, z)∈ Sd is linear, with essentially bounded coefficients. Since the terminal conditionPis also essentially bounded, it is known by standard results for linear BSDEs that there exists a unique solutionK, ZKwith values in Sd_×_Sd_{, s.t.}_K_{is essentially bounded and}_ZK_{lies in}_L2₍_×[₀_{, T}_]₎_{. Moreover,}

since P andt(0,0) =Qt are nonnegative under(H1), we also obtain by standard comparison principle for BSDE thatKt is nonnegative, for all 0 ≤ t ≤T.

(ii) Given K, we next consider the BSDE for , Z with generator: (, z)∈

Sd_×_Sd_→

t(Kt, , z)−Ut(Kt, , z)t−1(Kt, )Ut(Kt, , z)∈Sd, and ter-minal conditionP+ ¯P. This is a backward stochastic Riccati equation (BSRE),

and it is well-known (see, e.g., (Bismut 1976)) that it is associated with a stochastic standard LQ control problem (without McKean-Vlasov dependence) with controlled linear dynamics:

dX˜t =

(Bt + ¯Bt)X˜t+Ctαt

dt+D_t0+ ¯D0_tX˜t+Ft0αt

dW_t0,

and quadratic cost functional

˜

JK(α) =E T

0

˜

XtQKt X˜t+αtNtKαt+2X˜tMtKαt

dt + ˜X_T(P+ ¯P )X˜T

,

whereQKt =Qt+ ¯Qt+(Dt+ ¯Dt)Kt(Dt+ ¯Dt),NtK=Nt+FtKtFt,MtK =(Dt + ¯Dt)KtFt. Under the condition thatNK is positive definite, we can rewrite this cost functional after square completion as

˜

JK(α) =E T

0

˜

XtQ˜Kt X˜t+ ˜αtNtKα˜t

dt + ˜X_T(P+ ¯P )X˜T

,

withQ˜Kt =QKt −MtK

NtK

−1 MtK

,α˜t =αt +

NtK

−1 MtK

_˜ Xt. By noting thatQ˜Kt ≥Qt+ ¯Qt, it follows that the symmetric matricesQ˜KandP+

¯

(15)

is uniformly positive definite, we obtain from (Tang 2003) the existence and uniqueness of a solution, Zto this BSRE, withbeing nonnegative and

essentially bounded, andZsquare integrable inL2(×[0, T]). This implies,

in particular, that−1(K, )is well-defined and essentially bounded. Since

K is nonnegative under(H1), notice that the uniform positivity condition on NK is satisfied under(H2): this is clear whenNis uniformly positive definite

(as usually assumed in LQ problem), and holds also true whenFis uniformly nondegenerate, andKis uniformly positive definite, which occurs whenPorQ

is uniformly positive definite from comparison principle for the linear BSDE forK.

(iii) Given K, , Z, we consider the BSDE for Y, ZY with generator:

(y, z) ∈ Rd ×Rd → Gt(y, z) :=t(Kt, t, Zt , y, z) −Ut

Kt, t, Zt

−t 1(Kt, t)Rt(Kt, t, y, z)with values inRd, and terminal conditionL. This is a linear BSDE and{Gt(0,0),0 ≤ t ≤ T}lies inL2(× [0, T])(recall that b0, γ, and γ0 are assumed square integrable). By standard results for

BSDEs, we then know that there exists a unique solutionY, ZYs.t.Ylies in

S2₍_{× [}₀_{, T}_]₎_{, and}_Z_{lies in}_L2₍_{× [}₀_{, T}_]₎_.

(iv) Finally, givenK, , Y, ZY, we solve the backward stochastic equation forχ,

which is explicitly written as

χt =E T

t s

Ks, s, Ys, ZYs

−1₄R_s

Ks, s, Ys, ZYs

−_s1(Ks, s)Rs

Ks, s, Ys, ZsY

dsF_t0

,0≤t≤T ,

andχsatisfies theS1(× [0, T])integrability condition.

To sum up, we have proved the following result:

Theorem 3.1Under assumptions(H1)and(H2), there exists a unique solution

(K, ZK), (, Z), (Y, ZY), (χ , Zχ) to the BSDE (20) satisfying the integrability

condition of Proposition 3.1, and consequently we have an optimal control for the LQCMKV1 problem given by (21).

Control setA=L(Rd;Rm)

From the linear form ofbt, σt, σt0 in (2), and the quadratic form of fˆt in (7), the random field process in (17) is given, after some calculations by

Dt(μ, a, k, , y) = Var

μ, t

k, ZKt

+ ˙Kt

+v2μ, tk, , Zt

+ ˙t

+v1

μ, t

k, , Zt, y, ZYt

+ ˙Yt

+t

k, , y, ZYt

+ ˙χt +Var(a μ, t(k, k))+a μt(k, )a μ

+2

(x− ¯μ)Vt

k, Z_tKa(x)μ(dx)

+ 2Ut

k, , Z_t μ¯+Rt

(16)

for allt ∈ [0, T], μ∈P2

Rd_{, k,}_∈_Sd_,_y_∈_Rd_,_a_∈_L(_Rd_;_Rm₎_{, where}_{a μ}_∈ P2(Rm)denotes the image byaofμ,

a μ =

a(x)μ(dx), Var(a μ, k) =

(a(x)−a μ)k (a(x)−a μ) μ(dx),

and we keep the same notations as in (18) with the additional term:

Vt

k, ZK_t = DtkFt+

D_t0kF_t0+kCt+ZtKFt0. (22)

Then, after square completion under the condition thatt(k, )is positive definite inSm_{, we have}

Dt(μ, a, k, , y) = Var

μ, t

k, ZK_t −Vt

k, Z_tK−_t1(k, k)Vt

k, ZK_t + ˙Kt

+v2

μ, t

k, , Z_t−Ut

k, , Z_t_t−1(k, )U_tk, , Z_t + ˙t

+v1

μ,t

k,,Zt , y,ZtY

−Ut

k, , Zt

t−1(k, )Rt

k, , y, ZtY

+ ˙Yt

+t

k, , y, ZY_t−1 4R

t

k, , y, Z_tY−_t1(k, )Rt

k, , y, Z_tY+ ˙χt

+Var(a− ˆat)(.,μ, k, , y) μ, ¯ t(k, k)

+(a− ˆat)(.,μ, k, , y) μ¯

t(k, )(a− ˆat)(.,μ, k, , y) μ¯

whereaˆt(.,μ, k, , y)¯ :Rd→Rmis defined by

ˆ

at(x,μ, k, , y)¯ = −t−1(k, k)Vt

k, Z_tK(x− ¯μ)

−t−1(k, )

Ut

k, , Zt

¯

μ+1

2Rt

k, , y, ZYt

, x∈Rd.

We then consider the system of BSDEs:

⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩

dKt = −

t

Kt, ZtK

−Vt

Kt, ZtK

t−1(Kt, Kt)Vt

Kt, ZKt

dt

+ZK

t dWt0, 0≤t≤T , KT =P dt = −

t

Kt, t, Zt

−Ut

Kt, t, Zt

t−1(Kt, t)Ut

Kt, t, Zt

dt

+Z

t dWt0, 0≤t≤T , T =P+ ¯P dYt = −

tKt, t, Zt, Yt, ZtY

−UtKt, t, Zt

−t1(Kt, t)RtKt, t, Yt, ZtY

dt

+ZtYdWt0, 0≤t≤T , YT =L, dχt = −

tKt, t, Yt, ZtY

−1 4Rt

Kt, t, Yt, ZtY

_t−1(Kt, t)RtKt, t, Yt, ZYt

dt

+ZtχdWt0, 0≤t≤T , χT =0,

(23) and by the same arguments as in Proposition 3.1, we have the following verification result making the connection between the system (23) and the LQCMKV2 control problem.

Proposition 3.2Assume thatK, ZK,, Z,Y, ZY, (χ , Zχ)is a solution

to the BSDE (23) such that K, , −1(K, ) are essentially bounded, Y lies in

S2₍_{× [}₀_{, T}_]₎_{, and}_χ _{lies in} _S1₍_{× [}₀_{, T}_]₎_{. Then, the control process}_α∗ _with

(17)

α∗t(x)= ˆat

x,E

X∗t|W0

, Kt, t, Yt

= −_t−1(Kt, Kt)Vt

Kt, ZtK

x−EX∗_t|W0

−−_t1(Kt, t)

U_tKt, t, Zt

EX_t∗|W0+1 2Rt

Kt,t,Yt,ZtY

,x∈Rd,0≤t≤T ,

whereX∗=Xα∗is the state process with the feedback controlaˆt(., ., Kt, t, Yt), is

an optimal control for the LQCMKV2 problem, i.e.,V0=J (α∗), and we have V0 = Var(L(ξ0), K0)+v2(L(ξ0), 0)+v1(L(ξ0), Y0)+χ0.

Let us now discuss the existence of a solution to the BSDE (23) satisfying the integrability conditions of Proposition 3.2. As for (20), this system is decoupled. The difference w.r.t to the LQCMKV1 problem is in the BSDE forK, ZK, where the

generator(k, z)∈ Sd ×Sd → t(k, z)−Vt(k, z)t−1(k, k)Vt(k, z) ∈ Sd is now of the Riccati type. In general, it is not in the class of BSREs related to LQ control problem, but existence can be obtained in some particular cases:

(1) The coefficientsB,C,D,F,D0, F0,Q,P,Nare deterministic. In this case, the

BSRE forKis reduced to a matrix Riccati ordinary differential equation:

−dKt

dt = t(Kt,0)−Vt(Kt,0) −1

t (Kt, Kt)Vt(Kt,0), 0≤t≤T , KT = P .

This problem is associated to the LQ problem with controlled linear dynamics

dX˜t = (BtX˜t+Ctα˜t)dt+(DtX˜t+Ftα˜t)dWt +

D0_tX˜t+Ft0α˜t

dW_t0,

where the control processα˜ is anF-adapted process with values inRm, and the cost functional to be minimized overα˜ is

˜

J (α)˜ = E T

0

˜

XtQtX˜t + ˜αtNtα˜t

dt+ ˜X_TPX˜T

.

It was solved in (Wonham 1968) under assumption(H1)and the condition

(H2)(i) thatN is uniformly positive definite, and this gives the existence and

uniqueness ofK∈C1[0, T];Sd, which is nonnegative.

(2) D ≡F ≡ 0. In this case, the BSRDE forK, ZK is associated to the LQ

problem with controlled linear dynamics

dX˜t =(BtX˜t+Ctα˜t)dt+

D_t0X˜t +Ft0α˜t

dW_t0,

where the control processα˜ is anF0-adapted process with values inRm, and the cost functional to be minimized overα˜ is

˜

J (α)˜ = E T

0

˜

XtQtX˜t + ˜αtNtα˜t

dt+ ˜X_TPX˜T

.

It is then known from (Tang 2003) that under assumptions(H1)and(H2)(i),

there exists a unique pair(K, ZK)solution to the BSRDE, withKnonnegative,

(18)

(3) N≡0,Pis uniformly positive,m=d, andFis invertible withF−1bounded.

In this case, the BSDE forKis reduced to the linear BSDE:

dKt = −

t

Kt, ZKt

−CtFt−1Dt

Kt+Kt

CtFt−1Dt

−DtKtDt−KtCtFtKtFt− 1

CtKt

dt+Z_tKdW_t0, 0≤t≤T , KT =P ,

for which it is known that there exists a unique solution K, ZK, with K

positive, and essentially bounded.

It is an open question whether existence of a solution forKto the BSRE (23) holds in the general case. Anyway, once a solutionKexists, and is given, the BSDEs for the pairs, Z,Y, ZY, (χ , Zχ)are the same as in (20), and then their existence

and uniqueness are obtained under the same conditions.

Applications

Trading with price impact and benchmark tracking

We consider an agent trading in a financial market with an inventory Xt, i.e., a number of shares held at timetin a risky stock, governed by

dXt = αtdt,

where the controlα, a real-valuedF0-progressively measurable process inL2(×

[0, T]), represents the trading rate. Given a real-valuedF0-adapted stock price

pro-cess(St)0≤t≤T inL2(× [0, T]), a real-valuedF0-adapted target process(It)0≤t≤T inL2(×[0, T]), and a terminal benchmarkHas a square integrableF_T0-measurable

random variable, the objective of the agent is to minimize over control processesαa

cost functional of the form:

J (α) = E T

0

αt(St+ηαt)+q(Xt −It)2

dt+λ(XT −H )2

, (24)

whereη >0,q≥0, andλ≥0 are constants.

(19)

completion as

J (α) = E T

0

ηα˜2_t +q(Xt−It)2

dt+λ(XT −H )2

− E

T

0 St2 4ηdt

,

withα˜t=αt+S₂_ηt, we see that this problem fits into the LQCMKV1 framework (with b0_t = −St

2η, without McKean-Vlasov dependence but with random coefficients), and Assumptions(H1),(H2)are satisfied. From Theorem 3.1, the optimal control is then

given by

αt∗ = − 1

η

tX∗t + Yt

2

− St

2η, 0≤t ≤T , (25)

whereis solution to the (ordinary differential) Riccati equation

dt = −

q−

2 t η

dt, 0≤t≤T , T = λ, (26)

andYis solution to the linear BSDE

dYt =

2qIt+ t

η St+ t

η Yt

dt+ZtYdWt0, 0≤t ≤T , YT = −2λH. (27)

The solution to the Riccati equation is

t η =

! q/η

√

q/ηsinh(√q/η(T −t))+λ/ηcosh(√q/η(T −t))

λ/ηsinh(√q/η(T −t))+√q/ηcosh(√q/η(T −t)), 0≤t ≤T ,

while the solution to the linear BSDE is given by

Yt = −2E

e−

T

t sηds_λH+

T

t e−

s t uη du

qIs+

s η Ss

dsF_t0

, 0≤t≤T .

By integrating the function/η, we have

e− s

t uηdu = √t/η q/η

√

q/ηcosh(√q/η(T −s))+λ/ηsinh(√q/η(T −s))

√

q/ηsinh(√q/η(T −t))+λ/ηcosh(√q/η(T −t))

= t

s

√

q/ηsinh(√q/η(T−s))+λ/ηcosh(√q/η(T−s))

√

q/ηsinh(√q/η(T−t))+λ/ηcosh(√q/η(T−t)), t≤s≤T , and plugging into the expectation form of Y, the optimal control in (25) is then expressed as

αt∗ = − t

η

X∗t − ˆItH

+₂1

η

E T

t t

η

ω(t, T ) ω(s, T )Ssds|F

0 t

−St

=: αt∗,I H+α∗ ,S

t , 0≤t ≤T , (28)

where

ˆ

ItH =E

ω(t, T )H+(1−ω(t, T )) T

t

IsK(t, s)dsFt0

with a weight valued in[0,1]

ω(t, T ) = √ λ/η

(20)

and a kernel

K(t, s) =!q/η

√

q/ηcosh(√q/η(T−t))+λ/ηsinh(√q/η(T−t))

√

q/ηsinh(√q/η(T−t))+λ/η(cosh(√q/η(T−t))−1), 0≤t≤s≤T .

The optimal trading rule in (28) is decomposed in two parts:

(i) The first termα∗,I H prescribes the agent to trade optimally towards a

weigh-ted averageIˆtH, rather than the current target positionI. Indeed,IˆHis a convex combination of the expected future of the terminal random targetH, and of a weighted average of the running targetI (notice thatK(t, .)is a nonnegative

kernel integrating to one over_[t, T]). The rate towards this target is at a speed

proportional to its distance w.r.t the current investor’s position, and the coeffi-cient of proportionality is determined by the costs parametersη,q,λand the

time to maturityT −t. We retrieve the interpretation and results obtained in

(Bank et al. 2015) in the limiting cases whereλ=0 (no constraint on the

termi-nal position), andλ= ∞(constraint on the terminal positionXT =H). In the case whereq=0, we havet/η=λ/(η+λ(T −t)),ItˆH=E[H|Ft0], and we retrieve, in particular, the expressionα∗,I H = −X_t∗/(T−t), of optimal trading

rate whenH=0, andλ→ ∞corresponding to the optimal execution problem

with terminal liquidationXT =0.

(ii) The second termα∗,S related to the stock price, is an incentive to buy or sell

depending on whether the weighted average of expected future value of the stock is larger or smaller than its current value. In particular, when the price process is a martingale, then

αt∗,S = − St 2η

√

q/η

√

q/ηcosh(√q/η(T −t))+λ/ηsinh(√q/η(T −t))

which is nonpositive for nonnegative price St, hence meaning that due to the

price impact, one must sell. Moreover, in the limiting case where λ → ∞,

i.e., the terminal inventory XT is constrained to achieve the target H, then α∗,S is zero: we retrieve the result that the optimal trading rate does not

depend on the price process when it is a martingale, see (Alfonsi et al. 2010; Predoiu et al. 2011.

On the other hand, by applying Itˆo’s formula to (25), and using (26)-(27), we have

d

α∗_t + St

2η

= q

η(X

∗

t −It)ds− 1 2ηZ

Y t dWs0,

which implies the notable property:

α_t∗+ St

2η− q η

t

0 (X

∗