Continuous dependence of the value function on a class of one dimensional autonomous optimal control

(1)

CONTINUOUS DEPENDENCE OF THE VALUE FUNCTION ON A CLASS OF ONE-DIMENSIONAL AUTONOMOUS OPTIMAL CONTROL PROBLEMS AND

EXISTENCE OF SOLUTIONS

M’hamed Kesri

Faculty of Mathematics, USTHB University, B.P.: 32 El Alia, 16111

Bab-Ezzouar Algiers, Algeria

e-mail: [email protected]

(Received 30 January 2017; after final revision 29 April 2018;

accepted 11 May 2018)

We give in this paper the proof of existence of solutions for a class of one-dimensional au-tonomous optimal control problems. The second result in the paper is about convergent sub-sequences of solutions to problems, we show that the value function depends continuously on the problems.

Key words : Optimal control problems; infinite horizon; value function; global attractor.

1. INTRODUCTION

Many problems in engineering, ecology and resources exploitation can be formulated as infinite

hori-zon optimal control problems(P)of the form

Maximize ∞ Z

t0

e−δtF(t, x(t), u(t))dt

over measurable functionsu(.)and absolutely continuous functionsx(.)satisfying

˙

x(t) = f(t, x(t), u(t)) for a. e.t≥t0, (1.1)

u(t)∈U(t, x(t))for a.e.t≥t0,

x(t)∈Ω(t)fort≥t0,

(2)

wheret0 ∈ R, x = (x1, ..., xn) andu = (u1, ..., um) are respectively, the state and the control, f : [t₀,+∞[×Rn_×_Rm _→ _Rn _and _F _{: [t}

0,+∞[×Rn×Rm → R are functions, for t ≥ t0, the sets U = U(t, x(t)) ⊆ Rm,Ω(t) ⊆ Rn and a point x0 ∈ Rn. δ is a given positive discount rate. A measurable functionu(.) : [t0,+∞[→ Rm satisfyingu(t) ∈ U(t, x(t))a.e. is called an admissible control (for(P)). An absolutely continuous function x(.) : [t0,+∞[→ Rn satisfying (1.1),(1.2) and x(t) ∈ Ω(t) for t ≥ t0 is called an admissible state trajectory (corresponding to u(.)). Suitable conditions on the data ensuring that, for a given control function u(.) and initial

data(τ, ξ) ∈ [t0,+∞[×Rn, there is a unique absolutely continuous function x(.) : [τ,+∞[→ Rn satisfyingx(t) =. f(t, x(t), u(t))a.e. on[τ,+∞[,x(τ) =ξand the previous integral makes sense. It is assumed in autonomous problems thatF(x, u)is measurable and locally bounded inxuniformly

inu,thus, for any admissible pair(x, u)andδ >0, the integral ∞ R

0

e−δtF(x(t), u(t))dtmakes sense.

The problem (P) is embedded in a family of optimal control problems {(Aτ,ξ) : (τ, ξ) ∈ [t0,+∞[×Rn}, parameterized by the initial time and state(τ, ξ)∈[t0,+∞[×Rn:

(A_τ,ξ)Maximize

    

   

∞ R

τ

e−δt_F_{(t, x(t), u(t))dt /}_x. ₌_f_{(t, x, u),}

u(t)∈U(t, x(t))a.e. on[τ,+∞[, x(t)∈Ω(t)fort≥τ andx(τ) =ξ}

    

   

.

In this case(P) = (At0,x0). With this family of problems, the value function is

V(τ, ξ) = sup{(Aτ,ξ)},

which describes the variation of the optimal rent functional for (Aτ,ξ) when (τ, ξ) range over [t0,+∞[×Rn.

Dynamic programming approach leads to the Hamilton-Jacobi-Bellman (HJB) equation

charac-terization of the value function. Because of the importance of this equation in optimal feedback

control, it has been discussed in the literature for many years. Many of these discussions were

de-voted to show the existence of a unique non-smooth solution called a ’viscosity solution’ see [7, 8,

11]. However, the (HJB) equation is, in general, not analytically solvable, there are in the literature

only few numerical approximations and algorithm of resolution for the equation such as the one

re-ported in [12]. In autonomous infinite horizon optimal control problem(P), the (HJB) equation can be considered as a first-order Hamilton-Jacobi equation,

H(x, φ(x),∇φ(x)) = 0inΩ

(3)

For some classes of Hamilton-Jacobi equations, Tr´elat proves that, under suitable assumptions,

the viscosity solution is sub-analytic see [15]. (HJB) operators have been the object of intensive study

during the last years, for a general review of their theory and applications see [4-6, 10, 13, 14].

The value function may be characterized as the unique (generalized) solution of the

Hamilton-Jacobi (HJ) equation. This important result in nonlinear deterministic control can take a number of

guises depending on the class of optimal control problems considered and the notion of “generalized”

solution adopted. As well known, (HJ) equations may fail to haveC1 solutionsφ. It becomes nec-essary then to give meaning to functionsφdrawn from a larger function class, which are solutions to (HJ) equations in some extended sense. Many concepts of extended solutions to (HJ) equations, each

providing a characterization ofV,have been proposed, like lower Dini solutions and others. For the most part, the various extended solution concepts (e.g. the various forms of viscosity solutions [7,

11]) coincide when the functionsφinvolved are locally Lipschitz continuous.

Thanks to [9], the solution x(.) (optimal state trajectory) of the one state variable autonomous infinite horizon optimal control problem(P)must always be monotonic if the hamiltonian is strictly concave in the controlu, this property concerning the state trajectory is useful in the rest of the paper and it has the following corollary,

Lemma 1.1 — Suppose the problem(P),

       

      

maxR∞ 0

e−δt_F_{(x(t), u(t))dt} .

x(t) =u(t), t≥0 x(0) =a

u: [0,+∞)→R, measurable

admits a solutionx(.). IfF is strictly concave inu,thenx(.)is monotone.

PROOF: IfF is strictly concave inu, then the hamiltonian is also strictly concave inu,from [9, p. 164] the solutionx(.)is monotone.

In this paper, we prove the existence of solutions for a class of optimal control problems we

consider without the compactness condition on the set of controls. The second result in the paper is

(4)

Specification of a problem set

From now on we shall consider but problems(P)of the following type:

          

         

max∞R 0

e−δt_F_{(x(t), u(t))dt} .

x(t) =u(t), t≥0 x(0) =a

u: [0,+∞)→R, measurable u∈U

.

The problem may have as possible interpretation the optimal exploitation stock, wherex(t)can be considered as a real valued capital stock at timet,x(t). the rate investment,U the set of the admissible controls;F(x(t), u(t))the net income rate which result,δis a fixed discount rate and the discounted integral of the net income, when maximized is the present value of the initial capital stocka.

We specify such a problem by a pair(F, δ) = P and an initial statea. If an initial state is fixed, the problem is denoted byP(a).The family of problemsP(a)is denoted byP = (F, δ).

To further simplifying the analysis we consider only thoseF(., .)for which investmentx(t). >0 for largex(t)as well as disinvestmentx(t). < 0for large −x(t)is not profitable, and therefore the optimally steered stockx(t)decreases for large initial stockx(0)and increases for large−x(0). We give the definition,

Definition 1.1 — An interval[a, b] (a < b)is called (global) attractor forF, if

1/F(x,0)≥F(y, u) whenever b≤x≤y,0≤u,

2/F(x,0)≥F(y, u) whenever y≤x≤a,0≥u.

Remark 1.1 : F admits a global attractor means that is not profitable to invest (take strategy u ≥ 0) for large stock x(t)(x(t) > b), because the net incomeF which result decreases, passing fromF(x,0)toF(y, u) once b ≤ x ≤ y (see 1/ of definition 1.1). Therefore whenever we have at timet a large stockx(t) (x(t) > b), we must disinvest (take strategy u ≤ 0), this implies the optimally steered stockx(t)decreases for large initial stockx(0).

As well as, it is not profitable to disinvest (take strategy u ≤ 0) for large −x(t) (x(t) < a), because the net incomeFwhich result also decreases, passing fromF(x,0)toF(y, u)oncey≤x≤ a(see 2/ of Definition1.1). Therefore whenever we have at time ta large stock−x(t)(x(t) < a),

(5)

In conclusion, on the right of the global attractor[a, b] (a < b)it’s profitable to disinvest as well as to invest on his left.

Definition 1.2 — A setW of controls is called equi-integrable, if

1/∀² >0∃δ >0such thatR A

|w|< ²∀A⊂R,measurable with|A|< δ,∀w∈W,

2/∀² >0∃B ⊂R,measurable with|B|<∞such that R

R\B

|w|< ²∀w∈W.

Let P be the set of problems(F, δ) such thatδ is a positive number, the set of the admissible controlsU is equi-integrable andF :R×R→Ris a continuous function which is strictly concave inuand admits a global attractor.

Definition 1.3 — The function

V(a) = sup

  

 

∞ R

0

e−δtF(x(t), u(t))dt /u: [0,+∞)→R, measurable, .

x=u, u∈U andx(0) =a

  

 ,

is called the value ofa(for the problemP) andV =VP is called the value function.

If(x(.), u(.))is optimal, then

V(x(0)) =

∞ Z

0

e−δtF(x(t), u(t))dt.

2. SPECIFICATION OF ATOPOLOGY ONP ANDPRELIMINARYRESULTS

For open intervalsI, J inRput

|F|_I,J = sup

x∈I,u∈J|F(x, u)|.

For(F, δ)∈ P, and open intervalsI, J such thatI contains an attractor forF, put

V(F, δ, I, J, ²) =

(

(F0, δ0)∈ P/|δ−δ0|< ²,Icontains an attractor forF0 and |F−F0|_I,J < ²

)

.

Obviously theV(F, δ, I, J, ²)form a filter base of subsetsP containing(F, δ).

(6)

τ is a Hausdorff topology admitting a countable base. It is this topology which will be considered

for the rest of the article.

In the following lemma we show the existence of solution if both of control and trajectory state

are bounded.

Lemma 2.1 — For every bounded intervalsI1, I2 of the real lineR withI1 not separated from zero, the following problemP(a),(a∈R)admits a solution

          

         

max∞R 0

e−δtF(x(t),x(t))dt. .

x(t) =u(t), t≥0 u(t)∈I1, t≥0, x(t)∈I2, t≥0,

x(0) =a

.

PROOF: Letx(.)an absolutely continuous function withx(0) =a.

Put

V(x(.)) =

∞ Z

0

e−δtF(x(t),x(t))dt..

The value ofa(forP) is

V(a) = sup{V(x(.))/x(.)admissible,x(0) =a}.

There is a sequence of absolutely continuousxn(.) (n∈N), withxn(0) =aand

V(a) = lim

n V(xn(.)).

The sequencexn(.) is bounded inL∞([0,+∞)). On the other part, the control uis supposed to be bounded. Therefore the sequencexn(.)is bounded inW1,∞([0,+∞)),and hence, due to the theorem of Banach-Alaoglu-Bourbaki (see [2]), this sequencexn(.)converge for the topology weak-* tox(.), up to a subsequence.

One deduce from the concavity ofGthatxn(.) converge uniformly tox(.) which is solution of P(a)andlimnV(xn(.)) =V(x(.)) Q.e.d.

(7)

                        

max∞R 0

e−δt_F_(x(t),_x(t))dt. .

x(t) =u(t), t≥0 u∈U

x: [0,+∞[→R absolutely continuous and either increasing or decreasing,

x(0) =a

.

Lemma 2.2 —P(a)andP+(a)are equivalent i.e. have the same solutions.

PROOF: 1) Letx+(.)besolution of the problemP+(a), we show thatx+(.)is also solution of the problemP(a).

Letx(.)be any absolutely continuous such that the functiont→e−δtF(x(t),x(t)). is integrable, x(0) =a,x. ∈U and

V(x(.)) =

∞ Z

0

e−δtF(x(t),x(t))dt..

Using dominated convergence (theorem of Lebesgue) one notes that there is a sequence of

abso-lutely continuous functionsxnsuch that

−n≤xn(t)≤n,(t≥0), −n≤x._n(t)≤n,(t≥0)and

V(x(.)) = lim n

∞ Z

0

e−δtF(xn(t),x.n(t))dt.

For eachn∈N, consider the problemPn(a):

                    

max∞R 0

e−δt_F_(x(t),_x(t))dt.

x: [0,+∞[→R absolutely continuous, −n≤x(t)≤n,(t≥0), −n≤x(t). ≤n,(t≥0),

x(0) =a

.

From Lemma 2.1 we have each Pn(a) admits a solutionyn(.) and due to Lemma 1.1, yn(.) is either increasing or decreasing. Let

Vn(a) := ∞ Z

0

(8)

HenceVnis the value function forPn. Let

V(a) = sup{V(x(.))/ x(.)admissible forP+(a)}.

V is the value function forP+. Obviously

Vn(a)≤Vn+1(a); lim_n Vn(a) =V(a) (n∈N).

SinceVnis the value function forPnthen we have for eachn∈N,

Vn(a)≥ ∞ Z

0

e−δtF(xn(t),x.n(t))dt.

By passing to the limit, we deduce

V(a)≥ V(x(.)).

But by assumptionx+(.)is solution of the problemP+(a),

V(a) =

∞ Z

0

e−δtF(x+(t),x.+(t))dt.

This proves thatx+(.)is solution of the problemP(a), sincex(.) is an arbitrary admissible for P(a).

2) Conversely, letx(.)be solution of the problem P(a), from lemma1.1, we havex(.)is either increasing or decreasing, then obviouslyx+(.)is solution to the problemP+(a)

Therefore from 1/ and 2/,P(a)andP+(a)are equivalent.

3. MAINRESULTS

Theorem 3.1 — EveryP(a),(P ∈ P,a∈R)admits a solution.

PROOF: Using the above Lemma 2.2. it suffices to prove thatP+(a)admits a solution.

We now prove a priory estimates forP+(a)using the assumption that there is an attractorS = [s1, s2]forF.

Fory∈Rput

(9)

Ifxis increasing thenx(0)≤x(t),(t≥0).If howeverx(0)≥s2then, the constantx(0)satisfies

1

δF(x(0),0) =

∞ Z

0

e−δtF(x(0), u(0))dt

≥

∞ Z

0

e−δtF(x(t),x(t))dt..

Hence we may assume without loss of generality thatx(t) ≤ (x(0)),(t ≥ 0),and by the same argumentx(0)≤x(t),(t≥0).

That is, we do not alter the solution set ofP+(a)if we add the constraint

a=x(0)≤x(t)≤x(0) =a,(t≥0). (2.1)

Ifxis increasing we have

t

Z

0

¯

¯_x(s). ¯¯_ds₌_x(t)₋_x(0)

≤s1−x(0),(t≥0).

Ifxis decreasing we have

t

Z

0

¯

¯_x(s). ¯¯_ds₌_{−x(t) +}_x(0)

≤x(0)−s2,(t≥0).

Hence

t

Z

0

¯ ¯.

x(s)¯¯ds≤max(|s1−x(0)|,|x(0)−s2|,(t≥0). (2.2)

As for the value ofx(.)we always assume that

1

δF(x(0),0)≤

∞ Z

0

e−δtF(x(t),x(t))dt. =:V(x)

(10)

And therefore,

∞ Z

0

e−δt¯¯F(x(t),x(t)). ¯¯dt≤

∞ Z

0

e−δt¯¯F_x₍₀₎−F(x(t),x(t)). ¯¯dt+ 1 δ

¯ ¯_F_x₍₀₎¯¯

= 1

δFx(0)−V(x) + 1 δ

¯ ¯_F_x₍₀₎¯¯

≤ 2 δ

¯ ¯_F

x(0)

¯ ¯₋1

δF(x(0),0).

Hence we have ∞ Z

0

e−δt¯¯F(x(t),x(t)). ¯¯dt≤ 2 δ

¯

¯_F_x₍₀₎¯¯₋1

δF(x(0),0). (2.3)

We now prove existence of solutions toP+(a) using the estimates (2.1), (2.2), (2.3) and ”the standard argument see [1]”. Let

V(a) = sup

(

V(x(.))/x absolutely continuous and either increasing or decreasing, .

x=u, u∈U andx(0) =a

)

.

There is a sequence of absolutely continuousxn(.) (n∈N), withx.n=un, un ∈U ,xn(0) =a and

V(a) = lim

n V(xn(.))

and such xn is either increasing or decreasing. Passing to a subsequence we may without loss of generality assume that allxnare say increasing. In any case we may assume that (2.1), (2.2), (2.3) hold forx=xn(n∈N).

From (2.2) the sequencex.nis bounded inL1(R). On the other part,x.n=un, un∈Uand the set of the admissible controlsU is supposed to be equi-integrable, then©x.n, n∈N

ª

is equi-integrable.

Therefore, due to the theorem of Dunford-Pettis (see [3]), we conclude that there is a subsequence of .

xnwhich converges weakly inL1(R)to some functionuwhich is Lebesgue integrable.

Hence

lim

n xn(t) =a+ limn t

Z

0 . xn(s)ds

=a+ lim n

t

Z

0

(11)

Due to (2.3) we may also assume that the function

s→e−δsF(xn(s),x.n(s)) =:vn(s)

converge to some Lebesgue integrable functionv(.).Put

yn(t) = t

Z

0

vn(s)ds,y(t) = t Z 0 v(s)ds H(t, Ã y x ! ) = (Ã v u !

/u∈R, v ∈R, v≤e−δtF(x, u)

)

.

AsF(x, .)is concave,H(t,

Ã

y x

!

)is closed convex.

For everyn∈N:

Ã _.

y_n(t) . xn(t)

!

∈H(t,

Ã

y_n(t) xn(t)

!

)for almost allt≥0.

The weak convergence of

Ã _.

y_n(t) . xn(t)

! implies that Ã _. y(t) . x(t) ! ∈H(t, Ã y(t) x(t) !

)for almost allt≥0.

Hence for allt≥0 :

.

y(t)≤e−δtF(x(t),x(t)).

and

V(a) = lim n

∞ Z

0

vn(t)dt

= ∞ Z 0 v(t)dt ≤ ∞ Z 0

(12)

Therefore,

V(a) =

∞ Z

0

e−δtF(x(t),x(t))dt.

andx(.)is optimal forP+(a)and hence forP(a) 2

The second result is about convergent subsequences of solutionsxnto problemsPn∈P. It show that the value functionV depends continuously on the problemP.

Theorem 3.2 — Let inPa sequencePnandP such thatlimnPn=P,and letxnoptimal forPn

withxn(0) =afor alln∈N.

There is a subsequence(xnk)of(xn)such thatlimkxnk =xlocally uniformly on[0,+∞)and xis optimal forP.

PROOF : Letxn be optimal forPn(a), Pn = (Fn, δn) ∈ P andP = limnPn = (F, δ).There are bounded open intervals I = ]σ1, σ2[ ⊂ J such that for large n every Fn admits an attractor I_n ⊂ I,lim_nδ_n = δ,lim_n|F_n−F|_I,J = 0.Since we may assume, e.g., that all xnare increasing, this implies:

min(a, σ1) =a≤xn(t)≤a= max(a, σ2),(t≥0).

From this, we deduce

∞ Z

0

¯ ¯.

xn(s)

¯

¯_ds_≤_max(|s₁₋_a|_,_|s₂₋_a|),_(n_∈_N₎

with[s1, s2]an attractor forF.

The proof then proceed’s as does the proof of Theorem 3.1.

Corollary 3.1 — The value functionV depends continuously on the problemP, P ∈ P.

PROOF: IfVn resp. V is the value function forPnresp. P,then using the above Theorem 3.2. we can deduce thatlimnVn(b) =V(b)for allb,b∈R.

CONCLUSION

It was shown in this paper that a problemP(a), where P belongs to the family P we considered always admits a solution, and the value functionV depends continuously on the problemP. Most problems of optimal exploitation or capital stocks or resource stocks admit global attractor for the net

(13)

is to say, that an extension of the above results to include these problems seems to be an important

task.

ACKNOWLEDGEMENT

The author would like to express his thanks to the unknown referee for his valuable comments and

suggestions.

REFERENCES

1. J. P. Aubin and H. Frankowska, Set valued analysis, Birkhaeuser, Boston, 1990. 2. H. Bresis, Analyse fonctionnelle, th´eorie et applications, Masson, Paris, 1983.

3. H. Bresis, Functional analysis, Sobolev spaces and partial differential equations, DOI 10.1007/978-0-387-70914-7 Springer, 2010.

4. X. Cabre, Elliptic PDEs in probability and geometry, Discrete Contin. Dyn. Syst. Ser. A, 20(3) (2008), 425-457.

5. J. M. Coron, Phantom tracking method, homogeneity and rapid stabilization, Math. Control Relat. Fields, 3(3) (2013), 303-322.

6. J. M. Coron, Controllability and nonlinearity, ESAIM, Proc., 22 (2008), 21-39.

7. M. G. Crandall and P. L. Lions, Viscosity solutions of Hamilton-Jacobi equations, Trans. Amer. Math. Soc., 277 (1983), 1-42.

8. M. G. Crandall, L. C. Evans, and P. L. Lions, Some properties of viscosity solutions of Hamilton-Jacobi equations, Trans. Math. Soc., 282(2) (1984), 487-502.

9. M. I. Kamien and N. L. Schwartz, Dynamic optimisation: The calculus of variations and Optimal control in economic and management, North-Holland, Amesterdam, 1981, pp. 164.

10. N. V. Krylov, Fully nonlinear second order elliptic equations: Recent development, Ann. Sc. Norm. Pisa, 25(3-4) (1997), 569-595.

11. P. L. Lions, Generalized solutions of Hamilton-Jacobi equations, Pitman Research Notes in Mathemat-ics, Longman Scientific and Technical, Harbour, 69 (1982).

12. V. P. Maslov, On a new principle of superposition for optimisation problems, Russian Maths. Survey, 42(3) (1987), 43-84.

13. J. Cristiana Silva, F. M. Delfim Torres, and E. Tr´elat, On optimal control and its applications, Bol. Soc. Port. Mat., 61 (2009), 11-37.

14. E. Tr´elat, Optimal control and applications to aerospace: Some results and challenges, J. Optim. Theory Appl., 154(3) (2012), 713-758.