On the Optimization Problem of Stochastic Observations of Random Walks

(1)

On the Optimization Problem of Stochastic Observations

of Random Walks

Alexander A. Butov

Faculty of Mathematics and Information technologies of Ulyanovsk State University, Ulyanovsk, Russian Federation *Corresponding Author: [email protected]

Abstract

The optimal control problem for the intensity of observation events of the process of random walk is considered for the case of counting Poisson process in semimartingale terms. The linear function of the intensity as a cost of observations and the expected value of the quadratic form of errors of estimation as a cost of an error are reckoned in a loss function. The analogues result for the problem of the optimal intensity of stochastic approximation is presented.

Keywords

Random Walk, Poisson Process, Optimal Control, Estimation, Semimartingale

1. Introduction and Definitions

LetB=( ,ΩF,F=( )F_{t t}_≥₀, )P _{be a stochastic basis satisfying} the «usual conditions» of Dellacherie, [1]. We consider the following model. OnBa process X Xt t=

( )

≥₀is a random walk on the latticeZ=

{

...,−1,0,1,2,...

}

with trajectories in the Skorokhod space and ∆ = −X X X_{t t t}₋∈ −

{

1,0,1 ,

}

_{[2]. Along}

withX Xt t=( ) 0_≥ _{we shall consider on}B a Poisson process

( ) 0

N Nt t= _≥ _{with intensity}

λ

>0. _{For all} k=1,2,... _define stopping times ( )τ k such, that∆_{N k}

_τ

_{( )}=1:

( )

k inf :

(

t t 0,N k_t

)

τ = > = ₍₁₎ and denote (0) 0τ = _for k=0 . Then random variables

{

θ

_{( ) 1,2,...}_{k k}

}

₌ _with _θ

_{( ) ( ) ( )}

_k₌_τ _k₋_τ _k₋₁ _{are independent and}

identically distributed with the density function ρ

( )

x: ( ) 0x

ρ = _forx<0_and

{

}

( )x exp x

ρ = ⋅λ − ⋅λ _for x≥0_{. (2)}

Mathematical expectations and variances of ( )θ k _and

( )

k

τ

for allk≥1_{are equal to}Eθ

( )

k =1 ,λ Dθ

( )

_k =₁ λ2 andt∈[τ

( ) (

k,τ k+1 ),

)

D_τ

_{( )}

_k =_k _λ2 respectively. Suppose that processesXandNare independent and hence times of

their jumps cannot be simultaneous:

{ 0} 1

0

N s s s t

∆Χ ⋅∆ = = ≤ ≤

∑

P _{for all}_t_≥₀_.

In the model N is supposed to be the counting process of observation events. The initial value 0X is observable:

( ) ( )

0 0

Y ω =X ω for all

ω

∈Ω_{and hence we (0) 0.}τ = Because the jump of counting process∆_Nt _{is equal zero on the} interval t∈[0, (1)),τ _{the observation} _Yt _{remains equal to}

(0) 0

X

_τ

=X _{for any time}_t_{from this interval:}Y Y_t=_τ₍₀₎=X₀.

The observation_Ytcan change its value only at stopping time (1)

τ _{(the time of the first observation): as it follows from the} definition (1) Y

τ

(1)=X

τ

(1). Hence

( )1 (1) (1) (1) (0) (1) (0) (0)

Y

_τ

=Y

_τ

₋+∆Y

_τ

=Y

_τ

+ X

_τ

−Y

_τ

⋅ =Y

_τ

+ _{( (1)}X

_τ

) . (1) (1)

Y

_τ

N

_τ

− ₋ ⋅∆ _{Analogously we can describe the algorithm} of observation at times ( )τ k _{for all}k≥1 Y

_τ

_{( )}_k =X

_τ

_{( )}_k _and

( )

Y Y_t=

_τ

_k _{for times [ ( ), ( 1)).}t∈τ k τ k+ _{Hence the process of} observationsY =

( )

Yt t≥₀ is a solution of the following stochastic equation:

(

)

0 0

t

Y X_t= +

_∫

X Y_{s s}_{− −} dN_s.

Note, that a process of random walkXcan be represented as a difference of two counting processes:

, 0

X X A B= + − ₍₃₎ whereA At t=( ) 0≥ andB Bt t=( ) 0≥ are the counting processes of the numbers of positive and negative jumps of X

respectively:

(

1 ,

)

0

A_t X_s

s t

= Ι ∆ =

< ≤

∑

(

)

1 , 0

B_t X_s

s t

= Ι ∆ = −

< ≤

∑

0 0

A_s B_s s t

∆ ⋅ ∆ =

< ≤

∑

for all t≥0

(2)

trajectories from Skorokhod space on the lattice Z is the most general possible, but in this article we shall restrict the model by the assumption of very simple distributions of processes A and B.Let A and B be independent Poisson processes with intensities

α

≥0 and

β

≥0 respectively. According to the well-known Doob–Meyer decomposition of submartingales, [3],AandBcan be represented as

, ,

A B

A_t= ⋅ +αt m B_t _t= ⋅ +βt m_t (4)

wheremA=(_{mt t}A_{) 0}_≥ andmB=(_{mt t}B_{) 0}_≥ are square integrable martingales onBwith quadratic characteristics:

, , , 0.

A B A B

m t m t m m

t= ⋅

α

t= ⋅

β

t=

It is clear that the processY Yt t=( ) 0≥ is a stochastic discrete time approximation of the process_{X Xt t}=( )_≥₀._{In this simple}

case it is possible to estimate the value of Xtgiven the observations

{

Y_t,0≤ ≤s t

}

_{for all}t≥0.

The more is the rate of observations, the better is approximation and the less could be the error of estimation. In many applications the cost of observations is not negligible. So the problem is in finding such a compromise intensity of observationsλ, which results in minimization of a loss function reckoning in a cost of observations and a cost of an error of estimation.

In the model a cost of observations is supposed to be linearly depending on its averaged and normalized number

NTin a space of timeT>0,_{and therefore on the intensity} λ of the Poisson processN.A cost of errors of observations is the expected value of a quadratic form of errors. The process of estimation is considered here as a continuous-time construction of a (discontinuous) process ˆX Xt t=( ) 0ˆ _≥ _with random variable _Xtˆ , defined as an optimal mean square estimate of_Xt given

{

Y_s,0≤ ≤s t

}

_{for all}t≥0_{(i.e. given a}

discreet set of observations Y

τ

( )k =X

τ

( )k for all

( )

k t k, 0,1,2,...):

τ ≤ = _Xˆ_t₌_E_{( |}_X_{t t}_FY_), where a σ-algebra_F_tY =

(

Y s t_s;

)

σ ≤ _{of a non–decreasing family} FY=₍_F_{t t}Y_{) 0}_≥ is completed by sets fromFofP measure zero. Thus a cost of error of estimation is a normalized and expected value of an integral of the varianceγ_t=Eε_t2of the error of estimation

ˆ

X X t t t

ε

= − for t≥0.Along with a consideration of errors of estimations given the observations_FY it is possible to examine proper errors of approximationδ_t= −X Y_{t t}_{and the}

expected value of the quadratic form of errors withΓ =_t Eδ_t2.

2. Results

Let us define the loss process

ϕ

=( ( )) 0

ϕ

t t_≥ _{as a linear}

function of numbers of observations (as a cost of observations) and the quadratic form of errors of estimation (as a cost of an error) with positive constantshandg:

( )

2 ,

0

t

t h _s ds g N_t

φ = ⋅

_∫

ε + ⋅ (5)

The expected and normalized value ( )ψ λ _{of the loss} function

φ

( )

t corresponding to the intensityλis

( )

lim 1

( )

T ,

T T

ψ λ = φ

→∞ E (6) In terms of optimal control the problem is in finding such intensity *λ that

( )

*

( ) min . 0

ψ λ ψ λ

λ =

≥ (7) In order to investigate the variances of errorsγ_t=Eε_t2we at first formulate preliminary results for auxiliary random variables γ_t( )u being conditional variances

( )u ₍₍ ( ) 2u _{) |}_FY₎ u

t t

γ =E ε of the errorε_t

( )

u = _X_{t u}₊ −_Xˆ_t( )u of the estimate _Xˆ_t( )u =E₍_{Xt u u}₊ _|_FY₎fort≥0andu≥0 (timestand

u _{can be considered as arbitrary random}_FY -adapted stopping times). The properties ofγ_t

( )

u are studied in the following lemma:

Lemma 1. For all 1,2,...k= _{and stopping times}

τ

( )

k for

0

λ

> _{the following equality holds:}

( )

(

)

( )

1

2 1

k

k _dt t

k τ

α β τ

γ

λ τ

+ − ₌

−

∫

E .

Proof. According to the semimartingale presentation (4) of

components (3) of X and the equality Y kτ

( )

−1 =X kτ

( )

−1 the estimate _Xtˆ(τ

( )

k−1 )=E

(

_{Xt k}₊_τ

_{( ) ( )}

₋₁|F_τY_k₋₁

)

is equal to

( )

1

X kτ − +

(

α β− ⋅

)

t_{. Therefore the conditional error can be}

presented as ε_t( ( 1))τ k− = (m_τA_{( 1)}_k_{− +}_t−m_τA_{( 1)}_k₋ ) - (_{m k t}_τB

_{( )}

_{− +}₁ −

). ( 1)

B

m kτ − Because the martingalesmAandmB in (4) are

independent, the conditional variance

( ( 1)) _{( ( ( 1)) 2_{) |} _} ( 1)

F

k k Y

tτ tτ _k

γ − =E ε − _τ ₋ is equal to

( )

( 1) ( 1)

A A

m m

k t k

τ − +− τ − +(mBτ( 1)k− +t− mBτ( 1)k− )= =t⋅ +

(

α β

)

_{. (8)}

(3)

distribution density ρ

( )

x _{is exponential, (2), then for all} 1 k≥

( )

(

)

( )

(

)

( )

1 1 1 k k

k _dt _t _dt

t

k k

τ τ

τ

γ α β

τ τ

− ₌ _{⋅ +}

− −

∫

E E =

(

)

( )

( ) (

)

₂ .

0 0

0

k _x

t dt x t dt dx

θ

α β

α β ρ α β

λ

∞ ₊

⋅ + =_∫ ⋅ ⋅ +_∫ =

∫

E

Lemma 1 is proved.

The following lemma gives the way of calculation of ( )ψ λ in (6) in terms of

φ

( )

t and

τ

( )

k :

Lemma 2. For the loss process Φ and k=

[

λ⋅T

]

_the

following convergence takes place

( )

lim

( )

/k

(

( )

k

)

,

k

ψ λ= λ ⋅ φ τ

→∞ E

where

[ ]

⋅ _{is a greatest integer function.}

Proof. As it follows from the definition (1) the equality

( )

λ

/k

⋅ ⋅

g N

_τ

_{( )}_k

= ⋅

g

λ

_{holds for all 1.}k≥ _{The equality}

( )

1/T g N g⋅ ⋅ _T= ⋅λ

E takes place for all T>0 _{because the}

intensity of the Poisson process N is equal to λ Hence for the normalized and expected right summand in (5) at time

( )

k

τ

the following equality holds:

( )

1

{

}

lim N _k lim _{g NT}

k T

g

λ

τ

λ

⋅ = ⋅

→∞ E →∞ E

= ⋅

. (9)

Consider the normalized and expected left summand in (5) at time ( )τ k _{at time}T k= λ,k≥1.It is clear that

( )

(

)

( )

1 2 1

0 0 1

k k _k i

i

dt dt dt

t t t

i i

τ τ τ

τ

ε γ γ

τ − = = = −

∑

∫

E E E

and, as it follows from Lemma 1,therefore holds

( )

2 2 0 k

h _t dt h k h

k k

τ

λ _ε λ α β α β

λ λ

+ +

⋅ ⋅E

∫

= ⋅ ⋅ ⋅ = ⋅

Hence the proof of Lemma 2 is reduced to the verification of the statement (9):

( )

1 ₂ lim 0 T dt t T k α β ε λ + ⋅ =

→∞ E

∫

. (10)

Let us consider the auxiliary random variable

ζ

( )

T with values in [0, ]T for T >0_{defined as}

( )

T sup :

{

_{s s T Ns}, 1

}

ζ = ≤ ∆ = =

=inf : 0, ,

{

s s∈

[ ]

T N N_{s T}=

}

.

It is clear that

ζ

( )

t is not a stopping time on basis B (because in general case the set

{

ω ζ:

( )

_{t u u}≤ ∉

}

F for u t< ).

Nevertheless it is possible to investigate the set

{

Θ( )T T

}

>0 of random variables

( )

( ) 0

T

T t dt

ζ γ

Θ =

_∫

for T>0

in terms of inverse time. It is clear, that from the convergence

( ) ( )

k/ k 1 a s. .

τ λ⋅ → P− (ask→∞)_{it follows that (for}k→ ∞ and T= ⋅ →∞λk ₎P −a s.

( )

_{

_{( ) ( )}

_}

₍ 1 ( ) ₎

1 0

i

T _i _T _dt

t

T _i T

τ

τ γ

λ λ

∞ Θ

= Ι =Θ ⋅ ⋅ =

⋅ =

∑

∫

( ) ( )

{

}

₂ ,

1

i

i T

T i

α β α β τ

λ _λ λ

∞ ₊ ₊

Ι =Θ ⋅ ⋅ →

=

∑

(11)

where {}Ι ⋅ _{is an indicator function (}Ι

{ }

true=1,Ι

{

false

}

=0_{). So} the proof of the lemma would follow from the convergence

( )

1 ( ) 0. 0

T

dt T t

T⋅

∫

γ −Θ →

E (12)

Note, that

( )

0

T T

dt T dt

t t

T

γ γ

ζ −Θ =

∫

(13)

and in inverse time presentation the process

( ) [ ]

0,

R= _{Ru u}_∈ _T _for _{u T t}_{= −} _with_R_u₌_N_{T t}₋ _is _R F -adapted, where FR=(F_uR₎₀_{≤ ≤}_{u T} and F_uR =

{

R_v;0 v u

}

σ ≤ ≤ ₌ σ

{

N T u t T_t, − ≤ ≤

}

_{. The process}_R _{is a}

supermartingale and therefore admits the following Doob – Meyer decomposition

r R_u=N_T − =r_u N_T − +r_u m_u,

where r =

( )

ru ₀≤ ≤_{u T} is a compensator and mr =

( )

_mur ₀ u t

≤ ≤ is a square – integrable martingale. The compensator _ru is equal to

0

u ru

r_u du

T u =

−

∫

 (14)

(note that formula (14) is similar to that of semimartingale presentation of a Brownian bridge and results from the infinitesimal semimartingale presentation of Poisson process in inverse time). According to (14) and from the well-known formula of Dellacherie d r d F x_x= _ζ

( )

/ 1

(

−F x_ζ

( )

−

)

(see, e.g. [1]) it follows that for the conditional distribution function

( )

(4)

the first jump of the process r given the random value _NT hold

( )

F x_ζ ₌_{1 1 /}− −

₍

_{x T}

₎

NT ,

( )

x N T x_T

(

)

NT /TNT

ρζ = ⋅ − (15)

Because R coincides with N in inverse time, then ζ ζ= . From the formula (8) and independence of the processes

( )

₀

X= _{Xt t}_≥ _andN=

( )

_{Nt t}_≥₀_{it follows that}

( )

1 1

0

T T

dt du

t u

T _T T

ζ

γ γ

ζ

⋅

_∫

= ⋅

_∫

E E =

( )

2

1

2 0

T _x

x dx T⋅

∫

ρζ ⋅ ,

where γu =γT t− for u T t= − . From (15) we receive

( )

2

{

}

1 1 ₁

2 0

T _x

x dx _NT

T⋅

∫

ρζ ⋅ = _NT ⋅Ι ≥ . Because P

{

1≤N_T ≤T/ 2

}

→0 for T → ∞ and

{

}

{

Ι N_T ≥1 /N_T

}

≤ E

{

}

1

{

}

1 1 / 2 / 2

/ 2

N_T T T N_T

T

 

≤ _ ⋅Ι ≤ ≤ + ⋅Ι < _≤

 

E

{

1 / 2

}

1

/ 2

N_T T T

≤P ≤ ≤ + ,

then

( )

1 T _dt ₀

t T ⋅_ζ

∫

_T γ →

E _{. (16)}

The convergences (11) and (16) along with the equality (13) result in (12) and in the statement of Lemma 2. Lemma is proved.

The value of ( )ψ λ in (6) is obtained in the following lemma.

Lemma 3. Under assumptions of the Theorem the function

( )

ψ λ

is equal to

( )

h ( ) / g

ψ λ

= ⋅

α β λ

+ + ⋅

λ

.

Proof of thelemma follows from the statements (9) and (10).

Here is the solution of the problem (6)-(7):

Theorem 1. Leth≥0_,g>0._{Then the value}_λ*in the problem

(7) for the loss function (6) is equal to

* _{h g}_/ _,

λ = ⋅ α β+ (17) and the value of loss function is

( )

* 2 h g

ψ λ = ⋅ ⋅ α β+ _{. (18)}

Proof of thetheoremfollows from the statement of Lemma 3

and the equationdψ λ λ

( )

/d =0,_{resulting in (17) and (18).} Now we study an intensity of observations λ as a rate of stochastic approximation of X with Y . In order to investigate the rates of approximation we define functions

( )

t

Φ _andΨ

( )

λ

_{by analogy with}

φ

( )

t and

ψ

( )

t substituting the estimates _Xsˆ by the observations Ys (and hence substituting εsbyδs) in (5)-(6):

( )

2

0

t

t h δ_s ds g N_t Φ = ⋅

_∫

+ ⋅ _,

( )

lim 1

( )

T T T λ

Ψ = Φ

→∞ E , (19) and we consider the optimal control problem of finding such intensity of stochastic approximation Λ that

( )

( ) min 0

λ

Ψ Λ = Φ

≥ (20) By analogy withγ_t( )u we defineΓ( )_tu =E((δ_t( ) 2u ) |F_uY)with

( )u t

δ =X_{t u u}₊ −Y _.

Then the following result forΓ( )_tu is true:

Lemma 4. For all k=1,2,..._{and stopping times}

τ

( )

k for

0

λ

> _{there holds}

( )

(

)

( ) ₂

2( )

1

2 3

( 1)

k

k _dt t

k

τ _{α β} _{α β}

τ

λ λ

τ

+ ⋅ −

−

Γ = +

−

∫

E

Proof of the lemma follows from the obvious equality

( )

( ( 1)) ( 1)

k

k _dt t

k τ

τ τ

− Γ −

∫

E ₌

= ( )

( ( 1)) ( 1)

k

k _dt t

k τ

τ γ τ

−

∫

E ₊ ( )( ( ) )2 , 0 0

x

x tdt dx

ρ α β

∞

⋅ −

∫

from (2) and from the statement of Lemma 1.

The next result gives a way for finding ofΨ

( )

λ

in terms of ( )t

Φ _{and stopping times}

τ

( )

k :

Lemma 5. For the loss processΦandk= ⋅[λT]_{the following}

convergence takes place:

( )

lim

( )

/k

(

( ) .k

)

k

λ λ τ

Ψ = ⋅ Φ

→∞ E

Proof is similar to that of Lemma 2.

(5)

Lemma 6. Leth≥0, g>0._{Then for the function}Ψ

( )

λ

holds

( )

λ 2 (h α β) /2 2λ h

(

α β λ

)

/ gλ.

Ψ = ⋅ − + ⋅ + + ⋅ (21)

Proof is similar to the proof of Lemma 3.

The next result gives a way for solving the problem (20):

Theorem 2. Leth≥0, g>0._{Then the value}_Λ_{in the problem}

(20) for the loss function (19) is a solution of the following equation:

3 ( )( / ) 4(h g )( / ) 0.h g λ λ α β− ⋅ + ⋅ − ⋅ − ⋅α β = The value of loss functionΨ( )Λ _{is defined by (21).}

Proof is analogous to that of Theorem 1 and follows from the requirementdΨ

( )

λ λ/d =0.

3. Conclusion

The method discussed in the paper is based on the semimartingale approach developed for the random walks of a general type in [2]. This approach was shown to be useful for the limit theorems and for the problems of estimation. The problems of the optimal intensity of observations of the processes are developed mostly for the Gaussian or for the stationary systems (see, e.g. [4-6]) but not for random walks yet. Nevertheless it is possible to receive new results by means of this (martingale) approach. The investigated in the article simple case of the random walk is interesting because it demonstrates the method developed for a non-stationary (and non-ergodic) case. Thus it can be easily applied to a variety of processes (including the case of random walks in the conditionally-Markov random environments, considered as an example of the martingale approach in [2]) and point counting processes for the numbers of observation. The main result of the paper, stated in the Theorem 1, permits to consider the optimal control problems for the rate of instant observations of nonstationary systems (e.g. streams of data in queueing systems similar to [7]). The problems of comparison of optimal intensities of observations (for estimation) and optimal rates of approximation are of especial interest, and

the statement of the Theorem 2 (along with the results of

Theorem 1) can be considered as a simple approach to the problems of such type.

Acknowledgements

The author is grateful to the referee for helpful comments that led to an improved manuscript. This research was partially supported by the Ministry of Education and Science of the Russian Federation (Research Projects of Ulyanovsk State University).

REFERENCES

[1] C. Dellacherie. Capacites et processus stochastiques. Springer-Verlag, Berlin, Heidelberg, New York, 1972. [2] A. A. Butov. Random walks in random environments of a

general type. Stochastics and Stochastics Reports, Gordon and Breach Science Publishers S.A., Vol. 48, pp 145-160, 1994.

[3] R. Sh. Liptser, A. N. Shiryaev. Theory of martingales. Dordrecht, Kluwer Academic Publishers, 1989.

[4] M. Ades, P. E. Caines, R. P. Malhame. Stochastic optimal control under Poisson-distributed observations. Automatic Control, IEEE Transactions on, Vol. 45 , Issue 1, pp 3-13, 2000.

[5] R. A. Davis, W. T. M. Dunsmuir, S. B. Streett. Observation-driven models for Poisson counts. Biometrika, Vol. 90, No. 4, pp. 777-790, 2003.

[6] Tang Shanjian, Hou Shui-hung. Optimal Control of Point Processes with Noisy Observations: The Maximum Principle. Applied Mathematics & Optimization, Vol. 45 Issue 2, p185, 2002.