�NYf has bounded momen ts of some order �these are random variables

(1)

The Monte-Carlo method for ltering with discrete-time

observations

Pierre Del Moral

, Jean Jacod

y

and Philip Protter

z

September 28, 1999

1 Introduction

1)

The problem which we address in this work is the following: we have an

IR

d_-valued

state process

X

= (

X

t)t0 which evolves according to a stochastic dierential equation of

the form

dX

t =

a

(

X

t)

dt

+

b

(

X

t)

dW

t

L(

X

0) =

(1.1)

where

is a known distribution on

IR

d_{, and}

_a

_,

_b

_{are known functions, and}

_W

_{is a}

_d

-dimensional Wiener process. We have noisy observations

Y

1

:::Y

N at

N

regularly spaced

times, and without loss of generality we will assume that these times are 1

:::N

. We wish to compute the conditional expectations

YN

f

=

E

(

f

(

X

N)j

Y

0

:::Y

N) (1.2)

for all reasonable functions

f

on

IR

d_.

This is a ltering problem, but although

X

evolves according to \continuous" time this problem is essentially discrete in time, and the integer

N

, although perhaps large, is xed throughout (in contrast with the usual continuous-time non-linear ltering problem).

We will consider below three dierent ways in which the noisy observations occur. But in all cases, we have no explicit form for the transition semigroup (

Q

t)t0 of the

Markov process

X

, so that an explicit form for the lter

YN is not available, and we will

approximate this lter by Monte-Carlo simulations.

The performance is measured in terms of how many \single" random variables are necessary to simulate in order to achieve a given error in our approximation. More pre-cisely, we count as a \single" variable any variable of a xed dimension which we have to

LSP, B^at. 1R1, Universite Paul Sabatier, 118 route de Narbonne, 31062 Toulouse, France (CNRS

UMR C5583)

yLaboratoire de Probabilites, Universite Paris 6, 4 place Jussieu, 75252 Paris, France (CNRS UMR

7599)

zMathematics and Statistics Departments, Purdue University, W.Lafayette, IN 47907-1395, USA.

(2)

simulate. Then if we are allowed to simulate

r

single variables, we try to nd a Monte-Carlo algorithm giving an estimate ^_rYN

f

for

YN

f

in such a way that the sequence

r

1=(^rYN

f

;

YN

f

) has bounded moments of some order (these are random variables

dened on the space on which the simulated variables are dened themselves), for some

>

0 as small as possible: it means that if we wish to have a (mean) error less that

"

we need a constant times (1

="

) _{simulated variables.}

As said before, we wish to have

as small as possible, together with the following important properties:

1)

does not depend on the number

N

of observations (of course the bounds on the moments will depend on

N

, and in an exponential way as a matter of fact: so when

N

is big the method does not work well).

2)

does not depend on the observed points (

Y

1

:::Y

N), nor on the bounded measurable

function

f

.

2)

We single out three observations schemes:

Case A)

For all

i

= 1

:::N

we have

Y

i =

h

(

X

i

"

i)

(1.3)

where the

"

i are i.i.d.

q

0-dimensional variables, independent of

X

and with a law having

a known density, and

h

is a known function from

IR

d

IR

q 0

into

IR

q_.

Case B)

The observed values

Y

i are the values at times

i

of an

IR

q-valued process (

Y

t)t0

which satises the following:

dY

t =

h

(

X

t)

dt

+

dW

0

t

Y

0 = 0

(1.4)

where

h

is a known function and

W

0 is a

q

-dimensional Wiener process, independent of

W

, and is an invertible

q

-matrix.

Case C)

The same situation as Case B, except that the equation is

dY

t =

a

0(

X

t

Y

t)

dt

+

b

0(

X

t

Y

t)

dW

0

t

Y

0= 0

(1.5)

where

a

0 and

b

0 are known functions and

W

0 is as above.

In Cases B and C the initial value

Y

0= 0 is just for convenience, and for homogeneity

of notation we also set

Y

0 = 0 in case A. In the three cases, the two sequences (

X

i)i2IN

and (

X

i

Y

i)i2IN are Markov chains, with transitions denoted by

Q

and

P

respectively (we

have

Q

(

xdx

0) =

P

(

xy

dx

0

IR

q) for all

y

).

Let us now state the assumptions which will be in force in the paper. We do not seek the weakest possible assumptions here. First, in case (A) and (C) (resp. (B)) we will suppose (E1) (resp. (E2)) below about Equation (1.1):

(3)

(E2)

The functions

a

and

b

are bounded and Lipschitz continuous. For case (C) we also need an assumption on Equation (1.5), which is:

(E3)

The functions

a

0 and

b

0 are 2 times dierentiable with bounded derivatives of all

orders up to 2, and the matrix (

b

0

b

0)(

x

) is uniformly non-degenerate.

For case (A) we also need an assumption on the function

h

and on the i.i.d. variables

"

i:

(A1)

For any

x

2

IR

d the variable

Y

=

h

(

x"

1) admits a density

y

7!

g

(

xy

) and the

function

g

is bounded and explicitly known.

Observe that (A1) is satised as soon as

Y

i =

H

(

X

i) +

"

i and the

"

i's have a bounded

known density, whatever the known function

H

is.

For case (B) we need as well an assumption on the the function

h

occuring in (1.4) and on :

(B1)

The function

h

is bounded and Lipschitz continuous, and the matrix is invertible. We will exhibit Monte-Carlo algorithms with the following performances. Below, the constant

B

N depends on the quantities

a

,

b

,

a

0,

b

0,

h

, involved in the equations, but

not on the observed value (

Y

1

:::Y

N) = (

y

1

:::y

N) except through a minoration of the

continuous version of the density of the variable (

Y

1

:::Y

N) at the point (

y

1

:::y

N) (see

(2.14) below the assumptions made ensure the existence of this continuous density). As indicated by our notation,

B

N also depends in a crucial way on the number

N

of steps,

and in fact the dependency is exponential: we have

B

N =

CC

0N for two constants

C >

0

and

C

0

>

1.

The main results are as follows:

Case A:

Under (E1) and (A1) we have

E

(j^rYN

f

;

YN

f

j)

BNkfk

r1=3 , where

k

f

kis the

sup-norm of

f

.

Case B:

Under (E2) and (B1) we have

E

(j^rYN

f

;

YN

f

j)

BNkfk 0

r1=4 , where k

f

k

0 is

the Lipschitz norm of

f

, supposed to be a bounded Lipschitz function.

Case C:

Under (E1) and (E3) we have

E

(j^rYN

f

;

YN

f

j)

BNkfk

r1=(3+q).

Together with these we will also obtain exponential bounds.

Note that B is a particular case of C, provided (E1) holds and

h

is of class

C

2: in

this case we can also apply the result of case C, thus obtaining a majoration involving

k

f

k instead ofk

f

k

0. When

q

= 1 the rate is then the same when

f

is Lipschitz or simply

(4)

3)

The paper is organized as follows: Some preliminaries are provided in Section 2. Sec-tions 3, 4 and 5 are devoted to the proofs of the above results in cases A, B and C respectively. In Section 6 we rst state some extensions, without proofs. Then we give some numerical results: indeed the theory gives us constants

B

N above with the form

B

N =

C

N for some

C >

1, which seems to indicate that the method is unfeasible in

practice even for

N

moderately large. To understand what these constants

B

N really are,

we have made a numerical study in a very simple case: (

XY

) is a 2-dimensional Gaus-sian diusion the simulations show that

B

N stays reasonably small when the diusion is

recurrent, while it explodes as

N

increases in the transient case: this is of course to be expected, although we do not have a proof for this fact.

2 Preliminaries

This section is devoted to recalling some known facts about stochastic dierential equa-tions, and to provide some preliminary results. As a rule in this paper,

C

and

C

0 will

denote constants which may depend on the coecients of the equations (1.1), (1.3), (1.4) and possibly on the function

g

in (A1), but on nothing else, and they change from line to line.

2-1)

Since we cannot simulate exactly a random variable having the same law as

X

1

(starting at any

X

0 =

x

), we will approximate it by the well known Euler scheme. More

precisely, for any integer

m

1 and any starting point

x

, we dene recursively:

X

(

xm

)0 =

x

X

(

xm

)i+1 =

X

(

xm

)i+ 1

ma

(

X

(

xm

)i)+ 1p

mb

(

X

(

xm

)i)

U

i

(2.1)

where the variables

U

i are i.i.d.

d

-dimensional centered Gaussian with unit covariance

matrix.

Under (E1), if

Q

(m)(

x:

) denotes the law of

X

(

xm

)

m, then

Q

and

Q

(m)have densities

q

and

q

(m) satisfying

q

(

xx

0) +

q

(m)(

xx

0)

Ce

;C 0

jx;x 0

j 2

(2.2)

j

x

;

x

0

j

>

2

m

) j

q

(

xx

0)

;

q

(m)(

xx

0) j

C

me

;C

0

jx;x 0

j 2

:

(2.3)

(2.2) is a standard fact under (E1), and (2.3) is proved by Bally and Talay 1], 2] (since

bb

is assumed to be uniformly elliptic we do not need to consider the \perturbed" scheme

introduced in 2], and the

C

2-regularity is enough). This implies in particular that for any

bounded Borel function

f

we have for another constant still denoted by

C

:

j

Q

(m)

f

(

x

)

;

Qf

(

x

)j

C

m

k

f

k

:

(2.4)

Next, if we simply assume (E2) and if we denote by

X

(

x

) the solution to (1.1) starting at

X

0 =

x

and if we take

U

i =

p

m

(

W

i+1

m ;

W

im) in (2.1) with the same Wiener process

W

as in (1.1), we have (see e.g. 3]):

E

( sup

1im

j

X

(

xm

)i;

X

(

x

)imj 2

)

C

(5)

Under the same assumption (E2) we also have

E

(sup_s

1

j

X

(

x

)s;

X

(

y

)sj) + sup

m

E

( sup1im

j

X

(

xm

)i;

X

(

ym

)ij)

C

j

x

;

y

j

:

(2.6)

2-2)

In order to study case C, we also need to perform an Euler scheme for Equation (1.5). In fact we do this simultaneously for both equations (1.1) and (1.5): starting at

x

2

IR

d

and

y

2

IR

q, we dene

X

(

xm

)i by (2.1), and set

Y

(

xym

)0 =

y

Y

(

xym

)i+1 =

Y

(

xym

)i+

+1

m

a

0(

X

(

xm

)

i

Y

(

xym

)i) + 1 p

m

b

0(

X

(

xm

)

i

Y

(

xym

i)

U

0

i

9

=

(2.7)

where the variables

U

0

i are i.i.d.

q

-dimensional centered Gaussian with unit covariance

matrix. We clearly have the same properties for the pair (

X

(

xm

)i

Y

(

xym

)i) as we

had above for

X

(

xm

)i. In particular under (E1) and (E3), the pair (

XY

) satises an

equation of type (1.1) whose coecients satisfy (E1) on

IR

d+q. Therefore if

P

(m)(

xy

:

)

denotes the law of (

X

(

xm

)m

Y

(

xym

)m), the transitions

P

and

P

(m) have densities

p

and

p

(m) satisfying

p

(

xy

x

0

y

0) +

p

(m)(

xy

x

0

y

0)

Ce

;C 0

(jx;x 0

j 2

+jy;y 0

j 2

)

(2.8)

j

x

;

x

0

j+j

y

;

y

0

j

>

2

m

) j

p

(

xy

x

0

y

0

);

p

(m)

(

xy

x

0

y

0

)j

C

me

;C(jx;x 0

j 2

+jy;y 0

j 2

)

:

(2.9)

2-3)

We x the observation sequence (

y

0 = 0

y

1

y

2

:::

), and instead of

YN in (1.2) we

simply write

n

f

.

Our assumptions imply that there is a function

G

on

IR

d

IR

q

IR

d

IR

q such

that

y

0

7!

G

(

X

i

Y

i

X

i +1

y

0) is a version of the density of the law of

Y

i+1, conditional on

(

X

i

Y

i

X

i+1): this is trivial in case A with

G

(

xy

x

0

y

0) =

g

(

x

0

y

0) (see (A1)), and in case

C with

G

(

xy

x

0

y

0) =

p

(

xy

x

0

y

0)

=q

(

xx

0). In case B, if

'

is the density of the normal

law N(0

) on

IR

q (recall that is invertible) and if

k

(

xx

0

du

) denotes a version of

the law of the variable R 1

0

h

(

X

s)

ds

conditionally on

X

0 =

x

and

X

1 =

x

0, then the above

property is satised with the function

G

(

xy

x

0

y

0) = R

'

(

y

0

;

y

;

u

)

k

(

xx

0

du

).

Now, a version of

N

f

is given by

N

f

=

R

(

dx

0)

Q

(

x

0

dx

1)

G

(

x

0

y

0

x

1

y

1)

:::Q

(

x

N;1

dx

N)

G

(

x

N;1

y

N;1

xy

N)

f

(

x

N) R

(

dx

0)

Q

(

x

0

dx

1)

G

(

x

0

y

0

x

1

y

1)

:::Q

(

x

N;1

dx

N)

G

(

x

N;1

y

N;1

x

N

y

N)

:

(2.10) There is a recursive way to write the \lter"

N, which is as follows. First, we clearly

have

0

f

=

(

f

) = Z

f

(

x

)

(

dx

)

:

(2.11)

Next, we introduce the kernels

H

j from

IR

d into itself by

H

j

f

(

x

) =

Z

Q

(

xdx

0)

G

(

xy

j

x

0

y

(6)

Then

N is also given by the following recursive formula, starting with (2.11):

N

f

=

N;1

H

N

f

N;1

H

N1

:

(2.13)

Note that the denominator of (2.10) is

H

1

:::H

N1, and it is of course natural to

assume that it is not equal to 0 (it is the value of the density of (

Y

1

:::Y

N) at the

observed values (

y

1

:::y

N)). We give a name to this number:

"

=

H

1

:::H

N1

:

(2.14)

Next, as seen before the function

G

is bounded in case A (use (A1)) and B (because

'

is bounded) in case C we have

G

(

xy

x

0

y

0)

C=q

(

xx

0) by (2.8). Then (2.12) implies that

there exists a constant

K

such that

H

j1(

x

)

K

8

x

2

IR

d

j

= 1

:::N:

(2.15)

Then (2.14) and (2.15) yield

H

1

:::H

j1

H

1

:::H

j+11

K

:::

"

K

N;j

:

By applying (2.15) several times we also get

H

1

:::H

j;11

K

j

;2. Then since

j

f

= H1:::Hjf

H1:::Hj

1 by (2.13), we readily deduce the following estimates (for the right side, use the

fact that

j;1 is a transition measure and (2.15) again):

"

K

N;1

j ;1

H

j1

K:

(2.16)

2-4. The approximate lter.

We will replace the true lter (given by formula (2.13) for example) by an approximate lter constructed via the Euler approximation mentioned above. More precisely, we replace the kernel

H

j of (2.12) by another kernel

H

jm such that

for any bounded function

f

,

k

H

m

j

f

;

H

j

f

k

mk

f

k (2.17)

for some

m

>

0 which will go to 0 as

m

!1 (this kernel

H

nmwill be related to the Euler

approximation of stepsize 1

=m

in a way which will depend on the cases A, B or C). Then we dene the probability measures

mn by induction on

n

as follows:

m

0 =

mN

f

=

_mN;1

H

m N

f

_mN;1

H

m

N1

:

(2.18)

The quality of this approximation of

nis provided by the following result:

Proposition 2.1

Assume (2.17). For all bounded Borel functions

f

on

IR

d _{and all}

_j

₌

1

:::N

, we have

j

mj

f

;

j

f

j

mk

f

k

A

j

(2.19)

where

A

j =

j

+1

;

K

(

;1)

(7)

Proof

. a) We will prove (2.19) by induction on

j

. For

j

= 0 it is trivial with

A

0 = 0, so

we assume that it holds for

j

;1. We may write

_mj

f

;

j

f

=

1

j;1

H

j1

(

_mj;1

H

m j

f

;

j

;1

H

j

f

) + (

mj

f

)(

j;1

H

j1 ;

mj

;1

H

m j 1)

:

Since

_mj;1 and

mj are probability measures, we obtain by (2.17): j

mj

;1

H

m

j

f

;

mj ;1

H

j

f

j

mk

f

k

j

mj

f

j k

f

k

:

Taking also k

H

j

f

k

K

k

f

k (see (2.15)) into account and using the inductive hypothesis

(2.19) for

j

;1, we then easily deduce that

j

mj

f

;

j

f

j

K

N;1

"

2

mk

f

k(1 +

KA

j ;1)

:

Therefore (2.19) holds, provided we have

A

j = 2

K

N

;1

"

(1 +

KA

j;1) =

1

K

+

A

j;1

:

Since

A

0 = 0, this gives (2.20) for

A

j. 2

Finally we have another version of this result which proves useful for case B. First, we introduce the following norm on Lipschitz functions on

IR

d_:

k

f

k 0 =

k

f

k+ sup

x6=y

j

f

(

x

);

f

(

y

)j j

x

;

y

j

(2.21) (with k

f

k

0 =

1 by convention if

f

is not Lipschitz). Next we assume that we have for

some constant

K

0:

k

H

j

f

k 0

K

0

k

f

k 0

8

j

= 1

::: N:

(2.22)

Observe that this implies (2.15) with

K

0. Finally, suppose that for each

n

we have

other kernels

H

_jm _satisfying

k

H

jm

f

;

H

j

f

k

0

mk

f

k 0

(2.23) for some

0

m

>

0. Dene again the probability measures

mn by induction on

n

by (2.18).

Proposition 2.2

Assume (2.22) and (2.23). For all bounded Lipschitz functions

f

on

IR

d _{and all}

_j

_{= 1}

_:::N

_{, we have}

j

mj

f

;

j

f

j

0

mk

f

k 0

A

0

j

(2.24)

where

A

0

j =

0j+1 ;

0

K

0(

0 1)

0

= 2

K

0N

(8)

Proof

. The proof is exactly the same as for Proposition 2.1: we use the properties

j

mj ;1

H

m

j

f

;

mj ;1

H

j

f

j

0

mk

f

k 0

j

mj

f

j k

f

k k

f

k

0

and k

H

j

f

k 0

K

0

k

f

k

0 by (2.22) and the fact that under (2.22) we have (2.16) with

K

0

instead of

K

. 2

3 Case A

In this section we study our rst case A, which is the easiest one. The assumptions are (E1) and (A1) recall that

g

is dened in (A1).

A possible procedure goes as follows. We x an integer

n

, and we take for

m

n the

smallest integer p

n

. Then we perform the following steps:

Step 1:

We simulate

n

i.i.d. variables (

X

j

0)

1jn according to the law

, and for each

j

a variable

X

j

1 according to the law

Q

(mn)(

X

j 0

:

).

Notation:

At the end of Step

k

1 we will know the variables (

X

j

k)1jn, so we can

set for every function

f

on

IR

d_:

W

nk

f

= 1

_n

Xn

i=1

f

(

X

ik)

g

(

X

ik

y

k)

:

(3.1)

Then we introduce the following random probability measure on

IR

d_:

nk =

8

<

: 1

nW_nk1 P

ni=1

g

(

X

ik

y

k)

"

Xik if

W

nk1

>

0

"

0 otherwise

(3.2)

(the measure

"

0 above could indeed be replaced by any probability measure on

IR

d_).

Step

k

2

:

We simulate

n

i.i.d. variables (

X

0j

k;1)

1jn according to the law nk ;1.

Then for each

j

= 1

::: n

we simulate the random variable

X

_kj according to the law

Q

(mn)(

X

0j

k;1

:

).

We stop at the end of Step

N

, and our approximation of

N

f

will be nN

f

: as always

in this paper,

N is dened by (2.13) with a given observation (

y

1

:::y

N), while the

expectations below refer to the probability underlying our Monte-Carlo simulations.

Theorem 3.1

Assume (E1) and (A1). For all bounded Borel functions

f

on

IR

d_{, all}

k

= 1

:::N

and all

n

, we have

E

(jnk

f

;

k

f

j) Fk p

nk

f

k

P

(jnk

f

;

k

f

j

)

F

kexp

;

n

^

Fkkfk

2 !

9

>

=

>

(9)

where (with

= 2

K

N

_="

_{: recall that}

_"

_{is given in (2.14):}

F

k =

C

(4

)k

+1

;4

4

;1

F

k = 3k+1

F

^

k =

C

(8

)k (3.4)

and

C

depends only on

a

,

b

and

g

.

Remark 1:

Since

E

(

U

) =R

1

0

P

(

U

u

)

du

holds for any nonnegative variable

U

, the rst

property in (3.3) readily follows from the second one and from the fact thatR 1

0

e

;y

2= 2

dy

= p

=

2, except that the constant

F

k has to be substituted with

F

k

F

^kp

, which is much bigger: so if one is interested only in the rates of convergence (i.e. the behavior as

n

!1),

then there is no need to prove the rst half of (3.3).

Remark 2:

More generally,

E

(

U

p_{) =}R

1

0

P

(

U

u

1=p)

du

. So the second property in (3.3)

yields

E

(jnk

f

;

k

f

jp) 1=p

F

k(

p

)

p

n

k

f

k (3.5)

with

F

k(

p

) =

C

p

F

^k

F

1=p

k for some constant

C

p depending on

p

only exactly as in Remark

1, the constants

F

k(

p

) thus obtained are far too big, and a direct proof would yield smaller

constants.

Proof

. 1) For further reference, let us recall some well known estimates for i.i.d. random variables. Suppose that

1

:::

nare random variables which conditionally on a -eld

G

are independent and centered. Then we have the following two Kolmogorov exponential inequalities (see e.g. Lemma 1.5 and 1.6 in 4]), where

is arbitrary positive:

j

ij

a E

(j

ij 2

jG)

b

)

P

(j

1

n

X

i=1

ij

) 8

>

<

>

:

2exp(;

n

2

2a 2)

2exp ;

n

2

2b

2;

e

a=b

:

In the case where

i =

i;

E

(

ijG), we deduce that

j

ij

a E

(j

ij 2

jG)

b

)

P

(j

1

n

X

i=1

ij

) 8

>

<

>

:

2exp(;

n

2

8a 2)

2exp ;

n

2

2b

2;

e

2a=b

(3.6)

because with the above notation we have j

ij2

a

and

E

(j

ij 2

jG)

b

. Taking the latter

into account, we also have

E

(j

ij 2

jG)

b

)

E

(j

1

n

X

i=1

ij 2

)

b

n:

(3.7)

2) We can introduce another random measure on

IR

d _by0nk = 1

nP

ni=1

"

X

0ik (setting

(10)

and

_nj is dened by the recursive formula (2.18). In virtue of (2.4) the estimate (2.17) holds with

n=

k

g

k

=m

n

K=

p

n

, where

is some constant. In fact we will prove together the following estimates:

E

(jnk

f

;

nk

f

j)

G

k

p

n

k

f

k

= 1

:::N

(3.8)

P

(jnk

f

;

nk

f

j

) 2

I

kexp

;

n

2

8

J

kk

f

k 2

!

k

= 1

:::N

(3.9)

E

(j 0nk

f

;

nk

f

j)

G

0

k

p

n

k

f

k

= 0

:::N

;1

:

(3.10)

P

(j 0nk

f

;

nk

f

j

) 2

I

0

kexp

;

n

2

8

J

0

kk

f

k 2

!

k

= 0

:::N

;1

:

(3.11)

for some

G

k,

G

0

k,

I

k,

I

0

k,

J

k and

J

0

k to be computed later.

3) Let us begin with some preliminary estimates. Below

f

is a bounded Borel function on

IR

d_{. If}

_k

1 we have

0nk

f

;nk

f

= 1

nP

ni=1

i, where

i =

f

(

X

0ik)

;nk

f

. These

variables satisfy the assumptions for (3.6) and (3.7), with

i =

f

(

X

0ik) and w.r.t. the

-eld G generated by the variables

X

ik :

i

= 1

:::n

and with

a

= k

f

k and

b

= k

f

k 2.

Hence

E

(j 0nk

f

;nk

f

j 2)

kfk

2

n

P

(j

0nk

f

;nk

f

j

) 2exp

;

n

2

8kfk 2

:

9 > = > (3.12)

Next, the kernel

H

_nk takes the form

H

_nk

f

(

x

) =

Q

(mn)(

fg

k)(

x

), where

g

k(

x

0) =

g

(

x

0

y

k).

We have

W

_nk

f

; 0nk

;1

H

nk

f

= 1

nP

ni=1

i, where

i =

f

(

X

ik)

g

k(

X

ik);

Q

(mn)(

fg

k)(

X

0ik ;1)

:

These variables satisfy the assumptions for (3.6) and (3.7), with

i =

f

(

X

ik)

g

k(

X

ik) and

w.r.t. the -eldG generated by the variables

X

0ik

;1 :

i

= 1

::: n

and with

a

=

K

k

f

kand

b

=

K

2 k

f

k

2 (recall that here

K

=

k

g

k). Hence

E

(j

W

nk

f

; 0nk

;1

H

nk

f

j

2)

kfk 2K2

n

P

(j

W

nk

f

;

0nk ;1

H

nk

f

j

) 2exp

;

n

2

8kfk 2K2

:

9 > = > (3.13)

4) Now we proceed to prove (3.8), (3.9), (3.10) and (3.11) by induction on

k

. Since

n

0 =

and the variables

X

0i

0 are i.i.d. with law

, we can again apply (3.7) and

(3.6) to the variables

i =

f

(

X

0i 0)

;

(

f

), thus obtaining (3.10) and (3.11) for

k

= 0 with

G

0 0 =

I

0

0 =

J

0

0 = 1.

Now let

k

1 and assume that (3.10) and (3.11) hold for

k

;1. Recalling (2.19) and

(2.17) and

H

_nk1

K

and

n

K=

p

n

, we get

j

nk ;1

H

nk1

;

k ;1

H

k1

j

nk ;1

j

H

nk1;

H

k1j+j

nk ;1

H

k1

;

k ;1

H

k1

j

n(1 +

KA

k ;1)

KAp k

(11)

where we used the property 1 +

KA

k;1 =

A

k

=

coming from (2.20). Since (2.16) gives

k;1

H

k1

2

K=

, it follows that p

n

A

k )

nk ;1

H

nk1

K

:

(3.14)

In view of (2.13) and (3.2), we have _nk

f

;

nk

f

=

1

_nk;1

H

nk1

W

_nk

f

W

_nk1

;

_nk;1

H

nk1

;

W

nk1

+;

W

_nk

f

;

nk ;1

H

nk

f

!

if

W

_nk1

>

0 and _nk

f

;

nk

f

=

f

(0);

nk

f

otherwise. Therefore if

Lf

=

W

nk

f

;

nk

;1

H

nk

f

,

we obtain as soon as the condition in (3.14) is met and since obviouslyj

W

nk

f

jk

f

k

W

nk1:

jnk

f

;

nk

f

j 8

<

:

K(j

Lf

j+k

f

kj

L

1j) if

W

nk1

>

0

2k

f

k otherwise

j

Lf

j j

W

nk

f

; 0nk

;1

H

nk

f

j+j

0nk ;1

H

nk

f

;

nk ;1

H

nk

f

j

:

9

>

=

>

(3.15)

Now, (3.13) and (3.10) for

k

;1 andk

H

nk

f

k

K

k

f

kyield

E

(j

Lf

j) kfkK

p

n (1 +

G

0

k;1)

P

(j

Lf

j

) 2

e

;n

2= 32kfk

2K2

+ 2

I

0

k;1

e

;n

2= 32J

0

k;1 kfk

2K2

:

9

>

=

>

(3.16)

Now we will prove (3.8) and (3.10) for

k

. Assume rst that p

n

K

. In this

case, (3.14) yields that

W

_nk1 = 0 ) j

L

1j

K=

: thus (3.16) yields

P

(

W

nk1 = 0)

p

n(1 +

G

0

k;1). Then putting together (3.15) and the rst line of (3.16), we get (3.8) for

k

, provided

G

k 4

(1 +

G

0

k;1). On the other hand the left side of (3.8) is smaller than

2k

f

k, so (3.8) will hold for all

n

having p

n < A

k as soon as

G

k2

A

k. Further, under

(3.8) we readily deduce (3.10) from (3.12) for

k

, with

G

0

k = 1 +

G

k. Therefore (3.8) and

(3.10) will be true for all

k

,

n

, as soon as

G

k ;

4

(1 +

G

0

k;1)

_

(

A

k)

G

0

k = 1 +

G

k

G

0 0 = 1

:

At this point, one easily checks that the above properties are satised if

G

k = (2_

)(4

)k+1 ;4

4

;1

:

(3.17)

Next we prove (3.9) and (3.11) for

k

. Assume rst that p

n

A

k. Then (3.14) and

(3.15) yield

P

(jnk

f

;

nk

f

j

)

P

(j

Lf

j

K

2

) +

P

(j

L

1j

K

2

k

f

k

) +

P

(

W

nk1 = 0)1f2kfkg

:

Recall also that

P

(

W

_nk1 = 0)

P

(j

L

1j

K=

)

P

(j

L

1j

K=

2

k

f

k) if 2k

f

k

. Thus

(12)

of (3.9) is 0 as soon as

>

2k

f

kand is always smaller than 1, we see that (3.9) holds for

all

n

with p

n < A

k as soon as

I

k2 and

J

k(

A

k)

2

=

2. Further, taking into account

(3.12) we deduce from (3.9) that (3.11) also holds for

k

, as soon as

I

0

k = 1 +

I

k and

J

0

k = 4

J

k. Therefore (3.9) and (3.11) will be true for all

k

,

n

, as soon as

I

k

;

3(1 +

I

0

k;1

_

2

J

k

(4

)2

J

0

k;1

_(

A

k) 2

2

I

0

k = 1 +

I

k

J

0

k = 4

J

k

I

0 0 =

J

0

0 = 1

:

At this point, one easily checks that the above property are satised if

J

k =

1_

2

K

2 !

(26

2)k

I

k = 3k+1

;3

:

(3.18)

5) So far we have (3.8) and (3.9) for all

k

, with constants given by (3.17) and (3.18). It remains to apply (2.19). Since

n

K=

p

n

, the rst part of (3.3) is obvious with

F

k given in (3.4), because the function

x

7! x

k+1 ;x

x;1 is increasing over (0

1), and with

C

= 2 + 2

.

In order to obtain the second part of (3.3) we observe that when

nk

f

k

A

k

=

2, i.e.

when

n

2

=

k

f

k

2

(2

KA

k)

2, then the left side of (3.3) is smaller than the left side of (3.9)

written with

=

2 otherwise it is certainly smaller than 1: so the result will hold with ^

F

k

and

F

k given by (3.4), in view of (3.18). 2

Now, if we are allowed to simulate at most

r

\single" random variables, we give our nal result under the assumptions of this section. For a given

n

, the previous procedure necessitates

n

(1 +

m

n)

N

single simulations (counting for 1 each variable

X

0i and

X

0ik and

for

m

n each variable

X

ik for

k

= 1

:::N

because of the Euler scheme which we use). So

we choose

n

=

n

(

r

) to be the biggest integer

n

(1 +

m

n)

N

r

, and the simulated lter is

^_rN = n(r)

N

:

(3.19)

Then (3.3) gives, as soon as

r

6

N

(hence

n

(

r

)(

r=

2

N

) 2=3):

E

(j^rN

f

;

N

f

j) (2N)

1=3F

N

r1=3 k

f

k

P

(j^rN

f

;

N

f

j

)

F

^Nexp

;

r

2=3

FN(2N) 1=3

kfk

2

:

9

>

=

>

(3.20)

We also observe that we have strong consistency for our estimates ^_rN: indeed, taking

=

r

;1=6 in the second estimate (3.20) yields that X

r1

P

(j^rN

f

;

N

f

j

1

r

1=6)

<

1

so that ^_rN

f

converges almost surely to

N

f

as

r

!1 by Borel-Cantelli lemma.

Finally, let us emphasize that the constants obtained above are perhaps (very) big, but they also are \uniform" in the observations (

y

1

:::y

N) as soon as

H

1

:::H

N1

"

(13)

4 Case B

In this section we study our second case B. The assumptions are (E2) and (B1). The function

G

is not explicitely known in this case, but the kernel

H

j is given by the following

formula, where as above

'

is the density of the normal lawN(0

) in

IR

q:

H

j

f

(

x

) =

E

f

(

X

(

x

)1)

'

y

j;

y

j ;1

; Z

1

0

h

(

X

(

x

)s)

ds

:

(4.1)

Thus in view of (B1) and (2.6) and since the function

g

is also bounded and Lipschitz, we have (2.22) with some constant

K

0 depending on

a

,

b

and

h

.

A possible procedure goes as follows. We x an integer

n

, and we perform the following steps:

Step 1:

We simulate

n

i.i.d. variables (

X

i 0)

1in according to the law

, and we set

X

0i 0 =

X

i

0. Then for each

i

we simulate the variables (

X

i

1j)

1jn along the Euler scheme

(2.1) with stepsize 1

=n

and starting point

X

0i

0, i.e. with the same law as the sequence

(

X

(

X

0i 0

n

)j)

1jnif we use the notation (2.1).

Notation:

At the end of Step

k

1 we will know the variables (

X

ikj)

1in1jn, so

we can set for every function

f

on

IR

d_:

W

nk

f

= 1

_n

Xn

i=1

f

(

X

ikn)

'

0

@

y

k ;

y

k

;1 ;

1

n

X

j=1

h

(

X

ikj)

1

A

:

(4.2)

IR

d_:

nk =

_nW

1_nk₁Xn

i=1

'

0

@

y

k ;

y

k

;1 ;

1

n

X

j=1

h

(

X

ikj)

1

A

"

X_ikn (4.3)

(observe that here

' >

0, hence

W

_nk1

>

0 identically).

Step

k

2

:

We simulate

n

i.i.d. variables (

X

0j

k;1)

1jn according to the law nk ;1.

Then for each

i

= 1

:::n

we simulate the random variables (

X

_ikj)1jnaccording to the

law of the sequence (

X

(

X

0ik

;1

n

)j) 1jn.

N

. Our approximation of

N

f

will be nN

f

. Below, recall

that k

f

k

0 is the Lipschitz norm given by (2.21).

Theorem 4.1

Assume (E2) and (B1). For all bounded Lipschitz functions

f

on

IR

d_{, all}

k

= 1

:::N

and all

n

, we have

E

(jnk

f

;

k

f

j)

F0

k

p

nk

f

k 0

P

(jnk

f

;

k

f

j

)

F

0

kexp

;

n

^

F0

k f 0

2 !

9

>

=

>

(14)

where (with

0 = 2

K

0N

="

):

F

0

k =

C

(4

0)k+1 ;4

0

4

0 ;1

F

0

k = 3k+1

F

^0

k =

C

(8

)k

(4.5)

and

C

depends only on

a

,

b

,

h

and .

Proof

. 1) We prove rst that (2.23) holds with

0

m =

K

0

=

p

m

for some constant

, provided we choose for

H

_km _{the kernel dened by}

H

_km

_f

₍

_x

_{) =}

_E

0

@

f

(

X

(

xm

)m)

'

0

@

y

k ;

y

k

;1 ;

1

m

X

j=1

h

(

X

(

xm

)j)

1

A 1

A

(4.6)

and

_mk is dened by the recursive formula (2.18). We have

j

H

m

j

f

(

x

);

H

j

f

(

x

)j

E

(j

U

m(

x

);

V

m(

x

)j) +

E

(j

V

m(

x

);

V

(

x

)j)

where

U

m(

x

) =

f

(

X

(

xm

)m)

'

0

@

y

n ;

y

n

;1 ;

1

m

X

j=1

h

(

X

(

xm

)j)

1

A

V

m(

x

) =

f

(

X

(

x

)1)

'

0

@

y

n ;

y

n

;1 ;

1

m

X

j=1

h

(

X

(

x

)j=m)

1

A

V

(

x

) =

f

(

X

(

x

)1)

'

y

n;

y

n ;1

; Z

1

0

h

(

X

(

x

)s

ds

:

Since

'

is bounded and

f

is bounded and Lipschitz, we have

j

U

m(

x

);

V

m(

x

)

C

k

f

k 0 sup

1jm

j

X

(

xm

)j ;

X

(

x

)j=mj

and thus (2.5) yields

E

(j

U

m(

x

);

V

m(

x

)j)

C

k

f

k 0 1

p

m:

We also have

E

(j

V

m(

x

);

V

(

x

)j)

C

k

f

k

m

X

j=1 Z jm

j;1

m

E

(

j

X

(

x

)s;

X

(

x

)jmj)

ds

and by well known results about Euler schemes (see 3]) (E2) implies that this expression is smaller than

C=

p

m

. Putting these facts together shows that (2.23) holds with

0

m =

K

0

=

p

m

.

2) Now the result is proved exactly as in Theorem 3.1: we introduce another random measure on

IR

d _by0nk =

1

nP

ni=1

"

X

0ik (setting

X

0i 0 =

X

i

(15)

same way: we have to replace

n by

0

n, and

K

and

by

K

0 and

0, and use Proposition

2.2 instead of Proposition 2.1. We have to take

i =

f

(

X

ikmn)

'

0

@

y

k;

y

k ;1

;

1

n

mn

X

j=1

h

(

X

_ikj)

1

A

;

H

nk(

X

0ik

;1)

for the proof of (3.13), and the fact that k

f

k

0 comes in instead of the smaller number k

f

k

is due to the use of Proposition 2.2 (since (2.17) does not hold here). 2

r

n

(1+

n

)

N

single simulations. So we choose

n

=

n

(

r

) to be the biggest integer

n

(1+

n

)

N

r

, and the simulated lter is again given by (3.19). Then (4.4) gives, as soon

as

r

8

N

:

E

(j^rN

f

;

N

f

j) (2N)

1=4F0

N

r1=4 k

f

k

0

P

(j^rN

f

;

N

f

j

)

F

^ 0

Nexp

;

r

1=2

F0

N(2N) 1=4

kfk 0

2 !

:

9

>

=

>

(4.7)

Here also we have strong consistency for our estimates ^_rN.

5 Case C

Now we study our last case C. The assumptions are (E1) and (E3). We have seen that in this case the function

G

of subsection 2-3 is

G

(

xy

x

0

y

0) =

p

(

xy

x

0

y

0)

=q

(

xx

0), so that

H

k

f

(

x

) =

Z

p

(

xy

k;1

x

0

y

k)

f

(

x

0)

dx

0

:

(5.1)

We choose a Borel bounded function

'

from

IR

q _{into 0}

1) such that Z

'

(

y

)

dy

= 1

:= Z

j

y

j

'

(

y

)

dy <

1

:

(5.2)

We can then propose the following procedure: we x an integer

n

1. Then

m

n

denotes the smallest integer

n

1=(2+q), and we set

'

n(

y

) =

n

q=(2+q)

'

(

yn

1=(2+q))

:

(5.3)

Then we perform the following steps:

Step 1:

we simulate

n

i.i.d. variables (

X

i 0)

1in according to the law

, and for each

i

a variable (

X

i

1

Y

j

1) according to the law

P

(mn)(

X

i 0

y

(16)

Notation:

at the end of Step

k

we will know the variables (

X

ik

Y

ik)

1in, so we can set

for every function

f

on

IR

d_:

W

nk

f

= 1

_n

Xn

i=1

f

(

X

ik)

'

n(

Y

ik;

y

k)

:

(5.4)

IR

d_:

nk =

8

<

: 1

nW_nk1 P

ni=1

'

n(

Y

ik

;

y

k)

"

Xik

:

if

W

nk1

>

0

"

0 otherwise

(5.5)

Step

k

2

:

We simulate

n

i.i.d. variables (

X

0ik

;1)

1in according to the law nk ;1.

Then for each

j

= 1

::: n

we simulate the random variable (

X

_kj

Y

_kj) according to the law

P

(mn)(

X

0j

k;1

y

k ;1

:

).

N

. Our approximation of

N

f

will be nN

f

.

Theorem 5.1

Assume (E1) and (E3). For all bounded Borel functions

f

on

IR

d_{, all}

k

= 1

:::N

and all

n

, we have

E

(jnk

f

;

k

f

j) Fk

n1=(2+q) k

f

k

P

(jnk

f

;

k

f

j

)

F

kexp

;

n

2=(2+q)

^

Fkkfk

2 !

9

>

=

>

(5.6)

where

F

k,

Fk

and ^

F

k are given by (3.4) with the same

and another constant

C

depending

on

a

,

b

,

a

0,

b

0 and

'

.

Proof

. Here again the proof will be similar to the proof of Theorem 3.1. 1) We denote by

H

_nk the kernel dened by

H

nk

f

(

x

) = Z

p

(mn)

(

xy

k;1

x

0

y

0

)

f

(

x

0

)

'

n(

y

0

;

y

k)

dx

0

dy

0

(5.7) and

_nk is dened by the recursive formula (2.18).

In view of (5.1), (5.2) and (5.3) (which yieldsR

'

n(

y

)

dy

= 1) we get

H

_nk

f

(

x

);

H

k

f

(

x

) =

R

(

p

(mn)(

xy

k;1

x

0

y

0)

;

p

(

xy

k ;1

x

0

y

0))

f

(

x

0)

'

n(

y

0

;

y

k)

dx

0

dy

0

+R

(

p

(

xy

k;1

x

0

y

0)

;

p

(

xy

k ;1

x

0

y

k))

f

(

x

0)

'

n(

y

0

;

y

k)

dx

0

dy

0

:

(5.8)

The rst integral above can be divided into two parts: rst, the integral over

A

n =

f(

x

0

y

0) :

j

x

;

x

0

j+j

y

;

y

k ;1

j

>

2

=m

ng, and this bit is smaller than

C

k

f

k

=m

n by (2.9)

and R

'

n(

y

)

dy

= 1 second, the integral over the complement

A

cn which is smaller than

C

k

f

k

Z

fx 0

:jx 0

;xj2=mng

dx

0 Z

'

n(

y

0

;

y

k)

dy

0

(17)

by (2.8). Recalling that the derivative

@p

(

xy

x

0

y

0)

=@y

0 also satises (2.8) under (E1)

and (E3), and sinceR

j

y

j

'

n(

y

)

dy

=

n

;1=(2+q), the second integral in (5.8) is smaller than

C

k

f

k

n

;1=(2+q). So in view of the denition of

m

nwe nally get

j

H

nk

f

(

x

);

H

k

f

(

x

)j

C

k

f

k

1

n

1=(2+q)

:

In other words, we have

k

H

nk

f

;

H

k

f

k

nk

f

k with

n =

C

n

1=(2+q)

:

(5.9)

2) We introduce another random measure on

IR

d _by0nk= 1

nP

ni=1

"

X

0ik (setting

X

0i 0 =

X

i

0). We will prove the following estimates

E

(jnk

f

;

nk

f

j)

G

k

n

1=(2+q)

k

f

k

= 1

:::N

(5.10)

P

(jnk

f

;

nk

f

j

) 2

I

kexp

;

n

2

2+q

2

J

kk

f

k 2

!

k

= 1

:::N

(5.11)

E

(j 0nk

f

;

nk

f

j)

G

0

k

n

1=(2+q)

k

f

k

= 0

:::N

;1

:

(5.12)

P

(j 0nk

f

;

nk

f

j

) 2

I

0

kexp

;

n

2

2+q

2

J

0

kk

f

k 2

!

k

= 0

:::N

;1

:

(5.13)

for some

G

k,

G

0

k,

I

k,

I

0

k,

J

k and

J

0

k to be computed later.

3) Here again we give preliminary estimates. First (3.12) is proved exactly as in Theorem 3.1. Next,

W

_nk

f

;

0nk

;1

H

nk

f

= 1

nP

ni=1

i, where

i =

f

(

X

ik)

'

n(

X

ik ;

y

k);

H

nk

f

(

X

0ik

;1)

:

These variables satisfy the assumptions for (3.6), with

i =

f

(

X

ik)

'

n(

X

ik ;

y

k) and

w.r.t. the -eld G generated by the variables

X

0ik

;1 :

i

= 1

:::n

and with

a

=

a

f :=

C

1 k

f

k

n

q=

(2+q) and

b

=

b

f :=

C

2 k

f

k

2

n

q=(2+q) for some constants

C

1 and

C

2: indeed, the

rst of these assumptions is obvious by (5.3), and for the second one we may write

E

(j

ij 2

jF 0

k;1)

C

k

f

k

2

n

q=(2+q)

E

(

'

n(

Y

ik;

y

k)jG) =

C

k

f

k

2

n

q=(2+q)

H

nk

1(

X

0ik ;1)

and

H

_nk1

K

+

n

C

(see (2.15) and (5.9)). Then instead of (3.13) we deduce from

(3.7) and from the second inequality in (3.7) that for yet another constant

C

:

E

(j

W

nk

f

; 0nk

;1

H

nk

f

j

2)

C

kfk 2K2

n2=(2+q)

P

(j

W

nk

f

; 0nk

;1

H

nk

f

j

) 2exp

;

n

2

2+q 2

Ckfk 2

9

>

=

>

(5.14)

(up to replacing the constant

C

2 in the above denition of

b

f, we can always assume that

2

a

f

=b

f 1

=

2, hence 2

e

a=b 2 p

(18)

<

2k

f

k: in this case the second inequality comes from (3.6) since on the other hand its

left side equals 0 when

>

k

f

k, it always holds).

4) Now we proceed to prove (5.10), (5.11), (5.12) and (5.13) by induction on

k

. The last two estimates hold for

k

= 0 with

G

0

0 =

I

0

0 =

J

0

0 = 1, as in Theorem 3.1.

Now let

k

1 and assume that (5.12) and (5.13) hold for

k

;1. Using (5.9) instead

of

n

K=

p

n

, (3.14) gets substituted with (

being still another positive constant)

n

1 2+q

A

k )

nk ;1

H

nk1

K

:

(5.15)

Then under (5.15) we still have (3.15), hence (3.12) and (5.14) allow to replace (3.16) by

E

(j

Lf

j) kfkK

p

n (

C

+

G

0

k;1)

P

(j

Lf

j

) 2

e

;n

2=(2+q)2= 4Ckfk

2K2

+ 2

I

0

k;1

e

;n

2=(2+q)2= 4J

0

k;1 kfk

2K2

:

9 > = > (5.16)

Then as in Theorem 3.1 we see that (5.10) and (5.12) for

k

, provided

G

k ;

4

(

C

+

G

0

k;1)

_

(

A

k)

G

0

k = 1 +

G

k

G

0 0 = 1

which are satised by

G

k =

(1 +

C

)_

(4

)k+1 ;4

4

;1

:

(5.17)

Similarly, we obtain that (3.9) and (3.11) hold for

k

, provided

I

k ;

3(1 +

I

0

k;1

_

2

J

k

(4

)2(

J

0

k;1 _

C

)

_(

A

k)2

2

I

0

k = 1 +

I

k

J

0

k 4(

J

k _

C

)

I

0 0 =

J

0

0 = 1

:

The above properties are satised if

J

k =

C

+ 1_

2

K

2 !

(26

2)k

I

k = 3k+1

;3

:

(5.18)

5) So far we have (5.10) and (3.9') for all

k

, with constants given by (5.17) and (5.18). It remains to apply (2.19) and the fact that

n =

C=n

1=(2+q) by (5.9): Again exactly as

in Theorem 3.1 we conclude that (5.6) holds with the prescribed constants. 2

r

n

(1 +

m

n)

N

single simulations. So we choose

n

=

n

(

r

) to be the biggest

integer

n

(1 +

m

n)

N

r

, and the simulated lter is again given by (3.19). Then (5.6)

gives, as soon as

r

8

N

:

E

(j^rN

f

;

N

f

j) (2N)

1=(3+q)F

N

r1=(3+q) k

f

k

P

(j^rN

f

;

N

f

j

)

F

^Nexp

;

r

2=(3+q)

FN(2N) 1=(3+q)

kfk 2

9 > > = > > (5.19)

(19)

6 Comments and numerical results

6-1)

The regularity assumptions in (E1), (E2) and (E3) are reasonably weak, but the non-degeneracy ones could obviously be weakened: taking advantage of the result of Bally and Talay we could replace the uniform ellipticity by weak H ormander's conditions, at the expense of much stronger regularity on the coecients.

Another interesting point is when degenerates in (1.3), for example when = 0. Even then our ltering problem is not trivial, since the knowledge of R

1

0

h

(

X

t)

dt

does not

entail full knowledge of

X

1 in general. In this situation we can treat case B as a particular

case of C with degenerate coecient

b

0

b

0.

Still another interesting situation is when

b

= 0, i.e.

X

is a deterministic (not observ-able) function. The above method does not work since in the \simulation" of

X

we get a single deterministic path. However one could add a \small noise", i.e. consider (1.1) with

b

(

x

) =

"

and run the algorithm: this is a standard regularization technique used in practice.

In a dierent direction, it is interesting to see what happens when

f

is not bounded. Because of the bounds (2.2) the same results will hold when

f

hasj

f

(

x

)j

(1+j

x

jp), with k

f

kreplaced by a number depending on

and

p

, and except of course for the exponential

bounds (the second halves of (3.20), (4.7) and (5.19)).

6-2)

Our second comment is that it makes no practical sense here to achieve an \innite" precision on the approximate lter. In practice one usually wants to deduce the value

f

(

X

N) for some given function

f

(possibly a family of such functions) from the observations

(

y

1

:::y

N). Then even under the true lter

N the order of magnitude of the error made

is the standard deviation !N = !n(

y

1

:::y

N) of

f

under

N. Thus in this situation it is

somewhat meaningless to take the number

r

of simulated variables such thatj^rN

f

;

N

f

j

is much smaller than !N.

6-3)

An obvious drawback of our method is the form of the constants

F

N,.... For example

F

N is roughly of the form

F

N =

CC

0N where

C

0

>

1. If the \true" value of

F

N is really

exponential in

N

, then clearly the method is not feasible.

In order to get an idea of the true value of

F

N we have conducted a series of simulations

in the following very special case, where

d

=

q

= 1 and

dX

t =

X

t

dt

+

dW

t

X

0 = 1

dY

t =

X

t

dt

+

dW

0

t

Y

0 = 0

:

9

=

(6.1)

The true lter

N is known (and given by the Kalman-Bucy discrete-time lter). We have

run the algorithm given for case C, for the function

f

(

x

) =

x

(this is not bounded, but see the comments in 6-2 above), and for an observation set (

y

1

:::y

N) obtained by a rst

preliminary simulation of a particular path of the pair (

XY

).

(20)

Case 1:

=;

:

5, = 1, so

X

is a recurrent Ornstein-Uhlenbeck process. The standard

deviation of

f

under

N is !1 =

:

74, !2 =

:

79 and !N =

:

8 for all

N

3 (the two rst

values are smaller than the others because in our model we exactly know

X

0).

Case 2:

=;

:

5, =

:

1: this is like the above, except that the smaller value of implies

that we get \more information" on

X

from

Y

, and this is apparent from the standard deviation for the lter which is now !1 =

:

49 and !N =

:

51 for all

N

2.

Case 3:

=

:

1, = 1, so

X

is a transient Ornstein-Uhlenbeck process. The standard deviation of

f

under

N is !1 =

:

94, !2= 1

:

06 and !N = 1

:

07 for all

N

2.

The function

'

of (5.2) is the normalized indicator function of an interval ;

uu

] where

u

has been empirically adjusted so that in (5.5) one does not encounter the second stated possibility. This led us to take

u

= 20 in cases 1 and 2, and

u

= 40 in case 3.

The algorithm has been run for the following values of

n

in Theorem 5.1:

n

= 100

500

1000

2000, using of course the Euler approximation for

X

and

Y

. For each of these values it has been run 20 times, and the following tables give for some values of

N

the average absolute value of the normalized error

n

1=3

jnN

f

;

N

f

j over these 20 runs

(recall that here

N

f

is known).

Case 1

N n=100 n=500 n=1000 n=2000

5 1.3 2.1 3.3 4.4

10 2.7 4.9 5.0 7.0

20 1.1 4.6 6.0 5.4

30 1.1 2.1 3.1 4.1

40 1.1 1.9 3.0 3.1

50 1.9 4.0 5.2 7.6

60 1.4 1.5 3.4 2.5

70 5.9 9.8 10.6 13.9

80 3.0 4.9 5.7 8.5

90 3.0 6.1 9.3 10.4

Case 2

N n=100 n=500 n=1000 n=2000

5 9.6 18.6 24.0 27.3

10 0.9 2.6 3.5 4.5

20 6.7 10.9 12.5 16.8

30 4.7 7.4 9.1 10.8

40 7.2 14.2 20.2 21.7

50 2.1 3.3 4.9 7.2

60 2.4 3.3 4.3 5.6

70 4.6 8.7 9.7 10.9

80 2.7 4.9 4.9 7.8

(21)

Case 3

N n=100 n=500 n=1000 n=2000

5 4 3 12 20

10 14 21 41 74

20 40 145 214 238

30 56 444 359 164

40 146 984 893 369

50 375 1970 1581 907

60 1052 2952 4457 2760

70 2910 4973 6284 7883

80 7902 13541 17033 21516

90 21525 36873 46500 58475

These numerical results are all surprising: in the recurrent cases, the constant

F

N

seems indeed fairly independent from

N

, presumably because the process

X

forgets rather quickly its past values. In the transient case, the constant

F

N seems to behave somewhat

exponentially in

N

, as expected. There is however something unexpected, namely that the results are worse in case 2 than in case 1, while the observation noise is smaller.

Note also that since the standard deviation of

f

under

N is of the order of magnitude

equal to 1, there is no reason to increase the precision of our estimate of

N

f

much further

than 1. Recalling that the above tables give the error multiplied by

n

1=3, and looking at

the data for

N

= 90 for example, we see that we could stop at

n

= 100 already for case 1, while for the other two cases

n

= 2000 is not enough, and in case 3 the results are clearly totally inecient:

n

should be much larger (and hopefully when

n

gets large the normalized error stabilizes...).

References

1] V. Bally, D. Talay: The law of the Euler scheme for stochastic dierential equations: I. Convergence rate of the distribution fuction. Probab. Th. Related Fields,

104

, 43-60, (1996).

2] V. Bally, D. Talay: The law of the Euler scheme for stochastic dierential equations: II. Approximation of the density, Monte Carlo Methods and Applications,

2

, 93-128, (1996).

3] T.G. Kurtz, P. Protter: Wong Zakai corrections, random evolutions, and simulation schemes for SDE's, Stochastic Analysis, ed. Mayer-Wolf, Merzbach and Schwarz, Aca-demicPress: New York, 1991, 331-146.