• No results found

Imprecise stochastic processes

N/A
N/A
Protected

Academic year: 2021

Share "Imprecise stochastic processes"

Copied!
58
0
0

Loading.... (view fulltext now)

Full text

(1)

Imprecise stochastic processes

Alexander Erreygers

11

th

November 2019

(2)
(3)

discrete-time

(4)

discrete-time

(5)

We consider an infinite sequence

X

1

, X

2

, X

3

, . . . , X

n

, . . .

of uncertain variables that take values in the finite state space

X .

Example

X

n

is the weather in Oviedo n days from now, and

X

= {

,

,

}

.

We want to make inferences, for example answer the following questions:

What is the probability of

in 4 days?

What is the expected number of days until the next

day?

(6)

We consider an infinite sequence

X

1

, X

2

, X

3

, . . . , X

n

, . . .

of uncertain variables that take values in the finite state space

X .

Example

X

n

is the weather in Oviedo n days from now, and

X

= {

,

,

}

.

We want to make inferences, for example answer the following questions:

What is the probability of

in 4 days?

(7)

Modelling our uncertainty

First, we construct a tree with nodes (or situations)

s

= (

x

1

, . . . , x

n

)

,

x

i

X .

For example,

(

x

1

, x

2

, x

3

) = (

,

,

)

.

Second, we turn this into a probability tree by specifying a local probability mass

function p

s

:

X

→ [

0, 1

]

for every situation s

= (

x

1

, . . . , x

n

)

:

(8)
(9)

Modelling our uncertainty

First, we construct a tree with nodes (or situations)

s

= (

x

1

, . . . , x

n

)

,

x

i

X .

Second, we turn this into a probability tree by specifying a local probability mass

function p

s

:

X

→ [

0, 1

]

for every situation s

= (

x

1

, . . . , x

n

)

:

(10)
(11)

This way, we construct a probability measure P and

we can make inferences

—that is, compute E

P

(

f

|

s

)

for sufficiently nice functions f on

Ω—

by using backwards recursion due to the law of total probability (aka the law of

iterated expectation);

(12)

To make this tractable, one may assume that the local models

p

(

x

1

,...,x

n

)

=

p

n,x

n

=

p

x

n

1. only depend on the present,

[Markovianity]

2. do not change over time.

[time homogeneity]

This way, we end up with a homogeneous

Markov chain

and

(13)

To make this tractable, one may assume that the local models

p

(

x

1

,...,x

n

)

=

p

n,x

n

=

p

x

n

1. only depend on the present,

[Markovianity]

2. do not change over time.

[time homogeneity]

This way, we end up with a homogeneous

Markov chain

and

(14)

To make this tractable, one may assume that the local models

p

(

x

1

,...,x

n

)

=

p

n,x

n

=

p

x

n

1. only depend on the present,

[Markovianity]

2. do not change over time.

[time homogeneity]

This way, we end up with a homogeneous

Markov chain

and

(15)
(16)

What if we cannot specify

the local models

p

s

precisely?

¡impr

ecise

(17)

What if we cannot specify

the local models

p

s

precisely?

¡impr

ecise

(18)

Imprecise probabilities

is a collection of theories

(19)

Let

L denote the real vector space of all real-valued functions on X ,

and let

Σ

X

denote the subset of all probability mass functions on

X :

Σ

X

=

n

p

L : p

0,

x

X

p

(

x

) =

1

o

.

A probability mass function p induces an

expectation

operator E

p

:

L

R,

defined by

E

p

(

f

) =

x

X

p

(

x

)

f

(

x

)

for all f

L .

Recall that

E1. E

p

(

f

) ≥

min f for all f

L ;

[boundedness]

E2. E

p

(

f

+

g

) =

E

p

(

f

) +

E

p

(

g

)

for all f , g

L ;

[additivity]

(20)

Let

L denote the real vector space of all real-valued functions on X ,

and let

Σ

X

denote the subset of all probability mass functions on

X :

Σ

X

=

n

p

L : p

0,

x

X

p

(

x

) =

1

o

.

A probability mass function p induces an

expectation

operator E

p

:

L

R,

defined by

E

p

(

f

) =

x

X

p

(

x

)

f

(

x

)

for all f

L .

Recall that

E1. E

p

(

f

) ≥

min f for all f

L ;

[boundedness]

E2. E

p

(

f

+

g

) =

E

p

(

f

) +

E

p

(

g

)

for all f , g

L ;

[additivity]

(21)

Let

L denote the real vector space of all real-valued functions on X ,

and let

Σ

X

denote the subset of all probability mass functions on

X :

Σ

X

=

n

p

L : p

0,

x

X

p

(

x

) =

1

o

.

A probability mass function p induces an

expectation

operator E

p

:

L

R,

defined by

E

p

(

f

) =

x

X

p

(

x

)

f

(

x

)

for all f

L .

Recall that

E1. E

p

(

f

) ≥

min f for all f

L ;

[boundedness]

E2. E

p

(

f

+

g

) =

E

p

(

f

) +

E

p

(

g

)

for all f , g

L ;

[additivity]

(22)

Credal set

Instead of a single probability mass function p, we now consider

a

credal set

M

Σ

X

,

a non-empty, closed and convex set of probability mass functions.

The credal set

M induces a set of expectations:

{

E

p

(

f

)

: p

M

}

.

Specifically of interest are the bounds

E

M

(

f

)

:

=

min

{

E

p

(

f

)

: p

M

}

and

E

M

(

f

)

:

=

max

{

E

p

(

f

)

: p

M

}

.

(23)

Credal set

Instead of a single probability mass function p, we now consider

a

credal set

M

Σ

X

,

a non-empty, closed and convex set of probability mass functions.

A credal set is defined by constraints of the form

c

f

x

X

p

(

x

)

f

(

x

) =

E

p

(

f

)

.

The credal set

M induces a set of expectations:

{

E

p

(

f

)

: p

M

}

.

Specifically of interest are the bounds

E

M

(

f

)

:

=

min

{

E

p

(

f

)

: p

M

}

and

E

M

(

f

)

:

=

max

{

E

p

(

f

)

: p

M

}

.

(24)

Credal set

Instead of a single probability mass function p, we now consider

a

credal set

M

Σ

X

,

a non-empty, closed and convex set of probability mass functions.

The credal set

M induces a set of expectations:

{

E

p

(

f

)

: p

M

}

.

Specifically of interest are the bounds

(25)

Lower expectation

An operator E :

L

R is called a

lower expectation

(coherent lower prevision) if

LE1. E

(

f

) ≥

min f for all f

L ;

[boundedness]

LE2. E

(

f

+

g

) ≥

E

(

f

) +

E

(

g

)

for all f , g

L ;

[super-additivity]

LE3. E

(

λ f

) =

λE

(

f

)

for all f

L and λ

R

>

0

.

[positive homogeneity]

Theorem

An operator E :

L

R is a lower expectation if and only if it is the lower envelope of

some credal set

M

Σ

X

, meaning that

(26)

Lower expectation

An operator E :

L

R is called a

lower expectation

(coherent lower prevision) if

LE1. E

(

f

) ≥

min f for all f

L ;

[boundedness]

LE2. E

(

f

+

g

) ≥

E

(

f

) +

E

(

g

)

for all f , g

L ;

[super-additivity]

LE3. E

(

λ f

) =

λE

(

f

)

for all f

L and λ

R

>

0

.

[positive homogeneity]

Theorem

An operator E :

L

R is a lower expectation if and only if it is the lower envelope of

some credal set

M

Σ

X

, meaning that

(27)

Lower expectation

An operator E :

L

R is called a

lower expectation

(coherent lower prevision) if

LE1. E

(

f

) ≥

min f for all f

L ;

[boundedness]

LE2. E

(

f

+

g

) ≥

E

(

f

) +

E

(

g

)

for all f , g

L ;

[super-additivity]

LE3. E

(

λ f

) =

λE

(

f

)

for all f

L and λ

R

>

0

.

[positive homogeneity]

Theorem

An operator E :

L

R is a lower expectation if and only if it is the lower envelope of

some credal set

M

Σ

X

, meaning that

(28)

Assume that we can only assess that

p



M



and

p

(

x

1

,...,x

n

)

M

x

n

,

(1)

where

M



and

M

x

for all x

X are credal sets.

We consider three nested sets of probability trees that satisfy (1):

P

CHM

: all compatible homogeneous Markov chains;

P

CM

: all compatible Markov chains;

P

C

: all compatible probability trees.

Can we compute

E

P

(

f

|

s

)

:

=

inf

{

E

P

(

f

|

s

)

: P

P

}

and

(29)
(30)
(31)

Assume that we can only assess that

p



M



and

p

(

x

1

,...,x

n

)

M

x

n

.

(1)

We consider three nested sets of probability trees that satisfy (1):

P

CHM

: all compatible homogeneous Markov chains;

P

CM

: all compatible Markov chains;

P

C

: all compatible probability trees.

Can we compute

E

P

(

f

|

s

)

:

=

inf

{

E

P

(

f

|

s

)

: P

P

}

and

(32)

Imprecise Markov chains

Computing these tight lower and upper bounds turns out to be

intractable for

P

CHM

,

intractable for

P

CM

—at least in general,

tractable for

P

C

, because we can use backwards recursion due to the

(33)

Gert de Cooman and Filip Hermans. “Imprecise probability trees: Bridging two

theories of imprecise probability”.

In: AI 172.11 (2008), pp. 1400–1427

Gert de Cooman, Filip Hermans, and Erik Quaeghebeur. “Imprecise Markov

chains and their limit behavior”.

In: PEIS 23.4 (2009), pp. 597–635

Damjan Škulj. “Discrete time Markov chains with interval probabilities”.

In:

IJAR 50.8 (2009), pp. 1314–1329

Stavros Lopatatzidis. “Robust modelling and optimisation in stochastic

processes using imprecise probalities, with an application to queueing theory”.

PhD thesis. Ghent University, 2017

(34)

continuous-time

(35)

We consider the collection

{

X

τ

: τ

R

0

}

of uncertain variables that take values in the finite state space

X .

Example

X

τ

is the weather in Oviedo τ time units from now, and

X

= {

,

,

}

.

We want to make inferences, for example answer questions like:

What is the probability of

after 4 days?

How long do I have to wait until it is

again?

(36)

We consider the collection

{

X

τ

: τ

R

0

}

of uncertain variables that take values in the finite state space

X .

Example

X

τ

is the weather in Oviedo τ time units from now, and

X

= {

,

,

}

.

We want to make inferences, for example answer questions like:

What is the probability of

after 4 days?

(37)
(38)

Imprecise continuous-time Markov chains

Similar results as for imprecise discrete-time Markov chains,

but—for now—limited to inferences that depend on a finite number of time

points.

Damjan Škulj. “Efficient computation of the bounds of continuous time

imprecise Markov chains”.

In: AMC 250 (2015), pp. 165–180

(39)

A counting process is a model for a stream of events

X

τ

: the number of events that have occurred up to time τ, so

X

=

Z

0

.

Example

X

τ

: the number of

lightning strikes that have hit the cathedral of Oviedo.

We want to answer questions like:

What is the probability of at least one

in some time period?

What is the expected number of

in the following year?

(40)

A counting process is a model for a stream of events

X

τ

: the number of events that have occurred up to time τ, so

X

=

Z

0

.

Example

X

τ

: the number of

lightning strikes that have hit the cathedral of Oviedo.

We want to answer questions like:

What is the probability of at least one

in some time period?

What is the expected number of

in the following year?

(41)

Counting processes in general

τ

|

0

|

t

1

X

t

1

=

x

1

|

t

n

X

t

n

=

x

n

|

t

X

t

=

x

|

t

+

X

t

+

=

y

We model our beliefs by means of the transition probabilities

P

(

X

t

+

=

y

|

X

t

=

x, X

t

n

=

x

n

, . . . , X

t

1

=

x

1

|

{z

}

X

u

=

x

u

(42)

The Poisson process in particular

τ

|

0

|

t

1

X

t

1

=

x

1

|

t

n

X

t

n

=

x

n

|

t

X

t

=

x

|

t

+

X

t

+

=

y

For the Poisson process, we additionally assume that the transition probabilities

P

(

X

t

+

=

y

|

X

t

=

x, X

u

=

x

u

)

=

P

(

X

=

y

x

|

X

0

=

0

)

1. only depend on the present,

[Markovianity]

(43)

The Poisson process in particular

τ

|

0

|

t

1

X

t

1

=

x

1

|

t

n

X

t

n

=

x

n

|

t

X

t

=

x

|

t

+

X

t

+

=

y

For the Poisson process, we additionally assume that the transition probabilities

P

(

X

t

+

=

y

|

X

t

=

x, X

u

=

x

u

) =

P

(

X

t

+

=

y

|

X

t

=

x

)

1. only depend on the present,

[Markovianity]

(44)

The Poisson process in particular

τ

|

0

|

t

1

X

t

1

=

x

1

|

t

n

X

t

n

=

x

n

|

0

X

0

=

x

|

X

=

y

For the Poisson process, we additionally assume that the transition probabilities

P

(

X

t

+

=

y

|

X

t

=

x, X

u

=

x

u

) =

P

(

X

=

y

|

X

0

=

x

)

1. only depend on the present,

[Markovianity]

2. only depend on the length of the time period,

[time homogeneity]

(45)

The Poisson process in particular

τ

|

0

|

t

1

X

t

1

=

x

1

|

t

n

X

t

n

=

x

n

|

0

X

0

=

0

|

X

=

y

x

For the Poisson process, we additionally assume that the transition probabilities

P

(

X

t

+

=

y

|

X

t

=

x, X

u

=

x

u

) =

P

(

X

=

y

x

|

X

0

=

0

)

1. only depend on the present,

[Markovianity]

(46)

The rate parameter

A Poisson process is uniquely characterised by a single parameter:

the rate λ!

It has multiple interpretations, for instance:

the expected number of new events in any time period is proportional to λ:

E

P

(

X

t

+

|

X

t

=

x, X

u

=

x

u

) =

x

+

λ

∆;

(47)

The rate parameter

A Poisson process is uniquely characterised by a single parameter:

the rate λ!

It has multiple interpretations, for instance:

the expected number of new events in any time period is proportional to λ:

E

P

(

X

t

+

|

X

t

=

x, X

u

=

x

u

) =

x

+

λ

∆;

(48)

The rate parameter

A Poisson process is uniquely characterised by a single parameter:

the rate λ!

It has multiple interpretations, for instance:

the expected number of new events in any time period is proportional to λ:

E

P

(

X

t

+

|

X

t

=

x, X

u

=

x

u

) =

x

+

λ

∆;

(49)

What if we do not know the rate λ precisely,

but only know that it belongs to

(50)

The general approach

Let

P be a set of counting processes characterised by the rate interval

[

λ

, λ

]

,

and define the lower expectation

E

P

(

f

|

X

t

=

x, X

u

=

x

u

)

:

=

inf

{

E

P

(

f

|

X

t

=

x, X

u

=

x

u

)

: P

P

}

.

Choose

P such that

(i) computing E

P

(

f

|

X

t

=

x, X

u

=

x

u

)

is tractable,

(ii) E

P

(· | ·)

is Poisson-like, in the sense that

(51)

A naive imprecise Poisson process

If

P is the set of all Poisson processes with rate λ in the rate interval

[

λ

, λ

]

, then

computing E

P

(

f

|

X

t

=

x, X

u

=

x

u

)

is a one-parameter optimisation

problem;

E

P

(· | ·)

is Poisson-like;

(52)
(53)
(54)

A more involved imprecise Poisson process

If

P is the set of processes that are consistent with the rate interval

[

λ

, λ

]

,

in the sense that

λ

+

o

(

) ≤

P

(

X

t

+

=

x

+

1

|

X

t

=

x, X

u

=

x

u

) ≤

λ

+

o

(

)

,

then

a P in

P is not necessarily Markov nor homogeneous;

computing E

P

(

f

|

X

t

=

x, X

u

=

x

u

)

is non-trivial (if not infeasible).

However, it turns out that

computing E

P

(

g

(

X

t

+

) |

X

t

=

x, X

u

=

x

u

)

is tractable;

(55)

A more involved imprecise Poisson process

If

P is the set of processes that are consistent with the rate interval

[

λ

, λ

]

,

in the sense that

λ

+

o

(

) ≤

P

(

X

t

+

=

x

+

1

|

X

t

=

x, X

u

=

x

u

) ≤

λ

+

o

(

)

,

then

a P in

P is not necessarily Markov nor homogeneous;

computing E

P

(

f

|

X

t

=

x, X

u

=

x

u

)

is non-trivial (if not infeasible).

However, it turns out that

computing E

P

(

g

(

X

t

+

) |

X

t

=

x, X

u

=

x

u

)

is tractable;

(56)

References

Related documents

This report, prepared by Promar International for the United Soybean Board, focuses on the potential consumer cost of additional regulation of animal agriculture, and on food

The Daiwa Data Although the Recruit data have the advantage that they cover a large number of units over a long period, they only cover rental prices adopted in new contracts

En este trabajo se ha llevado a cabo una revisión y actualización de diferentes aspectos relacionados con la calidad del aceite de oliva virgen, se han estudiado las

Upper gastrointestinal endoscopy including esophagus, stomach, and either the duodenum and/or jejunum as appropriate; with transendoscopic ultrasound-guided intramural

Syfte: Syftet är att beskriva hur föräldrar och barn hanterar informationen när föräldern ska berätta för barnet att den har cancer, samt deras behov av stöd från vården i

Introduction: Concealing the Deep Embedding of DSLs as: trait Vector[T] { def map[U]fn: T => U: Vector[U] } The interface of the deep embedding, however, fundamentally differs in

Socio-economic situation of local population within and outside of the protected area Marek Martyniuk (District Office in Bielsk Podlaski) Martin Welp / Christoph Nowicki

In this study, we present the largest analysis of both chronic and occult HBV infection prevalence in HIV-infected and HIV-uninfected pregnant women in Botswana, a sub-Saharan