• No results found

1.4 Thesis Overview

2.1.1 Markov Chains

Consider a discrete-time stochastic process {Xi | i ≥ 1} on an appropriate probabil-

ity space (Ω, F,P), where each random variable1X

iis discrete and assumes values

in a countable set of states S. Set S is called the state space of {Xi | i ≥ 1} and for

each S ∈ S,P(Xi= S) indicates the probability that {Xi| i ≥ 1} is in S or visits S at

time-epoch i. If for all i ≥ 1 and S1, . . . , Si+1∈ S

P(Xi+1= Si+1| Xk= Skfor all 1 ≤ k ≤ i) =P(Xi+1= Si+1| Xi= Si) (2.1)

then {Xi| i ≥ 1} is said to satisfy the Markovian property and is called a discrete-time

Markov chain. For any S, T ∈ S, the probabilityP(Xi+1 = T | Xi = S) = PS,T(i)

is called the (one-step) transition probability from S to T and indicates the probability that discrete-time Markov chain {Xi| i ≥ 1} transfers from S to T at time-epoch i.

Of special interest are time-homogeneous discrete-time Markov chains, for which tran- sition probability PS,T(i) is independent of i for all S, T ∈ S. If PS,T(i) = PS,T(j)

for any two time-epochs i, j and all S, T ∈ S, then the discrete-time Markov chain is

1Strictly speaking, the X

i’s are no random variables in the sense of appendix A because S is not a subset of ¯R. This is however immaterial since a complete function can be defined, which uniquely assigns an element of ¯R to each element of S.

said to have stationary transition probabilities. In that case, PS,T is defined as

PS,T =P(Xi+1= T | Xi= S)

for all time-epochs i ≥ 1 and S, T ∈ S. For any fixed enumeration of the states in S, the matrix P = (PS,T) with S, T ∈ S is called the transition matrix of {Xi | i ≥ 1}.

DenotingP(X1 = S) with IS for S ∈ S, the n-dimensional joint probabilities of a

time-homogeneous discrete-time Markov chain {Xi| i ≥ 1} can be written as

P(Xi= Sifor all 1 ≤ i ≤ n) = IS1·

n−1Y i=1

PSi,Si+1

The probability ISis called the initial probability of S and indicates the probability that

{Xi | i ≥ 1} departs from state S. The distribution I of initial probabilities over S is

referred to as the initial distribution of {Xi| i ≥ 1}. In this thesis, time-homogeneous

discrete-time Markov chains are conveniently called Markov chains.

Definition 2.1 (Markov Chain) A Markov chain is a discrete-time stochastic process

{Xi | i ≥ 1}, where each discrete random variable Xi assumes values in a countable state

space S such that, for each time-epoch i ≥ 1, the transition probabilities

P(Xi+1= Si+1| Xk= Sk for all 1 ≤ k ≤ i) =P(Xi+1= Si+1 | Xi= Si)

are independent of i, for any S1, . . . , Si+1∈ S.

An important theorem in Markov theory is the existence theorem [39, 24]. It states that for any countable set of states S, sequence {IS | S ∈ S} and matrix (PS,T) with

S, T ∈ S, satisfying respectively IS ≥ 0 and X S∈S IS = 1 PS,T ≥ 0 and X T ∈S PS,T = 1

there exists a probability space (Ω, F,P) and a Markov chain {Xi | i ≥ 1} defined

on it with state space S, initial distribution I and transition matrix P = (PS,T). As a

result, the Markov chain {Xi| i ≥ 1} is completely determined by the triple (S, I, P),

with which it is therefore often conveniently represented. If the state space S is finite, the Markov chain can be visualised as a graph. Each state in S is represented as a node labelled with the corresponding name of the state. For every non-zero transition probability PS,T, a directed arrow is drawn from the node representing

S to the node representing T , labelled with transition probability PS,T. For any

state S ∈ S with non-zero initial probability, a symbol > directed towards the node representing S is drawn, labelled with initial probability IS. In case IS = 1 for state

S, the label 1 to the symbol > is usually omitted. This is illustrated with an example.

Example 2.1 Let (S, I, P) represent a Markov chain, where S = {A, B, C}, I is defined as

IA= 1, IB = IC= 0, and where the transition matrix P is given by

P =

PPA,AB,A PPA,BB,B PPB,CA,C

PC,A PC,B PC,C   =  0 1 3 23 0 1 2 12 0 1 0  

1 / 2 1 / 3 1 1 / 2 2 / 3 C B A

Figure 2.1: Visualisation of a Markov chain as a graph.

A graphical representation of this Markov chain is depicted in figure 2.1.

Probability Space The remainder of this section elaborates on the probability space (Ω, F,P) on which a Markov chain {Xi| i ≥ 1} represented by (S, I, P) is defined.

The sample space Ω is the set of all infinite sequences S = (S1, S2, . . .), where Si∈ S

for all i ≥ 1. Hence, Ω is the set S∞of all infinite state sequences. A cylinder of rank n

is a subset Cnof Ω of the form Cn = {S ∈ Ω | S1..n ∈ A} where A ⊆ Sn. In case A

is a singleton set {S}, the set {T ∈ Ω | T1..n= S}, which is denoted by S , is called a thin cylinder of rank n. The σ-algebra F is the σ-algebra generated by the set of cylinders and the probability measureP is defined as

P(S1..n) = IS1·

n−1Y i=1

PSi,Si+1

for each thin cylinder S1..n ∈ F with S = (S1, S2, . . .) in Ω. Since any cylinder Cn

can be written as a countable union of disjoint thin cylinders of the same rank, P(Cn) =

X

S1..n⊆Cn

P(S1..n)

by the property of countable additivity. Furthermore, it can be shown that any subset

C of Ω can be written as a countable intersection of cylinders in F. For example, {S} =

\

n=1

S1..n

for any S ∈ Ω. Since F is closed under countable intersection, it follows that C ∈ F. As a result F = 2, which implies that any discrete random variable defined on

(Ω, F,P) is measurable. For brevity, the probability P({S}) on a singleton set {S} ∈

F is also denoted byP(S).

Next to the probability on infinite state sequences, the probability on finite state se- quences is frequently used in this chapter. Although probability measureP is only defined for (sets of) infinite state sequences, the notationP(S) is also used for any finite state sequence S ∈ Sn, which is justified by defining thatP(S) = P(S ), and

a ) b ) c ) S 1 S n S 1 S n T 1 T n

Figure 2.2: Visualisation of different cylinder types. a) Thin cylinder S1..nof rank n. b) Cylinder of rank n. c) Generalised cylinder.

Now, a generalisation of the probability on cylinders is introduced. Let U be a set of finite state sequences (possibly of different lengths). Set U will be called proper if there exists no state sequence in U that is a prefix of any other state sequence in U . In case U is proper, the probabilityP(U) on U is defined as P(U) = P(U ), where

U = [

S∈U

S

Such a set U will be called a generalised cylinder. Figure 2.2 illustrates the differ- ences between thin cylinders, cylinders and generalised cylinders.

A proper set of finite state sequences U will be called an initial set if all the state sequences in U have the same initial state. For initial set U ,P(U ) is defined as

P(U ) = 1

IS1

X

S∈U

P(S)

and will be referred to as the conditional probability on U . Intuitively,P(U ) denotes

the sum of the probabilities on the state sequences S in U conditional on departing from state S1. A proper set of finite state sequences U is called a final set if all state sequences in U have the same final state.

Consider a final set of finite state sequences U and an initial set of finite state se- quences V such that the initial state of the state sequences in V is equal to the final

state of the state sequences in U . Then the concatenation of U and V , denoted by U ◦V , is defined as

U ◦ V = {(S1, . . . , Sn−1, Sn= T1, T2, . . . , Tm) | (S1, . . . , Sn) ∈ U, (T1, . . . , Tm) ∈ V }

Because both U and V are proper, the concatenation U ◦ V is also proper. Hence, P(U ◦ V ) = P(U) · P∗(V ). If U is also an initial set, thenP(U ◦ V ) =P(U ) ·P(V ).

It can now be observed that the initial probability IS =P(X1 = S) indicating that

the Markov chain {Xi| i ≥ 1} departs from state S represents in fact the probability

P((S)) on the finite state sequence (S). On the other hand, the probability PS,T =

P(Xi+1 = T | Xi = S) indicating that the Markov chain {Xi | i ≥ 1} transfers

from state S to state T (at any time-epoch i ≥ 1) represents in fact the probability P((S, T )) on the finite state sequence (S, T ) by the property of time-homogeneity.