Lecture_07-09 Markov Processes

(1)

Markov processes

A Markov process is a random process where the value of the random variable at instant n depends only on its immediate past value at instant n −1.

In a Markov process the random variable represents the state of the system at a given instant n.

If the state space of a Markov process is discrete, the Markov process is called a Markov chain. In that case the states are labeled by the

integers 0, 1, 2, etc. A Markov chain stays in a particular state for a certain amount of time called the hold time.

(2)

Discrete-Time Markov Chains

In a discrete-time Markov chain the hold time assumes discrete values. Time is measured at specific instances:

t = T₀, T₁, T₂, . . .

Most often the discrete time values are equally spaced:

t = nT

(3)

Memoryless Property

of Markov Chains

In a discrete-time Markov chain the present state of the system S(n) depends only on its immediate past state S(n-1)

The probability that the Markov chain is in state s_iat time n is a function of its past state s_jat time n − 1 only:

p [S(n) = s_i] = f (s_j )

for all i ∈S and j ∈S, where S is the set of all possible states of the system.

Transition from a state to the next state is determined by a transition

(4)

Markov Chain

Transition Matrix

The conditional probability p_ij(n) is the probability of that a system is in state i at time step n given that the past state at time n − 1 was state j

p_ij(n) = p [S(n) = i | S(n − 1) = j ]

If the transition probability is independent of the time step index n we have a

homogeneous Markov chain

p_ij= p [S(n) = i | S(n − 1) = j ]

The probability of finding a system in state i at the nth step

s_i(n) = p [X(n) = i ]

(5)

Markov Chain

Transition Matrix

Assume the number of possible states to be m and the indices i and j lie in the range 1 ≤ i ≤ m and 1 ≤ j ≤ m. Then

s(n) = P s(n − 1)

where P is the state transition matrix of dimension m × m

and s(n) is the distribution vector (or state vector) defined as the probability of the system being in each state at time step n:

(6)

Markov Chain

Transition Matrix

The component s_i (n) of the distribution vector s(n) at time n indicates the

probability of finding our system in state s_i at that time. Because s describes probabilities of all possible m states, we must have

(7)

Markov Chain

Transition Matrix

The columns of Markov chain transition matrix represent the present state while the rows represent the next state.

(8)

Example 1

Assume an on–off data source that generates equal length packets with probability a per time step. The channel introduces errors in the

transmitted packets such that the probability of a packet is received in error is e.

(9)

(10)

Markov Matrices

The definition of the transition matrix P results in a matrix with peculiar properties:

1. The number of rows equals the number of columns. Thus P is a square matrix.

2. All the elements of P are real numbers. Thus P is a real matrix.

3. 0 ≤ p_ij ≤ 1 for all values of i and j . Thus P is a nonnegative matrix. 4. The sum of each column is exactly 1

5. The magnitude of all eigenvalues obey the condition |λ_i| ≤ 1. Thus the spectral radius of P equals 1.

(11)

Markov Matrices

The transition matrix is square, real, and nonnegative. Such a matrix is termed column stochastic matrix or Markov matrix.

Theorem 1 Let P be any m × m column stochastic matrix. Then P has 1

as an eigenvalue

(12)

Queuing systems

Queuing systems are a special type of Markov chains in which customers arrive and lineup to be serviced by servers.

A queue is characterized by

 the number of arriving customers at a given time step,  the number of servers,

 the size of the waiting area for customers,

 the number of customers that can leave in one time step.

(13)

The Diagonals of P

(14)

Eigenvalues and

Eigenvectors of P

Theorem 3 Given a column stochastic matrix P and the eigenvector x

corresponding to the eigenvalue λ = 1, the sum of the elements of x is nonzero and could be taken as unity, i.e. σ(x) = 1.

Theorem 4 Given a column stochastic matrix P and an eigenvector x

corresponding to the eigenvalue λ = 1, the sum of the elements of x

(15)

Constructing

the State Transition Matrix P

1. Verify that the system under study displays the Markov property. 2. All possible states of the system are identified and labeled.

3. All possible transitions between the states are either drawn on the state transition diagram, or the corresponding elements of the state

transition matrix are identified.

4. The probability of every transition in the state diagram is obtained. 5. The transition matrix is constructed.

(16)

Transient Behavior

We can write the distribution vector at time step n = 1 as

s(1) = P s(0) and

s(2) = P s(1) = P [P s(0)]= P2 _s₍₀₎

The distribution vector at step n

(17)

Example 2

A computer memory system is composed of very fast on-chip cache, fast on-board RAM, and slow hard disk. When the computer is accessing a block from each memory system, the next block required could come from any of the three available memory systems.

This is modeled as a Markov chain with the state of the system

representing the memory from which the current block came from:

 state s1 corresponds to the cache,  state s2 corresponds to the RAM,

 state s3 corresponds to the hard disk.

(18)

Example 2

Find the probability that after three consecutive block accesses the system will read a block from the cache.

The starting distribution vector is s(0) = [ 1 0 0 ]t

The distribution vector after three iterations is

(19)

Properties of P

n

Lemma 1 Given a column stochastic matrix P, then Pn_{, for n}_{≥ 0}_{, is also}

column stochastic.

Lemma 2 The state vector s(n) at instance n is given by

s(n) = Pn _s₍₀₎

This vector must be a distribution vector for all values of n ≥ 0.

 Pn remains a column stochastic matrix

 A nonzero element in P can increase or decrease in Pn but can never

become zero.

 A zero element in P could remain zero or increase in Pn but can never

(20)

Finding s(n)

Alternative techniques for obtaining an expression for s(n) or Pn _include the following:

1. Repeated multiplications to get Pn_.

2. Expanding the initial distribution vector s(0). 3. Diagonalizing the matrix P.

(21)

Renaming the States

Renaming or relabeling the states amounts to exchanging the rows and columns of the transition matrix.

The exchange of states is achieved by pre and post multiplying the transition matrix:

(22)

Markov Chains at Equilibrium

A homogeneous Markov chain is a Markov chain in which the transition

probabilities are not a function of time t or n, for the continuous-time or discrete-time cases, respectively.

At steady state as n → ∞ the distribution vector s settles down to a

unique value and satisfies the equation

P s = s

In that case s is an eigenvector for P with corresponding eigenvalue

λ = 1.

(23)

Finding Steady-State

Distribution Vector s

We can use one of the following approaches for finding the steady-state distribution vector s.

1. Repeated multiplication of P to obtain Pn _{for high values of}_n_. 2. Eigenvector corresponding to eigenvalue λ = 1 for P.

3. Difference equations.

4. Z-transform (generating functions).

5. Direct numerical techniques for solving a system of linear equations. 6. Iterative numerical techniques for solving a system of linear

equations.

(24)

Balance Equations

In steady state the probability of finding ourselves in state s_i is given by the balance equation

From the definition of transition probability, we can write

(25)

Balance Equations

The LHS represents all the probabilities of flowing out of state i.

The RHS represents all the probabilities of flowing into state i.

The above equation describes the flow balance for state i.

(26)

Reducible Markov Chains

Reducible Markov chains describe systems that have particular states such that once we visit one of those states, we cannot visit other states.

If starting at any state, we are able to reach any other state directly, in one step, or indirectly, through one or more intermediate states, Such a Markov chain is termed irreducible Markov chain.

(27)

Closed and Transient States

The states of a reducible Markov chain are divided into two sets: closed state (C) and transient state (T ).

When the system is in T , it can make a transition to either T or C. However, once our system is in C, it can never make a transition to T

again no matter how long we iterate.

In other words, the probability of making a transition from a closed state to a transient state is exactly zero.

When C consists of only one state, then that state is called an absorbing

state.

(28)

Transition Matrix

of Reducible Markov Chains

The transition matrix P for a reducible Markov chain could be partitioned into the canonic form

C = square column stochastic matrix A = rectangular nonnegative matrix

T = square column substochastic matrix

(29)

(30)

Composite

Reducible Markov Chains

In the general case, the reducible Markov chain could be composed of two or more sets of closed states.

In that case, the canonic form for the transition matrix P for a reducible Markov chain could be expanded into several subsets of

noncommunicating closed states

(31)

Identifying

Reducible Markov Chains

Theorem 5 Let P be the transition matrix of a Markov chain whose

eigenvalue λ = 1 corresponds to an eigenvector s. Then this chain is reducible if and only if s has one or more zero elements.

(32)

Periodic Markov Chains

Periodic Markov chains are Markov chains whose distribution vector s(n) repeats its values at regular intervals of time and never settles down to an equilibrium value no matter how long we iterate.

(33)

Transient Behavior

Consider the abstract transition diagram, where the states of the Markov chain are divided into groups and allowed transitions occur only

between adjacent groups.

(34)

Types of

periodic Markov Chains

Strongly periodic Markov chains - the distribution vector repeats its values with a period γ > 1. The state transition matrix satisfies the relation

Pγ ₌_I

In a strongly periodic Markov chain, the probability of returning to the starting state after γ time steps is unity for all states of the system.

Weakly periodic Markov chains - the system shows periodic behavior only when n →∞. The distribution vector repeats its values with a

period γ > 1 only when n → ∞. The state transition matrix satisfies the

relation

Pγ _≠_I

In a weakly periodic Markov chain, the probability of returning to the

(35)

Strongly periodic Markov chain

The transition matrix.

For a strongly periodic Markov chain with period γ p_{i j} (n + γ ) = p_{i j} (n)

s(n + γ ) = s(n) s(n + γ ) = Pγ _s₍_n₎

Pγ _s₍_n_{) =}_s₍_n₎ (Pγ ₋_I₎_s₍_n_{) =}₀

Pγ ₌_I where I is the unit matrix and γ > 1.

(36)

Example 4

The following transition matrix corresponds to a strongly periodic Markov chain. Estimate the period of the chain

By performing repeated multiplications we see that P2 ₌_I._{The period of}

this Markov chain is γ = 2.

The given transition matrix is also known as a circulant matrix where the adjacent rows or columns advance by one position.

(37)

The Transition Matrix

Determinant

Theorem 3 Let P be the transition matrix of a strongly periodic Markov chain. The determinant of P will be given by

∆ = ±1

The properties of the transition matrix P of a strongly periodic Markov chain:

1. The m × m transition matrix P is full rank

rank(P) = m.

2. The rows and columns of the transition matrix P are linearly independent.

3. λ < 1 can never be an eigenvalue for the transition matrix P.

(38)

Transition Matrix

Diagonalization

Theorem 7 Let P be the transition matrix of a strongly periodic Markov chain with period γ > 1. Then P is diagonalizable.

(39)

Transition Matrix Elements

Theorem 9 Let P be the m ×m transition matrix of a Markov chain. The

Markov chain is strongly periodic if and only if the elements of P are all zeros except for m elements that have 1s arranged such that each column and each row contains only a single 1 entry in a unique

(40)

Canonic Form for P

A strongly periodic Markov chain will have its m × m transition matrix

expressed in the canonic form

This matrix can be obtained by proper ordering of the states and will have a period γ = m.

(41)

Transition Diagram

Each set of periodic classes for a strongly periodic Markov chain consists of one state only, and the number of states equals the period of the

(42)

Composite Strongly Periodic

Markov Chains

A composite strongly periodic Markov chain can be expressed, through proper ordering of states, in the canonic form

where C_iis an m_i× m_i circulant matrix whose period is γ_i= m_i .

The period of the composite Markov chain is given by the equation

(43)

Transition diagram

(44)

Weakly Periodic

Markov Chains

To generalize the structure of a circulant matrix, we replace each “1” with a block matrix and obtain the canonic form for a weakly periodic

Markov chain

where the block-diagonal matrices are square zero matrices and the

nonzero matrices W_icould be rectangular but the sum of each of their columns is unity since P is column stochastic.

(45)

Reducible

Periodic Markov Chains

A reducible periodic Markov chain is one in which the transition matrix can be partitioned into the canonic form

where C = square column stochastic periodic matrix A = rectangular nonnegative matrix

T = square column substochastic matrix

Some of the eigenvalues of the transition matrix will lie on the unit circle. The other eigenvalues will be inside the unit circle.

(46)

Identification

of Markov Chains

A fast and direct way to classify a Markov chain is to simply study its eigenvalues and eigenvector corresponding to λ = 1.

Nonperiodic Markov chain

This is the case when only one eigenvalue is 1 and all other eigenvalues lie inside the unit circle:

|λ_i | < 1

For large values of the time index n →∞, all modes will decay except the

(47)

Identification

of Markov Chains

Strongly periodic Markov chain

This is the case when all the eigenvalues of the transition matrix lie on the unit circle:

where 1 ≤ i ≤ γ .

(48)

Identification

of Markov Chains

Weakly periodic Markov chain

This is the case when γ eigenvalues of the transition matrix lie on the unit circle, and the rest of the eigenvalues lie inside the unit circle. Thus we can write

|λ_i | = 1 when 1 ≤ i ≤ γ

|λ_i | < 1 when γ < i ≤ m

The eigenvalues that lie on the unit circle will be given by

(49)

Identification

of Markov Chains

Weakly periodic Markov chain

For large values of the time index n →∞, some of the modes will decay

but γ of them will not, and the distribution vector will never settle down to a stable value.