Markov processes
A Markov process is a random process where the value of the random variable at instant n depends only on its immediate past value at instant n −1.
In a Markov process the random variable represents the state of the system at a given instant n.
If the state space of a Markov process is discrete, the Markov process is called a Markov chain. In that case the states are labeled by the
integers 0, 1, 2, etc. A Markov chain stays in a particular state for a certain amount of time called the hold time.
Discrete-Time Markov Chains
In a discrete-time Markov chain the hold time assumes discrete values. Time is measured at specific instances:
t = T0, T1, T2, . . .
Most often the discrete time values are equally spaced:
t = nT
Memoryless Property
of Markov Chains
In a discrete-time Markov chain the present state of the system S(n) depends only on its immediate past state S(n-1)
The probability that the Markov chain is in state si at time n is a function of its past state sj at time n − 1 only:
p [S(n) = si ] = f (sj )
for all i ∈S and j ∈S, where S is the set of all possible states of the system.
Transition from a state to the next state is determined by a transition
Markov Chain
Transition Matrix
The conditional probability pij (n) is the probability of that a system is in state i at time step n given that the past state at time n − 1 was state j
pij (n) = p [S(n) = i | S(n − 1) = j ]
If the transition probability is independent of the time step index n we have a
homogeneous Markov chain
pij = p [S(n) = i | S(n − 1) = j ]
The probability of finding a system in state i at the nth step
si (n) = p [X(n) = i ]
Markov Chain
Transition Matrix
Assume the number of possible states to be m and the indices i and j lie in the range 1 ≤ i ≤ m and 1 ≤ j ≤ m. Then
s(n) = P s(n − 1)
where P is the state transition matrix of dimension m × m
and s(n) is the distribution vector (or state vector) defined as the probability of the system being in each state at time step n:
Markov Chain
Transition Matrix
The component si (n) of the distribution vector s(n) at time n indicates the
probability of finding our system in state si at that time. Because s describes probabilities of all possible m states, we must have
Markov Chain
Transition Matrix
The columns of Markov chain transition matrix represent the present state while the rows represent the next state.
Example 1
Assume an on–off data source that generates equal length packets with probability a per time step. The channel introduces errors in the
transmitted packets such that the probability of a packet is received in error is e.
Markov Matrices
The definition of the transition matrix P results in a matrix with peculiar properties:
1. The number of rows equals the number of columns. Thus P is a square matrix.
2. All the elements of P are real numbers. Thus P is a real matrix.
3. 0 ≤ pij ≤ 1 for all values of i and j . Thus P is a nonnegative matrix. 4. The sum of each column is exactly 1
5. The magnitude of all eigenvalues obey the condition |λi | ≤ 1. Thus the spectral radius of P equals 1.
Markov Matrices
The transition matrix is square, real, and nonnegative. Such a matrix is termed column stochastic matrix or Markov matrix.
Theorem 1 Let P be any m × m column stochastic matrix. Then P has 1
as an eigenvalue
Queuing systems
Queuing systems are a special type of Markov chains in which customers arrive and lineup to be serviced by servers.
A queue is characterized by
the number of arriving customers at a given time step, the number of servers,
the size of the waiting area for customers,
the number of customers that can leave in one time step.
The Diagonals of P
Eigenvalues and
Eigenvectors of P
Theorem 3 Given a column stochastic matrix P and the eigenvector x
corresponding to the eigenvalue λ = 1, the sum of the elements of x is nonzero and could be taken as unity, i.e. σ(x) = 1.
Theorem 4 Given a column stochastic matrix P and an eigenvector x
corresponding to the eigenvalue λ = 1, the sum of the elements of x
Constructing
the State Transition Matrix P
1. Verify that the system under study displays the Markov property. 2. All possible states of the system are identified and labeled.
3. All possible transitions between the states are either drawn on the state transition diagram, or the corresponding elements of the state
transition matrix are identified.
4. The probability of every transition in the state diagram is obtained. 5. The transition matrix is constructed.
Transient Behavior
We can write the distribution vector at time step n = 1 as
s(1) = P s(0) and
s(2) = P s(1) = P [P s(0)]= P2 s(0)
The distribution vector at step n
Example 2
A computer memory system is composed of very fast on-chip cache, fast on-board RAM, and slow hard disk. When the computer is accessing a block from each memory system, the next block required could come from any of the three available memory systems.
This is modeled as a Markov chain with the state of the system
representing the memory from which the current block came from:
state s1 corresponds to the cache, state s2 corresponds to the RAM,
state s3 corresponds to the hard disk.
Example 2
Find the probability that after three consecutive block accesses the system will read a block from the cache.
The starting distribution vector is s(0) = [ 1 0 0 ]t
The distribution vector after three iterations is
Properties of P
n
Lemma 1 Given a column stochastic matrix P, then Pn, for n ≥ 0, is also
column stochastic.
Lemma 2 The state vector s(n) at instance n is given by
s(n) = Pn s(0)
This vector must be a distribution vector for all values of n ≥ 0.
Pn remains a column stochastic matrix
A nonzero element in P can increase or decrease in Pn but can never
become zero.
A zero element in P could remain zero or increase in Pn but can never
Finding s(n)
Alternative techniques for obtaining an expression for s(n) or Pn include the following:
1. Repeated multiplications to get Pn.
2. Expanding the initial distribution vector s(0). 3. Diagonalizing the matrix P.
Renaming the States
Renaming or relabeling the states amounts to exchanging the rows and columns of the transition matrix.
The exchange of states is achieved by pre and post multiplying the transition matrix:
Markov Chains at Equilibrium
A homogeneous Markov chain is a Markov chain in which the transitionprobabilities are not a function of time t or n, for the continuous-time or discrete-time cases, respectively.
At steady state as n → ∞ the distribution vector s settles down to a
unique value and satisfies the equation
P s = s
In that case s is an eigenvector for P with corresponding eigenvalue
λ = 1.
Finding Steady-State
Distribution Vector s
We can use one of the following approaches for finding the steady-state distribution vector s.
1. Repeated multiplication of P to obtain Pn for high values of n. 2. Eigenvector corresponding to eigenvalue λ = 1 for P.
3. Difference equations.
4. Z-transform (generating functions).
5. Direct numerical techniques for solving a system of linear equations. 6. Iterative numerical techniques for solving a system of linear
equations.
Balance Equations
In steady state the probability of finding ourselves in state si is given by the balance equation
From the definition of transition probability, we can write
Balance Equations
The LHS represents all the probabilities of flowing out of state i.
The RHS represents all the probabilities of flowing into state i.
The above equation describes the flow balance for state i.
Reducible Markov Chains
Reducible Markov chains describe systems that have particular states such that once we visit one of those states, we cannot visit other states.
If starting at any state, we are able to reach any other state directly, in one step, or indirectly, through one or more intermediate states, Such a Markov chain is termed irreducible Markov chain.
Closed and Transient States
The states of a reducible Markov chain are divided into two sets: closed state (C) and transient state (T ).
When the system is in T , it can make a transition to either T or C. However, once our system is in C, it can never make a transition to T
again no matter how long we iterate.
In other words, the probability of making a transition from a closed state to a transient state is exactly zero.
When C consists of only one state, then that state is called an absorbing
state.
Transition Matrix
of Reducible Markov Chains
The transition matrix P for a reducible Markov chain could be partitioned into the canonic form
C = square column stochastic matrix A = rectangular nonnegative matrix
T = square column substochastic matrix
Composite
Reducible Markov Chains
In the general case, the reducible Markov chain could be composed of two or more sets of closed states.
In that case, the canonic form for the transition matrix P for a reducible Markov chain could be expanded into several subsets of
noncommunicating closed states
Identifying
Reducible Markov Chains
Theorem 5 Let P be the transition matrix of a Markov chain whose
eigenvalue λ = 1 corresponds to an eigenvector s. Then this chain is reducible if and only if s has one or more zero elements.
Periodic Markov Chains
Periodic Markov chains are Markov chains whose distribution vector s(n) repeats its values at regular intervals of time and never settles down to an equilibrium value no matter how long we iterate.
Transient Behavior
Consider the abstract transition diagram, where the states of the Markov chain are divided into groups and allowed transitions occur only
between adjacent groups.
Types of
periodic Markov Chains
Strongly periodic Markov chains - the distribution vector repeats its values with a period γ > 1. The state transition matrix satisfies the relation
Pγ = I
In a strongly periodic Markov chain, the probability of returning to the starting state after γ time steps is unity for all states of the system.
Weakly periodic Markov chains - the system shows periodic behavior only when n →∞. The distribution vector repeats its values with a
period γ > 1 only when n → ∞. The state transition matrix satisfies the
relation
Pγ ≠ I
In a weakly periodic Markov chain, the probability of returning to the
Strongly periodic Markov chain
The transition matrix.
For a strongly periodic Markov chain with period γ pi j (n + γ ) = pi j (n)
s(n + γ ) = s(n) s(n + γ ) = Pγ s(n)
Pγ s(n) = s(n) (Pγ − I) s(n) = 0
Pγ = I where I is the unit matrix and γ > 1.
Example 4
The following transition matrix corresponds to a strongly periodic Markov chain. Estimate the period of the chain
By performing repeated multiplications we see that P2 = I. The period of
this Markov chain is γ = 2.
The given transition matrix is also known as a circulant matrix where the adjacent rows or columns advance by one position.
The Transition Matrix
Determinant
Theorem 3 Let P be the transition matrix of a strongly periodic Markov chain. The determinant of P will be given by
∆ = ±1
The properties of the transition matrix P of a strongly periodic Markov chain:
1. The m × m transition matrix P is full rank
rank(P) = m.
2. The rows and columns of the transition matrix P are linearly independent.
3. λ < 1 can never be an eigenvalue for the transition matrix P.
Transition Matrix
Diagonalization
Theorem 7 Let P be the transition matrix of a strongly periodic Markov chain with period γ > 1. Then P is diagonalizable.
Transition Matrix Elements
Theorem 9 Let P be the m ×m transition matrix of a Markov chain. The
Markov chain is strongly periodic if and only if the elements of P are all zeros except for m elements that have 1s arranged such that each column and each row contains only a single 1 entry in a unique
Canonic Form for P
A strongly periodic Markov chain will have its m × m transition matrix
expressed in the canonic form
This matrix can be obtained by proper ordering of the states and will have a period γ = m.
Transition Diagram
Each set of periodic classes for a strongly periodic Markov chain consists of one state only, and the number of states equals the period of the
Composite Strongly Periodic
Markov Chains
A composite strongly periodic Markov chain can be expressed, through proper ordering of states, in the canonic form
where Ci is an mi × mi circulant matrix whose period is γi = mi .
The period of the composite Markov chain is given by the equation
Transition diagram
Weakly Periodic
Markov Chains
To generalize the structure of a circulant matrix, we replace each “1” with a block matrix and obtain the canonic form for a weakly periodic
Markov chain
where the block-diagonal matrices are square zero matrices and the
nonzero matrices Wi could be rectangular but the sum of each of their columns is unity since P is column stochastic.
Reducible
Periodic Markov Chains
A reducible periodic Markov chain is one in which the transition matrix can be partitioned into the canonic form
where C = square column stochastic periodic matrix A = rectangular nonnegative matrix
T = square column substochastic matrix
Some of the eigenvalues of the transition matrix will lie on the unit circle. The other eigenvalues will be inside the unit circle.
Identification
of Markov Chains
A fast and direct way to classify a Markov chain is to simply study its eigenvalues and eigenvector corresponding to λ = 1.
Nonperiodic Markov chain
This is the case when only one eigenvalue is 1 and all other eigenvalues lie inside the unit circle:
|λi | < 1
For large values of the time index n →∞, all modes will decay except the
Identification
of Markov Chains
Strongly periodic Markov chain
This is the case when all the eigenvalues of the transition matrix lie on the unit circle:
where 1 ≤ i ≤ γ .
Identification
of Markov Chains
Weakly periodic Markov chain
This is the case when γ eigenvalues of the transition matrix lie on the unit circle, and the rest of the eigenvalues lie inside the unit circle. Thus we can write
|λi | = 1 when 1 ≤ i ≤ γ
|λi | < 1 when γ < i ≤ m
The eigenvalues that lie on the unit circle will be given by
Identification
of Markov Chains
Weakly periodic Markov chain
For large values of the time index n →∞, some of the modes will decay
but γ of them will not, and the distribution vector will never settle down to a stable value.