Capacity Limits of MIMO Channels

(1)

Capacity Limits of MIMO Channels

Tutorial ─ MIMO Communications with Applications to (B)3G and 4G Systems

Markku Juntti

1. Introduction

2. Review of information theory 3. Fixed MIMO channels

4. Fading MIMO channels 5. Summary and Conclusions

References

(2)

1. Introduction

• The use of multiple antennas can provide gain due to

– antenna gain

• more receive antennas Ö more power is collected – interference gain

• interference nulling by beamforming (array gain)

• interference averaging (to zero) due to independent observations

– diversity gain against fading

• receive diversity

• transmit diversity.

• Information theoretic model of multi-input–multi-

output (MIMO) channel is considered.

(3)

MIMO Channel Model ( ) ⁿ

x ₁

( ) ⁿ

x ₂

( ) ⁿ

x _N

T

M

( ) ⁿ

y ₁

( ) ⁿ

y ₂

( ) ⁿ

y _N

R

M

MIMO channel model.

( ) ⁿ

h ₁ _, ₁

( ) ⁿ

h _N _N

T R

,

• Assume N _T transmit and N _R receive antennae

– called N

_T

× N

_R

MIMO system.

• Fading radio channels

modeled as frequency-flat:

– fixed

– time-varying

– known both/either in the transmitter and/or receiver

• perfect channel state information (CSI)

– a priori unknown.

(4)

2. Review of Information Theory

• Information theory (IT) has its origins in analyzing the limits communications.

• Information theory answers two fundamental questions in communication theory:

– What is the ultimate data compression rate?

• Answer: entropy.

– What is the ultimate data transmission rate?

• Answer: channel capacity.

(5)

Basic Concepts

• Assume a discrete valued random variable (RV) X with probability mass function p ( x ).

• The average information or entropy of RV X :

• Joint entropy of RV’s X and Y :

• Conditional entropy of RV Y given X = x : Ö Chain rule:

( ) ⁼ ⁻ _∑ ( ) ^log [ ( ) ] ⁼ ⁻ ^E [ ^log ( ) ] ⁼ ^E _⎢⎣ ^⎡ ^log _{( )} ¹ ^⎤ _⎥⎦ ^.

X X p

p x

p X

H x

( )

[ ^, ] ^E { [ ^log [ ⁽ ^, ⁾ ] ] } ^.

log ) , ( )

,

( X Y p x y p x y p X Y

H = − ∑∑ x y = −

[ ] { [ ] } ^.

( Y

H X ) p ( x ) H ( Y X x ) p ( x , y ) log p ( y x ) E log p ( X , Y )

x y

x = = − = −

= ∑ ∑ ∑

).

( )

,

( X Y H X H Y X

H = +

(6)

Mutual Information

• Mutual information is the relative entropy between the joint distribution and product distribution:

• Measure of the information one random variable (say, X ) contains on the other ( Y ):

– If X and Y are independent: I ( X ; Y ) = 0 (also “only if”).

– If Y = X : I ( X ; X ) = H ( X ).

• Differential entropy for continuous RV’s.

) . ( ) (

) , log (

) E ( ) (

) , log (

) , ( )

;

( ⎭ ⎬ ⎫

⎩ ⎨

⎧ ⎥⎦ ⎤

⎢⎣ ⎡

⎥⎦ =

⎢⎣ ⎤

= ∑∑ ^p ^x ^y ⎡ _p ^p _x ^x _p ^y _y _p ^p _X ^X _p ^Y _Y Y

X I

( x y ) ( ) ( ) ( ) ( )

( ) ( ) ( ^, ) ( ^; ) ^.

;

X Y I Y

X H Y

H X

H

X Y H Y

H Y

X H X

H Y

X I

=

− +

=

−

=

−

=

(7)

Gaussian RV’s

• For multivariate, real-valued Gaussian RV’s X ₁ , X ₂ ,…, X _n with mean vector µ and covariance matrix K , the differential entropy is

• Gaussian distribution maximizes the entropy over all distributions with the same covariance:

for any RV’s X ₁ , X ₂ ,…, X _n with equality if and only if they are Gaussian.

( X 1 ^, X 2 ^, ^, X _n ) ¹ ₂ ^log [ ( ² e ) ⁿ ] ^det( K ^).

h K = π

( X 1 ^, X 2 ^, ^, X _n ) ¹ ₂ ^log [ ( ² e ) ⁿ ] ^det( K ⁾

h K ≤ π

(8)

Channel Capacity

Encoder Channel

p( y | x ) Decoder Message

W X

ⁿ

Y

ⁿ

Estimate of message

W ˆ

Information theoretic model of a communication system.

• Channel capacity :

• Code rate R is achievable , if there exists a sequence of (2 nR , n ) codes so that

( ) ( ^; ) ^.

max I X Y C

x p

=

. as

,

max 0

e, → n → ∞

P

(9)

Gaussian Channel

X _i Y _i = X _i + Z _i

The Gaussian channel.

S

).

, 0 (

~ N σ _N ² Z i

• Channel capacity:

• Capacity per time unit ((2 W ) samples per second):

( ) ( ) ⁽ ⁾ ⁽ ⁾

, 1

2 log

; 1 max

2 S

E 2

γ +

=

σ

≤

Y X I C

X p x ₂ .

N 2 S

σ

= σ γ

. 1

log

0 ⎟⎟ ⎠

⎜⎜ ⎞

⎝

⎛ +

= N W

W P

C

(10)

Parallel Gaussian Channels

X ₁ Y ₁

Parallel Gaussian channels.

S

).

, 0 (

~ _N,1 ²

1 _N σ

Z

X _k ^S Y _k

).

, 0 (

~ _N,k ² Z k _N σ X ₂ ^S Y ₂

).

, 0 (

~ _N,2 ²

2 _N σ

Z

M

• Capacity:

• Optimal transmission:

Ö water-filling .

( ) ¹ ^.

2 log 1 1

2 log 1

1 1 2

N

2 S ∑

∑ = = = + γ

⎟⎟

⎟

⎠

⎞

⎜⎜

⎜

⎝

⎛

σ + σ

= ^k

i i

k

i ,i

C ,i

[ ] ^⎟

⎠

⎜ ⎞

⎝

⎛ diag σ ² _S,1 , σ ² _S,2 , , σ ² _S,

~ N 0 , K _k

X

(11)

3. Fixed MIMO Channels

( ) ⁿ

x ₁

( ) ⁿ

x ₂

( ) ⁿ

x _N

T

M

( ) ⁿ

y ₁

( ) ⁿ

y ₂

( ) ⁿ

y _N

R

M

MIMO channel model.

( ) ⁿ

h ₁ _, ₁

( ) ⁿ

h _N _N

T R

,

• Signal x _i ( n ) is transmitted at time interval n from antenna i ( i =1,2,…, N _T ).

• Signal y _j ( n ) is received at time interval n at antenna j ( j =1,2,…, N _R ):

where h _ij ( n ) is the complex channel gain with

( ) ^T ( ) ( ) ( ) ^,

∑ 1

=

η +

= ^N

i i j

j n h n x n n

y ij

( ) 1

E ² ⎟ ⎟ ⎞ =

⎜ ⎜

⎛ h n

(12)

Matrix Formulation of MIMO Channel Model

• The signal received at all antennas:

where

( )

( ) ( ) ( )

.

T R

T R R

R

T T

, 2

, 1

,

, 2 2

, 2 1

, 2

, 2 1

, 1 1

, 1

N N

N N N

N

N N

n h

n ∈ ^×

⎥ ⎥

⎦

⎤

⎢ ⎢

⎣

⎡

= C

L

M O

M M

L L H

( ) ⁿ ⁼ [ ^x ₁ ( ) ⁿ ^x ₂ ( ) ⁿ ^L ^x _N _T ( ) ⁿ ] ^T ^∈ ^C ^N ^T ^,

x

( ) ⁿ ^H ( ) ( ) ( ) ⁿ ^x ⁿ ^η ⁿ ^,

y = +

( ) ⁿ ⁼ [ ^y ₁ ( ) ⁿ ^y ₂ ( ) ⁿ ^L ^y _N _R ( ) ⁿ ] ^T ^∈ ^C ^N ^R ^,

y

(13)

Noise Model and Power Constraint

• The noise vector

satisfies

• The transmitted signal satisfies the average power constraint:

( ) ( )

( ) ^E ^{( )} ^.

E ² _S

1 2 S, 1

H ^T 2 ⎟⎟ = ^T σ ≤ σ

⎠

⎜⎜ ⎞

⎝

= ∑ ⎛ ∑

=

N

i i

N

i x i n n

n x x

( ) ⁿ ⁼ [ ^η ₁ ( ) ⁿ ^η ₂ ( ) ⁿ ^L ^η _N _R ( ) ⁿ ] ^T ^∈ ^C ^N ^R ^,

η

( ) ^~ ( ) ^, _N ² ^I ^.

η n CN 0 σ

(14)

Singular Value Decomposition

• The MIMO model is a special case of parallel Gaussian channels.

• The channel transfer matrix has singular value decomposition (SVD):

where

are unitary matrices, and

is a “diagonal” matrix of the singular values of H .

H ,

1 2

V UΛ

H =

T T

R

R ^N , ^N ^N

N × ∈ ×

∈ C V C

U

T 2 R

1 ∈R N × N

Λ

(15)

Equivalent Channel Model

• Let

• Since U and V are unitary:

Ö Equivalent channel model

Ö Independent parallel Gaussian channels.

Ö Capacity achieved with Gaussian input and by water-filling.

( ) ( ) ^, ^~ ( ) ( ) ^, ^~ ( ) ( ) ^.

~ x n = V ^H x n y n = U ^H y n η n = U ^H η n

( ) ( )

( ^~ ^~ ) ^,

E x ^H n x n ≤ σ ² _S

( ) ^~ ( ) ^, ^.

~ ²

N I η n CN 0 σ

( ) ^~ ( ) ( ) ^~ ^.

~ y n = Λ ¹ ₂ x n + η n

“diagonal” matrix of sixe N

_R

× N

_T

(16)

Derivation of Channel Capacity

• The rank of matrix H is rank(H) ≤ min( ^N

_R

^, ^N

_T

).

Ö The number of positive singular values is rank(H).

Ö The capacity of MIMO AWGN channel:

where the signal powers are solved via water-filling

and µ is chosen so that the power constraint is satisfied or

( ) ^log ¹ ^{( )} ^log ( ¹ ) ^, ₂ ^,

N 2 S rank

1 rank

1 2

N 2 S,

σ

= σ γ γ

λ +

⎟⎟ =

⎟

⎠

⎞

⎜⎜

⎜

⎝

⎛

σ σ + λ

= ∑ ∑

=

i ,i

i i i

i

i i

C ^H ^H

( ) ^,

rank ,

, 2 , 1 ,

, 0 max

2 N

2 S, = K H

⎪⎭

⎪ ⎬

⎫

⎪⎩

⎪ ⎨

⎧

⎟ ⎟

⎠

⎞

⎜ ⎜

⎝

⎛

λ σ

−

= µ

σ i

i i

( ) ² _S .

rank 2

S, ≤ σ

∑ ^H σ _i

(17)

MIMO Channel Capacity for Full–Rank Channel Matrix

• No CSI at the transmitter (and full–rank H):

• CSI at the transmitter (and full–rank H):

where Q is the covariance matrix of the input vector x satisfying the power constraint tr(Q) ≤ σ _S ² .

– No CSI at the transmitter Ö Q = I.

. det

log ^H

R

T ⎥

⎦

⎢ ⎤

⎣

⎡ ⎟

⎠

⎜ ⎞

⎝

⎛ +

= I HH

C _N N γ

, det

log

max ^H

R T ⎥

⎦

⎢ ⎤

⎣

⎡ ⎟

⎠

⎜ ⎞

⎝

⎛ + γ

= I HQH

Q N

C _N

(18)

4. Fading MIMO Channels

• The channels are usually assumed to be ergodic:

fading is fast enough and gets all realizations so many times that

– the sample average equals the theoretical mean

– the sample covariance equals the theoretical covariance.

ergodic (a long observation time)

non-ergodic (a short observation time)

(19)

Fading Channel Model with Perfect Receiver CSI

• The effective channel output: the actual channel output y and the channel realization H.

• Assuming that the channel is memoryless

(independent channel state for each transmission), the capacity equals the mean of the mutual

information:

convolution

OUT

IN x

H

y Î ( ^x ^; ^y ^, ^H ) ( ⁼ Î ^x ^; ^H ) ⁺ Î ( ^x ^; ^y ^H ) ( ⁼ Î ^x ^; ^y ^H ) ^.

= 0 RV conditioned on channel realization

. det

log

E ^H

R

T

⎭ ⎬

⎫

⎩ ⎨

⎧ ⎥

⎦

⎢ ⎤

⎣

⎡ ⎟

⎠

⎜ ⎞

⎝

⎛ +

= _H I HH

C _N N γ

(20)

Capacity Evaluation

• The evaluation of the fading MIMO channel capacity is complicated:

– Wishart distribution Ö Laguerre polynomials [Telatar 1999]

– bounds [Foschini & Gans 1998]

– Monte Carlo computer simulations

– random matrix theory Ö mutual information tends to Gaussian

• under development.

(21)

Example: N × N MIMO System

0 2 4 6 8 10 12 14 16 18 20

10⁰ 10¹ 10²

R-CSI fading channel with N_R=N_T

SNR [dB]

Capacity [bits per symbol]

32 antennae 16 antennae 8 antennae 4 antennae 2 antennae 1 antenna

1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 10

20 30 40 50 60 70 80 90 100

R-CSI fading channel with N_R=N_T

Number of antennae

Capacity [bits per symbol]

SNR = 20 dB SNR = 10 dB SNR = 0 dB

Ö The capacity curves are sifted upwards by introducing more antennae.

Ö The capacity increases

linearly vs. the number of

antennae.

(22)

Non-Ergodic Channels

• The channels are not always ergodic: fading can be so slow that it undergoes only some realizations.

Ö The random process becomes non-ergodic.

ergodic

non-ergodic

(23)

Example

AWGN 1 bit / use

AWGN 2 bits / use random

switch

IN OUT

• Select one of the channels with equal probability, and keep then fixed.

Ö Average mutual information is 1.5 bits / channel use.

• However, with probability 0.5 it is not supported.

Ö The achievable rate ≤ 1 bits / channel use.

Ö Channel capacity ≠ the average maximum

mutual information.

(24)

Example: Random and Fixed Channel

• A simple example: generate a channel realization, and keep it fixed during the whole transmission.

Ö There is a positive probability of an arbitrarily bad channel realization.

Ö However small a rate, the channel realization may not be able to support it regardless the length of the code word.

Ö The Shannon capacity of this non-ergodic channel is zero.

Ö The Shannon capacity is again not equal to the

average mutual information.

(25)

Outage Probability

• In non-ergodic channels, the capacity is measured by the probability of outage for a given rate R :

– Often called capacity versus outage.

• The set-up is encountered in real time applications with transmission delay constraints.

• Similar approach is applicable also for delay

constrained communications in ergodic channels.

( ) ( ) [ ( ) ]

( ) Pr log det .

inf

; Pr

inf

H tr T

, 0 :

tr , 0 out :

2 R S

2 S

⎭ ⎬

⎫

⎩ ⎨

⎧ ⎥ <

⎦

⎢ ⎤

⎣

⎡ ⎟

⎠

⎜ ⎞

⎝

⎛ +

=

<

=

≤

≥

≤

≥

N R R I

R P

N HQH

I y x

Q Q

Q

Q Q

Q

γ

σ

(26)

5. Summary and Conclusions

• AWGN MIMO channels are an extension of parallel Gaussian channels.

– Another example of parallel channels: channels on different frequencies.

• Introducing both multiple transmit and receive antennae is equivalent to increase in bandwidth.

• The linear capacity increase becomes natural.

. det

log ^H

R

T ⎥

⎦

⎢ ⎤

⎣

⎡ ⎟

⎠

⎜ ⎞

⎝

⎛ +

= I HQH

C _N N γ

(27)

Fading AWGN MIMO Channel

• Ergodic channels:

– Channel experiences all its states several times.

– No delay constraints and/or fast fading.

– Capacity equals the average mutual information:

– Capacity increases linearly with N

_R

= N

_T

.

• Non-ergodic channels:

– Capacity does not equal the average mutual information.

– Capacity versus outage probability.

. det

log

E ^H

R

T

⎭ ⎬

⎫

⎩ ⎨

⎧ ⎥

⎦

⎢ ⎤

⎣

⎡ ⎟

⎠

⎜ ⎞

⎝

⎛ +

= _H I HH

C _N N γ

(28)

Research Challenges

• Capacity of selective channels

– time-selective

– frequency-selective

with no or imperfect channel state information in the transmitter and the receiver.

Ö Optimal signal structures (coding and modulation) for real use with issues like

– amount of training vs. non-coherent detection – transceiver complexity constraints

– limited bandwidth of a non-ideal feedback channel.

(29)

References

1. T. M. Cover & J. A. Thomas, Elements of Information Theory. John Wiley & Sons, 1991. ISBN: 0-471- 06259-6

2. E. Telatar, Capacity of multi-antenna Gaussian channels. European Transactions on Telecommunications, vol. 10, no. 6, pp. 585-595, Nov.-Dec. 1999.

3. G. J. Foschini & M. J. Gans, On limits of wireless communications in a fading environment when using multiple antennas. Wireless Personal Communications , vol. 6, pp. 311-335, Nov.-Dec. 1999

4. T. L. Marzetta & B. M. Hochwald, Capacity of a mobile multiple-antenna communication link in Rayleigh flat fading. IEEE Transactions on Information Theory, vol. 45, no. 1, pp. 139-157, Jan. 1999 5. I. E. Telatar & D. N C. Tse, Capacity and mutual information of wideband multipath fading channels.

IEEE Transactions on Information Theory, vol. 46, no. 4, pp. 1384-1400, July 2000.

6. M. Medard, The effect upon channel capacity in wireless communications of perfect and imperfect knowledge of the channel. IEEE Transactions on Information Theory, vol. 46, no. 3, pp. 933-945, May 2000.

7. M. Medard & R. G. Gallager, Bandwidth scaling for fading multipath channels. IEEE Transactions on Information Theory, vol. 48, no. 4, pp. 840-852, April 2002.

8. V. G. Subramanian & B. Hajek, Broad-band fading channels: signal burstiness and capacity. IEEE Transactions on Information Theory, vol. 48, no. 4, pp. 809-827, April 2002.

Capacity Limits of MIMO Channels