Probability distributions and densities - 2 Introduction to complex random vectors and processe

2 Introduction to complex random vectors and processes

2.3 Probability distributions and densities

Rather than defining a complex random variable from first principles (where we would start with a probability measure on a sample space), we simply define a complex random variable x: −→ Cⁿ as x= u + jv, where u: −→ IRⁿ and v: −→ IRⁿ are a pair of real random variables. This pair (u, v) has the joint probability distribution

P(u0, v0)= Prob(u ≤ u0, v ≤ v0) (2.38) and joint probability density function (pdf)

p(u, v) = ∂

∂u

∂

∂vP(u, v). (2.39)

We will allow the use of Dirac delta functions in the pdf. When we write P(x) or p(x), we shall define this to mean

P(x)= P(u + jv) P(u, v), (2.40)

p(x)= p(u + jv) p(u, v). (2.41)

Thus, the probability distribution of a complex random vector is interpreted as the 2n-dimensional joint distribution of its real and imaginary parts. The probability of x taking a value in the regionA = {u1< u ≤ u2; v1 < v ≤ v2} is thus

Prob(x∈ A) = v₂

u₂ u1

p(x)du dv. (2.42)

For a function g:D → Cⁿwhose domainD includes the range of x, the expectation operator is defined accordingly as

E{g(x)} = E{Re[g(x)]} + jE{Im[g(x)]}

IR²ⁿ

g(u+ jv)p(u + jv)du dv. (2.43) In many cases, expressing P(u, v) or p(u, v) in terms of x requires the use of the com-plex conjugate x^∗. This has prompted many researchers to write P(x, x^∗) and p(x, x^∗), which raises the question of whether these are now the joint distribution and density for x and x^∗. This question is actually ill-posed since distributions and densities of complex random vectors are always interpreted in terms of (2.40) and (2.41) – whether we write this as p(x) or p(x, x^∗) makes no difference. Nevertheless, the notation p(x, x^∗) does seem to carry potential for confusion since x perfectly determines x^∗, and vice versa. It is not possible to assign densities to x and x^∗independently.

The advantage of expressing a pdf in terms of complex x lies not in the fact that Prob(x∈ A) becomes easier to evaluate – that is obviously not the case. However, direct calculations of Prob(x∈ A) via (2.42) are rare. In most practical cases, e.g., maximum-likelihood or minimum mean-squared error estimation, we can work directly with p(x) since it contains all relevant information, conveniently parameterized in terms of the statistical properties of complex x.

2.3 Probability distributions and densities 39

We will now take a look at two important complex distributions: the multivariate Gaussian distribution and its generalization, the multivariate elliptical distribution. We will be particularly interested in expressing these pdfs in terms of covariance and com-plementary covariance matrices.

2.3.1 Complex Gaussian distribution

In order to derive the general complex multivariate Gaussian pdf (proper or improper), we begin with the Gaussian pdf of the composite vector of real and imaginary parts [u^T, v^T]^T= z: −→ IR²ⁿ:

With x= Tz, we are now in a position to state the following.

Result 2.4. The general pdf of a complex Gaussian random vector x: −→ Cⁿ is

p(x)= 1

πⁿdet¹^/2R_{x x} exp

−¹₂(x− ␮_x)^HR⁻¹_{x x}(x− ␮_x) . (2.48)

This pdf algebraically depends on x, i.e., x and x^∗, but is interpreted as the joint pdf of u and v. It may be used for proper or improper x. In the past, the term “complex Gaus-sian distribution” often implicitly assumed propriety. Therefore, some researchers call an improper complex Gaussian random vector “generalized complex Gaussian.”⁴ The simplification that occurs in the proper case, where Rx x = 0 and Rx x is block-diagonal, is obvious and leads to the following classical result.

Result 2.5. The pdf of a complex proper Gaussian random vector x: −→ Cⁿis

p(x)= 1

πⁿdet Rx x

exp

−(x − ␮x)^HR⁻¹_{x x}(x− ␮x)

. (2.49)

Let’s go back to the general Gaussian pdf in Result2.4. As in (A1.38) of Appendix1, R⁻¹_{x x} may be factored as W= Rx xR^−∗_{x x} produces the linear minimum mean-squared error (LMMSE) estimate of

40 Complex random vectors and processes

x from x^∗as

ˆx= W(x − ␮x)^∗+ ␮x, (2.51)

and tr P= Ex − ˆx²is the corresponding LMMSE. From (A1.3) we find det R_{x x} = det R^∗_{x x}det P= det Rx xdet P, and, using (2.50), we may then factor the improper pdf

p(x) as

p(x)= 1

πⁿ × 1

det¹^/2Rx x

exp

−¹₂(x− ␮x)^HR⁻¹_{x x}(x− ␮x)

× 1

det¹^/2Pexp

−¹₂(x− ˆx)^HP⁻¹(x− ˆx)

. (2.52)

This expresses the improper Gaussian pdf p(x) in terms of two factors: the first factor involves only x, its mean ␮x, and its covariance matrix Rx x; and the second factor involves only the prediction error x− ˆx and its covariance matrix P. These two factors are “almost” proper Gaussian pdfs, albeit with incorrect normalization constants and a factor of 1/2 in the quadratic form.

In Section1.6.1, we found that the real bivariate Gaussian pdf p(u, v) in (1.48) does indeed factor into a Gaussian pdf for the prediction error u− ˆu and a Gaussian pdf for v.

Importantly, in the real case, the error u− ˆu and v are independent. The difference in the complex case is that, although x− ˆx and x^∗are uncorrelated, they cannot be independent because x^∗ perfectly determines x (through complex conjugation). If x is proper, then ˆx= ␮x, so that W= 0 and P = Rx x, and the two factors in (2.52) are identical. This makes the factor of 1/2 in the quadratic form disappear.

By employing the Woodbury identity (cf. (A1.43) in Appendix1)

P⁻¹= R⁻¹x x + W^TP^−∗W^∗ (2.53) and

det P= det Rx xdet(I− WW^∗), (2.54) we may find the following alternative expressions for p(x):

p(x)=det¹^/2(I− WW^∗) πⁿdet P

× exp

−(x − ␮x)^HP⁻¹(x− ␮x)+ Re

(x− ␮x)^TP^−∗W^∗(x− ␮x)

, (2.55)

p(x)= 1

πⁿ(det Rx xdet P)¹^/2exp

−(x − ␮x)^HR⁻¹_{x x}(x− ␮x)

× exp

−(x − ␮x)^HW^TP^−∗W^∗(x− ␮x)+ Re

(x− ␮x)^TP^−∗W^∗(x− ␮x)

. (2.56) Since the complex Gaussian pdf is simply a convenient way of expressing the joint pdf of real and imaginary parts, many results valid for the real case translate straightforwardly to the complex case. In particular, a linear or widely linear transformation of a Gaussian random vector (proper or improper) is again Gaussian (proper or improper). We note,

2.3 Probability distributions and densities 41

however, that a widely linear transformation of a proper Gaussian will generally produce an improper Gaussian, and a widely linear transformation of an improper Gaussian may produce a proper Gaussian.

2.3.2 Conditional complex Gaussian distribution

If two real random vectors z= [u^T, v^T]^T: −→ IR²ⁿand w= [a^T, b^T]^T: −→ IR^2m are jointly Gaussian, then the conditional density for z given w is Gaussian,

p(z|w) = 1

(2π)²ⁿ^/2det¹^/2Rzz|wexp

−¹₂(z− ␮z|w)^TR⁻¹_zz_|w(z− ␮z|w)

(2.57) with conditional mean vector

␮z|w= ␮z+ RzwR⁻¹_ww(w− ␮w) (2.58) and conditional covariance matrix

Rzz|w= Rzz− RzwR⁻¹_wwR^T_z_w. (2.59) This result easily generalizes to the complex case. Let x= u + jv: −→ Cⁿ and y= a + jb: −→ C^m, and y= Tw. Then the augmented conditional mean vector is

␮_x_|y =

␮x|y

␮^∗x|y

= T␮z|w= T␮z+ (TRzwT^H)(T^−HR⁻¹_wwT⁻¹)T(w− ␮w)

= ␮_x+ Rx yR⁻¹_yy(y− ␮_y). (2.60) The augmented conditional covariance matrix is

R_{x x}_|y= TRzz|wT^H= TRzzT^H− (TRzwT^H)(T^−HR⁻¹_wwT⁻¹)(TR^T_z_wT^H)

= Rx x − Rx yR⁻¹_yyR^H_{x y}. (2.61)

Therefore, the conditional pdf takes the general form

p(x|y) = 1

πⁿdet¹^/2R_{x x}_|yexp

−¹₂(x− ␮_x_|y)^HR⁻¹_{x x}_|y(x− ␮_x_|y) . (2.62)

Using the matrix inversion lemma for R⁻¹_{x x}_|y, it is possible to derive an expression that explicitly shows the dependence of p(x|y) on y and y^∗. However, we shall postpone this until our discussion of widely linear estimation in Section5.4.

Definition 2.2. Two complex random vectors x and y are called jointly proper if the composite vector [x^T, y^T]^T is proper. This means they must be individually proper,

Rx x = 0 and Ryy = 0, and also cross-proper, Rx y = 0.

If x and y are jointly proper, the conditional Gaussian density for x given y is p(x|y) = 1

πⁿdet Rx x|y exp

−(x − ␮x|y)^HR⁻¹_{x x}_|y(x− ␮x|y)

(2.63)

42 Complex random vectors and processes

with mean

␮x|y = ␮x+ Rx yR⁻¹_yy(y− ␮y) (2.64) and covariance matrix

Rx x|y = Rx x− Rx yR⁻¹_yyR^H_{x y}. (2.65)

2.3.3 Scalar complex Gaussian distribution

The scalar complex Gaussian distribution is important enough to revisit in detail.

Consider a zero-mean scalar Gaussian random variable x= u + jv with variance Rx x = E|x|²and complementary variance Rx x = E x²= ρ Rx xwith|ρ| < 1. The com-plex correlation coefficientρ between x and x^∗is a measure for the degree of impropriety of x. From Result2.4, the pdf of x is

p(x)= 1

π Rx x

1− |ρ|²exp

−|x|²− Re(ρx^∗2) Rx x(1− |ρ|²)

. (2.66)

Let Ruu and R_vvbe the variances of the real part u and imaginary part v, and Ruv

their cross-covariance. The correlation coefficient between u andv is ρuv= √ Ruv

Ruu

√R_vv. (2.67)

From (2.21) and (2.22) we know that

Ruu+ R_vv= Rx x, (2.68) Ruu− Rvv+ 2j

Ruu

R_vvρuv= ρ Rx x. (2.69)

So the complementary varianceρ Rx x carries information about the variance mismatch Ruu− R_vvin its real part and about the correlation between u andv in its imaginary part. There are now four different cases.

1. If u andv have identical variances, Ruu= Rvv= Rx x/2, and are independent, ρuv= 0, then x is proper, i.e.,ρ = 0. Its pdf is

p(x)= 1

πe^−|x|². (2.70)

2. If u andv have different variances, Ruu = R_vv, but u andv are still independent, ρuv= 0, then ρ is real, ρ = (Ruu− R_vv)/Rx x, and x is improper.

3. If u andv have identical variances, Ruu= Rvv= Rx x/2, but u and v are correlated, ρuv= 0, then ρ is purely imaginary, ρ = jρuv, and x is improper.

4. We can combine these two possible sources of impropriety so that u and v have different variances, Ruu= R_vv, and are correlated, ρuv= 0. Then ρ is generally complex.

With x = re^j^θ andρ = |ρ|e^j^ψ, we see that the pdf p(x) is constant on the contour (or level curve) r²[1− |ρ|cos(2θ − ψ)] = K². This contour is an ellipse, and r is

2.3 Probability distributions and densities 43

Figure 2.1 Probability-density contours of complex Gaussian random variables with differentρ.

maximum when cos(2θ − ψ) is minimum. This establishes that the ellipse orientation (the angle between the u-axis and the major ellipse axis) isθ = ψ/2, which is half the angle of the complex correlation coefficientρ = |ρ|e^j^ψ. It is also not difficult to show (seeOllila (2008)) that|ρ| is the square of the ellipse eccentricity. This is compelling evidence for the usefulness of the complex description. The real description – in terms of Ruu, R_vv, and the correlation coefficient ρuv between u and v – is not nearly as insightful.

Example 2.3. Figure2.1shows contours of constant probability density for cases 1–4 listed above. In plot (a), we see the proper case with ρ = 0, which exhibits circular contour lines. All remaining plots are improper, with elliptical contour lines. We can make two observations. First, increasing the degree of impropriety of the signal by increasing |ρ| leads to ellipses with greater eccentricity. Secondly, the angle of the ellipse orientation is half the angle ofρ, as proved above.

In plots (b) and (c), we have case 2: u andv have different variances but are still independent. In this situation, the ellipse orientation is either 0^◦ or 90^◦, depending on whether u or v has greater variance. Plots (d) and (e) show case 3: u and v have the same variance but are now correlated. In this situation, the ellipse orien-tation is either 45^◦ or 135^◦. The general case, case 4, is depicted in plot (f). Now the ellipse can have an arbitrary orientation ψ/2, which is controlled by the angle of ρ = |ρ|e^j^ψ.

44 Complex random vectors and processes

With u= r cos θ, v = r sin θ, du dv = r dr dθ, it is possible to change variables and obtain the pdf for the polar coordinates (r, θ)

prθ(r, θ) = r where I0is the modified Bessel function of the first kind of order 0:

I0(z)= 1 π

_π

e^{z cos}^θdθ. (2.73)

This pdf is invariant with respect toψ. It is plotted in Fig.1.9in Section1.6for several values of|ρ|. For ρ = 0, it is the Rayleigh pdf

This suggests that we call pr(r ) in (2.72) the improper Rayleigh pdf.⁵ Integrating prθ(r, θ) over r yields the marginal pdf for θ:

If|ρ| = 1, x is a singular random variable because the support of the pdf p(x) collapses to a line in the complex plane and the pdf (2.66) must be expressed using a Dirac δ-function. This case is called maximally improper (terms used by other researchers are rectilinear and strict-sense noncircular). If x is maximally improper, we can express it as x= ae^j^ψ/2= a cos(ψ/2) + ja sin(ψ/2), where a is a real Gaussian random variable with zero mean and variance Rx x. Hence, the radius-squared r²of x/√

Rx x isχ²-distributed with one degree of freedom, and the angleθ takes on values ψ/2 and ψ/2 − π, each with probability equal to 1/2.

2.3.4 Complex elliptical distribution

A generalization of the Gaussian distribution, which has found some interesting appli-cations in communiappli-cations, is the family of elliptical distributions. We could proceed as in the Gaussian case by starting with the pdf of an elliptical distribution for a composite real random vector z and then deriving an expression in terms of complex x. Instead, we will directly modify the improper Gaussian pdf (2.48) by replacing the exponential function with a nonnegative function g: [0, ∞) −→ [0, ∞), called the pdf generator, that satisfies!_∞

0 tⁿ⁻¹g(t )dt < ∞. This necessitates two changes: First, since we do not

2.3 Probability distributions and densities 45

yet know the second-order moments of x, the matrix used in the expression for the pdf may no longer be the augmented covariance matrix of x. Hence, instead of R_{x x}, we use the augmented generating matrix

H_{x x} =

H_{x x} H_{x x} H^∗_{x x} H^∗_{x x}

to denote an arbitrary augmented positive definite matrix of size 2n× 2n. Secondly, we need to introduce a normalizing constant cn to ensure that p(x) is a valid pdf that integrates to 1.

We now state the general form of the complex elliptical pdf, which is a straightforward generalization of the real elliptical pdf, due toOllila and Koivunen (2004).

Definition 2.3. The pdf of a complex elliptical random vector x: −→ Cⁿis p(x)= cn

det¹^/2H_{x x}g

(x− ␮_x)^HH⁻¹_{x x}(x− ␮_x) . (2.76) The normalizing constant cnis given by

cn = (n− 1)!

πⁿ!_∞

0 t²ⁿ⁻¹g(t²)dt. (2.77)

If the mean exists, the parameter␮_xin the pdf (2.76) is the augmented mean of x. This indicates that the mean is independent of the choice of pdf generator. However, there are distributions for which some or all moments are undefined. For instance, none of the moments exist for the Cauchy distribution, which belongs to the family of elliptical distributions. In this case,␮_xshould be treated simply as a parameter of the pdf but not its augmented mean.

Since the complex elliptical pdf (2.76) contains the same quadratic form as the complex Gaussian pdf (2.48), we can obtain straightforward analogs of (2.55) and (2.56), which are expressions in terms of x and x^∗. We can also write down the expres-sion for the pdf of an elliptical random vector with zero complementary generating matrix.

Result 2.6. The pdf of a complex elliptical random vector x: −→ Cⁿ with Hx x = 0 is

p(x)= cn

det Hx x

2(x− ␮x)^HH⁻¹_{x x}(x− ␮x)

, (2.78)

with normalizing constant cngiven by (2.77).

A good overview of real and complex elliptical distributions is given byFang et al.

(1990). However, Fang et al. consider only complex elliptical distributions with zero complementary generating matrix. Thus,Ollila and Koivunen (2004) refer to the gen-eral complex elliptical pdf in Definition 2.3 as a “generalized complex elliptical”

pdf.

The family of complex elliptical distributions contains some important subclasses of distributions.

46 Complex random vectors and processes

Figure 2.2 Probability-density contours of complex Cauchy random variables with differentρ.

r The complex multivariate Gaussian distribution, for pdf generator g(t) = exp(−t/2).

r The complex multivariate t-distribution, with pdf given by p(x)= 2ⁿ(n + k/2)

(πk)ⁿ(k/2)det¹^/2H_{x x}

1+ k⁻¹(x− ␮_x)^HH⁻¹_{x x}(x− ␮_x) ^−n−k/2, (2.79) where k is an integer. We note that the Gamma function satisfies(n) = (n − 1)! if n is a positive integer.

r The complex multivariate Cauchy distribution, which is a special case of the complex multivariate t -distribution with k= 1. Its pdf is

p(x)= 2ⁿ(n + 1/2) πⁿ^+1/2det¹^/2H_{x x}

1+ (x − ␮_x)^HH⁻¹_{x x}(x− ␮_x) ^−n−1/2. (2.80) None of the moments of the Cauchy distribution exist.

Example 2.4. Similarly to Example 2.3, consider a scalar complex Cauchy pdf with µx= 0 (which is the median but not the mean, since the mean does not exist) and augmented generator matrix

Just as for the scalar complex Gaussian pdf, the scalar complex Cauchy pdf is constant on elliptical contours whose major axis isψ/2, half the angle of ρ = |ρ|e^j^ψ. Figure2.2 shows contours of constant probability density for three Cauchy random variables. Plots (a), (b), and (c) in this figure should be compared with plots (a), (d), and (f) in Fig.2.1, respectively, for the Gaussian case. The main difference is that Cauchy random variables have much heavier tails than Gaussian random variables. From the expression (2.82), it is straightforward to write down the joint pdf in polar coordinates (r, θ), and not quite as straightforward to integrate with respect to r orθ to obtain the marginal pdfs.

2.4 ML covariance estimators: Wishart distribution 47

Complex elliptical distributions have a number of desirable properties.

r The mean of x (if it exists) is independent of the choice of pdf generator g.

r The augmented covariance matrix of x (if it exists) is proportional to the augmented generating matrix H_{x x}. The proportionality factor depends on g, and is most eas-ily determined using the characteristic function of x. Therefore, if the second-order moments exist, the pdf (2.76) is the pdf of a complex improper elliptical random vector, and the pdf (2.78) for Hx x = 0 is the pdf of a complex proper elliptical random vector with Rx x = 0. This is due to Result2.8, to be discussed in Section2.5.

r All marginal distributions are also elliptical. If y: −→ C^mcontains components of x: −→ Cⁿ, m < n, then y is elliptical with the same g as x, and ␮y contains the corresponding components of␮x. The augmented generating matrix of y, H_yy, is the sub-matrix of Hx x that corresponds to the components that y extracts from x.

r Let y = Mx + b, where M is a given 2m × 2n augmented matrix and b is a given 2m× 1 augmented vector. Then y is elliptical with the same g as x and ␮_y= M ␮_x+ b and H_yy= M Hx xM^H.

r If x: −→ Cⁿand y: −→ C^mare jointly elliptically distributed with pdf generator g, then the conditional distribution of x given y is also elliptical with

␮_x_|y= ␮_x+ Hx yH⁻¹_yy(y− ␮_y), (2.83) which is analogous to the Gaussian case (2.60), and augmented conditional generating matrix

H_{x x}_|y = Hx x− Hx yH⁻¹_yyH^H_{x y}, (2.84) which is analogous to the Gaussian case (2.61). However, the conditional distribution will in general have a different pdf generator than x and y.

In document Statistical Signal Processing of Complex-Valued Data.pdf (Page 60-69)