October 3rd, Linear Algebra & Properties of the Covariance Matrix

(1)

Linear Algebra & Properties of the Covariance Matrix

October 3rd, 2012

(2)

Estimation of ¯ r and C

Let r_n¹, r_n^t, . . . , r_n^T be the historical return rates on the n^th asset.

rn=





 r_n¹ r_n² ... r_n^T







n = 1, 2, . . . , N .

(3)

Estimation of ¯ r and C

The expected return is approximated by ˆ¯rn = 1

T

X

t=1

r_n^t

and the covariance is approximated by

ˆ

cmn= 1 T − 1

T

X

t=1

(r_n^t− ˆ¯rn)(r_m^t − ˆ¯rm) ,

or in matrix/vector notation

C =b 1 T − 1

T

X

t=1

(r^t− ˆ¯r )(r^t− ˆ¯r )^> an outer product.

(4)

Importance of C

The expected return ¯r is often estimated exogenously (e.g. not statistically estimated). But

statistical estimation of C is an important topic, is used in practice,

and is the backbone of many finance strategies.

A major part of Markowitz theory was the assumption for C to be a covariance matrix it must be symmetric positive definite (SPD).

(5)

Real Matrices/Vectors

Let’s focus on real matrices, and only use complex numbers if necessary. A vector x ∈ R^N and a matrix A ∈ R^M×N are

x =





 x1

x₂ ... x_N







and A =







a11 a12 . . . a_1N a₂₁ a₂₂ . . . a_2N ... . .. ... a_M1 a_M2 . . . a_MN







(6)

Matrix/Vector Multiplication

The matrix A times a vector x is a new vector

Ax =







a11 a12 . . . a_1N a₂₁ a₂₂ . . . a_2N ... . .. ... a_M1 a_M2 . . . a_MN











 x1

x₂ ... x_N







=







a11x1+ a12x2+ · · · + a1NxN

a₂₁x₁+ a₂₂x₂+ · · · + a_2Nx_N ...

aM1x1+ aM2x2+ · · · + aMNxN







∈ R^M .

Matrix/matrix multiplication is an extension of this.

(7)

Matrix/Vector transpose

Their transposes are

x^>=





 x₁ x2

... xN







>

= (x₁, x₂, . . . , x_N)

and A^>=







a₁₁ a₁₂ . . . a_1N a21 a22 . . . a2N

... . .. ... a_M1 a_M2 . . . a_MN







>

=







a₁₁ a₂₁ . . . a_M1 a12 a22 . . . aM2

... . .. ... a_1N a_2N . . . a_MN





 .

Note that (Ax )^>= x^>A^>.

(8)

Vector Inner/outer Product

The vector x ∈ R^N inner product with another vector y ∈ R^N is x^>y = y^>x = x1y1+ x2y2+ . . . x_Nyn .

There is also the outer product,

xy^>=







x1y1 x1y2 . . . x1y_N x₂y₁ x₂y₂ . . . x₂y_N ... . .. ... x_Ny1 x_Ny2 . . . X_NY_N







6= yx^> .

(9)

Matrix/Vector Norm

The norm on the vector space R^N is k · k, and for any x ∈ R^N we have

kxk =√ x^>x . Properties of the norm

kxk ≥ 0 for any x, kxk = 0 iff x = 0,

kx + y k ≤ kxk + ky k (triangle inequality),

|x^>y | ≤ kx kky k with equality iff x = ay for some a ∈ R (Cauchy-Schwartz),

kx + y k² = kx k²+ ky k² if x^>y = 0 (Pythagorean thm).

(10)

Linear Independence & Orthogonality

Two vectors x , y are linear independent if there is no scalar constant a ∈ R s.t.

y = ax . These two vectors are orthogonal if

x^>y = 0 . Clearly, orthogonality ⇒ linear independent.

(11)

Span

A set of vectors x1, x2, . . . , xn is said to span a subspace V ⊂ R^N if for any y ∈ V there are constants a₁, a₂, . . . , a_n such

y = a1x2+ a2x2+ · · · + anxn .

We write, span(x1, . . . , xn) = V . In particular, a set x1, x2, . . . , xN

will span R^N iff they are linearly independent:

span(x1, x2, . . . , x_N) = R^N ⇔ x_i linearly independent of xj for all i , j . If so, then x1, x2, . . . , XN are a basis for R^N.

(12)

Orthogonal Basis

Definition

A basis of R^N is orthogonal if all its elements are orthogonal, span(x₁, x₂, . . . , x_N) = R^N and x_i^>x_j = 0 ∀i , j .

(13)

Invertibility

Definition

A square matrix A ∈ R^N×N is invertible if for any b ∈ R^N there exists x s.t.

Ax = b . If so, then x = A⁻¹b.

A is invertible if any of the following hold:

A has rows that are linearly independent A has columns that are linearly independent det(A) 6= 0

there is no non-zero vector x s.t. Ax = 0.

In fact, these 4 statements are equivalent.

(14)

Eigenvalues

Let A be an N × N matrix. A scalar λ ∈ C is an eigenvalue of A if

Ax = λx for some vector x = re(x ) +√

−1 im(x) ∈ C^N. If so x is an eigenvector.

By 4th statement of previous slide, A⁻¹ exists ⇔ zero is not an eigenvalue.

(15)

Eigenvalue Diagonalization

Let A be an N × N matrix with N linearly independent

eigenvectors. There is an N × N matrix X and an N × N diagonal matrix Λ such that

A = X ΛX⁻¹ ,

where Λii is an eigenvalue of A, and the columns of X = [x1, x2, . . . , x_N] are eigenvectors,

Axi = Λiixi .

(16)

Real Symmetric Matrices

A matrix is symmetric if A^>= A.

Proposition

A symmetric matrix has real eigenvalues.

pf: If Ax = λx then

λ²kxk²= λ²x^∗x = x^∗A²x = x^∗A^>Ax = kAx k², which is real and nonnegative (x^∗ = re(x )^>−√

−1 im(x)^> i.e the conjugate transpose). Hence, λ is cannot be imaginary or complex.

(17)

Real Symmetric Matrices

Proposition

Let A = A^> be a real matrix. A’s eigenvectors form an orthogonal basis of R^N, and the diagonalization is simplified:

A = X ΛX^> , i.e. X⁻¹ = X^>.

Basic Idea of Proof: Suppose that Ax = λx and Ay = αy with α 6= λ. Then

λx^>y = x^>A^>y = x^>Ay = αx^>y , and so it must be that x^>y = 0.

(18)

Skew Symmetric Matrices

A matrix is skew symmetric if A^>= −A.

Proposition

A skew symmetric matrix has purely imaginary eigenvalues.

pf: If Ax = λx then

λ²kxk²= λ²x^∗x = x^∗A²x = −x^∗A^>Ax = −kAx k² which is less than zero. Hence, λ² < 0 must be purely imaginary.

(19)

Positive Definitness

Definition

A matrix A ∈ R^N×N is positive definite iff x^>Ax > 0 ∀x ∈ R^N . It is only positive semi-definite iff

x^>Ax ≥ 0 ∀x ∈ R^N .

(20)

Positive Definitness ⇒ Symmetric

Proposition

If A ∈ R^N×N is positive definite, then A^>= A, i.e. it is symmetric positive definite (SPD).

pf: For any x ∈ R^N we have

0 < x^>Ax = 1 2





x^>(A + A^>

| {z } sym.

)x + x^>( A − A^>

| {z } skew-sym.

)x





 ,

but if A − A^>6= 0 then there exists eigenvector x s.t.

(A − A^>)x = αx where α ∈ C \ R, which contradicts the inequality.

(21)

Eigenvalue of an SPD Matrix

Proposition

If A is SPD, then it has all positive eigenvalues.

pf: By definition for any y ∈ R^N \ {0} it holds that y^>Ay > 0 .

In particular, for any λ and x s.t. Ax = λx , we have λkx k² = x^>Ax > 0

and so λ > 0.

Consequence: for A SPD there’s no x s.t. Ax = 0, and so A⁻¹ exists.

(22)

So Can/How Do We Invert C ?

The invertibility of the covariance matrix is very important:

we’ve seen a need for it in the efficient frontiers and Markowitz problem,

if it’s not invertible then there are redundant assets, we’ll need it to be invertible when we discuss multivariate Gaussian distributions.

If there is no risk-free, our covariance matrix is practically invertible by definition:

0 < var X

i

w_ir_i

!

=X

i ,j

w_iw_jC_ij = w^>Cw

where w ∈ R^N\ {0} is any allocation vector (w may not sum to

(23)

So How Do We Know b C is Covariance Matrix?

Recall, bC = _{T −1}¹ PT

t=1(r^t− ¯r )(r^t− ¯r )^>. Each summand is symmetric,

(r^t− ˆ¯r )(r^t− ˆ¯r )^> is symmetric ,

and sum of symmetric matrices is symmetric. But each summand is not invertible because there exists a vector y s.t.

y^>(r^t− ˆ¯r ) = 0, which means that zero is an eigenvalue, (r^t− ˆ¯r ) (r^t− ˆ¯r )^>y

| {z }

=0

= 0 .

(24)

So How Do We Know b C is Covariance Matrix?

If

span(r¹− ˆ¯r , r²− ˆ¯r , . . . , r^T − ˆ¯r ) = R^N then bC is invertible. So need T ≥ N.

In order for bC to be close C will need T N.

C is also SPD:b

y^>C y = yb ^> 1 T − 1

T

X

t=1

(r^t− ˆ¯r )(r^t− ˆ¯r )^>

! y

= 1

T − 1

T

X

t=1

y^>(r^t−ˆ¯r )(r^t−ˆ¯r )^>y = 1 T − 1

T

X

t=1

(y^>(r^t−ˆ¯r ))²> 0 ,

(25)

A Simple Covariance Matrix

C = (1 − ρ)I + ρU where I is the identity and U_ij = 1 for all i , j . U1 = N1

Ux = 0 if x^>1 = 0 where1 .

= (1, 1, . . . , 1)^>. Hence, for any x we have Cx = (1 − ρ)Ix + ρUx = (1 − ρ)x + ρ(1^>x )1 , which means x is an eigenvector iff1^>x = 0 or x = a1. The eigenvalues of C are 1 − ρ + Nρ and 1 − ρ, and the eigenvectors are1 and the N − 1 vectors orthogonal to 1, respectively.

(26)

Positive Definite (Complex Hermitian Matrices)

Let C be a square matrix, i.e. C ∈ C^N×N. C is positive definite if

z^∗Cz > 0 ∀z ∈ C^N\ {0}

where C^N is N-dimensional complex vectors, and z^∗ is conjugate transpose,

z^∗ = (z_r + iz_i)^∗ = z_r − iz_i . C is positive semi-definite if

z^∗Cz ≥ 0 ∀z ∈ C^N\ {0}

If C positive definite and real, then it must also be

(27)

Gershgorin Circles

Proposition

All eigenvalues of the matrix A lie in one of the Gershgorin circles,

|λ − A_ii| ≤X

j 6=i

|A_ij| for at least 1 i ≤ N,

where λ is an eigenvalue of A.

pf: Let λ be an eigenvalue and let x be an eigenvector. Choose i so that |xi| = max_j|x_j|. We have P

jAijxj = λxj . W.L.O.G. we can take |x_i| = 1 so thatP

j 6=iA_ijx_j = λ − A_ii, and taking absolute values yields the result,

|λ − A_ii| =

X

j 6=i

A_ijx_j

≤X

j 6=i

|A_ij| · |x_j| ≤X

j 6=i

|A_ij| .