III Change of bases - I. Eigenvectors and eigenvalues

As in the discussion of Section I.B, if V and W are vector spaces of dimensions m and n

respectively, then there is a 1–1 correspondence between linear transformations T : V _→ W and

m_×n matrices defined as follows: Let _A = _{a1, _{· · ·} ,an_} and _B = _{b1, _{· · ·} ,bm_} be ordered bases forV andW respectively, and define them_×nmatrix ofT with respect to_Aand_Bby using the following equations to define its entriesci,j:

T(aj) = m

i=1

ci,jbi

These equations are defined for 1_≤j_≤n, and the matrix itself will frequently be denoted by [T]B_A

in these notes. If we choose different ordered bases, then we obtain different 1–1 correspondences. The main purpose of this section is to begin consideration of the following questions:

(1) How are the different matrices related if one changes bases? In particular, what can one say about situations whereV =W and _A=_B?

(2) Suppose we are in a situation where V = W and _A = _B, and let P and Q represent the same linear transformation with respect to different ordered bases. What sorts of properties doP and Qhave in common?

(3) In the same type of situation, to what extent is it possible to choose an ordered basis _A so that the matrix[T]A

A takes a particularly simple form?

We shall give a complete answer to the first question and partial answers to the other two. In Section IV.4 of the notes we shall answer the last two questions more generally.

Default convention. Unless specifically stated otherwise, ifT is a linear transformation fromV

to itself and_Ais an ordered basis forV, then we shall refer to [T]A

A asthe matrix of T with respect to _A.

III.A : Review topics (Fraleigh and Beauregard, _§§2.3, 5.2)

Aside from the setting developed above, there are two main points to review. One involves the computations on pages 305–307 of the text for diagonalizable matrices. These involved a diagonalizable matrix A such that the columns of C formed a basis of eigenvectors for A; The eigenvalues of these columns were given by the diagonal entries of some diagonal matrixD. Under those conditions it was shown that A C was equal to C D, or equivalently D = C−1_{A C}_{. This} discussion was followed by the definition ofsimilarityfor matrices. Specifically, two square matrices

AandB are said to be similar if there is an invertible matrixP such thatP A P−1_{; in the situation} considered in Section 5.2, one can take P to be the inverse of C.

III.1 : Similarity of matrices (Fraleigh and Beauregard, _§§7.1, 7.2)

We shall now consider a general version of the question discussed in the preceding review. Let V be an n-dimensional vector space, let T : V _→ V be a linear transformation, and let U = _{u1, _{· · ·} ,un_} and _W = _{w1, _{· · ·} ,wn_} be ordered bases for V. Suppose that A is the matrix of T with respect to the ordered basis _U and B is the matrix of T with respect to the ordered basis_W. We want to determine the relationship between Aand B, and we shall do so by writing everything out fairly explicitly.

LetP be the matrix which gives the coefficients of the vectors in_W when the latter are written as linear combinations of the vectors in_U. Specifically, the entries ofP are defined by the equations

wi = n

k=1

pk,iuk

where 1 _≤ i_≤ n. We may then write T(wj) as a linear combination of the vectors in U by first writing this vector as a linear combination of the vectors in_W usingBand then usingP to express everything as a linear combination of vectors in_U:

T(wj) = n X i=1 bi,jwi = X i,k bi,jpk,iuk

On the other hand, we may also write T(wj) as a linear combination of the vectors in U by first writing this wj as a linear combination of the vectors in _U by means of P and then using A to evaluate this linear combination:

T(wj) = T n X `=1 p`,ju` ! = X i,` p`,jai,`ui

The coefficients of the vectorsui in both expressions must be equal. But the coefficient in the first expression is the (i, j) entry of P B while the coefficient in the second is the (i, j) entry of A P. Since iand j are arbitrary this establishes the following basic result:

CHANGE OF BASIS FORMULA. In the notation described above, we have P B=A P, or equivalently

B = P−1A P .

COROLLARY. Two square matrices are similar if and only if they represent the same linear transformation with respect to different ordered bases.

Proof. We have already shown the ( _⇐= ) implication. To see the ( =_⇒ ) implication, suppose that we have three matrices A, B, P satisfying B =P−1_{A P} _{, let}_U _{be the standard basis of unit} vectors, and let_W be the ordered basis whose vectors are the columns ofP. Under these conditions the preceding calculations show that the matrixCof _LAwith respect to the ordered basis_W must satisfy P C =A P, which in turn implies that C=P−1_{A P} ₌_B_.

EXAMPLE. Suppose that _U is the standard ordered basis of unit vectors for R2 _and _W _is given by the ordered set of vectors w1 = (1,1) andw2= (1,2). We take A to be the matrix

1 1

−1 3

and use the information above to compute the matrix [_LA]W in this case. The first step is to find the matrix P, which expresses the vectors in _W in terms of the vectors in _U. Since_U is just the standard ordered basis for R2_{, the linear combinations for the vectors in} _W _{are given by their} coordinates and we have

P = 1 1 1 2 .

The next step is to compute the inverse for P, and it turns out to be given by

P−1 = 2 ₋1 −1 1 .

We may then substitute these matrices into the formula [T]W = B = P−1A P

and if we carry out all the necessary matrix calculations we obtain the answer [T]W = B = 2 1 0 2 .

If we wish to check the correctness of this computation, we can do so by computing the vectors

Awi, and if we carry these computations out we find that 2w1 is equal toAw1 and w1+ 2w1 is equal to Aw2, which is exactly what our computation for [T]W =B claims should be the case.

III.2 : Invariants of similarity (Fraleigh and Beauregard, _§§7.1, 7.2)

By the results of the previous section, the classification of all linear transformations from Rn to itself is the same as the classification of all n_×nmatrices up to similarity. In order to study a problem of this sort, it is necessary to find a list of matrix properties such that two similar matrices have the same properties but two dissimilar matrices do not.

By one of the exercises in Section 5.2 of the text, the relation of similarity for matrices has the following three properties:

• Every matrix Ais similar to itself.

• IfAis similar to B, thenB is similar to A.

• IfAis similar to B and B is similar to C, thenA is similar to C. We shall repeatedly use these properties in our discussion.

A natural way to approach the similarity classification problem is to start with relatively weak properties that similar matrices share and to find increasingly stronger ones. The following is perhaps one of the simplest:

INVARIANCE OF DETERMINANTS. Similar matrices have equal determinants.

This is true because the determinant of a product is the product of the determinants, and since the product of scalars is commutative it follows that the determinant satisfies detC1C2= detC2C1 for all square matrices C2 and C1. If we apply this to B =P−1A P we obtain

detB = det P−1(A P)C = det (A P)P−1 = detA I = detA

and hence similar matrices have the same determinant. In fact, one can push the same methods further.

INVARIANCE OF CHARACTERISTIC POLYNOMIALS. Similar matrices have the same characteristic polynomials.

This is a similar computation whereB is replaced by the matrix of polynomials

B₋tI = P−1(A₋tI)P

and Areplaced by the matrix of polynomials A₋tI.

Eigenvalues and eigenvectors provide further common characteristics of similar matrices. INVARIANCE OF EIGENSPACE DIMENSIONS.If A and B are similar matrices and λ

is an arbitrary scalar, then the dimensions of the spaces of eigenvectors forAandB with eigenvalue

λare equal.

Note that the set of eigenvectors corresponds to the solutions of, say,A₋λ I and thus defines a subspace. — This assertion is true because if B =P−1_{A P} _and _W _{is the space of eigenvectors} forA with eigenvalueλ, then_L−1_P (W) is the space of eigenvectors forB with the same eigenvalue.

Another way of stating the preceding result is that for every eigenvalue λ the ranks of the matricesA₋λ I and B₋λ I are the same. Although it is possible to construct examples of 3_×3

matrices for which all these ranks are equal even though the matrices themselves are not similar (one of the additional exercises for this section gives a method for producing such examples), a refinement of this characterization provides complete information on the similarity type of this matrix. Since similar matrices have the same characteristic polynomial, let us assume we are dealing with a class of matrices for which this polynomial is the same. The following basic fact about polynomials will be helpful in understanding the similarity conditions we shall present: FINITENESS OF NUMBER OF FACTORS. If p(t) is a polynomial of degree n > 0 (in one variable), then there are only finitely many monic polynomials that are factors of p.

Recall that a polynomial ismonic if it has the formtd₊_q₍_t_{) where either}_q _{= 0 or the degree} of q is strictly less thand.

Given a polynomialf(t) with real coefficients and ann_×nmatrixA, one can form an associated matrix f(A) by making the substitution t =A, and for each such polynomial one can define the rank ofA. One then has the following result:

SIMILARITY CLASSIFICATION PRINCIPLE. Let A and B be two matrices with the same characteristic polynomial f(t). Then A and B are similar if and only if for every monic factor g(t) of f(t) the ranks ofg(A) andg(B) are equal.

Proving the equality of ranks if the matrices are similar is not difficult (see the exercises), but proving the converse is just a little beyond the scope of this course (although the theorem behind this result is a fundamental topic in graduate level algebra courses). The filesratforms._∗ contain further information.

The preceding result on similarity classification addresses only the first two of the three questions that were raised at the beginning of this unit; in particular, it does not address the question of whether one can find a matrix of a particularly simple or useful type that is similar to a given matrix. In fact, such nice forms exist provided we broaden our perspective to allow scalars that are more general than real numbers, and we shall ideal with such questions and their applications to matrix similarity in the next unit of the course.

In document I. Eigenvectors and eigenvalues (Page 37-42)