Linear Mappings and Their Matrices
3.3 The Inverse of a Linear Mapping
Given a linear mapping S : Rn −→ Rm, does it have an inverse? That is, is there a mapping T : Rm−→ Rn such that
S◦ T = idm and T◦ S = idn? If so, what is T ?
The symmetry of the previous display shows that if T is an inverse of S then S is an inverse of T in turn. Also, the inverse T , if it exists, must be unique, for if T′ : Rm−→ Rn also inverts S then
T′= T′◦ idm= T′◦ (S ◦ T ) = (T′◦ S) ◦ T = idn◦ T = T.
Thus T can unambiguously be denoted S−1. In fact, this argument has shown a little bit more than claimed: If T′ inverts S from the left and T inverts S from the right then T′ = T . On the other hand, the argument does not show that if T inverts S from the left then T also inverts S from the right—this is not true.
If the inverse T exists then it too is linear. To see this, note that the elementwise description of S and T being inverses of one another is that every y ∈ Rm takes the form y = S(x) for some x ∈ Rn, every x ∈ Rn takes the
3.3 The Inverse of a Linear Mapping 83 Thus T satisfies (3.1). The argument that T satisfies (3.2) is similar.
Since matrices are more explicit than linear mappings, we replace the question at the beginning of this section with its matrix counterpart: Given a matrix A∈ Mm,n(R), does it have an inverse matrix, a matrix B ∈ Mn,m(R) such that
AB = Im and BA = In?
As above, if the inverse exists then it is unique, and so it can be denoted A−1. The first observation to make is that if the equation Ax = 0m has a nonzero solution x∈ Rn then A has no inverse. Indeed, also A0n= 0m, so an inverse A−1would have to take 0mboth to x and to 0n, which is impossible.
And so we are led to a subordinate question: When does the matrix equation Ax = 0m
have nonzero solutions x∈ Rn?
For example, let A be the 5-by-6 matrix
A =
Left multiplication by certain special matrices will simplify the matrix A.
Definition 3.3.1 (Elementary Matrices). There are three kinds of ele-mentary matrices. For any i, j ∈ {1, · · · , m} (i 6= j) and any a ∈ R, the m-by-m (i; j, a) recombine matrix is
Ri;j,a=
(Here the a sits in the (i, j)th position, the diagonal entries are 1 and all other entries are 0. The a is above the diagonal as shown only when i < j, otherwise it is below.)
For any i∈ {1, · · · , m} and any nonzero a ∈ R, the m-by-m (i, a) scale matrix is
Si,a=
(Here the a sits in the ith diagonal position, all other diagonal entries are 1 and all other entries are 0.)
For any i, j ∈ {1, · · · , m} (i 6= j), the m-by-m (i; j) transposition
(Here the diagonal entries are 1 except the ith and jth, the (i, j)th and (j, i)th entries are 1, all other entries are 0.)
The plan is to study the equation Ax = 0m by using these elementary matrices to reduce A to a nicer matrix E and then solve the equation Ex = 0m
instead. Thus we are developing an algorithm rather than a formula. The next proposition describes the effect that the elementary matrices produce by left multiplication.
Proposition 3.3.2 (Effects of the Elementary Matrices). Let M be an m-by-n matrix; call its rows rk. Then
(1) The m-by-n matrix Ri;j,aM has the same rows as M except that its ith row is ri+ arj;
(2) The m-by-n matrix Si,aM has the same rows as M except that its ith row is ari;
(3) The m-by-n matrix Ti;jM has the same rows as M except that its ith row is rj and its jth row is ri.
Proof. (1) As observed immediately after Definition 3.2.4, each row of Ri;j,aM equals the corresponding row of Ri;j,atimes M . For any row index k6= i, the
3.3 The Inverse of a Linear Mapping 85 only nonzero entry of the row is a 1 in the kth position, so the product of the row and M simply picks of the kth row of M . Similarly, the ith row of Ri;j,a
has a 1 in the ith position and an a in the jth, so the row times M equals the ith row of M plus a times the jth row of M .
(2) and (3) are similar, left as exercise 3.3.2. ⊓⊔ To get a better sense of why the statements in the proposition are true, it may be helpful to do the calculations explicitly with some moderately sized matrices. But then, the point of the proposition is that once one believes it, left multiplication by elementary matrices no longer requires actual calculation.
Instead, one simply carries out the appropriate row operations. For example, R1;2,3·
1 2 3 4 5 6
=
13 17 21 4 5 6
,
because R1;2,3adds 3 times the second row to the first. The slogan here is:
Elementary matrix TIMES is row operation ON.
Thus we use the elementary matrices to reason about this material, but for hand calculation we simply carry out the row operations.
The next result is that performing row operations on A doesn’t change the set of solutions x to the equation Ax = 0m.
Lemma 3.3.3 (Invertibility of Products of the Elementary Matri-ces). Products of elementary matrices are invertible. More specifically:
(1) The elementary matrices are invertible by other elementary matrices.
Specifically,
(Ri;j,a)−1= Ri;j,−a, (Si,a)−1 = Si,a−1, (Ti;j)−1= Ti;j. (2) If the m-by-m matrices M and N are invertible by M−1 and N−1, then the
product matrix M N is invertible by N−1M−1. (Note the order reversal.) (3) Any product of elementary matrices is invertible by another such product,
specifically the product of the inverses of the original matrices, but taken in reverse order.
Proof. (1) To prove that Ri;j,−aRi;j,a = Im, note that Ri;j,a is the identity matrix Imwith a times its jth row added to its ith row, and multiplying this from the left by Ri;j,−a subtracts back off a times the jth row, restoring Im. The proof that Ri;j,aRi;j,−a = Im is either done similarly or by citing the proof just given with a replaced by−a. The rest of (1) is similar.
(2) Compute:
(M N )(N−1M−1) = M (N N−1)M−1= M ImM−1= M M−1 = Im. Similarly for (N−1M−1)(M N ) = Im.
(3) This is immediate from (1) and (2). ⊓⊔
Proposition 3.3.4 (Persistence of Solution). Let A be an m-by-n matrix and let P be a product of m-by-m elementary matrices. Then the equations
Ax = 0m and (P A)x = 0m
are satisfied by the same vectors x in Rn.
Proof. Suppose that the vector x∈ Rn satisfies the left equation, Ax = 0m. Then
(P A)x = P (Ax) = P 0m= 0m.
Conversely, suppose that x satisfies (P A)x = 0m. Lemma 3.3.3 says that P has an inverse P−1, so
Ax = ImAx = (P−1P )Ax = P−1(P A)x = P−10m= 0m.
⊓
⊔ The machinery is in place to solve the equation Ax = 05 where as before,
A =
Scale A’s fourth row by−1/2; transpose A’s first and fourth rows:
T1;4S4,−1/2A =
Note that B has a 1 as the leftmost entry of its first row. Recombine various multiples of the first row with the other rows to put 0’s beneath the leading 1 of the first row:
Recombine various multiples of the second row with the others to put 0’s above and below its leftmost nonzero entry; scale the second row to make its leading nonzero entry a 1:
3.3 The Inverse of a Linear Mapping 87
Transpose the third and fifth rows; put 0’s above and below the leading 1 in the third row:
Matrix E is a prime example of a so-called echelon matrix. (The term will be defined precisely in a moment.) Its virtue is that the equation Ex = 05is now easy to solve. This equation expands out to
Ex =
Matching the components in the last equality gives x1=−2x3− 3x4− 5x6 x2=−7x3− 11x4− 13x6
x5= − 17x6.
Thus, x3, x4and x6are free variables that may take any values we wish, but then x1, x2and x5are determined from these equations. For example, setting x3=−5, x4= 3, x6= 2 gives the solution x = (−9, −24, −5, 3, −34, 2).
Definition 3.3.5 (Echelon Matrix). A matrix E is called echelon if it has the form
Here the ∗’s are arbitrary entries and all entries below the stairway are 0.
Thus each row’s first nonzero entry is a 1, each row’s leading 1 is farther right than that of the row above it, each leading 1 has a column of 0’s above it, and any rows of 0’s are at the bottom.
Note that the identity matrix I is a special case of an echelon matrix.
The algorithm for reducing any matrix A to echelon form by row operations should be fairly clear from the previous example. The interested reader may want to codify it more formally, perhaps in the form of a computer program.
Although different sequences of row operations may reduce A to echelon form, the resulting echelon matrix E will always be the same. This result can be proved by induction on the number of columns of A, and its proof is in many linear algebra books.
Theorem 3.3.6 (Matrices Reduce to Echelon Form). Every matrix A row reduces to a unique echelon matrix E.
In an echelon matrix E, the columns with leading 1’s are called new columns, and all others are old columns. The recipe for solving the equation Ex = 0mis then
1. Freely choose the entries in x that correspond to the old columns of E.
2. Then each nonzero row of E will determine the entry of x corresponding to its leading 1 (which sits in a new column). This entry will be a linear combination of the free entries to its right.
Let’s return to the problem of determining whether A∈ Mm,n(R) is in-vertible. The idea was to see if the equation Ax = 0mhas any nonzero solu-tions x, in which case A is not invertible. Equivalently, we may check whether Ex = 0m has nonzero solutions, where E is the echelon matrix to which A row reduces. The recipe for solving Ex = 0m shows that there are nonzero solutions unless all of the columns are new.
If A∈ Mm,n(R) has more columns than rows then its echelon matrix E must have old columns. Indeed, each new column comes from the leading 1 in a distinct row, so
new columns of E ≤ rows of E < columns of E,
showing that not all the columns are new. Thus A is not invertible when m < n. On the other hand, if A∈ Mm,n(R) has more rows than columns and it has an inverse matrix A−1∈ Mn,m(R), then A−1in turn has inverse A, but this is impossible since A−1 has more columns than rows. Thus A is also not invertible when m > n.
The remaining case is that A is square. The only square echelon matrix with all new columns is I, the identity matrix (exercise 3.3.10). Thus, unless A’s echelon matrix is I, A is not invertible. On the other hand, if A’s echelon matrix is I, then P A = I for some product P of elementary matrices. Multiply from the left by P−1 to get A = P−1; this is invertible by P , giving A−1 = P . Summarizing,
Theorem 3.3.7 (Invertibility and Echelon Form for Matrices). A non-square matrix A is never invertible. A non-square matrix A is invertible if and only if its echelon form is the identity matrix.
3.3 The Inverse of a Linear Mapping 89 When A is square, the discussion above gives an algorithm that simulta-neously checks whether it is invertible and finds its inverse when it is.
Proposition 3.3.8 (Matrix Inversion Algorithm). Given A ∈ Mn(R), set up the matrix
B = A| In
in Mn,2n(R). Carry out row operations on this matrix to reduce the left side to echelon form. If the left side reduces to In then A is invertible and the right side is A−1. If the left side doesn’t reduce to In then A is not invertible.
The algorithm works because if B is left multiplied by a product P of elementary matrices, the result is
and one readily checks that the claimed inverse really works. Since arithmetic by hand is so error-prone a process, one always should confirm one’s answer from the matrix inversion algorithm.
We now have an algorithmic answer to the question at the beginning of the section.
Theorem 3.3.9 (Echelon Criterion for Invertibility). The linear map-ping S : Rn −→ Rm is invertible only when m = n and its matrix A has echelon matrix In, in which case its inverse S−1 is the linear mapping with matrix A−1.
Exercises
3.3.1. Write down the following 3-by-3 elementary matrices and their inverses:
R3;2,π, S3,3, T3;2, T2;3.
3.3.2. Finish the proof of Proposition 3.3.2.
3.3.3. Let A =h1 2
3 45 6
i. Evaluate the following products without actually mul-tiplying matrices: R3;2,πA, S3,3A, T3;2A, T2;3A.
3.3.4. Finish the proof of Lemma 3.3.3, part (1).
3.3.5. What is the effect of right multiplying the m-by-n matrix M by an n-by-n matrix Ri;j,a? By Si,a? By T i; j?
3.3.6. Recall the transpose of a matrix M (cf. exercise 3.2.4), denoted MT. Prove: RTi;j,a = Rj;i,a; Si,aT = Si,a; Ti;jT = Ti;j. Use these results and the formula (AB)T= BTATto redo the previous problem.
3.3.7. Are the following matrices echelon? For each matrix M , solve the equa-tion M x = 0.
3.3.8. For each matrix A solve the equation Ax = 0.
3.3.10. Prove by induction that the only square echelon matrix with all new columns is the identity matrix.
3.3.11. Are the following matrices invertible? Find the inverse when possible, and then check your answer.
3.3.12. The matrix A is called lower triangular if aij = 0 whenever i < j.
If A is a lower triangular square matrix with all diagonal entries equal to 1, show that A is invertible and A−1 takes the same form.
3.3.13. This exercise refers back to the Gram–Schmidt exercise in chapter 2.
That exercise expresses the relation between the vectors{x′j} and the vectors {xj} formally as x′ = Ax where x′ is a column vector whose entries are the vectors x′1,· · · , x′n, x is the corresponding column vector of xj’s, and A is an n-by-n lower triangular matrix.
Show that each xj has the form
xj = a′j1x′1+ a′j2x′2+· · · + a′j,j−1x′j−1+ x′j,
and thus any linear combination of the original{xj} is also a linear combina-tion of the new{x′j}.