Linear Mappings and Their Matrices
3.2 Operations on Matrices
3.1.15. A mapping f : Rn −→ Rm is called affine if it has the form f (x) = T (x) + b where T ∈ L(Rn, Rm) and b ∈ Rm. State precisely and prove: the composition of affine mappings is affine.
3.1.16. Let T : Rn −→ Rmbe a linear mapping. Note that since T is continu-ous and since the absolute value function on Rmis continuous, the composite function
|T | : Rn−→ R is continuous.
(a) Let S ={x ∈ Rn:|x| = 1}. Explain why S is a compact subset of Rn. Explain why it follows that |T | takes a maximum value c on S.
(b) Show that |T (x)| ≤ c|x| for all x ∈ Rn. This result is the Linear Magnification Boundedness Lemma. We will use it in chapter 4.
3.1.17. Let T : Rn−→ Rm be a linear mapping.
(a) Explain why the set D ={x ∈ Rn :|x| = 1} is compact.
(b) Use part (a) of this exercise and part (b) of the preceding exercise to explain why therefore the set {|T (x)| : x ∈ D} has a maximum. This maximum is called the norm of T and is denotedkT k.
(c) Explain whykT k is the smallest value K that satisfies the condition from part (b) of the preceding exercise,|T (x)| ≤ K|x| for all x ∈ Rn.
(d) Show that for any S, T ∈ L(Rn, Rm) and any a∈ R, kS + T k ≤ kSk + kT k and kaT k = |a| kT k.
Define a distance function
d :L(Rn, Rm)× L(Rn, Rm)−→ R, d(S, T ) = kT − Sk.
Show that this function satisfies the distance properties of Theorem 2.2.8.
(e) Show that for any S∈ L(Rn, Rm) and any T ∈ L(Rp, Rn), kST k ≤ kSkkT k.
3.2 Operations on Matrices
Having described abstract objects, the linear mappings T ∈ L(Rn, Rm), with explicit ones, the matrices A∈ Mm,n(R) with (i, j)th entry aij = Ti(ej), we naturally want to study linear mappings via their matrices. The first step is to develop rules for matrix manipulation corresponding to operations on mappings. Thus if
S, T : Rn−→ Rm are linear mappings having matrices
A, B∈ Mm,n(R),
and if a is a real number, then the matrices for the linear mappings S + T : Rn−→ Rm and aS : Rn−→ Rm naturally should be denoted
A + B∈ Mm,n(R) and aA∈ Mm,n(R).
So “+” and “·” (or juxtaposition) are about to acquire new meanings yet again,
+ : Mm,n(R)× Mm,n(R)−→ Mm,n(R) and
· : R × Mm,n(R)−→ Mm,n(R).
To define the sum, fix j between 1 and n. Then the jth column of A + B = (S + T )(ej)
= S(ej) + T (ej)
= the sum of the jth columns of A and B.
And since vector addition is simply coordinatewise scalar addition, it follows that for any i between 1 and m and any j between 1 and m, the (i, j)th entry of A+B is the sum of the (i, j)th entries of A and B. (One can reach the same conclusion in a different way by thinking about rows rather than columns.) Thus the definition for matrix addition must be
Definition 3.2.1 (Matrix Addition).
If A = [aij]m×n and B = [bij]m×n then A + B = [aij+ bij]m×n.
For example,
1 2 3 4
+
−1 0 2 1
=
0 2 5 5
.
A similar argument shows that the appropriate definition to make for scalar multiplication of matrices is
Definition 3.2.2 (Scalar-by-Matrix Multiplication).
If α∈ R and A = [aij]m×n then αA = [αaij]m×n. For example,
2
1 2 3 4
=
2 4 6 8
.
The zero matrix 0m,n ∈ Mm,n(R), corresponding to the zero mapping in L(Rn, Rm), is the obvious one, with all entries 0. The operations in Mm,n(R) precisely mirror those inL(Rn, Rm), so
3.2 Operations on Matrices 77 Proposition 3.2.3 (Mm,n(R) Forms a Vector Space). The set Mm,n(R) of m-by-n matrices forms a vector space over R.
The remaining important operation on linear mappings is composition. As shown in exercise 3.1.13, if
S : Rn −→ Rm and T : Rp−→ Rn are linear then their composition
S◦ T : Rp−→ Rm
is linear as well. Suppose that S and T respectively have matrices A∈ Mm,n(R) and B∈ Mn,p(R).
Then the composition S◦ T has a matrix in Mm,p(R) that is naturally defined as the matrix-by-matrix product
AB∈ Mm,p(R),
the order of multiplication being chosen for consistency with the composition.
Under this specification,
(A times B)’s jth column = (S◦ T )(ej)
= S(T (ej))
= A times (B’s jth column).
And A times (B’s jth column) is a matrix-by-vector multiplication, which we know how to carry out: the result is a column vector whose ith entry for i = 1,· · · , m is the inner product of the ith row of A and the jth column of B. In sum, the rule for matrix-by-matrix multiplication is as follows.
Definition 3.2.4 (Matrix Multiplication). Given two matrices, A∈ Mm,n(R) and B∈ Mn,p(R),
such that A has as many columns as B has rows, their product, AB∈ Mm,p(R),
has for its (i, j)th entry (for any i∈ {1, · · · , m} and j ∈ {1, · · · , p}) the inner product of the ith row of A and the jth column of B. In symbols,
(AB)ij =hith row of A, jth column of Bi, or, at the level of individual entries,
If A = [aij]m×n and B = [bij]n×p then AB =
" n X
k=1
aikbkj
#
m×p
.
Inevitably, matrix-by-matrix multiplication subsumes matrix-by-vector multiplication, viewing vectors as one-column matrices. Also, once we have the definition of matrix-by-matrix multiplication, we can observe that in com-plement to the already-established rule that for any j∈ {1, · · · , n},
(A times B)’s jth column equals A times (B’s jth column), also, for any i∈ {1, · · · , m},
ith row of (A times B) equals (ith row of A) times B.
Indeed, both quantities in the previous display are the 1-by-p vector whose jth entry is the inner product of the ith row of A and the jth column of B.
For example, consider the matrices
Some products among these (verify!) are
AB =
Matrix multiplication is not commutative. Indeed, when the product AB is defined, the product BA may not be, or it may be but have different dimen-sions from AB; cf. EF and F E above. Even when A and B are both n-by-n, so that AB and BA are likewise n-by-n, the products need not agree. For example,
Of particular interest is the matrix associated to the identity mapping, id : Rn−→ Rn, id(x) = x.
Naturally, this matrix is denoted the identity matrix; it is written In. Since idi(ej) = δij,
3.2 Operations on Matrices 79
In= [δij]n×n=
1 0· · · 0 0 1· · · 0 ... ... ... 0 0· · · 1
.
Although matrix multiplication fails to commute, it does have the following properties.
Proposition 3.2.5 (Properties of Matrix Multiplication). Matrix mul-tiplication is associative,
A(BC) = (AB)C for A∈ Mm,n(R), B∈ Mn,p(R), C∈ Mp,q(R).
Matrix multiplication distributes over matrix addition,
A(B + C) = AB + AC for A∈ Mm,n(R), B, C∈ Mn,p(R), (A + B)C = AC + BC for A, B∈ Mm,n(R), C ∈ Mn,p(R).
Scalar multiplication passes through matrix multiplication,
α(AB) = (αA)B = A(αB) for α∈ R, A ∈ Mm,n(R), B∈ Mn,p(R).
The identity matrix is a multiplicative identity,
ImA = A = AIn for A∈ Mm,n(R).
Proof. The right way to show these is intrinsic, by remembering that addition, scalar multiplication, and multiplication of matrices precisely mirror addition, scalar multiplication, and composition of mappings. For example, if A, B, C are the matrices of the linear mappings S ∈ L(Rn, Rm), T ∈ L(Rp, Rn), and U ∈ L(Rq, Rp), then (AB)C and A(BC) are the matrices of (S◦ T ) ◦ U and S◦ (T ◦ U). But these two mappings are the same since the composition of mappings (mappings in general, not only linear mappings) is associative. To verify the associativity, we cite the definition of four different binary compo-sitions to show that the ternary composition is independent of parentheses, as follows. For any x∈ Rq,
((S◦ T ) ◦ U)(x) = (S ◦ T )(U(x)) by definition of R◦ U where R = S ◦ T
= S(T (U (x))) by definition of S◦ T
= S((T◦ U)(x)) by definition of T◦ U
= (S◦ (T ◦ U))(x) by definition of S ◦ V where V = T ◦ U.
So indeed ((S◦ T ) ◦ U) = (S ◦ (T ◦ U)), and consequently (AB)C = A(BC).
Alternatively, one can verify the equalities elementwise by manipulating sums. Adopting the notation Mij for the (i, j)th entry of a matrix M ,
(A(BC))ij =
The steps here are not explained in detail because the author finds this method as grim as it is gratuitous: the coordinates work because they must, but their presence only clutters the argument. The other equalities are similar. ⊓⊔ Composing mappings is most interesting when all the mappings in ques-tion take a set S back to the same set S, for the set of such mappings is closed under composition. In particular, L(Rn, Rn) is closed under compo-sition. The corresponding statement about matrices is that Mn(R) is closed under multiplication.
Exercises
3.2.1. Justify Definition 3.2.2 of scalar multiplication of matrices.
3.2.2. Carry out the matrix multiplications
a b
3.2.4. (If you have not yet worked exercise 3.1.14 then do so before working this exercise.) Let A = [aij]∈ Mm,n(R) be the matrix of S∈ L(Rn, Rm). Its transpose AT∈ Mn,m(R) is the matrix of the transpose mapping ST. Since S and STact respectively as multiplication by A and AT, the characterizing property of ST from exercise 3.1.14 gives
hx, ATyi = hAx, yi for all x ∈ Rn and y∈ Rm.
Make specific choices of x and y to show that the transpose AT∈ Mn,m(R) is obtained by flipping A about its Northwest–Southeast diagonal; that is, show that the (i, j)th entry of AT is aji. It follows that the rows of AT are the columns of A and the columns of ATare the rows of A.
(Similarly, let B ∈ Mn,p(R) be the matrix of T ∈ L(Rp, Rn), so that BT is the matrix of TT. Since matrix multiplication is compatible with linear
3.2 Operations on Matrices 81 mapping composition, we know immediately from exercise 3.1.14(b), with no reference to the concrete description of the matrix transposes AT and BT in terms of the original matrices A and B, that the transpose of the product is the product of the transposes in reverse order,
(AB)T= BTAT for all A∈ Mm,n(R) and B∈ Mn,p(R).
That is, by characterizing the transpose mapping in exercise 3.1.14, we eas-ily derived the construction of the transpose matrix here and obtained the formula for the product of transpose matrices with no reference to their con-struction.)
(This exercise may entail double subscripts.)
3.2.6. For any matrix A ∈ Mm,n(R) and column vector a ∈ Rm define the affine mapping (cf. exercise 3.1.15)
AffA,a: Rn −→ Rm
by the rule AffA,a(x) = Ax + a for all x∈ Rn, viewing x as a column vector.
(a) Explain why every affine mapping from Rn to Rmtakes this form.
(b) Given such A and a, define the matrix A′ ∈ Mm+1,n+1(R) to be
Thus, affine mappings, like linear mappings, behave as matrix-by-vector mul-tiplications but where the vectors are the usual input and output vectors augmented with an extra “1” at the bottom.
(c) If the affine mapping AffB,b : Rp−→ Rn determined by B ∈ Mn,p(R) multi-plication is compatible with composition of affine mappings.
3.2.7. The exponential of any square matrix A is the infinite matrix sum eA= I + A + 1
2!A2+ 1
3!A3+· · · . Compute the exponentials of the following matrices:
A = [λ], A =