Linear Mappings and Their Matrices
3.8 Geometry of the Determinant: Volume
Proposition 3.7.2 (Classical Adjoint Identity). Let n≥ 2 be an integer, let A∈ Mn(R) be an n-by-n matrix, and let Aadjbe its classical adjoint. Then
A Aadj= det(A)In. Especially, if A is invertible then
A−1= 1
det(A)Aadj.
The idea of the proof is that the inner product of the ith row of A and the ith column of Aadj gives precisely the formula for det(A), while for i6= j the inner product of the ith row of A and the jth column of Aadj gives the formula for the determinant of a matrix having the ith row of A as two of its rows. The argument is purely formal but notationally tedious, and so we omit it.
In the 2-by-2 case the proposition gives us a slogan:
To invert a 2-by-2 matrix, exchange the diagonal elements, negate the off-diagonal elements, and divide by the determinant.
Again, for n > 2 the explicit formula for the inverse is rarely of calculational use. We care about it for the following reason.
Corollary 3.7.3. Let A∈ Mn(R) be an invertible n-by-n matrix. Then each entry of the inverse matrix A−1 is a continuous function of the entries of A.
Proof. Specifically, the (i, j)th entry of A−1 is
(A−1)i,j= (−1)i+jdet(Aj,i)/ det(A),
a rational function (ratio of polynomials) of the entries of A. As such it varies continuously in the entries of A so long as A remains invertible. ⊓⊔
Exercise
3.7.1. Verify at least one diagonal entry and at least one off-diagonal entry in the formula A Aadj= det(A)In for n = 3.
3.8 Geometry of the Determinant: Volume
Consider a linear mapping from n-space back to n-space, T : Rn −→ Rn.
This section discusses two ideas:
• The mapping T magnifies volume by a constant factor. (Here volume is a pan-dimensional term that in particular means length when n = 1, area when n = 2, and the usual notion of volume when n = 3.) That is, there is some number t≥ 0 such that if one takes a set,
E ⊂ Rn,
and passes it through the mapping to get another set, TE ⊂ Rn,
then the set’s volume is multiplied by t, vol TE = t · vol E.
The magnification factor t depends on T but is independent of the setE.
• Furthermore, if the matrix of T is A then the magnification factor associ-ated to T is
t =| det A|.
That is, the absolute value of det A has a geometric interpretation as the factor by which T magnifies volume.
(The geometric interpretation of the sign of det A will be discussed in the next section.)
An obstacle to pursuing these ideas is that we don’t have a theory of volume in Rn readily at hand. In fact, volume presents real difficulties. For instance, any notion of volume that has sensible properties can not apply to all sets; so either volume behaves unreasonably or some sets don’t have well defined volumes at all. Here we have been tacitly assuming that volume does behave well and that the sets E under consideration do have volumes.
This section will investigate volume informally by considering how it ought to behave, stating assumptions as they arise and arriving only at a partial description. The resulting arguments will be heuristic, and the skeptical reader will see gaps in the reasoning. Volume will be discussed further in chapter 6, but a full treatment of the subject (properly called measure) is beyond the range of this text.
The standard basis vectors e1,· · · , en in Rn span the unit box, B = {α1e1+· · · + αnen: 0≤ α1≤ 1, · · · , 0 ≤ αn≤ 1}.
Thus box means interval when n = 1, rectangle when n = 2, and the usual notion of box when n = 3. Let p be a point in Rn, let a1,· · · , an be positive real numbers, and letB′denote the box spanned by the vectors a1e1,· · · , anen
and translated by p,
B′ ={α1a1e1+· · · + αnanen+ p : 0≤ α1≤ 1, · · · , 0 ≤ αn≤ 1}.
3.8 Geometry of the Determinant: Volume 115 (See figure 3.11. The figures of this section are set in two dimensions, but the ideas are general and hence so are the figure captions.) A face of a box is the set of its points such that some particular αi is held fixed at 0 or at 1 while the others vary. A box in Rn has 2n faces.
A natural definition is that the unit box has unit volume, volB = 1.
We assume that volume is unchanged by translation. Also, we assume that box volume is finitely additive, meaning that given finitely many boxesB1,· · · , BM that are disjoint except possibly for shared faces or shared subsets of faces, the volume of their union is the sum of their volumes,
vol [M i=1
Bi= XM i=1
volBi. (3.7)
And we assume that scaling any spanning vector of a box affects the box’s volume continuously in the scaling factor. It follows that scaling any spanning vector of a box by a real number a magnifies the volume by|a|. To see this, first note that scaling a spanning vector by an integer ℓ creates|ℓ| abutting translated copies of the original box, and so the desired result follows in this case from finite additivity. A similar argument applies to scaling a spanning vector by a reciprocal integer 1/m (m6= 0), since the original box is now |m|
copies of the scaled one. These two special cases show that the result holds for scaling a spanning vector by any rational number r = ℓ/m. Finally, the continuity assumption extends the result from the rational numbers to the real numbers, since every real number is approached by a sequence of rational numbers. Since the volume of the unit box is normalized to 1, since volume is unchanged by translation, and since scaling any spanning vector of a box by a magnifies its volume by |a|, the volume of the general box is (recalling that a1,· · · , an are assumed to be positive)
volB′= a1· · · an.
B′
B e1
e2
p + a1e1
p + a2e2
p
Figure 3.11.Scaling and translating the unit box
A subset of Rnthat is well approximated by boxes plausibly has a volume.
To be more specific, a subsetE of Rnis well approximated by boxes if for any
ε > 0 there exist boxes B1,· · · , BN,BN +1,· · · , BM that are disjoint except possibly for shared faces, such thatE is contained between a partial union of the boxes and the full union,
[N i=1
Bi⊂ E ⊂ [M i=1
Bi, (3.8)
and such that the boxes that complete the partial union to the full union have a small sum of volumes,
XM i=N +1
volBi< ε. (3.9)
(See figure 3.12, whereE is an elliptical region, the boxes B1throughBN that it contains are dark, and the remaining boxes BN +1 through BM are light.) To see thatE should have a volume, note that the first containment of (3.8) says that a number at most big enough to serve as vol E (a lower bound) is L = vol SN
i=1Bi, and the second containment says that a number at least big enough (an upper bound) is U = vol SM
i=1Bi. By the finite additivity condition (3.7), the lower and upper bounds are L = PN
i=1volBi and U = PM
i=1volBi. Thus they are close to each other by (3.9), U− L =
XM i=N +1
volBi< ε.
Since ε is arbitrarily small, the bounds should be squeezing down on a unique value that is the actual volume ofE, and so indeed E should have a volume.
For now this is only a plausibility argument, but it is essentially the idea of integration and it will be quantified in chapter 6.
Figure 3.12. Inner and outer approximation of E by boxes
Any n vectors v1,· · · , vn in Rn span an n-dimensional parallelepiped P(v1,· · · , vn) ={α1v1+· · · + αnvn: 0≤ α1≤ 1, · · · , 0 ≤ αn≤ 1},
3.8 Geometry of the Determinant: Volume 117 abbreviated to P when the vectors are firmly fixed. Again the terminology is pan-dimensional, meaning in particular interval, parallelogram, and paral-lelepiped in the usual sense for n = 1, 2, 3. We will also consider translations of parallelepipeds away from the origin by offset vectors p,
P′=P + p = {v + p : v ∈ P}.
(See figure 3.13.) A face of a parallelepiped is the set of its points such that some particular αi is held fixed at 0 or at 1 while the others vary. A paral-lelepiped in Rn has 2n faces. Boxes are special cases of parallelepipeds. The methods of chapter 6 will show that parallelepipeds are well approximated by boxes, and so they have well defined volumes. We assume that parallelepiped volume is finitely additive, and we assume that any finite union of paral-lelepipeds each having volume zero again has volume zero.
P′
P p + v1
p + v2
v1
v2
p
Figure 3.13.Parallelepipeds
To begin the argument that the linear mapping T : Rn −→ Rn magnifies volume by a constant factor, we pass the unit boxB and the scaled translated box B′ from earlier in the section through T . The image of B under T is a parallelepiped TB spanned by T (e1),· · · , T (en), and the image of B′ is a parallelepiped TB′ spanned by T (a1e1),· · · , T (anen) and translated by T (p).
(See figure 3.14.) Since T (a1e1) = a1T (e1),· · · , T (anen) = anT (en), it follows that scaling the sides of TB by a1,· · · , an and then translating the scaled parallelepiped by T (p) gives TB′. As for boxes, scaling any spanning vector of a parallelepiped by a real number a magnifies the volume by|a|, and so we have
vol TB′= a1· · · an · vol T B.
But recall that also
volB′= a1· · · an. The two displays combine to give
vol TB′
volB′ = vol TB.
That is, the volume of the T -image of a box divided by the volume of the box is constant, regardless of the box’s location or side lengths, the constant being the volume of TB, the T -image of the unit box B. Call this constant magnification factor t. Thus,
Figure 3.14.Linear image of the unit box and of a scaled translated box
We need one last preliminary result about volume. Again letE be a subset of Rn that is well approximated by boxes. Fix a linear mapping T : Rn −→
Rn. Very similarly to the argument for E, the set T E also should have a volume, because it is well approximated by parallelepipeds. Indeed, the set containments (3.8) are preserved under the linear mapping T ,
T
In general, the image of a union is the union of the images, so this rewrites as [N
(See figure 3.15.) As before, numbers at most big enough and at least big enough for the volume of TE are
L = vol
The only new wrinkle is that citing the finite additivity of parallelepiped volume here assumes that the parallelepipeds TBi either inherit from the original boxes Bi the property of being disjoint except possibly for shared faces, or they all have volume zero. The assumption is valid because if T is invertible then the inheritance holds, while if T is not invertible then we will
3.8 Geometry of the Determinant: Volume 119 see later in this section that the TBi have volume zero as desired. With this point established, let t be the factor by which T magnifies box-volume. The previous display and (3.10) combine to show that the difference of the bounds is
U− L = XM i=N +1
vol TBi= XM i=N +1
t· vol Bi= t· XM i=N +1
vol Bi≤ tε.
The inequality is strict if t > 0, and it collapses to U − L = 0 if t = 0. In either case, since ε is arbitrarily small, the argument that TE should have a volume is the same as forE.
Figure 3.15. Inner and outer approximation of T E by parallelepipeds
To complete the argument that the linear mapping T : Rn −→ Rn mag-nifies volume by a constant factor, we argue that for any subset E of Rn that is well approximated by boxes, vol TE is t times the volume of E. Let V = volSN
i=1Bi. ThenE is contained between a set of volume V and a set of volume less than V + ε (again see figure 3.12, where V is the shaded area and V + ε is the total area), and TE is contained between a set of volume tV and a set of volume at most t(V + ε) (again see figure 3.15, where tV is the shaded area and t(V +ε) is the total area). Thus the volumes volE and vol T E satisfy the condition
tV
V + ε ≤ vol TE
volE ≤t(V + ε)
V .
Since ε can be arbitrarily small, the left and right quantities in the display can be arbitrarily close to t, and so the only possible value for the quantity in the middle (which is independent of ε) is t. Thus we have the desired equality announced at the beginning of the section,
vol TE = t · vol E.
In sum, subject to various assumptions about volume, T magnifies the volumes of all boxes and of all figures that are well approximated by boxes by the same factor, which we have denoted t.
Now we investigate the magnification factor t associated to the linear map-ping T , with the goal of showing that it is| det A| where A is the matrix of T . As a first observation, if the linear mappings S, T : Rn−→ Rnmagnify volume by s and t respectively, then their composition S◦ T magnifies volume by st.
In other words, the magnification of linear mappings is multiplicative. Also, recall that the mapping T is simply multiplication by the matrix A. Since any matrix is a product of elementary matrices times an echelon matrix, we only need to study the magnification of multiplying by such matrices. Temporarily let n = 2.
The 2-by-2 recombine matrices take the form R = [1 a0 1] and R′ = [1 0a 1] with a ∈ R. The standard basis vectors e1 and e2 are taken by R to its columns, e1 and ae1+ e2. Thus R acts geometrically as a shear by a in the e1-direction, magnifying volume by 1. (See figure 3.16.) Note that 1 =| det R|
as desired. The geometry of R′ is left as an exercise.
Figure 3.16. Shear
The scale matrices are S = [a 00 1] and S′ = [1 00 a]. The standard basis gets taken by S to ae1and e2, so S acts geometrically as a scale in the e1-direction, magnifying volume by|a|; this is | det S|, again as desired. (See figure 3.17.) The situation for S′ is similar.
Figure 3.17.Scale
The transposition matrix is T = [0 11 0]. It exchanges e1 and e2, acting as a reflection through the diagonal, magnifying volume by 1. (See figure 3.18.)
3.8 Geometry of the Determinant: Volume 121 Since det T =−1, the magnification factor is the absolute value of the deter-minant.
Figure 3.18.Reflection
Finally, the identity matrix E = I has no effect, magnifying volume by 1, and any other echelon matrix E has bottom row (0, 0) and hence squashes e1
and e2 to vectors whose last component is 0, magnifying volume by 0. (See figure 3.19.) The magnification factor is| det E| in both cases.
Figure 3.19.Squash
The discussion for scale matrices, transposition matrices, and echelon ma-trices generalizes effortlessly from 2 to n dimensions, but generalizing the discussion for recombine matrices Ri;j,a takes a small argument. Since trans-position matrices have no effect on volume, we may multiply Ri;j,a from the left and from the right by various transposition matrices to obtain R1;2,a and study it instead. Multiplication by R1;2,a preserves all of the standard basis vectors except e2, which is taken to ae1+ e2 as before. The resulting paral-lelepipedP(e1, ae1+ e2, e3,· · · , en) consists of the parallelogram shown in the right side of figure 3.16, extended one unit in each of the remaining orthogonal n− 2 directions of Rn. The n-dimensional volume of the parallelepiped is its base (the area of the parallelogram, 1) times its height (the (n−2)-dimensional volume of the unit box over each point of the parallelogram, again 1). That is, the n-by-n recombine matrix still magnifies volume by 1, the absolute value of its determinant, as desired. The base times height property of volume is yet
another invocation here, but it is a consequence of a theorem to be proved in chapter 6, Fubini’s Theorem. Summarizing,
Theorem 3.8.1 (Geometry of Linear Mappings). Any linear mapping T : Rn−→ Rn is the composition of a possible squash followed by shears, scales and reflections. If the matrix of T is A then T magnifies volume by | det A|.
Proof. The matrix A of T is a product of elementary matrices and an echelon matrix. The elementary matrices act as shears, scales and reflections, and if the echelon matrix is not the identity then it acts as a squash. This proves the first statement. Each elementary or echelon matrix magnifies volume by the absolute value of its determinant. The second statement follows since magnification and| det | are both multiplicative. ⊓⊔ The work of this section has given a geometric interpretation of the mag-nitude of det A: it is the magnification factor of multiplication by A. If the columns of A are denoted c1,· · · , cn then Aej = cj for j = 1,· · · , n, so that even more explicitly | det A| is the volume of the parallelepiped spanned by the columns of A. For instance, to find the volume of the three-dimensional parallelepiped spanned by the vectors (1, 2, 3), (2, 3, 4), and (3, 5, 8), compute that
| det
1 2 3 2 3 5 3 4 8
| = 1.
Exercises
3.8.1. (a) The section states that the image of a union is the union of the images. More specifically, let A and B be any sets, let f : A −→ B be any mapping, and let A1,· · · , AN be any subsets of A. Show that
f [N i=1
Ai
!
= [N i=1
f (Ai).
(This exercise is purely set-theoretic, making no reference to our working environment of Rn.)
(b) Consider a two-point set A = {a1, a2} where a1 6= a2, a one-point set B = {b}, and the only possible mapping f : A −→ B, given by f (a1) = f (a2) = b. Let A1 = {a1} and A2 = {a2}, subsets of A. What is the intersection A1∩ A2? What is the image of the intersection, f (A1∩ A2)?
What are the images f (A1) and f (A2)? What is the intersection of the images, f (A1)∩ f(A2)? Is the image of an intersection in general the intersection of the images?
3.8.2. Describe the geometric effect of multiplying by the matrices R′and S′ in the section. Describe the effect of multiplying by R and S if a < 0.