Section 1.2 Row Reduction 15 that (column i ) does not contain a pivot. When these arbitrary values are assigned, the other
1.3 THE MATRIX TRANSPOSE
In the discussion of the previous section, we chose to work with rows in order to apply the results to systems of linear equations. One may also perform column operations to simplify a matrix, and it is evident that similar results will be obtained.
^Elements of a set are said to be distinct if no two of them are equal.
Rows and columns are interchanged by the transpose operation on matrices. The transpose of an m X n matrix A is the n X m matrix A1 obtained by reflecting about the diagonal: A = (bij), where b !;- = aji. For instance,
' 1 2 't
1 3
_ 3 4_ _2 4 _ and [ 1 2 3 ]' = Here are the rules for computing with the transpose:
(1.3.1) (A B )X= B XA X, (A + B ){ = A t + B t , (cA )x = cA x, (A1)1 = A.
Using the first of these formulas, we can deduce facts about right multiplication from the corresponding facts about left multiplication. The elementary matrices (1.2.4) act by right multiplication AE as the following elementary column operations
(1.3.2) “ with a in the i, j position, add a (co lu m n i) to (column j ) ” ;
“interchange (column i ) and (column j ) ”;
“ multiply (column i) by a nonzero scalar c .”
Note that in the first of these operations, the indices i, j are the reverse of those in (l.2.5a).
1.4 DETERMINANTS
Every square matrix A has a number associated to it called its determinant, and denoted by detA. We define the determinant and derive some of its properties here.
The determinant of a 1 X 1 matrix is equal to its single entry
(1.4.1) det [a] = a,
and the determinant of a 2 X 2 matrix is given by the formula
(1.4.2) det = a d - bc.
The determinant of a 2 X 2 matrix A has a geometric interpretation. Left multiplication by A maps the space ]R2 of real two-dimensional column vectors to itself, and the area of the parallelogram that forms the image of the unit square via this map is the absolute value of the determinant of A. The determinant is positive or negative, according to whether the orientation of the square is preserved or reversed by the operation. Moreover, detA = 0 if and only if the parallelogram degenerates to a line segment or a point, which happens when the columns of the matrix are proportional.
[
3 2"1 4page. The shaded region is the image of the unit square under the map. Its area is 10.
This geometric interpretation extends to higher dimensions. Left multiplication by a 3 X 3 real matrix A maps the space of three-dimensional column vectors to itself, and the absolute value of its determinant is the volume of the image of the unit cube.
, is shown on the following
Section 1.4 Determinants 19‘
The set of all real n X n matrices forms a space of dimension n 2 that we denote by, JRn xn. We regard the determinant of n Xn matrices as a function from this space to the real numbers:
det :JRnXn -+ JR.
The determinant of an n X n matrix is a function of its n 2 entries. There is one such function for each positive integer n. Unfortunately, there are many formulas for these determinants, and all of them are complicated when n is large. Not only are the formulas complicated, but it may not be easy to show directly that two of them define the same function.
We use th e following strategy: We choose one of the formulas, and take it as o u r definition of the determinant. In that way we are talking about a particular function: We show that our chosen function is the only one having certain special properties: Then, to show that another formula defines the same determinant function, one needs only to check;:
those properties for the other; function. This is often not too difficult.
W e use a formula that computes the determinant of an n X n matrix in term s of certain (n — 1) X (n — 1) determinants by a process called expansion by minors. The detom inants of submatrices of a matrix are called minors. Expansion by minors allows us to give a recursive definition of the determinant.
The word recursive means that the definition of the determinant for n X n matrices makes use of the determinant for (n — 1) X (n - 1) matrices. Since we have defined the determinant for 1 X 1 matrices, we will be able to use our recursive definition ito compute, 2 X2 determinants, then knowing this, to compute 3 X 3 determ inants, and so on.
Let A b e an n Xn matrix and let A j denote the (n - 1) X (n — 1) submatrix obtained bycrossing o u t the ith row and the j t h column of Ai
j (1.4.4)
- Ay.
For example, if
1 0 3
A = 2 1 2 , then A21 = 0 5 1
0 3 5 1 '
• Expansion by minors on the first column is the formula
The signs alternate, beginning with +.
It is useful to write this expansion in summation notation:
(1.4.6) detA = ± a vid e tA vi.
v
The alternating sign can be written as ( - l ) u+l. It will appear again. We take this formula, together with (1.4.1), as a recursive definition o f the determinant.
For 1 X 1 and 2 X 2 matrices, this formula agrees with (1.4.1) and (1.4.2). The determinant of the 3 X 3 matrix A shown above is
Expansions by minors on other columns and on rows, which we define in Section 1.6, are among the other formulas for the determinant.
It is important to know the many special properties satisfied by determinants. We present some of these properties here, deferring proofs to the end of the section. Because we want to apply the discussion to other formulas, the properties will be stated for an unspecified function 8.
Theorem 1.4.7 Uniqueness of the Determinant. There is a unique function 8 on the space of n Xn matrices with the properties below, namely the determinant (1.4.5).
(i) With I denoting the identity matrix, 8 (/) = 1.
(ii) 8 is linear in the rows of the matrix A.
(iii) If two adjacent rows of a matrix A are equal, then 8 (A) = O.
The statement that 8 is linear in the rows of a matrix means this: Let A,- denote the ith row of a matrix A. Let A, B, D be three matrices, all of whose entries are equal, except for those in the rows indexed by k. Suppose furthermore that D* = cA* + c'B* for some scalars c and c'. Then 8(D) = c 8 (A) + c '8 (B):
(1.4.8) 8 cAi+c'Bi = c8 — A ; — + c ' 8 — Bt —
Section 1.4 Determinants 21
This allows us to operate on one row at a time, the other rows being left fixed. For example, since [0 2 3] = 2 [0 1 0] + 3 [0 0 1],
Perhaps the most important property of the determinant is its compatibility with matrix multiplication.
Theorem 1.4.9 Multiplicative Property of the Determinant. For any n Xn matrices A and B, det (AB) = (detA )(detB ).
The next theorem gives additional properties that are implied by those listed in (1.4.7).
Theorem 1.4.10 Let 8 be a function on n Xn matrices that has the properties (1.4.7)(i,ii,iii).
Then
(a) If A' is obtained from A by adding a multiple of (row j ) of A to (row i) and i j , then 8(A') = 8(A).
(b) If A' is obtained by interchanging (row i) and (row j ) of A and i j , then 8 (A') = - 8(A).
(c) If A' is obtained from A by multiplying (row i) by a scalar c, then 8(A') = c 8(A ).
If a row of a matrix A is equal to zero, then 8 (A) = 0.
(d) If (row i) of A is equal to a multiple of (row j ) and i j , then 8(A) = 0.
We now proceed to prove the three theorems stated above, in reverse order. The fact that there are quite a few points to be examined makes the proofs lengthy. This can’t be helped.
Proof o f Theorem 1.4.10. The first assertion of (c) is a part of linearity in rows (1.4.7)(ii).
The second assertion of (c) follows, because a row that is zero can be multiplied by 0 without changing the matrix, and it multiplies 8(A) by 0.
Next, we verify properties (a),(b),(d) when i and j are adjacent indices, say j = i + 1. To simplify our display, we represent the matrices schematically, denoting the rows in question by R = (row i) and S = (row j), and suppressing notation for the other rows. So
1
8 2 3 = 2 8 1
1
1 + 3 8 1
1
1 = 2 • 1 + 3 . 0 = 2.
1
denotes our given matrix A. Then by linearity in the ith row, (1.4.11)
The first term on the right side is 8(A), and the second is zero (1.4.7). This proves (a) for adjacent indices. To verify (b) for adjacent indices, we use (a) repeatedly. Denoting the rows by R and S as before:
( 1. 4.12)
Finally, (d) for adjacent indices follows from (c) and (1.4.7)(iii).
To complete the proof, we verify (a),(b),(d) for an arbitrary pair of distinct indices.
Suppose that (row i) is a multiple of (row j). We switch adjacent rows a few times to obtain a matrix A' in which the two rows in question are adjacent. Then (d) for adjacent rows tells us that 5G4') = 0, and (b) for adjacent rows tells us that 8(A') = ± 8(A). So 8(A) = 0, and this proves (d). A t this point, the proofs of that we have given for (a) and (b) in the case of
adjacent indices carry over to an arbitrary pair of indices. □
The rules (1.4.1O)(a),(b),(c) show how multiplication by an elementary matrix affects 8, and they lead to the next corollary.
Canceling 8(E), we see that the multiplicative property is true for A and B as well. This being so, induction shows that it suffices to prove the multiplicative property after row-reducing A. So we may suppose that A is row reduced. Then A is either the identity, or else its bottom row is zero. The property is obvious when A = I. If the bottom row of A is zero, so is the bottom row of AB, and Theorem 1.4.10 shows that 8(A) = 8( AB) = O. The property is true
in this case as well. □
Proof o f uniqueness o f the determinant, Theorem 1.4.7. Th ere are two parts. To prove unique
ness, we perform row reduction on a matrix A, say A' = Ek • • EjA. Corollary 1.4.13 tells us how to compute 8(A) from 8(A'). If A' is the identity, then 8(A') = 1. Otherwise the bottom row of A' is zero, and in that case Theorem 1.4.10 shows that 8 (A ’) = 0. This determ ine s 8(A) in both cases.
8
Section 1.4 Determinants 23