and Corollary 1.4.13. Since we can write an invertible matrix as a product of elementary matrices, these properties determine the determinant of every invertible matrix. But there are many ways to write a given matrix as such a product. Without going through some steps as we have, it won’t be clear that two such products will give the same answer. It isn’t easy to make this idea work.
To complete the proof of Theorem 1.4.7. we must show that the determinant function (1.4.5) we have defined has the properties (1.4.7). This is done by induction on the size of the matrices. We note that the properties (1.4.7) are true when n = 1, in which case det [a] = a.
So we assume that they have been proved for determinants of (n — 1) X (n — 1) matrices.
Then all of the properties (1.4.7), (1.4.10), (1.4.13). and (1.4.9) are true for (n — 1) X (n — 1) matrices. We proceed to verify (1.4.7) for the function 8 = det defined by (1.4.5). and for n X n matrices. For reference, they are:
(i) With I denoting the identity matrix, det (I) = 1.
(ii) det is linear in the rows o f the matrix A.
(iii) If two adjacent rows of a matrix A are equal, then det (A) = 0.
(i) If A = In . then a n = 1 and a v = 0 when v > 1 . The expansion (1.4.5) reduces to det (A) = 1 d e t(A u ). Moreover. A i = I n- i , so by induction, det (A n ) = 1 and det (I„) = 1.
(ii) To prove linearity in the rows, we return to the notation introduced in (1.4.8). We show linearity of each of the terms in the expansion (1.4.5), i.e., that
(1.4.14) dvi det (D vd = c a„i det (A„i) + c' det (B„i) for every index v. Let k be as in (1.4.8).
Case 1: v = k. The row that we operate on has been deleted from the minors A*i, Bki, Dki so they are equal, and the values of det on them are equal too. On the other hand, a ^ l, bk i, are the first entries of the rows A k , Bk , Dk, respectively. So dkl = c a ^ i + c 'b ki, and (1.4.14) follows.
Case 2: v* ,k. If we let A^, B^, denote the vectors obtained from the rows A k, Bk, Dk, respectively, by dropping th e first entry, then A'k is a row of th e minor A v\, etc. Here D" = c A^ + c' B^. and by induction on n, det (D'yl) = c det (A'ul) + C det ( # ^ ) . On the other hand, since v*' k, the coefficients a v . b vi, dyi are equal. So (1.4.14) is true in this case as well.
(iii) Suppose that rows k and k + 1 of a matrix A are equal. Unless v = k or k + 1, the minor Avt has two rows equal, and its determinant is zero by induction. Therefore, at most two terms in (1.4.5) are different from zero. On the other hand, deleting either of the equal rows gives us the same matrix. So a^i = a ^+11 and A k\ = A^+i i . Then
det (A) = ± a k{ det (A ki) =F a k+x i det (A k+l i) = 0.
This completes the proof of Theorem 1.4.7. □
Corollary 1.4.15
(a) A square matrix A is invertible if and only if its determinant is different from zero. If A is invertible, then det (A-1) = (detA )_ l.
(b) The determinant of a matrix A is equal to the determinant of its transpose A1.
(c) Properties (1.4.7) and (1.4.10) continue to hold if the word row is replaced by the word column throughout.
Proof (a) If A is invertible, then it is a product of elementary matrices, say A = E i .. • Er (1.2.16). Then detA = (det E i) ■ • . (det Ek). The determinants of elementary matrices are nonzero (1.4.13), so detA is nonzero too. IfA is not invertible, there are elementary matrices El, . . . , Er such that the bottom row ofA ' = E\ ■ • ■ ErA is zero (1.2.15). Then detA ' = 0, and detA = 0 as well. If A is invertible, then det(A-1 )detA = d e t(A ^ A) = det I = 1, therefore det (A -1) = (detA )-1.
(b) It is easy to check that det E = det E* if E is an elementary matrix. If A is invertible, we write A = Ei ■ ■ ■ Ek as before. Then A' = E*k ■ ■ • E\, and by the multiplicative property, detA = detA*. If A is not invertible, neither is A*. Then both detA and detA* are zero.
(c) This follows from (b). □
1.5 PERMUTATIONS
A permutation of a set S is a bijective map p from a set S to itself:
(1.5.1) p : S -+ S.
The table
(1.5.2) i 1 2 3 4 5
p ( 0 3 5 4 1 2
exhibits a permutation p of the set {1, 2, 3, 4, 5} of five indices: p ( 1) = 3, etc. It is bijective because every index appears exactly once in the bottom row.
The set of all permutations of the indices {1, 2, . . . , n} is called the symmetric group, and is denoted by Sn. It will be discussed in Chapter 2.
The benefit of this definition of a permutation is that it permits composition of permutations to be defined as composition of functions. If q is another permutation, then doing first p then q means composing the functions: q c p. The composition is called the product permutation, and will be denoted by qp.
Note: People sometimes like to think of a permutation of the indices 1, . . . , n as a list of the same indices in a different order, as in the bottom row of (1.5.2). This is not good for us. In mathematics one wants to keep track of what happens when one performs two or more permutations in succession. For instance, we may want to obtain a permutation by repeatedly switching pairs of indices. Then unless things are written carefully, keeping track
of what has been done becomes a nightmare. □
The tabular form shown above is cumbersome. It is more common to use cycle notation.
To write a cycle notation for the permutation p shown above, we begin with an arbitrary
Section 1.5 Permutations 25 index, say 3, and follow it along: p ( 3) = 4, p ( 4) = 1, and p ( l ) = 3. The string of three indices forms a cycle for the permutation, which is denoted by
(1.5.3) (341).
This notation is interpreted as follows: the index 3 is sent to 4, the index 4 is sent to 1, and the parenthesis at the end indicates that the index 1 is sent back to 3 at the front by the permutation:
Because there are three indices, this is a 3-cycle.
Also, p(2) = 5 and p (5) = 2, so with the analogous notation, the two indices 2, 5 form a 2-cycle (25). 2-cycles are called transpositions.
The complete cycle notation for p is obtained by writing these cycles one after the other:
(1.5.4) p = (341) (25).
The permutation can be read off easily from this notation.
One slight complication is that the cycle notation isn’t unique, for two reasons. First, we might have started with an index different from 3. Thus
• (3 41 ), (134) and (413)
are notations for the same 3-cycle. Second, the order in which the cycles are written doesn’t matter. Cycles made up of disjoint sets of indices can be written in any order. We might just as well write
p = ( 5 2 ) (13 4).
The indices (which are 1 ,2 , 3, 4. 5 here) may be grouped into cycles arbitrarily, and the result will be a cycle notation for some permutation. For example, (34)(2)(15) represents the permutation that switches two pairs of indices, while fixing 2. However, 1-cycles, the indices that are left fixed, are often omitted from the cycle notation. We might write this permutation as (3 4) (1 5 ). The 4-cycle .
(1.5.5) q = (1 4 5 2 )
is interpreted as meaning that the missing index 3 is left fixed. Then in a cycle notation for a permutation, every index appears at most once. (Of course this convention assumes that the set of indices is known.) The one exception to this rule is for the identity permutation. W e’d rather not use the empty symbol to denote this permutation, so we denote it by 1.
To compute the product permutation q p , with p and q as above, we follow the indices through the two permutations, but we must remember that q p means q o p, “first do p, then q.” So since p sends 3 -+ 4 and q sends 4 -+ 5, qp sends 3 -+ 5. Unfortunately, we read cycles from left to right, but we have to run through the permutations from right to left, in a
zig-zag fashion. This takes some getting used to, but in the end it is not difficult. The result in our case is a 3-cycle:
then this first do this
q p = [(1452)] 0 [(341)(25)] = (135), the missing indices 2 and 4 being left fixed. On the other hand,
p q = ( 2 3 4 ) .
Composition of permutations is not a commutative operation.
There is a permutation matrix P associated to any permutation p. Left multiplication by this permutation matrix permutes the entries of a vector X using the permutation p.
For example, if there are three indices, the matrix P associated to the cyclic permutation p = (123) and its operation on a column vector are as follows:
' 0 0 1 ' "*i~ '* 3
(1.5.6) PX = 1 0 0 X2 = Xl
0 1 0 X3 X2
Multiplication by P shifts the first entry of the vector X to the second position and so on.
It is essential to write the matrix of an arbitrary permutation down carefully, and to check that the matrix associated to a product p q of permutations is the product matrix PQ.
The matrix associated to a transposition (25) is an elementary matrix of the second type, the one that interchanges the two corresponding rows. This is easy to see. But for a general permutation, determining the matrix can be confusing .
• To write a permutation matrix explicitly, it is best to use the n X n matrix units e,j, the matrices with a single 1 in the i, j position that were defined before (1.1.21). The matrix associated to a permutation p of Sn is
(In order to make the subscript as compact as possible, we have written p i for p (i).) This matrix acts on the vector X = L ejXj as follows:
(1.5.8) P X = ( L epi,i) ( L :> jX j) = L e PMejX j = I ] e PMe;x, = J 2
eP‘Xi-i j i j i i
This computation is made using formula (1.1.25). The terms e p ije j in the double sum are zero when i =1= j.
To express the right side of (1.5.8) as a column vector, we have to reindex so that the standard basis vectors on the right are in the correct order, ei, . , . , en rather than in the