Chapter 3 - From Gaussian Elimination to LU Factorization
Maggie Myers Robert A. van de Geijn The University of Texas at Austin
Practical Linear Algebra – Fall 2009
Gaussian Elimination - Take 1
Consider the system of linear equations 2x + 4y − 2z = −10 4x − 2y + 6z = 20 6x − 4y + 2z = 18
Notice that x, y, and z are just variables, for which we can pick any name we want. To be consistent with the notation we introduced previously for naming components of vectors, we use the names χ
0, χ
1, and and χ
2instead of x, y, and z, respectively:
2χ
0+ 4χ
1− 2χ
2= −10
4χ
0− 2χ
1+ 6χ
2= 20
6χ − 4χ + 2χ = 18
2χ
0+ 4χ
1− 2χ
2= −10 4χ
0− 2χ
1+ 6χ
2= 20 6χ
0− 4χ
1+ 2χ
2= 18
Solving this linear system relies on the fact that its solution does not change if
1
Equations are reordered (not actually used in this example);
and/or
2
An equation in the system is modified by subtracting a multiple of another equation in the system from it; and/or
3
Both sides of an equation in the system are scaled by a
nonzero.
Example: Gaussian Elimination
The following steps are knows as Gaussian elimination. They transform a system of linear equations to an equivalent upper triangular system of linear equations:
Subtract λ
10= (4/2) = 2 times the first equation from the second equation:
Before After
2χ
0+ 4χ
1− 2χ
2= −10 4χ
0− 2χ
1+ 6χ
2= 20 6χ
0− 4χ
1+ 2χ
2= 18
2χ
0+ 4χ
1− 2χ
2= −10
− 10χ
1+ 10χ
2= 40
6χ
0− 4χ
1+ 2χ
2= 18
Subtract λ
20= (6/2) = 3 times the first equation from the third equation:
Before After
2χ
0+ 4χ
1− 2χ
2= −10
− 10χ
1+ 10χ
2= 40 6χ
0− 4χ
1+ 2χ
2= 18
2χ
0+ 4χ
1− 2χ
2= −10
− 10χ
1+ 10χ
2= 40
− 16χ
1+ 8χ
2= 48 Subtract λ
21= ((−16)/(−10)) = 1.6 times the second equation from the third equation:
Before After
2χ
0+ 4χ
1− 2χ
2= −10
− 10χ
1+ 10χ
2= 40
− 16χ
1+ 8χ
2= 48
2χ
0+ 4χ
1− 2χ
2= −10
− 10χ
1+ 10χ
2= 40
− 8χ
2= −16
This now leaves us with an upper triangular system of linear equations.
Multipliers
In the above Gaussian elimination procedure, λ
10, λ
20, and λ
21are
called the multipliers.
Back substitution
2χ
0+ 4χ
1− 2χ
2= −10
− 10χ
1+ 10χ
2= 40
− 8χ
2= −16 Solve last equation: χ
2= −16/(−8) = 2.
Substitute χ
2= 2 into second equation and solve:
χ
1= (40 − 10(2))/(−10) = −2.
Substitute χ
2= 2 and χ
1= −2 into first equation and solve:
χ
0= (−10 − (4(−2) + (−2)(−2)))/2 = 1.
Thus, the solution is the vector x =
χ
0χ
1χ
2
=
1
−2 2
.
Gaussian Elimination - Take 2
It becomes very cumbersome to always write the entire equation.
The information is encoded in the coefficients in front of the χ
ivariables, and the values to the right of the equal signs.
We could just let
2 4 −2 −10
4 −2 6 20
6 −4 2 18
represent
2χ
0+ 4χ
1− 2χ
2= −10
4χ
0− 2χ
1+ 6χ
2= 20
6χ
0− 4χ
1+ 2χ
2= 18
Then Gaussian elimination can simply work with this array of
numbers.
Initial system of equations:
2 4 −2 −10 4 −2 6 20
6 −4 2 18
Subtract λ
10= (4/2) = 2 times the first row from the second row:
2 4 −2 −10
0 −10 10 40
6 −4 2 18
Subtract λ
20= (6/2) = 3 times the first row from the third row:
2 4 −2 −10
0 −10 10 40
0 −16 8 48
Subtract λ
21= ((−16)/(−10)) = 1.6 times the second row from the third row:
2 4 −2 −10
0 −10 10 40
0 0 −8 −16
Back substitution
2 4 −2 −10
0 −10 10 40
0 0 −8 −16
The last row is shorthand for − 8χ
2= −16 which implies χ
2= (−16)/(−8) = 2 The second row is shorthand for − 10χ
1+ 10χ
2= 40 which implies − 10χ
1+ 10(2) = 40
and hence χ
1= (40 − 10(2))/(−10) = −2 The first row is shorthand for 2χ
0+ 4χ
1− 2χ
2= −10 which implies 2χ
0+ 4(−2) − 2(2) = −10 and hence χ
0= (−10 − 4(−2) + 2(2))/(2) = 1
Solution equals x =
0
@ χ
0χ
1χ
21 A =
0
@ 1
−2 2
1 A
Check the answer (by plugging χ
0= 1, χ
1= −2, and χ
2= 2 into the original system)
2(1) + 4(−2) − 2(2) = −10 X
4(1) − 2(−2) + 6(2) = 20 X
6(1) − 4(−2) + 2(2) = 18 X
Observations
The above discussion motivates storing only the coefficients of a linear system (the numbers to the left of the |) as a two dimensional array and the numbers to the right as a one dimension array.
We recognize this two dimensional array as a matrix:
A ∈ R
m×nis the two dimensional array of scalars
A =
α
0,0α
0,1· · · α
0,n−1α
1,0α
1,1· · · α
1,n−1.. . .. . . .. .. . α
m−1,0α
m−1,1· · · α
m−1,n−1
,
where α
i,j∈ R for 0 ≤ i < m and 0 ≤ j < n.
Observations (continued)
We similarly recognize that the one dimensional array is a (column) vector x ∈ R
nwhere
x =
χ
0χ
1.. . χ
n−1
.
The length of the vector is n.
Now, given A ∈ R
m×nand vector x ∈ R
n, the notation Ax stands for
α
0,0χ
0+ α
0,1χ
1+ · · · + α
0,n−1χ
n−1α
1,0χ
0+ α
1,1χ
1+ · · · + α
1,n−1χ
n−1.. . .. . .. . .. .
α
m−1,0χ
0+ α
m−1,1χ
1+ · · · + α
m−1,n−1χ
n−1
Gaussian Elimination - Take 3
Example
1 0 0
−2 1 0 0 0 1
2 4 −2
4 −2 6
6 −4 2
=
2 4 −2
0 −10 10
6 −4 2
.
Exercise Compute
1 0 0 0 1 0
−3 0 1
2 4 −2
0 −10 10
6 −4 2
.
How can this be described as an axpy operation?
0
@
2
4 −2 −10
4
−2 6 20
6 −4 2 18
1 A 0
@
1 0 0
−2 1 0 0 0 1
1 A
0
@
2
4 −2 −10
4
−2 6 20
6 −4 2 18
1 A =
0
@
2
4 −2 −10
0 −10 10 40
6
−4 2 18
1 A 0
@
1 0 0 0 1 0
−3 0 1 1 A
0
@
2
4 −2 −10 0 −10 10 40
6
−4 2 18
1 A =
0
@
2 4 −2 −10 0
−1010 40 0
−168 48
1 A 0
@
1 0 0
0 1 0
0 −1.6 1 1 A
0
@
2 4 −2 −10 0
−1010 40 0
−168 48
1 A =
0
@
2 4 −2 −10 0 −10 10 40 0 0 −8 −16
1
A
2 4 −2
4 −2 6
6 −4 2
1 0 0
−2 1 0 0 0 1
2 4 −2
4 −2 6
6 −4 2
=
2 4 −2
2
−10 10
6 −4 2
1 0 0 0 1 0
−3 0 1
2 4 −2
2
−10 10
6 −4 2
=
2 4 −2
2
−10 10
3
−16 8
1 0 0
0 1 0
0 −1.6 1
2 4 −2
2
−10 10
3
−16 8
=
2 4 −2
2
−10 10
3 1.6
−8
−10 20 18
1 0 0
−2 1 0 0 0 1
−10 20 18
=
−10 40 18
1 0 0 0 1 0
−3 0 1
−10 40 18
=
−10 40 48
1 0 0
0 1 0
0 −1.6 1
−10 40 48
=
−10 40
−16
Back substitution as before
Gaussian Elimination - Take 4
Example
1 0 0
−2 1 0
−3 0 1
2 4 −2
4 −2 6
6 −4 2
=
2 4 −2
0 −10 10
0 −16 8
2 4 −2
4 −2 6 6 −4 2
1 0 0
−2 1 0
−3 0 1
2 4 −2
4 −2 6 6 −4 2
=
2 4 −2
2
−10 10
3
−16 8
1 0 0
0 1 0
0 −1.6 1
2 4 −2
2
−10 10
3
−16 8
=
2 4 −2
2
−10 10
3 1.6
−8
Forward substitution
−10 20 18
1 0 0
−2 1 0
−3 0 1
−10 20 18
=
−10 40 48
1 0 0
0 1 0
0 −1.6 1
−10 40 48
=
−10 40
−16
Back substitution as before
Theorem
Let ˆ L
jbe a matrix that equals the identity, except that for i > jthe (i, j) elements (the ones below the diagonal in the jth column) have been replaced with −λ
i,j:
L ˆ
j=
I
j0 0 0 · · · 0
0 1 0 0 · · · 0
0 −λ
j+1,j1 0 · · · 0 0 −λ
j+2,j0 1 · · · 0 .. . .. . .. . .. . . .. ...
0 −λ
m−1,j0 0 · · · 1
.
Then ˆ L
jA equals the matrix A except that for i > j the ith row is
Exercise Verify that
1 0 0
0 1 0
0 −1.6 1
1 0 0
−2 1 0
−3 0 1
2 4 −2
4 −2 6
6 −4 2
=
2 4 −2
0 −10 10
0 0 −8
and
1 0 0
0 1 0
0 −1.6 1
1 0 0
−2 1 0
−3 0 1
−10 20 18
=
−10 40
−16
.
Gaussian Elimination - Take 4
Example Consider
1 0 0
−λ
101 0
−λ
200 1
2 4 −2
4 −2 6 6 −4 2
=
2 4 −2
4 − λ
10(2) −2 − λ
10(4) 6 − λ
10(−2) 6 − λ
20(2) −4 − λ
20(4) 2 − λ
20(−2)
How should λ
10and λ
20be chosen so that zeroes are introduced below the diagonal in the first column?.
Examine 4 − λ
10(2) and 6 − λ
20(2).
λ
10= 4/2 = and λ
20= 6/2 = 3 have the desired property.
Example
Alternatively, we can write this as
0
@
1 `
0 0 ´
−
„ λ
10λ
20« „ 1 0 0 1
« 1 A
0
@
2 `
4 −2 ´
„ 4 6
« „
−2 6
−4 2
« 1 A
= 0
@
2 `
4 −2 ´
−
„ λ
10λ
20« 2 +
„ 4 6
«
−
„ λ
10λ
20«
` 4 −2 ´ +
„ −2 6
−4 2
« 1 A
To zero the elements below the diagonal in the first column:
−
λ
10λ
202 +
4 6
=
0 0
or, equivalently,
Generalizing this insight
Let A
(0)∈ R
n×nand ˆ L
(0)a Gauss transform. Partition A
(0)→ α
(0)11a
(0) T12a
(0)21A
(0)22!
, ˆ L
(0)→ 1 0
−l
(0)21I
! .
Then ˆ L
(0)A
(0)=
1 0
−l
(0)21I
!
α
11(0)a
(0) T12a
(0)21A
(0)22!
= α
11(0)a
(0) T12a
(0)21− l
(0)21α
(0)11A
(0)22− l
21a
(0) T12!
.
Generalizing this insight (continued)
α
(0)11a
(0) T12a
(0)21− l
(0)21α
(0)11A
(0)22− l
21a
(0) T12! .
Choose l
(0)21so that a
(0)21− l
(0)21α
(0)11= 0:
l
21(0)= a
(0)21/α
(0)11.
A
(0)22→ A
(0)22− l
(0)21a
(0) T12: this is a rank-1 update (ger).
Update
A
(1):=
„ 1 0
−l
21(0)I
« α
(0)11a
(0) T12a
(0)21A
(0)22!
Example Consider
1 0 0
0 1 0
0 −λ
211
2 4 −2
0 −10 10
0 −16 8
=
2 4 −2
0 −10 10
0 −16 − λ
21(−10) 8 − λ
20(10)
How should λ
21be chosen?
−16 − λ
21(−10) = 0 so that λ
21= −16/(−10) = 1.6 has the desired property.
Alternatively, we notice that, viewed as a vector,
λ
21= −16 /(−10).
Moving on
A
(1)→
A
(1)00a
(1)01A
(1)020 α
(1)11a
(1) T120 a
(1)21A
(1)22
, ˆ L
(1)→
I
10 0
0 1 0
0 −l
(1)21I
.
Then
I
10 0
0 1 0
0 −l
21(1)I
A
(1)00a
(1)01A
(1)020 α
(1)11a
(1) T120 a
(1)21A
(1)22
=
A
(1)00a
(1)01A
(1)020 α
(1)a
(1) T
.
Moving on
0 B
@
I
10 0
0 1 0
0 −l
21(1)I 1 C A
0 B
@
A
(1)00a
(1)01A
(1)020 α
(1)11a
(1) T120 a
(1)21A
(1)221 C A
= 0 B
@
A
(1)00a
(1)01A
(1)020 α
(1)11a
(1) T120 a
(1)21− l
21(1)α
(1)11A
(1)22− l
(1)21a
(1) T121 C A .
Now,
Choose l
(1)21so that a
(1)21− l
(1)21α
(1)11= 0: l
(1)21= a
(1)21/α
(1)11. A
(1)22→ A
(1)22− l
(1)21a
(1) T12: this is a rank-1 update (ger).
A
(2)= 0 B
@
I
10 0
0 1 0
0 −l
(1)21I 1 C A
0 B
@
A
(1)00a
(1)01A
(1)020 α
(1)11a
(1) T120 a
(1)21A
(1)221 C A =
0 B
@
A
(1)00a
(1)01A
(1)020 α
(1)11a
(1) T120 0 A
(2)221
C
A
More general yet
A
(k)→
A
(k)00a
(k)01A
(k)020 α
(k)11a
(k) T120 a
(k)21A
(k)22
, ˆ L
(k)→
I
k0 0
0 1 0
0 −l
(k)21I
,
where A
(k)00and I
kare k × k matrices. Then
I
k0 0
0 1 0
0 −l
(k)21I
A
(k)00a
(k)01A
(k)020 α
(k)11a
(k) T120 a
(k)21A
(k)22
A
(k)00a
(k)01A
(k)02
0 B
@
I
k0 0
0 1 0
0 −l
(k)21I 1 C A
0 B
@
A
(k)00a
(k)01A
(k)020 α
(k)11a
(k) T120 a
(k)21A
(k)221 C A
= 0 B
@
A
(k)00a
(k)01A
(k)020 α
(k)11a
(k) T120 a
(k)21− l
21(k)α
(k)11A
(k)22− l
(k)21a
(k) T121 C A .
Choose l
(k)21so that a
(k)21− l
(k)21α
(k)11= 0: l
21(k)= a
(k)21/α
(k)11. A
(k)22→ A
(k)22− l
21(k)a
(k) T12:
A
(k+1)= 0 B
@
I
10 0
0 1 0
0 −l
21(k)I 1 C A
0 B
@
A
(k)00a
(k)01A
(k)020 α
(k)11a
(k) T120 a
(k)21A
(k)221 C A
= 0 B
@
A
(k)00a
(k)01A
(k)020 α
(k)11a
(k) T120 0 A
(k+1)221
C
A
A := GE Take5 (A) Partition A →
„ A
T LA
T RA
BLA
BR«
where A
T Lis 0 × 0 while m(A
T L) < m(A) do
Repartition
„ A
T LA
T RA
BLA
BR«
→ 0
@
A
00a
01A
02a
T10α
11a
T12A
20a
21A
221 A where α
11is 1 × 1
a
21:= a
21/α
11(= l
21)
A
22:= A
22− a
21a
T12(= A
22− l
21a
T12) Continue with
„ A
T LA
T R«
←
0 A
00a
01A
02a
T10α
11a
T121
Insights
Now, if A ∈ R
n×n, then
A
(n)= ˆ L
(n−1)· · · ˆ L
(1)L ˆ
(0)A = U,
an upper triangular matrix. Also, to solve Ax = b, we note that U x = ( ˆ L
(n−1)· · · ˆ L
(1)L ˆ
(0)A)x = ˆ L
(n−1)· · · ˆ L
(1)L ˆ
(0)b
| {z }
ˆ b
.
The right-hand size of this we recognize as forward substitution
applied to vector b. We will later see that solving U x = ˆ b where U
is upper triangular is equivalent to back substitution.
The reason why we got to this point as “GE Take 5” is so that the reader, hopefully, now recognizes this as just Gaussian elimination.
The insights in this section are summarized in the algorithm,
in which the original matrix A is overwritten with the upper
triangular matrix that results from Gaussian elimination and
the strictly lower triangular elements are overwritten by the
multipliers.
Gaussian Elimination - Take 6
Inverse of a Matrix
Let A ∈ R
n×nand B ∈ R
n×nhave the property that AB = BA = I.
Then B is said to be the inverse of matrix A and is denoted by A
−1.
Later we will see that for square A and B it is always the case that
if AB = I then BA = I and that the inverse of a matrix is unique.
Example Let
L = ˆ 0
@
1 0 0
−2 1 0 0 0 1
1
A and L = 0
@
1 0 0 2 1 0 0 0 1
1 A .
Then
L ˆ L = 0
@
1 0 0 2 1 0 0 0 1
1 A
0
@
1 0 0
−2 1 0 0 0 1
1 A =
0
@
1 0 0 0 1 0 0 0 1
1 A .
This should be intuitively true:
LA subtracts two times the first row from the second row. ˆ LA adds two times the first row from the second row.
L ˆ LA = L( ˆ LA) = A. Why?
Two transformations that always undo each other are inverses
of each other.
Exercise Compute
1 0 0
−2 1 0
−3 0 1
1 0 0 2 1 0 3 0 1
and reason why this should be intuitively true.
Theorem If
L = ˆ
I
k0 0
0 1 0
0 −l
21I
then L =
I
k0 0
0 1 0
0 l
21I
is its inverse: L ˆ L = ˆ LL = I.
Proof
LL ˆ =
I
k0 0
0 1 0
0 −l
21I
I
k0 0
0 1 0
0 l
21I
=
I
k0 0
0 1 0
0 −l
21+ Il
21I
=
I
k0 0 0 1 0 0 0 I
= I.
Similarly ˆ LL = I. (Notice that when we use I without indicating its dimensions, it has the dimensions that are required to fit the situation.)
http://z.cs.utexas.edu/wiki/pla.wiki/ 44
Exercise Recall that
1 0 0
0 1 0
0 −1.6 1
1 0 0
−2 1 0
−3 0 1
2 4 −2
4 −2 6
6 −4 2
=
2 4 −2
0 −10 10
0 0 −8
.
Show that
2 4 −2
4 −2 6
6 −4 2
Exercise Show that
1 0 0 2 1 0 3 0 1
1 0 0
0 1 0
0 1.6 1
=
1 0 0
2 1 0
3 1.6 1
so that
2 4 −2
4 −2 6
6 −4 2
=
1 0 0
2 1 0
3 1.6 1
2 4 −2
0 −10 10
0 0 −8
.
Theorem
Let ˆ L
(0), · · · , ˆ L
(n−1)be the sequence of Gauss transforms that transform an n × n matrix A to an upper triangular matrix:
L ˆ
(n−1)· · · ˆ L
(0)A = U.
Then
A = L
(0)· · · L
(n−2)L
(n−1)U,
where L
(j)= ˆ L
(j) −1, the inverse of ˆ L
(j).
Proof If
L ˆ
(n−1)L ˆ
(n−2)· · · ˆ L
(0)A = U.
then
A = L
(0)· · · L
(n−2)L
(n−1)L ˆ
(n−1)| {z }
I
L ˆ
(n−2)| {z }
I
· · · ˆ L
(0)| {z }
I
A
= L
(0)· · · L
(n−2)L
(n−1)L ˆ
(n−1)L ˆ
(n−2)· · · L
(0)A
| {z }
U
= L
(0)· · · L
(n−2)L
(n−1)U.
Lemma
Let ˆ L
(0), . . . , ˆ L
(n−1)be the sequence of Gauss transforms that transforms a matrix A into an upper triangular matrix U :
L ˆ
(n−1)· · · ˆ L
(0)A = U and let L
(j)= ˆ L
(j) −1. Then
L ˜
(k)= L
(0)· · · L
(k−1)L
(k)has the structure
L ˜
(k)=
L ˜
(k)T L0 L ˜
(k)BLI
!
Proof
Proof by induction on k.
Base case: k = 0. ˜ L
(0)= L
(0)= 1 0 l
21(0)I
!
meets the
desired criteria since 1 is a trivial unit lower triangular matrix.
Inductive step: Assume ˜ L
(k)meets the indicated criteria.
We will show that then ˜ L
(k+1)does too. Let
L ˜
(k)=
L ˜
(k)T L0 L ˜
(k)BLI
!
= 0 B
@
L ˜
(k)000 0
˜ l
(k) T101 0 L ˜
(k) T200 I
1 C A
where ˜ L
T L(and hence ˜ L
00) are unit lower triangular matrices of dimension (k + 1) × (k + 1). Then
L ˜
(k+1)= ˜ L
(k)L
(k+1)= 0 B
@
L ˜
(k)000 0
˜ l
(k) T101 0 L ˜
(k) T200 I
1 C A
0 B
@
I
k+10 0
0 1 0
0 l
21(k+1)I 1 C A
= 0 B
@
L ˜
(k)000 0
˜ l
10(k) T1 0 L ˜
(k) Tl
(k+1)I
1 C A =
0 B B
@
L ˜
(k)000
˜ l
(k) T101
! 0
“ L ˜
(k) Tl
(k+1)” I
1 C C A
=
˜ L
(k+1)T L0 L ˜
(k+1)BLI
!
,
By the Principle of Mathematical Induction the result
holds for L
(j), 0 ≤ j < n − 1.
Corollary
Under the conditions of the Lemma L = ˜ L
(n−1)is a unit lower triangular matrix the strictly lower triangular part of which is the sum of all the strictly lower triangular parts of L
(0), . . . , L
(n−1):
L = ˜ L
(n−1)=
1 0
l
21(0)1 0
l
(2)21. ..
1 0
l
(n−2)211
.
(Note that l
(n−1)21is a vector of length zero, so that the last step of
(n−1)