Linear Algebra
E XAMPLE 3.3: E XPONENTIATION
3.4 Using Matrix Algebra to Solve Linear Equations
x2 = [0 1 0], x3 = [0 0 1] are guaranteed to be linearly independent. Let us find the vector space generated by this basis set. Consider an arbitrary vector x = [a b c]. This vector can be expressed as a linear combination of the basis set as x = ax1 + bx2 + cx3. Therefore, this basis set generates the vector space of all possible vectors with three real-valued elements.
If we think of a vector with three real-valued elements as corresponding to a point in three-dimensional space, where the elements of the vector are its Cartesian coordinates, the basis vectors generate all possible points in three-dimensional space. It is easy to see that the basis vectors correspond to the three ordinal axes. It should now be clear why we call the generated vectors a space and why the cardinality of the basis set is the dimensionality of this space.
3.4 Using Matrix Algebra to Solve Linear Equations
We now turn our attention to an important application of matrix algebra, which is to solve sets of linear equations.
3.4.1 Representation
Systems of linear equations are conveniently represented by matrices. Consider the set of linear equations:
We can represent this set of equations by the matrix
where the position of a number in the matrix implicitly identifies it as either a coef-ficient of a variable or a value on the right-hand side. This representation can be used for any set of linear equations. If the rightmost column is 0, the system is said to be homogeneous. The submatrix corresponding to the left-hand side of the lin-ear equations is called the coefficient matrix.
3x+2y+z = 5 –8x+ +y 4z = –2 9x+0.5y+4z = 0.9
3 2 1 5
–8 1 4 –2 9 0.5 4 0.9
ptg7913109
3.4.2 Elementary Row Operations and Gaussian Elimination
Given a set of equations, certain simple operations allow us to generate new equa-tions. For example, multiplying the left- and right-hand sides of any equation by a scalar generates a new equation. Moreover, we can add or subtract the left- and right-hand sides of any pair of equations to also generate new equations.
In our preceding example, the first two equations are 3x + 2y + z = 5 and –8x + y + 4z
= –2. We can multiply the first equation by 3 to get the new equation 9x + 6y + 3z = 15.
We can also add the two equations to get a new equation (3 – 8)x + (2 + 1)y + (1 + 4)z = (5 – 2), which gives us the equation –5x + 3y + 5z = 3.
We can also combine these operations. For example, we could multiply the second equation by 2 and subtract it from the first one like this:
In the resulting equation, the variable y has been eliminated (i.e., does not appear).
We can similarly multiply the third equation by 4 and subtract it from the first one to obtain another equation that also eliminates y. We now have two equations in two variables that we can trivially solve to obtain x and z. Putting their values back into any of the three equations allows us to find y.
This approach, in essence, is the well-known technique called Gaussian elimina-tion. In this technique, we pick any one variable and use multiplications and addi-tions on the set of equaaddi-tions to eliminate that variable from all but one equation. This transforms a system with n variables and m equations to a system with n – 1 vari-ables and m – 1 equations. We can now recurse to obtain, in the end,1 an equation with one variable, which solves the system for that variable. By substituting this value back into the reduced set of equations, we solve the system.
When using a matrix representation of the set of equations, the elementary oper-ations of multiplying an equation by a scalar and of adding two equoper-ations corre-spond to two row operations. The first row operation multiplies all the elements of a row by a scalar, and the second row operation is the element-by-element addi-tion of two rows. It is easy to see that these are exactly analogous to the operaaddi-tions in the previous paragraphs. The Gaussian technique uses these elementary row operations to manipulate the matrix representation of a set of linear equations so that one row looks like this: [0 0 ... 0 1 0 ... 0 a], allowing us to read off the value of that variable. We can use this to substitute for this variable in the other equations, so that we are left with a system of equations with one less unknown and, by recur-sion, to find the values of all the variables.
1. Assuming that the equations are self-consistent and have at least one solution. More on this later.
3––16
x+2–2y+1–8z = 5––4 19x–7z = 9
ptg7913109
3.4 Using Matrix Algebra to Solve Linear Equations 119
EXAMPLE 3.8: GAUSSIAN ELIMINATION
Use row operations and Gaussian elimination to solve the system given by
Solution:
Subtract row 3 from row 2 to obtain
Then subtract 0.25 times row 3 from row 1 to obtain
Note that the first two rows represent a pair of equations in two unknowns. Mul-tiply the second row by 1.875/0.5 = 3.75 and subtract from the first row to obtain
This allows us to read off x as 15.65/66.525 = 0.2426. Substituting this into row 2, we get –17 * 0.2426 + 0.5y = –2.9, which we solve to get y = 2.4496. Sub-stituting this into the third row, we get 9 * 0.2426 + 0.5 * 2.4496 + 4z = 0.9, so that z = 0.6271. Checking, 3 * 0.2426 + 2 * 2.4484 – 0.6271 = 4.9975, which is within rounding error of 5.
In practice, choosing which variable to eliminate first has important conse-quences. Choosing a variable unwisely may require us to maintain matrix elements to very high degrees of precision, which is costly. There is a considerable body of work on algorithms to carefully choosing the variables to eliminate, which are also called the pivots. Standard matrix packages, such as MATLAB, implement these algorithms.
3 2 1 5
–8 1 4 –2 9 0.5 4 0.9
3 2 1 5
–17 0.5 0 2.9– 9 0.5 4 0.9
0.75 1.875 0 4.775 –17 0.5 0 –2.9
9 0.5 4 0.9
64.5 0 0 15.65 –17 0.5 0 –2.9
9 0.5 4 0.9
ptg7913109
3.4.3 Rank
So far, we have assumed that a set of linear equations always has a consistent solu-tion. This is not always the case. A set of equations has no solution or has an infi-nite number of solutions if it is either overdetermined or underdetermined, respectively. A system is overdetermined if the same variable assumes inconsistent values. For example, a trivial overdetermined system is the set of equations x = 1 and x = 2. Gaussian elimination will fail for such systems.
A system is underdetermined if it admits more than one answer. A trivial instance of an underdetermined system is the system of linear equations x + y = 1, because we can choose an infinite number of values of x and y that satisfy this equation. Gaussian elimination on such a system results in some set of variables expressed as linear combinations of the independent variables. Each assignment of values to the independent variables will result in finding a consistent solution to the system.
Given a system of m linear equations using n variables, the system is underde-termined if m < n. If m is at least as large as n, the system may or may not be underdetermined, depending on whether some equations are repeated. Specifically, we define an equation as being linearly dependent on a set of other equations if it can be expressed as a linear combination of the other equations: The vector corre-sponding to this equation is a linear combination of the vectors correcorre-sponding to the other equations. If one equation in a system of linear equations is linearly depen-dent on the others, we can reduce the equation to the equation 0 = 0 by a suitable combination of multiplications and additions. Thus, this equation does not give us any additional information and can be removed from the system without changing the solution.
If of m equations in a system, k can be expressed as a linear combination of the other m – k equations, we really have only m – k equations to work with. This value is called the rank of the system, denoted r. If r < n, the system is underdetermined.
If r = n, there is only one solution to the system. If r > n, the system is overdeter-mined and therefore inconsistent. Note that the rank of a matrix is the same as the cardinality of the basis set of the corresponding set of row vectors.