1
Sets and Set Notation.
Definition 1(Naive Definition of a Set). Asetis any collection of objects, called theelementsof that set. We will most often name sets using capital letters, likeA,B,X,Y, etc., while the elements of a set will usually be given lower-case letters, likex,y,z,v, etc.
Two setsX andY are calledequalifX andY consist of exactly the same elements. In this case we writeX =Y.
Example 1(Examples of Sets). (1) LetX be the collection of allintegers greater than or equal to 5 and strictly less than 10. ThenX is a set, and we may write:
X ={5,6,7,8,9}
The above notation is an example of a set being described explicitly, i.e. just by listing out all of its elements. Theset brackets {· · ·} indicate that we are talking about a set and not a number, sequence, or other mathematical object.
(2) Let E be the set of all evennatural numbers. We may write:
E={0,2,4,6,8, ...}
This is an example of an explicity described set with infinitely many elements. The ellipsis (...) in the above notation is used somewhat informally, but in this case its meaning, that we should “continue counting forever,” is clear from the context.
(3) Let Y be the collection of all real numbers greater than or equal to 5 and strictly less than 10. Recalling notation from previous math courses, we may write:
Y = [5,10)
This is an example of usinginterval notationto describe a set. Note that the setY obviously consists of infinitely many elements, but that there is no obvious way to write down the elements ofY explicitly like we did the setE in Example (2). Even though [5,10) is a set, we don’t need to use the set brackets in this case, as interval notation has a well-established meaning which we have used in many other math courses.
(4) Now and for the remainder of the course, let the symbol ∅ denote the empty set, that is, the unique set which consists of no elements. Written explicitly,∅={ }.
(5) Now and for the remainder of the course, let the symbolNdenote the set of all natural numbers,
i.e. N={0,1,2,3, ...}.
(6) Now and for the remainder of the course, let the symbol Rdenote the set of all real numbers.
We may think ofRgeometrically as being the collection of all the points on the number line.
(7) Let R2 denote the set of allordered pairsof real numbers. That is, let R2 be the set which
consists of all pairs (x, y) wherexandy are both real numbers. We may think ofR2
geometri-cally as the set of all points on the Cartesian coordinate plane.
If (x, y) is an element ofR2, it will often be convenient for us to write the pair as thecolumn vector
x y
. For our purposes the two notations will be interchangeable. It is important to
note here that the order matters when we talk about pairs, so in general we have
x y
6=
y x
.
(8) Let R3 be the set of allordered triplesof real numbers, i.e.
R3 is the set of all triples (x, y, z)
such thatx,y, and z are all real numbers. R3 may be visualized geometrically as the set of all points in 3-dimensional Euclidean coordinate space. We will also write elements (x, y, z) of R3
be using the column vector notation
x y z
.
(9) Lastly and most generally, let n≥1 be any natural number. We will let Rn be the set of all
ordered n-tuples of real numbers, i.e. the set of all n-tuples (x1, x2, ..., xn) for which each coordinatexi, 1≤i≤n, is a real number. We will also use the column vector notation
x1
x2
... xn
in this context.
Definition 2 (Set Notation). IfAis a set andxis an element ofA, then we write:
x∈A.
IfB is a set such that every element ofB is an element ofA(i.e. ifx∈B thenx∈A), then we call
B a subsetofAand we write:
B⊆A.
In order to distinguish particular subsets we wish to talk about, we will frequently use set-builder notation, which for convenience we will describe informally using examples, rather than give a formal definition. For an example, suppose we wish to formally describe the setE of all even positive integers (See Example 1 (2)). Then we may write
E={x∈N:xis evenly divisible by 2}.
The above notation should be read asThe set of allxinNsuch thatxis evenly divisible by2, which
clearly and precisely defines our setE. For another example, we could write
Y ={x∈R: 5≤x <10},
which readsThe set of allxinRsuch that5is less than or equal toxandxis strictly less than10. The
student should easily verify thatY = [5,10) from Example 1 (3). In general, given a setAand a precise mathematical sentenceP(x) about a variablex, the set-builder notation should be read as follows.
{ x∈A : P(x)}
2
Vector Spaces and Subspaces.
Definition 3. A(real) vector spaceis a nonempty setV, whose elements are calledvectors, together with an operation +, calledaddition, and an operation·, calledscalar multiplication, which satisfy the following ten axioms:
Addition Axioms.
(1) If~u∈V and~v∈V, then~u+~v∈V. (Closure under addition.)
(2) ~u+~v=~v+~ufor all~u, ~v∈V. (Commutative property of addition.)
(3) (~u+~v) +w~ =~u+ (~v+w~) for all~u, ~v, ~w∈V. (Associative property of addition.)
(4) There exists a vector~0∈V which satisfies~u+~0 =~ufor all~u∈V. (Existence of an additive identity.)
(5) For every~u∈V, there exists a vector−~u∈V such that~u+ (−~u) =~0. (Existence of additive inverses.)
Scalar multiplication axioms. (6) If~u∈V andc∈R, then c·~u∈V.
(Closure under scalar multiplication.)
(7) c·(~u+~v) =c·~u+c·~vfor allc∈R,~u, ~v∈V.
(First distributive property of multiplication over addition.)
(8) (c+d)·~u=c·~u+d·~ufor allc, d∈R,~u∈V.
(Second distributive property of multiplication over addition.)
(9) c·(d·~u) = (c·d)·~ufor allc, d∈R,~u∈V.
(Associative property of scalar multiplication.)
(10) 1·~u=~ufor every~u∈V.
We use the “arrow above” notation to help differentiate vectors (~u,~v, etc.), which may or may not be real numbers, from scalars, which are always real numbers. When no confusion will arise, we will often drop the· symbol in scalar multiplcation and simply writec~uinstead ofc·u~,c(~u+~v) instead of
c·(~u+~v), etc.
Example 2. LetV be an arbitrary vector space. (1) Prove that~0 +~u=~ufor every~u∈V.
(2) Prove that the zero vector~0 is unique. That is, prove that if w~ ∈ V has the property that
~
u+w~ =~ufor every u~∈V, then we must havew~ =~0.
(3) Prove that for every ~u∈V, the additive inverse−~uis unique. That is, prove that ifw~ ∈V has the property that~u+w~ =~0, then we must havew~ =−~u.
Proof. (1) By Axiom (2), the commutativity of addition, we have~0 +~u=~u+~0. Hence by Axiom
(2) Suppose w~ ∈V has the property that~u+w~ =~ufor every~u∈V. Then in particular, we have
~0 +w~ =~0. But~0 +w~ =w~ by part (1) above; sow~ =~0 +w~ =~0.
(3) Let~u∈V, and supposew~ ∈V has the property that~u+w~ =~0. Let−~ube the additive inverse of ~u guaranteed by Axiom (5). Adding −~uto both sides of the equality above, and applying Axioms (2) and (3) (commutativity and associativity), we get
−~u+ (~u+w~) =−~u+~0 (−~u+~u) +w~ =−~u
(~u+ (−~u)) +w~ =−~u ~0 +w~ =−~u.
Now by part (1) above, we havew~ =~0 +w~ =−~u.
Example 3(Examples of Vector Spaces). (1) The real number lineRis a vector space, where both + and· are interpreted in the usual way. In this case the axioms are the familiar properties of real numbers which we learn in elementary school.
(2) Consider the planeR2=
x y
:x, y∈R
. Define an operation + onR2 by the rule
x y
+
z w
=
x+z y+w
for allx, y, z, w∈R,
and a scalar multiplication·by the rule
c·
x y
=
cx cy
for every c∈R,x, y∈R.
ThenR2becomes a vector space. (Verify that each of the axioms holds.)
(3) Consider the following purely geometric description. Let V be the set of all arrows in two-dimensional space, with two arrows being regarded as equal if they have the same length and point in the same direction. Define an addition + onV as follows: if~uand~v are two arrows in
V, then lay them end-to-end, so the base of~v lies at the tip of~u. Then define~u+~v to be the arrow which shares its base with ~uand its tip with~v. (A picture helps here.) Define a scalar multiplication·by lettingc·~ube the arrow which points in the same direction as~u, but whose length isctimes the length of~u. IsV a vector space? (What is the relationship ofV withR2?)
(4) In general ifn∈N,n≥1, thenRnis a vector space, where the addition and scalar multiplication arecoordinate-wisea la part (2) above.
(5) Letn∈N, and letPndenote the set of all polynomials of degree at mostn. That is,Pn consists of all polynomials of the form
p(x) =a0+a1x+a2x2+...+anxn
where the coefficients a0, a1, ..., an are real numbers, and x is an abstract variable. Define + and · as follows: Suppose c ∈ R, and p, q ∈ Pn, so p(x) = a0 +a1x+...+anxn and
q(x) =b0+b1x+...+bnxn for some coefficientsa0, ..., an, b0, ..., bn∈R. Then
and
(cp)(x) =cp(x) =ca0+ca1x+...+canxn.
ThenPn is a vector space.
Definition 4. LetV be a vector space, and let W ⊆ V. If W is also a vector space using the same operations + and· inherited fromV, then we callW avector subspace, or justsubspace, ofV. Example 4. For each of the following, prove or disprove your answer.
(1) Let V =
x y
∈R2:x≥0, y≥0
, so V is the first quadrant of the Cartesian plane. IsV a
vector subspace ofR2?
(2) Let V be the set of all points on the graph ofy= 5x. IsV a vector subspace ofR2? (3) Let V be the set of all points on the graph ofy= 5x+ 1. IsV a vector subspace ofR2? (4) IsRa vector subspace ofR2?
(5) Is{~0} a vector subspace ofR2? (This space is called thetrivial space.)
Proof. (2) We claim that the set V of all points on the graph of y= 5xis indeed a subspace ofR2.
To prove this, observe that sinceV is a subset ofR2with the same addition and scalar
multipli-cation operations, it is immediate that Axioms (2), (3), (7), (8), (9), and (10) hold (Verify this in your brain!) So we need only check Axioms (1), (4), (5), and (6).
If x1 y1 , x2 y2
∈V, then by the definition ofV we havey1= 5x1andy2= 5x2. It follows
thaty1+y2= 5x1+5x2= 5(x1+x2). Hence the sum
x1+x2
y1+y2
satisfies the defining condition
ofV, and so
x1+x2
y1+y2
= x1 y1 + x2 y2
∈V. SoV is closed under addition and Axiom (1) is satisfied.
Check that the additive identity
0 0
is inV, so Axiom (3) is satisfied.
If
x1
y1
∈ V and c ∈ R, then c
x1 y1 = cx1 cy1
∈ V since cy1 = c(5x1) = 5(cx1), so
V is closed under scalar multiplication. Hence Axiom (6) is satisfied. Moreover, if we take
c =−1 in the previous equalities, we see that each vector
x1
y1
∈V has an additive inverse
−x1 −y1
∈V. So Axiom (5) is satisfied. ThusV meets all the criteria to be a vector space, as
we claimed.
(3) On the other hand, if we take V to be the set of all points on the graph of y= 5x+ 1, thenV
isnot a vector subspace ofR2. To see this, it suffices to check, for instance, thatV fails Axiom
(1), i.e. V is not closed under addition.
To show thatV is not closed under addition, it suffices to exhibit two vectors inV whose sum is not inV. So let~u=
0 1
and let~v=
1 6
. Both~uand~vare inV. (Why?) But their sum
~ u+~v =
1 7
isnot a solution of the equation y = 5x+ 1, and hence not inV. So V fails to satisfy Axiom (1), and cannot be a vector space.
Fact 1. Let V be a vector space, and let W ⊆ V. Then W, together with the addition and scalar multiplication inherited fromV, satisfies Axioms (2), (3), (7), (8), (9), (10) in the definition of a vector space.
Theorem 1. Let V be a vector space and letW ⊆V. Then W is a subspace of V if and only if the following three properties hold:
(1) ~0∈W.
(2) W is closed under addition. That is, for each~u, ~v∈W, we have~u+~v∈W.
(3) W is closed under scalar multiplication. That is, for eachc∈Randu~ ∈W, we have c~u∈W.
Proof. One direction of the proof is trivial: ifW is a vector subspace ofV, thenW satisfies the three conditions above because it satisfies Axioms (4), (1), and (6) respectively in the definition of a vector space.
Conversely, supposeW ⊆V, andW satisfies the three conditions above. The three conditions imply that W satisfies Axioms (4), (1), and (6), respectively, while our previous fact implies that W also satisfies Axioms (2), (3), (7), (8), (9), and (10). So the only axiom left to check is (5).
To see that (5) is satisfied, let ~u ∈ W. Since W ⊆ V, the vector ~u is in V, and hence has an additive inverse−~u∈V. We must show that in fact−~u∈W. Note that since W is closed under scalar multiplication, the vector −1·~uis in W. But on our homework problem #1(d), we show that in fact −1·~u=−~u. So−~u=−1·~u∈V as we hoped, and the proof is complete. Example 5. What do the vector subspaces ofR2 look like? What aboutR3?
3
Linear Combinations and Spanning Sets.
Definition 5. LetV be a vector space. Let v~1, ~v2, ..., ~vn ∈V, and letc1, c2, ..., cn ∈R. We say that a
vector~udefined by
~u=c1v~1+c2v~2+...+cnv~n
is alinear combination of the vectorsv~1, ..., ~vn with weightsc1, ..., cn. Notice that since addition is associative in any vector spaceV, we may omit parentheses from the sum above. Also notice that the zero vector~0 is always a linear combination of any collection of vectorsv~1, ..., ~vn, since we may always takec1=c2=...=cn = 0.
Example 6. Letv~1=
−1 1
and letv~2=
2 1
.
(1) Which points in R2 are linear combinations ofv
1 andv2, usinginteger weights? (2) Which points in R2 are linear combinations ofv
1 andv2, using any weights?
Example 7. Leta~1=
1 −2 −5
anda~2=
2 5 6
be vectors inR3.
(1) Let~b=
7 4 −3
. May~bbe written as a linear combination ofa~1 anda~2? (2) Let~b=
6 3 −5
. May~bbe written as a linear combination ofa~1 anda~2?
Partial Solution. (1) We wish to answer the question: Do there exist real numbers c1 and c2 for which
Written differently, are therec1, c2∈Rfor which
c1·
1 −2 −5
+c2·
2 5 6
=
c1+ 2c2 −2c1+ 5c2 −5c1+ 6c2
=
7 4 −3
?
Recall that vectors inR3are equal if and only if each pair of entries are equal. So to identify a c1andc2which make the above equality true, we wish to solve the following system of equations in two variables:
c1+ 2c2= 7 −2c1+ 5c2= 4 −5c1+ 6c2=−3
This can be done manually using elementary algebra techniques. We should getc1= 3 and
c2= 2. Since a solution exists,~b= 3a~1+ 2a~2is indeed a linear combination of a~1 anda~2.
Definition 6. Let V be a vector space and v~1, ..., ~vn ∈ V. We denote the set of all possible linear combinations ofv~1, ..., ~vn by Span{v~1, ..., ~vn}, and we call this set the subset ofV spannedbyv~1, ..., ~vn. We also call Span{v~1, ..., ~vn}the subset ofV generatedbyv~1, ..., ~vn.
Example 8. (1) Find Span
3 −1
in R2.
(2) Find Span
1 0 0
,
0 0 1
inR3.
Theorem 2. Let V be a vector space and let W ⊆ V. Then W is a subspace of V if and only if W
is closed under linear combinations, i.e., for every v~1, ..., ~vn ∈ W and every c1, ..., cn ∈ R, we have c1v~1+...+cnv~n∈W.
Corollary 1. LetV be a vector space andv~1, ..., ~vn∈V. Then Span{v~1, ..., ~vn} is a subspace ofV.
4
Matrix Row Reductions and Echelon Forms.
As our previous few examples should indicate, our techniques for solving systems of linear equations are extremely relevant to our understanding of linear combinations and spanning sets. The student should recall the basics of solving systems from a previous math course, but in this section we will develop a new notationally convenient method for finding solutions to such systems.
Definition 7. Recall that alinear equationinnvariables is an equation that can be written in the form
a1x1+a2x2+...+anxn=b
where a1, ..., an, b ∈ R and x1, ..., xn are variables. A system of linear equations is any finite collec-tion of linear equacollec-tions involving the same variablesx1, ..., xn. Asolutionto the system is an n-tuple (s1, s2, ..., sn)∈Rn, that makes each equation in the system true if we substitutes1, ..., sn forx1, ..., xn respectively.
The set of all possible solutions to a system is called the solution set. Two systems are called equivalent if they have the same solution set. It is only possible for a system of linear equations to have no solutions, exactly one solution, or infinitely many solutions. (Geometrically one may think of “parallel lines,” “intersecting lines,” and “coincident lines,” respectively.
A system of linear equations is calledconsistentif it has at least one solution; if the system’s solution set is∅, then the system isinconsistent.
Example 9. Solve the following system of three equations in three variables.
x1 − 2x2 + x3 = 0 2x2 − 8x2 = 8 −4x1 + 5x2 + 9x3 = −9
Solution. Our goal here will be to solve the system using the elimination technique, but with a stripped down notation which preserves only the necessary information and makes computations by hand go quite a bit faster.
First let us introduce a little terminology. Given the system we are working with, the matrix of coefficientsis the 3x3 matrix below.
1 −2 1
0 2 −8
−4 5 9
Theaugmented matrixof the system is the 3x4 matrix below.
1 −2 1 0
0 2 −8 8
−4 5 9 −9
Note that the straight black line between the third and fourth columns in the matrix above is not necessary. Its sole purpose is to remind us where the “equals sign” was in our original system, and may be omitted if the student prefers.
To solve our system of equations, we will perform operations on the augmented matrix which “encode” the process of elimination. For instance, if we were using elimination on the given system, we might add 4 times the first equation to the third equation, in order to eliminate thex1variable from the third equation. Let’s do the analogous operation to our matrix instead.
Add4 times the top row to the bottom row:
1 −2 1 0
0 2 −8 8
−4 5 9 −9
7→
1 −2 1 0
0 2 −8 8
0 −3 13 −9
Notice that the new matrix we obtained above can also be interpreted as the augmented matrix of a system of linear equations, which is the new system we would have obtained by just using the elim-ination technique. In particular, the first augmented matrix and the new augmented matrix represent systems which areequivalentto one another. (However, the two matrices are obviously not equalto one another, which is why we will stick to the arrow7→notation instead of using equals signs!)
Now we will continue the process, replacing our augmented matrix with a new, simpler augmented matrix, in such a way that the linear systems the matrices represent are all equivalent to one another.
Multiply the second row by 1 2:
1 −2 1 0
0 2 −8 8
0 −3 13 −9
7→
1 −2 1 0
0 1 −4 4
0 −3 13 −9
Add3 times the second row to the third row:
1 −2 1 0
0 1 −4 4
0 −3 13 −9
7→
1 −2 1 0
0 1 −4 4
0 0 1 3
Add4 times the third row to the second row:
1 −2 1 0
0 1 −4 4
0 0 1 3
7→
1 −2 1 0
0 1 0 16
0 0 1 3
Add−1times the third row to the first row:
1 −2 1 0
0 1 0 16
0 0 1 3
7→
1 −2 0 −3
0 1 0 16
0 0 1 3
Add2 times the second row to the first row:
1 −2 0 −3
0 1 0 16
0 0 1 3
7→
1 0 0 29 0 1 0 16
0 0 1 3
Now let’s stop and examine what we’ve done. The last augmented matrix above represents the following system:
x1 = 29
x2 = 16
x3 = 3
So the system is solved, and has a unique solution (29,16,3). Definition 8. Given an augmented matrix, the threeelementary row operationsare the following.
(1) (Replacement) Replace one row by the sum of itself and a multiple of another row. (2) (Interchange) Interchange two rows.
(3) (Scaling) Multiply all entries in a row by a nonzero constant.
These are the “legal moves” when solving systems of equations using augmented matrices. Two matrices are called row equivalentif one can be transformed into the other using a finite sequence of elementary row operations.
Fact 2. If two augmented matrices are row equivalent, then the two linear systems they represent are equivalent, i.e. they have the same solution set.
Definition 9. A matrix is said to be inechelon form, orrow echelon form, if it has the following three properties:
(1) All rows which have nonzero entries are above all rows which have only zero entries.
(2) Each leading nonzero entry of a row is in a column to the right of the leading nonzero entry of the row above it.
A matrix is said to be in reduced echelon form or reduced row echelon form if it is in echelon form, and also satisfies the following two properties:
(4) Each nonzero leading entry is 1.
(5) Each leading 1 is the only nonzero entry in its column.
Every matrix may be transformed, via elementary row operations, into a matrix in reduced row echelon form. This process is calledrow reduction.
Example 10. Use augmented matrices and row reductions to find solution sets for the following systems of equations.
(1)
2x1 − 6x3 = −8
x2 + 2x3 = 3 3x1 + 6x2 − 2x3 = −4
(2)
x1 − 5x2 + 4x3 = −3 2x1 − 7x2 + 3x3 = −2 −2x1 + x2 + 7x3 = −1
5
Functions and Function Notation
Definition 10. LetX and Y be sets. Afunctionf from X to Y is just a rule by which we associate to each element x∈X, an elementf(x)∈Y. The input set X is called the domain off, while the set Y of possible outputs is called the codomainoff. To denote the domain and codomain when we define a function, we write
f :X→Y.
We regard the termsmap,mapping, andtransformationas all being synonymous with function. Example 11. (1) For the familiar quadratic functionf(x) =x2, we would writef :
R→R, since f takes real numbers for input and returns real numbers for output. Notice that the codomain
Rhere is distinct from the range, i.e. the set of all actual outputs of the function, which the
reader should know is just the set [0,∞). This is an ok part of the notation.
(2) On the other hand, for the familiar hyperbolic functionf(x) = 1x, we would NOT writef :R→ R; this is because 0∈Rbut 0 is not part of the domain off.
6
Linear Transformations
Definition 11. LetV andW be vector spaces. A function T :V →W is a linear transformationif (1) T(~v+w~) =T(v~) +T(w~) for every~v, ~w∈V.
(2) T(c~v) =cT(~v) for every~v∈V andc∈R.
Note that the transformations which are linear are those whichrespect the addition and scalar multiplication operationsof the vector spaces involved.
Example 12. Are the following maps linear transformations? (1) T :R→R,T(x) =
x
3x
.
(2) T :R→R2,T(x) =
x x2
.
(3) T :R2→
R2, T(~v) =A~v, whereA=
1 2
0 −1
Fact 3. Everymatrix transformation T :Rn→Rm is a linear transformation.
Fact 4. If T :V →W is a linear transformation, thenT preserves linear combinations. In other words, for every collection of vectorsv~1, ..., ~vn∈V and every choice of weightsc1, ..., cn∈R, we have
T(c1v~1+...+cnv~n) =c1T(v~1) +...+cnT(v~n).
Definition 12. LetV =Rn be n-dimensional Euclidean space. Define vectorse~1, ..., ~en∈V by:
~ e1=
1 0 0 ... 0 0
;e~2=
0 1 0 ... 0 0
; ...;e~n=
0 0 0 ... 0 1 .
These vectorse~1, ..., ~en are called thestandard basisforRn. Note that any vector~v=
v1 v2 ... vn may
be written as a linear combination of the standard basis vectors in an obvious way: ~v=v1e~1+v2e~2+
...+vne~n.
Example 13. SupposeT :R2→
R3is a linear transformation such thatT(e~1) =
5 −7 2
andT(e~2) =
−3 8 0
. Find an explicit formula forT.
Theorem 3. LetT :Rn →
Rm(n, m∈N)be a linear transformation. Then there exists a uniquem×n
matrixA such thatT(~v) =A~v for every~v∈Rn. In fact, we have
A=
T(e~1) T(e~2) ... T(e~n)
.
Proof. Notice that for any~v=
v1 v2 ... vn
∈Rn, we have~v=v1e~1+...+vne~n. Since T is a linear
trans-formation, it respects linear combinations, and hence
T(~v) =v1T(e~1) +...+vnT(e~n)
=
T(e~1) T(e~2) ... T(e~n) v1 v2 ... vn
=A~v,
whereAis as we defined in the statement of the theorem.
Example 14. LetT :R2→
R2be the linear transformation which scales up every vector by a factor of
3, i.e. T(~v) = 3~v for every~v∈R2. Find a matrix representation forT.
Example 15. LetT :R2→R2 be the linear transformation which rotates the plane about the origin
7
The Null Space of a Linear Transformation
Definition 13. LetV andW be vector spaces, and let T :V →W be a linear transformation. Define the following set:
NulT ={~v∈V :T(~v) =~0}. Then this set NulT is called thenull space orkernelof the mapT.
Example 16. LetT :R3→R3be the matrix transformationT(~v) =A~v, whereA=
1 −1 5
2 0 7
−3 −5 −3
.
(1) Let ~u=
−7 3 2
. Is~u∈NulT?
(2) Let w~ =
−5 −5 0
. Isw~ ∈NulT?
Theorem 4. LetV, W be vector spaces andT :V →W a linear transformation. ThenNulT is a vector subspace of V.
Proof. We will show that NulT is closed under taking linear combinations, and hence the result will follow from Theorem 2.
Letv~1, ..., ~vn ∈NulT andc1, ..., cn ∈Rbe arbitrary. We must show that c1v~1+...+cnv~n ∈NulT, i.e. thatT(c1v~1+...+cnv~n) =~0. To see this, simply observe that sinceT is a linear transformation, we have
T(c1v~1+...+cnv~n) =c1T(v~1+...+cnT(v~n).
But sincev~1, ..., ~vn∈NulT, we haveT(v~1) =...=T(v~n) =~0. So in fact we have
T(c1v~1+...+cnv~n) =c1~0 +...+cn~0 =~0.
It follows thatc1v~1+...+cnv~n∈NulT, and NulT is closed under linear combinations. This completes
the proof.
Example 17. For the following matrix transformations, give an explicit description of NulT by finding a spanning set.
(1) T :R4→R2, T
x1 x2 x3 x4 =
1 2 4 0
0 1 3 −2
x1 x2 x3 x4
for everyx1, x2, x3, x4∈R.
(2) T :R5→
R3, T
x1 x2 x3 x4 x5 =
1 −4 0 2 0
0 0 1 −5 0
0 0 0 0 2
x1 x2 x3 x4 x5
for everyx1, x2, x3, x4, x5∈R.
Definition 14. LetV, W be sets. A mapping T :V →W is calledone-to-oneif the following holds: for every~v, ~w∈V, if~v6=w~, thenT(~v)6=T(w~).
Equivalently,T is one-to-one if the following holds: for every~v, ~w∈V, ifT(~v) =T(w~), then~v=w~). A map is one-to-one only if it sends distinct elements in V to distinct elements in W, i.e. no two elements inV are mapped to the same place inW.
Example 18. Are the following linear transformations one-to-one? (1) T :R→R2,T(x) =
x
3x
.
(2) T :R2→R2, T
x1
x2
=
1 2 2 4
x1
x2
.
Theorem 5. A linear transformationT :V →W is one-to-one if and only ifNulT ={~0}.
Proof. (⇒) First supposeT is one-to-one, and let~v∈NulT. We will show~v=~0. To see this, note that
T(~v) =~0 since~v∈NulT. But we also have~0∈NulT since NulT is a subspace ofV, soT(~0) =~0 =T(~v). SinceT is one-to-one, we must have~0 =~v, and hence NulT is the trivial subspace NulT ={~0}.
(⇐) Conversely, supposeT is not one-to-one; we will show NulT is non-trivial. Since T is not one-to-one, there exist two distinct vectors~v, ~w∈V,~v 6=w~, such thatT(~v) =T(w~). Set~u=~v−w~. Since
~v6=w~ and additive inverses are unique, we have~u6=~0, and we also have
T(~u) =T(~v−w~) =T(~v)−T(w~) =T(~v)−T(~v) =~0.
So~u∈NulT. Since~uis not the zero vector, NulT 6={~0}. Definition 15. LetV, W be sets. A mapping T : V → W is called onto if the following holds: for everyw~ ∈W, there exists a vector~v∈V such thatT(~v) =w~.
Example 19. Are the following linear transformations onto? (1) T :R→R2,T(x) =
x
3x
.
(2) T :R2→R,T
x1
x2
=x2−x1.
(3) T :R2→R2, T
x1
x2
=
x2 2x1+x2
.
Definition 16. LetV, W be vector spaces andT :V →W a linear transformation. IfT is both one-to-one and onto, thenT is called anisomorphism. In this case the domainV and codomainW are called isomorphic as vector spaces, or just isomorphic. It means that V and W are indistinguishable from one another in terms of their vector space structure.
Example 20. Prove that the mapping T :R2 →
R2, where T rotates the plane by a fixed angle θ, is
an isomorphism.
Example 21. Let W be the graph of the line y = 3x, a vector subspace of R2. Prove that R is
isomorphic toW.
8
The Range Space of a Linear Transformation and the Column Space of a
Matrix
Definition 17. LetV, W be vector spaces andT :V →W a linear transformation. Define the following set:
Then RanT is called the range ofT. (This definition should coincide with the student’s knowledge of the range of a function from previous courses.)
Theorem 6. Let V, W be vector spaces and T : V → W a linear transformation. Then RanT is a vector subspace of W.
Proof. Again we will show that RanT is closed under linear combinations, and appeal to Theorem 2.
To that end, letw~1, ..., ~wn∈RanT and letc1, ..., cn∈Rall be arbitrary. We must showc1w~1+...+
cnw~n ∈RanT. To see this, note that since w~1, ..., ~wn ∈RanT, there exist vectors v~1, ..., ~vn ∈ V such thatT(v~1) =w~1, ..., T(v~n) =w~n. Set
~
u=c1v~1+...+cnv~n.
SinceV is a vector space,V is closed under taking linear combinations and hence ~u∈V. Moreover, we have
T(~u) =T(c1v~1+...+cnv~n) =c1T(v~1) +...+cnT(v~n) =c1w~1+...+cnw~n.
So the vectorc1w~1+...+cnw~n is the image of~uunderT, and hencec1w~1+...+cnw~n∈RanT. This shows RanT is closed under linear combinations and hence a vector subspace ofW. Theorem 7. Let V, W be vector spaces andT :V →W a linear transformation. ThenT is onto if and only ifRanT =W.
Proof. Obvious if you think about it!
Corollary 2. Let V, W be vector spaces and T :V →W a linear transformation. Then the following statements are all equivalent:
(1) T is an isomorphism.
(2) T is one-to-one and onto.
(3) NulT ={~0} andRanT =W.
Definition 18. Letm, n∈N, and letAbe anm×nmatrix (soAinduces a linear transformation from Rn intoRm). Write
A=
w~1 ... w~n
,
where each ofw~1, ..., ~wn is a column vector inRm.
Define the following set:
ColA= Span{w~1, ..., ~wn}.
Then ColAis called thecolumn spaceof the matrixA. ColAis exactly the set of all vectors inW
which may be written as a linear combination of the columns of the matrixA. Of course ColAis also a vector subspace ofRm, by Corollary 1.
Example 22. LetA=
0 1 2 1 0 0
.
(1) Determine whether or not the vector
3 5 0
is in ColA.
(2) What does ColAlook like inR3?
Example 23. LetW =
6a−b a+b
−7a
:a, b∈R
. Find a matrix Aso thatW = ColA.
Theorem 8. Let T : Rn → Rm (n, m ∈ N) be a linear transformation. Let A be the m×n matrix
representation ofA guaranteed by Theorem 3. ThenRanT = ColA. Proof. Recall from Theorem 3 that the matrixAis given by
A=
T(e~1) T(e~2) ... T(e~n) ,
wheree~1, ..., ~en are the standard basis vectors inRn. So its columns are justT(e~1), ..., T(e~n).
The first thing we will show is that RanT ⊆ColA. So supposew~ ∈RanT. Then there exists some vector ~v ∈ Rn for which T(~v) = w~. Write ~v =
v1
... vn
= v1e~1+...+vne~n for some real numbers
v1, ..., vn∈R. Then, sinceT is a linear transformation, we have
~
w=T(~v)
=T(v1e~1+...+vne~n) =v1T(e~1+...+vnT(e~n).
So the above equality displaysw~ as a linear combination of the columns ofA, with weightsv1, ..., vn. This provesw~ ∈ColA. Sincew~ was taken arbitrarily out of RanT, we must have RanT ⊆ColA.
The next thing we will show is that ColA⊆RanT. So letw~ ∈ColA. Then w~ can be written as a linear combination of the columns ofA, so there exist some weightsc1, ..., cn∈Rfor which
~
w=c1T(e~1) +...+cnT(e~n).
Set~v=c1e~1+...+cne~n. So~v∈Rn, and we claim thatT(~v) =w~. To see this, just compute (again
using the fact thatT is linear):
T(~v) =T(c1e~1+...cne~n) =c1T(e~1) +...+cnT(e~n) =w.~
This shows w~ ∈ RanT, and again since w~ was taken arbitrarily out of ColA, we have shown that ColA⊆RanT.
Since RanT ⊆ColAand ColA⊆RanT, we must have RanT = ColA. This completes the proof. Example 24. Let T : R2 →
R2 be the linear transformation defined by T(e~1) = −2e~1 + 4e~2 and
T(e~2) =−e~1+ 2e~2. Letw~ =
2 1
. Is w~ ∈RanT?
9
Linear Independence and Bases.
Definition 19. LetV be a vector space and let v~1, ..., ~vn∈V. Consider the equation:
c1v~1+...+cnv~n=~0
wherec1, ..., cn are interpreted as real variables. Notice thatc1=...=cn = 0 is always a solution to the equation, which is called thetrivial solution, but that there may be others depending on our choice ofv~1, ..., ~vn.
If there exists a nonzero solution (c1, ..., cn) to the equation, i.e. a solution where ck 6= 0 for at least one ck (1≤k≤n) then the vectorsv~1, ..., ~vn are calledlinearly dependent. Otherwise if the trivial solution is the only solution, then the vectorsv~1, ..., ~vn are called linearly independent.
Example 25. Letv~1=
1 2 3
, v~2=
4 5 6
, andv~3=
2 1 0
.
(1) Are the vectorsv~1,v~2,v~3linearly independent?
(2) If possible, find adependence relation, i.e. a non-trivial linear combination ofv~1, ~v2, ~v3which sums to~0.
Fact 5. Letv~1, ..., ~vn be column vectors inRm. Define anm×nmatrixA by:
A=
v~1 ... v~n
,
and define a linear transformation T : Rn →
Rm by the rule T(~v) = A~v for every~v ∈ Rn. Then the
following statements are all equivalent:
(1) The vectorsv~1, ..., ~vn are linearly independent.
(2) NulT ={~0}.
(3) T is one-to-one.
Example 26. Letv~1=
2 1
andv~2=
4 −1
. Arev~1andv~2linearly independent?
Fact 6. LetV be a vector space andv~1, ..., ~vn∈V. Thenv~1, ..., ~vn are linearly dependent if and only if there exists some k∈ {1, ..., n}such thatv~k ∈Span{v~1, ..., ~vk−1, ~vk+1, ..., ~vn}, i.e. v~k can be written as a linear combination of the other vectors.
Example 27. Letv~1=
2 1
∈R2.
(2) Describe all pairs of vectorsv~2, ~v3∈R2for whichv~1, ~v2, ~v2 are linearly independent.
Example 28. Letv~1=
0 1 3
andv~2=
0 3 −1
. Describe all vectorsv~3∈R3 for whichv~1, ~v2, ~v3 are linearly independent.
Definition 20. LetV be a vector space and letb~1, ..., ~bn∈V. The set{b~1, ..., ~bn} ⊆V is called abasis forV if
(1) b~1, ..., ~bn are linearly independent, and
(2) V = Span{b~1, ..., ~bn}.
Example 29. Letv~1=
3 0 −6
,v~2=
−4 1 7
, andv~3=
−2 1 5
. Is{v~1, ~v2, ~v3}a basis forR3?
Example 30. Letv~1=
2 3
andv~2=
5 0
.
(1) Is{v~1, ~v2} a basis forR2?
(2) What if v~2=
−10 −15
?
Fact 7. The standard basis{e~1, ..., ~en} is a basis for Rn.
Fact 8. The set{1, x, x2, ..., xn} is a basis for Pn.
Fact 9. LetV be any vector space and letB={b~1, ..., ~bn} be a basis forV.
(1) If you remove any one element b~k from B, then the resulting set no longer spans V. This is becauseb~k is linearly independent from the other vectors inB, and hence cannot be written as a linear combination of them by Fact 6. In this sense a basis is a “minimal spanning set.” (2) If you add any one element~v, not already in B, to B, then the result set is no longer linearly
independent. This is because B spans V, and hence there are no vectors in V which are inde-pendent from those in B by Fact 6. In this sense a basis is a “maximal linearly independent set.”
Theorem 9 (Unique Representation Theorem). Let V be a vector space and let B ={b~1, ..., ~bn} be a basis for V. Then every vector ~v ∈V may be written as a linear combination~v=c1b~1+...+cnb~n in one and only one way, i.e. ~v has a unique representation with respect to B.
Proof. Since{b~1, ..., ~bn} is a basis forV, in particular the set spansV, so we have~v ∈Span{b~1, ..., ~bn}. It follows that there exist some scalarsc1, ..., cn ∈Rn for which
~
v=c1b~1+...+cnb~n.
So we need only check that the representation above is unique. To that end, supposed1, ..., dn∈Ris
another set of scalars for which
~v=d1b~1+...+dnb~n.
We will show that in factd1=c1, ...,dn=cn, and hence there is really only one choice of scalars to begin with. To see this, observe the following equalities:
(d1−c1)b~1+...+ (dn−cn)b~n= [d1b~1+...+dnb~n]−[c1b~1+...+cnb~n] =~v−~v
=~0.
So we have written~0 as a linear combination of b~1, ..., ~bn. But the collection is a basis and hence linearly independent, so the coefficients must all be zero, i.e. d1−c1=...=dn−cn = 0. The conclusion
now follows.
10
Dimension
Theorem 10 (Steinitz Exchange Lemma). Let V be a vector space and letB ={b~1, ..., ~bn} be a basis forV. Let~v∈V. By Theorem 9 there are unique scalars c1, ..., cn for which
~
v=c1b~1+...+cnb~n.
If there isk∈ {1, ..., n}for which ck 6= 0, then exchanging~v forb~k yields another basis for the space, i.e. the collection{b~1, ..., ~bk−1, ~v, ~bk+1, ..., ~bn} is a basis for V.
Proof. Call the new collection ˆB ={b~1, ..., ~bk−1, ~v, ~bk+1, ..., ~bn}. We must show that ˆB is both linearly independent and that it spansV.
First we check that ˆBis linearly independent. Consider the equation
d1b~1+...+dk−1bk~−1+dk~v+dk+1bk+1~ +...+dnb~n=~0.
Let’s substitute the representation for~v in the above:
d1b~1+...+dk−1bk~−1+dk(c1b~1+...+cnb~n) +dk+1bk+1~ +...+dnb~n=~0.
Now collecting like terms we get:
(d1+dkck)b~1+....+ (dk−1+dkck−1)bk~−1+dkckb~k+ (dk+1+dkck+1)bk+1~ +...+ (dn+dkcn)b~n=~0.
Now sinceb~1, ..., ~bn are linearly independent, all the coefficients in the above equation must be zero. In particular, we havedkck = 0. Butck 6= 0 by our hypothesis, so we may divide through byck and get
dk= 0. Now substituting 0 back in fordk we have:
d1b~1+...dk−1bk~−1+dk+1bk+1~ +...+dnb~n.
Now using the linear independence ofb~1, ..., ~bn one more time, we conclude thatd1=...dk−1=dk=
dk+1=...=dn = 0. So the only solution is the trivial solution, and hence ˆBis linearly independent. It remains only to check thatV = Span ˆB. To see this, letw~ ∈V be arbitrary. By Theorem 9, write
~
was a linear combination of the vectors inB:
~
for some scalarsa1, ..., an ∈R. Now notice that since~v=c1b~1+...cnb~n and sinceck6= 0, we may solve for the vectorb~k as follows:
~ bk =−cc1
k ~
b1−...−ckc−1
k ~ bk−1+c1
k~v−
ck+1
ck ~
bk+1−...−ccn
k ~ bn.
Note that for the above to make sense, it is crucial that ck 6= 0. Now, if we substitute the above expression for bk in the equation w~ = a1b~1+...+anb~n and then collect like terms, we will see w~ written as a linear combination of the vectorsb~1, ..., ~bk−1, ~bk+1, ..., ~bnand the vector~v. This implies that
~
w∈ Span ˆB. Since w~ was arbitrary in V, we have V = Span ˆB. So ˆB is indeed a basis for V and the
theorem is proved.
Corollary 3. LetV be a vector space which has a finite basis. Then every basis ofV is the same size. Proof. Let B = {b~1, ..., ~bn} be a basis of minimal size. Let D be any other basis for V, consisting of arbitrarily many elements ofV. We will show that in factDhas exactlynelements.
Let d~1 ∈D be arbitrary. Sinced~1 ∈ V = Span{b~1, ..., ~bn}, it is possible to write d~1 as a nontrivial linear combination d~1 =c1b~1+...+cnb~n, i.e. ck 6= 0 for at least one k∈ {1, .., n}. By reordering the terms of the basisB if necessary, we may assume without loss of generality thatc16= 0. Define a new setB1={d~1, ~b2, ..., ~bn}; by the Steinitz Exchange Lemma, B1 is also a basis forV.
Now we go one step further. SinceB was a basis of minimal size, we know Dhas at least nmany elements. So let d~2 ∈ D be distinct from d~1. We know d~2 ∈ V = SpanB1, so there exist constants
c1, ..., cn for whichd~2=c1d~1+c2b~2+...+cnb~n. Notice that we must haveck6= 0 for somek∈ {2, ..., n}, since ifc1 were the only nonzero coefficient, thend∈ would be a scalar multiple ofd∞, which is not the
case since they are linearly independent. By reordering the termsb~2, ..., ~bn if necessary, we may assume without loss of generality thatc26= 0. DefineB2={d~1, ~d2, ~b3, ..., ~b4}. By the Steinitz Exchange Lemma, B2 is also a basis forV.
Continue this process. At each stepk∈ {1, ..., n}, we will have obtained a new basisBk={d~1, ..., ~dk, ~bk+1, ..., ~bn} forV. SinceDhas at leastnmany elements, there is adk+1~ ∈V which is not equal to any ofd~1, ..., ~dk.
Writedk+1~ =c1d~1+...+ckd~k+ck+1bk+1~ +...+cnb~n for some constantsc1, ..., cn, and observe that one of the constantsck+1, ..., cn must be nonzero sincedk+1 is independent fromd1, ..., dk. Then reorder the termsbk+1, ..., bnand use the Steinitz Exchange Lemma to replacebk+1withdk+1to obtain a new basis.
After finitely many steps we end up with the basisBn={d~1, ..., ~dn} ⊆ D. SinceBn spansV, it is not possible forDto contain any more elements, since nothing inV is linearly independent from the vectors
inBn. So in fact D=Bn and the proof is complete.
Definition 21. LetV be a vector space, and supposeV has a finite basisB={b~1, ..., ~bn}consisting of
nvectors. Then we say that thedimensionofV isn, or thatV isn-dimensional. We also write that dimV =n. Notice that every finite-dimensional spaceV has a unique dimension by Corollary 3.
IfV has no finite basis, we say thatV isinfinite-dimensional. Example 31. Determine the dimension of the following vector spaces.
(1) R.
(2) R2.
(3) R3.
(5) Pn.
(6) The trivial space{~0}.
(7) Let V be the set of all infinite sequences (a1, a2, a3, ...) of real numbers which eventualy end in an infinite string of 0’s. (Verify that V is a vector space using coordinate-wise addition and scalar multiplication.)
(8) The space P of all polynomials. (Verify that P is a vector space using standard polynomial addition and scalar multiplication.)
(9) Let V be the set of all continuous functionsf :R→R. (Verify thatV is a vector space using
function addition and scalar multiplication.
(10) Any line through the origin inR3.
(11) Any line through the origin inR2.
(12) Any plane through the origin inR3.
Corollary 4. Suppose V is a vector space with dimV = n. Then no linearly independent set in V
consists of more than nelements.
Proof. This follows from the proof of Corollary 3, since when consideringDwe only used the fact that
Dwas linearly independent, not spanning.
Corollary 5. Let V be finite-dimensional vector space. Any linearly independent set in V may be expanded to make a basis forV.
Proof. If a linearly independent set{d~1, ..., ~dn} already spansV then it is already a basis. If it doesn’t span, then we can find a linearly independent vector dn+1~ ∈/ Span{d~1, ..., ~dn} by Fact 6 and consider {d~1, ..., ~dn, ~dn+1}. If it spansV, then we’re done. Otherwise, continue adding vectors. By the previous
corollary, we know the process stops in finitely many steps.
Corollary 6. LetV be a finite-dimensional vector space. Any spanning set inV may be shrunk to make a basis for V.
Proof. LetS be a spanning set forV, i.e. V = SpanS. Without loss of generality we may assume that
~0∈/ S, for if~0 is inS, then we can toss it out and the set S will still spanV.
IfS is empty thenV = SpanS ={~0} andS is already a basis, and we are done.
IfS is not empty, then chooses~1∈S. IfV = Span{s~1}, we are done and{s~1}is a basis. Otherwise, observe that if every vector in S were in Span{s~1}, then we would have SpanS = Span{s~1}; since S
spansV and{s~1}does not, this is impossible. So there is somes~2∈S withs~2∈/ Span{s~1}, i.e. s~1 and
~
s2 are linearly independent.
If{s~1, ~s2} spansV, we are done. Otherwise we can find a third linearly independent vector s~3, and
so on. The process stops in finitely many steps.
Corollary 7. Let V be an n-dimensional vector space. Then a set of n vectors in V is linearly inde-pendent if and only if it spansV.
11
Coordinate Systems
Definition 22. LetV be a vector space andB={b~1, ..., ~bn}be a basis forV. Let~v∈v. By the Unique Representation Theorem, there are unique weights c1, ..., cn ∈ Rfor which ~v =c1b~1+...+cnb~n. Call
these constantsc1, ..., cnthecoordinates ofxrelative toB. Define thecoordinate representation of xrelative to Bto be the vector
[~v]B=
c1
... cn
∈Rn.
Example 32. Consider a basis{b~1, ~b2} forR2, whereb~1=
1 0
andb~2=
1 2
.
(1) Suppose a vector~v∈R2hasB-coordinate representation~v=
−2 3
. Find~v.
(2) Let w~ =
−1 8
. Find [w~]B.
Theorem 11. Let V be a vector space andB={b~1, ..., ~bn} be a basis forV. Define a mappingT :V →
Rn by T(~v) = [~v]B. ThenT is an isomorphism.
Proof. (T is one-to-one.) Letv~1, ~v2∈V be such that [v~1]B = [v~2]B. Thenv~1and v~2 have the same co-ordinates relative toB. Now the Unique Representation Theorem impies thatv~1=v~2. SoTis one-to-one.
(T is onto.) For every vector
c1
... cn
∈ Rn, there corresponds a vector~v =c1b~1+...+cnb~n ∈ V.
ObviouslyT(~v) =
c1
... cn
, so the mapT is onto.
(T is linear.) Let v~1, ~v2 ∈ V be arbitrary, and let k ∈ R be an arbitrary scalar. Use the Unique
Representation Theorem to find their unique coordinatesv~1=c1b~1+...+cnb~nandv~2=d1b~1+...+dnb~n. Now just check that the linearity conditions hold:
T(v~1+v~2) =T((c1b~1+...+cnb~n) + (d1b~1+...+dnb~n)) =T((c1+d1)b~1+...+ (cn+dn)b~n)
=
c1+d1
... cn+dn
=
c1
... cn
+
d1
... dn
=T(v~1) +T(v~2); and
T(k ~v1) =T(kc1b~1+...+kcnb~n)
=
kc1
... kcn
=k
c1
... cn
=kT(v~1).
Corollary 8. Any twon-dimensional vector spaces are isomorphic to one another (and to Rn).
Recall that isomorphisms are exactly the maps which perfectly preserve vector space structure. In other words, supposeV is a vector space andT :Rn is an isomorphism. Thenv~1, .., ~vn are independent inV if and only if T(v~1), ..., T(v~n) are independent inRn; the set{v~1, ..., ~vn}spansV if and only if the set {T(v~1), ..., T(v~n} spans Rn; a linear combinationc1v~1+...+cnv~n is equal tow~ in V if and only if
c1T(v~1) +...+cnT(v~n) is equal toT(w~) in W, etc., etc.
So the previous theorem and corollary imply that all problems in an n-dimensional vector spaceV
may be effectively translated into problems in Rn, which we generally know how to handle via matrix
computations.
Example 33. Determine whether or not the vectors 1 + 2x2, 4 +x+ 5x2, and 3 + 2x are linearly independent inP2.
Example 34. Determine whether or not 1 +x+x2∈Span{1 + 5x,−4x+ 2x2} in
P2.
Example 35. Define a mapT :R3→P2 byT
a b c
= (a+b)x2+cfor every a, b, c∈R.
(1) IsT a linear transformation? (2) IsT one-to-one?
Example 36. Letv~1=
3 6 2
andv~2 =
−1 0 1
, and let B={v~1, ~v2}. The vectors inB are linearly
independent, soBis a basis forV = Span{v~1, ~v2}. Letw~ =
3 12
7
. Determine ifw~ is inV, and if it is,
find [w~]B.
Fact 10 (Change of Basis: Non-Standard into Standard). Let ~v ∈Rn and let B={b~1, ..., ~bn} be any basis for Rn. Then there exists a unique matrixPB with the property that
~v=PB[~v]B.
We call PB the change-of-coordinates matrix fromB to the standard basis in Rn. Moreover, we
have
PB=
b~1 ... b~n
.
Example 37. Let b~1 =
1 −1
3
, b~2 =
2 0 8
, and b~3 =
2 −2
4
. Then b~1, ~b2, ~b3 are three linearly independent vectors and henceB={b~1, ~b2, ~b3}forms a basis for the three-dimensional spaceR3. Suppose
[~v]B=
1 −2
3
. Find~v.
The previous fact gives us an easy tool for converting a vector inRnfrom its non-standard coordinate
representation [~v]B into its standard representation~v. What about going the opposite direction, i.e.
converting from standard coordinates to non-standard? The idea will not be hard: since we can write
~v=PB[~v]B using aB-to-standard matrixPB, we would like to be able to write [~v]B=PB−1~vand regard
the inverse matrix PB−1 as a change-of-basis from standard coordinates to B-coordinates. In order to justify this idea carefully, we first need to recall a few facts about inverting matrices.
12
Matrix Inversion
Definition 23. For everyn∈N, there is a uniquen×nmatrixI with the property that AI=IA=A
for everyn×nmatrixA. This matrix is given by
I=
e~1 ... e~n
wheree~1, ..., ~en are the standard basis vectors forRn.
Let A be an n×n matrix. If there exists a matrix C for which AC = CA= I, then we say A is invertibleand we callC theinverseofA. We denoteA−1=C, so
AA−1=A−1A=I. Fact 11. Let A=
a b c d
. If ad−bc6= 0, thenA is invertible and
A−1= 1
ad−bc
d −b
−c a
.
If ad−bc= 0, thenA is not invertible.
Example 38. Find the inverse ofA=
3 4 5 6
and verify that it is an inverse.
Example 39. Give an example of a matrix which is not invertible.
Theorem 12. If A is an invertible n×n matrix, then for each w~ ∈Rn, there is a unique~v ∈
Rn for
whichA~v=w~.
Proof. Set~v=A−1w~. ThenA~v =AA−1w~ =I ~w=w~. If~uwere another vector with the property that
A~u=w~ =A~v, then we would have A−1A~u=A−1A~v and henceI~u=I~v. So~u=~v and this vector is
unique.
Fact 12. LetA be ann×nmatrix. If Ais row equivalent toI, then[A|I]is row equivalent to[I|A−1]. OtherwiseA is not invertible.
Example 40. Find the inverse ofA=
0 1 2
1 0 3
4 −3 8
, if it exists.
Example 41. FindA−1 if possible. (1) A=
0 3 −1
1 0 1
1 −1 0
(2) A=
1 −2 −1
−1 5 6
5 −4 5
13
The Connection Between Isomorphisms and Invertible Matrices
Definition 24. Let V be a vector space. Define the identity map with respect to V, denoted idV : V → V, to be the function which leaves V unchanged, i.e. idV(v~) =~v for every~v ∈ V. It is easy to verify mentally that idV is a one-to-one and onto linear transformation, i.e. an isomorphism.
Now letW be another vector space and letT :V →W be an isomorphism. Then there is a unique inverse mapT−1:W →V, with the property that
T−1◦T = idV andT◦T−1= idW,
or
T−1(T(~v)) =~v for every~v∈V andT(T−1(w~)) =w~ for every w~ ∈W.
Example 42. Let T : R3 →
P2 be defined by T
a b c
= ax2 +bx+c. We proved T is an
isomorphism on the homework. Find an expression forT−1.
Theorem 13. LetT :Rn →Rn be an isomorphism and letAbe its uniquen×nmatrix representation
as guaranteed by 3. ThenAis invertible and the matrix representation of T−1 isA−1.
Proof. First notice that ifT is an isomorphism, it is one-to-one and hence the columns ofAare linearly independent by Fact 5. This means that A may be row reduced to the n×n identity matrix I, and henceAis invertible by Fact 12.
Notice that the uniquen×nmatrix representation of the identity map idRnmust be then×nidentity
matrixI. It is sufficient to notice that idRn(e~k) =e~k =I ~ekfor every standard basis vectore~k, 1≤k≤n.
Now let~v∈Rn be arbitrary. Set~u=A−1~v. Notice that
T(~u) =A~u=A(A−1~v) = (AA−1)~v=I~v=~v.
Taking T−1 of both sides above, we getT−1(~v) = ~u=A−1~v. SoT−1 and A1 send~v to the same place for every~v, i.e. A−1 is a matrix representation forT−1. Since this matrix representation must be
unique, we are done.
Example 43. Let T : Rn →Rn be the linear transformation defined by the rulesT(e~1) =
3 5
and
T(e~2) =
4 6
.
(1) Prove that T is an isomorphism. (2) Find an explicit formula for T−1.
Solution. (1)T has matrix representationA=
3 4 5 6
by Theorem 3. Ais invertible by Fact 11 since 3·6−4·56= 0. SoT is invertible, i.e. T is an isomorphism.
(2) By Fact 11, we haveA−1=−1 2
6 −4 −5 3
. Then by Theorem 13, we haveT(~v) =−1 2
6 −4 −5 3
~v
for every~v∈R2.
14
More Change of Basis
Fact 13(Change of Basis: Standard into Non-Standard). LetB={b~1, ..., ~bn}be a basis forRn. LetPB
be the uniquen×nfor which
for every ~v ∈Rn, so PB is the change-of-coordinates matrix from B to the standard basis. Then PB is
always invertible since its columns are linearly independent, and we call the inversePB−1 the change-of-coordinates matrix from the standard basis to B. This matrix has the property that
[~v]B=PB−1~v
for every~v∈Rn.
Example 44. LetB=
2 1
,
−1 1
. Find a matrixAsuch that [~v]B=A~vfor every~v∈R2.
Example 45. LetB=
1 −3
,
−2 4
andC =
−7 9
,
−5 7
. Find a matrixA for which
[~v]C =A[v~]B for every~v∈R2.
Fact 14. Let B={b~1, ..., ~bn} andC={c~1, ..., ~cn}be two bases forRn. Then there exists a uniquen×n
matrix, which we denote P
C←B, with the property that
[~v]C = P C←B[~v]B
for every~v∈Rn. We call P
C←B the change-of-coordinates matrix fromB toC.
Notice that
P
C←B=P −1
C ·PB.
We may also write
P
C←B=
a~1 ... a~n
.
wherea~1= [b~1]C, ..., a~n = [b~n]C.
Fact 15. Let B andC be any two bases forRn. Then P
C←Bis invertible, and
P
C←B
−1
= P
B←C.
Example 46. LetBandC be as in the previous example. (1) Find P
C←B.
(2) Find P
B←C.
Example 47. Suppose C = {c~1, ~c2} is a basis for R2, and B = {b~1, ~b2} where b~1 = 4c~1+c~2 and
~
b2=−6c~1+c~2. Suppose [~v]B=
3 1
. Find [~v]C.
15
Rank and Nullity
Definition 25. Let V and W be vector spaces and T : V →W a linear transformation. Define the rankofT, denoted rankT, by
rankT = dim RanT.
Intuitively we think of the nullity of a linear transformationT as being the amount of “information lost” by the map, and the rank is the amount of “information preserved,” as the next example may help illustrate.
Example 48. Compute rankT and dim NulT for the following maps. (1) T :R3→R2, T
x y z
=
y z
.
(2) T :R3→R3, T
x y z
=
x+y
2x+ 2y
0
.
(3) T :R→R2,T(x) =
−2x x
.
(4) T :R5→R3, T
x1
x2
x3
x4
x5
=
0 0 0
.
Theorem 14(Rank-Nullity Theorem). LetV, W be finite-dimensional vector spaces and letT :V →W
be a linear transformation. Then
rankT+ dim NulT = dimV.
Proof. Suppose dim NulT = k and dimV = n. Obviously k ≤ n since NulT is a subspace of
V. Let {b~1, ..., ~bk}} be a basis for NulT. By Corollary 5 we may expand this set to form a basis {b~1, ..., ~bk, ~bk+1, ..., ~bn} for the larger space V. We claim that the set {T(bk+1~ ), ..., T(b~n)} comprises a basis for RanT, and hence rankT = dim RanT =n−k.
First we must show thatT(bk+1~ ), ..., T(b~n) are linearly independent. To see this, consider the equation
ck+1T(bk+1) +~ ...+cnT(b~n) =~0.
SinceT is a linear transformation, we may rewrite the above as
T(ck+1bk+1~ +...+cnb~n) =~0.
It follows that the linear combinationck+1bk+1~ +...+cnb~n is in the null space NulT. This means we can write it as a linear combination of the elements of our basis{b~1, ..., ~bk} for NulT:
ck+1bk+1~ +...+cnb~n=c1b~1+...+ckb~k
for somec1, ..., ck∈R. Moving all terms to one side of the equation, we get:
c1b~1+...+ckb~k−ck+1bk+1~ −...−cnb~n=~0.
In other words we have written~0 as a linear combination of the basis elements{b~1, ..., ~bn}forV. Since these are linearly independent, all the coefficients must be 0, i.e. c1 =... =ck =ck+1 =... =cn = 0. This shows thatT(bk+1)~ , ..., T(b~n) are linearly independent.
It remains only to show that {T(bk+1)~ , ..., T(b~n)} forms a spanning set for RanT. To that end, let
~
w∈RanT, so there exists a~v∈V withT(~v) =w~. Write~vas a linear combination of the basis elements forV:
~v=c1b~1+...+ckb~k+ck+1bk+1~ +...+cnb~n.
Notice thatT(b~1) =...=T(b~k) =~0 sinceb~1, ..., ~bk ∈NulT. So, applying the mapT to both sides of the equation above, we get:
~
w=T(~v)
=T(c1b~1+...+ckb~k+ck+1bk+1~ +...+cnb~n)
=c1T(b~1) +...+ckT(b~k) +ck+1T(bk+1~ ) +...+cnT(b~n) =c1~0 +...+ck~0 +ck+1T(bk+1) +~ ...+cnT(b~n)
=ck+1T(bk+1) +~ ...+cnT(b~n).
So we have written w~ as a linear combination of the elements of {T(bk+1)~ , ..., T(b~n)}, and since w~
was arbitrary, we see that they span RanT. So {T(bk+1)~ , ..., T(b~n)} is a basis for RanT and hence
rankT =n−k. The conclusion of the theorem now follows.
Theorem 15. Let V and W both be n-dimensional vector spaces, and let T : V → W be a linear transformation. Then the following statements are all equivalent.
(1) T is an isomorphism. (2) T is one-to-one and onto. (3) T is invertible.
(4) T is one-to-one. (5) T is onto. (6) NulT ={~0}. (7) dim NulT= 0. (8) RanT =W. (9) rankT =n.
Furthermore, if V =W =Rn andA is the uniquen×nmatrix representation ofT, then the following
statements are all equivalent, and equivalent to the statements above. (10) A is invertible.
(11) A is row-equivalent to then×nidentity matrixI. (12) The equationA~x=~0 has only the trivial solution. (13) The columns of Aform a basis for Rn.
(14) The columns of Aare linearly independent. (15) The columns of Aspan Rn.
(16) ColA=Rn. (17) dim ColA=n.
Proof. We first check that (5) ⇔ (8) ⇔ (9). We have (5) ⇔ (8) by Theorem 7. If (8) holds and RanT =W, then rankT = dim RanT = dimW =n, so (9) holds. Conversely if (9) holds, then RanT
admits a basisBconsisting ofnvectors. The vectors inBare linearly independent inW and hence they spanW by Corollary 7. SoW = SpanB= RanTand hence (8) holds; so (5), (8), and (9) are equivalent. We also check that (4)⇔(6)⇔(7). (4)⇔(6) follows from Theorem 5 and (6)⇔(7) is obvious from the definition of dimension.
By the Rank-Nullity Theorem 14, we have (7)⇔(9). So all of statements (4)−(9) are equivalent. Statements (1)−(3) are obviously equivalent, and obviously imply (4)−(9). If any of statements (4)−(9) hold, then all of them do- in particular,T will be one-to-one and onto, so (1)−(3) will hold. This con-cludes the proof of the first half of the theorem.
Now suppose we have V = W = Rn and A is the n×n matrix representation of T. We have (1)⇒(10) by Theorem 13. If (10) holds andAis invertible, then we may define a mapT−1:
Rn→Rn
byT−1(~v) =A−1~vfor every~v∈
Rn; it is easy to check that thisT−1is truly the inverse ofT, and hence
(3) holds. So (10) holds if and only if (1)−(9) holds.
We have (10)⇔(11) by Fact 12. (11) holds if and only ifAis row-equivalent toI, if and only if the augmented matrix [A |~0] is row-equivalent to [I |~0], if and only if the equationA~x=~0 has only one
solution, i.e. (12) holds. So (11)⇔(12). So (1)−(12) are all equivalent.
Since RanT = ColA by Theorem 8, we have (16) ⇔ (8) and (17) ⇔ (9). So (1)−(12) are each equivalent to (16) and (17).
We have (15)⇔(16) by the definition of ColA, and we have (13)⇔(14)⇔(15) by Corollary 7. So
(1)−(17) are all equivalent as promised.
16
Determinants and the LaPlace Cofactor Expansion
Theorem 15 says that linear transformations are actually isomorphisms whenever their matrix repre-sentations are invertible. When is a givenn×nmatrix invertible? We have easy ways to check ifn= 1 orn= 2, but how can one tell if, say, a 52×52 matrix is invertible? We hope to develop a computable formula for checking the invertibility of an arbitrary square matrix, which we will call the determinant of that matrix.
There are many different ways to introduce and define the notion of a determinant for a general
n×nmatrixA. In the interest of brevity, we will restrict our attention here to two major approaches. The first method of computing a determinant is given implicitly in the upcoming Definition 28. This definition best conveys the intuition that the determinant of a matrix characterizes its invertibility or non-invertibility, and is computationally easiest for large matrices. The second method for computing determinants is the LaPlace expansion or cofactor expansion, which is probably computationally easier for small matrices, and which will prove very useful in the next section on eigenvalues and eigenvectors.
Definition 26. (1) Let A= [a] be a 1×1 matrix. Thedeterminant ofA, denoted detA, is the number detA=a.
(2) LetA=
a b c d
be a 2×2 matrix. ThedeterminantofA, also denoted detA, is the number detA=ad−bc.
Notice that in both cases above, the matrix Ais invertible if and only if detA6= 0. This statement is obvious for (1), and for (2) it follows from Fact 11, which the reader may be asked to prove on the homework.
We wish to generalize this idea for n×n matrices A of arbitrary sizen, i.e. we hope to find some computable number detA with the property that A is invertible if and only if detA 6= 0. Of course
A is invertible if and only if it is row-equivalent to the n×n identity matrix I, so there must be a strong relationship between the determinant detAand the elementary row operations we use to reduce a matrix. Building on this vague observation, in the next example we investigate the effect of the row operations on detAfor 2×2 matricesA.
Example 49. LetA=
a b c d