Massively Multivariable
Open Online Calculus
Jim Fowler and Steve Gubkin
EXPERIMENTAL DRAFT
Contents
1
An n-dimensional space
We package lists of numbers as “vectors.”In this course we will be studying calculus of many variables. That means that instead of just seeing how one quantity depends on another, we will see how two quantities could affect a third, or how five inputs might cause changes in three outputs. The very first step of this journey is to give a convenient mathematical framework for talking about lists of numbers. To that end we define:
Definition 1 Rn is the set of all ordered lists containing n real numbers. That is,
Rn = {(x1, x2, . . . , xn) : x1, x2, . . . , xn∈ R}.
The number n is called the dimension of Rn, and Rn is called n-dimensional space. When speaking aloud, it is also acceptable to say “are en.” We call the elements of Rn points or n-tuples.
Example 2 R1 is just the set of all real numbers, which is often visualized by the number line, which is 1-dimensional.
−1 0 1 2 3
Example 3 R2is the set of all pairs of real numbers, like (2, 5) or (1.54, π). This can be visualized by the coordinate plane, which is 2-dimensional.
(1, 2)
Example 4 R3 is the set of all triples of real numbers. It can be visualized as 3-dimensional space, with three coordinate axes.
Question 5 If (3, 2, e, 1.4) ∈ Rn, what is n? Solution
Hint: n-dimensional space consists of ordered n-tuples of numbers. How many coor-dinates does (3, 2, e, 1.4) have?
Hint:
1 An n-dimensional space
Hint: n = 4 n = 4
Question 7 Which point is farther away from the point (0, 0)? Solution Hint: (0, 1) (1, 1) (−2, 3) (1, 4) (a) (0, 1) (b) (1, 1) (c) (−2, 3) (d) (1, 4) X
It becomes quite difficult to visualize high dimensional spaces. You can some-times visualize a higher dimensional object by having 3 spatial dimensions and one color dimension, or 3 spatial dimensions and one time dimension to get a movie. Sometimes you can project a higher dimensional object into a lower dimensional space. If you have the time, you should watch the excellent film Dimensions1which
2
Vector spaces
Vector spaces are where vectors live.It will be convenient for us to equip Rn with two algebraic operations: “vector ad-dition” and “scalar multiplication” (to be defined soon). This additional structure will transform Rn from a mere set into a “vector space.” To distinguish between Rn as a set and Rn as a vector space, we think of elements of Rn as a set as being ordered lists, such as
p = (x1, x2, x3, . . . , xn),
but elements of Rn the vector space will be written typographically as vertically oriented lists flanked with square brackets, like this
~v = x1 x2 x3 .. . xn
We will try to stick to the convention that bold letters like p represent points, while letters with little arrows above them (like ~v) represent vectors.
Unfortunately (like practically everybody else in the world), we use the same symbol Rn to refer to both the vector space Rn and the underlying set of points Rn.
Vector addition is defined as follows: x1 x2 .. . xn + y1 y2 .. . yn = x1+ y1 x2+ y2 .. . xn+ yn
Warning 1 You cannot add vectors in Rn and Rmunless n = m.
An element of R is a number, but it is also called a “scalar” in this context, and vectors can be multiplied by scalars as follows:
c x1 x2 .. . xn = cx1 cx2 .. . cxn
Warning 2 We have not yet defined a notion of multiplication for vectors. You might think it is reasonable to define
x1 x2 .. . xn y1 y2 .. . yn = x1y1 x2y2 .. . xnyn ,
2 Vector spaces
but actually this operation is not especially useful, and will never be utilized in this course. We will have a notion of “vector multiplication” called the dot product, but that is not the (faulty) definition above.
Question 3 Solution Hint: 1 2 3 + 3 −2 4 = 1 + 3 2 + −2 3 + 4 = 4 0 7 What is 1 2 3 + 3 −2 4 ? Question 4 Solution Hint: 3 3 −2 4 = 3(3) 3(−2) 3(4) = 9 6 12 What is 3 3 −2 4 ? Question 5 If ~v1 = 3 −2 , ~v2= 1 5 , and ~v3= 1 1
can you find a, b ∈ R so that a~v1+ b~v2= v3?
Solution
2 Vector spaces Hint: ( 3a + b = 1 −2a + 5b = 1 ( 15a + 5b = 5 −2a + 5b = 1 ( 17a = 4 −2a + 5b = 1 a = 4 17 −2(4 17) + 5b = 1 a = 4 17 b = 5 17 a =4/17 Solution b = 5/17
3
Geometry
Vectors can be viewed geometrically.
Graphically, we depict a vector x1 x2 . . xn
in Rnas an arrow whose base is at the origin
and whose head is at the point (x1, x2, ..., xn). For example, in R2we would depict
the vector ~v =3 4
as follows
3 Geometry
~ w
Solution
Hint: Consider whether the x and y coordinates are positive or negative.
(a) −4 2 X (b) 3 −3 (c) −4 −2 (d) 4 2
3 Geometry
Question 2 Hint:
~ v
On a sheet of paper, draw the vector ~v =3 1
. Click the hint to see if you got it right.
3 Geometry Question 3 Hint: ~ v1 ~ v2 ~ v1+ ~v2 ~
v1 and ~v2 are drawn below. Redraw them on a sheet of paper, and also draw
their sum ~v1+ ~v2. Click the hint to see if you got it right.
~v1
3 Geometry
Question 4 Hint:
~v 3~v
~
v is drawn below. Redraw it on a sheet of paper, and also draw 3~v. Click the hint to see if you got it right
3 Geometry
You may have noticed that you can sum vectors graphically by forming a par-allelogram.
You also may have noticed that multiplying a vector by a scalar leaves the vector pointing in the same direction but ”scales” its length. That is the reason we call real numbers ”scalars” when they are coefficients of vectors: it is to remind us that they act geometrically by scaling the vector.
4
Span
Vectors can be combined; all those combinations form the “span.”
Definition 1 We say that a vector ~w is a linear combination of the vectors ~v1, ~v2, ~v3, . . . , ~vkif there are scalars a1, a2, . . . , akso that w = a1~v1+a2~v2+· · ·+ak~vk.
Definition 2 The span of a set of vectors ~v1, ~v2, . . . , ~vk ∈ Rn is the set of all
linear combinations of the vectors. Symbolically, span(~v1, ~v2, . . . , ~vk) = {a1~v1+
a2~v2+ · · · + ak~vk : a1, a2, . . . , ak∈ R}.
Example 3 The span of 1 0 0 , 0 1 0
is all vectors of the form x y 0 for some x, y ∈ R Example 4 8 13 is in the span of2 3 and4 7 because 22 3 +4 7 = 8 13 Question 5 Is 3 4 2 in the span of 1 2 0 and 3 −3 0 ? Solution
Hint: The linear combinations of 1 2 0 and 3 −3 0
are all the vectors of the form
a 1 2 0 + b 3 −3 0
for scalars a, b ∈ R. Could 3 4 2
be written in such a form?
Hint: No, because the last coordinate of all of these vectors is 0. In fact, graphically, the span of these two vectors is just the entire xy-plane, and
3 4 2
lives off of that plane.
(a) Yes, it is in the span of those two vectors. (b) No, it is not in the span of those two vectors. X
5
Functions
A function relates inputs and outputs.
Definition 1 A function f from a set A to a set B is an assignment of exactly one element of B to each element of A. If a is an element of A, we write f (a) for the element of B which is assigned to a by f .
We call A the domain of f , and B the codomain of f . We will also commonly write f : A → B which we read out loud as “f from A to B” or “f maps A to B.” Example 2 Let W = {yes, no} and A = {Dog, Cat, Walrus}. Let f : A → W be the function which assigns to each animal in A the answer to the question “Is this animal commonly a pet?” Then f (Dog) = yes, f (Cat) = yes, and f (Walrus) = no.
In this case, A is the domain, and W is the codomain. In these activities, we mostly study functions from Rn to Rm. Question 3 Let g : R1→ R2 be defined by g(θ) = (cos(θ), sin(θ)).
Solution Hint:
Warning 4 In everything that follows, cos and sin are in terms of radians.
Hint: g(π 6) = (cos( π 6), sin( π 6))
Hint: If you remember your trig facts, this is ( √ 3 2 , 1 2). Format this as √ 3 2 1 2 for this question. What is g(π
6) ? Give your answer as a vertical column of numbers.
Can you imagine what would happen to the point g(θ) as θ moved from 0 to 2π?
Question 5 Let h : R2→ R2 be defined by h(x, y) = (x, −y).
Solution
5 Functions
Hint:
(2, 1)
h(2, 1)
Hint: h takes any point (x, y) to its reflection in the x−axis.
Hint: Format your answer as 2 −1
What is h(2, 1)? Format your answer as a vertical column of numbers.
Try to understand this function graphically. How does it transform the plane? The hint reveals the answer to this question.
Question 6 Let f : R4→ R2be defined by f ((x
1, x2, x3, x4)) = (x1x2+ x3, x42+
x1).
Solution
Hint: f (3, 4, 1, 9) = (3 · 4 + 1, 92+ 3) = (13, 84).
Hint: Format this as13 84
6
Composition
One way to build new functions is via “composition.”
Practically the most important thing you can do with functions is to compose them. Definition 1 Let f : A → B and g : B → C. Then there is another function (g ◦ f ) : A → C defined by (g ◦ f )(a) = g (f (a)) for each a ∈ A.
It is called the composition of g with f .
Warning 2 The composition is only defined if the codomain of f is the domain of g.
Question 3 Let A = {cat, dog}, B = {(2, 3), (5, 6), (7, 8)}, C = R. Let f be defined by f (cat) = (2, 3) and f (dog) = (7, 8). Let g be defined by the rule g((x, y)) = x + y.
Solution
Hint: First, (g ◦ f )(cat) = g (f (cat)).
Hint: Then note that f (cat) = (2, 3).
Hint: So this is g ((2, 3)) = 2 + 3 = 5. (g ◦ f )(cat) = 5
Question 4 Let h : R2→ R3be defined by h(x, y) = (x2
, xy, y), and let ω : R3→ R2be defined by ω(x, y, z) = (xyz, z). Solution Hint: (ω ◦ h)(x, y) = ω [h(x, y)] = ω(x2, xy, y) = ((x2)(xy)(y), y) = (x3y2, y)
7
Higher-order functions
Sometimes functions act on functions.Functions from Rn → Rm are not the only useful kind of function. While such
functions are our primary object of study in this multivariable calculus class, it will often be helpful to think about “functions of functions.” The next examples might seem a bit peculiar, but later on in the course these kinds of mappings will become very important.
Question 1 Let C[0,1]be the set of all continuous functions from [0, 1] to R. Define
I : C[0,1]→ R by I(f) = Z 1 0 f (x)dx Solution Hint: I(g) = Z 1 0 g(x)dx = Z 1 0 x2dx =1 3x 3 1 0 =1 3(1 − 0) =1 3. If g(x) = x2, then I(g) = 1/3
Question 2 Let C∞(R) be the set of all infinitely differentiable (“smooth”) func-tions on R. Define Q : C∞(R) → C∞(R) by Q(f )(x) = f (0) + f0(0)x +f
00(0)
2 x
2.
Solution
7 Higher-order functions
Hint: f00(x) = − cos(x), so f00(0) = − cos(0) = −1 f00(0) = -1
Hint: So Q(f )(x) = 1 −x
2
2
If f (x) = cos(x), then Q(f )(x) = 1 − x2/2?
This is an example of a function which eats a function and spits out another func-tion. In particular, this takes a function and returns the second order MacLaurin polynomial of that function.
Question 6 Define dotn : Rn× Rn→ R by dotn((x1, x2, ..., xn), (y1, y2, ..., yn)) =
x1y1+ x2y2+ x3y3+ ... + xnyn.
Solution
Hint: dot3((2, 4, 5), (0, 1, 4)) = 2(0) + 4(1) + 5(4) = 24
8
Currying
Higher-order functions provide a different perspective on functions that take many inputs.
Definition 1 Let A and B be two sets. The product A × B of the two sets is the set of all ordered pairs A × B = {(a, b) : a ∈ A and b ∈ B}.
Example 2 If A = {1, 2, Wolf} and B = {4, 5}, then A×B = {(1, 4), (1, 5), (2, 4), (2, 5), (Wolf, 4), (Wolf, 5)} Example 3 We write R2 for pairs of real numbers, but we could have written
R × R instead.
Question 4 Let Func(R, R) be the set of all functions from R to R. Define Eval : R × Func(R, R) → R by Eval(x, f ) = f (x).
Solution
Hint: Eval(−3, g) = g(−3) = | − 3| = 3 If g(x) = |x|, then Eval(−3, g) = 3?
Question 5 Let Func(A, B) be the set of all functions from A to B for any two sets A and B.
Let Curry : Func(R2, R) → Func(R, Func(R, R)) be defined by Curry(f )(x)(y) = f (x, y).
Let h : R2→ R be defined by h(x, y) = x2+ xy.
Solution Hint: G(3) = Curry(h)(2)(3) = h(2, 3) = 22+ 2(3) = 10
9
Python
Python provides a playground for multivariable functions.
We can use Python to experiment a bit with multivariable functions. Question 1 Solution Model the function f (x) = x2 as a python function. Hint:
Warning 2 Python does not use ^ for exponentiation; it denotes this by **
Hint: Try using return x**2
Python
1 def f(x):
2 return #your code here 3
4 def validator():
5 return (f(4) == 16) and (f(-5) == 25)
Solution Model the function g(x) = (
−1 if x ≤ 0
1 if x > 0 as a Python function. Hint: Try using an if
Python
1 def g(x):
2 # your code here
3 return # the value of g(x) 4
5 def validator():
6 return (g(0) == -1) and (g(-17) == -1) and (g(25) == 1)
Solution Model the function h(x, y) = (
x/(1 + y) if y 6= −1
0 if y = −1 as a Python function. Python
1 def h(x,y): 2 # your code here
3 return # the value of h(x,y) 4
5 def validator():
10
Higher-order python
One nice feature of Python is that we can play with functions which act on functions.
Question 1 Here is an example of a higher order function horizontal_shift. It takes a function f of one variable, and a horizontal shift H, and returns the function whose graph is the same as f , only shifted horizontally by H units. Solution Find a function f so that horizontal_shift(f,2) is the squaring function.
Python
1 def horizontal_shift(f,H):
2 # first we define a new function shifted_f which is the appropriate shift of f 3 def shifted_f(x):
4 return f(x-H)
5 # then we return that function 6 return shifted_f
7 def f(x):
8 return # a function so that horizontal_shift(f,2) is the squaring function 9
10 def validator():
11 return (f(1) == 9) and (f(0) == 4) and (f(-3) == 1)
Solution Write a function forward_difference which takes a function f : R → R and returns another real-valued function defined by forward difference(f )(x) = f (x + 1) − f (x).
Python
1 def forward_difference(f): 2 # Your code here
3 4 def validator(): 5 def f(x): 6 return x**2 7 def g(x): 8 return x**3
11
Calculus
We can do some calculus with Python, too.
Let’s try doing some single-variable calculus with a bit of Python.
Let epsilon be a small, but positive number. Suppose f : R → R has been coded as a Python function f which takes a real number and returns a real number. Seeing as
f0(x) = lim
h→0
f (x + h) − f (x)
h ,
can you find a Python function which approximates f0(x)?
Given a Python function f which takes a real number and returns a real number, we can approximate f0(x) by using epsilon. Write a Python function derivative which takes a function f and returns an approximation to its derivative.
Solution
Hint: To approximate this, use (f(x+epsilon) - f(x))/epsilon. Python
1 epsilon = 0.0001 2 def derivative(f):
3 def df(x): return (f(blah blah) - f(blah blah)) / blah blah 4 return df 5 6 def validator(): 7 df = derivative(lambda x: 1+x**2+x**3) 8 if abs(df(2) - 16) > 0.01: 9 return False 10 df = derivative(lambda x: (1+x)**4) 11 if abs(df(-2.642) - -17.708405152) > 0.01: 12 return False 13 return True
This is great! In the future, we’ll review this activity, and then extend it to a multivariable setting.
12
Linear maps
Linear maps respect addition and scalar multiplication.
We begin by defining linear maps.
Definition 1 A function L : Rn → Rm is called a linear map if it “respects
addition and scalar multiplication.”
Symbolically, for a map to be linear, we must have that L(~v + ~w) = L(~v) + L( ~w) for all ~v, ~w ∈ Rn and also L(a~v) = aL(~v) for all a ∈ R and ~v ∈ Rn.
Definition 2 Linear Algebra is the branch of mathematics concerning vector spaces and linear mappings between such spaces.
Question 3 Which of the following functions are linear? Solution
Hint: For a function to be linear, it must respect scalar multiplication. Let’s see how f 51 1 compares to 5f1 1
, and also how h 51 1 compares to 5h1 1 . Question 4 Solution
Hint: Remember f is defined by fx y = x + 2y, so f 51 1 = f5 5 = 5 + 2(5) = 15 What is f 51 1 ? 15 Solution
Hint: Remember f is defined by fx y = x + 2y, so f1 1 = 1 + 2 (1) = 3 What is f1 1 ? 3 Solution Is f 51 1 = 5f1 1 ?
12 Linear maps
Hint: Remember h is defined by hx y =17 x , so h1 1 =17 1 What is h1 1 ? Solution Is h 51 1 = 5h1 1 ? (a) Yes (b) No X
Great! So h is not linear: by looking at this particular example, we can see that h does not always respect scalar multiplication. So h is not linear.
Since we know one of the two functions is linear, we can already answer the question: The answer is f . To be thorough, lets check that f really is linear.
First we check that f really does respect scalar multiplication: Let a ∈ R be an arbitrary scalar andxy
∈ R2 be an arbitrary vector. Then
f ax y = fax ay = ax + 2ay = a (x + 2y) = afx y
Now we check that f really does respect vector addition: Letx1 y1 andx2 y2
be arbitrary vectors in R2. Then
fx1 y1 +x2 y2 = fx1+ x2 y1+ y2 = (x1+ x2) + 2 (y1+ y2) = x1+ x2+ 2y1+ 2y2 = (x1+ 2y1) + (x2+ 2y2) = fx1 y1 + fx2 y2
This proves that f is linear!
(a) f : R2→ R1 defined by fx y = x + 2y X (b) h : R2→ R2 defined by hx y =17 x
What about these two functions? Which of them is a linear map? Solution
12 Linear maps
Hint: For a function to be linear, it must respect scalar addition. Let’s see how h(5+2) compares to h(5)+h(2) and also how g
2 3 1 + 1 4 5 compares to g 2 3 1 +g 1 4 5 . Question 5 Solution
Hint: Remember h is defined by h(x) = x x x 4x , so h(5 + 2) = h(7) = 7 7 7 28 What is h(5 + 2)? Solution
Hint: Remember h is defined by h(x) = x x x 4x , so h(5) + h(2) = 5 5 5 20 + 2 2 2 8 = 7 7 7 28 What is h(5) + h(2)? Solution Is h(5 + 2) = h(5) + h(2)? (a) Yes X (b) No
Great! So h has a chance of being linear, since it is respecting vector addition in this case. What about g?
Solution
Hint: Remember g is defined by g x y z = x xy , so g 2 3 1 + 1 4 5 = g 3 7 6 = 3 3(7) = 3 21 What is g 2 3 1 + 1 4 5 ?
12 Linear maps What is g 2 3 1 + g 1 4 5 ? Solution Is g 2 3 1 + 1 4 5 = g 2 3 1 + g 1 4 5 (a) Yes (b) No X
Great! So g is not linear: by looking at this particular example, we can see that g does not always respect vector addition. So g is not linear.
Since we know one of the two functions is linear, we can already answer the question: The answer is h. To be thorough, lets check that h really is linear.
First we check that h really does respect scalar multiplication:
Let a ∈ R be an arbitrary scalar and x ∈ R be an arbitrary vector. Then
h (ax) = ax ax ax 4ax = a x x x 4x = ah(x)
Now we check that h really does respect vector addition: Let x and y be arbitrary vectors in R1. Then
h (x + y) = x + y x + y x + y 4(x + y) = x + y x + y x + y 4x + 4y = x x x 4x + y y y 4y = h(x) + h(y) This proves that h is linear!
(a) g : R3→ R2 defined by g x y z = x xy
12 Linear maps (b) h : R → R4 defined by h(x) = x x x 4x X
And finally, which of the following functions are linear? Solution
Hint: For a function to be linear, it must respect scalar multiplication. Let’s see how A 22 3 compares to 2A2 3
and also how G 2 1 2 3 4 compares to 2G 1 2 3 4 . Question 6 Solution
Hint: Remember A is defined by Ax y =0 0 , so A 22 3 = A4 6 =0 0 What is A 22 3 ? Solution
Hint: Remember A is defined by Ax y =0 0 , so 2A2 3 = 20 0 =0 0 What is 2A2 3 ? Solution Is A 22 3 = 2A2 3 )? (a) Yes X (b) No
Great! So A has a chance of being linear, since it is respecting vector addition in this case. What about G?
Solution
Hint: Remember G is defined by G x y = ex+y x + z , so
12 Linear maps What is G 2 1 2 3 4 ? Solution
Hint: Remember G is defined by G x y z t = ex+y x + z sin(x + t) , so 2G 1 2 3 4 = 2 e1+2 1 + 3 sin(1 + 4) = 2 e3 4 sin(5) = 2e3 8 2 sin(5) What is 2G 1 2 3 4 ? Solution Is G 2 1 2 3 4 = 2G 1 2 3 4 ? (a) Yes (b) No X
Great! So G is not linear: by looking at this particular example, we can see that G does not always respect scalar multiplication. So G is not linear.
Since we know one of the two functions is linear, we can already answer the question: The answer is A. To be thorough, lets check that A really is linear.
First we check that A really does respect scalar multiplication: Let c ∈ R be an arbitrary scalar andx
y
∈ R2 be an arbitrary vector. Then
A cx y = Aax ay =0 0 = a0 0
Now we check that A really does respect vector addition: Letx1 y1 andx2 y2
12 Linear maps Ax1 y1 +x2 y2 = Ax1+ x2 y1+ y2 =0 0 =0 0 +0 0 = Ax1 y1 + Ax2 y2
This proves that A is linear!
(a) G : R4→ R3 defined by G x y z t = ex+y x + z sin(x + t) (b) A : R2→ R2 defined by Ax y =0 0 X
Warning 7 Note that the function which sends every vector to the zero vector is linear.
Question 8 Let L : R3 → R2 be a linear function. Suppose L
1 0 0 = 3 4 , L 0 1 0 = −2 0 , and L 0 0 1 = 1 −1 . Solution
Hint: The only thing we know about linear maps is that they respect scalar multi-plication and vector addition. So we need to somehow rewrite the vector
4 −1 in terms
12 Linear maps
Hint: Consider the coefficient on 1 0 0 .
Hint: In this case, a = 4.
Hint: Moreover, b = −1.
Hint: Finally, c = 2. a = 4
Solution b = -1 Solution c = 2
Now using the linearity of L, we can see that
L 4 −1 2 = L 4 1 0 0 + −1 0 1 0 + 2 0 0 1 = 4L 1 0 0 + −1L 0 1 0 + 2L 0 0 1
Can you finish off the computation?
Hint: L 4 −1 2 = 4L 1 0 0 + −1L 0 1 0 + 2L 0 0 1 = 43 4 + −1−2 0 + 2 1 −1 =12 16 +2 0 + 2 −2 =16 14 Let ~v = L 4 −1 2 . What is ~v?
Can you generalize this? Solution
12 Linear maps
Hint: The only thing we know about linear maps is that they respect scalar multi-plication and vector addition. So we need to somehow rewrite the vector
x y z in terms of the vectors 1 0 0 , 0 1 0 and 0 0 1
, scalar multiplication, and vector addition, to exploit what we know about L.
Question 10 Can you rewrite x y z in the form a 1 0 0 + b 0 1 0 + c 0 0 1 ? Solution Hint: x y z = x 1 0 0 + y 0 1 0 + z 0 0 1 a = x Solution b = y Solution c = z
Hint: Now using the linearity of L, we can see that
L x y z = L x 1 0 0 + y 0 1 0 + z 0 0 1 = xL 1 0 0 + yL 0 1 0 + zL 0 0 1
Can you finish off the computation?
12 Linear maps
As you have already discovered a linear map L : Rn→ Rmis fully determined
by its action on the “standard basis vectors” e1=
1 0 0 .. . 0 , e2= 0 1 0 .. . 0
, and so on, until
we reach en = 0 0 .. . 0 1 .
Argue convincingly that if L : Rn→ Rmis a linear map and you know L(~e i) for
i = 1, 2, 3, ..., n, then you could figure out L(~v) for any ~v ∈ Rn. I want to determine
what L does to any vector ~v = x1 x2 x3 . . . xn ∈ Rn. I can rewrite ~v as x 1e~1+ x2e~2+ x3e~3+ ... + xne~n. By the linearity of L, L(~v) = x1L( ~e1) + x2L( ~e2) + x3L( ~e3) + ... + xnL( ~en).
Since I already know the value of L(~ei) for all i = 1, 2, 3, ..., n, this allows me to
compute L(~v). So L is completely determined once I know what it does to each of the standard basis vectors.
1
13
Matrices
Matrices are a way to represent linear maps.
To make writing a linear map a little less cumbersome, we will develop a com-pact notation for linear maps using our previous observation that a linear map is determined by its action on the standard basis vectors.
Definition 1 An m × n matrix is an array of numbers which has m rows and n columns. The numbers in a matrix are called entries.
When A is a matrix, we write A = (aij), meaning that ai,j is the entry in the
ith row and jth column of the matrix. Note: We start counting with 1 not 0. So the upper lefthand entry of the matrix is a1,1.
Question 2 The matrix A = 1 −1 2 4 3 −5 is an n × m matrix. Solution
Hint: Note that this is n × m whereas the definition above used m × n.
Hint: n is the number of rows, and m is the number of columns
Hint: n = 3 and m = 2 In this case, n is 3. Solution And m is 2.
Remember, we write ai,j for the entry in the ith row and jth column of the
matrix. Solution
Hint: a3,2is the entry in the 3rd row and the 2ndcolumn.
13 Matrices Hint: B = 2 3 4 5 3 4 5 6 4 5 6 7 What is B?
Definition 4 To each linear map L : Rn→ Rmwe associate a m × n matrix A L
called the matrix of the linear map with respect to the standard coordinates. It is defined by setting ai,j to be the ith component of L(ej). In other words, the jth
column of the matrix AL is the vector L(ej).
Going the other way, we likewise associate to each matrix m × n matrix M a linear map LM : Rn → Rmby requiring that L(ej) be the jthcolumn of the matrix
M .
Question 5 The linear map L : R2→ R3satisfies L1
0 = 3 −5 2 and L 0 1 = −1 1 1
. What is the matrix of L?
Solution
Hint: Remember that, by definition, the first column of this matrix should be L1 0
and the second column should be L0 1
.
Hint: The matrix of L is
3 −1 −5 1 2 1
Let’s do another example.
Question 6 Suppose L is a linear map represented by the matrix A = 1 −1 2 4 3 −5 . Solution
Hint: A should have one column for each basis vector of the domain.
Hint: A has 2 columns, so the dimension of the domain is 2. The dimension of the domain of L is 2.
Solution
13 Matrices
Hint: Since the columns are of length 3, that means L is spitting out vectors of length 3.
Hint: The codomain of L is R3which is 3 dimensional. The dimension of the codomain of L is 3.
Suppose ~v = L0 1
. What is ~v? Solution
Hint: Remember that, by definition, the ith column of A is L(~ei).
Hint: So, by definition, L0 1
is the second column of the matrix A.
Hint: So L0 1 = −1 4 −5 Suppose ~w = L4 5 . What is ~w? Solution
Hint: By definition of the matrix associated to a linear map, we know that L1 0 = 1 2 3 and L 0 1 = −1 4 −5 .
Hint: Can you rewrite4 5 in terms of1 0 and0 1
so that you can use the linearity of L to compute L4 5 ? Hint: L4 5 = L 41 0 + 50 1 Hint: L4 = L 41 + 50
13 Matrices What is Lx y ? Solution
Hint: By definition of the matrix associated to a linear map, we know that L1 0 = 1 2 3 and L 0 1 = −1 4 −5 .
Hint: Can you rewritex y in terms of1 0 and0 1
so that you can use the linearity of L to compute L4 5 ? Hint: Lx y = L x1 0 + y0 1 Hint: Lx y = L x1 0 + y0 1 = xL1 0 + yL0 1 = x 1 2 3 + y −1 4 −5 = x 2x 3x + −y 4y −5y = x − y 2x + 4y 3x − 5y
As an antidote to the abstraction, let’s take a look at a simplistic “real world” example.
Question 7 In the local barter economy, there is an exchange where you can • trade 1 spoon for 2 apples and 1 orange,
• trade 1 knife for 2 oranges, and
• trade 1 fork for 3 apples and 4 oranges.
Model this as a linear map from L : R3 → R2, where the coordinates on R3 are spoons knives forks
and the coordinates on R
2
are apples oranges
.
13 Matrices
Solution
Hint: Remember the matrix of a linear map is defined by the fact the the kth column of the matrix is the image of the kth standard basis vector.
Hint: 1 0 0
represents one spoon in the codomain. Its image under this linear map is
2 apples and 1 orange, which is represented by the vector2 1
in the codomain. So the first column of the matrix should be2
1
Hint: The full matrix is
2 0 3 1 2 4
What is the matrix of the linear map L? Solution Hint: L 3 0 4 = L 3 1 0 0 + 4 0 0 1 (1) = 3L 1 0 0 + 4L 0 0 1 (2) = 32 1 + 43 4 (3) =6 3 +12 16 (4) =18 19 (5) So you would be able to get 18 apples and 19 oranges.
13 Matrices
Prove the following statement: if S : Rn → Rm and T : Rn → Rm are both linear maps, then the map (S + T ) : Rn → Rmdefined by (S + T )(~v) = S(~v) + T (~v) is also linear.
We need to check that (S + T ) respects both scalar multiplication and vector addition.
Scalar multiplication:
Choose and arbitrary scalar c ∈ R and an arbitrary vector ~v ∈ Rn. Then (S + T )(c~v) = S(c~v) + T (c~v) by definition of (S + T )
= cS(~v) + cT (~v) by the linearity of S and T
= c (S(~v) + T (~v)) by the distributivity of scalar multiplication over addition in Rm = c(S + T )(~v) by definition of (S + T )
Vector addition: Choose two arbitrary vectors ~v and ~w in Rn. Then
(S + T )(~v + ~w) = S(~v + ~w) + T (~v + ~w) by definition of S + T
= S(~v) + S( ~w) + T (~v) + T ( ~w) by the linearity of S and T
= S(~v) + T (~v) + S( ~w) + T ( ~w) by the commutativity of vector addition in Rm = (S + T )(~v) + (S + T )( ~w) by the definition of S + T.
Prove that if T : Rn→ Rm
is a linear map and c ∈ R is a scalar, then the map cT : Rn→ Rm, defined by
(cT )(~v) = cT (~v) is also a linear map.
We need to check that cT respects both scalar multiplication and vector addi-tion.
Scalar multiplication:
Choose and arbitrary scalar a ∈ R and an arbitrary vector ~v ∈ Rn. Then (cT )(a~v) = cT (a~v)
= acT (~v) = a(cT )(~v)
Vector addition: Choose two arbitrary vectors ~v and ~w in Rn. Then
(cT )(~v + ~w) = cT (~v + ~w) = c (T (~v) + T ( ~w)) = cT (~v) + cT ( ~w) = (cT )(~v) + (cT )( ~w)
Observation 8 The last two exercises show that we have a nice way to both add linear maps and multiply linear maps by scalars. So linear maps themselves “feel” a bit like vectors. You do not have to worry about this now, but we will see that the linear maps from Rn→ Rmform an “abstract vector space.” Much of the
14
Composition
The composition of linear maps can be computed with matrices.
Prove that if S : Rn → Rm
is a linear map, and T : Rm→ Rk is a linear map, then
the composite function T ◦ S : Rn→ Rk is also linear.
We need to show that T ◦ S respects scalar multiplication and vector addition: Scalar multiplication: For every scalar a ∈ R and every vector ~v ∈ Rn, we have:
(T ◦ S)(a~v) = T (S(a~v))
= T (aS(~v)) because S respects scalar multiplication = aT (S(~v)) because T respects scalar multiplication = a(T ◦ S)(~v)
Vector addition: For every two vectors ~v, ~w ∈ Rn, we have:
(T ◦ S)(~v + ~w) = T (S(~v + ~w))
= T (S(~v + S( ~w)))because S respects vector addition = T (S(~v)) + T (S( ~w))because T respects vector addition = (T ◦ S)(~v) + (T ◦ S)( ~w)
Question 1 Suppose the matrix of S is MS =
2 0 −1
−1 1 1
and the matrix of
T is MT = −1 −1 0 2 −1 1 . Solution
Hint: Remember that the matrix for S ◦ T will have columns given by (S ◦ T )1 0
and (S ◦ T )0 1
Hint: Question 2 Solution Hint:
14 Composition What is (S ◦ T )1 0 ? Question 3 Solution Hint: (S ◦ T )0 1 = S T0 1 = S −1 2 1 because by definition, T 0 1
is the second column of the matrix of T
= −1S 1 0 0 + 2S 0 1 0 + S 0 0 1 by the linearity of S = −1 2 −1 + 20 1 +−1 1 because ??? =−3 4 What is (S ◦ T )0 1 ?
Hint: The matrix of (S ◦ T ) is−1 −3
0 4
What is the matrix of S ◦ T ? Solution
Hint: Remember that the matrix for T ◦S will have columns given by (T ◦S) 1 0 0 , (T ◦ S) 0 1 0 and (T ◦ S) 0 0 1
Hint: Question 4 Solution Hint: (T ◦ S) 1 0 0 = T S 1 0 0 = T 2 −1 because by definition, S 1 0 0
is the first column of the matrix of S
= 2T1 0 + −1T0 1 by the linearity of T = 2 −1 0 −1 + −1 −1 2 1 because ??? = −1 −2 −3
14 Composition What is (T ◦ S) 1 0 0 ? Question 5 Solution Hint: (T ◦ S) 0 1 0 = T S 0 1 0 = T0 1 because by definition, S 1 0 0
is the first column of the matrix of S
= −1 2 1
we got lucky: by definition T 0
1
is the second column of the matrix of T
What is (T ◦ S) 0 1 0 ? Question 6 Solution Hint: (T ◦ S) 0 0 1 = T S 0 0 1 = T−1 1 because by definition, S 0 0 1
is the third column of the matrix of S
= −1T1 0 + T0 1 by the linearity of T = −1 −1 0 −1 + −1 2 1 because ???
14 Composition
Definition 7 If M is a m × n matrix and N is a k × m matrix, then the product N M of the matrices is defined as the matrix of the composition of the linear maps defined by M and N .
In other words, N M is the matrix of LN◦ LM.
Warning 8 You may have seen another definition for matrix multiplication in the past. That definition could be seen as a shortcut for how to compute the product, but it is usually presented devoid of mathematical meaning.
Hopefully our definition seems properly motivated: matrix multiplication is just what you do to compose linear maps. We suggest working out the problems here using our definition: you will develop your own efficient shortcuts in time.
You have already multiplied two matrices, even though you didn’t know it, above. Take some time now to get a whole lot of practice. You do not need us to prompt you: invent your own matrices and try to multiply them, on paper. What condition is needed on the rows and columns of the two matrices for matrix multiplication to even make sense? You can check your work using a computer algebra system, like SAGE1or you can use a free web hosted app like Reshih2. Use our definition, and think through it each time. Try to get faster and more efficient. Eventually you should be able to do this quite rapidly.
Question 9 Suppose B =1 2 3 4
. Find a 2 × 2 matrix A so that AB 6= BA. Play around! Can you find more than one?
Solution
Hint: There is no systematic way to answer this question: you just have to play around, and see what you discover!
Hint: Question 10 Solution Hint: 1 2 3 4 1 0 0 0 =1 0 3 0 What is1 2 3 4 1 0 0 0 ? Question 11 Solution Hint: 1 0 0 0 1 2 3 4 =1 2 0 0 What is1 0 0 0 1 2 3 4 ?
A matrix that doesn’t commute with B is
1
http://www.sagemath.org/
2
14 Composition
Question 12 Solution
Hint: Try some simple matrices. Maybe limit yourself to 2 × 2 matrices?
Hint: One simple linear map which would work is Lx y =y 0 . Applying this twice to any vector would give you the zero vector. This linear map is great for cooking up counterexamples to all sorts of naive things you might think about matrices! See this Mathoverflow answer3 (you will understand more and more of these terms as the course progresses).
Question 13 Hint: The matrix of L is0 1 0 0
What is the matrix of the example linear map L?
Find A 6= 0 with AA = 0. (Note: such a matrix is called “nilpotent”)
Question 14 If A =2 8 3 12 , find v 6= 0 with Av = ~0. Solution Hint: Let ~v =x y
, and solve a system of equations
Hint: A(~v) = ~0 2 8 3 12 x y =0 0 2x + 8y 3x + 12y =0 0
14 Composition
Hint: Let ~v =x y
and solve a system of equations.
Hint: A~v =0 8 1 3 2 4 x y =0 8 x + 3y 2x + 4y =0 8 Hint: ( x + 3y = 0 2x + 4y = 8 ( x + 3y = 0 x + 2y = 4 ( x + 3y = 0 y = −4 ( x = 12 y = −4
In the last two exercises, you found that solving matrix equations is equivalent to solving systems of linear equations.
Question 16 Rewrite ( 4x + 7y + z = 3 −x + 8y − z = 2 as A x y z = 3 2 . Solution Hint: A = 4 7 1 −1 8 −1
15
Python
Build up some linear algebra in python.
Exercise 1 We will store a vector as a list. So the vector 1 2 3 will be stored as [1,2,3]. Let’s try to write some Python code for working with lists as if they were vectors.
Solution
Hint: This was discussed on http: // stackoverflow. com/ questions/ 14050824/ add-sum-of-values-of-two-lists-into-new-list StackOverflow. Write a “vector add” function. Your function may assume that the two vectors have
the same number of entries.
Python
1 # write a function vector_sum(v,w) which takes two vectors v and w, 2 # and returns the sum v + w.
3 #
4 # For example, vector_sum([1,2], [4,1]) equals [5,3] 5 #
6
7 def vector_sum(v,w): 8 # your code here 9 return # the sum v+w 10
11 def validator():
12 # It would be better to try more cases 13 if vector_sum([-5,23],[10,2])[0] != 5: 14 return False 15 if vector_sum([1,5,6],[2,3,6])[1] != 8: 16 return False 17 return True 18 Solution
15 Python
9
10 def validator():
11 # It would be better to try more cases 12 if scale_vector(-3,[2,3,10])[1] != -9: 13 return False 14 if scale_vector(10,[4,3,2,1])[2] != 20: 15 return False 16 return True 17
Let’s write a dot product function. Solution
Python
1 # Write a function dot_product(v,w) which takes two vectors v and w, 2 # and returns the dot product of v and w.
3 #
4 # For example, dot_product([1,2],[0,3]) is 6. 5
6 def dot_product(v,w): 7 # your code here
8 return # the dot product "v dot w" 9 10 def validator(): 11 if dot_product([1,2],[-3,5]) != 7: 12 return False 13 if dot_product([0,4,2],[2,3,-7]) != -2: 14 return False 15 return True
And we will store a matrix as a list of lists. For example the list [[1,3,5],[2,4,6]] will represent the matrix
1 3 5
2 4 6
.
Note that there are two different conventions that we could have chosen: the in-nermost lists could be the rows, or the columns. There are good reasons to have chosen the opposite convention: after all, when thinking of a matrix as a linear map, we should be paying attention to the columns, since the ith column tells us what the corresponding linear map does when applied to ~ei.
Nevertheless, the innermost lists are rows in our chosen representation. This way, to talk about the entry mij, we write m[i][j]. Had we made the other
choice, the mijentry would have been accessed by writing j and i in the other order.
This is also the same convention used by the computer algebra system, Sage. Exercise 2 Write a “matrix multiplication” function.
15 Python
Python
1 # write a function multiply(A,B) which takes two matrices A and B stored in the above format, 2 # and returns the matrix of their product
3
4 def multiply(A,B): 5 # your code here 6 return # the product AB 7
8 def validator():
9 # It would be better to try more cases 10 a = [[-2, 0], [-2, -3], [-1, 3]] 11 b = [[-3, 2, -1, -2], [3, 2, 1, 3]] 12 result = multiply(a,b) 13 if (len(result) != 3): 14 return False 15 if (len(result[0]) != 4): 16 return False 17 if (result[2][1] != 4): 18 return False 19 return True Fantastic!
Next, let’s think more about how matrices and linear maps are related. Solution
Hint:
Warning 3 This is a function whose output is a function.
Hint: Try using lambda.
Write a function matrix_to_function which takes a matrix MLrepresenting the linear
map L, and returns a Python function. The returned Python function should take a vector ~
v and send it to L(~v).
Python
1 # For example, if M = [[1,2],[3,4]], then matrix_to_function(M)([0,1]) should be [2,4] 2
15 Python
Solution Now let’s go the other way. Write a function function_to_matrix which takes a Python function f—assumed to be a linear map from R2 to R2—and returns the 2 × 2 matrix representing that linear map.
Python
1 # For example if you had defined 2 # 3 # def L(v): 4 # return [2*v[0]+3*v[1], -4*v[0]] 5 # 6 # Then function_to_matrix(L) is 7
8 # You may assume that L takes [x,y] to another list with two entries 9 # and you may assume that L is linear
10
11 def function_to_matrix(L): 12 #your code here
13 return # the matrix 14 15 def validator(): 16 M = function_to_matrix( lambda v: [3*v[0]+5*v[1], -2*v[0] + 4*v[1]] ) 17 if (M[0][0] != 3): 18 return False 19 M = function_to_matrix( lambda v: [2*v[0]-3*v[1], -7*v[0] - 5*v[1]] ) 20 if (M[1][0] != -7): 21 return False 22 M = function_to_matrix( lambda v: [v[0]+7*v[1], 3*v[0] - 2*v[1]] ) 23 if (M[1][1] != -2): 24 return False 25 return True
Great work! If you like, you can try to compute function_to_matrix(matrix_to_function(M)). You should get back M .
16
An inner product space
The dot product provides a way to compute lengths and angles.
In order to do geometry in Rn, we will want to be able to compute the length of a vector, and the angle between two vectors. Miraculously, a single operation will allow us to compute both quantities.
17
Covectors
A covector eats vectors and provides numbers.
Definition 1 A covector on Rn is a linear map from Rn→ R. As a matrix, it is a single row of length n.
Example 2 2 −1 3 is the matrix of a covector on R3. Question 3 Solution Hint: 2 −1 3 3 5 7 = 2(3) + −1(5) + 3(7) = 22 2 −1 3 3 5 7 =22
Now we can do this a bit more abstractly.
Hint: x y z a b c = ax + by + cz x y z a b c = ax + by + cz
There is a natural way to turn a vector into a covector, or a covector into a vector: just turn the matrix 90◦ one direction or the other!
Definition 4 We define the transpose of a vector v = x1 x2 .. . xn to be the covector v> with matrixx1 x2 · · · xn.
Similarly we define the transpose of a covector ω : x1 x2 · · · xn to be
the vector ω> with matrix x1 x2 .. . xn . Question 5 Suppose ~v = 1 4 3 . What is (~v>)>? Solution (a) (~v>)>= 1 4 3 X
17 Covectors
(b) (~v>)>=1 4 3
Indeed, (~v>)>= ~v and (ω>)>= ω for any vector ~v and covector ω. Let v = 5 3 1 and w = 2 −2 7 Solution Hint: v>(w) =5 3 1 2 −2 7 = 5(2) + 3(−2) + 1(7) = 11 v>(w) = 11? Solution Hint: w(v>) = 2 −2 7 5 3 1 = 10 6 2 −10 −6 −2 35 21 7 What is wv>?
18
Dot product
The standard inner product is the dot product.
Definition 1 Given two vectors ~v, ~w ∈ Rn, we define their standard inner product h~v, ~wi by h~v, ~wi = ~v>( ~w) ∈ R. We sometimes use the notation ~v · ~w for h~v, ~wi, and call the operation the dot product.
Warning 2 Note that ~v>( ~w) 6= ~w(~v>): one is a number, while the other is an n × n matrix.
Question 3 Make sure for yourself, by using the definition, that x1 x2 .. . xn · y1 y2 .. . yn = x1y1+ x2y2+ x3y3+ · · · + xnyn.
Prove the following facts about the dot product. ~u, ~v, ~w ∈ Rn and a ∈ R (a) ~v · ~w = ~w · ~v (The dot product is commutative)
(b) (~u + ~v) · ~w = ~u · ~w + ~v · ~w and (a~v) · ~w = a(~v · ~w) (The dot product is linear in the first argument)
(c) ~u · (~v + ~w) = ~u · ~v + ~u · ~w and ~v · (a ~w) = a(~v · ~w) (The dot product is linear in the second argument)
(d) ~v · ~v ≥ 0 (We say that the dot product is “positive definite”)
(e) if ~v · ~z = 0 for all ~z ∈ Rn, then ~v = ~0 (The dot product is nondegenerate) 1. ~v · ~w = v1w1+ v2w2+ ... + vnwn= w1v1+ w2v2+ ... + wnvn = w · v, so the
dot product is commutative. (skipping item 2 for now)
3. ~ u · (v + ~w) = ~u>(v + ~w) by definition = ~u>(v) + ~u>( ~w) since ~u>: Rn → R is linear = ~u · v + ~u · ~w by definition and ~ u · (a ~w) = ~u>(a ~w) by definition = a~u>( ~w) since ~u>: Rn→ R is linear = a~u · ~w by definition
18 Dot product
2. follows from 3 and 1
4. ~v · ~v = v12+ v22+ v32+ ... + v2n, and the square of a real number is nonnega-tive, so the sum of these squares is also nonnegative.
5. is perhaps the trickiest fact to prove. Observe that if ~v · ~z = 0 for every ~z ∈ Rn, then this formula is true in particular for z = ~ej. But ~v · ~ej= vj. Thus, by dotting
with all of the standard basis vectors, we see that every coordinate of ~v must be 0. Thus ~v is the zero vector
The fact that the dot product is linear in two separate vector variables means that it is an example of a “bilinear form”. We will make a careful study of bi-linear forms later in this course: it will turn out that the second derivative of a multivariable function gives a bilinear form at each point.
So far, the inner product feels like it belongs to the realm of pure algebra. In the next few exercises, we will start to see some hints of its geometric meaning. Question 4 Let v =5 1 . Solution Hint: h~v, ~vi = 52+ 12= 26 h~v, ~vi = 26
Let’s think about this a bit more abstractly. Set v =x y . Solution Hint: h~v, ~vi = x2+ y2 h~v, ~vi = x2+ y2
Notice that the length of the line segment from (0, 0) to (x, y) is px2+ y2 by
19
Length
The inner product provides a way to measure the length of a vector.
You should have discovered that v · v is the square of the length of the vector v when viewed as an arrow based at the origin. So far, you have only shown this in the 2-dimensional case. See if you can do it in three dimensions.
Show that the length of the line segment from (0, 0, 0) to (x, y, z) is √ ~ v · ~v, where ~v = x y z .
Until now, you may not have seen a treatment of length in higher dimensions. Generalizing the results above, we define:
Definition 1 The length of a vector ~v ∈ Rn is defined by |v| =√v · v.
Question 2 Solution The length of the vector 6 2 3 1 = sqrt(62+ 22+ 32+ 12) Question 3 Solution
Hint: By the Pythagorean theorem, we can see that the distance isp(5 − 2)2+ (9 − 3)2
Hint: We could also view this as the length of the vector 3 6
which “points” from (2, 3) to (5, 9).
The distance between the points (2, 3) and (5, 9) is sqrt(32+ 62)
Definition 4 The distance between two points p and q in Rn is defined to be the length of the “displacement” vector ~p − ~q.
Question 5 Solution
Hint: The displacement vector between these points is 5 − 2 6 − 7 9 − 3 8 − 1 = 3 1 6 7
Hint: The length of the displacement vector isp32+ 12+ 62+ 72
The distance between the points (2, 7, 3, 1) and (5, 6, 9, 8) is sqrt(32+ 1 + 62+ 72)
Question 6 Write an equation for the sphere centered at (0, 0, 0, 0) in R4of radius r using the coordinates x, y, z, w on R4.
19 Length
Hint: For a point p = (x, y, z, w) to be on the sphere of radius r centered at (0, 0, 0, 0), the distance from p to the origin must be r
Hint: r =px2+ y2+ z2+ w2
Hint: x2+ y2+ z2+ w2= r2 x2+ y2+ z2+ w2 = r2
Question 7 Write an inequality stating that the point (x, y, z, w) is more than 4 units away from the point (2, 3, 1, 9)
Solution
Hint: The distance between the point (x, y, z, w) and (2, 3, 1, 9) isp(x − 2)2+ (y − 3)2+ (z − 1)2+ (w − 9)2.
Hint: So we needp(x − 2)2+ (y − 3)2+ (z − 1)2+ (w − 9)2> 4
sqrt((x − 2)2+ (y − 3)2+ (z − 1)2+ (w − 9)2) > 4
Prove that |a~v| = |a||~v| for every a ∈ R.
Warning 8 These two uses of | · | are distinct: |a| means the absolute value of a, and |~v| is the length of ~v.
|a~v| =pha~v, a~vi by definition
=pa2h~v, ~vi by the linearity of the inner product in each slot
= √
a2ph~v, ~vi
20
Angles
Dot products can be used to compute angles.
Question 1 Give a vector of length 1 which points in the same direction as ~v = 1
2
(i.e. is a positive multiple of ~v). Solution
Hint: Remember that you just argued that |a~v| = |a|~v for any a ∈ R. What positive a could you choose to make |a||~v| = 1?
Hint: We need to take a = 1 |~v|
Hint: The length of ~v isp12+ 22=√5
Hint: The vector 1 √ 5 2 √ 5
points in the same direction as ~v, but has length 1.
Now that we understand the relationship between the inner product and length of vectors, we will attempt to establish a connection between the inner product and the angle between two vectors.
Do you remember the law of cosines? It states the following:
Theorem 2 If a triangle has side lengths a, b, and c, then c2= a2+b2−2ab cos(θ), where θ is the angle opposite the side with length c.
Prove the law of cosines. You may want to read the lovely proof at mathproofs1. You can find a beautiful proof here2.
We can rephrase this in terms of vectors, since geometrically if ~v and ~w are vectors, the third side of the triangle is the vector ~w − ~v.
Theorem 3 For any two vectors v, w ∈ Rn, |w − v|2= |w|2+ |v|2− 2|v||w| cos(θ), where θ is the angle between v and w.
(For you sticklers, this is really being taken as the definition of the angle between two vectors in arbitrary dimension.)
Rewrite the theorem above by using our definition of length in terms of the dot product. Performing some algebra you should obtain a nice expression for v · w in terms of |v|, |w|, and cos(θ).
1
http://mathproofs.blogspot.com/2006/06/law-of-cosines.html
2
20 Angles
|w − v|2= |v|2+ |w|2− 2|v||w| cos(θ)
hw − v, w − vi = |v|2+ |w|2− 2|v||w| cos(θ)
hw, w − vi − hv, w − vi = |v|2+ |w|2− 2|v||w| cos(θ) by the linearity of the inner product in the first slot
hw, wi − hw, v − hv, wi + hv, vi = |v|2+ |w|2− 2|v||w| cos(θ) by the linearity of the inner product in the second slot
|w|2− 2hv, wi + |v|2= |v|2+ |w|2− 2|v||w| cos(θ) hv, wi = |v||w| cos(θ)
You should have discovered the following theorem:
Theorem 4 For any two vectors v, w ∈ Rn, v · w = |v||w| cos(θ). In words, the dot product of two vectors is the product of the lengths of the two vectors, times the cosine of the angle between them.
This gives an almost totally geometric picture of the dot product: Given two vectors ~v and ~w, |~v cos(θ)| can be viewed as the length of the projection of ~v onto the line containing ~w. So |~v|| ~w| cos(θ) is the “length of the projection of ~v in the direction of ~w times the length of ~w”.
As mentioned above, this theorem is really being used to define the angle be-tween two vectors. This is not quite rigorous: how do we even know that v · w
|v||w| is even between −1 and 1, so that it could be the cosine of an angle? This is clear from the “Euclidean Geometry” perspective, but not as clear from the “Carte-sian Geometry” perspective. To make sure that everything is okay, we prove the “Cauchy-Schwarz” theorem which reconciles these two worlds.
21
Cauchy-Schwarz
The Cauchy-Schwarz inequality relates the inner product and the norm of the two vectors.
This is the Cauchy-Schwarz inequality.
Theorem 1 |v · w| ≤ |v||w| for any two vectors v, w ∈ Rn
Proof If ~v or ~w is the zero vector, the result is trivial. So assume ~v 6= ~0 and ~
w 6= ~0 Start by noting that hv − w, v − wi ≥ 0. Expanding this out, we have: hv, vi − 2hv, wi + hw, wi ≥ 0
2hv, wi ≤ hv, vi + hw, wi Now, if ~v and ~w are unit vectors, this says that
2h~v, ~wi ≤ 2 h~v, ~wi ≤ 1
Now to prove the result for any pair of nonzero vectors, simply scale them to make them unit vectors:
h 1 |~v|~v, 1 | ~w|wi ≤ 1~ hv, wi ≤ |v||w| We are not quite done with the proof, because we have not proven that v · w ≥ −|v||w|. Following the same basic outline, try to prove the other half of this inequality below. Start by noting that hv + w, v + wi ≥ 0. Expanding this out, we have:
hv, vi + 2hv, wi + hw, wi ≥ 0
2hv, wi ≥ −hv, vi + −hw, wi Now, if ~v and ~w are unit vectors, this says that
2h~v, ~wi ≥ −2 h~v, ~wi ≥ −1
Now to prove the result for any pair of nonzero vectors, simply scale them to make them unit vectors:
h1 |~v|~v,
1
| ~w|wi ≥ −1~ hv, wi ≤ −|v||w|
In the next question, we ask you to fill in the details of an alternative proof which, while a little harder than the one above, is at least as beautiful.
21 Cauchy-Schwarz
Question 2 Start by noting that hv − w, v − wi ≥ 0. Expanding this out, we have: hv, vi − 2hv, wi + hw, wi ≥ 0
2hv, wi ≤ hv, vi + hw, wi
Now notice that the left hand side is unaffected by scaling v by a scalar λ and w by 1
λ, but the right hand side is! This allows us to breathe new life into the inequality: we know that for every scalar λ ∈ (0, ∞)
hv, wi ≤ λ2|v|2+ 1 λ2|w|
2
This is somewhat miraculous: we have a stronger inequality than the one we started with “for free.”
This new inequality is strongest when the right hand side (RHS) is minimized. As it stands the RHS is just a function of one real variable λ.
Solution
Hint: We can minimize the right hand side using single variable calculus.
Hint: Let f (λ) = λ2|v|2+ 1 λ2|w| 2 . Then f0(λ) = 2λ|v|2− 2|w| 2 λ3
The minimum must occur where f0 vanishes
Hint: f0(λ) = 0 2λ|v|2− 2|w| 2 λ3 = 0 λ4|v|2 = |w|2 λ = s |w| |v|
21 Cauchy-Schwarz Hint: 2 3 1 · 1 1 1 = 2(1) + 3(1) + 1(1) = 6 Hint: |~v| =√~v · ~v =√14 Hint: | ~w| =√w · ~~ w =√3
Hint: Thus, 6 =√14√3 cos(θ)
Hint: Therefore, θ = arccos(√6 42)
The angle between the vectors ~v = 2 3 1 and ~w = 1 1 1 is arccos(6/(sqrt(14)*sqrt(3))) This problem probably would have stumped you before you started this activity!
Question 4 Find a vector which is perpendicular to ~w = 2 3 1 . Solution
Hint: For ~v to be perpendicular to( 2, 3, 1), we would need that the angle between ~
v and ~w is π 2 (or
−π
2 ). In either case ~v · ~w = |~v|| ~w| cos( ±π
2 ) = 0 So we need to find a vector for which ~v · ~w = 0
Hint: Let ~v = x y z . Then ~v · ~w = 0 x y z · 2 3 1 = 0 2x + 3y + z = 0
Hint: There are a whole lot of choices for x, y, and z that fit these criteria (In fact there is an entire plane of vectors perpendicular to ~w)
Hint: 0 1 −3
21 Cauchy-Schwarz
Question 5 Find a vector ~u which is perpendicular to both ~v = 2 3 1 and ~w = 5 9 2 Solution
Hint: We need both ~u · ~v = 0 and ~u · ~w = 0
Hint: Letting ~u = x y z
, we have the conditions
( 2x + 3y + z = 0 5x + 9y + 2z = 0 Hint: ( 4x + 6y + 2z = 0 5x + 9y + 2z = 0 ( x + 3y = 0 5x + 9y + 2z = 0
Hint: Picking whatever you like for x, you should be able to find the other values now. Try x = 3. Hint: 3 −1 3 works.
Prove the “Triangle inequality”: For any two vectors ~v, ~w ∈ Rn, |~v + ~w| ≤ |~v| + | ~w|. Draw a picture. Why is this called the triangle inequality?
The inequality is equivalent to |~v + ~w|2≤ ||~v| + | ~w||2, which is easier to handle because it does not involve square roots.
22
Multiplying matrices using dot
products
There is a quick way to multiply matrices using dot products
Question 1 Let M = 2 3 4 5 1 2 , and ~e2= 0 1 0 . Solution Hint: ~ e>2M = 0 1 0 2 3 4 5 1 2 =4 5 ~e>2M =
Did you notice how multiplying by ~e>2 on the right selected the 2ndrow of M ?
Prove that if M is an m × n matrix and ej ∈ Rmis the jthstandard basis vector
of Rm, then ~ej>M is the jth row of M . We know that ~w = ~ej>M is a covector
(row) just by looking at dimensions. What is the ith entry of this row? Well, we can only figure that out by applying the map to the basis vectors. ~ej>M ~ei is the
dot product of ~ej with the ithcolumn of M . But that just selects the jth element
of that column. So the ithelement of ~w is the jth element of the ithcolumn of M . This just says that ~w is the jth column of M . (Whew.)
Now we can use this observation to great effect. If M is an m × n matrix, ~ej
is the standard basis of Rm and ~bk is the standard basis of Rn, then we can select
Mj,kby performing the operation ~e>jM~bk. This is so important we will label it as
a theorem:
Theorem 2 If M is an m × n matrix, ~ej is the standard basis of Rmand ~bk is
the standard basis of Rn, then Mj,k= ~e>jM~bk.
Proof The proof is simply that M~bk is by definition the kth column of the
matrix, and by our observation above ~e>jM~bk must be the jthrow of that column
vector, which consists of the single number Mi,j
Question 3 Let M =4 1 −2
3 1 0
. Solution
Hint: By the above theorem, it will be the entry in the 2ndrow and the 1stcolumn of M
22 Multiplying matrices using dot products Hint: 0 1 M 1 0 0 = 3 0 1 M 1 0 0 =3
The philosophical import of this theorem is that we can probe the inner structure of any matrix with simple row and column vectors to find out every component of the matrix. What happens when we apply this insight to a product of matrices?
Question 4 Let A = −1 1 2 2 3 0
and B = []. Let C = AB.
Solution
Hint: By the theorem above, C2,3=0 1 0 C
0 0 1 0 Hint: So C2,3=0 1 0 AB 0 0 1 0
Hint: But0 1 0 A is the 2nd
row of A, and B 0 0 1 0 is the 3rdcolumn of B Hint: So0 1 0 A = 2 2 and B 0 0 1 = 1 9
22 Multiplying matrices using dot products
Theorem 5 Let A and B be composable matrices. Let C = AB. Then Ci,j is
the product of the ithrow of A with the jthcolumn of B
Prove this theorem We can prove this by combining the other two theorems in this section. Ci,j = ~ei>C~ej by the second theorem. But C = AB, so we have
Ci,j= ~ei>AB~ej. By the first theorem ~ei>A is the ithrow of A, and by our definition
of matrix multiplication, B~ej is the jthcolumn of B. So Ci,j is the product of the
ithrow of A with the kth column of B.
Now try multiplying some matrices of your choosing using this method. This is likely the definition of matrix multiplication you learned in high school (or the same thing defined by some messy formula with aX). Do you prefer this method? Or do you prefer whatever method you came up with on your own earlier? Maybe they are the same!
Another note: it is interesting that we are feeding two vectors ei and ej into
the matrix and getting out a number somehow. In week 4 we will learn that we are treading in deep water here: this is the very tip of the iceberg of bilinear forms, which are a kind of 2-tensor.
23
Limits
Limits are the difference between analysis and algebra
Limits are the backbone of calculus. Multivariable calculus is no different. In this section we will deal with limits on an intuitive level.
We will postpone the rigorous -δ analysis to the next section. Definition 1 Let f : Rn→ Rm
and let p ∈ Rn. We say that lim
x→pf (x) = L
for some L ∈ Rmif as x “gets arbitrarily close to ” p, the points f (x) “get arbitrarily close to L”.
Definition 2 A function f : Rn→ Rm
is said to be continuous at a point p ∈ Rn if lim
x→pf (x) = f (p)
Most functions defined by formulas are continuous where they are defined. For example, the function f (x, y) = (cos(xy + y2), esin(x)+y + y2) is continuous be-cause each component function is a string of composites of continuous functions. f (x, y) = (xy, cos(x)/(x+y)) is continuous everywhere it is defined (it is not defined on the line y = −x, because the denominator of the second component function vanishes there). This is basically because all of the functions we have names for like cos(x), sin(x), ex, polynomials, rational functions, are all continuous, so if you can write down a function as a “single formula” it is probably continuous. The prob-lematic points are basically just zeros of denominators, like our example above. Piecewise defined functions can also be problematic:
Argue intuitively that the function f : R2→ R defined by f(x, y) = (
0 if x < y 1 if x ≥ y is continuous at every point off the line y = x, and is discontinuous at every point on the line y = x For any point p which is not on the line y = x, there is a little neighborhood of p where f is the constant function 0, which is known to be continuous. So f is continuous at p. For any point p on the line y = x, we get a different limit if we approach p along the line y = x (we get 1), versus approaching
23 Limits
If we are confronted with a limit like lim
(x,y)→(0,0)
x2+ xy
x + y , this is actually a little bit interesting. The function is not continuous at 0, because it is not even defined at 0. What is more, the numerator and denominator are both approaching 0, which each ”pull” the limit in opposite directions. (Dividing by smaller and smaller numbers would tend to make the value larger and larger, while multiplying by smaller and smaller numbers has the opposite effect) There are essentially two ways to work with this:
• show that it does not have a limit by finding two different ways of approaching (0, 0) which give different limiting values, or
• show that it does have a limit by rewriting the expression algebraically as a continuous function, and just plug in to get the value of the limit.
Question 4 Consider lim
(x,y)→(0,0)
x2+ xy
x + y . Solution
Hint: This limit does exist, because it can be rewritten as a continuous function. Do you think the limit exists?
(a) Yes X (b) No Solution Hint: lim (x,y)→(0,0) x2+ xy x + y =(x,y)→(0,0)lim x(x + y) (x + y) Hint: lim (x,y)→(0,0) x(x + y) (x + y) =(x,y)→(0,0)lim x = 0 lim (x,y)→(0,0) x2+ xy x + y =0
Question 5 Consider lim
(x,y)→(3,3)
x2− 9 xy − 3y. Solution
Hint: This limit does exist, because it can be rewritten as a continuous function. Do you think the limit exists?
(a) Yes X (b) No Solution Hint: lim (x,y)→(3,3) x2− 9 xy − 3y=(x,y)→(3,3)lim (x − 3)(x + 3) y(x − 3)