Functions of more than one variable
2. Partial Derivatives
e derivative f′(x)of a function of one variable, y = f(x), measures a rate of change:
if we increase x by a small amount ∆x then y = f(x) also increases by a small amount
∆y. e ratio between these two changes is the derivative: f′(x)≈∆y∆x.
For a function z = f(x, y) of two variables there is a similar concept: if we change xand/or y by a small amount then z will also change by a small amount, and there are formulas relating the changes ∆x, ∆y and ∆z. Because there are many different ways in which we can change x and y there are a few different formulas. We will encounter the following versions of “the derivative of f(x, y)”:
▶Change only one of the variables but not the other: this leads to the so-called partial derivatives.
▶Simultaneously vary both x and y: the resulting change turns out to be the sum of the changes we would get if we were to vary only x or only y, respectively. is will follow from the ain rule, and the resulting formula is called the total derivative.
We begin with the partial derivatives.
2.1. Definition of Partial Derivatives. If z = f(x, y) is a function of two variables then the partial derivatives of f with respect to x and with respect to y are
(51) ∂f
∂x(x, y) = lim
∆x→0
f (x + ∆x, y)− f(x, y)
∆x and
(52) ∂f
∂y(x, y) = lim
∆y→0
f (x, y + ∆y)− f(x, y)
∆y
e following more convenient notation is used very oen (because it’s so much shorter):
(53) fx(x, y) = ∂f
∂x(x, y), fy(x, y) = ∂f
∂y(x, y).
When we are in a hurry we can also drop the “(x, y)” from our notation for derivatives and just write fxand fy.
3. PROBLEMS 51
y
x
∂f
∂y is the rate of change of f in the vertical direction
∂f
∂x is the rate of change of f in the horizontal direction When we define the partial derivatives at some point (x, y), we assume that the function is defined on some sufficiently small disc centered at that point (x, y).
Figure 2. The partial derivatives of a function at some point (x, y) measure how fast the func-tion f(x, y) changes if we move the point either horizontally (the x direcfunc-tion) or vertically (the y direction).
2.2. Partial derivatives of functions of three or more variables. If a function de-pends on three or more variables then one can define its partial derivatives in the same way as for functions of two variables. For instance, if w = f(x, y, z) is a function of three variables, then its partial derivative with respect to x is defined to be
∂f
∂x = lim
∆x→0
f (x + ∆x, y, z)− f(x, y, z)
∆x .
e derivatives of f with respect to y and z have very similar definitions.
2.3. Examples. Computing partial derivatives is not harder than computing ordi-nary derivatives. To find the partial derivative of a function with respect to x we just pretend all other variables are constants and differentiate. Or, in other words, we could think of the partial derivative of f(x, y) with respect to x as the ordinary derivative of the function f in which we have frozen the variable y at some particular value.
For instance, the partial derivatives of the function f(x, y, z) = x2sin πy + z of three variables x, y, and z, are
fx= 2xsin πy, fy = πx2cos πy and fz= 1.
3. Problems
1. For each of the following functions sketch the graph (use a graphing program, if nec-essary) and decide if you think the function has a limit as (x, y) approaches (0, 0).
(a)f (x, y) = xy x2+ y2 (b)g(x, y) = 1
x2+ y2 (c)h(x, y) = x
x2+ y2. (d)p(x, y) = x
√x2+ y2.
(e)q(x, y) = x2
√x2+ y2.
2. Find the partial derivatives of the follow-ing functions:
(a)f (x, y) = x2y3− x3y2.
(b)f (x, y) =cos(x2y) + y3. • (c)f (x, y) = xy
x2+ y. •
(d)f (x, t) = (x + t)4. (e)f (x, t) = (x− t)4. (f)f (x, t) =sin ωt cos2πx
L .
(g)f (x, y) = ex2+y2. •
3. Let r be the radius in polar coordinates, as defined in §4of Chapter III.
(a)Compute the partial derivatives of r.
(b)Show that the partial derivatives of r can be wrien as
4. Let θ be the polar angle function, defined in §4.2of Chapter III.
(a)In the le half plane the function θ is de-fined by
θ(x, y) =arctany x.
Use this expression to find its partial
deriva-tives,∂θ∂xand∂y∂θ. •
(b)Check that the angle function also satis-fies
xsin θ = y cos θ
at all points in the plane. Use implicit differ-entiation to find the partial derivatives ∂x∂θ and∂θ∂y.
5. Let f(x, y) = the distance from (x, y) to the origin. Find a formula for f, and com-pute
fx, fy,and√ fx2+ fy2. (Hint: compare this problem with problem
3.3.) •
6. Suppose f(t) and g(t) are single variable differentiable functions. Find ∂z/∂x and
∂z/∂yfor each of the following two variable functions.
(a)z = f (x)g(y) •
(b)z = f (xy) •
(c)z = f (x/y) •
7. Let f be the distance to the square Q function from problem5.13. Find the partial derivatives fx and fyof f. (You will need your answer to problem5.13, in particular the description of f as a “piecewise defined function”.)
4. e linear approximation to a function
4.1. e Chain Rule and friends. When we compute the partial derivative of a func-tion with respect to a variable x we pretend all other variables are constants, and just dif-ferentiate with respect to x, just as we would in first semester calculus. ere is therefore no need to state a product rule or quotient rule, because these are exactly the same as for functions of one variable. e chain rule on the other hand is different: there is a chain rule for functions of several variables, but it has more terms than the chain rule from one-variable calculus. ere are several related topics that fit together in a discussion of the chain rule, namely Linear Approximation, Tangent Planes to a Graph, and e Total Derivative. We will go through these one at a time in the next few sections.
4.2. e linear approximation formula. e key to the chain rule is the linear ap-proximation formula. is formula tells us approximately how much a function z = f (x, y)of two variables changes if both variables are subjected to a small change.
More precisely, if we have a function z = f(x, y), and we know its value f(x0, y0) at some point (x0, y0), then how much does the function value change if x is increased from x0to x0+ ∆x, and if y is similarly increased from y0to y0+ ∆y?
4. THE LINEAR APPROXIMATION TO A FUNCTION 53
x0 x0+ ∆x
y0 y0+ ∆y
We can change (x0, y0)to (x0+ ∆x, y0+ ∆y)in two steps:
first keep y fixed and increase x by ∆x, then keep x fixed and increase y by ∆y
(˜x, y0) (x0+ ∆x, ˜y)
To express the change in function values in terms of derivatives, we can use the Mean Value Theorem. We get two intermediate points:
one at x = ˜x for the increase in f when x changes, and
one at y = ˜yfor the increase in f when y changes.
Figure 3. Computation of the linear approximation(54)
e basic idea in the computation of the change in f(x, y) is to go from (x0, y0)to (x0+ ∆x, y0+ ∆y)in two steps:
∆f = f (x0+ ∆x, y0+ ∆y)− f(x0, y0) (54)
= f (x0+ ∆x, y0+ ∆y)− f(x0+ ∆x, y0)
| {z }
only y changes
+ f (x0+ ∆x, y0)− f(x0, y0)
| {z }
only x changes
We have wrien the total change in f as the sum of two changes, one of them caused by the change in x, and the other due to the change in y. See Figure3.
In the second difference only x changes while y remains the same, so we can use the one variable Mean Value eorem to conclude that there is some number ˜x between x0
and x0+ ∆xwith
f (x0+ ∆x, y0)− f(x0, y0)
∆x = fx(˜x, y0), i.e.
(55) f (x0+ ∆x, y0)− f(x0, y0) = fx(˜x, y0)· ∆x.
Likewise, in the difference in (54) where only y changes we can use the Mean Value
eorem to conclude that there is some ˜y between y0and y0+ ∆ysuch that f (x0+ ∆x, y0+ ∆y)− f(x0+ ∆x, y0)
∆y = fy(x0+ ∆x, ˜y),
and hence
(56) f (x0+ ∆x, y0+ ∆y)− f(x0+ ∆x, y0) = fx(x0+ ∆x, ˜y)· ∆x.
If we now combine (55) and (56) with (54) then we get
∆f = fx(˜x, y0)· ∆x + fy(x0+ ∆x, ˜y)· ∆y.
is equation is exactly true, i.e. we have not made any approximations, and we have not ignored any kind of “error terms.” However, the equation does contain the numbers ˜x and ˜y, which are provided by the Mean Value eorem, and of which we therefore do not
know anything besides the fact that ˜x lies between x0and x0+ ∆x, and ˜y lies between y0
and y0+ ∆y. We can get rid of this uncertainty by seling for an approximation for ∆f instead of the exact expression we have just found. To do this we assume that ∆x and ∆y are “small.” en, since ˜x lies between x0and x0+ ∆x, we know that ˜x≈ x0. We also know that y0+ ∆y≈ y0, so, if the function fxis continuous, then it seems reasonable to assume that
(57) fx(˜x, y0+ ∆y)≈ fx(x0, y0).
Similarly, we will assume that
(58) fy(x0, ˜y)≈ fy(x0, y0).
Substituting this in (54) we find
(59) ∆f ≈ fx(x0, y0)∆x + fy(x0, y0)∆y
Keeping in mind that ∆f = f(x0+ ∆x, y0+ ∆y)− f(x0, y0), we conclude (60) f (x0+ ∆x, y0+ ∆y)≈ f(x0, y0) + fx(x0, y0)∆x + fy(x0, y0)∆y
e linear approximation formula (60) is oen wrien using Leibniz-style notation for the derivatives, where one writes ∂f∂x for fx, and∂f∂y for fy. In this notation the approxi-mation formula takes these forms:
f (x0+ ∆x, y0+ ∆y)≈ f(x0, y0) +∂f
∂x(x0, y0)· ∆x + ∂f
∂y(x0, y0)· ∆y, or, shorter,
(61) ∆f ≈∂f
∂x∆x +∂f
∂y∆y.
e approximation (60) can also be wrien without ∆x and ∆y by a change of nota-tion. To do this we introduce
(62) x = x0+ ∆xand y = y0+ ∆y,
and interpret (60) as a formula that tells us approximately what the function value at (x, y)is, provided (x, y) is close enought to (x0, y0). Wrien in terms of x and y, (60) says
(63) f (x, y)≈ f(x0, y0) + fx(x0, y0) (x− x0) + fy(x0, y0) (y− y0).
4.3. Linear approximation – infinitesimal version. We expect the approximation in (61) to improve as we decrease ∆x and ∆y (and we will try to make this statement more precise in the next section, §4.4). We could then say, as is commonly done, that there is an exact equation when ∆x and ∆y are “infinitely small,” and write this equation as
(64) df = ∂f
∂xdx +∂f
∂ydy.
e meaning of this equation is that infinitesimally small changes in x and y, of magni-tudes dx and dy, respectively, lead to an infinitesimally small change in f of magnitude df, and that df, dx, and dy are related by (64). Even though it is very difficult to make sense of the “infinitely small” quantities dx, dy, df, in (64), this notation is widely used, because the make-belief it entails allows one to ignore the more awkward error terms that we will now discuss.
5. THE TANGENT PLANE TO A GRAPH 55
4.4. e linear approximation formula with error term. In our computation of the change ∆f of the function we approximated fx(˜x, y0)by fx(x0, y0), and fy(x0+ ∆x, ˜y) by fy(x0, y0). As a result our linear approximation formula (60) is not an exact equation, but only says that one thing is “approximately equal” to another.
We can make this a bit more precise by including error terms, i.e. by saying that there are small numbers exand eysuch that
fx(˜x, y0) = fx(x0, y0) + ex, and fy(x0+ ∆x, ˜y) = fy(x0, y0) + ey.
Here exand eydepend on ∆x and ∆y, and as both ∆x and ∆y go to zero, the errors ex
and eywill also go to zero.
Puing this in (54) we get the linear approximation formula with error terms:
(65) f(x0+ ∆x, y0+ ∆y) = f (x0, y0) + fx(x0, y0)∆x + fy(x0, y0)∆y
| {z }
linear approximation
+ ex∆x + ey∆y
| {z }
error
in which exand eydepend on ∆x, ∆y, and satisfy
∆x,∆ylim→0ex= lim
∆x,∆y→0ey= 0.
If we ignore the “error term” then we recover the linear approximation formula (60). Our more precise linear approximation formula (65) tells us that the error in (60) (difference between le and right hand sides) is given by ex∆x + ey∆y, and that this error is “small
” compared to ∆x and ∆y. We could write this as
Error in the approximation = ex∆x + ey∆y = o(∆x) + o(∆y).
5. e tangent plane to a graph
5.1. e tangent plane. For a function z = f(x, y) and a point (x0, y0)the linear approximation (63) gives us an approximation for the function f at any other point (x, y) near (x0, y0). It says
z≈ f(x0, y0) + fx(x0, y0)(x− x0) + fy(x0, y0)(y− y0).
If we replace “≈” by equality, then we get a new function of (x, y):
(66) z = f (x0, y0) + fx(x0, y0)(x− x0) + fy(x0, y0)(y− y0).
Keeping in mind that f(x0, y0), fx(x0, y0), and fy(x0, y0)are constants, while only (x, y) are variables here, we see that this is the equation for a plane which we call the tangent plane to the graph of f at the point (x0, y0, f (x0, y0)).
5.2. Example: tangent plane to the saddle surface at the origin. Find the equation for the tangent plane to the saddle surface z = xy at the origin.
Solution: e saddle surface is the graph of the function f(x, y) = xy whose partial derivatives are fx(x, y) = y and fy(x, y) = x. To find the tangent plane at x0 = 0, y0= 0, we compute the partial derivatives,
fx(x, y) = ∂xy
∂x = y, so at (x0, y0) = (0, 0)we have fx(0, 0) = 0, and
fy(x, y) = ∂xy
∂y = x, so at (x0, y0) = (0, 0)we have fy(0, 0) = 0,
y+Δy
y x+Δx
x fyΔy
fxΔx
Δy Linear approximation to the graph of z=f(x, y)
Δx y+Δy
y x+Δx
x fyΔy
fxΔx
Δy Δx fxΔx+fyΔy
Figure 4. Top: The graph of the linear approximation of f (graph of f itself is not shown – see the boom figure). If we increase x by ∆x, then f will increase by approximately fx∆x, and if we increase y by ∆y, then f increases by approximately fy∆y. If we increase x and y by ∆x and ∆y at the same time, then f increases by roughly fx∆x + fy∆y. The vertical doed line behind the parallelogram represents this increase in f.
Boom: The graph of a function, and of its tangent plane at some point (x0, y0, z0). The tangent plane is the graph of the linear approximation to f.
Moreover, we also have f(x0, y0) = f (0, 0) = 0, so that the equation for the tangent plane is
z = 0 + 0· (x − 0) + 0 · (y − 0) = 0, i.e.,
z = 0.
e tangent plane at the origin is just the xy-plane.
5.3. Example: another tangent plane to the saddle surface. Find the equation for the tangent plane to the saddle surface z = xy at the point (2, 1, 2). Where does this plane intersect the coordinate axes?
Solution: is is almost the same problem as before. e only difference is that we are trying to find the tangent plane at a point other than the origin. To get the tangent plane at the point (x0, y0) = (2, 1)we compute the derivatives
fx(x, y) = y =⇒ fx(2, 1) = 1,
5. THE TANGENT PLANE TO A GRAPH 57
Figure 5. The graph of z = xy and the tangent plane at the origin.
and
fy(x, y) = x =⇒ fy(2, 1) = 2.
e equation for the tangent plane is therefore
z = x0y0+ y0(x− x0) + x0(y− y0) (67)
= 2 + 1· (x − 2) + 2 · (y − 1)
=−2 + x + 2y
e intersections with the x, y and z axes are, respectively, (2, 0, 0), (0, 1, 0), and (0, 0,−2).
5.4. Example: tangent plane to a sphere. e point (x0, y0, z0)lies on the upper half of the sphere with radius 4 centered at the origin. Find an equation for the tangent plane to the sphere at that point, if x0= 1and y0= 3.
Solution: e equation for the sphere is x2+ y2+ z2 = 42 = 16, so the upper half is the graph of the function
f (x, y) =√
16− x2− y2.
e z coordinate of the given point is therefore z0=√
16− 12− 32=√ 6. e partial derivatives of f at (x0, y0) = (1, 3) are
∂f
∂x = −x0
√16− x20− y02
=− 1
√6,
∂f
∂y = √ −y0
16− x20− y02
=− 3
√6.
e equation for the tangent plane is then z =√
6− 1
√6(x− 1) − 3
√6(y− 3)
= 16
√6− x
√6− 3y
√6.
6. e Two Variable Chain Rule
6.1. e ain rule. Given two functions x = x(t), y = y(t) of one variable, and a function z = f(x, y) of two variables, then what is the derivative of the function
g(t) = f (x(t), y(t))?
We can find a general formula for g′(t)by using the linear approximation (§4) in the following way.
To find g′(t0)for some t0, we must compute g(t0+ ∆t)− g(t0)
∆t and let ∆t→ 0.
If t increases by an amount ∆t from t0to t0+ ∆t, then x and y will also change. We write ∆x and ∆y for the changes in x and y, i.e.
∆x = x(t0+ ∆t)− x0, ∆y = y(t0+ ∆t)− y0, where x0= x(t0)and y0= y(t0). e resulting change in g is thus
∆g = g(t0+ ∆t)− g(t0)
= f(
x(t0+ ∆t), y(t0+ ∆t))
− f(
x(t0), y(t0))
= f (x0+ ∆x, y0+ ∆y)− f(x0, y0).
By the linear approximation formula (65) one then has
∆f
∆t = fx(x0, y0)∆x
∆t + fy(x0, y0)∆y
∆t + ex∆x
∆t + ey∆x
∆t
As we let ∆t→ 0 the quotients ∆x/∆t and ∆y/∆t converge to x′(t0)and y′(t0), while the errors exand eyconverge to zero, so we get the two-variable ain rule:
(68) df (x(t), y(t))
dt = fx(x0, y0)· x′(t0) + fy(x0, y0)· y′(t0).
e chain rule is oen also wrien as
(69) df
dt = ∂f
∂x dx dt +∂f
∂y dy
dt.
is form becomes easy to remember if we interpret the first term as “the change in f caused by the change in x” and the second term as “the change in f caused by the change in y.”
In the way (69) is wrien a number of details are swept under the rug: the two deriva-tives dxdt and dydt are ordinary (Math 221) derivatives of the two functions x(t) and y(t);
the two partial derivatives ∂f∂x and ∂f∂y are the partial derivatives of f in whi one has substituted x(t) and y(t). A more correct way of writing the equation would be (70) df (x(t), y(t))
dt = ∂f
∂x(x(t), y(t))· x′(t) +∂f
∂y(x(t), y(t))· y′(t).
Many people find (69) easier on the eyes, so that is what we will usually write.
6. THE TWO VARIABLE CHAIN RULE 59
6.2. e difference between d and ∂. Compare (69) with the linear approximation formula (64) with infinitesimal small quantities. Equation (69) is just (64) in which one has divided both sides by dt. In contrast to equation (64) which contains the strange
“infinitely small quantities” dx, dy, df, equation (69) contains the derivatives dxdt, etc.
which are well-defined.
Note that we have a breakdown of Leibniz’s notation: if we ignore the distinction between “d” and “∂”, and just cancel dx and ∂x, and also dy and ∂y on the right then we which doesn’t make a lot of sense. e moral: don’t cancel dx against ∂x!
6.3. An example. Suppose x(t) = cos ωt and y(t) = sin ωt, so that #‰x(t) = x(t) #‰e1+ y(t) #‰e2traces out the unit circle.
How fast does S(t) = 2x(t) + 3y(t) change along this motion?
In other words, what can we say aboutdSdt?
e quantity S(t) is the composition of a function of two variables with the functions x(t)and y(t), i.e. it is the result of substituting x(t) and y(t) in the function f(x, y) = 2x + 3y.
Answer 1 – without using the chain rule. We can simply compute S(t) = cos ωt + sin ωt and differentiate:
Note that we did not use our new two-variable chain rule here. is answer shows that the point of the two-variable chain rule is not to compute dtdf (x(t), y(t))in situations where we have formulas for the functions f(x, y), x(t), and y(t). In such a situation we can always substitute x(t) and y(t) in the function f(x, y) aer which we get a function S(t) = f (x(t), y(t))of one variable. We learned how to differentiate those in our first calculus course.
Answer 2 – using the chain rule. e quantity we want to differentiate is S(t) = f(
x(t), y(t)) , where
f (x, y) = 2x + 3y, and x(t) = cos ωt, y(t) =sin ωt.
e chain rule tells us that
(72) dS
Here the first term stands for the change in S that is caused by the change in x. To compute it we first find
∂f
Similarly, the second term in (72) represents the change in S(t) due to the fact that y is changing:
To get the rate of change of S we add both the x and y contributions to this rate of change, which leads us to
(73) dS
dt = 2· dx
dt + 3·dy dt.
So far we have not used what we know about x(t) and y(t). is expression we have just derived for dS/dt is true no maer which x(t), y(t) we are given. In our case we have
x(t) =cos ωt =⇒ dx
dt =−ω sin ωt, y(t) =sin ωt =⇒ dy
dt = +ωcos ωt.
Substitute this in (73):
dS
dt =−2ω sin ωt + 3 cos ωt, as before.
e moral: In this example the answer using the chain rule was longer, much more verbose, and perhaps more complicated than the straightforward computation that led to our first answer (71). Indeed, if the derivative of S is all we want then our first computa-tion is the most efficient way of geing dS/dt. However, the computacomputa-tion using the chain rule did give us some useful intermediate results, such as the general expression (73) for dS/dt. is expression remains valid if we change the path (x(t), y(t)) and can there-fore be useful in situations where, for example, we are allowed to choose the path and we would like to choose a path for which dS/dt has some prescribed value (e.g. suppose we want to keep S constant, how do we choose the path?)
6.4. Another example. Suppose the temperature at the point (x, y) in the plane is given by T (x, y), and suppose that an ant is walking along the parametrized curve
x(t) = Rcos ωt, y(t) = Rsin ωt.
us the ant is walking on a circle with radius R, and with angular velocity ω.
How fast is the temperature of the ant changing?
i.e. compute dTdt.
Here we are not given an explicit formula for the function T (x, y), so we cannot substitute x(t)and y(t) in T and differentiate using only our first semester calculus skills. e approach in Answer 1 of our previous example does not apply here; we must use the chain rule.
In §6.1we have seen several equivalent ways of writing the chain rule. Let us look at two of these and consider the meaning of the terms that arise.
e short form (69) of the chain rule tells us that dT
dt = ∂T
∂x dx
dt +∂T
∂y dy
dt.
e T on the le stands for T (x(t), y(t)), which we can interpret as the temperature at the point (x(t), y(t)). at point is the location of the ant at time t, so the T on the le
is the temperature the ant feels at time t. is is a function of t. In mathematical terms it is the result of substituting (composing) the functions x(t) and y(t) in the function T = T (x, y).
e two T ’s on the right appear in partial derivatives. Here ∂T∂x stands for the partial derivative of the function T = T (x, y) with respect to the variable x. One can compute this without knowing the ant’s path (x(t), y(t)). Similarly,∂T∂y is the partial derivative of
7. PROBLEMS 61
70 68 66
64 T=62°F
60 58 56
54 48 52
50
72 74
Figure 6. Ant walking in a region of varying temperature.
T with respect to y. e partial derivatives ∂T∂x and ∂T∂y themselves are again functions of x and y. Aer computing these partials they are meant to be evaluated at the point (x(t), y(t)).
is leads us to the more verbose version (70) of the chain rule, which tells us dT (x(t), y(t))
dt = ∂T
∂x(x(t), y(t))· x′(t) +∂T
∂y(x(t), y(t))· y′(t).
At this point the only additional information we have is about the ant’s motion, namely, x(t) = Rcos ωt and y(t) = sin ωt. We can compute the derivatives of x(t) and y(t), which gives us the velocity of the ant in the x and y directions:
x′(t) =−ωR sin ωt, y′(t) = ωRcos ωt.
If we substitute everything we know in the chain rule we find that the rate at which the ant’s temperature changes is
dT
dt =−∂T
∂x(Rcos ωt, R sin ωt)· ωR sin ωt +∂T
∂y(Rcos ωt, R sin ωt)· ωR cos ωt.
To make the equation more readable one can leave out the (R cos ωt, R sin ωt), which results in
dT
dt =−ωR sin ωt∂T
∂x + ωRcos ωt∂T
∂y.
e disadvantage of this shorter version is that the reader has to figure out where we intended to evaluate the two partial derivatives ∂T∂x and ∂T∂y.
7. Problems