LEARNING OBJECTIVES FOR THIS CHAPTER

(1)

6

Woman teaching geometry, from a fourteenth-century edition of Euclid’s geometry book.

Inner Product Spaces

In making the definition of a vector space, we generalized the linear structure (addition and scalar multiplication) ofR2andR3. We ignored other important

features, such as the notions of length and angle. These ideas are embedded in the concept we now investigate, inner products.

Our standing assumptions are as follows:

6.1 Notation F,V

FdenotesRorC.

V denotes a vector space overF.

LEARNING OBJECTIVES FOR THIS CHAPTER Cauchy–Schwarz Inequality

Gram–Schmidt Procedure

linear functionals on inner product spaces calculating minimum distance to a subspace

(2)

6.A

Inner Products and Norms

Inner Products

x

Hx , x₁ ₂L

The length of this vectorxis

p

x12Cx22.

To motivate the concept of inner prod-uct, think of vectors inR2 and R3 as

arrows with initial point at the origin. The length of a vector x inR2 orR3

is called the norm ofx, denoted _kx_k. Thus for xD.x1; x2/2R2, we have kx_{k D}px12Cx22.

Similarly, ifx _D.x1; x2; x3/2R3,

thenkxk Dpx12Cx22Cx32.

Even though we cannot draw pictures in higher dimensions, the gener-alization to Rnis obvious: we define the norm ofx _D.x1; : : : ; xn/ 2 Rn

by

kx_{k D}px12C Cxn2:

The norm is not linear onRn. To inject linearity into the discussion, we

introduce the dot product.

6.2 Definition dot product

Forx; y 2Rn, thedot productofxandy, denotedxy, is defined by xy _Dx1y1C Cxnyn;

wherex_D.x1; : : : ; xn/andyD.y1; : : : ; yn/.

If we think of vectors as points in-stead of arrows, thenkxkshould

be interpreted as the distance from the origin to the pointx.

Note that the dot product of two vec-tors inRnis a number, not a vector.

Ob-viouslyxx _{D k}x_k2 for all x ₂ Rn.

The dot product onRnhas the

follow-ing properties: xx 0for allx ₂Rn;

xx _D0if and only ifx _D0;

fory ₂Rnfixed, the map fromRntoRthat sendsx₂Rntoxyis linear;

(3)

An inner product is a generalization of the dot product. At this point you may be tempted to guess that an inner product is defined by abstracting the properties of the dot product discussed in the last paragraph. For real vector spaces, that guess is correct. However, so that we can make a definition that will be useful for both real and complex vector spaces, we need to examine the complex case before making the definition.

Recall that ifDaCbi, wherea; b2R, then

the absolute value of, denoted_j_j, is defined by_j_{j D}pa2_C_b2_; the complex conjugate of, denotedN, is defined byN _Da bi;

jj2 DN.

See Chapter 4 for the definitions and the basic properties of the absolute value and complex conjugate.

Forz_D.z1; : : : ; zn/2Cn, we define the norm ofzby kzk D

q

jz1j2C C jznj2:

The absolute values are needed because we wantkz_k to be a nonnegative number. Note that

kz_k2 _Dz1z1C Cznzn:

We want to think of kz_k2 as the inner product ofz with itself, as we did in Rn. The equation above thus suggests that the inner product of

wD.w1; : : : ;wn/2Cnwithzshould equal

w1z1C Cwnzn:

If the roles of thewandz were interchanged, the expression above would be replaced with its complex conjugate. In other words, we should expect that the inner product ofwwithzequals the complex conjugate of the inner product ofz withw. With that motivation, we are now ready to define an inner product onV, which may be a real or a complex vector space.

Two comments about the notation used in the next definition:

Ifis a complex number, then the notation0means thatis real and nonnegative.

We use the common notationhu;v_i, with angle brackets denoting an inner product. Some people use parentheses instead, but then .u;v/ becomes ambiguous because it could denote either an ordered pair or an inner product.

(4)

6.3 Definition inner product

Aninner productonV is a function that takes each ordered pair.u;v/of elements ofV to a number_hu;v_{i 2}Fand has the following properties: positivity

hv;v_i0for allv₂V;

definiteness

hv;v_{i D}0if and only ifv_D0;

additivity in first slot

hu_Cv;w_{i D h}u;w_{i C h}v;w_ifor allu;v;w₂V;

homogeneity in first slot

hu;v_{i D}_hu;v_ifor all₂Fand allu;v₂V;

conjugate symmetry

hu;v_{i D h}v; u_ifor allu;v₂V. Although most mathematicians de-fine an inner product as above, many physicists use a definition that requires homogeneity in the second slot instead of the first slot.

Every real number equals its com-plex conjugate. Thus if we are dealing with a real vector space, then in the last condition above we can dispense with the complex conjugate and simply state thathu;v_{i D h}v; u_ifor allv;w₂V. 6.4 Example inner products

(a) TheEuclidean inner productonFnis defined by

h.w1; : : : ;wn/; .z1; : : : ; zn/i Dw1z1C Cwnzn:

(b) Ifc1; : : : ; cnare positive numbers, then an inner product can be defined

onFnby

h.w1; : : : ;wn/; .z1; : : : ; zn/i Dc1w1z1C Ccnwnzn:

(c) An inner product can be defined on the vector space of continuous real-valued functions on the intervalŒ 1; 1by

hf; g_{i D} Z 1

1

f .x/g.x/ dx: (d) An inner product can be defined onP.R/by

hp; qi D Z 1

0

(5)

6.5 Definition inner product space

Aninner product spaceis a vector spaceV along with an inner product onV.

The most important example of an inner product space isFn with the

Euclidean inner product given by part (a) of the last example. WhenFnis

referred to as an inner product space, you should assume that the inner product is the Euclidean inner product unless explicitly told otherwise.

So that we do not have to keep repeating the hypothesis thatV is an inner product space, for the rest of this chapter we make the following assumption:

6.6 Notation V

For the rest of this chapter,V denotes an inner product space overF.

Note the slight abuse of language here. An inner product space is a vector space along with an inner product on that vector space. When we say that a vector spaceV is an inner product space, we are also thinking that an inner product onV is lurking nearby or is obvious from the context (or is the Euclidean inner product if the vector space isFn).

6.7 Basic properties of an inner product

(a) For each fixedu2V, the function that takesvtohv; uiis a linear map fromV toF.

(b) h0; u_{i D}0for everyu₂V. (c) hu; 0i D0for everyu2V.

(d) hu;v_Cw_{i D h}u;v_{i C h}u;w_ifor allu;v;w₂V. (e) hu; v_{i D N}_hu;v_ifor all₂Fandu;v₂V. Proof

(a) Part (a) follows from the conditions of additivity in the first slot and homogeneity in the first slot in the definition of an inner product. (b) Part (b) follows from part (a) and the result that every linear map takes

(6)

(c) Part (c) follows from part (a) and the conjugate symmetry property in the definition of an inner product.

(d) Supposeu;v;w₂V. Then

hu;vCwi D hvCw; ui D hv; u_{i C h}w; u_i

D hv; u_{i C h}w; u_i

D hu;v_{i C h}u;w_i: (e) Suppose2Fandu;v2V. Then

hu; v_{i D h}v; u_i D_hv; u_i D N_hv; u_i D N_hu;v_i; as desired. Norms

Our motivation for defining inner products came initially from the norms of vectors onR2 and R3. Now we see that each inner product determines a

norm.

6.8 Definition norm,_kv_k

Forv2V, thenormofv, denotedkvk, is defined by kvk Dphv;v_i:

6.9 Example norms

(a) If.z1; : : : ; zn/2Fn(with the Euclidean inner product), then k.z1; : : : ; zn/k D

q

jz1j2C C jznj2:

(b) In the vector space of continuous real-valued functions onŒ 1; 1[with inner product given as in part (c) of 6.4], we have

kf_{k D} s Z 1 1 f .x/2 dx:

(7)

6.10 Basic properties of the norm

Supposev2V.

(a) kvk D0if and only ifv_D0. (b) kvk D jj kvkfor all2F.

Proof

(a) The desired result holds becausehv;v_{i D}0if and only ifv_D0. (b) Suppose₂F. Then

kv_k2_{D h}v; v_i

D_hv; v_i

DN_hv;v_i

D j_j2_kv_k2: Taking square roots now gives the desired equality.

The proof above of part (b) illustrates a general principle: working with norms squared is usually easier than working directly with norms.

Now we come to a crucial definition.

6.11 Definition orthogonal

Two vectorsu;v2V are calledorthogonalifhu;vi D0.

In the definition above, the order of the vectors does not matter, because hu;v_{i D} 0 if and only if_hv; u_{i D} 0. Instead of saying that uand v are orthogonal, sometimes we say thatuis orthogonal tov.

Exercise 13 asks you to prove that ifu;vare nonzero vectors inR2, then hu;vi D kukkvkcos;

whereis the angle betweenuandv(thinking ofuandvas arrows with initial point at the origin). Thus two vectors inR2are orthogonal (with respect to the

usual Euclidean inner product) if and only if the cosine of the angle between them is0, which happens if and only if the vectors are perpendicular in the usual sense of plane geometry. Thus you can think of the wordorthogonalas a fancy word meaningperpendicular.

(8)

We begin our study of orthogonality with an easy result.

6.12 Orthogonality and0

(a) 0is orthogonal to every vector inV.

(b) 0is the only vector inV that is orthogonal to itself. Proof

(a) Part (b) of 6.7 states thath0; u_{i D}0for everyu₂V.

(b) Ifv2V and_hv;v_{i D}0, thenv_D0(by definition of inner product). The wordorthogonalcomes from

the Greek word orthogonios,

which means right-angled.

For the special caseV _D R2, the

next theorem is over 2,500 years old. Of course, the proof below is not the original proof.

6.13 Pythagorean Theorem

Supposeuandvare orthogonal vectors inV. Then

kuCvk2D kuk2C kvk2: Proof We have ku_Cv_k2 _{D h}u_Cv; u_Cv_i D hu; u_{i C h}u;v_{i C h}v; u_{i C h}v;v_i D ku_k2_{C k}v_k2; as desired.

The proof given above of the Pythagorean Theorem shows that the conclusion holds if and only if hu;vi C hv; ui, which equals

2Rehu;vi, is0. Thus the converse

of the Pythagorean Theorem holds in real inner product spaces.

Supposeu;v ₂V, withv _¤0. We would like to writeuas a scalar multiple ofvplus a vectorworthogonal tov, as suggested in the next picture.

(9)

w u 0 cv v An orthogonal decomposition.

To discover how to writeuas a scalar multiple ofvplus a vector orthogonal tov, letc 2Fdenote a scalar. Then

u_Dcv_C.u cv/:

Thus we need to choosecso thatvis orthogonal to.u cv/. In other words, we want

0_{D h}u cv;v_{i D h}u;v_i c_kv_k2:

The equation above shows that we should choosecto behu;vi=kvk2. Making this choice ofc, we can write

uD hu;vi kvk2vC u hu;vi kvk2v :

As you should verify, the equation above writesuas a scalar multiple ofv plus a vector orthogonal tov. In other words, we have proved the following result.

6.14 An orthogonal decomposition

Supposeu;v₂V, withv_¤0. Setc _D hu;vi

kvk2 andwDu

hu;vi

kvk2v. Then hw;v_{i D}0 and u_Dcv_Cw:

French mathematician Augustin-Louis Cauchy(1789–1857)proved 6.17(a)in 1821. German mathe-matician Hermann Schwarz(1843– 1921)proved 6.17(b)in 1886. The orthogonal decomposition 6.14

will be used in the proof of the Cauchy– Schwarz Inequality, which is our next result and is one of the most important inequalities in mathematics.

(10)

6.15 Cauchy–Schwarz Inequality

Supposeu;v2V. Then

jhu;v_{ij k}u_{k k}v_k:

This inequality is an equality if and only if one ofu;vis a scalar multiple of the other.

Proof Ifv_D0, then both sides of the desired inequality equal0. Thus we can assume thatv¤0. Consider the orthogonal decomposition

u_D hu;vi

kvk2vCw

given by 6.14, wherewis orthogonal tov. By the Pythagorean Theorem, ku_k2_D hu;v_i kvk2v 2 C kwk2 D jhu;vij 2 kvk2 C kwk 2 jhu;vij 2 kvk2 : 6.16

Multiplying both sides of this inequality bykvk2and then taking square roots gives the desired inequality.

Looking at the proof in the paragraph above, note that the Cauchy–Schwarz Inequality is an equality if and only if 6.16 is an equality. Obviously this happens if and only ifw D0. Butw _D0if and only ifuis a multiple ofv (see 6.14). Thus the Cauchy–Schwarz Inequality is an equality if and only if uis a scalar multiple ofvorvis a scalar multiple ofu(or both; the phrasing has been chosen to cover cases in which eitheruorvequals0).

6.17 Example examples of the Cauchy–Schwarz Inequality (a) Ifx1; : : : ; xn; y1; : : : ; yn2R, then

jx1y1C Cxnynj2.x12C Cxn2/.y12C Cyn2/:

(b) Iff; gare continuous real-valued functions onŒ 1; 1, then ˇ ˇ ˇ Z 1 1 f .x/g.x/ dxˇˇ ˇ 2 Z 1 1 f .x/2 dx Z 1 1 g.x/2 dx:

(11)

u+v v u

The next result, called the Triangle Inequality, has the geometric interpreta-tion that the length of each side of a tri-angle is less than the sum of the lengths of the other two sides.

Note that the Triangle Inequality im-plies that the shortest path between two points is a line segment.

6.18 Triangle Inequality

Supposeu;v2V. Then

ku_Cv_{k k}u_{k C k}v_k:

This inequality is an equality if and only if one ofu;vis a nonnegative multiple of the other.

Proof We have ku_Cv_k2 _{D h}u_Cv; u_Cv_i D hu; ui C hv;vi C hu;vi C hv; ui D hu; ui C hv;vi C hu;vi C hu;vi D ku_k2_{C k}v_k2_C2Re_hu;v_i ku_k2_{C k}v_k2_C2_jhu;v_ij 6.19 ku_k2_{C k}v_k2_C2_ku_{k k}v_k 6.20 D._ku_{k C k}v_k/2;

where 6.20 follows from the Cauchy–Schwarz Inequality (6.15). Taking square roots of both sides of the inequality above gives the desired inequality.

The proof above shows that the Triangle Inequality is an equality if and only if we have equality in 6.19 and 6.20. Thus we have equality in the Triangle Inequality if and only if

6.21 _hu;v_{i D k}u_kkv_k:

If one of u;v is a nonnegative multiple of the other, then 6.21 holds, as you should verify. Conversely, suppose 6.21 holds. Then the condition for equality in the Cauchy–Schwarz Inequality (6.15) implies that one ofu;vis a scalar multiple of the other. Clearly 6.21 forces the scalar in question to be nonnegative, as desired.

(12)

The next result is called the parallelogram equality because of its geometric interpretation: in every parallelogram, the sum of the squares of the lengths of the diagonals equals the sum of the squares of the lengths of the four sides.

u+v u-v

u

v v

The parallelogram equality.

6.22 Parallelogram Equality Supposeu;v₂V. Then ku_Cv_k2_{C k}u v_k2_D2._ku_k2_{C k}v_k2/: Proof We have ku_Cv_k2_{C k}u v_k2 _{D h}u_Cv; u_Cv_{i C h}u v; u v_i D ku_k2_{C k}v_k2_{C h}u;v_{i C h}v; u_i C ku_k2_{C k}v_k2 _hu;v_i _hv; u_i D2.kuk2C kvk2/; as desired.

Law professor Richard Friedman presenting a case before the U.S. Supreme Court in 2010:

Mr. Friedman: I think that issue is entirely orthogonal to the issue here because the Commonwealth is acknowledging—

Chief Justice Roberts: I’m sorry. Entirely what?

Mr. Friedman: Orthogonal. Right angle. Unrelated. Irrelevant.

Chief Justice Roberts: Oh.

Justice Scalia: What was that adjective? I liked that.

Mr. Friedman: Orthogonal.

Chief Justice Roberts: Orthogonal.

Mr. Friedman: Right, right.

Justice Scalia: Orthogonal, ooh. (Laughter.)

(13)

EXERCISES

6.A

1 Show that the function that takes .x1; x2/; .y1; y2/ 2 R2 R2 to jx1y1j C jx2y2jis not an inner product onR2.

2 Show that the function that takes .x1; x2; x3/; .y1; y2; y3/2R3R3

tox1y1Cx3y3is not an inner product onR3.

3 SupposeF_DRandV _{¤ f}0_g. Replace the positivity condition (which states thathv;v_i0for allv₂V) in the definition of an inner product (6.3) with the condition thathv;vi> 0for somev2V. Show that this change in the definition does not change the set of functions fromV V toRthat are inner products onV.

4 SupposeV is a real inner product space.

(a) Show thathu_Cv; u v_{i D k}u_k2 _kv_k2for everyu;v₂V. (b) Show that ifu;v₂V have the same norm, thenu_Cvis orthogonal

tou v.

(c) Use part (b) to show that the diagonals of a rhombus are perpen-dicular to each other.

5 SupposeV is finite-dimensional andT 2L.V /is such thatkTvk kvk for everyv2V. Prove thatT p2I is invertible.

6 Supposeu;v2V. Prove thathu;vi D0if and only if

ku_{k k}u_Cav_k for alla₂F.

7 Supposeu;v₂V. Prove that_kau_Cbv_{k D k}bu_Cav_kfor alla; b₂R

if and only ifku_{k D k}v_k.

8 Supposeu;v₂V and_ku_{k D k}v_{k D}1and_hu;v_{i D}1. Prove thatu_Dv.

9 Supposeu;v₂V and_ku_k1and_kv_k1. Prove that q

1 _ku_k2q₁ _k_v_k2₁ _jh_u;_v_ij_:

10 Find vectorsu;v ₂ R2 such thatuis a scalar multiple of.1; 3/, v is orthogonal to.1; 3/, and.1; 2/_Du_Cv.

(14)

11 Prove that 16.a_Cb_Cc_Cd / ₁ aC 1 b C 1 c C 1 d

for all positive numbersa; b; c; d.

12 Prove that

.x1C Cxn/2n.x12C Cxn2/

for all positive integersnand all real numbersx1; : : : ; xn. 13 Supposeu;vare nonzero vectors inR2. Prove that

hu;v_{i D k}u_kkv_kcos;

whereis the angle betweenuandv(thinking ofuandvas arrows with initial point at the origin).

Hint:Draw the triangle formed byu,v, andu v; then use the law of cosines.

14 The angle between two vectors (thought of as arrows with initial point at

the origin) inR2orR3can be defined geometrically. However, geometry

is not as clear inRnfor n > 3. Thus the angle between two nonzero vectorsx; y 2Rnis defined to be

arccos hx; yi kx_kky_k;

where the motivation for this definition comes from the previous exercise. Explain why the Cauchy–Schwarz Inequality is needed to show that this definition makes sense.

15 Prove that _Xn jD1 ajbj 2 n X jD1 j aj2 _Xn jD1 bj2 j

for all real numbersa1; : : : ; anandb1; : : : ; bn. 16 Supposeu;v₂V are such that

ku_{k D}3; _ku_Cv_{k D}4; _ku v_{k D}6: What number doeskvkequal?

(15)

17 Prove or disprove: there is an inner product onR2such that the associated

norm is given by

k.x; y/_{k D}max_fx; y_g for all.x; y/2R2.

18 Supposep > 0. Prove that there is an inner product onR2such that the

associated norm is given by

k.x; y/k D.jxjp C jyjp/1=p for all.x; y/₂R2if and only ifp_D2.

19 SupposeV is a real inner product space. Prove that

hu;v_{i D} kuCvk

2

ku v_k2 4

for allu;v₂V.

20 SupposeV is a complex inner product space. Prove that

hu;v_{i D} kuCvk

2

ku vk2C kuCivk2i ku ivk2i 4

for allu;v₂V.

21 A norm on a vector space U is a function k k W U ! Œ0;1/ such that ku_{k D} 0 if and only ifu _D 0, _k˛u_{k D j}˛_jku_k for all ˛ ₂ F

and allu ₂ U, and_ku_Cv_{k k}u_{k C k}v_k for all u;v ₂ U. Prove that a norm satisfying the parallelogram equality comes from an inner product (in other words, show that ifk kis a norm onU satisfying the parallelogram equality, then there is an inner producth; _ionU such thatkuk D hu; ui1=2for allu2U).

22 Show that the square of an average is less than or equal to the average

of the squares. More precisely, show that ifa1; : : : ; an 2 R, then the

square of the average ofa1; : : : ; anis less than or equal to the average

ofa12; : : : ; an2.

23 SupposeV1; : : : ; Vmare inner product spaces. Show that the equation h.u1; : : : ; um/; .v1; : : : ;vm/i D hu1;v1i C C hum;vmi

defines an inner product onV1 Vm.

[In the expression above on the right,hu1;v1idenotes the inner product

onV1, . . . ,hum;vmidenotes the inner product onVm. Each of the spaces

V1; : : : ; Vmmay have a different inner product, even though the same

(16)

24 SupposeS 2L.V /is an injective operator onV. Defineh;i1by hu;v_i1D hS u; Svi

foru;v₂V. Show that_h;_i1is an inner product onV.

25 SupposeS ₂L.V /is not injective. Define_h;_i1as in the exercise above.

Explain whyh;_i1is not an inner product onV.

26 Supposef; gare differentiable functions fromRtoRn.

(a) Show that

hf .t /; g.t /_i0_{D h}f0.t /; g.t /_{i C h}f .t /; g0.t /_i:

(b) Supposec > 0and _kf .t /_{k D} c for everyt ₂ R. Show that hf0.t /; f .t /_{i D}0for everyt ₂R.

(c) Interpret the result in part (b) geometrically in terms of the tangent vector to a curve lying on a sphere inRncentered at the origin.

[For the exercise above, a functionf _W R!Rnis called differentiable

if there exist differentiable functionsf1; : : : ; fnfromRtoRsuch that

f .t /_D f1.t /; : : : ; fn.t /for eacht 2R. Furthermore, for eacht2R,

the derivativef0.t /₂Rnis defined byf0.t /_D f10.t /; : : : ; fn0.t /.] 27 Supposeu;v;w₂V. Prove that

kw 12.uCv/k 2 D kw uk 2 C kw vk2 2 ku vk2 4 :

28 SupposeC is a subset ofV with the property that u;v 2 C implies

1

2.uCv/ 2C. Letw2 V. Show that there is at most one point inC

that is closest tow. In other words, show that there is at most oneu₂C such that

kw u_{k k}w v_k for allv₂C.

Hint:Use the previous exercise.

29 Foru;v₂V, defined.u;v/_{D k}u v_k. (a) Show thatd is a metric onV.

(b) Show that ifV is finite-dimensional, thend is a complete metric onV (meaning that every Cauchy sequence converges).

(c) Show that every finite-dimensional subspace of V is a closed subset ofV (with respect to the metricd).

(17)

30 Fix a positive integern. TheLaplacianp of a twice differentiable functionponRnis the function onRndefined by

p_D @ 2_p @x₁2 C C @2p @x2 n : The functionpis calledharmonicifp_D0.

Apolynomial onRnis a linear combination of functions of the

formx1m1 xnmn, wherem1; : : : ; mnare nonnegative integers.

Supposeqis a polynomial onRn. Prove that there exists a harmonic

polynomialp on Rn such that p.x/ _D q.x/for every x ₂ Rn with kx_{k D}1.

[The only fact about harmonic functions that you need for this exercise is that ifpis a harmonic function onRnandp.x/_D0for allx ₂Rn

withkxk D1, thenp D0.]

Hint:A reasonable guess is that the desired harmonic polynomialpis of the formq_C.1 _kx_k2/r for some polynomialr. Prove that there is a polynomialronRnsuch thatqC.1 kxk2/ris harmonic by defining an operatorT on a suitable vector space by

T r D .1 kxk2/r

and then showing thatT is injective and hence surjective.

31 Use inner products to prove Apollonius’s Identity: In a triangle with

sides of lengtha,b, andc, letd be the length of the line segment from the midpoint of the side of lengthcto the opposite vertex. Then

a2_Cb2_D 1₂c2_C2d2:

c

(18)

6.B

Orthonormal Bases

6.23 Definition orthonormal

A list of vectors is calledorthonormalif each vector in the list has norm1and is orthogonal to all the other vectors in the list.

In other words, a liste1; : : : ; emof vectors inV is orthonormal if hej; eki D

(

1 ifj Dk, 0 ifj _¤k.

6.24 Example orthonormal lists

(a) The standard basis inFnis an orthonormal list.

(b) _p1 3; 1 p 3; 1 p 3 ; p1 2; 1 p 2; 0 is an orthonormal list inF3. (c) _p1 3; 1 p 3; 1 p 3 ; p1 2; 1 p 2; 0 ; p1 6; 1 p 6; 2 p 6 is an orthonormal list inF3.

Orthonormal lists are particularly easy to work with, as illustrated by the next result.

6.25 The norm of an orthonormal linear combination

Ife1; : : : ; emis an orthonormal list of vectors inV, then ka1e1C Camemk2D ja1j2C C jamj2

for alla1; : : : ; am2F.

Proof Because eachej has norm1, this follows easily from repeated

appli-cations of the Pythagorean Theorem (6.13).

The result above has the following important corollary.

6.26 An orthonormal list is linearly independent

(19)

Proof Suppose e1; : : : ; em is an orthonormal list of vectors in V and

a1; : : : ; am2Fare such that

a1e1C CamemD0:

Thenja1j2C C jamj2D0(by 6.25), which means that all theaj’s are0.

Thuse1; : : : ; emis linearly independent. 6.27 Definition orthonormal basis

Anorthonormal basisofV is an orthonormal list of vectors inV that is also a basis ofV.

For example, the standard basis is an orthonormal basis ofFn.

6.28 An orthonormal list of the right length is an orthonormal basis

Every orthonormal list of vectors inV with length dimV is an orthonormal basis ofV.

Proof By 6.26, any such list must be linearly independent; because it has the right length, it is a basis—see 2.39.

6.29 Example Show that

1 2; 1 2; 1 2; 1 2 ; 1₂;1₂; 1₂; 1₂ ; 1₂; 1₂; 1₂;1₂ ; 1₂;1₂; 1₂;1₂ is an orthonormal basis ofF4. Solution We have 1₂;1₂;1₂;1₂ D q 1 2 2 C 12 2 C 12 2 C 12 2 D1: Similarly, the other three vectors in the list above also have norm1.

We have ˝ 1 2; 1 2; 1 2; 1 2 ; 1₂;1₂; 1₂; 1₂˛ D 12 1 2C 1 2 1 2 C 1 2 1 2 C12 1 2 D0: Similarly, the inner product of any two distinct vectors in the list above also equals0.

Thus the list above is orthonormal. Because we have an orthonormal list of length four in the four-dimensional vector spaceF4, this list is an orthonormal

(20)

In general, given a basise1; : : : ; enofV and a vectorv2V, we know that

there is some choice of scalarsa1; : : : ; an2Fsuch that

vDa1e1C Canen:

The importance of orthonormal bases stems mainly from the next result.

Computing the numbersa1; : : : ; anthat

satisfy the equation above can be diffi-cult for an arbitrary basis of V. The next result shows, however, that this is easy for an orthonormal basis—just take aj D hv; eji.

6.30 Writing a vector as linear combination of orthonormal basis

Supposee1; : : : ; enis an orthonormal basis ofV andv2V. Then

vD hv; e1ie1C C hv; enien

and

kvk2D jhv; e1ij2C C jhv; enij2:

Proof Becausee1; : : : ; enis a basis ofV, there exist scalarsa1; : : : ; ansuch

that

vDa1e1C Canen:

Becausee1; : : : ; enis orthonormal, taking the inner product of both sides of

this equation withej giveshv; eji Daj. Thus the first equation in 6.30 holds.

The second equation in 6.30 follows immediately from the first equation and 6.25.

Now that we understand the usefulness of orthonormal bases, how do we go about finding them? For example, doesPm.R/, with inner product given

by integration onŒ 1; 1[see 6.4(c)], have an orthonormal basis? The next result will lead to answers to these questions.

Danish mathematician Jørgen Gram (1850–1916) and German mathematician Erhard Schmidt (1876–1959) popularized this algo-rithm that constructs orthonormal lists.

The algorithm used in the next proof is called theGram–Schmidt Procedure. It gives a method for turning a linearly independent list into an orthonormal list with the same span as the original list.

(21)

6.31 Gram–Schmidt Procedure

Supposev1; : : : ;vm is a linearly independent list of vectors in V. Let

e1Dv1=kv1k. Forj D2; : : : ; m, defineej inductively by

ej D vj hvj

; e1ie1 hvj; ej 1iej 1 kvj hvj; e1ie1 hvj; ej 1iej 1k

: Thene1; : : : ; emis an orthonormal list of vectors inV such that

span.v1; : : : ;vj/Dspan.e1; : : : ; ej/

forj _D1; : : : ; m.

Proof We will show by induction onj that the desired conclusion holds. To get started withj _D1, note that span.v1/Dspan.e1/becausev1is a positive

multiple ofe1.

Suppose1 < j < mand we have verified that

6.32 span.v1; : : : ;vj 1/Dspan.e1; : : : ; ej 1/:

Note thatvj … span.v1; : : : ;vj 1/(becausev1; : : : ;vmis linearly

indepen-dent). Thusvj …span.e1; : : : ; ej 1/. Hence we are not dividing by0in the

definition ofej given in 6.31. Dividing a vector by its norm produces a new

vector with norm1; thus_kejk D1.

Let1k < j. Then hej; eki D v j hvj; e1ie1 hvj; ej 1iej 1 kvj hvj; e1ie1 hvj; ej 1iej 1k ; ek D hvj; eki hvj; eki kvj hvj; e1ie1 hvj; ej 1iej 1k D0:

Thuse1; : : : ; ej is an orthonormal list.

From the definition ofej given in 6.31, we see thatvj 2span.e1; : : : ; ej/.

Combining this information with 6.32 shows that span.v1; : : : ;vj/span.e1; : : : ; ej/:

Both lists above are linearly independent (thev’s by hypothesis, thee’s by orthonormality and 6.26). Thus both subspaces above have dimensionj, and hence they are equal, completing the proof.

(22)

6.33 Example Find an orthonormal basis ofP2.R/, where the inner

prod-uct is given byhp; q_{i D}R1

1p.x/q.x/ dx.

Solution We will apply the Gram–Schmidt Procedure (6.31) to the basis 1; x; x2.

To get started, with this inner product we have k1k2D

Z 1

1

12dxD2:

Thusk1_{k D}p2, and hencee1 D

q

1 2.

Now the numerator in the expression fore2is

x _hx; e1ie1 Dx Z 1 1 x q 1 2dx q 1 2 Dx: We have kx_k2_D Z 1 1 x2dx_D 2₃: Thuskx_{k D} q 2 3, and hencee2D q 3 2x.

Now the numerator in the expression fore3is

x2 _hx2; e1ie1 hx2; e2ie2 Dx2 Z 1 1 x2 q 1 2dx q 1 2 Z 1 1 x2 q 3 2x dx q 3 2x Dx2 1₃: We have kx2 1₃_k2 _D Z 1 1 x4 2₃x2_C1₉ dx_D ₄₅8: Thuskx2 1₃_{k D} q 8 45, and hencee3D q 45 8 x2 1 3 . Thus _q 1 2; q 3 2x; q 45 8 x 2 1 3

is an orthonormal list of length3inP2.R/. Hence this orthonormal list is an

(23)

Now we can answer the question about the existence of orthonormal bases.

6.34 Existence of orthonormal basis

Every finite-dimensional inner product space has an orthonormal basis. Proof SupposeV is finite-dimensional. Choose a basis of V. Apply the Gram–Schmidt Procedure (6.31) to it, producing an orthonormal list with length dimV. By 6.28, this orthonormal list is an orthonormal basis ofV.

Sometimes we need to know not only that an orthonormal basis exists, but also that every orthonormal list can be extended to an orthonormal basis. In the next corollary, the Gram–Schmidt Procedure shows that such an extension is always possible.

6.35 Orthonormal list extends to orthonormal basis

SupposeV is finite-dimensional. Then every orthonormal list of vectors inV can be extended to an orthonormal basis ofV.

Proof Suppose e1; : : : ; em is an orthonormal list of vectors in V. Then

e1; : : : ; emis linearly independent (by 6.26). Hence this list can be extended to

a basise1; : : : ; em;v1; : : : ;vnofV (see 2.33). Now apply the Gram–Schmidt

Procedure (6.31) toe1; : : : ; em;v1; : : : ;vn, producing an orthonormal list 6.36 e1; : : : ; em; f1; : : : ; fnI

here the formula given by the Gram–Schmidt Procedure leaves the firstm vectors unchanged because they are already orthonormal. The list above is an orthonormal basis ofV by 6.28.

Recall that a matrix is called upper triangular if all entries below the diagonal equal0. In other words, an upper-triangular matrix looks like this:

0 B @ : :: 0 1 C A;

where the0in the matrix above indicates that all entries below the diagonal equal0, and asterisks are used to denote entries on and above the diagonal.

(24)

In the last chapter we showed that ifV is a finite-dimensional complex vector space, then for each operator on V there is a basis with respect to which the matrix of the operator is upper triangular (see 5.27). Now that we are dealing with inner product spaces, we would like to know whether there exists anorthonormalbasis with respect to which we have an upper-triangular matrix.

The next result shows that the existence of a basis with respect to which T has an upper-triangular matrix implies the existence of an orthonormal basis with this property. This result is true on both real and complex vector spaces (although on a real vector space, the hypothesis holds only for some operators).

6.37 Upper-triangular matrix with respect to orthonormal basis

SupposeT ₂L.V /. IfT has an upper-triangular matrix with respect to some basis ofV, then T has an upper-triangular matrix with respect to some orthonormal basis ofV.

Proof SupposeT has an upper-triangular matrix with respect to some basis

v1; : : : ;vn ofV. Thus span.v1; : : : ;vj/is invariant underT for eachj D

1; : : : ; n(see 5.26).

Apply the Gram–Schmidt Procedure tov1; : : : ;vn, producing an

orthonor-mal basise1; : : : ; enofV. Because

span.e1; : : : ; ej/Dspan.v1; : : : ;vj/

for eachj (see 6.31), we conclude that span.e1; : : : ; ej/is invariant underT

for eachj _D1; : : : ; n. Thus, by 5.26,T has an upper-triangular matrix with respect to the orthonormal basise1; : : : ; en.

German mathematician Issai Schur

(1875–1941) published the first proof of the next result in 1909.

The next result is an important appli-cation of the result above.

6.38 Schur’s Theorem

SupposeV is a finite-dimensional complex vector space andT ₂L.V /. Then T has an upper-triangular matrix with respect to some orthonormal basis ofV.

Proof Recall thatT has an upper-triangular matrix with respect to some basis ofV (see 5.27). Now apply 6.37.

(25)

Linear Functionals on Inner Product Spaces

Because linear maps into the scalar fieldFplay a special role, we defined a

special name for them in Section 3.F. That definition is repeated below in case you skipped Section 3.F.

6.39 Definition linear functional

Alinear functionalonV is a linear map fromV toF. In other words, a

linear functional is an element ofL.V;F/.

6.40 Example The function'W F3!Fdefined by

'.z1; z2; z3/D2z1 5z2Cz3

is a linear functional onF3. We could write this linear functional in the form

'.z/_{D h}z; u_i for everyz2F3, whereuD.2; 5; 1/.

6.41 Example The function'W P2.R/!Rdefined by

'.p/D Z 1

1

p.t / cos. t / dt

is a linear functional onP2.R/(here the inner product onP2.R/is

multi-plication followed by integration onŒ 1; 1; see 6.33). It is not obvious that there existsu2P2.R/such that

'.p/_{D h}p; u_i

for everyp ₂P2.R/[we cannot takeu.t /Dcos. t /because that function

is not an element ofP2.R/].

The next result is named in honor of Hungarian mathematician Frigyes Riesz (1880–1956), who proved several results early in the twen-tieth century that look very much like the result below.

If u ₂ V, then the map that sends

v tohv; u_i is a linear functional onV. The next result shows that every linear functional on V is of this form. Ex-ample 6.41 above illustrates the power of the next result because for the linear functional in that example, there is no obvious candidate foru.

(26)

6.42 Riesz Representation Theorem

SupposeV is finite-dimensional and'is a linear functional onV. Then there is a unique vectoru2V such that

'.v/_{D h}v; u_i for everyv2V.

Proof First we show there exists a vectoru₂V such that'.v/_{D h}v; u_ifor everyv2V. Lete1; : : : ; enbe an orthonormal basis ofV. Then

'.v/_D'._hv; e1ie1C C hv; enien/ D hv; e1i'.e1/C C hv; eni'.en/ D hv; '.e1/e1C C'.en/eni

for everyv2V, where the first equality comes from 6.30. Thus setting

6.43 u_D'.e1/e1C C'.en/en;

we have'.v/_{D h}v; u_ifor everyv₂V, as desired.

Now we prove that only one vector u ₂ V has the desired behavior. Supposeu1; u2 2V are such that

'.v/_{D h}v; u1i D hv; u2i

for everyv2V. Then

0D hv; u1i hv; u2i D hv; u1 u2i

for everyv2V. TakingvDu1 u2shows thatu1 u2 D0. In other words,

u1Du2, completing the proof of the uniqueness part of the result.

6.44 Example Findu2P2.R/such that

Z 1 1 p.t / cos. t /dt D Z 1 1 p.t /u.t / dt for everyp ₂P2.R/.

(27)

Solution Let'.p/_D R1

1p.t / cos. t /

dt. Applying formula 6.43 from the proof above, and using the orthonormal basis from Example 6.33, we have

u.x/_D Z 1 1 q 1 2 cos. t / dtq1₂_C Z 1 1 q 3 2t cos. t / dtq3₂x C Z 1 1 q 45 8 t 2 1 3 cos. t / dtq45₈ x2 1₃ : A bit of calculus shows that

u.x/_D ₂452 x

2 1

3

:

SupposeV is finite-dimensional and' a linear functional onV. Then 6.43 gives a formula for the vectoruthat satisfies '.v/ _{D h}v; u_ifor all v ₂ V. Specifically, we have

uD'.e1/e1C C'.en/en:

The right side of the equation above seems to depend on the orthonormal basis e1; : : : ; enas well as on'. However, 6.42 tells us that uis uniquely

determined by'. Thus the right side of the equation above is the same regardless of which orthonormal basise1; : : : ; enofV is chosen.

EXERCISES

6.B

1 (a) Suppose ₂ R. Show that .cos;sin /; . sin;cos / and .cos;sin /; .sin; cos /are orthonormal bases ofR2.

(b) Show that each orthonormal basis ofR2is of the form given by

one of the two possibilities of part (a).

2 Supposee1; : : : ; emis an orthonormal list of vectors inV. Letv 2 V.

Prove that

kvk2D jhv; e1ij2C C jhv; emij2

if and only ifv2span.e1; : : : ; em/.

3 SupposeT ₂ L.R3/ has an upper-triangular matrix with respect to the basis.1; 0; 0/, (1, 1, 1),.1; 1; 2/. Find an orthonormal basis ofR3

(use the usual inner product on R3) with respect to which T has an upper-triangular matrix.

(28)

4 Supposenis a positive integer. Prove that 1 p 2; cosx p ; cos2x p ; : : : ; cosnx p ; sinx p ; sin2x p ; : : : ; sinnx p is an orthonormal list of vectors inC Œ ; , the vector space of contin-uous real-valued functions onŒ ; with inner product

hf; g_{i D} Z

f .x/g.x/ dx:

[The orthonormal list above is often used for modeling periodic phenom-ena such as tides.]

5 OnP2.R/, consider the inner product given by hp; q_{i D}

Z 1

0

p.x/q.x/ dx:

Apply the Gram–Schmidt Procedure to the basis1; x; x2to produce an orthonormal basis ofP2.R/.

6 Find an orthonormal basis ofP2.R/(with inner product as in Exercise 5)

such that the differentiation operator (the operator that takesp top0) onP2.R/has an upper-triangular matrix with respect to this basis. 7 Find a polynomialq₂P2.R/such that

p 1₂ D Z 1 0 p.x/q.x/ dx for everyp ₂P2.R/.

8 Find a polynomialq₂P2.R/such that

Z 1 0 p.x/.cosx/ dx_D Z 1 0 p.x/q.x/ dx for everyp ₂P2.R/.

9 What happens if the Gram–Schmidt Procedure is applied to a list of

(29)

10 SupposeV is a real inner product space andv1; : : : ;vmis a linearly

inde-pendent list of vectors inV. Prove that there exist exactly2morthonormal listse1; : : : ; emof vectors inV such that

span.v1; : : : ;vj/Dspan.e1; : : : ; ej/

for allj _{2 f}1; : : : ; m_g.

11 Suppose_h;_i1andh;i2are inner products onV such thathv;wi1 D0

if and only ifhv;w_i2 D0. Prove that there is a positive numbercsuch

thathv;wi1Dchv;wi2for everyv;w2V.

12 SupposeV is finite-dimensional and_h;_i1,h;i2are inner products on

V with corresponding norms_{k k}1andk k2. Prove that there exists a

positive numbercsuch that

kvk1ckvk2

for everyv2V.

13 Supposev1; : : : ;vmis a linearly independent list inV. Show that there

existsw2V such that_hw;vji> 0for allj 2 f1; : : : ; mg.

14 Supposee1; : : : ; en is an orthonormal basis of V and v1; : : : ;vn are

vectors inV such that

kej vjk<

1

p

n for eachj. Prove thatv1; : : : ;vnis a basis ofV.

15 SupposeCR.Œ 1; 1/is the vector space of continuous real-valued func-tions on the intervalŒ 1; 1with inner product given by

hf; gi D Z 1

1

f .x/g.x/ dx

forf; g 2 CR.Œ 1; 1/. Let' be the linear functional onCR.Œ 1; 1/ defined by'.f /_Df .0/. Show that there does not existg₂CR.Œ 1; 1/ such that

'.f /_{D h}f; g_i for everyf ₂CR.Œ 1; 1/.

[The exercise above shows that the Riesz Representation Theorem(6.42)

does not hold on infinite-dimensional vector spaces without additional hypotheses onV and'.]

(30)

16 SupposeF DC,V is finite-dimensional,T 2L.V /, all the eigenvalues ofT have absolute value less than1, and > 0. Prove that there exists a positive integermsuch that_kTmv_k_kv_kfor everyv₂V.

17 Foru₂V, letˆudenote the linear functional onV defined by .ˆu/.v/_{D h}v; u_i

forv2V.

(a) Show that ifFDR, thenˆis a linear map fromV toV0. (Recall from Section 3.F thatV0DL.V;F/and thatV0is called the dual space ofV.)

(b) Show that ifF DCandV _{¤ f}0_g, thenˆis not a linear map. (c) Show thatˆis injective.

(d) SupposeF_DRandV is finite-dimensional. Use parts (a) and (c) and a dimension-counting argument (but without using 6.42) to show thatˆis an isomorphism fromV ontoV0.

[Part(d)gives an alternative proof of the Riesz Representation Theorem

(6.42)whenF DR. Part(d)also gives a natural isomorphism(meaning

that it does not depend on a choice of basis)from a finite-dimensional real inner product space onto its dual space.]

(31)

6.C

Orthogonal Complements and

Minimization Problems

Orthogonal Complements

6.45 Definition orthogonal complement,U?

IfU is a subset ofV, then theorthogonal complementofU, denotedU?, is the set of all vectors inV that are orthogonal to every vector inU:

U? _{D f}v₂V _{W h}v; u_{i D}0for everyu₂U_g:

For example, ifU is a line inR3, then U? is the plane containing the origin that is perpendicular toU. IfU is a plane inR3, thenU? is the line containing the origin that is perpendicular toU.

6.46 Basic properties of orthogonal complement

(a) IfU is a subset ofV, thenU?is a subspace ofV. (b) f0_g? _DV.

(c) V?_{D f}0_g.

(d) IfU is a subset ofV, thenU _\U?_f0_g.

(e) IfU andW are subsets ofV andU W, thenW?U?. Proof

(a) SupposeU is a subset ofV. Thenh0; ui D0for everyu 2 U; thus 0₂U?.

Supposev;w₂U?. Ifu₂U, then

hvCw; u_{i D h}v; u_{i C h}w; u_{i D}0_C0_D0:

ThusvCw2U?. In other words,U?is closed under addition. Similarly, suppose₂Fandv₂U?. Ifu₂U, then

hv; u_{i D}_hv; u_{i D}0_D0:

Thusv₂U?. In other words,U?is closed under scalar multiplica-tion. ThusU?is a subspace ofV.

(32)

(b) Supposev2 V. Thenhv; 0i D0, which implies thatv 2 f0g?. Thus f0_g? _DV.

(c) Supposev 2 V?. Then _hv;v_{i D}0, which implies thatv _D0. Thus V?_{D f}0_g.

(d) SupposeU is a subset ofV andv₂U _\U?. Then_hv;v_{i D}0, which implies thatvD0. ThusU _\U?_f0_g.

(e) SupposeU andW are subsets ofV andU W. Supposev ₂ W?. Thenhv; u_{i D}0for everyu₂ W, which implies that_hv; u_{i D}0for everyu2U. Hencev2U?. ThusW? U?.

Recall that ifU; W are subspaces ofV, thenV is the direct sum ofU and W (writtenV _DU _˚W) if each element ofV can be written in exactly one way as a vector inU plus a vector inW (see 1.40).

The next result shows that every finite-dimensional subspace ofV leads to a natural direct sum decomposition ofV.

6.47 Direct sum of a subspace and its orthogonal complement

SupposeU is a finite-dimensional subspace ofV. Then V _DU _˚U?:

Proof First we will show that

6.48 V _DU _CU?:

To do this, suppose v 2 V. Let e1; : : : ; em be an orthonormal basis ofU.

Obviously 6.49 v_{D h}v; e1ie1C C hv; emiem „ ƒ‚ … u Cv hv; e1ie1 hv; emiem „ ƒ‚ … w : Letuandwbe defined as in the equation above. Clearlyu 2 U. Because e1; : : : ; emis an orthonormal list, for eachj D1; : : : ; mwe have

hw; eji D hv; eji hv; eji D0:

Thus wis orthogonal to every vector in span.e1; : : : ; em/. In other words,

w 2 U?. Thus we have writtenv _D u_Cw, whereu ₂ U andw ₂ U?, completing the proof of 6.48.

From 6.46(d), we know thatU\U?D f0g. Along with 6.48, this implies thatV _DU _˚U?(see 1.45).

(33)

Now we can see how to compute dimU?from dimU.

6.50 Dimension of the orthogonal complement

SupposeV is finite-dimensional andU is a subspace ofV. Then dimU? _DdimV dimU:

Proof The formula for dimU?follows immediately from 6.47 and 3.78. The next result is an important consequence of 6.47.

6.51 The orthogonal complement of the orthogonal complement

SupposeU is a finite-dimensional subspace ofV. Then U _D.U?/?:

Proof First we will show that

6.52 U .U?/?:

To do this, suppose u ₂ U. Then _hu;v_{i D} 0 for every v ₂ U? (by the definition ofU?). Becauseuis orthogonal to every vector inU?, we have u₂.U?/?, completing the proof of 6.52.

To prove the inclusion in the other direction, supposev 2 .U?/?. By 6.47, we can write v D u_Cw, where u ₂ U and w ₂ U?. We have

v u _D w ₂ U?. Becausev ₂ .U?/? andu ₂ .U?/? (from 6.52), we havev u₂.U?/?. Thusv u₂U?_\.U?/?, which implies thatv u is orthogonal to itself, which implies thatv u _D 0, which implies that

vDu, which implies thatv₂U. Thus.U?/? U, which along with 6.52 completes the proof.

We now define an operatorPU for each finite-dimensional subspace ofV. 6.53 Definition orthogonal projection,PU

Suppose U is a finite-dimensional subspace of V. The orthogonal projectionofV ontoU is the operatorPU 2 L.V /defined as follows:

(34)

The direct sum decompositionV DU ˚U?given by 6.47 shows that eachv2V can be uniquely written in the formv_Du_Cwwithu₂U and

w2U?. ThusPUvis well defined.

6.54 Example Supposex₂V withx_¤0andU _Dspan.x/. Show that

PUvD hv

; xi kx_k2x

for everyv2V.

Solution Supposev₂V. Then

vD hv; xi kx_k2xC v hv; xi kx_k2x ;

where the first term on the right is in span.x/(and thus inU) and the second term on the right is orthogonal tox(and thus is inU?/. ThusPUvequals the

first term on the right, as desired.

6.55 Properties of the orthogonal projectionPU

SupposeU is a finite-dimensional subspace ofV andv₂V. Then (a) PU 2L.V /; (b) PUuDufor everyu2U; (c) PUwD0for everyw2U?; (d) rangePU DU; (e) nullPU DU?; (f) v PUv2U?; (g) PU2DPU; (h) kPUvk kvk;

(i) for every orthonormal basise1; : : : ; emofU,

(35)

Proof

(a) To show thatPU is a linear map onV, supposev1;v2 2V. Write

v1 Du1Cw1 and v2 Du2Cw2

withu1; u22U andw1;w2 2U?. ThusPUv1Du1andPUv2Du2.

Now

v1Cv2 D.u1Cu2/C.w1Cw2/;

whereu1Cu22U andw1Cw2 2U?. Thus

PU.v1Cv2/Du1Cu2DPUv1CPUv2:

Similarly, suppose₂ F. The equationv Du_Cwwithu₂U and

w 2 U? implies that v D uCwwithu 2 U and w 2 U?. ThusPU.v/DuDPUv.

HencePU is a linear map fromV toV.

(b) Supposeu2U. We can writeuDuC0, whereu2U and02U?. ThusPUuDu.

(c) Supposew2U?. We can writewD0Cw, where02U andw2U?. ThusPUwD0.

(d) The definition ofPU implies that rangePU U. Part (b) implies that

U rangePU. Thus rangePU DU.

(e) Part (c) implies thatU?nullPU. To prove the inclusion in the other

direction, note that ifv2nullPU then the decomposition given by 6.47

must bevD0_Cv, where0₂U andv₂U?. Thus nullPU U?.

(f) IfvDu_Cwwithu₂U andw₂U?, then

v PUvDv uDw2U?:

(g) IfvDu_Cwwithu₂U andw₂U?, then

.PU2/vDPU.PUv/DPUuDuDPUv:

(h) IfvDu_Cwwithu₂U andw₂U?, then

kPUvk2 D kuk2 kuk2C kwk2D kvk2;

where the last equality comes from the Pythagorean Theorem. (i) The formula forPUvfollows from equation 6.49 in the proof of 6.47.

(36)

Minimization Problems

The remarkable simplicity of the so-lution to this minimization problem has led to many important applica-tions of inner product spaces out-side of pure mathematics.

The following problem often arises: given a subspace U ofV and a point

v 2 V, find a point u 2 U such that

kv u_k is as small as possible. The next proposition shows that this mini-mization problem is solved by taking u_DPUv.

6.56 Minimizing the distance to a subspace

SupposeU is a finite-dimensional subspace ofV,v2V, andu2U. Then

kv PUvk kv uk:

Furthermore, the inequality above is an equality if and only ifu_DPUv.

Proof We have

kv PUvk2 kv PUvk2C kPUv uk2 6.57

D k.v PUv/C.PUv u/k2 D kv u_k2;

where the first line above holds because 0 _kPUv uk2, the second

line above comes from the Pythagorean Theorem [which applies because

v PUv2U?by 6.55(f), andPUv u2U], and the third line above holds

by simple algebra. Taking square roots gives the desired inequality.

Our inequality above is an equality if and only if 6.57 is an equality, which happens if and only ifkPUv uk D0, which happens if and only if

u_DPUv.

0 v

P v_U U

(37)

The last result is often combined with the formula 6.55(i) to compute explicit solutions to minimization problems.

6.58 Example Find a polynomialuwith real coefficients and degree at most5that approximates sinxas well as possible on the intervalŒ ; , in

the sense that _Z

j

sinx u.x/j2dx

is as small as possible. Compare this result to the Taylor series approximation. Solution LetCRŒ ; denote the real inner product space of continuous real-valued functions onŒ ; with inner product

6.59 _hf; g_{i D}

Z

f .x/g.x/ dx:

Letv2CRŒ ; be the function defined byv.x/Dsinx. LetU denote the subspace ofCRŒ ; consisting of the polynomials with real coefficients and degree at most5. Our problem can now be reformulated as follows:

Findu₂U such that_kv u_kis as small as possible.

A computer that can perform inte-grations is useful here.

To compute the solution to our ap-proximation problem, first apply the Gram–Schmidt Procedure (using the

in-ner product given by 6.59) to the basis1; x; x2; x3; x4; x5 ofU, producing an orthonormal basise1; e2; e3; e4; e5; e6ofU. Then, again using the inner

product given by 6.59, computePUvusing 6.55(i) (withmD6). Doing this

computation shows thatPUvis the functionudefined by 6.60 u.x/D0:987862x 0:155271x3C0:00564312x5;

where the’s that appear in the exact answer have been replaced with a good decimal approximation.

By 6.56, the polynomialuabove is the best approximation to sinx on Œ ; using polynomials of degree at most5(here “best approximation” means in the sense of minimizingR

jsinx u.x/j

2_dx_{). To see how good}

this approximation is, the next figure shows the graphs of both sinxand our approximationu.x/given by 6.60 over the intervalŒ ; .

(38)

-3 3

-1 1

Graphs onŒ ; of sinx(blue)and

its approximationu.x/(red)given by 6.60.

Our approximation 6.60 is so accurate that the two graphs are almost identical—our eyes may see only one graph! Here the blue graph is placed almost exactly over the red graph. If you are viewing this on an electronic device, try enlarging the picture above, especially near3or 3, to see a small gap between the two graphs.

Another well-known approximation to sinxby a polynomial of degree5 is given by the Taylor polynomial

6.61 x x

3

3Š C x5

5Š:

To see how good this approximation is, the next picture shows the graphs of both sinxand the Taylor polynomial 6.61 over the intervalŒ ; .

-3 3

-1 1

Graphs onŒ ; of sinx(blue)and the Taylor polynomial 6.61(red). The Taylor polynomial is an excellent approximation to sinxforxnear0. But the picture above shows that forjxj> 2, the Taylor polynomial is not so accurate, especially compared to 6.60. For example, takingx _D 3, our approximation 6.60 estimates sin3with an error of about0:001, but the Taylor series 6.61 estimates sin3with an error of about0:4. Thus atxD3, the error in the Taylor series is hundreds of times larger than the error given by 6.60. Linear algebra has helped us discover an approximation to sinxthat improves upon what we learned in calculus!

(39)

EXERCISES

6.C

1 Supposev1; : : : ;vm2V. Prove that

fv1; : : : ;vmg? D span.v1; : : : ;vm/?:

2 SupposeU is a finite-dimensional subspace ofV. Prove thatU?_{D f}0_g if and only ifU DV.

[Exercise 14(a)shows that the result above is not true without the hy-pothesis thatU is finite-dimensional.]

3 SupposeU is a subspace ofV with basisu1; : : : ; umand

u1; : : : ; um;w1; : : : ;wn

is a basis ofV. Prove that if the Gram–Schmidt Procedure is applied to the basis of V above, producing a list e1; : : : ; em; f1; : : : ; fn, then

e1; : : : ; emis an orthonormal basis ofU andf1; : : : ; fnis an

orthonor-mal basis ofU?.

4 SupposeU is the subspace ofR4defined by

U _Dspan .1; 2; 3; 4/; . 5; 4; 3; 2/ :

Find an orthonormal basis ofU and an orthonormal basis ofU?.

5 SupposeV is finite-dimensional andU is a subspace ofV. Show that P_U? DI PU, whereI is the identity operator onV.

6 SupposeU andW are finite-dimensional subspaces ofV. Prove that PUPW D0if and only ifhu;wi D0for allu2U and allw2W. 7 SupposeV is finite-dimensional andP ₂L.V /is such thatP2 _DP and

every vector in nullP is orthogonal to every vector in rangeP. Prove that there exists a subspaceU ofV such thatP _DPU.

8 SupposeV is finite-dimensional andP ₂ L.V /is such thatP2 _DP and

kPvk kvk

for everyv 2 V. Prove that there exists a subspaceU ofV such that P _DPU.

9 SupposeT ₂L.V /andU is a finite-dimensional subspace ofV. Prove thatU is invariant underT if and only ifPUTPU DTPU.

(40)

10 Suppose V is finite-dimensional, T 2 L.V /, and U is a subspace of V. Prove thatU and U? are both invariant under T if and only ifPUT DTPU.

11 InR4, let

U _Dspan .1; 1; 0; 0/; .1; 1; 1; 2/ :

Findu₂U such that_ku .1; 2; 3; 4/_kis as small as possible.

12 Findp₂P3.R/such thatp.0/D0,p0.0/D0, and

Z 1

0 j

2_C3x p.x/_j2dx is as small as possible.

13 Findp₂P5.R/that makes

Z

jsin

x p.x/_j2dx as small as possible.

[The polynomial 6.60 is an excellent approximation to the answer to this exercise, but here you are asked to find the exact solution, which involves powers of. A computer that can perform symbolic integration will be

useful.]

14 SupposeCR.Œ 1; 1/is the vector space of continuous real-valued func-tions on the intervalŒ 1; 1with inner product given by

hf; g_{i D} Z 1

1

f .x/g.x/ dx

forf; g 2 CR.Œ 1; 1/. LetU be the subspace ofCR.Œ 1; 1/defined by

U _{D f}f ₂CR.Œ 1; 1/Wf .0/D0g: (a) Show thatU? _{D f}0_g.

(b) Show that 6.47 and 6.51 do not hold without the finite-dimensional hypothesis.