Gelfand - Lectures on Linear Algebra

(1)

LECTURES ON

LINEAR ALGEBRA

L M. GELTAND

Academy of Sciences, Moscow, U.S.S.R.

Translated from the Revised Second Russian Edition by A. SHENITZER

Adelphi College, Garden City, New York

INTERSCIENCE PUBLISHERS, INC., NEW YORK

(2)

LIBRARY OF CONGRESS CATALOG CARD NUMBER 61-8630 SECOND PRINTING 1963

(3)

The second edition differs from the first in two ways. Some of the material was substantially revised and new material was added. The

major additions include two appendices at the end of the book dealing with computational methods in linear algebra and the theory of pertur-bations, a section on extremal properties of eigenvalues, and a section

on polynomial matrices ( §§ 17 and 21). As for major revisions, the

chapter dealing with the Jordan canonical form of a linear transforma-tion was entirely rewritten and Chapter IV was reworked. Minor changes and additions were also made. The new text was written in

colla-boration with Z. Ja. Shapiro.

I wish to thank A. G. Kurosh for making available his lecture notes

on tensor algebra. I am grateful to S. V. Fomin for a number of valuable comments Finally, my thanks go to M. L. Tzeitlin for assistance in the preparation of the manuscript and for a number of suggestions.

September 1950 I. GELtAND

Translator's note: Professor Gel'fand asked that the two appendices

(4)

(5)

This book is based on a course in linear algebra taught by the author in the department of mechanics and mathematics of the Moscow State

University and at the Byelorussian State University.

S. V. Fomin participated to a considerable extent in the writing of this book. Without his help this book could not have been written.

T h e author wishes to thank Assistant Professor A. E. Turetski of the Byelorussian State University, who made available to him notes of the lectures given by the author in 1945, and to D. A. Raikov, who carefully

read the manuscript and made a number of valuable comments. The material in fine print is not utilized in the main part of the text and may be omitted in a first perfunctory reading.

January 1948 I. GEL'FAND

(6)

(7)

Page

Preface to the second edition

Preface to the first edition vii

I. n-Dimensional Spaces. Linear and Bilinear Forms

ti-Dimensional vector spaces

Euclidean space 14

Orthogonal basis. Isomorphism of Euclidean spaces 21

Bilinear and quadratic forms 34

Reduction of a quadratic form to a sum of squares 42 Reduction of a quadratic form by means of a triangular

trans-formation 46

The law of inertia 55

Complex n-dimensional space 60

II. Linear Transformations 70

Linear transformations. Operations on linear transformations . 70

Invariant subspaces. Eigenvalues and eigenvectors of a linear

transformation 81

The adjoint of a linear transformation 90

Self-adjoint (Hermitian) transformations. Simultaneous

reduc-tion of a pair of quadratic forms to a sum of squares 97

Unitary transformations 103

Commutative linear transformations. Normal transformations . 107 Decomposition of a linear transformation into a product of a unitary and self-adjoint transformation

Linear transformations on a rea/ Euclidean space 114

Extremal properties of eigenvalues 126

III. The Canonical Form of an Arbitrary Linear Transformation . 132

The canonical form of a linear transformation 132

Reduction to canonical form 137

Elementary divisors 142

Polynomial matrices 149

IV. Introduction to Tensors 164

The dual space 164

(8)

(9)

n-Dimensional Spaces. Linear and

Bilinear Forms

§ 1. a-Dimensional vector spaces

1. Definition of a vector space. We frequently come across objects which are added and multiplied by numbers. Thus

In geometry objects of this nature are vectors in three dimensional space, i.e., directed segments. Two directed segments

are said to define the same vector if and only if it is possible to translate one of them into the other. It istherefore convenient to

measure off all such directed segments beginning with one common

point which we shall call the origin. As is well known the sum of

two vectors x and y is, by definition, the diagonal of the

parallelo-gram with sides x and y. Thedefinition of multiplication by (real) numbers is equally well known.

In

algebra we come

across systems of

n numbers

x = (ei, E2, , e) (e.g., rows of a matrix, the set of coefficients of a linear form, etc.). Addition and multiplication of n-tuples by

numbers are usually defined as follows: bythe sum of the n-tuples x = (E1, E2,

, ¿) and y =(,n2,

, ?In) we mean the n-tuple

x + y =

+ ij, E2 + n2,

,

+ n). By the product of the

number A and the n-tuple x = (ei , e2, e) we mean the n-tuple ix= (241, )42, , AE ).

In analysis we define the operations of addition of functions and multiplication of functions by numbers. In the sequel we shall consider all continuous functions defined on some interval fa, b].

In the examples just given the operationsof addition and

multi-plication by numbers are applied to entirelydissimilar objects. To investigate all examples of this nature from a unified point of view we introduce the concept of a vector space.

DEFINITION 1. A set R of elements x, y, z, is said to be a

vector space over a field F if:

(10)

With every two elements x and y in R there is associated an

element z in R which is called the sum of the elements x and y. The sum of the elements x and y is denoted by x + y.

With every element x in R andevery numeer A belongingtua c

field F there is associated an element Ax in R. Ai( is referredto as the product of x by A.

The above operations must satisfy the following requirements

(axioms):

I.

1. x+y=y+x

_{(commutativity)}

(x + y) 4- z = x

(y z) _{(associativity)} R contains an element 0 such that x = x for all x in

R. 0 is referred to as the zero element.

For every x in R there exists (in R) an element denoted by

x with the property x ( x) = O.

II.

1. Ix= x

2. (ßx) = ß(x).

III. 1. (cc

+ fi)x =

+ fix

2. ct(x

y) =

cor.

It is not an oversight on our part thatwe have not specified how

elements of R are to be added and multiplied by numbers. Any

definitions of these operations are acceptable as long

_{as the}

axioms listed above are satisfied. Whenever this is the case we are

dealing with an instance of a vector space.

We leave it to the reader to verify that the examples 1, 2, 3

above are indeed examples of vector spaces.

Let us give a few more examples of vector spaces.

4. The set of all polynomials of degree not exceeding some natural number n constitutes a vector space if addition of

polyno-mials and multiplication of polynopolyno-mials by numbers are defined in

the usual manner.

We observe that under the usual operations of addition and multiplication by numbers the set of polynomials of degree n does not form a vector space since the sum of two polynomials of degree n may turn out to be a polynomial of degree smaller than_{n. Thus}

(r

t)

( r

t) =

(11)

of the matrices I laikl I and we take the matrix Hai, + b1,11.

As the product of the number X and the matrix 11aikl we take the matrix 112a1tt It is easy to see that the above set R is now a

vector space.

It is natural to call the elements of a vector space vectors. The

fact that this term was used in Example I should not confuse the

reader. The geometric considerations associated with this word will help us clarify and even predict a number of results.

If the numbers X, y, involved in the definition of a vector space are real, then the space is referred to as a real vector space. If the numbers 2, it, are taken from the field of complex numbers,

then the space is referred to as a complex vector space.

More generally it may be assumed that A, 1.4, , are elements of an

arbitrary field K. Then R is called a vector space over the field K. Many

concepts and theorems dealt with in the sequel and, in particular, the

contents of this section apply to vector spaces over arbitraryfields.

How-ever, in chapter I we shall ordinarily assume that R is a real vector space. 2. The dimensionality of a vector space. 1/Ve nowdefine the notions of linear dependence and independence of vectors which are of fundamental importance in all that follows.

DEFINITION 2. Let R be a vector space. W e shall say that the vectors x, y, z, , IT are linearly dependent if there exist numbers

cc, /3, y, 9, not all equal to zero such that

(1)

yz +

+ Ov = O.

Vectors which are not linearly dependent are said to be linearly independent. In other words,

a set of vectors x, y, z, , v is said to be linearly independent if the equality

GO( + /3y + yz + + Ov = 0

implies that =- p y 0 = O.

Let the vectors x, y, z, , IT be linearly dependent, i.e., let x, y, z, , IT be connected by a relation of the form (1) with at

least one of the coefficients, a, say, unequal to zero. Then

fty

yz Ov.

(12)

(/57X) = A, = tu,

', (01x) =

we have

(2) x /1.y pZ + + Of.

Whenever a vector x is expressible through vectors y, z,

, y

in the form (2) we say that x is a linear combination of the vectors

y, z, , y.

Thus, if the vectors x, y, z, , y are linearly dependent then at

least one of them is a linear combination of the others. We leave it to the reader to prove that the converse is also true, i.e., that if one of a set of vectors is a linear combination of the remaining vectors then

the vectors of the set are linearly dependent.

EXERCISES. 1. Show that if one of the vectors x, y, z, y is the zero vector then these vectors are linearly dependent.

2. Show that if the vectors x, y, z, are linearly dependent and u, v, are arbitrary vectors then the vectors x, y, z, , u, y, are linearly

dependent.

We now introduce the concept of dimension of a vector space.

Any two vectors on a line are proportional, i.e., linearly depend-ent. In the plane we can find two linearly independent vectors

but any three vectors are linearly dependent. If R is the set of vectors in three-dimensional space, then it is possible to find three linearly independent vectors but any four vectors are linearly

dependent.

As we see the maximal number of linearly independent vectors

on a straight line, in the plane, and in three-dimensional space coincides with what is called in geometry the dimensionality of the

line, plane, and space, respectively. It is therefore natural to make the following general definition.

DEF/NITION 3. A vector space R is said to be n-dimensional if it

contains n linearly independent vectors and if any n 1 vectors in R are linearly dependent.

If R is a vector space which contains an arbitrarily large number of linearly independent vectors, then R is said to be

infinite-dimensional.

Infinite-dimensional spaces will not be studied in this book. Ve shall now compute the dimensionality of each of the vector

(13)

As we have already indicated, the space R of Example 1 contains three linearly independent vectors and any four vectors in it are linearly dependent. ConsequentlyR is three-dimensional. Let R denote the space whose elements are n-tuples of real

numbers.

This space contains n linearly independent vectors For instance,

the vectors

are easily seen to belinearly independent. On the other hand, any m vectors in R, tre > n, are linearly dependent. Indeed, let

Yi(nii, n12,

Y2 = (V21, n22, , n2n), Yin = (nml, nm2, , nmn)

be ni vectors and let ni > n. The number of linearly independent

rows in the matrix

17ml, 717n2, n tnn

cannot exceed n (the number of columns). Since m > n, our ni

rows are linearly dependent. But this implies the linear dependence

of the vectors y1, y2, , Ynt Thus the dimension of R is n.

Let R be the space of continuous functions. Let N be any

natural number. Then the functions: f1(t) 1, f2(t) = t, fN(t) = tN-1 form a set of linearly independent vectors (the proof

of this statement is left to the reader). It follows that our space contains an arbitrarily large number of linearly independent functions or, briefly, R is infinite-dimensional.

Let R be the space of polynomials of degree n 1. In

this space the n polynomials 1, t, , t"-1 arelinearly independent. It can be shown that any ni elements of R, ni > n, are linearly

dependent. Hence R is n-dimensional.

[

nu., n12, Thn

n21, n22y 172n

xi -= (1, 0, , 0), x, = (0, 1, , 0),

(14)

5. We leave it to the reader to prove that the space of n x n matrices [a2kH is n2-dimensional.

3. Basis and coordinates in n-dimensional space

DEFINITION 4. Any set of n linearly independent vectors

e1, e2, , en of an n-dimensional vector space R is called a basis of R.

Thus, for instance, in the case of the space considered in Example

1 any-three vectors which are not coplanar form a basis.

By definition of the term "n-dimensional vector space" such a space contains n linearly independent vectors, i.e., it contains a

basis.

THEOREM 1. Every vector x belonging to an n-dimensional vector

space R can be uniquely represented as a linear combination of basis vectors.

Proof: Let e1, e2, , e be a basis in R. Let x be an arbitrary

vector in R. The set x, e1, e2, e contains n + 1 vectors. It

follows from the definition of an n-dimensional vector space that these vectors are linearly dependent, i.e., that there exist n I numbers « al, , ex not all zero such that

(3) ao X + 1e1 +

cte = O.

Obviously «0 O. Otherwise (3) would imply the linear

depend-ence of the vectors e1, e2, e. Using (3) we have

cc °c

x =

2 e2 m en.

cco 22o «0

This proves that every x e R is indeed a linear combination of the vectors el, e2, , en.

To prove uniqueness of the representation of x in terms of the

basis vectors we assume that

x = $1e1 E2e2 +

+ e

and

X = E',e, r2e2 +

+ ete.

Subtracting one equation from the other we obtain

(15)

Since e1, e2,

,

en are linearly independent, it follows that

Et'= e2

e'2 =

= en = e'n =

i.e.,

= eta ea = E12,

,

=

$'7t

This proves uniqueness of the representation.

DEFINITION 5. if el, e2, , e form a basis in an n-dimensional

space and

(4)

x =

+ e2e2 +

+

then the numbers $1, E2, , E are called the coordinates of the vector x relative to the basis el, e2,

e.

Theorem 1 states that given a basis el, e2, , e of a vector space R every vector X E R has a unique set of coordinates.

If the coordinates of x relative to the basis e1, e2,

,

en are

E2,

,

En and the coordinates of y relative to the same basis

are ni, n2,

a, i.e., if

x

+ 2e2 +

+ Ene

Y =

+ /72; +

+

then

x + Y = (El + ni)ei + (e2 712)e2 +

+

+ n)e,

i.e., the coordinates of x + y are E, ni, e2 772, , En +

Similarly the vector /lx has as coordinates the numbers 141,

2E2, , 2En

Thus the coordinates of the sum of two vectors are the sums of the appropriate coordinates of the summands, and the coordinates of the

product of a vector by a scalar are the products of the coordinates

of that vector by the scalar in question.

It is clear that the zero vector is the only vector all of whose

coordinates are zero.

EXAMPLES. 1. In the case of three-dimensional space our defini-tion of the coordinates of a vector coincides with the definidefini-tion of

the coordinates of a vector in a (not necessarily Cartesian) coor-dinate system.

2. Let R be the space of n-tuples of numbers. Let us choose as basis the vectors

(16)

et = (1, 1, 1, 1),

e, = (0, I, 1, , 1), en = (0, 0, 0, 1),

and then compute the coordinates n

, n of the vector

x = (E1, $2, , e) relative to the basis el, e,, , e. By definition

x

=

n2e2 '

+ nen;

i.e., (E1, E, en) n/(1, 1, , 1) 272(0, 1, , 1) Ti(0, 0, , 1)

=

+ n2,

ni +

+

+ n).

The numbers (ni, .72, , n) must satisfy the relations = El,

171 + 172 E2

Then

x = (Ei, 52, En)

Ei(1, o, o) 2(o, 1,

, o) +

+ $(o, o, , 1)

=

e2e2+

+Ee,,.

It follows that in the space R of n-tuples ($1, $2, ,$) the numbers

+

172 +

+ n. = E.

Consequently,

ni " Si,

n2 " S2 -- $1, ' nn " $n

$fl-1-Let us now consider a basis for R in which the connection

be-tween the coordinates of a vector x = E2,

, en) and the

numbers et, E2, , E which define the vector is particularly simple. Thus, let

= (1,

e, = (0,

= (0, 0, 1, 0,

,

0), 0), 1).

(17)

may be viewed as the coordinates of the vector

,

EXERCISE. Show that in an arbitrary basis

e, = (all, 6122,

,

e2 (a, a22,

a,),

en

(a, a", -

, a,)

the coordinates n,, /2' 17 of a vector x ($1., e2, , en) are linear combinations of the numbers E1, E2, '

,

Let R be the vector space of polynomials of degree n 1. A very simple basis in this space is the basis whose elements are

the vectors el = 1, e, = t,

, e =

t"--1-. It is easy to see that the

coordinates of the polynomial P(t) = a0r-1 a1t"-2 + in this basis are the coefficients a_, an_2, , ao.

Let us now select another basis for R:

e' = 1,

e'2 = t a,

e's = (t

a)2,

,

e' = (t

a)"--1. Expanding P (t) in powers of (t a) we find that

P (t) = P (a) + P' (a)(t

a) +

+ [P(nl) (a)I (n-1)!](t a)n_'.

Thus the coordinates of P(t) in this basis are

P(a), P' (a), , [PR-1)(a)/ (n

1)1

Isomorphism of n-dimensional vector spaces. In the examples considered above some of the spaces are identical with others when it comes to the properties we have investigated so far. One instance of this type is supplied by the ordinary three-dimensional space R considered in Example / and the space R' whose elements are

triples of real numbers. Indeed, once a basis has been selected in

R we can associate with a vector in R its coordinates relative to that basis; i.e., we can associate with a vector in R a vector in R'.

When vectors are added their coordinates are added. When a vector is multiplied by a scalar all of its coordinates are multiplied

by that scalar. This implies a parallelism between the geometric

properties of R and appropriate properties of R'.

We shall now formulate precisely the notion of "sameness" or of

"isomorphism" of vector spaces. ei,

X =(,

,e) relative to the basis

(18)

DEFINITION 6. Two vector spaces R and RI, are said to be

iso-morphic if it is possible to establish a one-to-one correspondence

X 4-4 x' between the elements x e R and x' e R' such that if x 4> x' and y y', then

the vector which this correspondence associates with x + y is X' + y',

the vector which this correspondence associates with Ax is Ax'.

There arises the question as to which vector spaces are iso-morphic and which are not.

Two vector spaces of different dimensions are certainly not

iso-morphic.

Indeed, let us assume that R and R' are isomorphic. If x, y,

are vectors in R and x', y', are their counterparts in R' then

in view of conditions I and 2 of the definition of isomorphism

the equation Ax

yy +

= 0 is equivalent to the equation Ax'

py' +

_{= O. Hence the counterparts in R' of linearly} independent vectors in R are also linearly independent and con-versely. Therefore the maximal number of linearly independent

vectors in R is the same as the maximal number of linearly independent vectors in R'. This is the same as saying that the

dimensions of R and R' are the same. It follows that two spaces of different dimensions cannot be isomorphic.

THEOREM 2. All vector spaces of dimension n are isomorphic.

Proof: Let R and R' be two n-dimensional vector spaces. Let e2, , en be a basis in R and let e'2, , e' be a basis in R'. We shall associate with the vector

(5)

x =

e2e2 +

+ ee

the vector

x' E2e'2 +

+

i.e., a linear combination of the vectors e', with the same

coeffi-cients as in (5).

This correspondence is one-to-one. Indeed, every vector x e R has a unique representation of the form (5). This means that the

E, are uniquely determined by the vector x. But then x' is likewise

uniquely determined by x. By the same token every x' e R'

(19)

It should now be obvious that if x 4* x' and y e> y', then

x + y 4> x' + y' and 2x 4> Ax'. This completes the proof of the

isomorphism of the spaces R and R'.

In § 3 we shall have another opportunity to explore the concept

of isomorphism.

5. Subspaces of a vector space

DEFINITION 7. A subset R', of a vector space R is called a subspace

of R if it forms a vector space under the operations of addition and

scalar multiplication introduced in R.

In other words, a set R' of vectors x, y, in R is called a

subspace of R if x e R', y e R' implies x

y e R', x e R'.

EXAMPLES. 1. The zero or null element of R forms a subspace of R.

The whole space R forms a subspace of R.

The null space and the whole space are usually referred to as improper subspaces. We now give a few examples of non-trivial

subspaces.

Let R be the ordinary three-dimensional space. Consider

any plane in R going through the origin. The totality R' of vectors in that plane form a subspace of R.

In the vector space of n-tuples of numbers all vectors

x = (E1, E2, , En) for which ei = 0 form a subspace. More generally, all vectors x (E1, E2,

, E) such that

a,E1 a2E2 + + anen = 0,

where al, a2,

, an are arbitrary but

fixed numbers, form a subspace.

The totality of polynomials of degree n form a subspace of the vector space of all continuous functions.

It is clear that every subspace R' of a vector space R must

con-tain the zero element of R.

Since a subspace of a vector space is a vector space in its own

right we can speak of a basis of a subspace as well as of its dimen-sionality. It is clear that the dimension of an arbitrary subspaceof a vector space does not exceed the dimension of thatvector space.

EXERCISE. Show that if the dimension of a subspace R' of a vector space

(20)

A general method for constructing subspaces of a vector space R

is implied by the observation that if e, f, g,

are a (finite

OY infinite) set of vectors belonging to R, then the set R' of all

(finite) linear combinations of the vectors e, f, g, forms a subspace R' of R. The subspace R' is referred to as the subspace

generated by the vectors e, f, g, . This subspace is the smallest subspace of R containing the vectors e, f, g,

The subspace R' generated by the linearly independent vectors e1, e2, e, is k-dimensional and the vectors e1, e2, , eh form a basis of R' Indeed, R' contains k linearly independent vectors

(i.e., the vectors e1, e2, eh).

On the other hand, let x1,

x2, , x, be 1 vectors in R' and let 1 > k. If

x1= $11e1 + $ 12; + + elkek X2 enel. 4- $ 22; +

+ E2e,

xi E12e2 + ' + etkek,

then the I rows in the matrix

$12, P Elk $21, $22, , $2k

Etl., $22, E lk

must be linearly dependent. But this implies (cf. Example 2, page 5) the linear dependence of the vec ors x, x2, , xi. Thus

the maximal number of linearly independent vectors in R', i.e., the dimension of R', is hand the vectors e, e2, , e7, form a basis

in R'.

EXERCISE. Show that every n-dimensional vector space contains

subspaces of dimension /, / 1, 2, , n.

If we ignore null spaces, then the simplest vector spaces are one-dimensional vector spaces. A basis of such a space is a single

vector el O. Thus a one-dimensional vector space consists of all vectors me1, where a is an arbitrary scalar.

Consider the set of vectors of the form x xo °Lei, where xo

and e, 0 are fixed vectors and ranges over all scalars. It is

natural to call this set of vectors

by analogy with three-dimensional space a line in the vector space R.

(21)

Similarly, all vectors of the form /el ße2, where el and e2 are fixed linearly independent vectors and a and fi are arbitrary numbers form a two-dimensional vector space. The set of vectors

of the form

X =

xe

ße2,

where xo is a fixed vector, is called a (two-dimensional) plane. EXERCISES. 1. Show that in the vector space of n-tuples (ei, ep, '

of real numbers the set of vectors satisfying the relation

6/1E1 a2E, + + ane,

----(a1, a,, , a, arefixed numbers not all of which are zero) form a subspace

of dimension n 1.

Show that if two subspaces R, and 112 of a vector space R have only

the null vector in common then the sum of their dimensions does not exceed

the dimension of R.

Show that the dimension of the subspace generated by the vectors

e, f, g, is equal to the maximal number of linearly independent vectors among the vectors e, f, g,

6. Transformation of coordinates under change of basis. Let

e2, , en and e'1, e'2, , e' be two bases of an n-dimensional vector space. Further, let the connection between them be given

by the equations

e', = ae

a21e2 +

+ ae,

(6)

e'2 = tine'

a22 e2 +

+ an,en,

= ame,

a211e2 +

+ ae.

The determinant of the matrix d in (6) is different from zero

(otherwise the vectors e'1, e'o, e' would be linearly

depend-ent).

Let ei be the coordinates of a vector x in the first basis and

its coordinates in the second basis. Then

x = Ele

$2e2 +

$e = E'le',

+

Replacing the e', with the appropriate expressions from (6) we get

x e1e1 $2e2 +

+ ee

e'l(ae

ae2 +

ae)

±a2e)

ane2

(22)

en

aniVi + a,V, +

+ anE'n.

Thus the coordinates of the vector x in the first basis are express-ed through its coordinates in the second basis by means of the matrix st which is the transpose of .21.

To rephrase our result we solve the system (7) for ¿'i, , . Then

buE1 + 1)12E2 + + binen, ei2 = bn + b22$2 +

+ b2Jn,

e'n = b11

b2$2 + bnnen

where the b are the elements of the inverse of the matrix st.

Thus, the coordinates of a vector are transformed by means of a

matrix ri which is the inverse of the transpose of the matrix at in (6)

which determines the change of basis.

§ 2. Euclidean space

1. Definition of Euclidean space. In the preceding section a

vector space was defined as a collection of elements (vectors) for which there are defined the operations of addition and multipli-cation by scalars.

By means of these operations it is possible to define in a vector space the concepts of line, plane, dimension, parallelism of lines, etc. However, many concepts of so-called Euclidean geometry cannot be forniulated in terms of addition and multiplication by scalars. Instances of such concepts are: length of a vector, angles between vectors, the inner product of vectors. The simplest way of introducing these concepts is the following.

We take as our fundamental concept the concept of an inner product of vectors. We define this concept axiomatically. Using the inner product operation in addition to the operations of addi-Since the e, are linearly independent, the coefficients of the e, on

both sides of the above equation must be the same. Hence

auri + an

+

rn

(7) E2 = an VI

(23)

tion and multiplication by scalars we shall find it possible to

devel-op all of Euclidean geometry.

DEFINITION 1. If with every pair of vectors x, y in a real vector space R there is associated a real number (x, y) suchthat

(x, y) = (y, x),

(2x, y) = 2(x, y), (A real) (x, + x2, y) =] (x1, y] + (x2, y],

(x, x) 0 and (x, x) = 0 tt and only if x 0, then we say that an inner product is defined in R.

A vector space in which an inner product satisfying conditions 1 through 4 has been defined is referred to as a Euclidean space.

EXAMPLES. 1. Let us consider the (three-dimensional) space R

of vectors studied in elementary solid geometry (cf. Example

/,

§ 1). Let us define the inner product of two vectors in this space as the product of their lengths by the cosine of the angle between

them. We leave it to the reader to verify the fact that the opera-tion just defined satisfies condiopera-tions 1 through 4 above.

Consider the space R of n-tuples of real numbers. Let

x = (et,

23 , En) and Y = (n2, n2, , th,) be in R. In addition

to the definitions of addition

x + Y = (ei

nt, ez + n2,

en + n)

and multiplication by scalars

Ax (25,, AE,, , Aen)

with which we are already familiar from Example 2, § 1, we define

the inner product of x and y as

(x, Y) = eint + 5m +

+

it is again easy to check that properties 1 through 4 are satisfied

by (x, y) as defined.

Without changing the definitions of addition and

multipli-cation by scalars in Example 2 above we shall define the inner

product of two vectors in the space of Example 2 in a different and

more general manner.

(24)

+ an]. ennl an2 En n2 + ann ennn

We can verify directly the fact that this definition satisfies Axioms 2 and 3 for an inner product regardless of the nature of the

real matrix raj/cll. For Axiom 1 to hold, that is, for (x, y) to be

symmetric relative to x and y, it is necessary and sufficient that

a= a,

i.e., that IctO be symmetric.

Axiom 4 requires that the expression (x, x) ctikeie,

i, k=1

be non-negative fore very choice of the n numbers el, e2, , en

and that it vanish only if E, = E, =

= E,, = O.

The homogeneous polynomial or, as it is frequently called,

quadratic form in (3) is said to be positive definite if it takes on non-negative values only and if it vanishes only when all the Ei are zero. Thus for Axiom 4 to hold the quadratic form (3) must be positive definite.

In summary, for (1) to define an inner product the matrix 11(211

must be symmetric and the quadratic form associated with Ila11 must be positive definite.

If we take as the matrix fIctO the unit matrix, i.e., if we put

a = 1 and a, = O (i

k), then the inner product (x, y) deEned

by (1) takes the form

(x, y) = I eini

and the result is the Euclidean space of Example 2.

EXERCISE. Show that the matrix

(0 1)

1 0

cannot be used to define an inner product (the corresponding quadratic

form is not positive definite), and that the matrix (1

1\

1 21

can be used to define an inner product satisfying the axioms I through 4. (x, Y) = a11C1n1 + a12C1n2 + + am el /7

(25)

In the sequel (§ 6) we shall give simple criteria for a quadratic

form to be positive definite.

Let the elements of a vector space be all the continuous functions on an interval [a, b]. We define the inner product of two such functions as the integral of their product

(f, g) =fa f(t)g(t) dt.

It is easy to check that the Axioms 1 through 4 are satisfied.

Let R be the space of polynomials of degree n 1.

We define the inner product of two polynomials as in Example 4

(P, Q) = P (t)Q(t) dt.

L

2. Length of a vector. Angle between two vectors. We shall now make use of the concept of an inner product to define the length of a vector and the angle between two vectors.

DEFINITION 2. By the length of a vector x in Euclidean space we mean the number

(4) (x, x).

We shall denote the length of a vector x by the symbol N.

It is quite natural to require that the definitions of length of a vector, of the angle between two vectors and of the inner product

of two vectors imply the usual relation which connects these quantities. In other words, it is natural to require that the inner

product of two vectors be equal to the product of the lengths of

these vectors times the cosine of the angle between them. This

dictates the following definition of the concept of angle between two vectors.

DEFINITION 3. By the angle between two vectors x and y we mean

the number arc cos (x, y) lx1 1311 i.e., we put (x, y) cos 9) 13C1 1371 (5)

(26)

The vectors x and y are said to be orthogonal if (x, y) = O. The angle between two non-zero orthogonal vectors is clearly .7i/ 2. The concepts just introduced permit us to extend a number of

theorems of elementary geometry to Euclidean spaces.

The following is an example of such extension. If x and y are

orthogonal vectors, then it is natural to regard x + y as the

diagonal of a rectangle with sides x and y. We shall show that

Y12 = 1x12 1Y121

i.e., that the square of the length of the diagonal of a rectangle is equal

to the sum of the squares of the lengths of its two non-parallel sides

(the theorem of Pythagoras).

Proof: By definition of length of a vector

lx + 3/12 (x Y, x Y).

In view of the distributivity property of inner products (Axiom 3),

(X + y, x + y) = (x, x)

_{(x, Y) 4- (Y, x)} _{(Y, y).}

Since x and y are supposed orthogonal, (x, y) = (y, x) = O. Thus

1x + y12 = (x, x) _{(Y, Y)} 13112,

which is what we set out to prove.

This theorem can be easily generalized to read: if x, y, z,

-are pairwise orthogonal, then

Ex + y + z +

12 = ixr2 )7I2 1z12 +

'

3. _{The Schwarz inequality. In para. 2. we defined the angle} _w

between two vectors x and y by means of the relation (x, y)

cos 99

1x1 1Y1

If _{is to be always computable from this relation we must show}

that

I We could have axiomatized the notions of length of a vector and angle between two vectors rather than the notion of inner product. However, this course would have resulted in a more complicated system of axioms than that associated with the notion of an inner product.

(27)

y)1/1x1 IYI 1.

(X, Y)

1 <

< 1

IXI IYI

or, equivalently, that

(x, y)2

< I

Ix12 13112 which, in turn, is the same as

(6) (x, y)2 (x, x)(y, y).

Inequality (6) is known as the Schwarz inequality.

Thus, before we can correctly define the angle between two vectors by

means of the relation (5) we must prove the Schwarz inequality. 2 To prove the Schwarz inequality we consider the vector x ty where t is any real number. In view of Axiom 4 for inner products,

(x ty, x ty) 0;

i.e., for any t,

12(y , y) 2t(x, y) + (x, x) O.

This inequality implies that the polynomial cannot have two

dis-tinct real roots. Consequently, the discriminant of the equation

t2(y, y)

2t(x, y) + (x, x) =

cannot be positive; i.e.,

(x, y)2 (x, x)(y, y) _CO, which is what we wished to prove.

EXERCISE.

Prove that a

necessary and sufficient condition for

(x, y), (x, x)(y, y) is the linear dependence of the vectors x and y. EXAMPLES. We have proved the validity of (6) for an axiomatically defined Euclidean space. It is now appropriate to interpret this inequality

in the various concrete Euclidean spaces in para. 1.

1. In the case of Example 1, inequality (6) tells us nothing new. (cf. the remark preceding the proof of the Schwarz inequality.)

2 Note, however that in para. 1, Example /, of this section there is no need to prove this inequality. Namely, in vector analysis the inner product of two vectors is defined in such a way that the quantity (x, y)/1x1 1y1 is the cosine of a previously determined angle between the vectors. Conse-quently, 1(x,

(28)

2 In Example 2 the inner product was defined as

(x, y) = E t=1 It follows that

(X, X) E Et2, (y. y) = E i)(8,

2=1 2--1

and inequality (6) becomes

Y n )( n

ei M 5- ( Ei2 tif2).

i=1 i=1 i=1

In Example 3 the inner product was defined as

(1) _Y) _aikeink, i,n=1 where and (3) E aikeiek >. k=1

for any choice of the ¿i. Hence (6) implies that

il the numbers an satisfy conditions (2) and (3), then the following inequality

holds:

2 ( n

n

( E ao,$((o) E aikeik)( E 6111115112)

k=1 k=1 i, 2=1

EXERCISE. Show that if the numbers an satisfy conditions (2) and (3), anakk. (Hint: Assign suitable values to the numbers Ei, En, , En and /72, , 71 in the inequality just derived.)

In Example 4 the inner product was defined by means of the integral

fb

1(1)g (t) dt. Hence (6) takes the form

0. af(t)g(t) dt))2 fba [f(t)J' dt [g(t)p dt.

This inequality plays an important role in many problems of analysis. We now give an example of an inequality which is a consequence

of the Schwarz inequality.

If x and y are two vectors in a Euclidean space R then

(29)

1Yr-Proof:

lx y12 = (x + y, x -1- y) = (x, x) 2(x, y) + (y, y). Since 2(x, y) 213E1 it follows that

1x±y12 = (x+y, x+y) (x, x)+21x1 lyi+ (y, Y) = fix1+IYI)2, i.e., 1x + yl x1 SI, which is the desired conclusion.

EXERCISE. Interpret inequality (7) in each of the concrete Euclidean spaces considered in the beginning of thissection.

In geometry the distance between two points x and y (note the use of the same symbol to denote avectordrawn from the

origin-and a point, the tip of that vector) is defined as the length of the vector x y. In the general case of an n-dimensional Euclidean space we define the distance between x and y by the relation

d lx yl.

§ 3. Orthogonal basis. Isomorphism of Euclidean spaces

I. Orthogonal basis. In § 1 we introduced the notion of a basis (coordinate system) of a vector space. In a vector space there is

no reason to prefer one basis to another. 3 Not so in Euclidean spaces. Here there is every reason to prefer so-called orthogonal

bases to all other bases. Orthogonal bases play the same role in Euclidean spaces which rectangular coord nate systems play in analytic geometry.

DEFINITION 1. The non-zero vectors el, e2,

e of an

n-dimensional Euclidean vector space are said to form an orthogonal

basis if they are pairwise orthogonal, and an orthonormalbasis i f , in addition, each has unit length. Briefly, the vectors e,,e2,

,e

form an orthonormal basis

3 Careful reading of the proof of the isomorphism of vector spaces given in § 1 will show that in addition to proving the theorem we alsoshowed that it is possible to construct an isomorphism of two n-dimensional vector spaces which takes a specified basis in one of these spaces into a specified basis in

the other space. In particular, if e,, e,, e and e'5, e',, are two bases in R, then there exists an isomorphic mappingof R onto itself which

(30)

f 1 if

i = k

(ei, ek) =

10 if i k.

For this definition to be correct we must prove that the vectors

ei, e,,

en of the definition actually form a basis, i.e., are

linearly independent.

Thus, let

A2e2 +

+ Ae = O.

We wish to show that (2) implies Ai = 2.2

= A = O. To

this end we multiply both sides of (2) by el (i.e., form the inner

product of each side of (2) with ei). The result is 21(e1, el) + A2(e1, e2) + + 2(e1, en)

=

O.

Now, the definition of an orthogonal basis implies that

ei) 0 0, (e1, ek) = 0

for k é L

Hence A = O.

Likewise, multiplying (2) by e, we find that

A2 = 0, etc. This proves that el, e2, , e are linearly independ-ent.

We shall make use of the so-called orthogonalization procedure to prove the existence of orthogonal bases. This procedure leads

from any basis f, 2, , f to an orthogonal basis el, e2,

, e.

THEOREM 1. _{Every n-dimensional Euclidean space contains}

orthogonal bases.

Proof: By definition of an n-dimensional vector space (§ 1,

para. 2) such a space contains a basis f1, f2,

, f. We put

= f1. Next we put e, = f,

e1, where a is chosen so that

(e2, el) = 0; i.e., (f, e1, ei) = O. This means that (f,, e1)/(e1, e1).

Suppose that we have already constructed non-zero pairwise orthogonal vectors el, e2,

_",

To construct ek we put

e, =A1e11+ ' + Ak-iet

(31)

(ek, e1) = (fk 21e2-1 ' '

+

= 0,

(ek, e2) _{= (fk} Aiek.-t. -I-

+ A2e, e2)

= 0,

(fk, e2-1) 21(ek_1, e) = O.

It follows that

2-1 = (f/c, e1)/(e1, el),

=

(fk, e2)/(e2, e2), '

(f2,

So far we have not made use of the linear independence of the

vectors f1, fa, , but we shall make use of this fact presently

to prove that e, O. The vector ek is a linear combination of the vectors ek, e2,

, e, f2. But e, can be written as a

linear combination of the vector f_, and the vectors e, e2,

,

Similar statements hold for e22, e2_2, , ek. It follows that

ek = alfk ci2f2 +

+

2-1 fk.

In view of the linear independence of the vectors f1, f2, fk we may conclude on the basis of eq. (5) that ek O.

Just as ek, e2, , ek_k and fk were used to construct e, so

ek, e2,

, e, and fk+, can be used to construct e,, etc.

By continuing the process described above we obtain n non-zero,

pairwise orthogonal vectors ek, e2,

, en, i.e., an orthogonal

basis. This proves our theorem.

It is clear that the vectors

e'k = ek/lekl (k = 1, 2, , n)

form an orthonormal basis.

EXAMPLES OF ORTHOGONALIZATION. 1. Let R be the three-dimensional

space with which we are familiar from elementary geometry. Let fi,f,, f, be three linearly independent vectors in R. Put e, = f,. Next select a ..;ctor e, perpendicular to e, and lying in the plane determined by e, = f,

(e2, e,) = (f2

_{A1e2-1 +} '

+

e2-1) O. Since the vectors el, e2, , e, are pairwise orthogonal, the latter equalities become:

(fk, el) + 22-1(e1, et) = 0, (fk, 02) 22-2(e2, e2) 0,

(32)

and f2. Finally, choose e, perpendicular to ei ande, (i.e., perpendicular to the previously constructed plane).

Let R be the three-dimensional vector space of polynomials of degree not exceeding two. We define the inner product of two vectors in this space by the integral

fi P(t)Q (t) dt.

The vectors 1, t, 12 form a basis in R. We shall now orthogonalize this basis. We put e, = 1. Next we put e, = t I. Since

O (t+ I, I) = f (t dt = 2a,

it follows that a = 0, i.e., e, = t. Finally we put e, = 12 + 131 y 1.

The orthogonality requirements imply ß 0 and y = 1/3, i.e., e, = t2

1/3. Thus 1, t, P 1/3 is an orthogonal basis in R. By dividing each basis

vector by its length we obtain an orthonormal basis for R.

Let R be the space of polynomials of degree not exceeding n 1. We define the inner product of two vectors in this space as in the preceding example.

We select as basis the vectors 1, t, -,t^-2. As in Example 2 the process of orthogonalization leads to the sequence of polynomials

1, t, 12 1/3, (3/5)1,

.

Apart from multiplicative constants these polynomials coincide with the

Legendre polynomials

1 dk (12 1)k

2k k! dtk

The Legendre polynomials form an orthogonal, but not orthonormal basis in R. Multiplying each Legendre polynomial by a suitable constant we obtain an orthonormal basis in R. We shall denote the kth element of this basis by Pk(t).

Let e1, e2, , en be an orthonormal basis of a Euclidean space

R. If

x =

52e, + + enen,

y = ?he,

n2e2 +

+77e0,

then

(x,Y)=

$2e2

+ ee,

n2e2 +

+ ne).

Since

ek) = {01 _1ff

(33)

EXAMPLES. 1. Let Po(t), P,(1), P o(t) be the normed Legendre polynomials of degree 0, 1, , n. Further, let Q (t) be an arbitrary

polyno-it follows that

(x, y) El% + $27)2 + + Enfl.

Thus, the inner product of two vectors relative to an orthonormal basis

is equal to the sum of the products of the corresponding coordinates of

these vectors (cf. Example 2, § 2).

EXERCISES 1. Show that if f., f., f is an arbit ary basis, then

(x, y) = E nikEink, Cle=1

where aik = aki and ei e2, " en and ni, n2, ?I are the coordinates of x and y respectively.

2. Show that if in some basis f1, f'2,

, f

(x, Y) = 071 + + + Enn.

for every x =

" '

Ent. and Y = nifi + + n,,f, then this

basis is orthonormal.

We shall now find the coordinates of a vector x relative to an orthonormal basis el, e2,

, e.

Let

x = eie,

E2e2

ene.

Multiplying both sides of this equation by el we get

(x, e1) = el) + $2(e2 el) +

+ e(en

,

el) =

and, similarly,

(7) e, = (x, e2),

,

= (x, e).

Thus the kth coordinate of a vector relative to an orthonormal basis is

the inner product of this vector and the kth basis vector.

It is natural to call the inner product of a vector x and a vector e of length 1 the projection of x on e. The result just proved may be

states as follows: The coordinates of a vector relative to an orthonor-mal basis are the projections of this vector on the basis vectors. This is the exact analog of a statement with which we are familiar from

analytic geometry, except that there we speak of projections on the coordinate axes rather than on the basis vectors.

(34)

rnial of degree n. We shall represent Q (t) as a linear combination of the Legendre polynomials. To this end we note that all polynomials of degree

n form an n-dimensional vector space with orthonormal basis Po(t), , P(t). Hence every polynomial Q(t) of degree n can be

rep-resented in the forra

Q (t) = P(t) + c1P1(1) +

+ cP(t).

It follows from (7) that

c, I Q(t)P,(1) dt.

2 Consider the system of functions

(8) 1, cos t, sin t, cos 2t, sin 2t, cos nt, sin nt, on the interval (0, 2n). A linear combination

P(t) = (a012) + a, cos t + b, sin t + al cos 2t + + 6,, sin nt

of these functions is called a trigonometric polynomial of degree n. The totality of trigonometric polynomials of degree n form a (2n + 1) -dimen-sional space R,. We define an inner product in R1 by the usual integral

2n

(P, Q) = fo P(t)Q(t) dt.

It is easy to see that the system (8) is an orthogonal basis Indeed r 2r

cos kt cos It dt = 0 if k 1,

Jo

rn

_{sin kt cos It dt = 0,}

f.o 2r sin kt sin lt dt = 0, if k I. Since 27, .f: sin% kt dt = cos% kt dt n o and .{0 227 ldt = 2n, it follows that the functions

(8') 1/1/2a, (l/ n) cos t, (l/ n) sin t, (l/Vn) cos nt, (l/Vn) sin nt

are an orthonormal basis for R1.

2. Perpendicular from a point to a subspace. The shortest distance from a point to a subspace. (This paragraph may be left out in a

first reading.)

DEFINITION 2. Let R, be a subspace of a Euclidean space R. We shall say that a vector h e R is orthogonal to the subspace R, if it is orthogonal to every vector x e RI.

(35)

If h is orthogonal to the vectors e, e,,

e then it is also

orthogonal to any linear combination of these vectors. Indeed,

(h, et) = 0 (1= 1, 2, , m)

implies that for any numbers 2,, 22,

,

(h, 21e1 + 22; + + 2.,e,) = O.

Hence, for a vector h to be orthogonal to an m-dimensional sub-space of R it is sufficient that it be orthogonal to ni linearly

inde-pendent vectors in It1, i.e., to a basis of 12.1.

Let R, be an m-dimensional subspace of a (finite or infinite dimensional) Euclidean space R and let f be a vector not belonging

to lt1. We pose the problem of dropping a perpendicular from the

point f to 121, i.e., of finding a vector f, in R1 such that the vector

h f f is orthogonal to R1. The vector f, is called the orthogonal

projection of f on the subspace R1. We shall see in the sequel that this problem has always a unique solution. Right now we shall show that, just as in Euclidean geometry, 1111 is the shortest dis-tance from f to RI. In other words, we shall show that if f, e R1

and f1 f, then.

If -

> If - 41.

Indeed, as a difference of two vectors in RI, the vector fo f1

belongs to R, and is therefore orthogonal to h = f

f. By the

theorem of Pythagoras

If

412 + 14 f1I2 = If f0 + 4 f1I2 = If

-so that

If -

> If

We shall now show how one can actually compute the orthogo-nal projection f, of f on the subspace R1 (i.e., how to drop a

perpendicular from f on 141). Let el, e,, e, be a basis of R1.

As a vector in 121, f, must be of the form

f, =

c2e2 +

c,e,.

To find the c, we note that f f, must be orthogonal to III, i.e.,

(f _{fo, ej = O (k = 1, 2,} _{, in), or,}

(36)

Replacing f, by the expression in (9) we obtain a system of m equations for the c,

c,(ei, e1) c2(e2, e1)

+

+ c,(e,,,, e1)

= (f, e1) (k = 1, 2, m). We first consider the frequent case when the vectors e1, e2,

,

e,,, are orthonormal. In this case the problem can be solved with ease. Indeed, in such a basis the system (11) goes over into the

system

ci = (f, e,).

Since it is always possible to select an orthonormal basis in an m-dimensional subspace, we have proved that for every vector f

there exists a unique orthogonal projection f, on the subspace

We shall now show that for an arbitrary basis el, e2, , en, the system (11) must also have a unique solution. Indeed, in view of

the established existence and uniqueness of the vector f0, this

vector has uniquely determined coordinates el, c2, , cm with

respect to the basis el, e2, , e,. Since the c, satisfy the system (11), this system has a unique solution.

Thus, the coordinates c, of the orthogonal projection fo of the vector f

on the subspace 111 are determined from the system (12) or from the system (11) according as the c, are the coordinates of to relative to an orthonormal basis of R1 or a non-orthonormal basis of

A system of m linear equations in in unknowns can have a unique solution only if its determinant is different from zero.

It follows that the determinant of the system (11) (el, (e2, e1) ' (era, ei)

(ex, ;) (e2, e2)

(e

(ex, e,) (e2, e,)

(e e,)

must be different from zero. This determinant is known as the

Gramm determinant of the vectors e1, e2, e,.

EXAMPLES. 1. The method of least squares. Let y be a linear function of x1, x2, , x,; i.e., let

(37)

where the c, are fixed unknown coefficients. Frequently the c, are determined experimentally. To this end one carries out a number

of measurements of ael, x2,

, x, and y. Let x, x2,,

X, y,

denote the results of the kth measurement. One could try to determine the coefficients c1, c2, , cm from the system of

equa-tions

X11C1 X21C2 + + XnaCm = y1, Xl2C1 + X22 C2 + + Xm2 = Y2,

xincl x2c2 +

+ xmc, = y .

However usually the number n of measurements exceeds the

number m of unknowns and the results of the measurements are never free from error. Thus, the system (13) isusually incompati-ble and can be solved only approximately. There arises the proincompati-blem of determining t1, c2, , cm so that the left sides of the equations

in (13) are as "close" as possible to the corresponding right sides. As

a measure

of "closeness" we take the so-called mean

deviation of the left sides of the equations from the corresponding free terms, i.e., the quantity

E (X1kC1 X2nC2 + + XinkCk Ykr

k=1

The problem of minimizing the mean deviation can be solved directly. However, its solution can be immediately obtained from

the results just presented.

Indeed, let us consider the n-dimensional Euclidean space of

n-tuples and the

following vectors:

e, = (x11, x,

, x1), e2 = (X21 X22, 4 4, X2n),

,

= (x,, x,2,

, x), and

f =_- (y1, y2, y) in that space. The right sides of (13) are the components of the vector f and the left sides, of the vector

ciel c2e2 +

+

Consequently, (14) represents the square of the distance from f to clel c2e2 + + c,e,n and the problem of minimizing the

mean deviation is equivalent to the problem

of choosing ni numbers e1, e2, , Cm so as to minimize the distance from f to

(38)

the vectors el, e2, , e, (supposed linearly independent), then

our problem is the problem of finding the projection of f on RI. As we have seen (cf. formula (11)), the numbers c1, c2, , which solve this problem are found from the system of equations

(e1, ei)c, (e2, e1)c2 +

+ (e, el)a, = (f, e1),

(15) (e1, e2)c1 (e2, e2)c2 + ' (e e2)c,,, = (f, e2),

I

x,c2 k

(e1, en)ci (e2, em)c, +

+ (e, en)c, = (f, e,),

where

(f, ex) = xxiYs, ek) = xoxk,,

I =1 1-1

The system of equations (15) is referred to as the system of normal equations. The method of approximate solution of the system (13) which we have just described is known as the method of least squares.

EXERCISE. Use the method of least squares to solve the system of equations

2c 3 3c 4

4c = 5.

Solution: el (2, 3, 4), f = (3, 4, 5). In this case the normal system

consists of the single equation

(e1, enc = (ee f), 29c = 38; c = 38/29.

When the system (13) consists of n equat ons in one unknown

xic

(13') x2c = y2,

xc

the (least squares) solution is

(x, y) _k=1

xox

c

(39)

In this case the geometric significance of c is that of the slope of a line through the origin which is "as close as possible" to the points (x1, y1), (x2, y2),

, (x, y).

2. Approximation of functions by means of trigonometric polynomials. Let (t) be a continuous function on the interval [0, 2n]. It is frequently

necessary to find a trigonometric polynomial P(t) of given degree v,Mich differs from f(t) by as little as possible. We shall measure the proximity of

1(t) and P (t) by means of the integral u(t) - 13(1)i2 dl.

Thus, we are to find among all trigonometric polynom als of degree n,

P(t) = (a012) + a, cos t b, sin t + + an cos nt b sin nt,

that polynomial for which the mean deviation from f (t) is a minimum. Let us consider the space R of continuous functions on the interval

10, 2:r] in which the inner product is defined, as usual, by means of the

integral

(I, g) = 21,f(t)g(t) dt. Then the length of a vector f(t) in R is given by

=

6atr

EN)? dt.

Consequently, the mean deviation (16) is simply the square of the distance from j(t) to P(t). The trigonometric polynomials (17) form a subspace R, of R of dimension 2n + 1. Our problem is to find that vector of R, which is

closest to fit), and this problem is solved by dropping a perpendicular from f(t) to R1.

Since the functions

1 cos t sin t eo e, e, , V2,7 ' Nhz 1 Or cos nt sin nt e,_, ;

e,

Ahr

form an orthonormal basis in R, (cf. para. 1, Example 2), the required element P(t) of R, is 2n P(t) = E ce1, k=0 where ck = (t, el), or

(40)

I 27 1 J.2" e,,....1c,

= --

f(t)dt; tik., f(t) cos kt dt; Vart o Vn o 1 27 C2k - , 1(t) sin kt dt. ' 7(/0

ThuS, for the mean deviation of the trigonometric polynomial

a, n

P(t) := * Ea,. cos kt + b sin kt

2 k=1

from f(t) to be a minimum the coefficients a and bk must have the values

127

1 2

=7

a,

5

fit) dt; a =-

5

1(1) cos kt dt; X _o _x _o

127 b =

5

f(t) sin kt dt.

"Jo

The numbers a,. and bk defined above are called the Fourier coefficients of

the function fit).

3. Isomorphism of Euclidean spaces. We have investigated a

number of examples of n-dimensional Euclidean spaces. In each of them the word "vector" had a different meaning. Thus in § 2,

Example 2, "vector" stood for an n-tuple of real numbers, in § 2, Example 5, it stood for a polynomial, etc.

The question arises which of these spaces are fundamentally different and which of them differ only in externals. To be more specific:

DEFINITION 2. Two Euclidean spaces R and R', are said to be isomorphic if it is possible to establish a one-to-one correspondence

x 4> x' (x e R, x' e R') such that

I. If X 4> X' and y 4> y', then x + y X' + y', i.e., if OW

correspondence associates with X E R the vector X' E R' and with

y e R the vector y' e R', then it associates with the sum x + y the

sum x' y'.

If x x', then Axe> Ax'.

If x 4> x' and y 4--> y', then (x, y) = (x', y'); i.e., the inner

products of corresponding pairs of vectors are to have the same value.

We observe that if in some n-dimensional Euclidean space R a

theorem stated in terms of addition, scalar multiplication and inner multiplication of vectors has been proved, then the same

(41)

theorem is valid in every Euclidean space R', isomorphic to the space R. Indeed, if we replaced vectors from R appearing in the statement and in the proof of the theorem by corresponding

vec-tors from R', then, in view of the properties 1, 2, 3 of the definition

of isomorphism, all arguments would remain unaffected. The following theorem settles the problem of isomorphism of

different Euclidean vector spaces.

THEOREM 2. All Euclidean spaces of dimension n are isomorphic.

We shall show that all n-dimensional Euclidean spaces are

isomorphic to a selected "standard" Euclidean space of dimension

n. This will prove our theorem.

As our standard n-dimensional space R' we shall take the space of Example 2, § 2, in which a vector is an n-tuple of real numbers

and in which the inner product of two vectors x' = (E1, ¿2,

, e)

and y' = (7? n

_a

, nn) is defined to be (x', = Eft?' + $2n2 +

+

Now let R be any n-dimensional Euclidean space. Let el,

e2, , en be an orthonormal basis in R (we showed earlier that every Euclidean space contains such a basis). We associate with

the vector

x =

e2e2 +

+ ene

in R the vector

= (81, 82, , 8n)

in R'.

We now show that this correspondence is an isomorphism. The one-to-one nature of this correspondence is obvious.

Conditions 1 and 2 are also immediately seen to hold. It remains

to prove that our correspondence satisfies condition 3 of the defini-tion of isomorphism, i.e., that the inner products of corresponding pairs of vectors have the same value. Clearly,

(x, Y) = $1ni + $2n2 + + Ennn,

because of the assumed orthonormality of the e,. On the other

hand, the definition of inner multiplication in R' states that

(42)

Thus

(x', y') = (x, Y):

i.e., the inner products of corresponding pairs of vectors have

indeed the same value.

This completes the proof of our theorem.

EXERCISE. Prove this theorem by a method analogous to that used in

para. 4, § 1.

The following is an interesting consequence of the isomorphism

theorem. Any "geometric" assertion (i.e., an assertion stated in

terms of addition, inner multiplication and multiplication of vectors by scalars) pertaining to two or three vectors is true if it is true in elementary geometry of three space. Indeed, the vectors in

question span a subspace of dimension at most three. This sub-space is isomorphic to ordinary three sub-space (or a subsub-space of it), and it therefore suffices to verify the assertion in the latter space.

In particular the Schwarz inequality a geometric theorem about a pair of vectors is true in any vector space because it is

true in elementary geometry. We thus have a new proof of the Schwarz inequality. Again, inequality (7) of § 2

Ix + yl ixl

is stated and proved in every textbook of elementary geometry as the proposition that the length of the diagonal of a parallelogram

does not exceed the sum of the lengths of its two non-parallel sides,

and is therefore valid in every Euclidean space. To illustrate, the inequality,

(f(t) g(t))2 dt _{VSa [}

bb

f (t)]2 dt VSa [g(t)]2 dt,

which expresses inequality (7), § 2, in the space of continuous func-tions on [a, b], is a direct consequence, via the isomorphism

theo-rem, of the proposition of elementary geometry just mentioned.

§ 4. Bilinear and quadratic forms

In this section we shall investigate the simplest real valued

functions defined on vector spaces.

(43)

1. Linear functions. Linear functions are the simplest functions defined on vector spaces.

DEFINITION I. A linear function (linear form) f is said to be

defined on a vector space if with every vector x there is associated a number f(x) so that the following conditions hold:

_fix Y) f(x) +AY),

!(Aa) = 1f (x).

Let et, e2, , en be a basis in an n-dimensional vector space.

Since every vector x can be represented in the form

x =

enen,

the properties of a linear function imply that

f(x) =

f&ie, + E2e2 +

+ ene)

eifie,)

¿j(e2) +

+ enf(e.)-Thus, if et, e2, , en is a basis of an n-dimensional vector space R, x a vector whose coordinates in the given basis are E1, $2,

, E,

and f a linear function defined on R, then

(1)

f(x) = aiel

a252+ -in amen,

where f(e) =

= 1, 2, , n).

The definition of a linear function given above coincides with

the definition of a linear function familiar from algebra. What

must be remembered, however, is the dependence of the a, on the choice of a basis. The exact nature of this dependence is easily

explained.

Thus let et, e2,

, e and e'1, e'2,

, e'n be two bases in R.

Let the e', be expressed in terms of the basis vectors et, e2,

, e

by means of the equations

e', = acne, + ac21e2 + + ocnIenr

e'2 acne, + 122e2 + + (Xn2en,

e'

2,2,e2 +

+ 2,e.

Further, let

(44)

relative to the basis el, e2, , en, and

f(x) = a'le', + a'2E12 +

+ a'e'

relative to the basis e'1, e'2,

e'.

Since a, = f(e,) and a' k = f(e'), it follows that

ai = f(xikei

cc2ke2 + + C(nk en) = Xlki(ei) ac22f(e2)

+

+ akf(e.) = ctik

+ c(2k az +

+ anka.

This shows that the coefficients of a linear form transform under a

change of basis like the basis vectors (or, as it is sometimes said, cogrediently).

2. Bilinear forms. In what follows an important role is played by

bilinear and quadratic forms (functions).

DEFINITION 2. A (x; y) is said to be a bilinear function (bilinear form) of the vectors x and y if

for any fixed y, A (x; y) is a linear function of x,

for any fixed x, A (x; y) is a linear function of y.

In other words, noting the definition of a linear function, conditions 1 and 2 above state that

A (xi + x2; y) = A (x,; y) + A (x2; y), A (Ax; y) = (x; y),

A (x; + y2) = A (x; yi) + A (x; yz), A (x; pty) = yA(x; y).

EXAMPLES. 1. Consider the n-dimensional space of n-tuples of

real numbers. Let x = ($1, bt2,

, E), y =

n2, , nk), and

define

A (x; y) = a12e072 ' '

+

(2) + an 27/1 anE27/2 ' ' a2ne2nn

anlennl an2enn2 + + annennn

A (x; y) is a bilinear function. Indeed, if we keep y fixed, i.e., if we regard ni , n2, n as constants, ae ink depends linearly

i, k1

on the $,; A (x; y) is a linear function of x =- (E1, ey, en). Again, if $1, $2,

E are kept constant, A (x; y) is a linear

(45)

2. Let K(s, t) be a (fixed) continuous function of two variables s, t. Let R be the space of continuous functions f(t). If we put

b b

A (f; g)

=

I

s

K(s, t)f(s)g(t) ds dt,

then A (f; g) is a bilinear function of the vectors f and g. Indeed,

the first part of condition 1 of the definition of a bilinear form means, in this case, that the integral of a sum is the sum of the

integrals and the second part of condition 1 that the constant A.

may be removed from under the integral sign. Conditions 2 have analogous meaning.

If K(s, t) 1, then

b

A (f; g) =

jib

f(s)g(t) ds dt = f(s) ds g(t) dt,

i.e., A( f; g) is the product of the linearfunctions!: f(s) ds and

b

Ja g(t) dt.

EXERCISE. Show that if 1(x) and g(y) are linear functions, then their product 1(x) g(y) is a bilinear function.

DEFINITION 3. A bilinear function (bill ear form) is called

symmetric 21

A (x; y) = A (y x)

for arbitrary vectors x and y.

In Example / above the bilinear form A (x; y) defined by (2) is

symmetric if and only if aik= aid for all i and k.

The inner product (x, y) in a Euclidean space is an example of a symmetric bilinear form.

Indeed, Axioms I, 2, 3 in the definition of an inner product (§ 2) say that the inner product is a symmetric, bilinear form. 3. The matrix of a bilinear form. We defined a bilinear form

axiomatically. Now let el, e2, en be a basis in n-dimensional space. We shall express the bilinear form A (x; y) using the coordinates ei, e2, e of x and the coordinates ni, n,,

, ri of

y relative to the basis e1, e2,

e. Thus,

A (x; y) = A (ei ei $2e2 + + e en; /he, )72e2 +

+ 71e).

In view of the properties 1 and 2 of bilinear forms

(46)

A (x; y) = I A (ei; k=1

or, if we denote the constants A (ei; e1) by a,,

A (x; y) = ai,e

To sum up: Every bilinear form in n-dimensional space can be written as

A (x: Y) = azkink, i, 1=1

where X -r- $1e1 +

4- e, y = ?he,. +

_{+ tine, and}

a, = A (ei; ek).

The matrix a/ = is called the matrix of the bilinear form

A (x; y) relative to the basis el, e2, , en.

Thus given a basis e1, e2, , e the form A (x; y) is determined

by its matrix at= Ha11.

EXAMPLE. Let R be the three-dimensional vector space of triples (El. EP 59) of real numbers. We define a bilinear form in R by means of the equation

A (x; y) = El% + 2eon + 3eana.

Let us choose as a basis of R the vectors

e, = (1, 1, 1); e, = (1, 1, 1);

e, = (1, 1, 1),

6

0 4

= 0 6

d.

4 2

6

It follows that if the coordinates of x and y relative to the basis_{e,, e2, e,} are denoted by 5',, 5',, and n'2, TA, respectively, then

A (x; y) = 65'oy, 4E', n'a 6E', + _{+ 2} _{3.17' +}

and compute the matrix .91 of the bilinear form A (x; y). Nlaking use of (4) we find that: an 1 1 + 2 1 1 + 3 1 1 = 6,

a = an

1 1 + 2 1 1 + 3 1 (-1) = 0,

a = 11 + 2

1 1 + 3 (-1) (-1) = 6,

a= a

1

1 + 2 U (-1) + 3

1

(-1) = 4,

a = a= 1

1 1- 2 1

(-1) + 3 (-1)(-1) = 2,

a

1 - 1 + 2 (-1) (-1) + 3

(-1) (I) = 6,

(47)

(5)

4. Transformation of the matrix of a bilinear form under a change of basis. Let el, e2,

e and f, f2,

f be two bases of an

n-dimensional vector space. Let the connection between these

bases be described by the relations

= cue'

c21e2 d-

+ ce,

c22e2 +

+

f2 = cue].

In = crne + c2e2 +

+

which state that the coordinates of the vector fn relat ve to the basis el., e2, , e are cm, c22,

- , c. The matrix

1

[clici2

w C21C22 . C2n

cn2c2 cn

is referred to as the matrix of transition from the basis e,, e2,

to the basis f1, f2,

, f.

Let si = I laid I be the matrix of a bilinear form A (x;y) relative to the basis e1, e2, , e and .4 = I ibikl 1, the matrix of that form relative to the basis f1, f2, L. Our problem consists in finding the matrix I IbI I given the matrix

I kJ.

By definition [eq. (4)] b = A (f,,, fe), i.e., bp,is the value of our

bilinear form for x f, y = fe. To find this value we make use of (3) where in place of the ei and ni we put the coordinates of fp

and fe relative to the basis e1, e2,

, e, i.e., the numbers

c, ca,

, en and c, c,

, c,.

It follows that

(6) by, = A (fp; fe) =

acic..

k=-1

We shall now express our result in matrix form. To this end we put e1,, = c' The e',,1 are, of course, the elements of the

transpose W' of W. Now b becomes 4

4 As is well known, the element c of a matrix 55' which is the product of two matrices at = 11a0,11

cik E ai,zbk. c/-1

Using this definition twice one can show that i = d.Vt, then

= E abafic Aft.

a.ß=1