LECTURES ON
LINEAR ALGEBRA
L M. GELTAND
Academy of Sciences, Moscow, U.S.S.R.
Translated from the Revised Second Russian Edition by A. SHENITZER
Adelphi College, Garden City, New York
INTERSCIENCE PUBLISHERS, INC., NEW YORK
ALL RIGHTS RESERVED
LIBRARY OF CONGRESS CATALOG CARD NUMBER 61-8630 SECOND PRINTING 1963
The second edition differs from the first in two ways. Some of the material was substantially revised and new material was added. The
major additions include two appendices at the end of the book dealing with computational methods in linear algebra and the theory of pertur-bations, a section on extremal properties of eigenvalues, and a section
on polynomial matrices ( §§ 17 and 21). As for major revisions, the
chapter dealing with the Jordan canonical form of a linear transforma-tion was entirely rewritten and Chapter IV was reworked. Minor changes and additions were also made. The new text was written in
colla-boration with Z. Ja. Shapiro.
I wish to thank A. G. Kurosh for making available his lecture notes
on tensor algebra. I am grateful to S. V. Fomin for a number of valuable comments Finally, my thanks go to M. L. Tzeitlin for assistance in the preparation of the manuscript and for a number of suggestions.
September 1950 I. GELtAND
Translator's note: Professor Gel'fand asked that the two appendices
This book is based on a course in linear algebra taught by the author in the department of mechanics and mathematics of the Moscow State
University and at the Byelorussian State University.
S. V. Fomin participated to a considerable extent in the writing of this book. Without his help this book could not have been written.
T h e author wishes to thank Assistant Professor A. E. Turetski of the Byelorussian State University, who made available to him notes of the lectures given by the author in 1945, and to D. A. Raikov, who carefully
read the manuscript and made a number of valuable comments. The material in fine print is not utilized in the main part of the text and may be omitted in a first perfunctory reading.
January 1948 I. GEL'FAND
Page
Preface to the second edition
Preface to the first edition vii
I. n-Dimensional Spaces. Linear and Bilinear Forms
ti-Dimensional vector spaces
Euclidean space 14
Orthogonal basis. Isomorphism of Euclidean spaces 21
Bilinear and quadratic forms 34
Reduction of a quadratic form to a sum of squares 42 Reduction of a quadratic form by means of a triangular
trans-formation 46
The law of inertia 55
Complex n-dimensional space 60
II. Linear Transformations 70
Linear transformations. Operations on linear transformations . 70
Invariant subspaces. Eigenvalues and eigenvectors of a linear
transformation 81
The adjoint of a linear transformation 90
Self-adjoint (Hermitian) transformations. Simultaneous
reduc-tion of a pair of quadratic forms to a sum of squares 97
Unitary transformations 103
Commutative linear transformations. Normal transformations . 107 Decomposition of a linear transformation into a product of a unitary and self-adjoint transformation
Linear transformations on a rea/ Euclidean space 114
Extremal properties of eigenvalues 126
III. The Canonical Form of an Arbitrary Linear Transformation . 132
The canonical form of a linear transformation 132
Reduction to canonical form 137
Elementary divisors 142
Polynomial matrices 149
IV. Introduction to Tensors 164
The dual space 164
n-Dimensional Spaces. Linear and
Bilinear Forms
§ 1. a-Dimensional vector spaces
1. Definition of a vector space. We frequently come across objects which are added and multiplied by numbers. Thus
In geometry objects of this nature are vectors in three dimensional space, i.e., directed segments. Two directed segments
are said to define the same vector if and only if it is possible to translate one of them into the other. It istherefore convenient to
measure off all such directed segments beginning with one common
point which we shall call the origin. As is well known the sum of
two vectors x and y is, by definition, the diagonal of the
parallelo-gram with sides x and y. Thedefinition of multiplication by (real) numbers is equally well known.
In
algebra we come
across systems ofn numbers
x = (ei, E2, , e) (e.g., rows of a matrix, the set of coefficients of a linear form, etc.). Addition and multiplication of n-tuples bynumbers are usually defined as follows: bythe sum of the n-tuples x = (E1, E2,
, ¿) and y =(,n2,
, ?In) we mean the n-tuplex + y =
+ ij, E2 + n2,,
+ n). By the product of the
number A and the n-tuple x = (ei , e2, e) we mean the n-tuple ix= (241, )42, , AE ).In analysis we define the operations of addition of functions and multiplication of functions by numbers. In the sequel we shall consider all continuous functions defined on some interval fa, b].
In the examples just given the operationsof addition and
multi-plication by numbers are applied to entirelydissimilar objects. To investigate all examples of this nature from a unified point of view we introduce the concept of a vector space.
DEFINITION 1. A set R of elements x, y, z, is said to be a
vector space over a field F if:
With every two elements x and y in R there is associated an
element z in R which is called the sum of the elements x and y. The sum of the elements x and y is denoted by x + y.
With every element x in R andevery numeer A belongingtua c
field F there is associated an element Ax in R. Ai( is referredto as the product of x by A.
The above operations must satisfy the following requirements
(axioms):
I.
1. x+y=y+x
(commutativity)(x + y) 4- z = x
(y z) (associativity) R contains an element 0 such that x = x for all x inR. 0 is referred to as the zero element.
For every x in R there exists (in R) an element denoted by
x with the property x ( x) = O.
II.
1. Ix= x
2. (ßx) = ß(x).
III. 1. (cc
+ fi)x =
+ fix2. ct(x
y) =
cor.It is not an oversight on our part thatwe have not specified how
elements of R are to be added and multiplied by numbers. Any
definitions of these operations are acceptable as long
as the
axioms listed above are satisfied. Whenever this is the case we aredealing with an instance of a vector space.
We leave it to the reader to verify that the examples 1, 2, 3
above are indeed examples of vector spaces.
Let us give a few more examples of vector spaces.
4. The set of all polynomials of degree not exceeding some natural number n constitutes a vector space if addition of
polyno-mials and multiplication of polynopolyno-mials by numbers are defined in
the usual manner.
We observe that under the usual operations of addition and multiplication by numbers the set of polynomials of degree n does not form a vector space since the sum of two polynomials of degree n may turn out to be a polynomial of degree smaller thann. Thus
(r
t)( r
t) =
of the matrices I laikl I and we take the matrix Hai, + b1,11.
As the product of the number X and the matrix 11aikl we take the matrix 112a1tt It is easy to see that the above set R is now a
vector space.
It is natural to call the elements of a vector space vectors. The
fact that this term was used in Example I should not confuse the
reader. The geometric considerations associated with this word will help us clarify and even predict a number of results.
If the numbers X, y, involved in the definition of a vector space are real, then the space is referred to as a real vector space. If the numbers 2, it, are taken from the field of complex numbers,
then the space is referred to as a complex vector space.
More generally it may be assumed that A, 1.4, , are elements of an
arbitrary field K. Then R is called a vector space over the field K. Many
concepts and theorems dealt with in the sequel and, in particular, the
contents of this section apply to vector spaces over arbitraryfields.
How-ever, in chapter I we shall ordinarily assume that R is a real vector space. 2. The dimensionality of a vector space. 1/Ve nowdefine the notions of linear dependence and independence of vectors which are of fundamental importance in all that follows.
DEFINITION 2. Let R be a vector space. W e shall say that the vectors x, y, z, , IT are linearly dependent if there exist numbers
cc, /3, y, 9, not all equal to zero such that
(1)
yz +
+ Ov = O.Vectors which are not linearly dependent are said to be linearly independent. In other words,
a set of vectors x, y, z, , v is said to be linearly independent if the equality
GO( + /3y + yz + + Ov = 0
implies that =- p y 0 = O.
Let the vectors x, y, z, , IT be linearly dependent, i.e., let x, y, z, , IT be connected by a relation of the form (1) with at
least one of the coefficients, a, say, unequal to zero. Then
fty
yz Ov.(/57X) = A, = tu,
', (01x) =
we have
(2) x /1.y pZ + + Of.
Whenever a vector x is expressible through vectors y, z,
, y
in the form (2) we say that x is a linear combination of the vectors
y, z, , y.
Thus, if the vectors x, y, z, , y are linearly dependent then at
least one of them is a linear combination of the others. We leave it to the reader to prove that the converse is also true, i.e., that if one of a set of vectors is a linear combination of the remaining vectors then
the vectors of the set are linearly dependent.
EXERCISES. 1. Show that if one of the vectors x, y, z, y is the zero vector then these vectors are linearly dependent.
2. Show that if the vectors x, y, z, are linearly dependent and u, v, are arbitrary vectors then the vectors x, y, z, , u, y, are linearly
dependent.
We now introduce the concept of dimension of a vector space.
Any two vectors on a line are proportional, i.e., linearly depend-ent. In the plane we can find two linearly independent vectors
but any three vectors are linearly dependent. If R is the set of vectors in three-dimensional space, then it is possible to find three linearly independent vectors but any four vectors are linearly
dependent.
As we see the maximal number of linearly independent vectors
on a straight line, in the plane, and in three-dimensional space coincides with what is called in geometry the dimensionality of the
line, plane, and space, respectively. It is therefore natural to make the following general definition.
DEF/NITION 3. A vector space R is said to be n-dimensional if it
contains n linearly independent vectors and if any n 1 vectors in R are linearly dependent.
If R is a vector space which contains an arbitrarily large number of linearly independent vectors, then R is said to be
infinite-dimensional.
Infinite-dimensional spaces will not be studied in this book. Ve shall now compute the dimensionality of each of the vector
As we have already indicated, the space R of Example 1 contains three linearly independent vectors and any four vectors in it are linearly dependent. ConsequentlyR is three-dimensional. Let R denote the space whose elements are n-tuples of real
numbers.
This space contains n linearly independent vectors For instance,
the vectors
are easily seen to belinearly independent. On the other hand, any m vectors in R, tre > n, are linearly dependent. Indeed, let
Yi(nii, n12,
Y2 = (V21, n22, , n2n), Yin = (nml, nm2, , nmn)
be ni vectors and let ni > n. The number of linearly independent
rows in the matrix
17ml, 717n2, n tnn
cannot exceed n (the number of columns). Since m > n, our ni
rows are linearly dependent. But this implies the linear dependence
of the vectors y1, y2, , Ynt Thus the dimension of R is n.
Let R be the space of continuous functions. Let N be any
natural number. Then the functions: f1(t) 1, f2(t) = t, fN(t) = tN-1 form a set of linearly independent vectors (the proof
of this statement is left to the reader). It follows that our space contains an arbitrarily large number of linearly independent functions or, briefly, R is infinite-dimensional.
Let R be the space of polynomials of degree n 1. In
this space the n polynomials 1, t, , t"-1 arelinearly independent. It can be shown that any ni elements of R, ni > n, are linearly
dependent. Hence R is n-dimensional.
[
nu., n12, Thnn21, n22y 172n
xi -= (1, 0, , 0), x, = (0, 1, , 0),
5. We leave it to the reader to prove that the space of n x n matrices [a2kH is n2-dimensional.
3. Basis and coordinates in n-dimensional space
DEFINITION 4. Any set of n linearly independent vectors
e1, e2, , en of an n-dimensional vector space R is called a basis of R.
Thus, for instance, in the case of the space considered in Example
1 any-three vectors which are not coplanar form a basis.
By definition of the term "n-dimensional vector space" such a space contains n linearly independent vectors, i.e., it contains a
basis.
THEOREM 1. Every vector x belonging to an n-dimensional vector
space R can be uniquely represented as a linear combination of basis vectors.
Proof: Let e1, e2, , e be a basis in R. Let x be an arbitrary
vector in R. The set x, e1, e2, e contains n + 1 vectors. It
follows from the definition of an n-dimensional vector space that these vectors are linearly dependent, i.e., that there exist n I numbers « al, , ex not all zero such that
(3) ao X + 1e1 +
cte = O.
Obviously «0 O. Otherwise (3) would imply the linear
depend-ence of the vectors e1, e2, e. Using (3) we have
cc °c
x =
2 e2 m en.cco 22o «0
This proves that every x e R is indeed a linear combination of the vectors el, e2, , en.
To prove uniqueness of the representation of x in terms of the
basis vectors we assume that
x = $1e1 E2e2 +
+ e
and
X = E',e, r2e2 +
+ ete.
Subtracting one equation from the other we obtain
Since e1, e2,
,
en are linearly independent, it follows thatEt'= e2
e'2 == en = e'n =
i.e.,
= eta ea = E12,
,
=
$'7tThis proves uniqueness of the representation.
DEFINITION 5. if el, e2, , e form a basis in an n-dimensional
space and
(4)
x =
+ e2e2 ++
then the numbers $1, E2, , E are called the coordinates of the vector x relative to the basis el, e2,
e.
Theorem 1 states that given a basis el, e2, , e of a vector space R every vector X E R has a unique set of coordinates.
If the coordinates of x relative to the basis e1, e2,
,
en areE2,
,
En and the coordinates of y relative to the same basisare ni, n2,
a, i.e., if
x
+ 2e2 +
+ Ene
Y =
+ /72; +
+
then
x + Y = (El + ni)ei + (e2 712)e2 +
+
+ n)e,
i.e., the coordinates of x + y are E, ni, e2 772, , En +
Similarly the vector /lx has as coordinates the numbers 141,
2E2, , 2En
Thus the coordinates of the sum of two vectors are the sums of the appropriate coordinates of the summands, and the coordinates of the
product of a vector by a scalar are the products of the coordinates
of that vector by the scalar in question.
It is clear that the zero vector is the only vector all of whose
coordinates are zero.
EXAMPLES. 1. In the case of three-dimensional space our defini-tion of the coordinates of a vector coincides with the definidefini-tion of
the coordinates of a vector in a (not necessarily Cartesian) coor-dinate system.
2. Let R be the space of n-tuples of numbers. Let us choose as basis the vectors
et = (1, 1, 1, 1),
e, = (0, I, 1, , 1), en = (0, 0, 0, 1),
and then compute the coordinates n
, n of the vector
x = (E1, $2, , e) relative to the basis el, e,, , e. By definition
x
=
n2e2 '+ nen;
i.e., (E1, E, en) n/(1, 1, , 1) 272(0, 1, , 1) Ti(0, 0, , 1)=
+ n2,ni +
+
+ n).
The numbers (ni, .72, , n) must satisfy the relations = El,
171 + 172 E2
Then
x = (Ei, 52, En)
Ei(1, o, o) 2(o, 1,
, o) +
+ $(o, o, , 1)=
e2e2++Ee,,.
It follows that in the space R of n-tuples ($1, $2, ,$) the numbers
+
172 ++ n. = E.
Consequently,
ni " Si,
n2 " S2 -- $1, ' nn " $n$fl-1-Let us now consider a basis for R in which the connection
be-tween the coordinates of a vector x = E2,
, en) and the
numbers et, E2, , E which define the vector is particularly simple. Thus, let= (1,
e, = (0,
= (0, 0, 1, 0,,
,
0), 0), 1).may be viewed as the coordinates of the vector
,
EXERCISE. Show that in an arbitrary basis
e, = (all, 6122,
,
e2 (a, a22,
a,),
en
(a, a", -
, a,)
the coordinates n,, /2' 17 of a vector x ($1., e2, , en) are linear combinations of the numbers E1, E2, '
,
Let R be the vector space of polynomials of degree n 1. A very simple basis in this space is the basis whose elements are
the vectors el = 1, e, = t,
, e =
t"--1-. It is easy to see that thecoordinates of the polynomial P(t) = a0r-1 a1t"-2 + in this basis are the coefficients a_, an_2, , ao.
Let us now select another basis for R:
e' = 1,
e'2 = t a,e's = (t
a)2,,
e' = (t
a)"--1. Expanding P (t) in powers of (t a) we find thatP (t) = P (a) + P' (a)(t
a) +
+ [P(nl) (a)I (n-1)!](t a)n_'.
Thus the coordinates of P(t) in this basis areP(a), P' (a), , [PR-1)(a)/ (n
1)1
Isomorphism of n-dimensional vector spaces. In the examples considered above some of the spaces are identical with others when it comes to the properties we have investigated so far. One instance of this type is supplied by the ordinary three-dimensional space R considered in Example / and the space R' whose elements are
triples of real numbers. Indeed, once a basis has been selected in
R we can associate with a vector in R its coordinates relative to that basis; i.e., we can associate with a vector in R a vector in R'.
When vectors are added their coordinates are added. When a vector is multiplied by a scalar all of its coordinates are multiplied
by that scalar. This implies a parallelism between the geometric
properties of R and appropriate properties of R'.
We shall now formulate precisely the notion of "sameness" or of
"isomorphism" of vector spaces. ei,
X =(,
,e) relative to the basisDEFINITION 6. Two vector spaces R and RI, are said to be
iso-morphic if it is possible to establish a one-to-one correspondence
X 4-4 x' between the elements x e R and x' e R' such that if x 4> x' and y y', then
the vector which this correspondence associates with x + y is X' + y',
the vector which this correspondence associates with Ax is Ax'.
There arises the question as to which vector spaces are iso-morphic and which are not.
Two vector spaces of different dimensions are certainly not
iso-morphic.
Indeed, let us assume that R and R' are isomorphic. If x, y,
are vectors in R and x', y', are their counterparts in R' then
in view of conditions I and 2 of the definition of isomorphism
the equation Ax
yy +
= 0 is equivalent to the equation Ax'py' +
= O. Hence the counterparts in R' of linearly independent vectors in R are also linearly independent and con-versely. Therefore the maximal number of linearly independentvectors in R is the same as the maximal number of linearly independent vectors in R'. This is the same as saying that the
dimensions of R and R' are the same. It follows that two spaces of different dimensions cannot be isomorphic.
THEOREM 2. All vector spaces of dimension n are isomorphic.
Proof: Let R and R' be two n-dimensional vector spaces. Let e2, , en be a basis in R and let e'2, , e' be a basis in R'. We shall associate with the vector
(5)
x =
e2e2 ++ ee
the vector
x' E2e'2 +
+
i.e., a linear combination of the vectors e', with the same
coeffi-cients as in (5).
This correspondence is one-to-one. Indeed, every vector x e R has a unique representation of the form (5). This means that the
E, are uniquely determined by the vector x. But then x' is likewise
uniquely determined by x. By the same token every x' e R'
It should now be obvious that if x 4* x' and y e> y', then
x + y 4> x' + y' and 2x 4> Ax'. This completes the proof of the
isomorphism of the spaces R and R'.
In § 3 we shall have another opportunity to explore the concept
of isomorphism.
5. Subspaces of a vector space
DEFINITION 7. A subset R', of a vector space R is called a subspace
of R if it forms a vector space under the operations of addition and
scalar multiplication introduced in R.
In other words, a set R' of vectors x, y, in R is called a
subspace of R if x e R', y e R' implies x
y e R', x e R'.EXAMPLES. 1. The zero or null element of R forms a subspace of R.
The whole space R forms a subspace of R.
The null space and the whole space are usually referred to as improper subspaces. We now give a few examples of non-trivial
subspaces.
Let R be the ordinary three-dimensional space. Consider
any plane in R going through the origin. The totality R' of vectors in that plane form a subspace of R.
In the vector space of n-tuples of numbers all vectors
x = (E1, E2, , En) for which ei = 0 form a subspace. More generally, all vectors x (E1, E2,
, E) such that
a,E1 a2E2 + + anen = 0,
where al, a2,
, an are arbitrary but
fixed numbers, form a subspace.The totality of polynomials of degree n form a subspace of the vector space of all continuous functions.
It is clear that every subspace R' of a vector space R must
con-tain the zero element of R.
Since a subspace of a vector space is a vector space in its own
right we can speak of a basis of a subspace as well as of its dimen-sionality. It is clear that the dimension of an arbitrary subspaceof a vector space does not exceed the dimension of thatvector space.
EXERCISE. Show that if the dimension of a subspace R' of a vector space
A general method for constructing subspaces of a vector space R
is implied by the observation that if e, f, g,
are a (finiteOY infinite) set of vectors belonging to R, then the set R' of all
(finite) linear combinations of the vectors e, f, g, forms a subspace R' of R. The subspace R' is referred to as the subspace
generated by the vectors e, f, g, . This subspace is the smallest subspace of R containing the vectors e, f, g,
The subspace R' generated by the linearly independent vectors e1, e2, e, is k-dimensional and the vectors e1, e2, , eh form a basis of R' Indeed, R' contains k linearly independent vectors
(i.e., the vectors e1, e2, eh).
On the other hand, let x1,
x2, , x, be 1 vectors in R' and let 1 > k. If
x1= $11e1 + $ 12; + + elkek X2 enel. 4- $ 22; +
+ E2e,
xi E12e2 + ' + etkek,
then the I rows in the matrix
$12, P Elk $21, $22, , $2k
Etl., $22, E lk
must be linearly dependent. But this implies (cf. Example 2, page 5) the linear dependence of the vec ors x, x2, , xi. Thus
the maximal number of linearly independent vectors in R', i.e., the dimension of R', is hand the vectors e, e2, , e7, form a basis
in R'.
EXERCISE. Show that every n-dimensional vector space contains
subspaces of dimension /, / 1, 2, , n.
If we ignore null spaces, then the simplest vector spaces are one-dimensional vector spaces. A basis of such a space is a single
vector el O. Thus a one-dimensional vector space consists of all vectors me1, where a is an arbitrary scalar.
Consider the set of vectors of the form x xo °Lei, where xo
and e, 0 are fixed vectors and ranges over all scalars. It is
natural to call this set of vectors
by analogy with three-dimensional space a line in the vector space R.Similarly, all vectors of the form /el ße2, where el and e2 are fixed linearly independent vectors and a and fi are arbitrary numbers form a two-dimensional vector space. The set of vectors
of the form
X =
xe
ße2,where xo is a fixed vector, is called a (two-dimensional) plane. EXERCISES. 1. Show that in the vector space of n-tuples (ei, ep, '
of real numbers the set of vectors satisfying the relation
6/1E1 a2E, + + ane,
----(a1, a,, , a, arefixed numbers not all of which are zero) form a subspace
of dimension n 1.
Show that if two subspaces R, and 112 of a vector space R have only
the null vector in common then the sum of their dimensions does not exceed
the dimension of R.
Show that the dimension of the subspace generated by the vectors
e, f, g, is equal to the maximal number of linearly independent vectors among the vectors e, f, g,
6. Transformation of coordinates under change of basis. Let
e2, , en and e'1, e'2, , e' be two bases of an n-dimensional vector space. Further, let the connection between them be given
by the equations
e', = ae
a21e2 ++ ae,
(6)
e'2 = tine'
a22 e2 ++ an,en,
= ame,
a211e2 ++ ae.
The determinant of the matrix d in (6) is different from zero
(otherwise the vectors e'1, e'o, e' would be linearly
depend-ent).
Let ei be the coordinates of a vector x in the first basis and
its coordinates in the second basis. Then
x = Ele
$2e2 +$e = E'le',
+
Replacing the e', with the appropriate expressions from (6) we getx e1e1 $2e2 +
+ ee
e'l(ae
ae2 +
ae)
±a2e)
ane2
en
aniVi + a,V, +
+ anE'n.Thus the coordinates of the vector x in the first basis are express-ed through its coordinates in the second basis by means of the matrix st which is the transpose of .21.
To rephrase our result we solve the system (7) for ¿'i, , . Then
buE1 + 1)12E2 + + binen, ei2 = bn + b22$2 +
+ b2Jn,
e'n = b11
b2$2 + bnnenwhere the b are the elements of the inverse of the matrix st.
Thus, the coordinates of a vector are transformed by means of a
matrix ri which is the inverse of the transpose of the matrix at in (6)
which determines the change of basis.
§ 2. Euclidean space
1. Definition of Euclidean space. In the preceding section a
vector space was defined as a collection of elements (vectors) for which there are defined the operations of addition and multipli-cation by scalars.
By means of these operations it is possible to define in a vector space the concepts of line, plane, dimension, parallelism of lines, etc. However, many concepts of so-called Euclidean geometry cannot be forniulated in terms of addition and multiplication by scalars. Instances of such concepts are: length of a vector, angles between vectors, the inner product of vectors. The simplest way of introducing these concepts is the following.
We take as our fundamental concept the concept of an inner product of vectors. We define this concept axiomatically. Using the inner product operation in addition to the operations of addi-Since the e, are linearly independent, the coefficients of the e, on
both sides of the above equation must be the same. Hence
auri + an
+
+
rn
(7) E2 = an VI
tion and multiplication by scalars we shall find it possible to
devel-op all of Euclidean geometry.
DEFINITION 1. If with every pair of vectors x, y in a real vector space R there is associated a real number (x, y) suchthat
(x, y) = (y, x),
(2x, y) = 2(x, y), (A real) (x, + x2, y) =] (x1, y] + (x2, y],
(x, x) 0 and (x, x) = 0 tt and only if x 0, then we say that an inner product is defined in R.
A vector space in which an inner product satisfying conditions 1 through 4 has been defined is referred to as a Euclidean space.
EXAMPLES. 1. Let us consider the (three-dimensional) space R
of vectors studied in elementary solid geometry (cf. Example
/,
§ 1). Let us define the inner product of two vectors in this space as the product of their lengths by the cosine of the angle between
them. We leave it to the reader to verify the fact that the opera-tion just defined satisfies condiopera-tions 1 through 4 above.
Consider the space R of n-tuples of real numbers. Let
x = (et,
23 , En) and Y = (n2, n2, , th,) be in R. In additionto the definitions of addition
x + Y = (ei
nt, ez + n2,en + n)
and multiplication by scalars
Ax (25,, AE,, , Aen)
with which we are already familiar from Example 2, § 1, we define
the inner product of x and y as
(x, Y) = eint + 5m +
+
it is again easy to check that properties 1 through 4 are satisfied
by (x, y) as defined.
Without changing the definitions of addition and
multipli-cation by scalars in Example 2 above we shall define the inner
product of two vectors in the space of Example 2 in a different and
more general manner.
+ an]. ennl an2 En n2 + ann ennn
We can verify directly the fact that this definition satisfies Axioms 2 and 3 for an inner product regardless of the nature of the
real matrix raj/cll. For Axiom 1 to hold, that is, for (x, y) to be
symmetric relative to x and y, it is necessary and sufficient that
a= a,
i.e., that IctO be symmetric.
Axiom 4 requires that the expression (x, x) ctikeie,
i, k=1
be non-negative fore very choice of the n numbers el, e2, , en
and that it vanish only if E, = E, =
= E,, = O.The homogeneous polynomial or, as it is frequently called,
quadratic form in (3) is said to be positive definite if it takes on non-negative values only and if it vanishes only when all the Ei are zero. Thus for Axiom 4 to hold the quadratic form (3) must be positive definite.
In summary, for (1) to define an inner product the matrix 11(211
must be symmetric and the quadratic form associated with Ila11 must be positive definite.
If we take as the matrix fIctO the unit matrix, i.e., if we put
a = 1 and a, = O (i
k), then the inner product (x, y) deEnedby (1) takes the form
(x, y) = I eini
and the result is the Euclidean space of Example 2.
EXERCISE. Show that the matrix
(0 1)
1 0
cannot be used to define an inner product (the corresponding quadratic
form is not positive definite), and that the matrix (1
1\
1 21
can be used to define an inner product satisfying the axioms I through 4. (x, Y) = a11C1n1 + a12C1n2 + + am el /7
In the sequel (§ 6) we shall give simple criteria for a quadratic
form to be positive definite.
Let the elements of a vector space be all the continuous functions on an interval [a, b]. We define the inner product of two such functions as the integral of their product
(f, g) =fa f(t)g(t) dt.
It is easy to check that the Axioms 1 through 4 are satisfied.
Let R be the space of polynomials of degree n 1.
We define the inner product of two polynomials as in Example 4
(P, Q) = P (t)Q(t) dt.
L
2. Length of a vector. Angle between two vectors. We shall now make use of the concept of an inner product to define the length of a vector and the angle between two vectors.
DEFINITION 2. By the length of a vector x in Euclidean space we mean the number
(4) (x, x).
We shall denote the length of a vector x by the symbol N.
It is quite natural to require that the definitions of length of a vector, of the angle between two vectors and of the inner product
of two vectors imply the usual relation which connects these quantities. In other words, it is natural to require that the inner
product of two vectors be equal to the product of the lengths of
these vectors times the cosine of the angle between them. This
dictates the following definition of the concept of angle between two vectors.
DEFINITION 3. By the angle between two vectors x and y we mean
the number arc cos (x, y) lx1 1311 i.e., we put (x, y) cos 9) 13C1 1371 (5)
The vectors x and y are said to be orthogonal if (x, y) = O. The angle between two non-zero orthogonal vectors is clearly .7i/ 2. The concepts just introduced permit us to extend a number of
theorems of elementary geometry to Euclidean spaces.
The following is an example of such extension. If x and y are
orthogonal vectors, then it is natural to regard x + y as the
diagonal of a rectangle with sides x and y. We shall show that
Y12 = 1x12 1Y121
i.e., that the square of the length of the diagonal of a rectangle is equal
to the sum of the squares of the lengths of its two non-parallel sides
(the theorem of Pythagoras).
Proof: By definition of length of a vector
lx + 3/12 (x Y, x Y).
In view of the distributivity property of inner products (Axiom 3),
(X + y, x + y) = (x, x)
(x, Y) 4- (Y, x) (Y, y).Since x and y are supposed orthogonal, (x, y) = (y, x) = O. Thus
1x + y12 = (x, x) (Y, Y) 13112,
which is what we set out to prove.
This theorem can be easily generalized to read: if x, y, z,
-are pairwise orthogonal, then
Ex + y + z +
12 = ixr2 )7I2 1z12 +'
3. The Schwarz inequality. In para. 2. we defined the angle w
between two vectors x and y by means of the relation (x, y)
cos 99
1x1 1Y1
If is to be always computable from this relation we must show
that
I We could have axiomatized the notions of length of a vector and angle between two vectors rather than the notion of inner product. However, this course would have resulted in a more complicated system of axioms than that associated with the notion of an inner product.
y)1/1x1 IYI 1.
(X, Y)
1 <
< 1IXI IYI
or, equivalently, that
(x, y)2
< I
Ix12 13112 which, in turn, is the same as
(6) (x, y)2 (x, x)(y, y).
Inequality (6) is known as the Schwarz inequality.
Thus, before we can correctly define the angle between two vectors by
means of the relation (5) we must prove the Schwarz inequality. 2 To prove the Schwarz inequality we consider the vector x ty where t is any real number. In view of Axiom 4 for inner products,
(x ty, x ty) 0;
i.e., for any t,
12(y , y) 2t(x, y) + (x, x) O.
This inequality implies that the polynomial cannot have two
dis-tinct real roots. Consequently, the discriminant of the equation
t2(y, y)
2t(x, y) + (x, x) =
cannot be positive; i.e.,(x, y)2 (x, x)(y, y) _CO, which is what we wished to prove.
EXERCISE.
Prove that a
necessary and sufficient condition for(x, y), (x, x)(y, y) is the linear dependence of the vectors x and y. EXAMPLES. We have proved the validity of (6) for an axiomatically defined Euclidean space. It is now appropriate to interpret this inequality
in the various concrete Euclidean spaces in para. 1.
1. In the case of Example 1, inequality (6) tells us nothing new. (cf. the remark preceding the proof of the Schwarz inequality.)
2 Note, however that in para. 1, Example /, of this section there is no need to prove this inequality. Namely, in vector analysis the inner product of two vectors is defined in such a way that the quantity (x, y)/1x1 1y1 is the cosine of a previously determined angle between the vectors. Conse-quently, 1(x,
2 In Example 2 the inner product was defined as
(x, y) = E t=1 It follows that
(X, X) E Et2, (y. y) = E i)(8,
2=1 2--1
and inequality (6) becomes
Y n )( n
ei M 5- ( Ei2 tif2).
i=1 i=1 i=1
In Example 3 the inner product was defined as
(1) Y) aikeink, i,n=1 where and (3) E aikeiek >. k=1
for any choice of the ¿i. Hence (6) implies that
il the numbers an satisfy conditions (2) and (3), then the following inequality
holds:
2 ( n
n
( E ao,$((o) E aikeik)( E 6111115112)
k=1 k=1 i, 2=1
EXERCISE. Show that if the numbers an satisfy conditions (2) and (3), anakk. (Hint: Assign suitable values to the numbers Ei, En, , En and /72, , 71 in the inequality just derived.)
In Example 4 the inner product was defined by means of the integral
fb
1(1)g (t) dt. Hence (6) takes the form
0. af(t)g(t) dt))2 fba [f(t)J' dt [g(t)p dt.
This inequality plays an important role in many problems of analysis. We now give an example of an inequality which is a consequence
of the Schwarz inequality.
If x and y are two vectors in a Euclidean space R then
1Yr-Proof:
lx y12 = (x + y, x -1- y) = (x, x) 2(x, y) + (y, y). Since 2(x, y) 213E1 it follows that
1x±y12 = (x+y, x+y) (x, x)+21x1 lyi+ (y, Y) = fix1+IYI)2, i.e., 1x + yl x1 SI, which is the desired conclusion.
EXERCISE. Interpret inequality (7) in each of the concrete Euclidean spaces considered in the beginning of thissection.
In geometry the distance between two points x and y (note the use of the same symbol to denote avectordrawn from the
origin-and a point, the tip of that vector) is defined as the length of the vector x y. In the general case of an n-dimensional Euclidean space we define the distance between x and y by the relation
d lx yl.
§ 3. Orthogonal basis. Isomorphism of Euclidean spaces
I. Orthogonal basis. In § 1 we introduced the notion of a basis (coordinate system) of a vector space. In a vector space there is
no reason to prefer one basis to another. 3 Not so in Euclidean spaces. Here there is every reason to prefer so-called orthogonal
bases to all other bases. Orthogonal bases play the same role in Euclidean spaces which rectangular coord nate systems play in analytic geometry.
DEFINITION 1. The non-zero vectors el, e2,
e of an
n-dimensional Euclidean vector space are said to form an orthogonal
basis if they are pairwise orthogonal, and an orthonormalbasis i f , in addition, each has unit length. Briefly, the vectors e,,e2,
,e
form an orthonormal basis
3 Careful reading of the proof of the isomorphism of vector spaces given in § 1 will show that in addition to proving the theorem we alsoshowed that it is possible to construct an isomorphism of two n-dimensional vector spaces which takes a specified basis in one of these spaces into a specified basis in
the other space. In particular, if e,, e,, e and e'5, e',, are two bases in R, then there exists an isomorphic mappingof R onto itself which
f 1 if
i = k
(ei, ek) =
10 if i k.
For this definition to be correct we must prove that the vectors
ei, e,,
en of the definition actually form a basis, i.e., arelinearly independent.
Thus, let
A2e2 +
+ Ae = O.
We wish to show that (2) implies Ai = 2.2
= A = O. To
this end we multiply both sides of (2) by el (i.e., form the innerproduct of each side of (2) with ei). The result is 21(e1, el) + A2(e1, e2) + + 2(e1, en)
=
O.Now, the definition of an orthogonal basis implies that
ei) 0 0, (e1, ek) = 0
for k é L
Hence A = O.
Likewise, multiplying (2) by e, we find that
A2 = 0, etc. This proves that el, e2, , e are linearly independ-ent.
We shall make use of the so-called orthogonalization procedure to prove the existence of orthogonal bases. This procedure leads
from any basis f, 2, , f to an orthogonal basis el, e2,
, e.
THEOREM 1. Every n-dimensional Euclidean space contains
orthogonal bases.
Proof: By definition of an n-dimensional vector space (§ 1,
para. 2) such a space contains a basis f1, f2,
, f. We put
= f1. Next we put e, = f,
e1, where a is chosen so that
(e2, el) = 0; i.e., (f, e1, ei) = O. This means that (f,, e1)/(e1, e1).
Suppose that we have already constructed non-zero pairwise orthogonal vectors el, e2,
",
To construct ek we pute, =A1e11+ ' + Ak-iet
(ek, e1) = (fk 21e2-1 ' '
+
= 0,(ek, e2) = (fk Aiek.-t. -I-
+ A2e, e2)
= 0,
(fk, e2-1) 21(ek_1, e) = O.
It follows that
2-1 = (f/c, e1)/(e1, el),
=
(fk, e2)/(e2, e2), '(f2,
So far we have not made use of the linear independence of the
vectors f1, fa, , but we shall make use of this fact presently
to prove that e, O. The vector ek is a linear combination of the vectors ek, e2,
, e, f2. But e, can be written as a
linear combination of the vector f_, and the vectors e, e2,,
Similar statements hold for e22, e2_2, , ek. It follows thatek = alfk ci2f2 +
+
2-1 fk.In view of the linear independence of the vectors f1, f2, fk we may conclude on the basis of eq. (5) that ek O.
Just as ek, e2, , ek_k and fk were used to construct e, so
ek, e2,
, e, and fk+, can be used to construct e,, etc.
By continuing the process described above we obtain n non-zero,
pairwise orthogonal vectors ek, e2,
, en, i.e., an orthogonal
basis. This proves our theorem.
It is clear that the vectors
e'k = ek/lekl (k = 1, 2, , n)
form an orthonormal basis.
EXAMPLES OF ORTHOGONALIZATION. 1. Let R be the three-dimensional
space with which we are familiar from elementary geometry. Let fi,f,, f, be three linearly independent vectors in R. Put e, = f,. Next select a ..;ctor e, perpendicular to e, and lying in the plane determined by e, = f,
(e2, e,) = (f2
A1e2-1 + '+
e2-1) O. Since the vectors el, e2, , e, are pairwise orthogonal, the latter equalities become:(fk, el) + 22-1(e1, et) = 0, (fk, 02) 22-2(e2, e2) 0,
and f2. Finally, choose e, perpendicular to ei ande, (i.e., perpendicular to the previously constructed plane).
Let R be the three-dimensional vector space of polynomials of degree not exceeding two. We define the inner product of two vectors in this space by the integral
fi P(t)Q (t) dt.
The vectors 1, t, 12 form a basis in R. We shall now orthogonalize this basis. We put e, = 1. Next we put e, = t I. Since
O (t+ I, I) = f (t dt = 2a,
it follows that a = 0, i.e., e, = t. Finally we put e, = 12 + 131 y 1.
The orthogonality requirements imply ß 0 and y = 1/3, i.e., e, = t2
1/3. Thus 1, t, P 1/3 is an orthogonal basis in R. By dividing each basis
vector by its length we obtain an orthonormal basis for R.
Let R be the space of polynomials of degree not exceeding n 1. We define the inner product of two vectors in this space as in the preceding example.
We select as basis the vectors 1, t, -,t^-2. As in Example 2 the process of orthogonalization leads to the sequence of polynomials
1, t, 12 1/3, (3/5)1,
.
Apart from multiplicative constants these polynomials coincide with the
Legendre polynomials
1 dk (12 1)k
2k k! dtk
The Legendre polynomials form an orthogonal, but not orthonormal basis in R. Multiplying each Legendre polynomial by a suitable constant we obtain an orthonormal basis in R. We shall denote the kth element of this basis by Pk(t).
Let e1, e2, , en be an orthonormal basis of a Euclidean space
R. If
x =
52e, + + enen,y = ?he,
n2e2 ++77e0,
then
(x,Y)=
$2e2+ ee,
n2e2 ++ ne).
Since
ek) = {01 1ff
EXAMPLES. 1. Let Po(t), P,(1), P o(t) be the normed Legendre polynomials of degree 0, 1, , n. Further, let Q (t) be an arbitrary
polyno-it follows that
(x, y) El% + $27)2 + + Enfl.
Thus, the inner product of two vectors relative to an orthonormal basis
is equal to the sum of the products of the corresponding coordinates of
these vectors (cf. Example 2, § 2).
EXERCISES 1. Show that if f., f., f is an arbit ary basis, then
(x, y) = E nikEink, Cle=1
where aik = aki and ei e2, " en and ni, n2, ?I are the coordinates of x and y respectively.
2. Show that if in some basis f1, f'2,
, f
(x, Y) = 071 + + + Enn.
for every x =
" '
Ent. and Y = nifi + + n,,f, then thisbasis is orthonormal.
We shall now find the coordinates of a vector x relative to an orthonormal basis el, e2,
, e.
Let
x = eie,
E2e2ene.
Multiplying both sides of this equation by el we get
(x, e1) = el) + $2(e2 el) +
+ e(en
,el) =
and, similarly,
(7) e, = (x, e2),
,
= (x, e).
Thus the kth coordinate of a vector relative to an orthonormal basis is
the inner product of this vector and the kth basis vector.
It is natural to call the inner product of a vector x and a vector e of length 1 the projection of x on e. The result just proved may be
states as follows: The coordinates of a vector relative to an orthonor-mal basis are the projections of this vector on the basis vectors. This is the exact analog of a statement with which we are familiar from
analytic geometry, except that there we speak of projections on the coordinate axes rather than on the basis vectors.
rnial of degree n. We shall represent Q (t) as a linear combination of the Legendre polynomials. To this end we note that all polynomials of degree
n form an n-dimensional vector space with orthonormal basis Po(t), , P(t). Hence every polynomial Q(t) of degree n can be
rep-resented in the forra
Q (t) = P(t) + c1P1(1) +
+ cP(t).
It follows from (7) that
c, I Q(t)P,(1) dt.
2 Consider the system of functions
(8) 1, cos t, sin t, cos 2t, sin 2t, cos nt, sin nt, on the interval (0, 2n). A linear combination
P(t) = (a012) + a, cos t + b, sin t + al cos 2t + + 6,, sin nt
of these functions is called a trigonometric polynomial of degree n. The totality of trigonometric polynomials of degree n form a (2n + 1) -dimen-sional space R,. We define an inner product in R1 by the usual integral
2n
(P, Q) = fo P(t)Q(t) dt.
It is easy to see that the system (8) is an orthogonal basis Indeed r 2r
cos kt cos It dt = 0 if k 1,
Jo
rn
sin kt cos It dt = 0,f.o 2r sin kt sin lt dt = 0, if k I. Since 27, .f: sin% kt dt = cos% kt dt n o and .{0 227 ldt = 2n, it follows that the functions
(8') 1/1/2a, (l/ n) cos t, (l/ n) sin t, (l/Vn) cos nt, (l/Vn) sin nt
are an orthonormal basis for R1.
2. Perpendicular from a point to a subspace. The shortest distance from a point to a subspace. (This paragraph may be left out in a
first reading.)
DEFINITION 2. Let R, be a subspace of a Euclidean space R. We shall say that a vector h e R is orthogonal to the subspace R, if it is orthogonal to every vector x e RI.
If h is orthogonal to the vectors e, e,,
e then it is also
orthogonal to any linear combination of these vectors. Indeed,(h, et) = 0 (1= 1, 2, , m)
implies that for any numbers 2,, 22,
,
(h, 21e1 + 22; + + 2.,e,) = O.
Hence, for a vector h to be orthogonal to an m-dimensional sub-space of R it is sufficient that it be orthogonal to ni linearly
inde-pendent vectors in It1, i.e., to a basis of 12.1.
Let R, be an m-dimensional subspace of a (finite or infinite dimensional) Euclidean space R and let f be a vector not belonging
to lt1. We pose the problem of dropping a perpendicular from the
point f to 121, i.e., of finding a vector f, in R1 such that the vector
h f f is orthogonal to R1. The vector f, is called the orthogonal
projection of f on the subspace R1. We shall see in the sequel that this problem has always a unique solution. Right now we shall show that, just as in Euclidean geometry, 1111 is the shortest dis-tance from f to RI. In other words, we shall show that if f, e R1
and f1 f, then.
If -
> If - 41.
Indeed, as a difference of two vectors in RI, the vector fo f1
belongs to R, and is therefore orthogonal to h = f
f. By the
theorem of Pythagoras
If
412 + 14 f1I2 = If f0 + 4 f1I2 = If
-so that
If -
> If
We shall now show how one can actually compute the orthogo-nal projection f, of f on the subspace R1 (i.e., how to drop a
perpendicular from f on 141). Let el, e,, e, be a basis of R1.
As a vector in 121, f, must be of the form
f, =
c2e2 +c,e,.
To find the c, we note that f f, must be orthogonal to III, i.e.,
(f fo, ej = O (k = 1, 2, , in), or,
Replacing f, by the expression in (9) we obtain a system of m equations for the c,
c,(ei, e1) c2(e2, e1)
+
+ c,(e,,,, e1)= (f, e1) (k = 1, 2, m). We first consider the frequent case when the vectors e1, e2,
,
e,,, are orthonormal. In this case the problem can be solved with ease. Indeed, in such a basis the system (11) goes over into the
system
ci = (f, e,).
Since it is always possible to select an orthonormal basis in an m-dimensional subspace, we have proved that for every vector f
there exists a unique orthogonal projection f, on the subspace
We shall now show that for an arbitrary basis el, e2, , en, the system (11) must also have a unique solution. Indeed, in view of
the established existence and uniqueness of the vector f0, this
vector has uniquely determined coordinates el, c2, , cm with
respect to the basis el, e2, , e,. Since the c, satisfy the system (11), this system has a unique solution.
Thus, the coordinates c, of the orthogonal projection fo of the vector f
on the subspace 111 are determined from the system (12) or from the system (11) according as the c, are the coordinates of to relative to an orthonormal basis of R1 or a non-orthonormal basis of
A system of m linear equations in in unknowns can have a unique solution only if its determinant is different from zero.
It follows that the determinant of the system (11) (el, (e2, e1) ' (era, ei)
(ex, ;) (e2, e2)
(e
(ex, e,) (e2, e,)
(e e,)
must be different from zero. This determinant is known as the
Gramm determinant of the vectors e1, e2, e,.
EXAMPLES. 1. The method of least squares. Let y be a linear function of x1, x2, , x,; i.e., let
where the c, are fixed unknown coefficients. Frequently the c, are determined experimentally. To this end one carries out a number
of measurements of ael, x2,
, x, and y. Let x, x2,,
X, y,
denote the results of the kth measurement. One could try to determine the coefficients c1, c2, , cm from the system ofequa-tions
X11C1 X21C2 + + XnaCm = y1, Xl2C1 + X22 C2 + + Xm2 = Y2,
xincl x2c2 +
+ xmc, = y .
However usually the number n of measurements exceeds the
number m of unknowns and the results of the measurements are never free from error. Thus, the system (13) isusually incompati-ble and can be solved only approximately. There arises the proincompati-blem of determining t1, c2, , cm so that the left sides of the equations
in (13) are as "close" as possible to the corresponding right sides. As
a measure
of "closeness" we take the so-called meandeviation of the left sides of the equations from the corresponding free terms, i.e., the quantity
E (X1kC1 X2nC2 + + XinkCk Ykr
k=1
The problem of minimizing the mean deviation can be solved directly. However, its solution can be immediately obtained from
the results just presented.
Indeed, let us consider the n-dimensional Euclidean space of
n-tuples and the
following vectors:e, = (x11, x,
, x1), e2 = (X21 X22, 4 4, X2n),,
= (x,, x,2,
, x), and
f =_- (y1, y2, y) in that space. The right sides of (13) are the components of the vector f and the left sides, of the vector
ciel c2e2 +
+
Consequently, (14) represents the square of the distance from f to clel c2e2 + + c,e,n and the problem of minimizing the
mean deviation is equivalent to the problem
of choosing ni numbers e1, e2, , Cm so as to minimize the distance from f tothe vectors el, e2, , e, (supposed linearly independent), then
our problem is the problem of finding the projection of f on RI. As we have seen (cf. formula (11)), the numbers c1, c2, , which solve this problem are found from the system of equations
(e1, ei)c, (e2, e1)c2 +
+ (e, el)a, = (f, e1),
(15) (e1, e2)c1 (e2, e2)c2 + ' (e e2)c,,, = (f, e2),I
x,c2 k(e1, en)ci (e2, em)c, +
+ (e, en)c, = (f, e,),
where(f, ex) = xxiYs, ek) = xoxk,,
I =1 1-1
The system of equations (15) is referred to as the system of normal equations. The method of approximate solution of the system (13) which we have just described is known as the method of least squares.
EXERCISE. Use the method of least squares to solve the system of equations
2c 3 3c 4
4c = 5.
Solution: el (2, 3, 4), f = (3, 4, 5). In this case the normal system
consists of the single equation
(e1, enc = (ee f), 29c = 38; c = 38/29.
When the system (13) consists of n equat ons in one unknown
xic
(13') x2c = y2,
xc
the (least squares) solution is
(x, y) k=1
xox
c
In this case the geometric significance of c is that of the slope of a line through the origin which is "as close as possible" to the points (x1, y1), (x2, y2),
, (x, y).
2. Approximation of functions by means of trigonometric polynomials. Let (t) be a continuous function on the interval [0, 2n]. It is frequently
necessary to find a trigonometric polynomial P(t) of given degree v,Mich differs from f(t) by as little as possible. We shall measure the proximity of
1(t) and P (t) by means of the integral u(t) - 13(1)i2 dl.
Thus, we are to find among all trigonometric polynom als of degree n,
P(t) = (a012) + a, cos t b, sin t + + an cos nt b sin nt,
that polynomial for which the mean deviation from f (t) is a minimum. Let us consider the space R of continuous functions on the interval
10, 2:r] in which the inner product is defined, as usual, by means of the
integral
(I, g) = 21,f(t)g(t) dt. Then the length of a vector f(t) in R is given by
=
6atr
EN)? dt.Consequently, the mean deviation (16) is simply the square of the distance from j(t) to P(t). The trigonometric polynomials (17) form a subspace R, of R of dimension 2n + 1. Our problem is to find that vector of R, which is
closest to fit), and this problem is solved by dropping a perpendicular from f(t) to R1.
Since the functions
1 cos t sin t eo e, e, , V2,7 ' Nhz 1 Or cos nt sin nt e,_, ;
e,
Ahrform an orthonormal basis in R, (cf. para. 1, Example 2), the required element P(t) of R, is 2n P(t) = E ce1, k=0 where ck = (t, el), or
I 27 1 J.2" e,,....1c,
= --
f(t)dt; tik., f(t) cos kt dt; Vart o Vn o 1 27 C2k - , 1(t) sin kt dt. ' 7(/0ThuS, for the mean deviation of the trigonometric polynomial
a, n
P(t) := * Ea,. cos kt + b sin kt
2 k=1
from f(t) to be a minimum the coefficients a and bk must have the values
127
1 2=7
a,5
fit) dt; a =-5
1(1) cos kt dt; X o x o127
b =
5
f(t) sin kt dt."Jo
The numbers a,. and bk defined above are called the Fourier coefficients of
the function fit).
3. Isomorphism of Euclidean spaces. We have investigated a
number of examples of n-dimensional Euclidean spaces. In each of them the word "vector" had a different meaning. Thus in § 2,
Example 2, "vector" stood for an n-tuple of real numbers, in § 2, Example 5, it stood for a polynomial, etc.
The question arises which of these spaces are fundamentally different and which of them differ only in externals. To be more specific:
DEFINITION 2. Two Euclidean spaces R and R', are said to be isomorphic if it is possible to establish a one-to-one correspondence
x 4> x' (x e R, x' e R') such that
I. If X 4> X' and y 4> y', then x + y X' + y', i.e., if OW
correspondence associates with X E R the vector X' E R' and with
y e R the vector y' e R', then it associates with the sum x + y the
sum x' y'.
If x x', then Axe> Ax'.
If x 4> x' and y 4--> y', then (x, y) = (x', y'); i.e., the inner
products of corresponding pairs of vectors are to have the same value.
We observe that if in some n-dimensional Euclidean space R a
theorem stated in terms of addition, scalar multiplication and inner multiplication of vectors has been proved, then the same
theorem is valid in every Euclidean space R', isomorphic to the space R. Indeed, if we replaced vectors from R appearing in the statement and in the proof of the theorem by corresponding
vec-tors from R', then, in view of the properties 1, 2, 3 of the definition
of isomorphism, all arguments would remain unaffected. The following theorem settles the problem of isomorphism of
different Euclidean vector spaces.
THEOREM 2. All Euclidean spaces of dimension n are isomorphic.
We shall show that all n-dimensional Euclidean spaces are
isomorphic to a selected "standard" Euclidean space of dimension
n. This will prove our theorem.
As our standard n-dimensional space R' we shall take the space of Example 2, § 2, in which a vector is an n-tuple of real numbers
and in which the inner product of two vectors x' = (E1, ¿2,
, e)
and y' = (7? n
a
, nn) is defined to be (x', = Eft?' + $2n2 ++
Now let R be any n-dimensional Euclidean space. Let el,
e2, , en be an orthonormal basis in R (we showed earlier that every Euclidean space contains such a basis). We associate with
the vector
x =
e2e2 ++ ene
in R the vector
= (81, 82, , 8n)
in R'.
We now show that this correspondence is an isomorphism. The one-to-one nature of this correspondence is obvious.
Conditions 1 and 2 are also immediately seen to hold. It remains
to prove that our correspondence satisfies condition 3 of the defini-tion of isomorphism, i.e., that the inner products of corresponding pairs of vectors have the same value. Clearly,
(x, Y) = $1ni + $2n2 + + Ennn,
because of the assumed orthonormality of the e,. On the other
hand, the definition of inner multiplication in R' states that
Thus
(x', y') = (x, Y):
i.e., the inner products of corresponding pairs of vectors have
indeed the same value.
This completes the proof of our theorem.
EXERCISE. Prove this theorem by a method analogous to that used in
para. 4, § 1.
The following is an interesting consequence of the isomorphism
theorem. Any "geometric" assertion (i.e., an assertion stated in
terms of addition, inner multiplication and multiplication of vectors by scalars) pertaining to two or three vectors is true if it is true in elementary geometry of three space. Indeed, the vectors in
question span a subspace of dimension at most three. This sub-space is isomorphic to ordinary three sub-space (or a subsub-space of it), and it therefore suffices to verify the assertion in the latter space.
In particular the Schwarz inequality a geometric theorem about a pair of vectors is true in any vector space because it is
true in elementary geometry. We thus have a new proof of the Schwarz inequality. Again, inequality (7) of § 2
Ix + yl ixl
is stated and proved in every textbook of elementary geometry as the proposition that the length of the diagonal of a parallelogram
does not exceed the sum of the lengths of its two non-parallel sides,
and is therefore valid in every Euclidean space. To illustrate, the inequality,
(f(t) g(t))2 dt VSa [
bb
f (t)]2 dt VSa [g(t)]2 dt,which expresses inequality (7), § 2, in the space of continuous func-tions on [a, b], is a direct consequence, via the isomorphism
theo-rem, of the proposition of elementary geometry just mentioned.
§ 4. Bilinear and quadratic forms
In this section we shall investigate the simplest real valued
functions defined on vector spaces.
1. Linear functions. Linear functions are the simplest functions defined on vector spaces.
DEFINITION I. A linear function (linear form) f is said to be
defined on a vector space if with every vector x there is associated a number f(x) so that the following conditions hold:
_fix Y) f(x) +AY),
!(Aa) = 1f (x).
Let et, e2, , en be a basis in an n-dimensional vector space.
Since every vector x can be represented in the form
x =
enen,the properties of a linear function imply that
f(x) =
f&ie, + E2e2 ++ ene)
eifie,)¿j(e2) +
+ enf(e.)-Thus, if et, e2, , en is a basis of an n-dimensional vector space R, x a vector whose coordinates in the given basis are E1, $2,, E,
and f a linear function defined on R, then(1)
f(x) = aiel
a252+ -in amen,where f(e) =
= 1, 2, , n).The definition of a linear function given above coincides with
the definition of a linear function familiar from algebra. What
must be remembered, however, is the dependence of the a, on the choice of a basis. The exact nature of this dependence is easily
explained.
Thus let et, e2,
, e and e'1, e'2,
, e'n be two bases in R.Let the e', be expressed in terms of the basis vectors et, e2,
, e
by means of the equationse', = acne, + ac21e2 + + ocnIenr
e'2 acne, + 122e2 + + (Xn2en,
e'
2,2,e2 ++ 2,e.
Further, let
relative to the basis el, e2, , en, and
f(x) = a'le', + a'2E12 +
+ a'e'
relative to the basis e'1, e'2,
e'.
Since a, = f(e,) and a' k = f(e'), it follows that
ai = f(xikei
cc2ke2 + + C(nk en) = Xlki(ei) ac22f(e2)+
+ akf(e.) = ctik
+ c(2k az ++ anka.
This shows that the coefficients of a linear form transform under a
change of basis like the basis vectors (or, as it is sometimes said, cogrediently).
2. Bilinear forms. In what follows an important role is played by
bilinear and quadratic forms (functions).
DEFINITION 2. A (x; y) is said to be a bilinear function (bilinear form) of the vectors x and y if
for any fixed y, A (x; y) is a linear function of x,
for any fixed x, A (x; y) is a linear function of y.
In other words, noting the definition of a linear function, conditions 1 and 2 above state that
A (xi + x2; y) = A (x,; y) + A (x2; y), A (Ax; y) = (x; y),
A (x; + y2) = A (x; yi) + A (x; yz), A (x; pty) = yA(x; y).
EXAMPLES. 1. Consider the n-dimensional space of n-tuples of
real numbers. Let x = ($1, bt2,
, E), y =
n2, , nk), anddefine
A (x; y) = a12e072 ' '
+
(2) + an 27/1 anE27/2 ' ' a2ne2nn
anlennl an2enn2 + + annennn
A (x; y) is a bilinear function. Indeed, if we keep y fixed, i.e., if we regard ni , n2, n as constants, ae ink depends linearly
i, k1
on the $,; A (x; y) is a linear function of x =- (E1, ey, en). Again, if $1, $2,
E are kept constant, A (x; y) is a linear
2. Let K(s, t) be a (fixed) continuous function of two variables s, t. Let R be the space of continuous functions f(t). If we put
b b
A (f; g)
=
Is
K(s, t)f(s)g(t) ds dt,
then A (f; g) is a bilinear function of the vectors f and g. Indeed,
the first part of condition 1 of the definition of a bilinear form means, in this case, that the integral of a sum is the sum of the
integrals and the second part of condition 1 that the constant A.
may be removed from under the integral sign. Conditions 2 have analogous meaning.
If K(s, t) 1, then
b
A (f; g) =
jib
f(s)g(t) ds dt = f(s) ds g(t) dt,
i.e., A( f; g) is the product of the linearfunctions!: f(s) ds and
b
Ja g(t) dt.
EXERCISE. Show that if 1(x) and g(y) are linear functions, then their product 1(x) g(y) is a bilinear function.
DEFINITION 3. A bilinear function (bill ear form) is called
symmetric 21
A (x; y) = A (y x)
for arbitrary vectors x and y.
In Example / above the bilinear form A (x; y) defined by (2) is
symmetric if and only if aik= aid for all i and k.
The inner product (x, y) in a Euclidean space is an example of a symmetric bilinear form.
Indeed, Axioms I, 2, 3 in the definition of an inner product (§ 2) say that the inner product is a symmetric, bilinear form. 3. The matrix of a bilinear form. We defined a bilinear form
axiomatically. Now let el, e2, en be a basis in n-dimensional space. We shall express the bilinear form A (x; y) using the coordinates ei, e2, e of x and the coordinates ni, n,,
, ri of
y relative to the basis e1, e2,
e. Thus,
A (x; y) = A (ei ei $2e2 + + e en; /he, )72e2 +
+ 71e).
In view of the properties 1 and 2 of bilinear formsA (x; y) = I A (ei; k=1
or, if we denote the constants A (ei; e1) by a,,
A (x; y) = ai,e
To sum up: Every bilinear form in n-dimensional space can be written as
A (x: Y) = azkink, i, 1=1
where X -r- $1e1 +
4- e, y = ?he,. +
+ tine, and
a, = A (ei; ek).The matrix a/ = is called the matrix of the bilinear form
A (x; y) relative to the basis el, e2, , en.
Thus given a basis e1, e2, , e the form A (x; y) is determined
by its matrix at= Ha11.
EXAMPLE. Let R be the three-dimensional vector space of triples (El. EP 59) of real numbers. We define a bilinear form in R by means of the equation
A (x; y) = El% + 2eon + 3eana.
Let us choose as a basis of R the vectors
e, = (1, 1, 1); e, = (1, 1, 1);
e, = (1, 1, 1),
6
0 4
= 0 6
d.
4 2
6It follows that if the coordinates of x and y relative to the basise,, e2, e, are denoted by 5',, 5',, and n'2, TA, respectively, then
A (x; y) = 65'oy, 4E', n'a 6E', + + 2 3.17' +
and compute the matrix .91 of the bilinear form A (x; y). Nlaking use of (4) we find that: an 1 1 + 2 1 1 + 3 1 1 = 6,
a = an
1 1 + 2 1 1 + 3 1 (-1) = 0,a = 11 + 2
1 1 + 3 (-1) (-1) = 6,a= a
11 + 2 U (-1) + 3
1(-1) = 4,
a = a= 1
1 1- 2 1(-1) + 3 (-1)(-1) = 2,
a
1 - 1 + 2 (-1) (-1) + 3(-1) (I) = 6,
(5)
4. Transformation of the matrix of a bilinear form under a change of basis. Let el, e2,
e and f, f2,
f be two bases of ann-dimensional vector space. Let the connection between these
bases be described by the relations
= cue'
c21e2 d-+ ce,
c22e2 +
+
f2 = cue].
In = crne + c2e2 +
+
which state that the coordinates of the vector fn relat ve to the basis el., e2, , e are cm, c22,
- , c. The matrix
1
[clici2
w C21C22 . C2n
cn2c2 cn
is referred to as the matrix of transition from the basis e,, e2,
to the basis f1, f2,
, f.
Let si = I laid I be the matrix of a bilinear form A (x;y) relative to the basis e1, e2, , e and .4 = I ibikl 1, the matrix of that form relative to the basis f1, f2, L. Our problem consists in finding the matrix I IbI I given the matrix
I kJ.
By definition [eq. (4)] b = A (f,,, fe), i.e., bp,is the value of our
bilinear form for x f, y = fe. To find this value we make use of (3) where in place of the ei and ni we put the coordinates of fp
and fe relative to the basis e1, e2,
, e, i.e., the numbers
c, ca,
, en and c, c,
, c,.
It follows that(6) by, = A (fp; fe) =
acic..
k=-1
We shall now express our result in matrix form. To this end we put e1,, = c' The e',,1 are, of course, the elements of the
transpose W' of W. Now b becomes 4
4 As is well known, the element c of a matrix 55' which is the product of two matrices at = 11a0,11
cik E ai,zbk. c/-1
Using this definition twice one can show that i = d.Vt, then
= E abafic Aft.
a.ß=1