Linear Algebra. Week 10

(1)

Linear Algebra. Week 10

Dr. Marco A Roque Sol

(2)

Eigenvalues and eigenvectors of an operator Definition.

LetV be a vector space andL:V→V be a linear operator. A

numberλis called aneigenvalueof the operatorL ifL(v) =λv

for a nonzero vectorv∈V. The vector v is called an eigenvector

ofLassociated with the eigenvalue λ(If Vis a functional space then eigenvectors are also calledeigenfunctions. )

IfV=Rn then the linear operator Lis given byL(x) =Ax, where

Ais ann×n matrix. In this case, eigenvalues and eigenvectors of

the operatorLare precisely eigenvalues and eigenvectors of the

(3)

SupposeL:V→V is a linear operator on a finite-dimensional

vector spaceV

Let{u1,u2,· · · ,un}be a basis forV andg :V→Rn be the

corresponding coordinate mapping. LetA be the matrix ofL with

respect to this basis. Then

L(v) =λL(v) ⇐⇒ Ag(v) =λg(v)

Hence the eigenvalues ofL coincide with those of the matrixA.

Moreover, the associated eigenvectors ofAare coordinates of the

(4)

Definition.

The characteristic polynomialp(λ) =det(A−λI) of the matrix A

is called thecharacteristic polynomial of the operatorL.

Then eigenvalues ofLare roots of its characteristic polynomial.

Theorem.

The characteristic polynomial of the operatorLis well defined. That is, it does not depend on the choice of a basis.

(5)

Proof

LetB be the matrix ofL with respect to a different basis

{v1,v2,· · ·,vn}. Then A=UBU−1, whereU is the transition

matrix from the basis{v1,v2,· · ·,vn} to{u1,u2,· · ·,un}. We

have to show thatdet(A−λI) =det(B−λI) for allλ∈_R. We obtain

det(A−λI) =det(UBU−1−λI) =det(U[B−λI]U−1) =

(6)

Eigenspaces

LetL:V→V be a linear operator. For anyλ∈_R, letVλ denotes

the set of all solutions of the equationL(x) =λx.

ThenVλ is a subspace ofV, sinceVλ is the kernel of a linear operator given byx→L(x)−λx.

Vλ minus the zero vector is the set of all eigenvectors ofL associated with the eigenvalueλ. In particular,λ∈Ris an eigenvalue ofLif and only if Vλ6=0.

IfVλ 6=0 then it is called the eigenspace ofLcorresponding to the eigenvalueλ.

(7)

Let V=C∞(R) andD :V→V, Df =f0

A function f ∈C∞(R) is an eigenfunction of the operator D

belonging to an eigenvalue λiff0(x) =λf(x) for all x ∈R

It follows thatf(x) =ceλx, where c is a nonzero constant.

Thus eachλ∈_Ris an eigenvalue of D. The corresponding

(8)

Let V=C∞(R) andL:V→V, Lf =f00

A function f ∈C∞(R) is an eigenfunction of the operator L

belonging to an eigenvalue λif

f00(x) =λf(x)⇒f00(x)−λf(x) = 0 for all x∈R

Note thatLf =D2f, henceDf =µf ⇒Lf =µ2f. Ifλ >0 thenVλ=Span(eµx,e−µx) with µ=

√ λ

Ifλ <0 thenVλ=Span(sin(µx),cos(µx)) withµ=

√ −λ

(9)

LetV be a vector space andL:V→V be a linear operator.

Theorem

Ifv∈V is an eigenvector of the operatorLthen the associated

eigenvalue is unique.

Proof

L(v) =λ1v andL(v) =λ2v. Then,

(10)

Theorem

Suppossedv1 andv2 are eigenvectors ofLassociated with different

eigenvaluesλ1 and λ2. Then,v1 andv2 are linearly independent.

Proof

For any scalart 6= 0 the vector tv1 is also an eigenvector of L

associated with the eigenvalueλ1. Sinceλ16=λ2, it follows that

tv16=v2. That is, v2 is not a scalar multiple ofv1. Similarly, v1 is

(11)

LetL:V→V be a linear operator.

Theorem

Ifv1,v2,and v3 are eigenvectors of Lassociated with distinct

eigenvaluesλ1, λ2 and λ3, then they are linearly independent

Proof

Supposed thatt1v1+t2v2+t3v3=0 for somet1,t2,t3∈R. Then L(t1v1+t2v2+t3v3 =0)

t1L(v1) +t2L(v2) +t3L(v3) =0

(12)

It follows that

t1λ1v1+t2λ2v2+t3λ3v3−λ3(t1v1+t2v2+t3v3) =

t1(λ1−λ3)v1+t2(λ2−λ3)v2 =0

By the above,v1 andv2 are linearly independent.

Hencet1(λ1−λ2) =t2(λ2−λ3) = 0⇒t1 =t2 = 0. Then

(13)

Theorem

Ifv1,v2, ...,vk are eigenvectors of a linear operatorL associated

with distinct eigenvaluesλ1, λ2, ..., λk thenv1,v2, ...,vk are linearly

independent.

Corollary

Ifλ1, λ2, ..., λk are distinct real numbers, then the functions

eλ1x,eλ2x, ...,eλkx _{are linearly independent.}

Proof

Consider the linear operatorD :C∞(R)→C∞(R) given by Df =f0.

(14)

Theneλ1x,eλ2x, ...,eλkx are eigenfunctions of D associated with distinct eigenvaluesλ1, λ2, ..., λk. By the theorem, the

eigenfunctions are linearly independent.

Basis of eigenvectors

LetV be a finite-dimensional vector space andL:V→V be a

linear operator. Letv1,v2, ...,vn are a basis forV andAbe the

(15)

Theorem

The matrixAis diagonal if and only if vectors v1,v2, ...,vn are

eigenvectors ofL. If this is the case, then the diagonal entries of

the matrixAare the corresponding eigenvalues ofL.

Proof L(vi) =λivi ⇒L(v) =Av ⇐⇒ A=      λ1 0 λ2 . .. 0 λn     

(16)

How to find a basis of eigenvectors

We know that ifv1,v2, ...,vk are eigenvectors of a linear operator

Lassociated with distinct eigenvaluesλ1, λ2, ..., λk, then

v1,v2, ...,vk are linearly independent. Thus, from this result follows

Corollary

Supposeλ1, λ2, ..., λk are all eigenvalues of a linear operator

L:V→V. For any 1≤i ≤k letSi be a basis for the eigenspace

associated to the eigenvalueλi. Then these bases are disjoint and

the unionS =S1∪S2∪ · · · ∪Sk is a linearly independent set.

Moreover, if the vector spaceV admits a basis consisting of

(17)

Corollary

LetAbe an n×n matrix such that the characteristic equation

det(A−I) = 0 has n distinct roots. Then (i) there is a basis for

Rn consisting of eigenvectors of A;(ii) all eigenspaces of Aare

one-dimensional.

Theorem

LetLbe a linear operator on a finite-dimensional vector spaceV. Then the following conditions are equivalent:

the matrix ofL with respect to some basis is diagonal;

there exists a basis for V formed by eigenvectors ofL

(18)

Theorem

LetAbe an n×n matrix. Then the following conditions are

equivalent:

A is the matrix of a diagonalizable operator

A is similar to a diagonal matrix, i.e., it is represented as

A=UDU−1 where the matrix D is diagonal.

there exists a basisRn formed by eigenvectors ofA.

(19)

Example 11.1

Given the matrix

A= 2 1 1 2 Find, if it is diagonalizable. Solution

The matrix Ahas two eigenvalues: 1 and 3.

The eigenspace of Aassociated with the eigenvalue 1 is the

(20)

The eigenspace of Aassociated with the eigenvalue 3 is the line spanned by v2 = (1,1)

Eigenvectors v1,v2 form a basis forR2

Thus, the matrixA is diagonalizable. Namely,A=UBU−1, where

B= 1 0 0 3 U = −1 1 1 1

Notice thatU is the transition matrix from the basisv1,v2 to the

standard basis or in another words, the columns ofU are made of

(21)

Example 11.2

Given the matrix

A=   1 1 −1 1 1 1 0 0 2   Find if it is diagonalizable. Solution

The matrix Ahas two eigenvalues: 0 and 2.

The eigenspace of Aassociated with the eigenvalue 0 is

(22)

The eigenspace of Aassociated with the eigenvalue 2 is two-dimensional, is spanned by v2 = (1,1,0),v3 = (−1,0,1),

andS2={v2,v3}

The unionS =S1∪S2={v1,v2,v3} is a linearly independent

set, hence it is a basis for R3

Thus, the matrixA is diagonalizable. Namely,A=UBU−1, where

B=   0 0 0 0 2 0 0 0 2   U =   −1 1 −1 1 1 0 0 0 1  

Notice thatU is the transition matrix from the basisv1,v2,v3 to

(23)

Example 11.3

Given the matrix

A= 4 3 0 1 FindA5. Solution

We know t hatA=UBU−1, where

B = 4 0 0 1 , U = 1 −1 0 1

(24)

Then

A5 =UBU−1UBU−1UBU−1UBU−1UBU−1=UB5U−1= . 1 −1 0 1 1024 0 0 1 1 1 0 1 1024 −1 0 1 1 1 0 1 = 1024 1023 0 1

(25)

Example 11.4

Given the matrix

A= 4 3 0 1 FindAk. Solution

We know thatA=UBU−1, where

B = 4 0 0 1 , U = 1 −1 0 1

(26)

Then

Ak =UBU−1UBU−1· · ·UBU−1

| {z } k−times =UBkU−1 = . 1 −1 0 1 4k 0 0 1 1 1 0 1 4k −1 0 1 1 1 0 1 = 4k 4k−1 0 1

(27)

Example 11.5

Given the matrix

A=

4 3

0 1

FindC such thatC2 =A.

Solution

We know thatA=UBU−1, where

B = 4 0 0 1 , U = 1 −1 0 1

(28)

Suppose thatD2 =B for some matrixD. LetC =UDU−1, then C2 =UDU−1UDU−1 =UD2U−1=A We can take D= √ 4 0 0 √1 then C = 1 −1 0 1 2 0 0 1 1 1 0 1 = 2 1 0 1

(29)

Example 11.6

Coupled Oscillations

1) Case of two masses. Consider two identical bodies joined up with identical springs on a frictionless track as follows:

HereAandB represent the equilibrium positions of the two

masses. Letx1(t) andx2(t) be the distances from the equilibrium

positions of the two masses at timet and let k be the spring

(30)

The force acting on the first body has two parts byHooke’s law

(discovered by the English scientist Robert Hookein 1660–states that the force f exerted by a coiled spring is directly proportional to its extension ∆x): the first part is −kx1 due to the leftmost spring

and the second part isk(x2−x1) due to the center spring. The net

force acting on the first mass is then

F1 =−kx1−k(x2−x1) =−2kx1+kx2

Similarly, the net force acting on the second mass is

F2=kx1−2kx2

ApplyingSecond Newtons Law

(In the presence of external forces, an object experiences an acceleration directly proportional to the net external force and inversely proportional to the mass of the object: F=ma )

(31)

gives the following system of differential equations:

mx₁00=−2kx1+kx2

mx₂00=kx1−2kx2

This can be written in matrix form as follows:

x1 x2 00 =−k m 2 −1 −1 2 x1 x2

Let us now find the eigenvalues of the matrix

A=

2 −1

−1 2

(32)

Recall that the eigenvalues ofAare those values ofλsatisfying the equation:

det(A−λI)−0 whereI is the 2×2 identity matrix:

|A−λI|= 2−λ −1 −1 2−λ = 0 ⇒(2−λ)2−1 = 0 ⇒λ1,2= 1,3.

Let us now find corresponding eigenvectors. Forλ1= 1

1 −1 0

−1 1 0

(33)

Which is equivalen to

1 −1 0

0 0 0

Thus, one unit eigenvector, associated to this eigenvalues, is

v1=

1/√2 1/√2

Similarly, one can show that forλ2= 3, one unit eigenvector,

associated to this eigenvalues, is

v1 =

−1√2 1√2

(34)

Now let V = √1 2 1 −1 1 1 thenV−1AV =D orA=VDV−1 , where D= 1 0 0 3

(35)

Thus, the original system of equations can be written as X00=−k mVDV −1_X (V−1X)00=−k mDV −1_X

and making the change of variable,Y=V−1X, we obtain

y1 y2 00 =−ω2₀ 1 0 0 3 y1 y2

(36)

which is giving us theuncoupled equations y₁00=−ω2 0y1 y₂00=−3ω₀2y2 ω0 = r k m Physical interpretation

1)Let us look again at the eigenvector

v1=

1/√2 1/√2

corresponding to the eigenvalue 1. The fact that the components are equal tells us thatx1 andx2 are always equal. Consequently,

the system oscillates back and forth but the middle spring is never stretched. It is as if we had two masses, each attached to a spring of constantk. It easy to see then that the frequency is given by

ω0=

q

k m

(37)

2)Let us look again at the second eigenvector v1 = −1/√2 1/√2

corresponding to the eigenvalue 3.

The fact that the componentsx1 and x2 are always equal but have

opposite directions, this gives an “in and out” motion type. The frequency of the system is also predictable in this case: each mass

is attached to a spring compressed by distancex1 and to another

stretched by a distance of 2x1. It is as if the mass is attached to

one single spring of constant 3k. We know that the frequency in

this case isω0 =

q

3k m

(38)

These two particular cases are called theNormal Modes (

https://www.youtube.com/watch?v=x_ZkKPtgTeA) of

vibration of the system. As you can guess, they have the property that if the system starts out i n one of these modes, it will remain in this mode.

Of course, t he above problem involving two masses can be dealt without talking about eigenvalues and eigenvectors. The benefit of using that algebraic technique is more apparent in the complicated cases of more than two masses.

(39)

Example 11.6 Fibonacci Numbers

In 1202, an Italian mathematician known asFibonacci posed the

following problem: A male and a female rabbit are born at the beginning of the year. We assume the following conditions:

After reaching the age of two months, each pair produces a mixed pair, (one male, one female), and then another mixed pair each month thereafter.

(40)

(41)

Letfndenote the number of rabbit pairs at the beginning of month

n, then we have

f0 = 1,f1= 1,f2 = 2,f3 = 3,f4= 5, ...,fn=fn−1+fn−2

In other words, every term in this sequence (starting withn= 2) is

equal to the sum of the previous two terms. The sequence 1,1,2,3,5,8,13,21,3...generated according to this scheme was named in honour of Fibonacci.

(Leonard of Pisa or Fibonacci played an important role in reviving ancient mathematics and made significant contributions of his own. ).

(42)

This sequence occurs in many places in mathematics, general science, in nature and even in art.

In particular, at the end of the first year (n=12), there will 144 rabbits. As you can see, the Fibonacci numbers get bigger and bigger.

(43)

A natural question arise here: can we find an expression that gives

fnfor any n? The answer is yes, if one knows a little bit about

matrix diagonalization.

Let us now see how all this can be applied to answer our problem about the rabbits of Fibonacci. The general pattern for the Fibonacci sequence was:

fn=fn−1+fn−2

A general expression forfn is not apparent, but a clever twist

transforms the problem into a simpler one using matrices. The idea is to compute the vector

Vn=

fn+1

fn

(44)

for eachn≥0 rather than fn itself. The relationfn=fn−1+fn−2 gives Vn+1 = fn+2 fn+1 =   1 1 1 0   fn+1 fn =AVn where A=   1 1 1 0  

(45)

and therefore

Vn=AVn−1 =A(AVn−2) =A2Vn−2=A3Vn−3 =...AnV0

But, the matrixA is diagonalizable since the eigenvalues are given

by |A−λI|= 1−λ 1 1 −λ =λ2−λ−1 = 0⇒λ1= 1 + √ 5 2 :=φ, λ2 = 1− √ 5 2 :=φ ∗

(46)

and associate eigenvectors p1 = φ 1 p2 = φ∗ 1

Hence, the matrixAis diagonalizable by the matrix of eigenvectors

P =

φ φ∗

1 1

and its inverse

P−1 = 1 √ 5 − 1 √ 5φ ∗ −_√1 5 1 √ 5φ !

(47)

The above results gives the following decomposition ofA: A=PDP−1 = φ φ∗ 1 1 φ 0 0 φ∗ _√1 5 − 1 √ 5φ ∗ −_√1 5 1 √ 5φ !

Thus, finally we obtain

Vn=AnV0 =PDnP−1V0 = φ φ∗ 1 1 φn ₀ 0 (φ∗)n _√1 5 − 1 √ 5φ ∗ −√1 5 1 √ 5φ ! 1 −1

(48)

If you carefully multiply this out and take the first component of the result, you get thenth Fibonacci term:

fn= 1 √ 5 φn+1+φ∗n+1 Note The number φ:= 1 + √ 5 2

is one of the most mysterious numbers, and is widely known as the

golden ratio. It has this name since a rectangle with sidess and

φs has a pleasing shape, possibly because of the fact that if you fold the smaller side (of length s) into the rectangle, dividing into a square of sides and a rectangle with sides s ands(1−φ), this smaller rectangle is similar to the one you began with !!!!

(49)

Example 11.7 Quadratic Equation

Aquadratic equationin two variables x andy is an equation of the form

ax2+ 2bxy+cy2+dx+ey+f = 0 ... (1) which can be rewritten as:

x y a b c d x y + d e x y +f = 0 ... (2)

(50)

Let x= x y A= a b c d The term xTAx=ax2+ 2bxy+cy2

(51)

Conic Sections

The graph of an equation of the form (1) is called a conic section. [If there are no ordered pairs (x,y) which satisfy (1), we say that the equation represents an imaginary conic.] If the graph of (1) consists of a single point, a line, or a pair of lines, we say that (1) represents a degenerate conic. Of more interest are the

nondegenerate conics. Graphs of nondegenerate conics turn out to be circles, ellipses, parabolas, or hyperbolas. The graph of a conic is particularly easy to sketch when its equation can be put into one of the following standard forms:

(52)

x2₊_y2₌_r2 _{... ( Circle ).} x2 a2 + y2 b2 = 1 ... ( Ellipse ). x2 a2 − y2 b2 = 1 or y2 a2 −x 2 b2 = 1 ... ( Hyperbola ). x2 =ay or y2 =ax... ( Parabola ).

(53)

Herea,b and r are nonzero real numbers. Note that the circle is a special case of the ellipse (a=b =r). A conic section is said to be in standard position if its equation can be put into one of these four standard forms. The graphs of the first three will all be symmetric to both coordinate axes and the origin. We say that these curves are centered at the origin. A parabola in standard position will have its vertex at the origin and will be symmetric to one of the axes.

Now, if the original quadratic equation has no termxy, then the

graph of the standar conic will just suffer a vertical or horizontal shif.

(54)

Thus for instance we have the following case 9x2−18x+ 4y2+ 16y−11 = 0⇒ (x−1) 2 22 + (y+ 2)2 32 = 0

and if make the change of variable

x0 =x−1, y0 =y+ 2 we obtain

(55)

(x0)2

22 +

(y0)2

32 = 1

which is in standard form with respect to the variablesx0 andy0. Now, if, however, the conic section has also been rotated from the standard position, it is necessary to change coordinates so that the equation in terms of the new coordinatesx0 andy0 involves nox0y0

term. Letx= (x,y)T and x0= (x0,y0)T . Since the new

(56)

x=Qx0 or x0=QTx where Q = cos(θ) sin(θ) −sin(θ) cos(θ) or QT = cos(θ) −sin(θ) sin(θ) cos(θ)

If 0< θ < π, then the matrix Q corresponds to a rotation ofθ

radians in the clockwise direction andQT corresponds to a

rotation ofθradians in the counterclockwise direction. With this

(57)

(x0)T

QTAQ

x0+ (d0 e0)x0+f = 0 ... (3)

where (d0 e0) = (d e)Q. This equation will involve nox0y0 term if and only ifQTAQ is diagonal. SinceA is symmetric, it is possible to find a pair of orthonormal eigenvectorsq1= (x1,−y1)T

andq2 = (y1,x1)T. Thus, if we set cos(θ) =x1 andsin(θ) =y1,

(58)

λ1(x0)2+λ2(y0)2+d0x+e0y+f = 0

Consider the conic section

3x2+ 2bxy+ 3y2−8 = 0 This equation can be written in the form

x y 3 1 1 3 x y −8 = 0

(59)

The matrix A= 3 1 1 3

has eigenvaluesλ1 = 2 andλ2 = 4 with corresponding unit

eigenvectors q1 = ( 1 √ 2,− 1 √ 2) T_, _q 2 = ( 1 √ 2, 1 √ 2) T . Let Q = 1 √ 2 1 √ 2 −√1 2 1 √ 2 ! = cos(45◦) sin(45◦) −sin(45◦) cos(45◦)

(60)

and set x y = 1 √ 2 1 √ 2 −√1 2 1 √ 2 ! x0 y0

and the equation of the conic becomes (x0)2

4 +

(y0)2

(61)

In the new coordinate system the direction of thex0-axis is determined by the pointx0= 1,y0 = 0. To translate this to the

x−y coordinate system, we multiply

Qx0=x 1 √ 2 1 √ 2 −√1 2 1 √ 2 ! 1 0 = 1 √ 2 −√1 2 ! =q1

Thex0-axis will be in the direction ofq1. Similarly, the direction of

(62)

(63)

The Essentials of Quantum Mechanics

In classical mechanics, a particle has an exact, sharply defined position and an exact, sharply defined momentum at all times. Quantum mechanics is a different fundamental formalism, in which observables such as position and momentum are not real numbers but operators; consequently there are uncertainty relations,e.g.

∆x∆p≥_~, which say that as some observables become more

sharply defined, others become more uncertain. Experiments show that quantum mechanics, not classical mechanics, is the correct description of nature.

(64)

Here is a summary of the essentials of quantum mechanics, focussing on the case of a single non-relativistic particle (e.g. an electron) in one dimension. This is not a complete review of everything you need to know: it is a quick outline of the basics to help you get oriented with this challenging subject.

States. The state of the system is given by awavefunction

ψ(x). The wavefunction is complex, and gives the amplitude

for finding the particle at position x. The probability density is the square modulus of the amplitude, so in one dimension the probability to find the particle located between x andx+dx is

(65)

This means that if you start off with an “ensemble” of identical copies of the system, all with the samewavefunction ψ(x), and in each member you measure the position of the particle, then you will get different results from different members of the ensemble. The probability of getting particular answers is given by (1). A physical state must be normalized, so that all the probabilities add up to 1

Z

|ψ(x)|2dx = 1 ... (2)

Operators. Each observable corresponds to a linear operator. A linear operator is something that acts on a state and gives another state, i.e. , it changes one function into another.

(66)

Position Operatorˆx defined by ˆ

x(ψ(x)) =xψ(x)

Momentum Operatorˆpdefined by ˆ

p(ψ(x)) =−i~∂ψ(x)

∂x .... (3)

For example, when the position operator acts on the state

ψ(x) = _a2_+x1 2 it gives ˆx(ψ(x)) = _a2_+xx 2,

while the momentum operator gives ˆp(ψ(x)) = 2i~x (a2_+x2₎2.

Linearity means that an operator ˆY acts on a sum of two states in

the obvious way ˆ

(67)

Expectation values. The expectation value of an observable ˆY in a normalized stateψ(x) is < ψ(x)|Yˆ|ψ(x)>= Z ∞ −∞ ψ(x)∗ˆ(Y) [ψ(x)]dx .... (5) This means that if you have an ensemble of identical copies of the

system, all with the samewavefunctionψ(x), then when you

measure the value of the observable ˆY in all the members, the

average value that you get is< ψ(x)|Yˆ|ψ(x)>(sometimes written<Yˆ >_ψ_(x), or even <Yˆ >, if it is obvious from the context which state to use).

(68)

So the average position of a particle in a normalized state is < ψ(x)|ˆx|ψ(x)>= Z ∞ −∞ ψ(x)∗ˆ(x) [ψ(x)]dx = Z ∞ −∞ ψ(x)∗xψ(x)dx = R∞ −∞x|ψ(x)|2dx .... (6)

Eigenvalues and eigenstates. Each operator ˆY has a set of

eigenvaluesy which are the possible values you can get on doing a

measurement of ˆY. Any measurement of ˆY must yield one of the

eigenvalues. Each eigenvaluey is associated with an eigenstate

φy(x), which is the state for which the value of ˆY is exactlyy,

(69)

So if you create an ensemble of systems that are all in the same state, namely the ˆY-eigenstateφy(x), then when you measure the

value of the observable ˆY, each member of the ensemble will give

you the same answer, namelyy.

ˆ

Y [φy(x)] =yφy(x)

Physically,φy(x) is the state in which the observable ˆY has the definite valuey. Note that the eigenvalues of an operator that coresponds to an observable are always real, since they are possible values of that physical observable.

(70)

For example, the eigenstates of the momentum operator are plane waves ψp(x) =eipx/~ ˆ p eipx/~₌₋_i ~∂e ipx/~ ∂x =pe ipx/~ _... ₍₈₎

The eigenstates of the position operator areδ-functions,

ψx1(x) =δ(x−x1). The functionδ(x−x1) is zero everywhere

except atx =x1 where it is infinite, so

ˆ

x(δ(x−x1)) =xδ(x−x1) =x1δ(x−x1)

The formal definition of theδ-function is: R

f(x)δ(x−x1)dx

(71)

Resolving a State into Eigenstates. The eigenstates of any

operator ˆY form a complete orthonormal basis of states, so you

can write any stateψ(x) in terms of them:

ψ(x) =X

y

Ayφy(x) or ψ(x) =

Z

A(y)φy(x)dy ... (9)

Ay orA(y) is the amplitude for the value of the observable ˆY for a

particle in the stateψ(x) to bey. So when you make a

measurement of the observable ˆY on a system in the stateψ(x) : Discrete Eigenvalues: |Ay|2 = probability to get the valuey

Continuous Eigenvalues: |A(y)|2_dy _{= probability to}_y _to

(72)

The only question is: for a given stateψ(x), how do you express it in terms of eigenstates of ˆY, i.e. , how do you calculate the coefficientsAy or A(y). Actually, thanks to the orthonormality

property this is easy: you multiply the state by the complex conjugate of the eigenstate of ˆY with eigenvalue y, and integrate

Ay or A(y) =

Z

φ∗_y(x)ψ(x)dy ... (11) Using (10), we then know the probability of getting any of the

allowed values when you perform a measurement of ˆY on a

particle in stateψ(x). This follows from the orthonormality property of the eigenstates:

Z φ∗_y1(x)φy2(x)dx = δ(y1−y2) δy1,y2 ... (12)

(73)

Time Evolution. The evolution through time of a state is

determined by the energy operator, usually called the Hamiltonian ˆ

H, via the time-dependent Schrodinger¨ equation

ˆ

Hψ=i~∂ψ

∂t ... (13)

For a spinless, non-relativistic particle of massm, in a one-dimensional potentialV(x), the Hamiltonian is

ˆ Hψ= pˆ 2m +V(ˆx) =− ~2 2m ∂2 ∂x2 +V(x) ... (14)

(74)

Energy Eigenstates. Because time evolution is so important, the energy eigenstates are particularly important. As you would expect from (7), these are defined by

ˆ

HψE =EψE ... (15)

which is called the time-independentSchrodinger¨ equation, but is really just the standard equation for the Hamiltonian (i.e. energy) operator’s eigenvalues and eigenstates. You can apply the

time-dependentSchrodinger¨ equation to the energy eigenstates,

and show that they have simple time dependence: they oscillate at a frequency determined by their energy.

ψE(x,t) =ψE(x,0)e

−iEt

(75)

So the easiest way to evolve a state forward in time is to resolve it into energy eigenstates, and let each eigenstate oscillate at its own frequency: ψE(x,0) = X E AEψE(x) ... (16) ψE(x,t) = X E AEψE(x)e −iEt ~ ... (16)

The tricky part of this is that you have to calculate the eigenstates and corresponding eigenvalues of the Hamiltonian. This is not always easy to do, but once it is done you have solved the system: you know exactly how it will behave.

(76)

Length and distance. Definition

Thelengthof a vector v= (v1,v2, ...,vn)∈Rn is

||v||=

q

v₁2+v₂2+...+v2 n

The distance between vectorsxandy is defined as ||x−y||

Properties of length

||x|| ≥0, ||x||= 0 only ifx=0. ( Positivity )

||rx||=|r|||x||. ( Homogeneity )

(77)

Scalar product. Definition

Thescalar product of vector x= (x1,x2, ...,xn) and

y= (y1,y2, ...,yn) isx·y=x1y1+x2y2+· · ·+xnyn.

Alternative notation (x,y) or <x,y>

Properties of the scalar product

x·x≥0,||x||= 0 only if x=0. ( Positivity )

x·y=y·x. ( Simmetry )

(x+y)·z=x·z+y·z. ( Distributive Law ) (rx)·y=r(x·y). ( Homogeneity )

(78)

In particular,x·y is a bilinear function (i.e., it is both a linear function ofx and a linear function ofy ).

Relations between lengths and scalar products:

||x||=√x·x

|x·y| ≤ ||x||||y|| (Cauchy-Schwarz inequality).

(79)

By theCauchy-Schwarz inequality, for any nonzero vectors x,y∈_Rn _{we have} |x·y| ≤ ||x||||y|| ⇒ −||x||||y||| ≤x·y≤ ||x||||y|| ⇒ −1≤ x·y ||x||||y|| ≤1 Thus, we define cos(θ) = x·y ||x||||y|| for some 0≤θ≤π

whereθis called the angle between the vectorsx andy. The

vectorsxandy are said to be orthogonal ( denoted by⊥) if

(80)

Example 11.9

Find the angleθ between vectors x= (2,−1) and y= (3,1) .

Solution x·y= 5,||x||=√5,||y||=√10⇒. cos(θ) = x·y ||x||||y|| = 5 √ 5√10 = 1 √ 2 ⇒θ= 45 ◦ Example 11.10

Find the angleφ between vectorsv= (2,1,3) andw= (4,5,1) .

Solution

(81)

Orthogonality Definition 1

Vectorsx,y∈_Rn _{are said to be} _orthogonal _{( denoted by}_⊥_{) if}

x·y= 0.

Definition 2

Avector x∈_Rn _{is said to be} _{orthogonal to a nonempty set}

Y ⊂_Rn _{( denoted by}_x_⊥_Y _{) if}_x_·_y_{= 0. for all} _y_∈_Y

Definition 3

Nonempty setsX,Y,⊂_Rn _{are said to be}_{orthogonal sets}

(82)

The line x=y = 0 is orthogonal to the liney =z = 0. Indeed, if v= (0,0,z) andw= (x,0,0) thenv·w

The line x=y = 0 is orthogonal to the plane z = 0. Indeed,

if v= (0,0,z) andw = (x,y,0) then v·w= 0

The line x=y = 0 is not orthogonal to the plane z = 1. The

vector v= (0,0,1) belongs to both the line and the plane,

andv·v= 16= 0 (OBS:They form a perpendicular angle

between them but they are not orthogonal sets)

The plane z = 0 is not orthogonal to the plane y= 0. The

vector v = (1,0,0) belongs to both planes, and v·v= 16= 0

(OBS:They form a perpendicular angle between them but

(83)

Proposition 1

IfX,Y,∈_Rn _{are orthogonal sets, then either they are disjoint or}

X∩Y ={0}

poof

v∈X ∩Y ⇒v⊥v⇒v·v= 0⇒v=0 Proposition 2

LetVbe a subspace of RnandS be a spanning set forV. then for

anyx∈_Rn

(84)

proof

Anyv∈V is represented asv=a1v1+a2v2+· · ·+anvn where

vi ∈S andai ∈R. Then for anyx⊥S

x·v=x·(a1v1+a2v2+· · ·+anvn) =

(85)

Orthogonal complement Definition

LetS ⊂Rn. The orthogonal complement of S, denoted by S⊥ is

the set of all vectorsx∈_Rn _{that are orthogonal to} _S_{. That is,}_S⊥

is the largest subset ofRn orthogonal toS. Theorem

S⊥ is a subspace of Rn.

Note thatS ⊂(S⊥)⊥, henceSpan(S)⊂Span(S⊥)⊥.

Theorem

(S⊥)⊥ =Span(S) in particular, for any subspaceV, we have (V⊥)⊥=V

(86)

For instance, the lineL={(x,0,0) :x∈_R}and the plane Π ={(0,y,z) :x ∈_R} inR3, satisfy: L⊥ =Πand Π⊥=L Fundamental subspaces I

Definitions

Given anm×n matrix A, let

N(A) ={x∈Rn|Ax=0}

R(A) ={b∈Rm|b=Ax, for some x∈Rn}

R(A) is the range of the linear mapping

(87)

AlsoN(A) is the nullspace of the matrix A whileR(A) is the column space ofA. The row space of Ais R(AT)

The subspacesN(A),R(AT)⊂_Rn _and_R₍_A₎_,_N₍_AT₎_⊂

Rm are fundamental subspacesassociated to the matrixA.

Theorem

N(A) =R(AT)⊥, N(AT) =R(A)⊥

That is, the nullspace of a matrix is the orthogonal complement of its row space.

(88)

Proof

The equalityAx=0 means that the vectorx is orthogonal to rows

of the matrixA. Therefore, N(A) =S⊥ whereS is the set of rows ofA. it remains to note thatS⊥=Span(S)⊥=R(AT)⊥ Corollary

LetV be a subspace ofRn thendim(V) +dim(V⊥) =n Proof

Pick a basisv1,v2,· · · ,vn for V. Let Abe the k×n matrix whose

rows are vectorsv1,v2,· · ·,vn. Then V=R(AT), hence

V⊥=N(A). Consequently, dim(V) anddim(V⊥) are rank and nullity ofA. Therefore dim(V) +dim(V⊥) equals the number of

(89)

Example 11.11

LetV be the plane spanned by vectors v1= (1,1,0) and

v2= (0,1,1). Find V⊥.

Solution

The orthogonal complement toV is the same as the orthogonal

complement of the set{v1,v2}. A vectoru1= (x,y,z) belongs to

the latter if and only if

u·v1 = 0 u·v2 = 0 ⇐⇒ x+y = 0 y+z = 0

(90)

Alternatively, the subspaceV is the row space of the matrix A= 1 1 0 0 1 1

henceV⊥ is the nullspace ofA. The general solution of the system

(or, equivalently, the general element of the nullspace ofA) is (t,−t,t) =t(1,−1,1),t∈_R. Thus V⊥ is the straight line spanned by the vector (1,−1,1)

(91)

Orthogonal projection Theorem

LetV be a subspace ofRn. Then any vector x∈Rn. is uniquely

represented asx=p+o wherep∈V ando∈V⊥

Idea of the proof

Letv1,v2,· · ·,vk be a basis forV andw1,w2,· · · ,wm be a basis

forV⊥ Thenv1,v2,· · ·,vk,w1,w2,· · ·,wm is a linearly

independent set. Hence it is a basis forRn.

In the above expansion,p, is called the orthogonal projectionof

(92)

Theorem

||x−v||>||x−p||for anyv6=p in V.

Thus||o||=||x−p||=minv∈V||x−v|| is thedistance from the

vectorx to the subspaceV. Rememember that:

Orthogonal complement Definition

LetS ⊂Rn the orthogonal complement of S, denotedS⊥ is the

(93)

Also remember that

Theorem

i)S⊥ is a subspace ofRn

ii)(S⊥)⊥=Span(S) is a subspace ofRn

and

Theorem

IfVis a subspace of Rn i)(V⊥)⊥=V

ii) V∩V⊥=φ is a subspace ofRn iii)dim(V) +dim(V⊥) =n

(94)

Theorem

IfVis the row space of a matrix, then V⊥ is the nullspace of the same matrix.

Orthogonal projection over a space

We already saw that

Theorem

LetV be a subspace ofRn. Then any vector x∈Rn. is uniquely

(95)

Orthogonal projection over a space

Letx,y∈_Rn _with _y₆₌₀_{. Then there exists a unique}

decomposition,x=p+o, such thatp is parallel toy andois

orthogonal toy,p is known as the orthogonal projection of x

(96)

We havep=αy for someα∈_R. Then 0 =p·o= (x−αy) =x·y−αy·y⇒ α= x·y y·y ⇒p= x·y y·y y

(97)

Example 11.12

Find the distance from the pointx= (3,1) to the line spanned by

y= (2,−1).

Solution

Consider the decompositionx=p+o wherep is parallel toy and

ois orthogonal toy. The required distance is the length of the

orthogonal componento p= x·y y·y y= 5 5 (2,−1) = (2,−1) o=x−p= (3,1)−(2,−1) = (1,2)⇒ ||o||=√5

(98)

Example 11.13

Find the points on the liney =−x that is closest to the point (3,4).

Solution

The required point is the projectionp of v= (3,4) on the vector

w= (1,−1) spanning the line y =−x.

p= v·w w·w w= −1 2 (1,−1) = (−1/2,1/2)

(99)

Example 11.14

Let Π be the plane spanned by vectorsv1 = (1,1,0) and

v2= (0,1,1)

i)Find the orthogonal projection of the vector x= (4,0,−1) onto the plane Π.

(100)

Solution

We havex=p+owhere p∈Π ando∈Π. Then the orthogonal

projection ofxonto Π is p and t he distance from xto Π is ||o||

we have thatp=αv1+βv2 for someα, β∈R. Then

o=x−p=x−αv1−βv2 o·v1= 0 o·v2= 0 ⇒ α(v1·v1) +β(v2·v1) =x·v1 α(v1·v2) +β(v2·v2) =x·v2 ⇒ 2α+β = 4 α+ 2β=−1 ⇒ α= 3 β =−2 p=p=αv1+βv2= (3,1,−2) o=x−αp= (1,−1,1) ⇒ ||o||=√3

(101)

Least Squares Problems

Let’s consider the Overdetermined system of linear equations:

   x+ 2y = 3 3x+ 2y = 5 x+y = 2.09 ⇒    x+ 2y = 3 −4y =−4 −y =−0.09

Now, assume that a solution (x0,y0) does exist in fact but the

system is not quite accurate, namely, there may be some errors in the right-hand sides (rounding errors for instance).

Problem

(102)

One approach is theleast squares fit. Namely, we look for a pair (x,y) that minimize the sum

(x+ 2y−3)2+ (3x+ 2y−5)2+ (x+y−2.09)2

Least squares solution System of linear equations:

     a11x1+a12x2+· · ·+a1nxn=b1 a21x1+a22x2+· · ·+a2nxn=b2 .. .am1x1+am2x2+· · ·+amnxn=bm

(103)

Theleast squares solution x to the system is the one that minimizes||r(x)|| (or, equivalently, ||r(x)||2 _).

||r(x)||2= (

m

X

i=1

(ai1x1+ai2x2+· · ·+ainxn−bi)2

LetAbe an m×n matrix and letb∈Rn

Theorem

A vector ˆxis a least squares solution of the systemAx if and only

(104)

proof

Ax is an arbitrary vector in R(A), the column space of A. Hence the length ofr(x) =b−Ax is minimal ifAx is the orthogonal projection ofbonto R(A) that is, ifr(x) is orthogonal to R(A).

We know that row space⊥ = Nullspace for any matrix. In

particular,R(A)⊥=N(A) the nullspace of the transpose matrix of

A. Thus, ˆxis a least squares solution if and only if

(105)

Corollary

The normal systemATAx=ATb is always consistent.

Example 11.15

Find the least squares solution to

   x+ 2y = 3 3x+ 2y= 5 x+y = 2.09 Solution

(106)

  1 2 3 2 1 1   x y =   3 5 2.09  

and thenormal systemis

1 3 1 2 2 1   1 2 3 2 1 1   x y = 1 3 1 2 2 1   3 5 2.09  ⇒ 11 9 9 9 x y = 20.09 18.09 ⇒ x = 1 y = 1.01

(107)

Example 11.16

Find theconstant function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution f(x) =c ⇒        c = 1 c = 0 c = 1 c = 2 ⇒     1 0 1 2     c ⇒

(108)

Then, the normal system is 1 1 1 1     1 1 1 1     c = 1 1 1 1     1 0 1 2    

c = 1₄(1 + 0 + 1 + 2) = 1 (mean arithmetic value)

Thus, the constant function is

(109)

Example 11.17

Find thelinear polynomial function that is the least squares fit to the following data

x 0 1 2 3 f(x) 1 0 1 2 Solution f(x) =c1+c2x ⇒        c1 = 1 c1+c2 = 0 c1+ 2c2 = 1 c1+ 3c2 = 2 ⇒     1 0 1 1 1 2 1 3     c1 c2 =     1 0 1 2    

(110)

Then, the nomal system is 1 1 1 1 0 1 2 3     1 0 1 1 1 2 1 3     c1 c2 = 1 1 1 1 0 1 2 3     1 0 1 2     4 6 6 14 c1 c2 = 4 8 ⇒ c1= 0.4 c2= 0.4

Thus, the linear function is

(111)

Example 11.18

Find thequadratic polynomial function that is the least squares fit to the following data

x 0 1 2 3 f(x) 1 0 1 2 Solution f(x) =c1+c2x+c3x2⇒        c1 = 1 c1+c2+c3 = 0 c1+ 2c2+ 4c3= 1 c1+ 3c2+ 9c3= 2 ⇒

(112)

    1 0 0 1 1 1 1 2 4 1 3 9       c1 c2 c3  =     1 0 1 2     ⇒

Then, the nomal system is

  1 1 1 1 0 1 2 3 0 1 4 9       1 0 0 1 1 1 1 2 4 1 3 9       c1 c2 c3  =   1 1 1 1 0 1 2 3 0 1 4 9       1 0 1 2    

(113)

  4 6 14 6 14 36 14 36 98     c1 c2 c3  =   4 8 22  ⇒    c1 = 0.9 c2 =−1.1 c3 = 0.5

Thus, the quadratic function is

(114)

Orthogonal sets

Let<·,·>denote the scalar product in Rn

Definition

Nonzero vectorsv1,v2,· · ·,vk ∈Rn form an orthogonal setif

they are orthogonal to each other: <vi,vj >= 0 for all i 6=j.

If, in addition, all vectors are of unit length,vi,v1,v2,· · · ,vk is

called anorthonormal set.

For instance, The standard basis

e1 = (1,0,0, ...,0),e2= (0,1,0, ...,0),· · · ,en= (0,0,0, ...,1). It

(115)

Orthonormal bases

Supposev1,v2,· · ·,vn is an orthonormal basis forRn (i.e., it is a

basis and an orthonormal set).

Theorem Letx=x1v1+x2v2+· · ·+xnvn andy=y1v1+y2v2+· · ·+ynvn wherexi,y1 ∈R i)<x,y>=Pn i=ixiyi i)||x||=pPn_i=ixiyi

(116)

proof i) <x,y>= * _n X i=i xivi, n X j=i yjvj + = n X i=i xi * vi, n X j=i vj + = n X i=i xi n X j=i yjhvi,vji= n X i=i xiyi

(117)

SupposeVis a subspace ofRn. Let pbe the orthogonal projection

of a vectorx∈_Rn _onto_V_.

IfVis a one-dimensional subspace spanned by av, then

p= <_<x_v,_,v_v>_>v

IfVadmits an orthogonal basis v1,v2,· · · , vk, then

p= <x,v1> <v1,v1 > v1+ <x,v2> <v2,v2> v2+...+ <x,vk > <vk,vk > vk Indeed,<p,vi >=Pk_j=i _<<_vx,vj> j,vj> <vj,vi >= <x,vi> <vi,vi> <vi,vi >=< x,vi >⇒<x−p,vi >= 0⇒(x−p)⊥vi ⇒(x−p)⊥V.

(118)

Coordinates relative to an orthogonal basis Theorem

Ifv1,v2,· · ·,vn is an orthogonal basis forRn, then

x= <x,v1> <v1,v1> v1+ <x,v2> <v2,v2 > v2+...+ <x,vn> <vn,vn> vn

for any vectorx∈_Rn

Corollary

Ifv1,v2,· · ·,vn is an orthonormal basis forRn, then

z=<x,v1>v1+<x,v2>v2+...+<x,vn>vn