Linear Algebra. Week 10
Dr. Marco A Roque SolEigenvalues and eigenvectors of an operator Definition.
LetV be a vector space andL:V→V be a linear operator. A
numberλis called aneigenvalueof the operatorL ifL(v) =λv
for a nonzero vectorv∈V. The vector v is called an eigenvector
ofLassociated with the eigenvalue λ(If Vis a functional space then eigenvectors are also calledeigenfunctions. )
IfV=Rn then the linear operator Lis given byL(x) =Ax, where
Ais ann×n matrix. In this case, eigenvalues and eigenvectors of
the operatorLare precisely eigenvalues and eigenvectors of the
SupposeL:V→V is a linear operator on a finite-dimensional
vector spaceV
Let{u1,u2,· · · ,un}be a basis forV andg :V→Rn be the
corresponding coordinate mapping. LetA be the matrix ofL with
respect to this basis. Then
L(v) =λL(v) ⇐⇒ Ag(v) =λg(v)
Hence the eigenvalues ofL coincide with those of the matrixA.
Moreover, the associated eigenvectors ofAare coordinates of the
Definition.
The characteristic polynomialp(λ) =det(A−λI) of the matrix A
is called thecharacteristic polynomial of the operatorL.
Then eigenvalues ofLare roots of its characteristic polynomial.
Theorem.
The characteristic polynomial of the operatorLis well defined. That is, it does not depend on the choice of a basis.
Proof
LetB be the matrix ofL with respect to a different basis
{v1,v2,· · ·,vn}. Then A=UBU−1, whereU is the transition
matrix from the basis{v1,v2,· · ·,vn} to{u1,u2,· · ·,un}. We
have to show thatdet(A−λI) =det(B−λI) for allλ∈R. We obtain
det(A−λI) =det(UBU−1−λI) =det(U[B−λI]U−1) =
Eigenspaces
LetL:V→V be a linear operator. For anyλ∈R, letVλ denotes
the set of all solutions of the equationL(x) =λx.
ThenVλ is a subspace ofV, sinceVλ is the kernel of a linear operator given byx→L(x)−λx.
Vλ minus the zero vector is the set of all eigenvectors ofL associated with the eigenvalueλ. In particular,λ∈Ris an eigenvalue ofLif and only if Vλ6=0.
IfVλ 6=0 then it is called the eigenspace ofLcorresponding to the eigenvalueλ.
Let V=C∞(R) andD :V→V, Df =f0
A function f ∈C∞(R) is an eigenfunction of the operator D
belonging to an eigenvalue λiff0(x) =λf(x) for all x ∈R
It follows thatf(x) =ceλx, where c is a nonzero constant.
Thus eachλ∈Ris an eigenvalue of D. The corresponding
Let V=C∞(R) andL:V→V, Lf =f00
A function f ∈C∞(R) is an eigenfunction of the operator L
belonging to an eigenvalue λif
f00(x) =λf(x)⇒f00(x)−λf(x) = 0 for all x∈R
Note thatLf =D2f, henceDf =µf ⇒Lf =µ2f. Ifλ >0 thenVλ=Span(eµx,e−µx) with µ=
√ λ
Ifλ <0 thenVλ=Span(sin(µx),cos(µx)) withµ=
√ −λ
LetV be a vector space andL:V→V be a linear operator.
Theorem
Ifv∈V is an eigenvector of the operatorLthen the associated
eigenvalue is unique.
Proof
L(v) =λ1v andL(v) =λ2v. Then,
Theorem
Suppossedv1 andv2 are eigenvectors ofLassociated with different
eigenvaluesλ1 and λ2. Then,v1 andv2 are linearly independent.
Proof
For any scalart 6= 0 the vector tv1 is also an eigenvector of L
associated with the eigenvalueλ1. Sinceλ16=λ2, it follows that
tv16=v2. That is, v2 is not a scalar multiple ofv1. Similarly, v1 is
LetL:V→V be a linear operator.
Theorem
Ifv1,v2,and v3 are eigenvectors of Lassociated with distinct
eigenvaluesλ1, λ2 and λ3, then they are linearly independent
Proof
Supposed thatt1v1+t2v2+t3v3=0 for somet1,t2,t3∈R. Then L(t1v1+t2v2+t3v3 =0)
t1L(v1) +t2L(v2) +t3L(v3) =0
It follows that
t1λ1v1+t2λ2v2+t3λ3v3−λ3(t1v1+t2v2+t3v3) =
t1(λ1−λ3)v1+t2(λ2−λ3)v2 =0
By the above,v1 andv2 are linearly independent.
Hencet1(λ1−λ2) =t2(λ2−λ3) = 0⇒t1 =t2 = 0. Then
Theorem
Ifv1,v2, ...,vk are eigenvectors of a linear operatorL associated
with distinct eigenvaluesλ1, λ2, ..., λk thenv1,v2, ...,vk are linearly
independent.
Corollary
Ifλ1, λ2, ..., λk are distinct real numbers, then the functions
eλ1x,eλ2x, ...,eλkx are linearly independent.
Proof
Consider the linear operatorD :C∞(R)→C∞(R) given by Df =f0.
Theneλ1x,eλ2x, ...,eλkx are eigenfunctions of D associated with distinct eigenvaluesλ1, λ2, ..., λk. By the theorem, the
eigenfunctions are linearly independent.
Basis of eigenvectors
LetV be a finite-dimensional vector space andL:V→V be a
linear operator. Letv1,v2, ...,vn are a basis forV andAbe the
Theorem
The matrixAis diagonal if and only if vectors v1,v2, ...,vn are
eigenvectors ofL. If this is the case, then the diagonal entries of
the matrixAare the corresponding eigenvalues ofL.
Proof L(vi) =λivi ⇒L(v) =Av ⇐⇒ A= λ1 0 λ2 . .. 0 λn
How to find a basis of eigenvectors
We know that ifv1,v2, ...,vk are eigenvectors of a linear operator
Lassociated with distinct eigenvaluesλ1, λ2, ..., λk, then
v1,v2, ...,vk are linearly independent. Thus, from this result follows
Corollary
Supposeλ1, λ2, ..., λk are all eigenvalues of a linear operator
L:V→V. For any 1≤i ≤k letSi be a basis for the eigenspace
associated to the eigenvalueλi. Then these bases are disjoint and
the unionS =S1∪S2∪ · · · ∪Sk is a linearly independent set.
Moreover, if the vector spaceV admits a basis consisting of
Corollary
LetAbe an n×n matrix such that the characteristic equation
det(A−I) = 0 has n distinct roots. Then (i) there is a basis for
Rn consisting of eigenvectors of A;(ii) all eigenspaces of Aare
one-dimensional.
Theorem
LetLbe a linear operator on a finite-dimensional vector spaceV. Then the following conditions are equivalent:
the matrix ofL with respect to some basis is diagonal;
there exists a basis for V formed by eigenvectors ofL
Theorem
LetAbe an n×n matrix. Then the following conditions are
equivalent:
A is the matrix of a diagonalizable operator
A is similar to a diagonal matrix, i.e., it is represented as
A=UDU−1 where the matrix D is diagonal.
there exists a basisRn formed by eigenvectors ofA.
Example 11.1
Given the matrix
A= 2 1 1 2 Find, if it is diagonalizable. Solution
The matrix Ahas two eigenvalues: 1 and 3.
The eigenspace of Aassociated with the eigenvalue 1 is the
The eigenspace of Aassociated with the eigenvalue 3 is the line spanned by v2 = (1,1)
Eigenvectors v1,v2 form a basis forR2
Thus, the matrixA is diagonalizable. Namely,A=UBU−1, where
B= 1 0 0 3 U = −1 1 1 1
Notice thatU is the transition matrix from the basisv1,v2 to the
standard basis or in another words, the columns ofU are made of
Example 11.2
Given the matrix
A= 1 1 −1 1 1 1 0 0 2 Find if it is diagonalizable. Solution
The matrix Ahas two eigenvalues: 0 and 2.
The eigenspace of Aassociated with the eigenvalue 0 is
The eigenspace of Aassociated with the eigenvalue 2 is two-dimensional, is spanned by v2 = (1,1,0),v3 = (−1,0,1),
andS2={v2,v3}
The unionS =S1∪S2={v1,v2,v3} is a linearly independent
set, hence it is a basis for R3
Thus, the matrixA is diagonalizable. Namely,A=UBU−1, where
B= 0 0 0 0 2 0 0 0 2 U = −1 1 −1 1 1 0 0 0 1
Notice thatU is the transition matrix from the basisv1,v2,v3 to
Example 11.3
Given the matrix
A= 4 3 0 1 FindA5. Solution
We know t hatA=UBU−1, where
B = 4 0 0 1 , U = 1 −1 0 1
Then
A5 =UBU−1UBU−1UBU−1UBU−1UBU−1=UB5U−1= . 1 −1 0 1 1024 0 0 1 1 1 0 1 1024 −1 0 1 1 1 0 1 = 1024 1023 0 1
Example 11.4
Given the matrix
A= 4 3 0 1 FindAk. Solution
We know thatA=UBU−1, where
B = 4 0 0 1 , U = 1 −1 0 1
Then
Ak =UBU−1UBU−1· · ·UBU−1
| {z } k−times =UBkU−1 = . 1 −1 0 1 4k 0 0 1 1 1 0 1 4k −1 0 1 1 1 0 1 = 4k 4k−1 0 1
Example 11.5
Given the matrix
A=
4 3
0 1
FindC such thatC2 =A.
Solution
We know thatA=UBU−1, where
B = 4 0 0 1 , U = 1 −1 0 1
Suppose thatD2 =B for some matrixD. LetC =UDU−1, then C2 =UDU−1UDU−1 =UD2U−1=A We can take D= √ 4 0 0 √1 then C = 1 −1 0 1 2 0 0 1 1 1 0 1 = 2 1 0 1
Example 11.6
Coupled Oscillations
1) Case of two masses. Consider two identical bodies joined up with identical springs on a frictionless track as follows:
HereAandB represent the equilibrium positions of the two
masses. Letx1(t) andx2(t) be the distances from the equilibrium
positions of the two masses at timet and let k be the spring
The force acting on the first body has two parts byHooke’s law
(discovered by the English scientist Robert Hookein 1660–states that the force f exerted by a coiled spring is directly proportional to its extension ∆x): the first part is −kx1 due to the leftmost spring
and the second part isk(x2−x1) due to the center spring. The net
force acting on the first mass is then
F1 =−kx1−k(x2−x1) =−2kx1+kx2
Similarly, the net force acting on the second mass is
F2=kx1−2kx2
ApplyingSecond Newtons Law
(In the presence of external forces, an object experiences an acceleration directly proportional to the net external force and inversely proportional to the mass of the object: F=ma )
gives the following system of differential equations:
mx100=−2kx1+kx2
mx200=kx1−2kx2
This can be written in matrix form as follows:
x1 x2 00 =−k m 2 −1 −1 2 x1 x2
Let us now find the eigenvalues of the matrix
A=
2 −1
−1 2
Recall that the eigenvalues ofAare those values ofλsatisfying the equation:
det(A−λI)−0 whereI is the 2×2 identity matrix:
|A−λI|= 2−λ −1 −1 2−λ = 0 ⇒(2−λ)2−1 = 0 ⇒λ1,2= 1,3.
Let us now find corresponding eigenvectors. Forλ1= 1
1 −1 0
−1 1 0
Which is equivalen to
1 −1 0
0 0 0
Thus, one unit eigenvector, associated to this eigenvalues, is
v1=
1/√2 1/√2
Similarly, one can show that forλ2= 3, one unit eigenvector,
associated to this eigenvalues, is
v1 =
−1√2 1√2
Now let V = √1 2 1 −1 1 1 thenV−1AV =D orA=VDV−1 , where D= 1 0 0 3
Thus, the original system of equations can be written as X00=−k mVDV −1X (V−1X)00=−k mDV −1X
and making the change of variable,Y=V−1X, we obtain
y1 y2 00 =−ω20 1 0 0 3 y1 y2
which is giving us theuncoupled equations y100=−ω2 0y1 y200=−3ω02y2 ω0 = r k m Physical interpretation
1)Let us look again at the eigenvector
v1=
1/√2 1/√2
corresponding to the eigenvalue 1. The fact that the components are equal tells us thatx1 andx2 are always equal. Consequently,
the system oscillates back and forth but the middle spring is never stretched. It is as if we had two masses, each attached to a spring of constantk. It easy to see then that the frequency is given by
ω0=
q
k m
2)Let us look again at the second eigenvector v1 = −1/√2 1/√2
corresponding to the eigenvalue 3.
The fact that the componentsx1 and x2 are always equal but have
opposite directions, this gives an “in and out” motion type. The frequency of the system is also predictable in this case: each mass
is attached to a spring compressed by distancex1 and to another
stretched by a distance of 2x1. It is as if the mass is attached to
one single spring of constant 3k. We know that the frequency in
this case isω0 =
q
3k m
These two particular cases are called theNormal Modes (
https://www.youtube.com/watch?v=x_ZkKPtgTeA) of
vibration of the system. As you can guess, they have the property that if the system starts out i n one of these modes, it will remain in this mode.
Of course, t he above problem involving two masses can be dealt without talking about eigenvalues and eigenvectors. The benefit of using that algebraic technique is more apparent in the complicated cases of more than two masses.
Example 11.6 Fibonacci Numbers
In 1202, an Italian mathematician known asFibonacci posed the
following problem: A male and a female rabbit are born at the beginning of the year. We assume the following conditions:
After reaching the age of two months, each pair produces a mixed pair, (one male, one female), and then another mixed pair each month thereafter.
Letfndenote the number of rabbit pairs at the beginning of month
n, then we have
f0 = 1,f1= 1,f2 = 2,f3 = 3,f4= 5, ...,fn=fn−1+fn−2
In other words, every term in this sequence (starting withn= 2) is
equal to the sum of the previous two terms. The sequence 1,1,2,3,5,8,13,21,3...generated according to this scheme was named in honour of Fibonacci.
(Leonard of Pisa or Fibonacci played an important role in reviving ancient mathematics and made significant contributions of his own. ).
This sequence occurs in many places in mathematics, general science, in nature and even in art.
In particular, at the end of the first year (n=12), there will 144 rabbits. As you can see, the Fibonacci numbers get bigger and bigger.
A natural question arise here: can we find an expression that gives
fnfor any n? The answer is yes, if one knows a little bit about
matrix diagonalization.
Let us now see how all this can be applied to answer our problem about the rabbits of Fibonacci. The general pattern for the Fibonacci sequence was:
fn=fn−1+fn−2
A general expression forfn is not apparent, but a clever twist
transforms the problem into a simpler one using matrices. The idea is to compute the vector
Vn=
fn+1
fn
for eachn≥0 rather than fn itself. The relationfn=fn−1+fn−2 gives Vn+1 = fn+2 fn+1 = 1 1 1 0 fn+1 fn =AVn where A= 1 1 1 0
and therefore
Vn=AVn−1 =A(AVn−2) =A2Vn−2=A3Vn−3 =...AnV0
But, the matrixA is diagonalizable since the eigenvalues are given
by |A−λI|= 1−λ 1 1 −λ =λ2−λ−1 = 0⇒λ1= 1 + √ 5 2 :=φ, λ2 = 1− √ 5 2 :=φ ∗
and associate eigenvectors p1 = φ 1 p2 = φ∗ 1
Hence, the matrixAis diagonalizable by the matrix of eigenvectors
P =
φ φ∗
1 1
and its inverse
P−1 = 1 √ 5 − 1 √ 5φ ∗ −√1 5 1 √ 5φ !
The above results gives the following decomposition ofA: A=PDP−1 = φ φ∗ 1 1 φ 0 0 φ∗ √1 5 − 1 √ 5φ ∗ −√1 5 1 √ 5φ !
Thus, finally we obtain
Vn=AnV0 =PDnP−1V0 = φ φ∗ 1 1 φn 0 0 (φ∗)n √1 5 − 1 √ 5φ ∗ −√1 5 1 √ 5φ ! 1 −1
If you carefully multiply this out and take the first component of the result, you get thenth Fibonacci term:
fn= 1 √ 5 φn+1+φ∗n+1 Note The number φ:= 1 + √ 5 2
is one of the most mysterious numbers, and is widely known as the
golden ratio. It has this name since a rectangle with sidess and
φs has a pleasing shape, possibly because of the fact that if you fold the smaller side (of length s) into the rectangle, dividing into a square of sides and a rectangle with sides s ands(1−φ), this smaller rectangle is similar to the one you began with !!!!
Example 11.7 Quadratic Equation
Aquadratic equationin two variables x andy is an equation of the form
ax2+ 2bxy+cy2+dx+ey+f = 0 ... (1) which can be rewritten as:
x y a b c d x y + d e x y +f = 0 ... (2)
Let x= x y A= a b c d The term xTAx=ax2+ 2bxy+cy2
Conic Sections
The graph of an equation of the form (1) is called a conic section. [If there are no ordered pairs (x,y) which satisfy (1), we say that the equation represents an imaginary conic.] If the graph of (1) consists of a single point, a line, or a pair of lines, we say that (1) represents a degenerate conic. Of more interest are the
nondegenerate conics. Graphs of nondegenerate conics turn out to be circles, ellipses, parabolas, or hyperbolas. The graph of a conic is particularly easy to sketch when its equation can be put into one of the following standard forms:
x2+y2=r2 ... ( Circle ). x2 a2 + y2 b2 = 1 ... ( Ellipse ). x2 a2 − y2 b2 = 1 or y2 a2 −x 2 b2 = 1 ... ( Hyperbola ). x2 =ay or y2 =ax... ( Parabola ).
Herea,b and r are nonzero real numbers. Note that the circle is a special case of the ellipse (a=b =r). A conic section is said to be in standard position if its equation can be put into one of these four standard forms. The graphs of the first three will all be symmetric to both coordinate axes and the origin. We say that these curves are centered at the origin. A parabola in standard position will have its vertex at the origin and will be symmetric to one of the axes.
Now, if the original quadratic equation has no termxy, then the
graph of the standar conic will just suffer a vertical or horizontal shif.
Thus for instance we have the following case 9x2−18x+ 4y2+ 16y−11 = 0⇒ (x−1) 2 22 + (y+ 2)2 32 = 0
and if make the change of variable
x0 =x−1, y0 =y+ 2 we obtain
(x0)2
22 +
(y0)2
32 = 1
which is in standard form with respect to the variablesx0 andy0. Now, if, however, the conic section has also been rotated from the standard position, it is necessary to change coordinates so that the equation in terms of the new coordinatesx0 andy0 involves nox0y0
term. Letx= (x,y)T and x0= (x0,y0)T . Since the new
x=Qx0 or x0=QTx where Q = cos(θ) sin(θ) −sin(θ) cos(θ) or QT = cos(θ) −sin(θ) sin(θ) cos(θ)
If 0< θ < π, then the matrix Q corresponds to a rotation ofθ
radians in the clockwise direction andQT corresponds to a
rotation ofθradians in the counterclockwise direction. With this
(x0)T
QTAQ
x0+ (d0 e0)x0+f = 0 ... (3)
where (d0 e0) = (d e)Q. This equation will involve nox0y0 term if and only ifQTAQ is diagonal. SinceA is symmetric, it is possible to find a pair of orthonormal eigenvectorsq1= (x1,−y1)T
andq2 = (y1,x1)T. Thus, if we set cos(θ) =x1 andsin(θ) =y1,
λ1(x0)2+λ2(y0)2+d0x+e0y+f = 0
Consider the conic section
3x2+ 2bxy+ 3y2−8 = 0 This equation can be written in the form
x y 3 1 1 3 x y −8 = 0
The matrix A= 3 1 1 3
has eigenvaluesλ1 = 2 andλ2 = 4 with corresponding unit
eigenvectors q1 = ( 1 √ 2,− 1 √ 2) T, q 2 = ( 1 √ 2, 1 √ 2) T . Let Q = 1 √ 2 1 √ 2 −√1 2 1 √ 2 ! = cos(45◦) sin(45◦) −sin(45◦) cos(45◦)
and set x y = 1 √ 2 1 √ 2 −√1 2 1 √ 2 ! x0 y0
and the equation of the conic becomes (x0)2
4 +
(y0)2
In the new coordinate system the direction of thex0-axis is determined by the pointx0= 1,y0 = 0. To translate this to the
x−y coordinate system, we multiply
Qx0=x 1 √ 2 1 √ 2 −√1 2 1 √ 2 ! 1 0 = 1 √ 2 −√1 2 ! =q1
Thex0-axis will be in the direction ofq1. Similarly, the direction of
The Essentials of Quantum Mechanics
In classical mechanics, a particle has an exact, sharply defined position and an exact, sharply defined momentum at all times. Quantum mechanics is a different fundamental formalism, in which observables such as position and momentum are not real numbers but operators; consequently there are uncertainty relations,e.g.
∆x∆p≥~, which say that as some observables become more
sharply defined, others become more uncertain. Experiments show that quantum mechanics, not classical mechanics, is the correct description of nature.
Here is a summary of the essentials of quantum mechanics, focussing on the case of a single non-relativistic particle (e.g. an electron) in one dimension. This is not a complete review of everything you need to know: it is a quick outline of the basics to help you get oriented with this challenging subject.
States. The state of the system is given by awavefunction
ψ(x). The wavefunction is complex, and gives the amplitude
for finding the particle at position x. The probability density is the square modulus of the amplitude, so in one dimension the probability to find the particle located between x andx+dx is
This means that if you start off with an “ensemble” of identical copies of the system, all with the samewavefunction ψ(x), and in each member you measure the position of the particle, then you will get different results from different members of the ensemble. The probability of getting particular answers is given by (1). A physical state must be normalized, so that all the probabilities add up to 1
Z
|ψ(x)|2dx = 1 ... (2)
Operators. Each observable corresponds to a linear operator. A linear operator is something that acts on a state and gives another state, i.e. , it changes one function into another.
Position Operatorˆx defined by ˆ
x(ψ(x)) =xψ(x)
Momentum Operatorˆpdefined by ˆ
p(ψ(x)) =−i~∂ψ(x)
∂x .... (3)
For example, when the position operator acts on the state
ψ(x) = a2+x1 2 it gives ˆx(ψ(x)) = a2+xx 2,
while the momentum operator gives ˆp(ψ(x)) = 2i~x (a2+x2)2.
Linearity means that an operator ˆY acts on a sum of two states in
the obvious way ˆ
Expectation values. The expectation value of an observable ˆY in a normalized stateψ(x) is < ψ(x)|Yˆ|ψ(x)>= Z ∞ −∞ ψ(x)∗ˆ(Y) [ψ(x)]dx .... (5) This means that if you have an ensemble of identical copies of the
system, all with the samewavefunctionψ(x), then when you
measure the value of the observable ˆY in all the members, the
average value that you get is< ψ(x)|Yˆ|ψ(x)>(sometimes written<Yˆ >ψ(x), or even <Yˆ >, if it is obvious from the context which state to use).
So the average position of a particle in a normalized state is < ψ(x)|ˆx|ψ(x)>= Z ∞ −∞ ψ(x)∗ˆ(x) [ψ(x)]dx = Z ∞ −∞ ψ(x)∗xψ(x)dx = R∞ −∞x|ψ(x)|2dx .... (6)
Eigenvalues and eigenstates. Each operator ˆY has a set of
eigenvaluesy which are the possible values you can get on doing a
measurement of ˆY. Any measurement of ˆY must yield one of the
eigenvalues. Each eigenvaluey is associated with an eigenstate
φy(x), which is the state for which the value of ˆY is exactlyy,
So if you create an ensemble of systems that are all in the same state, namely the ˆY-eigenstateφy(x), then when you measure the
value of the observable ˆY, each member of the ensemble will give
you the same answer, namelyy.
ˆ
Y [φy(x)] =yφy(x)
Physically,φy(x) is the state in which the observable ˆY has the definite valuey. Note that the eigenvalues of an operator that coresponds to an observable are always real, since they are possible values of that physical observable.
For example, the eigenstates of the momentum operator are plane waves ψp(x) =eipx/~ ˆ p eipx/~=−i ~∂e ipx/~ ∂x =pe ipx/~ ... (8)
The eigenstates of the position operator areδ-functions,
ψx1(x) =δ(x−x1). The functionδ(x−x1) is zero everywhere
except atx =x1 where it is infinite, so
ˆ
x(δ(x−x1)) =xδ(x−x1) =x1δ(x−x1)
The formal definition of theδ-function is: R
f(x)δ(x−x1)dx
Resolving a State into Eigenstates. The eigenstates of any
operator ˆY form a complete orthonormal basis of states, so you
can write any stateψ(x) in terms of them:
ψ(x) =X
y
Ayφy(x) or ψ(x) =
Z
A(y)φy(x)dy ... (9)
Ay orA(y) is the amplitude for the value of the observable ˆY for a
particle in the stateψ(x) to bey. So when you make a
measurement of the observable ˆY on a system in the stateψ(x) : Discrete Eigenvalues: |Ay|2 = probability to get the valuey
Continuous Eigenvalues: |A(y)|2dy = probability toy to
The only question is: for a given stateψ(x), how do you express it in terms of eigenstates of ˆY, i.e. , how do you calculate the coefficientsAy or A(y). Actually, thanks to the orthonormality
property this is easy: you multiply the state by the complex conjugate of the eigenstate of ˆY with eigenvalue y, and integrate
Ay or A(y) =
Z
φ∗y(x)ψ(x)dy ... (11) Using (10), we then know the probability of getting any of the
allowed values when you perform a measurement of ˆY on a
particle in stateψ(x). This follows from the orthonormality property of the eigenstates:
Z φ∗y1(x)φy2(x)dx = δ(y1−y2) δy1,y2 ... (12)
Time Evolution. The evolution through time of a state is
determined by the energy operator, usually called the Hamiltonian ˆ
H, via the time-dependent Schrodinger¨ equation
ˆ
Hψ=i~∂ψ
∂t ... (13)
For a spinless, non-relativistic particle of massm, in a one-dimensional potentialV(x), the Hamiltonian is
ˆ Hψ= pˆ 2m +V(ˆx) =− ~2 2m ∂2 ∂x2 +V(x) ... (14)
Energy Eigenstates. Because time evolution is so important, the energy eigenstates are particularly important. As you would expect from (7), these are defined by
ˆ
HψE =EψE ... (15)
which is called the time-independentSchrodinger¨ equation, but is really just the standard equation for the Hamiltonian (i.e. energy) operator’s eigenvalues and eigenstates. You can apply the
time-dependentSchrodinger¨ equation to the energy eigenstates,
and show that they have simple time dependence: they oscillate at a frequency determined by their energy.
ψE(x,t) =ψE(x,0)e
−iEt
So the easiest way to evolve a state forward in time is to resolve it into energy eigenstates, and let each eigenstate oscillate at its own frequency: ψE(x,0) = X E AEψE(x) ... (16) ψE(x,t) = X E AEψE(x)e −iEt ~ ... (16)
The tricky part of this is that you have to calculate the eigenstates and corresponding eigenvalues of the Hamiltonian. This is not always easy to do, but once it is done you have solved the system: you know exactly how it will behave.
Length and distance. Definition
Thelengthof a vector v= (v1,v2, ...,vn)∈Rn is
||v||=
q
v12+v22+...+v2 n
The distance between vectorsxandy is defined as ||x−y||
Properties of length
||x|| ≥0, ||x||= 0 only ifx=0. ( Positivity )
||rx||=|r|||x||. ( Homogeneity )
Scalar product. Definition
Thescalar product of vector x= (x1,x2, ...,xn) and
y= (y1,y2, ...,yn) isx·y=x1y1+x2y2+· · ·+xnyn.
Alternative notation (x,y) or <x,y>
Properties of the scalar product
x·x≥0,||x||= 0 only if x=0. ( Positivity )
x·y=y·x. ( Simmetry )
(x+y)·z=x·z+y·z. ( Distributive Law ) (rx)·y=r(x·y). ( Homogeneity )
In particular,x·y is a bilinear function (i.e., it is both a linear function ofx and a linear function ofy ).
Relations between lengths and scalar products:
||x||=√x·x
|x·y| ≤ ||x||||y|| (Cauchy-Schwarz inequality).
By theCauchy-Schwarz inequality, for any nonzero vectors x,y∈Rn we have |x·y| ≤ ||x||||y|| ⇒ −||x||||y||| ≤x·y≤ ||x||||y|| ⇒ −1≤ x·y ||x||||y|| ≤1 Thus, we define cos(θ) = x·y ||x||||y|| for some 0≤θ≤π
whereθis called the angle between the vectorsx andy. The
vectorsxandy are said to be orthogonal ( denoted by⊥) if
Example 11.9
Find the angleθ between vectors x= (2,−1) and y= (3,1) .
Solution x·y= 5,||x||=√5,||y||=√10⇒. cos(θ) = x·y ||x||||y|| = 5 √ 5√10 = 1 √ 2 ⇒θ= 45 ◦ Example 11.10
Find the angleφ between vectorsv= (2,1,3) andw= (4,5,1) .
Solution
Orthogonality Definition 1
Vectorsx,y∈Rn are said to be orthogonal ( denoted by⊥) if
x·y= 0.
Definition 2
Avector x∈Rn is said to be orthogonal to a nonempty set
Y ⊂Rn ( denoted byx⊥Y ) ifx·y= 0. for all y∈Y
Definition 3
Nonempty setsX,Y,⊂Rn are said to beorthogonal sets
The line x=y = 0 is orthogonal to the liney =z = 0. Indeed, if v= (0,0,z) andw= (x,0,0) thenv·w
The line x=y = 0 is orthogonal to the plane z = 0. Indeed,
if v= (0,0,z) andw = (x,y,0) then v·w= 0
The line x=y = 0 is not orthogonal to the plane z = 1. The
vector v= (0,0,1) belongs to both the line and the plane,
andv·v= 16= 0 (OBS:They form a perpendicular angle
between them but they are not orthogonal sets)
The plane z = 0 is not orthogonal to the plane y= 0. The
vector v = (1,0,0) belongs to both planes, and v·v= 16= 0
(OBS:They form a perpendicular angle between them but
Proposition 1
IfX,Y,∈Rn are orthogonal sets, then either they are disjoint or
X∩Y ={0}
poof
v∈X ∩Y ⇒v⊥v⇒v·v= 0⇒v=0 Proposition 2
LetVbe a subspace of RnandS be a spanning set forV. then for
anyx∈Rn
proof
Anyv∈V is represented asv=a1v1+a2v2+· · ·+anvn where
vi ∈S andai ∈R. Then for anyx⊥S
x·v=x·(a1v1+a2v2+· · ·+anvn) =
Orthogonal complement Definition
LetS ⊂Rn. The orthogonal complement of S, denoted by S⊥ is
the set of all vectorsx∈Rn that are orthogonal to S. That is,S⊥
is the largest subset ofRn orthogonal toS. Theorem
S⊥ is a subspace of Rn.
Note thatS ⊂(S⊥)⊥, henceSpan(S)⊂Span(S⊥)⊥.
Theorem
(S⊥)⊥ =Span(S) in particular, for any subspaceV, we have (V⊥)⊥=V
For instance, the lineL={(x,0,0) :x∈R}and the plane Π ={(0,y,z) :x ∈R} inR3, satisfy: L⊥ =Πand Π⊥=L Fundamental subspaces I
Definitions
Given anm×n matrix A, let
N(A) ={x∈Rn|Ax=0}
R(A) ={b∈Rm|b=Ax, for some x∈Rn}
R(A) is the range of the linear mapping
AlsoN(A) is the nullspace of the matrix A whileR(A) is the column space ofA. The row space of Ais R(AT)
The subspacesN(A),R(AT)⊂Rn andR(A),N(AT)⊂
Rm are fundamental subspacesassociated to the matrixA.
Theorem
N(A) =R(AT)⊥, N(AT) =R(A)⊥
That is, the nullspace of a matrix is the orthogonal complement of its row space.
Proof
The equalityAx=0 means that the vectorx is orthogonal to rows
of the matrixA. Therefore, N(A) =S⊥ whereS is the set of rows ofA. it remains to note thatS⊥=Span(S)⊥=R(AT)⊥ Corollary
LetV be a subspace ofRn thendim(V) +dim(V⊥) =n Proof
Pick a basisv1,v2,· · · ,vn for V. Let Abe the k×n matrix whose
rows are vectorsv1,v2,· · ·,vn. Then V=R(AT), hence
V⊥=N(A). Consequently, dim(V) anddim(V⊥) are rank and nullity ofA. Therefore dim(V) +dim(V⊥) equals the number of
Example 11.11
LetV be the plane spanned by vectors v1= (1,1,0) and
v2= (0,1,1). Find V⊥.
Solution
The orthogonal complement toV is the same as the orthogonal
complement of the set{v1,v2}. A vectoru1= (x,y,z) belongs to
the latter if and only if
u·v1 = 0 u·v2 = 0 ⇐⇒ x+y = 0 y+z = 0
Alternatively, the subspaceV is the row space of the matrix A= 1 1 0 0 1 1
henceV⊥ is the nullspace ofA. The general solution of the system
(or, equivalently, the general element of the nullspace ofA) is (t,−t,t) =t(1,−1,1),t∈R. Thus V⊥ is the straight line spanned by the vector (1,−1,1)
Orthogonal projection Theorem
LetV be a subspace ofRn. Then any vector x∈Rn. is uniquely
represented asx=p+o wherep∈V ando∈V⊥
Idea of the proof
Letv1,v2,· · ·,vk be a basis forV andw1,w2,· · · ,wm be a basis
forV⊥ Thenv1,v2,· · ·,vk,w1,w2,· · ·,wm is a linearly
independent set. Hence it is a basis forRn.
In the above expansion,p, is called the orthogonal projectionof
Theorem
||x−v||>||x−p||for anyv6=p in V.
Thus||o||=||x−p||=minv∈V||x−v|| is thedistance from the
vectorx to the subspaceV. Rememember that:
Orthogonal complement Definition
LetS ⊂Rn the orthogonal complement of S, denotedS⊥ is the
Also remember that
Theorem
i)S⊥ is a subspace ofRn
ii)(S⊥)⊥=Span(S) is a subspace ofRn
and
Theorem
IfVis a subspace of Rn i)(V⊥)⊥=V
ii) V∩V⊥=φ is a subspace ofRn iii)dim(V) +dim(V⊥) =n
Theorem
IfVis the row space of a matrix, then V⊥ is the nullspace of the same matrix.
Orthogonal projection over a space
We already saw that
Theorem
LetV be a subspace ofRn. Then any vector x∈Rn. is uniquely
Orthogonal projection over a space
Letx,y∈Rn with y6=0. Then there exists a unique
decomposition,x=p+o, such thatp is parallel toy andois
orthogonal toy,p is known as the orthogonal projection of x
We havep=αy for someα∈R. Then 0 =p·o= (x−αy) =x·y−αy·y⇒ α= x·y y·y ⇒p= x·y y·y y
Example 11.12
Find the distance from the pointx= (3,1) to the line spanned by
y= (2,−1).
Solution
Consider the decompositionx=p+o wherep is parallel toy and
ois orthogonal toy. The required distance is the length of the
orthogonal componento p= x·y y·y y= 5 5 (2,−1) = (2,−1) o=x−p= (3,1)−(2,−1) = (1,2)⇒ ||o||=√5
Example 11.13
Find the points on the liney =−x that is closest to the point (3,4).
Solution
The required point is the projectionp of v= (3,4) on the vector
w= (1,−1) spanning the line y =−x.
p= v·w w·w w= −1 2 (1,−1) = (−1/2,1/2)
Example 11.14
Let Π be the plane spanned by vectorsv1 = (1,1,0) and
v2= (0,1,1)
i)Find the orthogonal projection of the vector x= (4,0,−1) onto the plane Π.
Solution
We havex=p+owhere p∈Π ando∈Π. Then the orthogonal
projection ofxonto Π is p and t he distance from xto Π is ||o||
we have thatp=αv1+βv2 for someα, β∈R. Then
o=x−p=x−αv1−βv2 o·v1= 0 o·v2= 0 ⇒ α(v1·v1) +β(v2·v1) =x·v1 α(v1·v2) +β(v2·v2) =x·v2 ⇒ 2α+β = 4 α+ 2β=−1 ⇒ α= 3 β =−2 p=p=αv1+βv2= (3,1,−2) o=x−αp= (1,−1,1) ⇒ ||o||=√3
Least Squares Problems
Let’s consider the Overdetermined system of linear equations:
x+ 2y = 3 3x+ 2y = 5 x+y = 2.09 ⇒ x+ 2y = 3 −4y =−4 −y =−0.09
Now, assume that a solution (x0,y0) does exist in fact but the
system is not quite accurate, namely, there may be some errors in the right-hand sides (rounding errors for instance).
Problem
One approach is theleast squares fit. Namely, we look for a pair (x,y) that minimize the sum
(x+ 2y−3)2+ (3x+ 2y−5)2+ (x+y−2.09)2
Least squares solution System of linear equations:
a11x1+a12x2+· · ·+a1nxn=b1 a21x1+a22x2+· · ·+a2nxn=b2 .. .am1x1+am2x2+· · ·+amnxn=bm
Theleast squares solution x to the system is the one that minimizes||r(x)|| (or, equivalently, ||r(x)||2 ).
||r(x)||2= (
m
X
i=1
(ai1x1+ai2x2+· · ·+ainxn−bi)2
LetAbe an m×n matrix and letb∈Rn
Theorem
A vector ˆxis a least squares solution of the systemAx if and only
proof
Ax is an arbitrary vector in R(A), the column space of A. Hence the length ofr(x) =b−Ax is minimal ifAx is the orthogonal projection ofbonto R(A) that is, ifr(x) is orthogonal to R(A).
We know that row space⊥ = Nullspace for any matrix. In
particular,R(A)⊥=N(A) the nullspace of the transpose matrix of
A. Thus, ˆxis a least squares solution if and only if
Corollary
The normal systemATAx=ATb is always consistent.
Example 11.15
Find the least squares solution to
x+ 2y = 3 3x+ 2y= 5 x+y = 2.09 Solution
1 2 3 2 1 1 x y = 3 5 2.09
and thenormal systemis
1 3 1 2 2 1 1 2 3 2 1 1 x y = 1 3 1 2 2 1 3 5 2.09 ⇒ 11 9 9 9 x y = 20.09 18.09 ⇒ x = 1 y = 1.01
Example 11.16
Find theconstant function that is the least squares fit to the following data x 0 1 2 3 f(x) 1 0 1 2 Solution f(x) =c ⇒ c = 1 c = 0 c = 1 c = 2 ⇒ 1 0 1 2 c ⇒
Then, the normal system is 1 1 1 1 1 1 1 1 c = 1 1 1 1 1 0 1 2
c = 14(1 + 0 + 1 + 2) = 1 (mean arithmetic value)
Thus, the constant function is
Example 11.17
Find thelinear polynomial function that is the least squares fit to the following data
x 0 1 2 3 f(x) 1 0 1 2 Solution f(x) =c1+c2x ⇒ c1 = 1 c1+c2 = 0 c1+ 2c2 = 1 c1+ 3c2 = 2 ⇒ 1 0 1 1 1 2 1 3 c1 c2 = 1 0 1 2
Then, the nomal system is 1 1 1 1 0 1 2 3 1 0 1 1 1 2 1 3 c1 c2 = 1 1 1 1 0 1 2 3 1 0 1 2 4 6 6 14 c1 c2 = 4 8 ⇒ c1= 0.4 c2= 0.4
Thus, the linear function is
Example 11.18
Find thequadratic polynomial function that is the least squares fit to the following data
x 0 1 2 3 f(x) 1 0 1 2 Solution f(x) =c1+c2x+c3x2⇒ c1 = 1 c1+c2+c3 = 0 c1+ 2c2+ 4c3= 1 c1+ 3c2+ 9c3= 2 ⇒
1 0 0 1 1 1 1 2 4 1 3 9 c1 c2 c3 = 1 0 1 2 ⇒
Then, the nomal system is
1 1 1 1 0 1 2 3 0 1 4 9 1 0 0 1 1 1 1 2 4 1 3 9 c1 c2 c3 = 1 1 1 1 0 1 2 3 0 1 4 9 1 0 1 2
4 6 14 6 14 36 14 36 98 c1 c2 c3 = 4 8 22 ⇒ c1 = 0.9 c2 =−1.1 c3 = 0.5
Thus, the quadratic function is
Orthogonal sets
Let<·,·>denote the scalar product in Rn
Definition
Nonzero vectorsv1,v2,· · ·,vk ∈Rn form an orthogonal setif
they are orthogonal to each other: <vi,vj >= 0 for all i 6=j.
If, in addition, all vectors are of unit length,vi,v1,v2,· · · ,vk is
called anorthonormal set.
For instance, The standard basis
e1 = (1,0,0, ...,0),e2= (0,1,0, ...,0),· · · ,en= (0,0,0, ...,1). It
Orthonormal bases
Supposev1,v2,· · ·,vn is an orthonormal basis forRn (i.e., it is a
basis and an orthonormal set).
Theorem Letx=x1v1+x2v2+· · ·+xnvn andy=y1v1+y2v2+· · ·+ynvn wherexi,y1 ∈R i)<x,y>=Pn i=ixiyi i)||x||=pPni=ixiyi
proof i) <x,y>= * n X i=i xivi, n X j=i yjvj + = n X i=i xi * vi, n X j=i vj + = n X i=i xi n X j=i yjhvi,vji= n X i=i xiyi
SupposeVis a subspace ofRn. Let pbe the orthogonal projection
of a vectorx∈Rn ontoV.
IfVis a one-dimensional subspace spanned by av, then
p= <<xv,,vv>>v
IfVadmits an orthogonal basis v1,v2,· · · , vk, then
p= <x,v1> <v1,v1 > v1+ <x,v2> <v2,v2> v2+...+ <x,vk > <vk,vk > vk Indeed,<p,vi >=Pkj=i <<vx,vj> j,vj> <vj,vi >= <x,vi> <vi,vi> <vi,vi >=< x,vi >⇒<x−p,vi >= 0⇒(x−p)⊥vi ⇒(x−p)⊥V.
Coordinates relative to an orthogonal basis Theorem
Ifv1,v2,· · ·,vn is an orthogonal basis forRn, then
x= <x,v1> <v1,v1> v1+ <x,v2> <v2,v2 > v2+...+ <x,vn> <vn,vn> vn
for any vectorx∈Rn
Corollary
Ifv1,v2,· · ·,vn is an orthonormal basis forRn, then
z=<x,v1>v1+<x,v2>v2+...+<x,vn>vn