Let s(n) ≡ sn. Then equation (6.88) is easily extended to s(mnr · · ·) = s(n) + s(m) + s(r) + · · · , so with n = m = r = · · · we conclude that
s(nk) = ks(n).
Now let u, v be any two integers bigger than 1. Then for arbitrarily large n we can find m such that
m n ≤ ln v
ln u < m + 1
n ⇒ um≤ vn< um+1. (1) Since s is monotone increasing,
s(um) ≤ s(vn) < s(um+1) ⇒ ms(u) ≤ ns(v) < (m + 1)s(u)
⇒ m
n ≤ s(v)
s(u) <m + 1
n . (2)
Comparing equation (1) with equation (2), we see that
where ǫ = s(u)/(n ln v) is arbitrary small. Thus we have shown that s(v) ∝ ln v.
“either x2 or x3”. Then we assign a probability p23 = p2+ p3 to getting x23, giving missing information s(p1, p23). To this missing information we have to add that associated with resolving the outcome x23into either x2
or x3. The probability that we will have to resolve this missing information is p23, and the probability of getting x2given that we have x23is p2/p23,
This equation is easily generalised: we have n possible outcomes x1, . . . , xn
with probabilities p1, . . . , pn. We gather the outcomes into r groups and let y1 be the outcome in which one of x1, . . . , xk1 was obtained, y2 the outcome in which one of xk1+1. . . , xk2 was obtained, etc., and let wi
denote the probability of the outcome yi. Then since the probability that we get x1given that we have already obtained y1is p1/w1, we have
s(p1, . . . , pn) = s(w1, . . . , wr) + w1s(p1/w1, . . . , pk1/w1)+
· · · + wrs(pn−kr/wr, . . . , pn/wr). (6.85) Since s is a continuous function of its arguments, it suffices to evaluate it for rational values of the arguments. So we assume that there are
integers ni such that pi = ni/N , where P
ini = N by the requirement that the probabilities sum to unity. Consider a system in which there are N equally likely outcomes, and from these form n groups, with ni
possibilities in the ith group. Then the probability of the group is pi and the probability of getting any possibility in the ith group given that the ith group has come up, is 1/ni. Hence applying equation (6.85) to the whole system we find
This equation relates s evaluated on a general argument list to the values that s takes when all its arguments are equal. Setting all the ni = m we obtain a relation that involves only sn:
sn = snm− sm. (6.88)
It is easy to check that this functional equation is solved by sn= K ln n, where K is an arbitrary constant that we can set to unity. In fact, in Box 6.4 it is shown that this is the only monotone solution of equation (6.88). Hence from equation (6.87) we have that the unique measure of missing information is
Since every probability pi is non-negative and less than or equal to one, s is inherently positive. Claude Shannon (1916–2001) first demonstrated12 that the function (6.89) is the only consistent measure of missing informa-tion. Since s(p) turns out to be intimately connected to thermodynamic entropy, it is called the Shannon entropy of the probability distribution.
The Shannon entropy of a density operator ρ is defined to be
s(ρ) = − Tr ρ ln ρ. (6.90)
The right side of this expression involves a function, ln(x) of the operator ρ. We recall from equation (2.20) that f (ρ) has the same eigenkets as ρ and eigenvalues f (λi), where λi are the eigenvalues of ρ. Hence
s = − Tr(ρ ln ρ) = −X Hence s is simply the Shannon entropy of the probability distribution {pi} that appears in the definition (6.64) of ρ.
12C.E. Shannon, Bell Systems Technical Journal, 27, 379 (1948). For a much fuller account, see E.T. Jaynes, Probability Theory: The Logic of Science, Cambridge University Press, 2003.
6.4 Thermodynamics
Thermodynamics is concerned with macroscopic systems about which we don’t know very much, certainly vastly less than is required to define a quantum state. For example, the system might consist of a cylinder full of fluid and our knowledge be confined to the chemical nature of the fluid (that it is O2or CO2, or whatever), the mass of fluid, its volume and the temperature of the environment with which it is in equilibrium. In the canonical picture we consider that as a result of exchanges of energy with the environment, the energy of the fluid fluctuates around a mean U . The pressure also fluctuates around a mean value P , but the volume V is well-defined and under our control.
Thermodynamics applies to systems that are more complex than bod-ies of fluid, for example to a quantity of diamond. In such a case the stress in the material is not fully described by the pressure, and thermodynamic relations involve also the shear stress and the shear strain within the crys-tal. If the crystal, like quartz, has interesting electrical properties, the thermodynamic relations will involve the electric field within the material and the polarisation that it induces. A fluid is the simplest non-trivial thermodynamic system and therefore the focus of introductory texts, but the principles that it illuminates are of much wider validity. For simplicity we restrict our discussion to fluids.
To obtain relations between the thermodynamic variables from a knowledge of the system’s microstructure, we need to assign a proba-bility pito each of the system’s zillions of quantum states. We argue that the only rational way to assign probabilities to the stationary states of a thermodynamic system is to choose them such that (i) they reproduce any measurements we have of the system, and (ii) they maximise the Shan-non entropy. Requirement (ii) follows because in choosing the {pi} we must not specify any information beyond that included when we satisfy requirement (i) – our probabilities must “tell the truth, the whole truth and nothing but the truth”. It is straightforward to show (Problem 6.16) that the pi that maximise the Shannon entropy for given internal energy
U ≡ X
stationary states i
Eipi (6.92)
are given by
pi= 1
Ze−βEi, (6.93a)
where β ≡ 1/(kBT ) is the inverse temperature and
Z ≡ X
stationary states i
e−βEi. (6.93b)
The quantity Z defined above is called the partition function; it is manifestly a function of T and less obviously a function of the volume V and whatever other parameters define the spectrum {Ei} of the Hamilto-nian. In equation (6.93a) its role is clearly to ensure that the probabilities satisfy the normalisation conditionP
ipi= 1.
Since the probability distribution (6.93a) maximises the Shannon en-tropy for given internal energy, we take the density operator of a ther-modynamic system to be diagonal in the energy representation and to be given by
This form of the density operator is called the Gibbs distribution in honour of J.W. Gibbs (1839–1903), who died before quantum mechanics emerged but had already established that probabilities should given by equation (6.93a).
The sum in equation (6.94) is over quantum states not energy levels.
It is likely that many energy levels will be highly degenerate and in this case the sum simplifies to Z = P
αgαe−βEα, where α runs over energy levels and gα is the number of linearly independent quantum states in level α.
The expectation of the Hamiltonian of a thermodynamic system is H = Tr(Hρ) =X
where we have used the definition (6.92) of the internal energy. Thus the internal energy U of thermodynamics is simply the expectation value of the system’s Hamiltonian. Another important expression for U follows straightforwardly from equations (6.92) and (6.93):
U = −∂ ln Z
∂β . (6.96)
We obtain an interesting equation using equation (6.93a) to eliminate the second occurrence of pn from the extreme right of equation (6.91):
s = −X
n
pn(−βEn− ln Z) = βU + ln Z. (6.97)
In terms of the thermodynamic entropy
S ≡ kBs (6.98)
and the Helmholtz free energy
F ≡ −kBT ln Z (6.99)
equation (6.97) can be written
F = U − T S, (6.100)
which in classical thermodynamics is considered to be the definition of the Helmholtz free energy. When we substitute our definition of F into equation (6.96), we obtain
U =∂(βF )
∂β = F + β∂F
∂β = F − T∂F
∂T. (6.101)
Comparing this equation with equation (6.100) we conclude that S = −∂F
∂T. (6.102)
The difference of equation (6.92) between two similar thermodynamic states is
dU =X
i
(dpiEi+ pidEi). (6.103) Similarly differencing the definition S = −kBP
ipiln pi of the thermody-namic entropy (equations 6.91 and 6.98), we obtain
dS = −kB
where the second equality exploits the fact that {pi} is a probability dis-tribution soP
ipi= 1 always. By equation (6.93a), ln pi= −Ei/(kBT ) − ln Z, so again usingP
ipi = 1, equation (6.104) can be rewritten T dS =X
i
Eidpi. (6.105)
If we heat our system up at constant volume, the Ei stay the same but the pi change because they depend on T . In these circumstances the increase in internal energy,P
iEidpi, is the heat absorbed by the system.
Consequently, equation (6.105) states that T dS is the heat absorbed when the system is heated with no work done. This statement coincides with the definition of entropy in classical thermodynamics.
Substituting equation (6.105) into equation (6.103) yields
dU = T dS − P dV, (6.106a)
If we isolate our system from heat sources and then slowly change its volume, the adiabatic principle (§12.1) tells us that the system will stay in whatever stationary state it started in. That is, the piwill be constant while the volume of the thermally isolated system is slowly changed. In classical thermodynamics this is an ‘adiabatic’ change. From equation (6.104) we see that the entropy S is constant during an adiabatic change, just as classical thermodynamics teaches.
Since dS = 0 in an adiabatic change, the change in U as V is varied must be the mechanical work done on the system, −P dV, where P is the pressure the system exerts. This argument establishes that the quantity P defined by (6.106b) is the pressure.
Differentiating equation (6.100) for the Helmholtz free energy and using equation (6.106a) to eliminate dU , we find that
dF = −SdT − P dV. (6.107)
From this it immediately follows that
The first of these equations was obtained above but the second one is new.
Equation (6.106a) is the central equation of thermodynamics since it embodies both the first and second laws of thermodynamics. This result establishes that classical thermodynamics is a consequence of applying quantum mechanics to systems of which we know very little. Remarkably, physicists working in the first half of the nineteenth century discovered thermodynamics long before quantum mechanics was thought of, using extremely subtle arguments concerning heat engines. Quantum mechan-ics makes these arguments redundant. Notwithstanding this redundancy, they continue to feature in undergraduate syllabuses the world over be-cause they are beautiful. But then so are copperplate writing and slide rules, which have rightly disappeared from schools.
A possible explanation for the survival of thermodynamics as an in-dependent discipline is as follows. Equations (6.99), (6.100) and (6.108) establish that any thermodynamic quantity can be obtained from the dependence of the partition function on T and V . Unfortunately, this dependence can be calculated for only a very few Hamiltonians. In al-most all practical cases we cannot proceed by evaluating Z. However, once we know that Z and therefore F and S exist, we can determine their functional forms from experimental data. For example, by measuring the heat released on cooling our system at constant volume to absolute zero, we can determine its entropy S =R
dQ/T . Similarly, we can measure the system’s pressure as a function of T and V. Then by integrating equation (6.107) we can obtain F (T, V) and thus infer Z(T, V). In none of these op-erations is the involvement of quantum mechanics apparent, so engineers and chemists, who make extensive use of thermodynamics, are generally unaware that it is a consequence of quantum mechanics. Quantum me-chanics provides us with relations between thermodynamic quantities but does not enable us to evaluate the quantities themselves. Evaluation must still be done with nineteenth-century technology.
Although thermodynamic systems are inherently macroscopic, quan-tum mechanics plays a central role in determining their thermodynamic quantities because it defines the stationary states we have to sum over in (6.93b) to form the partition function. Before quantum mechanics was born, the thermodynamic properties of an ideal gas – one composed of molecules that occupy negligible volume and interact only at very short range – were obtained by summing over the phase-space locations of each molecule independently. In this procedure there are six distinct states of a three-molecule gas in which there are molecules at the phase-space lo-cations x1, x2and x3: in one state molecule 1 is at x1, molecule 2 is at x2
and molecule 3 is at x3, and a distinct state is obtained by swapping the locations of molecule 1 and molecule 2, and so forth. Quantum mechan-ics teaches that the state of the gas is completely specified by listing the three occupied states, |1i, |2i and |3i for it is meaningless to say which
molecule is in which state. The classical way of counting states leads to absurd results even for gases at room temperature (Problem 6.22). At low temperatures another aspect of classical physics leads to erroneous re-sults: the low-lying energy levels of a gas are distributed discretely rather than continuously in E, with the result that specific heats always vanish in the limit T → 0 (Nernst’s theorem; Problem 6.23), contrary to the prediction of classical physics.
An important lesson to be learnt from the failure of classical physics to predict the properties of an ideal gas is the importance in quantum me-chanics of thinking wholistically: we have to sum over the quantum states of the whole cylinder of gas, not over the states of individual molecules.
This is analogous to the importance for understanding EPR phenomena of considering the quantum system formed by the entangled particles taken together. In quantum mechanics the whole is generally very much more than the sum of its parts because there are non-trivial correlations be-tween the parts.13
6.5 Measurement
When we laid the foundations of quantum mechanics in §1.2, we focused on ‘ideal’ measurements, which are perfectly reproducible. We saw that an ideal measurement jogs the system into a special state in which the system is not affected by further measurements of the same observable.
Mathematically, the impact of an ideal measurement on a system is rep-resented by the ‘collapse’ of the system’s state into one of the eigenkets of the observable’s operator. This scheme is known as the Copenhagen interpretation of quantum mechanics because it was thrashed out in Niels Bohr’s institute in Copenhagen.
The Copenhagen interpretation relies on the concept of an ideal mea-surement not only to predict the outcomes of experiments given the sys-tem’s state |ψi, but also to deduce what |ψi is: the claim that a system is in a given state is invariably based on the assumption that the energy or momentum or some other observable has been measured, so the system was recently in a known eigenstate of that observable, and has evolved from that point according to the tdse, which we know how to solve.
Although useful, the concept of an ideal measurement is not phys-ically satisfying. For one thing it obliges us to imagine that a system’s state changes instantaneously, and without reference to the tdse, into some other state. Moreover, we have seen that the eigenstates of the op-erators of such key observables as position and momentum are physically unrealisable (§2.3.2). Finally, it does not apply to real measurements, which always come with non-zero error bars.
The concepts introduced in this chapter should enable us to give a much fuller and more convincing explanation of what happens when we make a measurement than the simple collapse hypothesis. Indeed, the
13The origin of these correlations is the subject of §11.1.
ability of density operators to handle classical uncertainty arising from incomplete information should enable us to take into account non-zero error bars. Moreover, we make measurements by bringing an instrument into contact with the system to be measured and allowing it to entangle itself with the system, so our discussion of entanglement and reduced den-sity operators must be key to a satisfactory description of measurement.
Let A denote the system we are measuring, and B denote the mea-suring instrument, which will itself be a dynamical system governed by quantum mechanics. In general B will be a macroscopic system, so we do not expect to know its state at the start of the measuring process, and must describe it via its density operator ρB (§6.3). Given our ignorance of the initial state of B, the evolution of A will become increasingly un-certain from the moment it is brought into contact with B. That is A will experience decoherence. This observation suggests that the uncertainty in the outcome of the measurement (which is reflected in the probabilistic nature of state collapse) may be entirely due to our ignorance of the initial state of B.
Let |A, ψi be the state of A before the measurement. Once the mea-suring process is underway, A will cease to be characterised by a pure state and predictions will have to be based on the density operator ρAB
of the composite system AB, whose evolution we can follow. This evo-lution depends on the initial state |A, ψi of A, and our hope is that the structure of the post-measurement form of ρAB will enable us to infer the post-measurement state of A.
The density operator ρAB, which eventually contains the results of our measurement, is best understood by examining its eigenkets |AB, ki, where k enumerates these objects, and its eigenvalues pk; following the measurement, AB is in one of the states |AB, ki with probability pk. Let’s suppose that A is just the spin of a spin-half particle and B is an apparatus for measuring the z component of this spin. Then we have
|A, ψi = a−|−i + a+|+i (6.109) for some amplitudes a± and we hope that after the measurement the eigenkets of ρAB that contribute most of the probability are of the form
|A−i|B, ni and |A+i|B, n′i, (6.110) where n and n′ refer to states of B that are dramatically different from one another, so it easy to tell whether B is in a state of class n or of class n′ without measuring B carefully enough to come close to determining what its quantum state actually is. We hope further that
X
n
pn= |a−|2 and X
n′
pn′ = |a+|2, (6.111)
for then if examination of B reveals that it was in a state of class n, which will occur with probability |a−|2, we can infer that the measurement has left A in the state |−i, while if it turns out that B is in a state of class
n′ (with probability |a+|2), then A has been left in the state |+i. Thus if most of the probability in ρAB is associated with states of the form (6.110), we can understand how to recover the physical content of the collapse hypothesis from a proper dynamical model of the measurement process.
This picture of the measurement process can be straightforwardly generalised to the case in which a complete basis for system A contains not two states but infinitely many: in this case we would hope to find that after the measuring process is complete, the eigenkets of ρAB that contain most of the probability are of the form
|A, ii|B, nii (6.112)
where i enumerates members of a basis for system A and |B, nii is a mem-ber of the ith group of states in a basis for system B, it being possible to determine whether B’s state lies in the ith group rather than the jth without actually determining the state of B. If, for example, B were an ap-paratus to measure energy and A were an anharmonic oscillator (§3.2.1), the states |A, ii would be the oscillator’s conventional stationary states so infinitely many values of i would occur and for large i, Ei+1− Ei would be small. Consequently, we would not expect to be able to distinguish be-tween the ithand the (i+1)thgroups of B’s states, with the result that we would remain uncertain what A’s energy was; we would assign non-zero probabilities (computed from the pni) to all Ei which were paired with groups i of B’s states that were compatible with our examination of B.
Thus we could model non-ideal measurements, including their error bars.
The feasibility of this programme for recovering the world of real ex-periments from the convenient fiction of ideal measurements is unproven.
Given a system A, the challenge is to design a system B with its Hamil-tonian HB and a coupling Hint to A, such that the entanglement of this system from a generic impure state ρB (such as the Gibbs distribution, eq. 6.94) leads after a reasonable time to a density operator ρAB whose eigenkets have the trivial form (6.112); unless B is cunningly designed the eigenkets of ρAB will be states in which A and B are entangled, and
Given a system A, the challenge is to design a system B with its Hamil-tonian HB and a coupling Hint to A, such that the entanglement of this system from a generic impure state ρB (such as the Gibbs distribution, eq. 6.94) leads after a reasonable time to a density operator ρAB whose eigenkets have the trivial form (6.112); unless B is cunningly designed the eigenkets of ρAB will be states in which A and B are entangled, and