QUANTUM MECHANICAL DISTRIBUTION FUNCTIONS:

QUANTUM MECHANICAL DISTRIBUTION FUNCTIONS: THE WIGNER FUNCTION AND THE MARCINKIEWICZ THEOREM

§3.1. Introduction

Generally speaking, contemporary quantum transport physics is performed assuming the state space interpretation of quantum mechanics. This is because historically, in order to incorporate the existence of discrete energy levels in systems described by Hamiltonians which were continuous functions of position and momentum, it was found necessary to introduce an algebra of non-conmuting Hermitian operators which acted on

181] vectors in some state-space

Within this context the state of a particular system is characterised by a vector (denoted by |\|j>) in this space; observable quantities by

Hermitian operators S, and the allowed values (S) that any measurement of this observable may yield are obtained from an eigenvalue equation

S|i|» = S|^> (3.1.1.)

which, upon using the completeness relation

| d-n(i*(T)i|i(T) = <ty|^> = 1

yields

S <>HS|ÿ>

(3.1.2.)

(3.1.3.)

where «J>| are vectors in the conjugate space to the one defined by |ij/>. This description corresponds to the "pure state" situation where we know that our system is in a particular state |ip>. Of course in general we can only assign a statistical probability Pn of the system being in any given state |^R> and so if the corresponding eigenvalue is Sn we have

- 37 -

a statistical interpretation of the average value of an observable denoted by

<s> = _{; n n}

y

p s =

y _L

<é_Mn1 ip iü) _{n |yn yn' |Vm}><\p is ié

n n,m

A A

= Tr[pS]

where (3.1.5.) defines the statistical density matrix

p =

1

(3.1.6.)

that totally characterises the system. This density matrix is the central object in quantum theory viewed from the state-space interpretation: it is the most general representation of a quantum mechanical system and once given may be used to calculate the expectation value of observables through (3.1.4.) by summing the leading diagonal of the matrix product of the density matrix with the matrix of the operator corresponding to the observable.

However this state-space interpretation is alienated from the phase space description that has been used so successfully in classical mechanics. In this picture the state of a system is represented by a point in a 6N- dimensional phase space of the co-ordinates and momenta of the N-particle system. In terms of this phase space a distribution function f(r,p,t) may be introduced as the density of phase points in terms of which the expected value of an observable (now represented by a function A(r,p) of the phase space co-ordinates) is calculated in the conventional sense of a statistical ensemble, i.e.

<A> = | dpdqA(p,q)f(p,q,t) (3.1.7.)

Therefore if we know the distribution function we can calculate any relevant system observable and so in terms of dynamics the distribution could be obtained in principle from a transport equation such as the one (3.1.4.)

- 38 -

introduced by Boltzmann.

The difference in the two interpretations of the framework of quantum and classical mechanics are sunmarised in Table 3.1.1.

APPROACHES CLASSICAL QUANTUM

Base space components Phase-space points (p*q)

State-space vectors k >

Space densities Distribution function f(pqt) Statistical density A matrix p Liouvilles equation of motion M ■

1 W

■

A -1 A A A A V = IK (Hp ■ pH) Expected value of observable quantities - 3„ f 3 n H] Pi Qi dpdqA (pq) f (pqt) Tr[Ap] TABLE 3.1.1.

Obviously the algebras governing quantum and classical mechanics are different, however comparisons between the two are complicated due to the algebras operating on different types of basis space. Therefore to explicitly exhibit differences in the dynamics of quantum and classical mechanics it would be preferable to look at the action of the two algebras

[17]

on the

same

basis space . If we choose to look at the problem from a unified point of view based on the phase space interpretation then the hope must be that some of the many techniques both analytic and numerical, developed for classical transport theory (and particularly with regard to the Boltzmann equation) may be transposed successfully to analyse quantum transport problems.

Before we consider quantum transport theory in phase space (Chapter 4) we need to discuss a few problems associated with general quantum mechanical phase space distributions since we already know they cannot

- 39 -

have the same interpretation as classical distributions due to the function referring to specific momentum and position values at the same time - something against the spirit of the uncertainty relations.

In order to introduce quantum distributions it is useful to first consider the relevance to physical transport theories of a mathematical

fc o']

theorem due to Marcinkiewicz . This theorem is illustrated in the next section where the implications are discussed as a need to relax some of the inherent restrictions we inpose on classical distribution functions.

§3.2. The Marcinkiewicz Theorem

We have already noted in §2.3. that a large class of physical transport theories involve the reduction of an infinite set of coupled equations (the BBGKY hierarchy) to a finite number by the complete neglect of high order correlations between many particles. However as was first

r

1 r 7 s i

pointed out by Robinson and later discussed by Rajagopal ,

[ 75] [93*1

Sudarshan and Titulaer , this procedure in general violates a

r 62 *1

mathematical theorem proposed by Marcinkiewicz J in 1938 which may be stated in the following terms.

The theorem refers to the behaviour of a quantity known as the characteristic or moment generating function C(t) which is defined as the expectation value of expCitx] where x is a generalised random

' [531

variable (which in quantum mechanics may be interpreted as a q-number )

i.e. C(t) = <exp[itx]>

= I n=o

(it)1

n T yn (3.2.1.)

where = <x°> is the n-th moment of the distribution. Expression (3.2.1.) may be used to define the n-th cumulant <n by expressing it in terms of the cumulant generating function x(t) as

C(t) = exp K(t)

(3.2.2.)

The cumulants are not simple averages but may be expressed in terms of lower order moments by equating the coefficients in the respective expansions of (3.2.1.) and (3.2.2.), for example

<2(x1,x2) = <x.,x7> - <x,xx,>

k3(x1»x2,x3) — <X1X2X3>

{<x^><x^>

+ <x2><x3x1> + <X2><x^x2>}

+ 3<x1><x2><Xj>

Therefore we see that the cumulants are what is physically known as correlation functions and the truncation schemes mentioned in §2.3. usually relied on the assumption that = 0 in (3.2.4.) where the random variable x referred to the local density of electrons. .

If F(x) is the corresponding probability density to the random variable x, then the definition of the characteristic function (3.2.1.) may be written in terms of the probability density as

which upon inverting furnishes a definition of the distribution function f(x) in terms of the characteristic function:

- 41 -

is a secondary defined quantity.

One particularly important example of a characteristic function is

(^(t) = exptiat -

\

a2t2] (3.2.7.)

and in which case its' corresponding probability density is just the

[ 5 9 ]

normal distribution

Rj(x) = -i— exp[-(x - a)2/2a2] (3.2.8.) /2ttct

We now come to a statement of the Marcinkiewicz theorem which, in terms of the characteristic function, says that if the cumulant generating function is a polynomial then to be consistent with conventional probability theory this polynomial must necessarily be no more than quadratic. In other words if there are only a finite number of non-zero correlation functions then

all

correlations above second order must be taken to be zero.

The implications of this theorem with regards to truncation schemes in many body approximations are clear: as long as one is content with a transport theory defined by the first two moments of the random variable considered, then consistent results may be obtained. However this consistency will be lost if attempts are made to extend the theory by including a finite number of non-zero higher order moments. Therefore it would appear that the most general situation that may be accurately described by using the truncation technique is a generalised free field model given by the displaced Maxwellian akin to (3.2.8.). Although this represents a very strong argument against the use of terminating the BBGKY hierarchy in such a fashion, the limitations imposed by this theorem have largely been ignored in the literature.

The theorem also has consequences with regards to the construction of quantum mechanical phase space distribution functions but in order to see why it is necessary to rederive the Marcinkiewicz theorem using concepts more familiar to physicists than the pure mathematician.

- 42 -

First of all, if we consider what useful properties characterise a distribution function then by analogy with classical mechanics we would perhaps impose the three minimal requirements that the distribution function be

and (c) Non-negative

Therefore we would also suppose that these three conditions represent restrictions on the behaviour of the characteristic function defining the distribution function and indeed these restrictions may be derived as follows.

We see from (3.2.6.) that if the distribution function is real then the characteristic function must be Hermitian i.e.

Secondly if the distribution function is bounded (which we would require in the quantum mechanical case so that the projections in momentum and position may be regarded as true probability distributions) it follows from (3.2.5.) that C(t) is also bounded and moreover

if f is non-negative

Hie third requirement on the distribution function, that of non negativity implies that the equality in (3.2.11.) holds

only

for t 3 0 i.e.

This latter restraint is not obvious but may be illustrated as follows. From (3.2.5.) we have

(a) Real

(b) Bounded (in the sense that dF(x) is finite) (3.2.9.)

(al) C*(t) = C(-t) (3.2.10.)

(bl) (3.2.11.)

- 43 -

|C(t)|2 = [ReC(t)]2 + tlmC(t)]2

< | cos2txdx | f2(x)dx + sin2txdx f2(x)dx

by the Schwartz Inequality^90^ where the equality only holds if the function g(x) (= costx or sintx) is directly proportional to f (x) or is altogether independent of x, in which case we must have t = 0 at which value

|ReC(t)| = | |f(x)|dx (3.2.13.) and

| ImC(t) | = 0 (3.2.14.)

Since we are assuming that the distribution function is non-negative we must have

I| dF(x)| =

J

JdF(x)| = 1 (3.2.IS.)

so that through (3.2.13.) we have the restriction (cl), i.e. that |C(t) | = 1 only at t = 0.

Note that (cl) need not be true if the distribution function is allowed to assume negative values since then

J

|dF(x)| >

|j

dF(x)| - 1 (3.2.16.)

which would permit |C(t)| to be greater than unity for a range of t values and consequently it would be possible for |C(t) | to equal unity at values of t other than zero.

Therefore to reiterate we recognise that the three physically reasonable requirements (3.2.9.) impose three constraints (al)-(cl) on the behaviour of the characteristic function. We will now see that these three constraints are sufficient to generate the Marcinkiewicz theorem.

We have seen that the characteristic function may be expanded in terms of correlation functions through

- 44 -

C(t) =

exp l

K,

n=l n

Consequently if we assume only a finite number of correlations are involved so that

determine the precise form of Q, R as follows.

Since we know C(t) is Hermitian (al), then Q and R must be polynomials with real coefficients (in other words the correlations themselves are real). Consequently the real and imaginary parts of the characteristic function may be expressed as

assume a negative value for any choice of t and consequently we must infer

We now inpose condition (cl) on (3.2.18.) and (3.2.19.) in two stages. First we recall from (cl) that t ■ 0 is the only value of t for which

|ReC(t)| ■ 1 and so from (3.2.18.) if C(t) = exp P(t)

where P(t) is a polynomial of the form

= itQ(t2) - t2R(t2) (3.2.17.)

and Q, R are polynomials in t , then restrictions (al)-(cl) may be used to

ReC(t) = exp[-t2R(t2)]cos[tQ(t2)] (3.2.18.)

ImC(t) = exp[-t2R(t2)]sin[tQ(t2)] (3.2.19.)

2

Moreover, since the magnitude of C(t) is governed by exp[-t R(t )] and we know from (bl) that |C(t)| s 1, we must ensure that R(t ) cannot

that R(t ) is a real polynomial with

positive

coefficients.

|ReC(t)| = exp[-t2R(t2)]

- 45 -

The only conclusive way of ensuring this is to assume that R(t ) = Tq = constant for all t.

Similarly if t = 0 is the only solution that satisfies ImC(t) = 0 2

then from (3.2.19) to be sure Q(t ) has no roots it is necessary to take

Q(t ) = <}q = constant.

Therefore combining the restrictions on (3.2.17.) due to the conditions (al)-(cl) leads to a characteristic function of the form

C(t) = expCiqQt - rQt2) 'v (3.2.20.)

where q^, Tq are real constants and Tg is positive, i.e. the characteristic function must be the exponential of a quadratic polynomial which is

precisely a restatement of the Marcinkiewicz theorem. Indeed (3.2.20.) is identical to the characteristic function of the normal probability density (3.2.7.) even down to the prediction of the correct sign of Tq (which being

1 2 positive may be written in the form of j o say).

We note that although the original proof of the Marcinkiewicz theorem

r

o i

relied on involved mathematics the construction just presented depends only on the three physically reasonable restraints (3.2.9.) which have been borrowed from our knowledge of classical distribution functions. Consequently it is not difficult to see that the Marcinkiewicz theorem presents a severe restriction on the construction of quantum distribution functions in general which may be expanded in these terms: if we require a real, bounded, non negative distribution function constructed out of a finite number of correlation functions, then the only probability density we are allowed is the normal distribution, being constructed out of the first two moments only.

This is evidently far too restrictive for practical purposes and yet is a direct consequence of requiring that the distribution function satisfy the three apparently reasonable constraints (3.2.9.).

Of course in quantum dynamical systems, even if we could initialise a distribution function to be in this restrictive form, its* subsequent evolution (under an electric field for instance) would progressively

- 46 -

involve higher moments than second order due to the interactions within a system from both collisions with other particles and the interactions with the inbuilt system potentials such as boundaries and static charge inhomogeneities introduced by doping for example. Moreover we could only ever be aware of a finite number of these moments because if we had the information of all moments we would know everything about our system which is against the fundamental issues of quantum mechanics, and herein lies the contradiction with the Marcinkiewicz theorem.

We can see that in order to reconcile the construction of a quantum distribution function with this theorem, we are forced to relax at least one of the prejudices in (3.2.9.) inherent in classical distribution theory. The choice we will use in this thesis is to take a real, bounded distribution function which by the foregoing discussion must in general be allowed to assume negative values. This particular choice is the Wigner distribution function to be constructed in the next section.

It is important to recognise that the construction of a quantum distribution function is not unique; it would be perfectly feasible to construct a quantum distribution function which was always positive for

[*+*+]

example but if we required it to be bounded, then it would also in general have to be a complex valued quantity. The construction may only be made unique by the additional imposition of a set of extra restraints. This lack of uniqueness is not surprising since the only significance of a distribution function is its' use in calculating the correct average values of observable quantities - as long as the calculations lead to the same value it does not matter what type of a distribution function is employed. The situation is similar to the choice of a convenient representation in the conventional state-space approach for quantum mechanics.

- 47 -

§3.3. Winner Phase Space Functions

In the preceding section a quantity C(t) known as a characteristic function was mentioned (3.2.5.) which was considered to be the fundamental quantity in terms of which a general distribution function is defined as

This section considers an explicit construction of such a distribution function appropriate to a quantum mechanical phase space where the random variables are generated by q-numbers p, q taken to obey the commutation relations

Therefore in terms of these variables the general distribution function is defined as

Note that at this stage (3.3.3.) assumes nothing regarding the form of the characteristic function which remains indeterminate until we specify the

A A

relationship between p and q in order that the average in (3.3.3.) may be calculated. It becomes determined when we assume the commutation relations

(3.3.1.) and the position representation of the momentum operator i.e. where the characteristic function

C(£,n) = <exp[i£p + iqq]>

= <C(5,n)>

= Tree p] (3.3.3.)

- 48 -

since then the characteristic function may be evaluated using the

complete set of wavefunctions (tpn(x)} through (3.1.4.) as

C(£,n)

= I pn [ dn|>*(T)exp[i£p + inq]^n(t)

(3.3.5.)

n '

= l Pn f dT4i*(T)exp[inq]exp[i£;p]^n(T)exp[-iCnK/2] (3.3.6.)

n J =

l

Pn [ du(j*(T)exp[inq]^n(T + ii?)exp[-i5nft/2] ( 3 .3 .7 .) n ■*

l

pn dxi|i*(x - &5/2)^n(x + 4iC/2)exp[inx] ( 3 .3 .8 .)

Lemma

In going from (3.3.5.) to (3.3.6.) we have used the Baker-Hausdorff

[9 6] ,

A A

expLA + B] = exp[A]exp[B]expj[B,A] (3.3.9.)

which holds whenever the operators A and B commute with their commutator i.e. whenever

A A A A A A

[A,[A,B]] = [B,[A,B]) = 0

(3.3.10.)

(which of course they do in the case of operators p, q since their commutator is just a constant (3.3.2.)). Also the transformation from

(3.3.6.) to (3.3.7.) was performed utilising the translation property

exp[a8x]f(x) = f(x + a)

(3.3.11.)

If we now substitute this reduced form of the characteristic function (3.3.8.) into the general expression (3.3.2.) we obtain a particular distribution function:

f(p»q>t) = (¿r) i d?dnexp(

1JS?) x

l

pn [ dxi|£(x - §,t)ij>n(x + |,t)exp[irrt/fi] n *

- 49 -

This particular expression of a quantum distribution function defined in terms of wavefunctions was first introduced for pure states in 1932 by

ution function. As can be seen from (3.3.12.) the Wigner distribution may be interpreted as a partial Fourier transform on the off diagonal elements of the density matrix.

The Wigner distribution has many useful properties, most of which are listed in Appendix I, but a few properties of direct interest may be extracted inmediately from its' definition (3.3.12.). The first is that it is a real function since (3.3.\12.) is invariant under taking its complex conjugate and changing variables £ -*■ -£. Secondly integrating (3.3.12.) over p and q respectively shows that the projections of the Wigner function are just the usual position and momentum probability distributions so that it is normalised to unity. That is to say

Therefore in the context of §3.2. we see that the Wigner distribution is both real and bounded and as a consequence of the Marcinkiewicz theorem we would generally expect it to assume negative values. This is indeed the case as an explicit example pertinent to an excited state of the harmonic oscillator shows in Appendix I.

Of course this negative-going behaviour of the Wigner distribution invalidates any direct interpretation of it as a probability function in the conventional sense. However we have already commented that a distribution function by itself should be regarded as a secondary quantity when compared

In document High field quantum transport theory in semiconductors (Page 47-64)