2.3 Quantum Monte Carlo Methods
2.3.1 Statistical Methods
This chapter covers the elements of probability theory that are important for quantum Monte Carlo. A more extended discussion of this matter is to be found in the QMC books of Hammond and Kalos.1,2
Terms
Probability theory takes hold where an experiment under a well defined complex of conditions does not end with a secure result, but where one of multiple random results occurs, i.e a certain event happens which is equivalent to a certain state being occupied by the system. All possible states are called event space. If the states cannot be further decomposed they are elementary (otherwise: composite) and the set of the occurring elementary states is declared as the state space. A random variable is the vector or the scalar, respectively, on which every elementary state being occupied under the given conditions, can be projected. Random variables can be discrete or continuous, of which the latter are important in the following.
The Probability Density
Consider a continuous random variable X that can take one of the infinitely many values x from the interval WX, for example ] − ∞, ∞[ as outcome from the random event. X is
further defined by the non-negative discrete probability pi or the probability density function
pX(x), respectively, both being ∈ [0, 1]. The probability to realize a random variable X in
the interval I ∈ WX is then defined as
P {X ∈ I} =
P
ipixi for discrete variables
R
IpX(x)dx for continuous variables
(2.68)
From this definition
Z
WX
pX(x)dx = 1 (2.69)
follows. This means that the probability of X having any outcome in the complete interval WX is normalized to one. Furthermore, the probability that the X has an exact, predefined
value x0 ∈ WX goes to zero. Analogous definitions hold for multi-dimensional random
variables X, with x being vectors. The interval I will then be a multi-dimensional region Γ and dx a general volume element. Results from probability theory of one-dimensional continuous problems can easily be extended to higher dimensions as needed for the quantum Monte Carlo method.
The Distribution Function
The normalized distribution function F (z) of a random variable X is the probability of X having a value x in Γ ∈ WX
F (z) = Z
Γ
pX(x)dx. (2.70)
It completely characterizes the random quantity. If the exact distribution function is un- known, its characteristic parameters represent the major properties of the distribution. These are the expectation value, the variance, and higher moments of the variable.
The expectation or stochastic mean value hXi of a continuous random variable X is defined as:
hXi = E(X) = Z
WX
x pX(x)dx. (2.71)
In analogy to that the expectation values of the function f (X) of a random variable is determined:
hf (X)i = E(f (X)) = Z
WX
f (x) pX(x)dx. (2.72)
In particular if f (X) = Xk, one speaks of the kth moment of X.
The moments of the random variate X − hXi are the central moments of X. The second central moment plays an especially important role as being the variance of X:
σ2 = var(X) = hX2i − hXi2 = h(X − hXi)2i = Z
WX
(X − hXi)2pX(x)dx. (2.73)
σ is the standard deviation which can be interpreted as a measure for the dispersion of the random variables:
σ =pvar(X). (2.74)
Pairs of Random Variates
An elementary state is often not only described by one but two (or more) random variates X and Y (e.g. three spatial coordinates for the description of the position of a point in three dimensional space). If these variables are independent from each other then the joint probability density p(x, y) is the product of the single probability densities. The joint expec- tation value hXY i is the product of the separate expectation values. pX(x) is the marginal
probability density function of one random state (X) independent of the value of the other: pX(x) =
Z
WY
The probabilities for X not depending on any other random variable Y is then: P {X ∈ Γ} = Z Γ pX(x)dx = Z Γ Z WY p(x, y)dy. (2.76)
Per contra, when the variables depend on each other the transition probability can be defined:
P {X ∈ Γ|Y = y} = Z
Γ
p(x|y)dx. (2.77) p(x|y) is the conditional probability density function of the dependent variables normalized to one, referred to as transition density:
p(x|y) = R p(x, y)
WXp(x, y)dx
= p(x, y) pY(y)
(2.78) It describes the probability for X = x if Y = y. With the equations (2.75) and (2.78) at hand one arrives at the marginal probability density of X for dependent variables:
pX(x) =
Z
Γ
p(x|y)pY(y)dy. (2.79)
Samples of Random Variables
To predict a random event for a given pX(x) in Γ many random variables are needed. With
these it is possible to establish statements on the distribution function without knowledge of probability density function. This is important in QMC where the latter are usually unknown. In a first step the sample of size n is generated in an experimental or simulated fashion by realizing the random variable X n times. X1, X2, . . . , Xn are drawn from the probability
density function pX(x). For a given function f (X) Sn is defined as the average over the
f (Xi): Sn= 1 n n X i=1 f (Xi) (2.80)
The expectation value of the function given as its first central moment (eq. (2.72)) is: hSni = * 1 n n X i=1 f (Xi) + = 1 n n X i=1 hf (Xi)i = hf (X)i. (2.81)
hSni is said to be an estimator of hf (X)i. The quality of the estimate rises for n → ∞ due
to the central limit theorem which means that f(X) converges. When hSni equals hf (X)i for
finite n then Sn is referred to as unbiased estimator. If this condition is only fulfilled for an
infinitely large sample Sn would be called asymptotically unbiased.
As in equation (2.73) the variance of the mean Sn becomes
var(Sn) = var( 1 n n X i=1 f (Xi)) = n X i=1 1 n2var(f (Xi)) = 1 nvar(f (X)) (2.82)
The standard deviation is again the square root of the variance (compare eq. (2.74)): σ(Sn) = p var(Sn) = pvar(f(X)) √ n (2.83)
To obtain the variance from the observed independent values of f (Xi), i.e. as a sample
estimate instead of using Sn, is done as follows:2
Vn= 1 n − 1 " 1 n n X i=1 f (Xi)2− ( 1 n n X i=1 f (Xi))2 # . (2.84)
Generation of Random Variables
Random variates suitable for the Monte Carlo techniques are obtained in two steps: Getting uniformly distributed random numbers from a pseudo random number generator (RNG) and transforming them into random variables with a certain probability density. A typical RNG for this purpose is for example the linear congruence generator, explained in more detail in literature.1 Methods for the transformation are the inversion and the rejection method. The Box-Muller method described below is an application of the former, whereas the Metropolis algorithm goes back to the latter.
Gaussian Distributed Random Variables
The probability density pX(x) for a variable X in a d-dimensional Gaussian distribution is
given by: pX(x) = 1 σ√2πe −(x−µ)2 2σ2 . (2.85)
µ is the expectation value and center of the function, σ2 determines its variance as widths of the distribution.
A way to obtain Gaussian distributed random variables is the Box-Muller algorithm. Here the generation of always two Gaussian distributed random variables with a mean of zero and a variance of one is performed using a two-dimensional Gaussian distribution. Given be the joint distribution p(y1, y2) of the desired Gaussian distributed variables y1 and y2:
p(y1, y2) = pY(y1)pY(y2) = 1 2πe −1 2(y 2 1+y22). (2.86)
y1 and y2 are transformed into the polar coordinates r and ϕ
p0(r, ϕ) = pR(r)pΦ(ϕ) = re−
1 2r
2 1
Both polar coordinates r =√−2 ln ξ1 and ϕ = 2πξ2 can be gained from uniformly distributed
random numbers ξ1 and ξ2. Back transformation from equation (2.87) to the inverse of
equation (2.86) renders Gaussian distributed variables y1 and y2:
y1 = p−2 ln ξ1cos 2πξ2 (2.88)
y2 = p−2 ln ξ1sin 2πξ2. (2.89)
Gaussian distributed random variables are needed for the diffusion step in QMC.
Random Walk Processes
Quantum Monte Carlo simulations need pseudo random numbers to create a meaningful multi-dimensional density function which is usually Ψ2. Electrons are propagated in a Metropolis random walk process so that large samples of random electron configurations are obtained.67
The random walk describes the random movement of a mathematical entity (a walker) in space. The presently discussed sequence of random walk steps is the so-called Markow chain. The new position of the walker depends on its former position, but not on the time and history of the process. The walk itself is the summation of random variables (equally for random numbers and vectors):
Zn= X1+ X2+ . . . + Xn. (2.90)
The recursive definition of the formula reads
Zn= Zn−1+ Xn (2.91)
with Z0 = 0, Z1= X1, and n = 0, 1, 2, . . ..
For the summation of two independent random variables X + Y = Z their joint proba- bility is given as
pZ(z) = pZ(x, y) = pX(x)pY(y). (2.92)
Their cumulative distribution function reads F (z) =
Z Z
pX(x)pY(y)dxdy = F (z − x)pX(x)dx. (2.93)
This reformulation is done using y = z − x and F (y) =R pY(y)dy. The probability density
function then reads
pZ(z) =
Z
pY(z − x)pX(x)dx =
Z
In analogy to this the marginal probability density φn(z) for the addition of many
random variables Zn in a Markow process is obtained using xn= z − z0 and equation (2.79)
φn(z) =
Z
p(z − z0, z0)dz0 = Z
p(z − z0|z0)φn−1(z0)dz0. (2.95)
z0 does represent the old and z the new position, n = 0, 1, 2, . . . is in analogy to above. Conclusively the process is uniquely described by the density of the preceding walker φn−1(z0),
i.e. actually by the density of the first walker φ0(z0) and the transition density p(z − z0|z0)
which is needed to be known. If the Markow chain evolves in space towards an equilibrium density p∞(z) then the mean value Snof the function f (Z) and hSni is an estimate for hf (Z)i
(see eq. (2.80) and (2.81)).
When Z depends on former Z0 data following each other are serially correlated and the standard deviation may not be calculated directly from the data according to equation (2.83). To estimate the standard deviation b data points are usually grouped in a block so that
Sn= 1 n n X i=1 (1 b b(i+1) X a=(i×b)+1 f (Za)) = 1 n n X i=1 hf (Zi)iblock. (2.96)
The hf (Zi)iblock are the statistically independent mean values of the blocks that can be used
to estimate the variance and standard deviation σblock2 ≈ 1 n n X i=1 (hf (Zi)iblock− hf (Z)i)2. (2.97)
Generalized Metropolis Algorithm
The random walk process for the Monte Carlo integration aims in rendering samples of a predefined density φ(z) starting from the arbitrary initial density φ0(z). The transition den-
sity p(z − z0|z0) of equation (2.95) is now unknown. That is a case to which the generalized Metropolis method applies to.68The Metropolis random walk is performed based on the sym- metric d-dimensional Gaussian distribution as transition density T (z|z0) with an acceptance step being introduced: the random walk step from position z0 to position z will only be ac- cepted with certain probability A(z|z0) ≤ 1, and only if accepted the step will be performed. With the probability 1 − A(z|z0) the walker remains at its former position z0. So the density φn(z) at a position z and with n = 1, 2, . . . is given as:
φn(z) = Z [A(z|z0)T (z|z0) + ¯a(z0)δ(z − z0)] | {z } p(z|z0) φn−1(z0)dz0. (2.98)
¯
a(z0) is the probability of rejecting all possible steps away from z0. The converged density is given by
φ(z) = f (z)
R f (z)dz. (2.99)
Furthermore the acceptance probability is chosen to fulfill the detailed balance condition A(z0|z)T (z0|z)pZ(z) = A(z|z0)T (z|z0)pZ(z0). (2.100) This is because of min 1,T (z|z 0)f (z0) T (z0|z)f (z) T (z0|z) f (z) R f (z)dz = min 1, T (z 0|z)f (z) T (z|z0)f (z0) T (z|z0) f (z 0) R f (z)dz. (2.101) Equation (2.100) can be understood as an equation for a dynamic equilibrium, which has adjusted itself when the density limit is approached so that the propagation for z → z0 is equal to that for z0 → z. That would be after a certain number of random walk steps when equation (2.101) is fulfilled and the iteration of the density (2.98) is converged to
φ(z) = φn(z) = φn−1(z). (2.102)
The mean step length can be determined by the choice of T . The smaller the step length the more steps will be accepted and vice versa. By rule of thumb the acceptance ratio should be 50% for a well-behaved convergence. Only when having approached equilibrium the density may be introduced into further computations. As will be seen in the following subsections T for QMC is represented by the drift diffusion terms.