Chapter 2 Simulation methods—A review
2.4 Multicanonical simulation
By running multiple Markov chains in parallel with tempered distributions, a paral- lel tempering simulation is able to move across free energy barriers and an enhanced sampling can often be achieved at the lowest temperature. Until now we have been focusing on the temperature spacing problem, which is related to the actual imple- mentation of the algorithm; another problem intrinsic to the method is the sampling of rare configurations, such as configurations that define the transition states in a potential energy surface. For example, problems can arise in systems involving phase
transitions, where we typically observe a low exchange probability between low and high energy states, even with small temperature spacing. Ultimately, this is due to the Boltzmann weight in the canonical ensemble, which is the ensemble we have been using. The multicanonical simulation (MUCA) [6] takes a different perspective by sampling from a modified ensemble in which the energy is approximately uniformly distributed; that is, the multicanonical density is proportional to the inverse of the density of states:
πmu(x)∝
1
Ω(U(x)). (2.8)
Because Ω(U) is not knowna priori, the actual simulation only samples from an approximation ˆπmu of (2.8). Had it been known there would be no need to do
simulations because we can compute all thermodynamics from the density of states. Thus, the idea of MUCA is to iteratively construct a sequence of approximations ˆ
πmun (n= 1,2, . . .):
ˆ
πmun ∝( ˆΩn(U(x)))−1, n= 1,2, . . . such that ˆπmun ≈πmu when nis large.
In practice, this is usually done by running many small-scale simulations where each simulation yields an approximation ˆπnmu, until one is satisfied with some Nth simulation which produces an approximately flat energy histogram. By ap- proximately flat we mean the sampler is able to visit all energy regions relatively frequently, and a difference of up to a factor of ten is often deemed to be accept- able [5]. One can then run a longer simulation with the weights ˆπN
mu and obtain
estimates with respect to the canonical ensemble through importance reweighting. The key to successful implementation of MUCA therefore depends on the recursion rule used to update ˆπmu. Initially, ˆπ1mu is set to 1, indicating there is
no prior information about the system and the sampling of every configuration is equally likely. A simple update rule proceeds by giving to each ˆΩn(Um), where
{Um}Mm=1 is a discretization of energy U, a weight proportional to the observed
energy histogram in binm: ˆ
Ωn+1(Um)∝Ωˆn(Um)Hmn, (2.9)
tionality constant is irrelevant. We may write ˆ Ωn+1(Um) = ˆΩn(Um) Hmn Hexp , (2.10)
whereHexp is the expected count per bin and is equal to the total number of samples
divided by the number of bins. The logic behind (2.10) is clear: if Hmn
Hexp >1 then binm is oversampled, so in order to drive the sampler towards constant behaviour we increase ˆΩn(Um), so that bin m is likely to be sampled less in the next round, and vice versa if Hmn
Hexp <1.
There are several problems with this simple recursion. First, there is always statistical noise associated withHmn, and this noise is erroneously treated as a cor- rection factor for the density of states. An extreme situation is when we feed into our simulation the exact density of states: in this case the new update is doomed to be worse because all that we added is statistical noise. Another drawback is that each update is based only on the most recent Hmn and historical data from previ- ous simulations are ignored. Also, ifHmn is zero then the multicanonical density is undefined.
A modified recursion which takes into account these problems was proposed by Berg [4]. In its original formulation, the approximation to the target densityπmu
was written under a new parameterization:
ˆ
πmu ∝e−S(U)=e−b(U)U+a(U), (2.11)
whereS(U) is the microcanonical entropy,b(U) is the microcanonical temperature, given by the derivative ofS with respect toU, anda(U) is the fugacity. The weight to be used at the (n+ 1)th simulation follows oncebn+1(U) andan+1(U) have been determined from thenth simulation:
ˆ
πmun+1 ∝e−bn+1(U)U+an+1(U). (2.12) Although this may seem complicated by introducing additional parameters, note that only one of them, say b(U), is a “real” parameter because a(U) follows from (2.11) and the fact thatb= ∂U∂S.
Therefore, the modified recursion only involves the determination ofbn+1(U), and this is done in a way that historical knowledge about the parameter is properly incorporated. Essentially, the update bn+1 not only uses data from the nth simu- lation, but also combines bn which encapsulates the history of the previous n−1 simulations. A weight proportional to the inverse of the variance ofbn is used as a
guide to combine the most recent and historical simulations.
This recursion scheme addresses most problems typical of the simple recur- sion (2.9), but does not eliminate them. For example, statistical noise is still present since the estimate of the variance is based on finite and often very short simulations. In his paper, Berg used around 9000 simulations with 1000 MC steps each to study a 10-state Potts model and claimed that using frequent iterations1 was capable of increasing the stability of the result. However, because short simulations generally yield larger statistical uncertainties than longer simulations, it is unclear whether we should use more iterations with fewer steps per simulation or fewer iterations with more steps per simulation, given the same amount of CPU time. Fortunately, with the estimates of the density of states derived from our new method, this mul- ticanonical recursion is often not necessary.
Apart from the actual recursion step, another notable problem with MUCA is that it often requires human input to guide the simulation. This is related to the fact that the weights stay put during an iteration. It is only after one iteration finishes that the weights get updated. Since a MUCA simulation starts by assigning to each configuration an equal weight, it may not be able to visit low energy region of the system under an affordable time, and hence proper guesses of ˆΩn(U) near the ground state are often needed in the course of the simulation.
As mentioned in the beginning of this section, the method of tempered dis- tributions such as ST and PT are often effective in moving across energy barriers and thus exploring configuration space more rapidly. This is one of the motivations for combining the strength of PT and MUCA, and how this can be done efficiently will be discussed in Chapter 5.