Ensemble Generation [63] - Gauge Ensembles

1.2 Gauge Ensembles

1.2.1 Ensemble Generation [63]

An ensemble of gauge configurations is generally created as a sequence, called a Markov chain,

where each individual gauge configuration in the sequence is generated from the previous

member. The process associated with transforming one gauge configuration into another is

characterized by a transition probability whose elements P [U0, U ] determine the likelihood

of transitioning from the gauge configuration U to U0. By accounting for all the possible

ways of transitioning into a new configuration U0, the transition probability can also be

interpreted as a transformation of the ensemble probability distribution W [U ]

W [U0] = Z

[dU ]P [U0, U ]W [U ] ≡ P W, (1.48)

where the latter form is expressed as matrix multiplication. For the generated sequence of

gauge configurations to correspond to a sample from the desired probability density Weq

specified in Eq. (1.37), the transition probability must satisfy a few conditions. To reach the desired distribution from any initial configuration, repeated application of the transition

probability must move the distribution towards Weq. This can be summarized as a limiting

procedure,

lim

n→∞P

n_{W = W}

Once the equilibrium distribution is reached, the sequence must continue to sample from the

same distribution. In other words, the equilibrium distribution must be an eigenvector of

the transition probability with eigenvalue 1 (due to normalization).

P Weq = Weq (1.50)

As long as the equilibrium distribution satisfying these two properties is unique for a given

transition probability, the Markov chain initialized from an arbitrary gauge configuration

will eventually converge to the required equilibrium distribution.

While these properties are straightforward to understand, it is difficult to immediately check

if a transition probability satisfies them. Instead, a handful of simpler conditions are usually

imposed which are sufficient to guarantee the above conditions. Chief among these is strong

ergodicity

P [U0, U ] > 0. (1.51)

In practice, strong ergodicity may not be satisfied by a single update; however, P [U0, U ] can

be redefined to include any finite number of transitions. In this sense, strong ergodicity is

satisfied when any gauge configuration is reachable from any other after a finite number of

transitions. The second simpler condition that is often employed is detailed balance.

Detailed balance requires that during any transition at equilibrium, the probability that is

redistributed from configuration U to U’ is balanced by the probability redistributed from

the reverse process, U’ to U. When applied together, strong ergodicity and detailed balance

are sufficient to derive Eqs. (1.49) and (1.50). Detailed balance is sufficient here but it is not necessary; in practical applications most algorithms do not satisfy detailed balance.

The equilibrium density is shown to be an eigenvector by directly applying detailed balance

to the left hand side of the eigenvalue equation, where the eigenvalue 1 results from the

normalization of the transition probability.

P Weq ≡ Z [dU ]P [U0, U ]Weq[U ] = Z [dU ]P [U, U0] Weq[U0] = Weq (1.53)

Deriving the limiting condition in Eq. (1.49) is more involved, but is a direct consequence of the transition probability being positive definite, e.g. strong ergodicity. The Perron-

Frobenius theorem states that there must exist a unique maximum eigenvalue and corre-

sponding eigenvector for positive matrices. By proving the eigenvalue 1 associated with the

equilibrium density is the largest eigenvalue for the transition matrix, this theorem shows

the equilibrium distribution is a unique eigenvector and all other eigenvalues must be less

than one. As a consequence, all contributions to an initial density that are orthogonal to

the equilibrium density decay away with successive transitions.

Showing λ = 1 is the maximum possible eigenvalue is a straightforward result after integrat-

ing the magnitude of each side of the eigenvalue equation P W = λW . Both integrations

|λ| C = |λ| Z [dU0] |W [U0]| = Z [dU0] |λW [U0]| = Z [dU0] Z [dU ]P [U0, U ]W [U ] ≤ Z [dU0] Z [dU ]P [U0, U ] |W [U ]| = Z [dU ] |W [U ]| ≡ C

Cancelling this factor yields the desired result of |λ| ≤ 1.

Now that the simpler conditions of strong ergodicity and detailed balance are understood

to be sufficient, the main question still remains: how can a transition probability satisfying

these conditions be constructed. Many algorithms exist that meet these conditions. In this

section we’ll focus on Metropolis algorithms.

The Metropolis algorithm is best summarized as an accept-reject procedure. The first step

is to randomly modify the individual links of a gauge configuration U to produce a potential

next configuration U0. Then, the modified gauge configuration is either accepted or rejected

based on the relative weight of each configuration in the equilibrium probability density

probability) then the gauge configuration is accepted. If the new gauge configuration has a

higher action, then it is randomly accepted with probability r.

P [U0, U ] ∝ min(1, r) (1.54)

Detailed balance can be verified by simple inspection of each case (r > 1 and r < 1). Strong

ergodicity depends primarily on how new gauge links are proposed. Often it is impractical

to update all the links simultaneously, due to the high likelihood that r << 1. Instead, an

individual update might only change one or a smaller subset of links at a time. In such a

case ergodicity is restored by repeatedly applying the Metropolis step until every link within

the lattice has been considered for update.

One additional feature of the Metropolis algorithm that makes it more advantageous for

lattice field theory computations is its independence from any normalization constant. In

the case of lattice field theory, the overall normalization constant Z[0] is not calculable,

so its necessary to have an algorithm that avoids it. Some alternatives to the Metropolis

algorithm, like heat bath, still require a partial normalization to be calculated.

In document Lattice Scales from Gradient Flow and Chiral Analysis on the MILC Collaboration\u27s HISQ Ensembles (Page 42-46)