Two-Electron Integral Evaluation - Electronic Structure Methods

2.3 Electronic Structure Methods

2.3.3 Two-Electron Integral Evaluation

Evaluation and processing of the two-electrons dominate the HF method. A Simple ERI for four primitive s functions6i.e. (ssjss) is,

IERI Z Z e αjr1 Aj 2 e βjr1 Bj 2 1 r1 r2 e γjr1 Cj 2 e δjr1 Dj 2 (2.16) whereA;B;C;Dare centers of the four functions;α;β;γ;δ their exponents. Using the Gaus-

sian product rule [33, 34, 271] the four center problem can be combined into a two center problem, IERI=GABGCD Z Z e ζjr1 Pj 2 1 r1 r2 e ηjr2 Qj 2 (2.17) where GAB = exp[ αβ α+β jA Bj 2 ℄ GCD = exp[ γδ γ+δ jC Dj 2 ℄ P = αA+βB α+β Q = γC+δD γ+δ ζ = α+β η = γ+δ (2.18)

The final ERI expression for an s function is,

IERI=GABGCD 2π52 ζ η(ζ+η) 1 2 Fm(T) (2.19)

whereF₀EE(T)is the incomplete Gamma function,

Fm(T)=

1 0

e Tu2du (2.20)

Boys [33] showed how higher angular momentum functions can be obtained from lower ones by partial differentiation with respect a given coordinate. Functions with higher angular momentum values are computed using a set of recursive relations which successively build up angular momentum from primitive s functions. There are several schemes for evaluating ERIs [73, 109, 112, 161, 197, 198, 229, 272]. In this thesis the McMurchie-Davidson scheme is of interest and is outlined in the following sub-section.

6_{An s function arises when a given orbital has}_l

Figure 2.7: The McMurchie-Davidson ERI scheme. From [229].

2.3.3.1 The McMurchie-Davidson scheme for ERIs

McMurchie and Davidson (MD) computes ERIs based on Hermite Gaussians [182, 308]. Hermite Gaussians offer a compact means of representing one and two electron integrals. Us- ing Boys’ observation [34] on partial differentiation w.r.t. a center, MD derived recurrence- /recursion relations using Hermite Gaussians to allow calculation of ERIs for CGTOs. Refer- ring back to Equation 2.10, pairs of basis functions (φµ φλjand jφν φσ)are grouped together

when computing the Fock matrix. This grouping referred to as a shell-pair.

The MD ERI evaluation procedure is shown in Figure 2.7. The process begins through the evaluation of a quantity labelled as [0℄

m_{, which is related to}_F

m(T). The [0℄

m _{quantities are}

formed usingFm(T)(i.e. Step (a) in Figure 2.7),

[0℄ m =DADBDCDDGABGCD s 2π5 (ζ η) 3(2ϑ 2 ) m+ 1 2_F m(T) (2.21)

where DA;DB;DC;DD are contraction coefficients of the Gaussians; m is determined by the

sum of the angular momentums of the constituent basis functions;ϑ = q

ζ η ζ+η

; and 0mL. [0℄

The first step in the MD process constructs the[r℄ (m)

vector from the[0℄

m_{scalar quantity.}

MD showed that[r℄[r℄ (0)

can be generated using a two-term recurrence relation (RR) (Step (b)), [r℄ (m) =Ri[r 1i℄ (m+1) (ri 1)[r 2i℄ (m+1) (2.22) where iis the Cartesian axis direction; 1i and 2i represents a unit vector with a value of 1

or 2 in direction i. As Equation 2.22 is a recursive relation which shows that any given[r℄

integral can be assembled from an elementary set of[0℄

m_{integrals i.e. any given}

[r℄ (m)

can be generated using lower angular momentum terms in upto three different ways, each of which has different number of terms and as a results differing costs in accessing data operands from memory. Finding the most optimal path from the elementary[0℄

m_to

[r℄

m_{involves a tree search}

procedure, which forL5 is non-trivial and forL8 is unsolved [135].

Transferring angular momentum from[r℄to[pjq℄proceeds by using the following relation

to shift angular momentum from one[r℄to two centerspandq( [pjq℄[00pjq00℄), [pjq℄=( 1)

[p+q℄ (2.23)

wherepandqare products of functionsa;band c;drespectively. Herepand qare Hermite

functions [92].

Step (d) proceed by converting a Hermite Gaussian into two Cartesian Gaussians by using the following RR, which shifts angular momentum from the one Hermiteq onto a Cartesian Gaussianc,

jq cd℄=Qij(c 1i)d(q 1i)℄+(Qi Ci)j(c 1i)dq℄+(2η)

j(c 1i)d(q+1i)℄ (2.24)

where the angular momentum is shifted from Hermiteqontoc;d.

Step (e) involves carrying out a contraction step amongst thejcd℄integrals (denoted by the

change from square brackets to round brackets) where , the constituent GTOs that make up a CGTO are combined.

[pjcd)= KA

∑

i=1 KB

∑

j=1 [pjcidj℄ (2.25)

Steps (f) and (g) are carried out using similar transfer and contraction relations to Equations 2.24, 2.25. As a pedagogical exercise, a basic MD integral evaluation algorithm and SCF code was developed in C++ and parallelized using OpenMP.

2.3.3.2 The PRISM algorithm

The efficacy of an ERI algorithm is dependent on when PGTOs are combined to form fully contracted CGTOs. Algorithms like MD choose to carry out integral contraction i.e. the

Figure 2.8: The MD-PRISM ERI scheme. From [229], [92]

contraction step,(abjcd), is the last to be performed. Gill et al. [95] realized that the efficiency

of various integrals algorithms is tied not only to the nature of the integrals being computed, but also to when the contraction steps are performed. In recognition of this, they created the PRISM algorithm which dynamically chooses when contractions are performed. Figure 2.7, is the front face of MD PRISM. There are at most three transformation (T) steps and two contraction (C) steps to generate a given(abjcd)shell-quartet. The MD algorithm corresponds

to the T1T4C5T8C8 path in PRISM. PRISM further recognizes that shell-quartets that have

identical angular momentum types and contraction lengths can be treated together. This allows for vectorization of these shell-quartets into batches on vector machines or these batches are further cache-blocked for operation on cache based architectures.

PRISM’s mode of operation is as follows – Once the shell-pair data is collected, integral screening is performed to reduce the total number of significant shell-pairs that need to be considered. The list of remaining shell-pairs are sorted and pairing of shell pairs is done. At this stage, there is batching of similar shell-quartets of the same type to increase sharing of intermediate quantities. After this step, the[0℄

m_{integrals are computed. Each of the transfor-}

mation steps involves the use of driver routines which take pre-compute solutions to the tree search problem (i.e. the best way of computing an integral intelligently without resorting to

recursion) and execute a series of operations array locations in order to generate the required integral [92].

We note that the implementation of PRISM, in the Gaussian code, has additional paths which correspond to the generalisation of the Obara-Saika (OS) integral algorithm [197, 198], in addition to MD-PRISM. Reference [92] casts OS recurrence relations into a form similar to MD-PRISM. These OS paths are subsequently used by PRISM for certain contracted integral as its implementation yields lower run-times.

In document Performance Models for Electronic Structure Methods on Modern Computer Architectures (Page 58-62)