Many-body Study of Core-valence Partitioning and Correlation in Systems with Large-Z Element.

(1)

ABSTRACT

ZHU, MINYI. Many-body Study of Core-valence Partitioning and Correlation in Systems with Large-Z Element. (Under the direction of Lubos Mitas.)

Quantum Monte Carlo (QMC) is one of the most promising many-body electronic struc-ture approaches in studying strong correlated systems of electrons. We have applied QMC in

the calculation of transition metal oxide and large systems containing heavy elements such as

Thorium and Pb. Relativistic effect becomes non-neglible when heavy elements are involved. However, a direct calculation of relativistic effect in QMC is not practicable because it requires a

4-component framework and all of the electrons need to be considered. Scalar-relativistic

effec-tive core potential(ECP) and 2-component relativistic ECP(RECP) are introudced to quantum chemistry to mimic all-electron calculation including relativistic effects.

We study the accuracy of the ECP of two different sizes of cores by Hartree-Fock, density

functional theory(DFT) and QMC methods using MnO molecule as a test system. We show that the discrepancies between all-electron and ECP calulation of transition metal oxide is actually

caused by the problem of non-linear exchange-correlation functionals in DFT, instead of the

inaccuracy of the ECP. High accuracy diffusion Monte Carlo calculation of the MnO molecule confirms that the Ne-core and He-core ECPs are of comparable quality and therefore enable

to reproduce energy differences within 0.1 eV or better accuracy margin. In addition, we have

corroborated previous results on nodal surfaces which are more most accurate when using trial functions based on orbitals from hybrid functionals.

We make a further modification on the ordinary QMC which extends the applicability to

inherently complex wavefunctions. The complex state can result from a presence of magnetic field, boundary conditions or due to of spin-orbit interactions. The spin-orbit interactions is

particularly interesting since that requires the spin to be dynamic unlike the spin-free mechanism

in ordinary QMC which was restricted to a static label. We implement an inovative spin-sampling technique and fixed-phase approximation for diffusion Monte Carlo(DMC). With the

help of RECP, we calculate the excitation energy of Pb atom and binding properties of Pb

molecules. The excellent agreement with experiment results shows our new spin-orbit QMC is very promising and capable to reproduce spin-orbit interaction.

The study of thorium halides is completed in collaboration with Shi Guo and Shuming Hu. We investigate bond dissociation energies(BDE) of ThX_n X=Cl,Br. Comparison of ex-periment results and theoretical calculation including DFT and QMC shows better agreement

when using DMC on ThCl_n. However, an abnormal experimental BDE curve of ThBr_n is not predicted by our calculations which indicates that additional work including both theoretical

(2)

(3)

Many-body Study of Core-valence Partitioning and Correlation in Systems with Large-Z Element

by Minyi Zhu

A dissertation submitted to the Graduate Faculty of North Carolina State University

in partial fulfillment of the requirements for the Degree of

Doctor of Philosophy

Physics

Raleigh, North Carolina

2013

APPROVED BY:

Celeste Sagui Dean Lee

Elena Jakubikova Lubos Mitas

(4)

DEDICATION

(5)

BIOGRAPHY

The author was born on April 18, 1985, in Shanghai. He obtained his bachelor degree in Shanghai Jiao Tong University. In 2007, he decided to attend NC State University(NCSU) and started

(6)

ACKNOWLEDGEMENTS

I would like to thank my advisor, Lubos Mitas for his help and support. I can not complete my research without his encouragement and advice. His patience, enthusiam and immense

knowledge helped me to accomplish my PhD study. The experience under his guidance will be an invaluable part of my life.

I would like to thank Prof. Celeste Sagui, Prof. Dean Lee and Prof. Elena Jakubikova for

being my committee members and giving me advice on my work. I would also like to thank Department of Physics, NC State University, especially to Prof. Harald Ade and Prof. Chueng

Ji for being the director of graduate programs.

Thank you to all of my past and current group members: Jindrich Kolorenc, Michal Bajdich, Shuming Hu, Xin Li, Rene Derian and Shi Guo. I really enjoy working with my collegues and

they have given me many suggestions on my research, great conversation and great friendship.

(7)

TABLE OF CONTENTS

LIST OF TABLES . . . vii

LIST OF FIGURES . . . ix

Chapter 1 Introduction . . . 1

1.1 Organization of This Thesis . . . 1

Chapter 2 Traditional Electronic Structure Theories . . . 3

2.1 Many-body Problem in Electronic Structure Theory . . . 3

2.2 Hartree-Fock Method . . . 4

2.3 Post-Hartree-Fock Method . . . 6

2.4 Density Functional Theory . . . 6

REFERENCES . . . 9

Chapter 3 Quantum Monte Carlo Methods . . . 10

3.1 Introduction . . . 10

3.2 Principle of Monte Carlo Methods . . . 10

3.2.1 Probability Theroy . . . 10

3.2.2 Advantages of Monte Carlo Method . . . 11

3.2.3 Importance Sampling . . . 12

3.2.4 Metropolis Algorithm . . . 14

3.3 Variational Monte Carlo . . . 16

3.4 Variational Trial Wavefunction . . . 17

3.4.1 Cusp Conditions . . . 17

3.4.2 Form of Trial Wavefunction . . . 18

3.5 Optimization of Wavefunction . . . 20

3.5.1 Variance Minimization . . . 20

3.5.2 Energy Minimization . . . 21

3.5.3 Newton Method and Beyond . . . 22

3.6 Diffusion Monte Carlo . . . 23

3.6.1 Time Dependent Green’s Function . . . 23

3.6.2 Short-time Approximation . . . 25

3.6.3 Importance Sampling and Outline of DMC . . . 26

3.6.4 Fixed-Node Approximation . . . 27

3.7 Summary . . . 29

REFERENCES . . . 30

Chapter 4 Study of Ne-core and He-core Pseudopotential Errors in the MnO Molecule . . . 32

(8)

4.2.1 Ne-core and He-core ECPs . . . 36

4.2.2 Ansatz for the He-core ECP . . . 36

4.2.3 Optimization of ECP . . . 39

4.2.4 Implementation of ECP in QMC . . . 41

4.3 Results . . . 42

4.4 Conclusion . . . 46

REFERENCES . . . 47

Chapter 5 Calculation of Spin-orbit Interaction in QMC . . . 49

5.2 Principles of Relativistic Quantum Chemistry . . . 49

5.3 2-component Relativistic Effective Core Potential (RECP) and SCF Theories . . 52

5.4 Spin-Orbit coupling in Quantum Monte Carlo Calculations . . . 57

5.4.1 Trial Wavefunction ofp2 Configuration . . . 57

5.4.2 Spin-dependent Hamiltonian . . . 61

5.4.3 ECP Operator . . . 63

5.4.4 Fixed-phase Approximation and Diffusion Monte Carlo . . . 65

5.5 Results . . . 67

5.5.1 Atomic Results . . . 67

5.5.2 Molecular Results . . . 72

5.6 Additional Benchmarks on Spin Timestep . . . 74

REFERENCES . . . 76

Chapter 6 Dissociation Energy Study of Thorium Halides . . . 78

6.2 Computational Methodology . . . 79

6.3 Results . . . 81

REFERENCES . . . 85

Appendices . . . 86

Appendix A Parameters of the He-core ECP for 3d transition-metal elements . . . . 87

Appendix B Parameters of modified Pb RECPs . . . 90

(9)

LIST OF TABLES

Table 4.1 The discrepancies between all-electron and Ne-core pseudopotential for the energy difference (eV) between antiferromagnetic and nonmagnetic states

in the MnO solid[2]. . . 34

Table 4.2 Parameters of the He-core ECP for Mn, the conventional ECP representa-tion in quantum chemistry programs is used: Vl(r) =r−2PkAklrnkle−Bklr 2 37 Table 4.3 Errors (in eV) of the self-consistent excitation energies of the Mn atom for the two types of ECPs with regard to the all-electron Dirac-Fock calculation 41 Table 4.4 Energy difference ∆ = Ehs −Els between high-spin (2S + 1 = 6) and low-spin (2S + 1 = 2) states of the MnO molecule and corresponding errors quantified as the disagreement between ∆all and ∆ECP for different treatment of the cores and methods. . . 43

Table 4.5 Comparison of QMC excitation energies (eV) and the differences with re-gard to the He-core ECP and differences between the three pseudopoten-tials. In the brackets are types of trial functions which are either single reference built from DFT orbitals or in the last row, multi-reference from CI. . . 45

Table 5.1 Total energy of carbon atom (a.u) . . . 67

Table 5.2 Excitation energies of Pb atom using large core RECP[24] (eV) . . . 70

Table 5.3 Excitation energies of Pb: comparison of large core and small core RECP(eV) 70 Table 5.4 Total energy (atomic unit) of Pb ground state; c: conventional with single vp channel; s: spin sampling, 2-component vpj channels . . . 71

Table 5.5 Excitation energies of Pb atom using conventional and spin sampling QMC (eV) . . . 71

Table 5.6 Dissociation energies of PbH systems using large-core RECP(eV) . . . 73

Table 5.7 Bond lengths(re) and dissociation energies(De) of PbO . . . 74

Table 5.8 DMC total energies of ground state Pb using different spin time step . . . 75

Table 6.1 List of bond lengths[˚A], symmetries, and total energies [a.u.] from DFT(B3LYP) and DMC . . . 81

Table 6.2 Comparison of the BDEs(kcal/mol) of ThBr_nfromhybrid-functional DFT(B3LYP) calculation and experiment. The two columns (78e− RECP and 60e− RECP) indicate large-core scalar-relativistic ECP and small-core 2-component relativistic ECP with spin-orbit effect of thorium atom[11]. . . 83

Table A.1 Parameters of the He-core ECP for Sc . . . 87

Table A.2 Parameters of the He-core ECP for Ti . . . 88

Table A.3 Parameters of the He-core ECP for V . . . 88

Table A.4 Parameters of the He-core ECP for Cr . . . 88

Table A.5 Parameters of the He-core ECP for Mn . . . 89

(10)

(11)

LIST OF FIGURES

Figure 3.1 Target function in the original integral . . . 12 Figure 3.2 Trial distribution in the importance sampling . . . 14 Figure 3.3 Comparison of exact wavefunction Φ0 and fixed-node solution ΦF N in

1-dimension, the error caused by fixed-node approximation is of the order of 5% of correlation energy . . . 28

Figure 4.1 He-core pseudopotentials for the Mn atom. . . 38

Figure 5.1 Pb atom: 1-component AREP(weighted average of RECP) and 2-component RECP (s, p channels only) [24] . . . 56 Figure 5.2 Energy Levels of group 14 elements. Plotted by I.Kim and Y.S.Lee [31]

using experimental data for C-Pb and theoretical results for Fl . . . 69

Figure 6.1 Experimental bond dissociation energies of Th-F, Th-Cl and Th-Br. Plot-ted by D. L. Hildenbrand and K. H. Lau [3, 4, 5] . . . 80 Figure 6.2 Bond dissociation energies for ThCl_n molecules. (Data from Shi Guo’s

(12)

Chapter 1

Introduction

For several decades, the computational simulation has become an important tool in studying fields of physics, chemistry, material science and bioscience. Theab initiocalculation starts from first principles of quantum mechanics, without implementing any empirical or semi-empirical

parameters in the calculation. With enormous development in the field of high performance computing, first principle methods now enable us to simulate large systems containing hundreds

and thousands of atoms. The electronic structure calculation of atoms, molecules and solids will

help to not only verify but also predict new properties of nanoscale materials.

The main issue of first principle calculation is how to solve the quantum mathematical

equations both efficiently and accurately as Dirac wrote with his famous equation:

“The underlying physical laws necessary for the mathematical theory of a large part

of physics and the whole chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations much too complicated

to be soluble.”

(Post-)Hartree-Fock, density functional theory(DFT), quantum Monte Carlo(QMC),

pseudopo-tential models and many other theories have been successfully developed to solve the

compli-cated equations. The treatment of electron-nucleus, electron-electron interaction and correla-tion is at the heart of these compuacorrela-tion methods. This dissertacorrela-tion will discuss my research

on strongly correlated systems such as transition metal and heavy element systems using high-accuracy quantum Monte Carlo methods and effective core potential models in comparsion with

traditional mean-field approaches.

1.1 Organization of This Thesis

We start by giving a brief introduction to Born-Oppenheimer Approximation, Hartree-Fock,

(13)

In Chapter 3, we discuss two typical quantum Monte Carlo methods and concept of

trial-wavefunction.

In Chapter 4, we show the study of ECP’s accuracy by comparing all-electron, He-core and

Ne-core ECPs in DFT, HF and QMC approaches applied to the MnO molecule.

In Chapter 5, we demonstrate the innovative implementation of 2-component REP in QMC which makes it possible to treat spin-orbit interaction in QMC with high accuracy.

Chapter 6 is part of my contribution (collaborated with Hu, Shuming and Guo, Shi) to the

(14)

Chapter 2

Traditional Electronic Structure

Theories

For a n-particle system, the exact solution of the many-body Schr¨odinger equations is

pro-hibitively complicated and impossible in most cases. Several popular electronic structure

the-ories and approximations for simplifying non-relativistic many-body problems are briefly re-viewed in this chapter. A comprehensive treatment of relativistic effect is given in Chapter

5.

2.1 Many-body Problem in Electronic Structure Theory

The description of the elctronic structure of matter is derived from the Schr¨odinger equation

in quantum mechanics. The many-body system consists of both heavy nuclei and light electons which interact with each other through the Coulomb interaction. The typical non-relativisic

Hamiltonian for this system is:

ˆ

H =− h

2

2me X

i

∇i2− X

i,I

ZIe2

|ri−RI| +1

2

X

i6=j e2

|ri−rj|

− h

2

2MI X

I

∇I2+ 1 2

X

I,J

ZIZJe2

|RI−RJ|

(2.1)

where mi and MI represent mass of electrons and nuclei, with coordinates ri and RI. Due to the fact that the mass of a nuclei far exceeds the mass of electron, we can assume that the motion(kinetic energy) of the nuclei is negligible compared to that of the electrons. The second

(15)

Born-Oppenheimer Approximation. The simplified hamiltonian becomes

ˆ H =−1

2

X

i

∇_i2₋X i,I

ZIe2

|ri−RI|+

1 2

X

i6=j e2

|ri−rj| (2.2)

where atomic unit is used hereafter. The ith eigenstate Ψi(1,2, . . . , N) of time-independent Schr¨odinger equation

HΨi(1,2, . . . , N) =EΨi(1,2, . . . , N) (2.3) is a many-body wavefunction. The general coordinates (1,2, . . . ,N) include both spin directions

and spatial coordinates. The total energy is the expectation value of the Hamiltonian,

E= hΨ| ˆ H|Ψi

hΨ|Ψi (2.4)

Of particular interest is the ground state of the system, which has the lowest eigenvalues of Eq.2.3. Although the Hamiltonian is simplified by Born-Oppenheimer Approximation, it is still

impractical to get an exact analytical solution to many-body Schr¨odinger equation even for

the system with only a few electrons. A few traditional quantum chemistry methods will be discussed below, in which various further approximations are made.

2.2 Hartree-Fock Method

The early attempt to tackle the many-body problem is known as Hartree product. In 1926, Hartree introduced the theory based on the assumption that each particle is only subjected to

the mean field created by all other electrons. The wavefunction of a non-interacting systems

which is also called Hartree product is given by:

ΨHP(1, . . . , N) =ϕ1(r1)· · ·ϕN(rN) (2.5) However, Hartree equation fails to describe correctly other important properties such as indis-tinguishability of electrons etc. To overcome this problem, Hartree-Fock method was invented

as an extension of Hartree approximation. In Hartree-Fock theory, the motion of each electron

(16)

The spin restriction(Pauli principle) forces an antisymmetric form of the wavefunction:

ΨHP(1, . . . , N) = 1 √ N!

ϕ1(r1) ϕ2(r1) · · · ϕN(r1)

ϕ1(r2) ϕ2(r2) · · · ϕN(r2)

..

. ... · · · ... ϕ1(rN) ϕ2(rN) · · · ϕN(rN)

(2.6)

which is written as a normalized slater determinant.

The expectation value of the Hamiltonian is evaluated by

EHF =X si

Z

Ψ ∗

(1, . . . , N)HΨ(1, . . . , N)dr1. . . drN (2.7)

EHF is always greater than or equal to the exact ground state energy. Therefore, the ground state is achieved by so called “Variational principle”. We can use Lagrange’s method of unde-termined multipliers to find the stationary points of the total energy with respect to variations

in the one-particle wavefunctionsδφi. This finally leads to the one-electron Hartree-Fock equa-tions:

−1

2∇

2₊_V

ion(r) +Vele(r)

φi(r)− X

j δsisj

Z dr0

1

|r−r0_|ψ ∗ j(r0)ψ

(

ir 0₎_∗

ψj(r) =iψi(r) (2.8)

where si is the spin state. Vele represents the Coulomb potential of an electron in the

aver-age charge distribution and the last term on the left side is exchange term that represents

the electron–electron interactions. The advantage of Hartree-Fock theory is that the exchange energy is treated exactly. The Coulomb interaction of an electron with itself (self-interaction

error) gets cancelled by an equivalent term in the exchange part. On the other hand, the

wave-function in general can not be described accurately by just one determinant. The incomplete wavefunction leads to error or so-called correlation energy. Here and in the rest of this thesis,

the electron correlation energy is defined as the difference between Hartree-Fock energy and

the exact energy:

Ecorr =Eexact−EHF (2.9)

(17)

2.3 Post-Hartree-Fock Method

Post-Hartree-Fock or sometimes called configuration-interaction(CI) is a many-body ab initio

method in order to recover correlation energy lost in Hartree-Fock approximation. Theoretically, a full CI provides an exact solution of many-body Hamiltonian. The wavefunction is written as

a linear combination of the determinants:

|Ψi=c0|Ψ0i+

X

ar

cr_a|Ψr_ai+ X a<b,r<s

crs_ab|Ψrs_abi

+ X

a<b<c r<s<t

crst_abc|Ψrst_abci+ X a<b<c<d r<s<t<u

crstu_abcd|Ψrstu_abcdi+· · · (2.10)

where|Ψi is a complete set of determinants including the dominant Hartree-Fock determinant

|Ψ0i, singly excitation |Ψari, doubly excitation |Ψrsabi, etc, up to N-tuply excited determinants. Given N electrons and 2K finite one-electron orbitals (occupied and unoccupied), there exist

2K N

!

different determinants. Therefore even for a small system and moderate number of

one-electron orbitals, the computation cost of full CI is difficult to be fulfilled.

Many practical CI approaches usually limit the length of the CI expansion. For example, in CISD method, only single- and double excitations of the electrons into the virtual orbitals are

considered. But the truncated CI wavefunction probably will not describe e-e correlation

cor-rectly especially for macroscopic systems such as a crystal. In order to simulate solids, millions of determinants are required to construct and be optimized. The lack of size consistency makes

typical CI only appropriate for small molecules with up to 50 electrons(if the quadruple

exci-tations are included) whereas another highly accurate many-body approach–Quantum Monte Carlo method can treat up to 1000 electrons.

2.4 Density Functional Theory

Density functional theory(DFT) is one of the most popular methods for computing properties of solids or super-molecules. The electron density ρ(r) is said to be the probability of finding an electron being present in a certain volume ofdraround the positionr. Mathematically, it is defined as [1]

n(r) =NX s1

· · ·X

sN

Z

dr2· · ·

Z

drN|Ψ(r, s1, r2, s2, . . . , rN, sN)|2 (2.11)

(18)

In 1964, Hohenberg and Kohn[2] proposed and proved two simple theorems which makes

DFT possible. The Hohenberg and Kohn theorems demonstrate that for any system of inter-acting particles, the ground state properties are uniquely determined by the density and the

ground state can be achieved by minimizing the functional Etot[n(r)]. The theorems result in the remarkable simplification in which the properties of system depend on only 3 spatial coordinates instead of 3N coordinates.

However, for an exact many-body wavefunction, Eq.2.11 still has infinite terms in the

sum-mation. In 1965, Kohn and Sham[3] made a further approximation and stated that the problem of interacting electrons can be mapped onto solving a fictitious system of non-interacting

“elec-trons”. One-electron orbitals of a non-interacting auxiliary system are introduced from which

the kinetic energy can be computed accurately, leaving only a small correction to be calculated separately. The energy of such a system is expressed as the functional of density:

Etot[n(r)] =− 1 2

N X

i=1

Z

ψ∗_i(r)∇2ψi(r) + Z

n(r)Vion(r)dr

+1 2

Z Z

n(r)n(r0)

|r−r0_| drdr 0₊_E

xc[n(r)]

(2.12)

which is composed of the kinetic energy of a non-interacting system, the Coulomb potential, the

Hatree energy and the exchange-correlation energy. The well-known Kohn-Sham equation(an effective one-particle equation) is obtained by minimizing the total energy functional until

self-consistency is achieved:

−1

2∇

2₊_V

ion[n(r)] +VH[n(r)] +Vxc[n(r)]

ψi(r) =iψi(r) (2.13)

where the Hartree potentialVH is

VH(r) =

Z

dr0 n(r)

|r−r0_| (2.14)

and the exchange-correlation potential is

Vxc(r) = δExc[n(r)]

δn(r) (2.15)

All terms in Eq.2.13 are exact and no approximations have been made except the

exchange-correlation potential which is by no means known exactly. One of the most widely used approx-imation to exchange-correlation functional is called the local density approxapprox-imation (LDA).

(19)

or a perfect infinite crystal with uniformly distributed valence electrons and positive ions. The

approximation works very well in the case that electron density is almost uniform. But it sur-prisingly also works well for the systems that electron density varies rapidly. QMC played an

important role in finding highly accurate exchange-correlation functional[4].

However, it is common that DFT may get some properties correctly, but fail in predicting others, especially in studying strongly-correlated materials. For example, it is well known that

the LDA and generalized gradient approximation (GGA) predict incorrect equilibrium crystal

structure and electronic state for FeO. In addition, the approximate xc terms leads to a self-interaction error which is very difficult to quantify. To partially remove the impact of this error,

hybrid functional is constructed to incorporate the exact exchange by combining Hartree-Fock

and the density functional treatments of exchange. The most famous hybrid functional are B3LYP[5, 6, 7]:

E_xcB3LY P =E_xcLDA+a0(ExHF −ExLDA) +ax(ExGGA−ExLDA) +ac(EGGAc −EcLDA) (2.16) and PBE0 functional[8]:

(20)

REFERENCES

[1] Parr, Robert G.; Yang, Weitao (1989). Density-Functional Theory of Atoms and Molecules. New York: Oxford University Press.

[2] P. Hohenberg and W. Kohn., Phys. Rev., 136, B864(1964).

[3] W. Kohn and L. J. Sham, Phys. Rev., 140, A1133 (1965).

[4] D. M. Ceperley and B. J. Alder, Phys. Rev. Lett., 45, 566 (1980).

[5] A. D. Becke, J. Chem. Phys., 98, 1372 (1993).

[6] K. Kim and K. D. Jordan J. Phys. Chem., 98,10089 (1994).

[7] P.J. Stephens, F. J. Devlin, C. F. Chabalowski and M. J. Frisch, J. Phys. Chem., 98, 11623 (1994).

(21)

Chapter 3

Quantum Monte Carlo Methods

3.1 Introduction

In this chapter, the key idea of the Monte Carlo simulation and its application in many-body quantum mechanics will be discussed. Section 3.1 will review the principle and background of

the Monte Carlo method. Section 3.2–3.5 will introduce several popular implementations of

quantum Monte Carlo in electronic structure calculations.

3.2 Principle of Monte Carlo Methods

3.2.1 Probability Theroy

In mathematics, a probability measure is associated with an event. The probability(p) of the event is defined as the volume of its outcomes (Nk) relative to all possible outcomes(N):

pk= lim

N→∞(Nk/N) (3.1)

A function that maps outcomes of an event in the sample space into a set of real numbers is called random variable. The expectation value of a random variable is defined as

hxi=E(x) =X i

pixi _(3.2)

Monte Carlo methods are based on the relation of probability and volume. In fact it uses

the identity of probability in reverse. In Monte Carlo simulation, one draws random samples from a universe of all possible outcomes and interprets the fraction of samples falling into a

given set as the probability. The estimation will converge to the exact value because of the law

(22)

The law of large number ensures that the average of independent and identical distributed

( i.i.d.) random variables tend to stabilize at the exact expectation value of the distribution. For example if we draw i.i.d. samplesx1_{, . . . , x}m _{from a distribution with mean}_µ_{and variance} σ2. Its converge rate is defined by the Central Limit Theorem:

σ2= σ

2

f

m (3.3)

Let’s move forward a small step to integral which is important application of Monte Carlo

A simple integral can be written as

I =

Z b

a

f(x)dx (3.4)

On the other hand, the expectation value of f(x) is

E[f(x)] =

Z b

a

f(x)p(x)dx= lim m→∞

1 m

m X

i=1

f(xi) (3.5)

ifx1, . . . , xmare randomly drawn from a uniform distribution in whichp(xi) = b−a1 . The integral (Eq.3.4) will then be

ˆ

Im = lim m→∞

b−a m

m X

i=1

f(xi) (3.6)

The error of ˆIm−I in the Monte Carlo estimate is a normal distributionN(0,_√σf

m).

3.2.2 Advantages of Monte Carlo Method

Many problems in physics involve the calculation of particular integrals (i.e. the Schrodinger

equation). In many cases these integrals are computed by using numerical integration, such

as the Newton-Cotes formulas and the Simpson’s rule. However, traditional numerical integral techniques fail in high dimensional space. Let us consider an integral over a d dimensional hypercube. Using a standard quadrature method such as the trapezoidal rule,with a fixed

spacing of m points per dimension, the error(standard deviation) is proportional to O(m−2/d) toO(m−4/d).

One advantage of Monte Carlo methods over other techniques is the error will still have

the form ofσf/√min all dimensions d. In electronic structure theory, the many-body problem contains N electrons which means 3N dimensional integral. The analytical or numerical solution

becomes intractable. One of the most accurate many-body approaches so far is quantum Monte Carlo method.

(23)

target function f(x,y)

-1

-0.5

0

0.5

1 X axis

-1 -0.5

0 0.5

1

Y axis 0

0.2 0.4 0.6 0.8 1

Figure 3.1: Target function in the original integral

not very competitive. However, rather than drawing uniform random points from the space, we

can optimize the simulation by using a better sampling technique such as importance sampling

described as following. The convergence rate will be improved by a significant factor.

3.2.3 Importance Sampling

An example of the efficiency of Monte Carlo simulation is given in Liu’s book[1]. The target function in the example is:

f(x, y) = 0.5e−90(x−0.5)2−45(y+0.1)4 +e−45(x+0.4)2−60(y−0.5)2 (3.7) where (x, y) ∈ [−1,1]×[−1,1], see Figure 3.1. By taking m=1000x1000 uniform grid points in the area of [−1,1]×[−1,1], the integral I = R R

f(x, y)dxdy is evaluated by deterministic algorithm:

ˆ I = 4

m h

f(1)+, . . . , f(m) i

≈

Z Z

f(x, y)dxdy (3.8)

(24)

uniformly distributed samples. However, most computation time in the numerical integration

and simple Monte Carlo simulation will be wasted on evaluating the area with zero value. Instead of such unbiased sampling methods, a much more efficient sampling technique will give

more weight to “important” outcomes and is called “importance sampling”. We can write the

integral of interest as

I =

Z

h(x)π(x)dx=Eπ[h(x)] (3.9)

To evaluate this integral, an ordinary Monte Carlo method will draw samplesx1, . . . , xm with probability density functionπ.

ˆ I = 1

m X

i

h(xi) (3.10)

Sampling from π directly is certainly an efficient method. However, in some cases(i.e. a PDE solution), it is impossible to take the samples from π because π distribution has no explicit analytical form. In importance sampling Monte Carlo, we alternatively represent I as

I =

Z

h(x)π(x)

g(x)g(x)dx=Eg

h(x)π(x) g(x)

(3.11)

wheregis a nonnegative and normalized probability density function close toπ. The importance sampling drawsx1, . . . , xm from this trial distributiong(Figure 2). The Monte Carlo estimator

with importance sampling associated with g is then given by:

ˆ I = 1

m X

i h(xi)

π(xi)

g(xi) (3.12)

The importance weight π(xi)

g(xi) is a bias correction which can be determined exactly given any

point xi. The key factor of smaller error in estimation lies in selecting an effective importance sampling density g, which should be as ”close” in shape to h(x)π(x) as possible. In an ideal situation,h(x)π(x)/g(x) is a constant for all possiblexi and the importance sampling estimator will have zero-variance. In other cases, the better the approximation is, the smaller variance the estimator will have. In the example, instead of uniform distribution, if we choose a trial

distributiong(x, y) as

g(x, y)∝0.5e−90(x−0.5)2−10(y+0.1)2 +e−45(x+0.4)2−60(y−0.5)2 (3.13) the estimation of the integral I is 0.1259 and the standard error is 0.0005 with sample size m=2500. This is a significant improvement comparing to simple Monte Carlo algorithm in

(25)

trial distribution g(x,y)

-1

-0.5

0

0.5

1 X axis

-1 -0.5

0 0.5

1

Y axis 0

0.2 0.4 0.6 0.8 1

Figure 3.2: Trial distribution in the importance sampling

3.2.4 Metropolis Algorithm

The example in the previous example introduces a trial distributiong(x, y) in order to develop an efficient Monte Carlo algorithm. However, the trial distribution in the importance sampling

must be a normalized probability distribution function. Therefore, a normalization constant or a

partition function is required ing(x, y). In general, given any distribution Ψ(X) in configuration space X, the probability density distribution is

P(X) = |Ψ(X)|

2

R

dX|Ψ(X)|2 (3.14)

The partition function R

dX|Ψ(X)|2 is a nontrivial integral which is not easier to solve than the original integral in most cases. To avoid the additional integral, Metropolis algorithm [13]

was developed to draw a collection of samples from any unnormalized distribution function Ψ, which is proportional to the desired probability density ofP.

The Metropolis algorithm is a simulation method based on Markov chains. This algorithm

(26)

stateX0, the Metropolis algorithm will be illustrated as the following steps:

1. Propose a trial move from current stateXt toX0.X0. The probability of the transition is defined by a transition probability functionT(Xt, X0). Mathematically, it is a symmetric and ergodic function which satisfies

T(x, y) =T(y, x)

X

yT(x, y) = 1 (3.15)

2. Define the acceptance function A(Xt, X0) as the probability of accepting the trial move. At equilibrium, due to the detailed balance condition[13],A(Xt_{, X}0_{) can be calculated by:}

A(Xt, X0) = min

1,T(X

0_{, X}t₎_P₍_X0₎ T(Xt_{, X}0₎_P₍_Xt₎

= min

1,P(X 0₎ P(Xt₎

(3.16)

Obviously, the norm of the target distribution function P got cancelled in the above equation.

3. Generate a random numberu∼Uniform[0,1] and updateX state by

Xt+1 =

(

X0 ifu≤A(Xt, X0)

Xt otherwise (3.17)

Given a variable or operator ˆO, the expectation value is represented by the average of the ensemble at an equilibrium state t.

<O >ˆ N= R

O(Xt)P(Xt)

R

P(Xt₎_dX ≈ 1 N

N X

i=1

O(X_it) (3.18)

The variance of the measurement is defined as

σ2≈ 1

N −1 N X

i=1

(O(X_it)−<O >ˆ N) (3.19)

(27)

3.3 Variational Monte Carlo

We will introduce the first QMC method, variational Monte Carlo(VMC) method in this section.

In QMC, we use the non-relativistic many-body Hamiltonian without mean-field approxima-tions. The kinetic and potential energy operators are given by

T =−1

2 Nele

X

i=1

52_i (3.20)

V = Ne X i=1 Nion X A=1 ZA riA + Ne X i=1 Ne X j>i 1 rij (3.21)

VMC is based on the variational principle. Given a trial wavefunction ΨT, the expectation value of Hamiltonian is

ET = hΨT|H|ΨTi

hΨT|ΨTi

=

R

dR|ΨT(R)|2H_ΨΨT(R)

T(R)

R

dR|ΨT(R)|2 =

R

dR|ΨT(R)|2EL(R) R

dR|ΨT(R)|2

≥E0 (3.22)

where E0 is the exact ground state energy and ET is an upper bound of E0. The component

HΨT(R)

ΨT(R) is called local energyEL. Since|ΨT(R)|

2 _{is a non negative function and not necessarily}

normalized, we can use the importance sampling and the Metropolis algorithm introduced in

the previous section. The trial ground state energy ET is the average of EL

ET = 1 N

N X

i=1

EL(Ri) (3.23)

with estimated variance

σ2_E

T = 1 N N X i=1

[EL(Ri)−ET]2 (3.24)

R is a configuration of positions of electrons (r1, r2, . . . , rN). The trial move from R0 to R is realized by a random vector~= (∆r1,∆r2, . . . ,∆rN) which follows Gaussian distribution and

we have

R0 =R+~ (3.25)

(28)

individually instead of all together.

If the trial wavefunction ΨT is the exact solution of H,EL is a constant andσ_E2_T = 0 (zero variance). In general, the better ΨT approximation to the ground state wavefunction is, the smaller variance will be. The next section will introduce the variational trial wavefunction used

in our calculations.

3.4 Variational Trial Wavefunction

The form of trial wavefunction is not uniquely determined. Typically, a Hatree-Fock wavefunc-tion or a more generalized multi configurawavefunc-tion CI wavefuncwavefunc-tion multiplied by the Jastrow factor

is the most commonly used trial wavefunction. In VMC, about 50-80% of the correlation energy

can be recovered by using the optimized trial wave functions. Although we have the freedom in choosing the form of wavefunction, wavefunction must satisfy several basic properties. First,

the cusp conditions at the overlap of two particles must be satisfied in order to obtain small

variance of the local energy. Second, the antisymmetric rule of ΨT under the interchange of two electrons is imposed by the property of fermonic system.

3.4.1 Cusp Conditions

The cusp conditions were raised by Kato [3] which ensure that the divergent kinetic energy

and potential cancels each other at the electron-nucleus and electron-electron overlap. With-out the cusp conditions, the local energy Hψ/ψ will be divergent. We first focus on at the electron-nucleus cusp condition. The hydrogen-like Schrodinger equation for single electron can

be presented as

d2 dr2 +

2 r

d dr +

2Z r −

l(l+ 1) r2 + 2E

ψ(r) = 0 (3.26)

where ψ(r) is the radial function of one-electron orbital. All singularity terms r−n must be canceled atriI = 0 and the cusp condition is summarized as :

1 ψ

∂ψ

∂riI =−ZI (3.27)

Similarly, the electron-electron cusp can be derived from a two-body “radial” equation:

"

2 d

2

dr_ij2 + 2 rij

d drij

+ 2

rij

−l(l+ 1)

r2_ij + 2E #

ψS(rij) = 0 (3.28)

(29)

electron-electron cusp condition: 1 ψS ∂ψS ∂rij = 1

2(l+ 1) (3.29)

For like(parallel) spins, it is,

1 ψS ∂ψS ∂rij = 1 2 (3.30)

and unlike(antiparallel) spins, it is

1 ψS ∂ψS ∂rij = 1 4 (3.31)

3.4.2 Form of Trial Wavefunction

The trial wavefunction in our QMC calculation is represented by the product of antisymmetric

wavefunctions and exponential Jastrow factors (Slater-Jastrow form):

ΨT(R) =ΨA(R) exp [Ucorr(R)] =

Ndet

X

n=1

(dnD↑_nD↓_n)

×exp



 X

iI

χ(riI) +X i6=j

u(rij) + X i6=j,I

w(rij, riI, rjI)



 (3.32)

We can write the antisymmetric function in the form of Slater determinants Dn↑ for up spins and Dn↓ for down spins. The determinant is composed of one-electron orbitalsφwhich is taken from (post-)HF or DFT calculation. Assumingφ and ΨT are real functions (we will deal with complex wavefunction in Chapter 5), it is written as the product of radial part and angular part

φk(r,Ω) =Rk(r)Ylk_mk(Ω) (3.33)

The radial function is expressed as the expansion of (gaussian) basis sets or numerical

orbitals. The spherical harmonic functionsYlmare complex-valued. In most quantum chemistry packages, a unitary transformation is made in order to use real spherical harmonic functions Slm(In Chapter 5, we will show the equivalent results obtained by using these two approaches):

Sl,m Sl,−m

!

= √1

2

(−1)m_l 1

−(−1)m_l i !

Yl,m Yl,−m

!

(3.34)

We have already mentioned above that one-particle orbital must satisfy cusp conditions

(30)

derivative is zero at riI = 0. However, if we choose proper pseudopotentials (non-divergent at the nucleus), the orbital with zero derivative also satisfies the electron-nucleus cusp. In Chapter 4, a He-core pseudopoential we constructed will be discussed. It is designed for QMC and can

reproduce most all-electron calculation results due to its extremely small core size.

In QMC, The electron-electron cusp condition is fullfilled by the Jastrow factor. In CI cal-culation, the slow convergence of wavefunction is partly resulted from its attempt to reproduce

the correct electron-electron cusp[4]. Therefore, if we make Jastrow parameters satisfy cusp, a

more compact wavefunction could be used in the calculation. This is one of the advantages of QMC over the traditional quantum chemistry methods.

The components of Jastrow factor include one body term χ(riI) for electron-nucleus, two-body termu(rij) for electron-electron and three body termw(rij, riI, rjI) for electron-electron-nucleus. The existence of two-body term favors the global expansion of electron density and

the three-body term contributes to re-adjust the high density area near the nuclei. These

com-ponents are easily expanded in one-dimension basis sets such as:

χ(r) =X k

cen_k ak(r) (3.35)

u(r) =X k

cee_kbk(r) (3.36)

w(rij, riI, rjI) =X klm

ceen_klm[ak(riI)al(rjI) +ak(rjIal(riI)]bm(rij) (3.37)

whose specific forms may vary. The most commonly used Jastrow functions are Pade-Jastrow[5]

and Boys-Handy functions[6] [7]. The basis functions we employed in our program are listed in Ref.[8]:

fcusp(x, γ) =C( x−x

2₊_x3_/₃

1 +γ(x−x2₊_x3_/₃₎−

1

γ+ 3) (3.38)

fpoly−P ade(x, β) = 1−x

2₍₆₋₈_x_{+ 3}_x2₎

1 +βx2₍₆₋₈_x_{+ 3}_x2₎ (3.39)

Using this form of Jastrow factor, we have obtained excellent trial wavefunctions for atomic systems(C–Si)[9][10], molecules [11][12] and solids [13]. In the optimization procedure, the

co-efficients and parameters of these basis functions are optimized to satisfy the cusp conditions

and also make the total energy as low as possible. Furthermore, since electron-electron corre-lation effects are incorporated in the Jastrow functions, it is not necessary to re-optimize the

coefficients of determinants. The correlation energy recovered in QMC depends on the number

(31)

our VMC calculation with three-body Jastrow function.

The main purpose of Jastrow factor is to help reduce statistical variance in QMC calculation which will save a large amount of CPU time. Although the Jastrow recovers part of the cusp

conditions, two-body and three-body correlation, it does not change the nodal surface of the

wavefunction. An accurate nodal surface is the key factor of Diffusion Monte Carlo(DMC) calculations. For large systems, multi-determinant wave functions are still necessary to describe

accurate nodal surfaces. We will discuss DMC and its fixed-node approximation in the last

section.

3.5 Optimization of Wavefunction

Typical optimization methods in QMC include variance minimization, energy minimization

and mixture of both. In our calculation, the variational parameters and coefficients in the cusp

(Eq.3.38) and poly-pade functions (Eq.3.39) are optimized while the antisymmetric wavefunc-tions ΨA and configurations are fixed. The observable in VMC is estimated as

hOˆi=

R

dRΨ†_T(R) ˆOΨT(R) R

dR|ΨT(R)|2 =hOLiΨ2

T

≡O¯ (3.40)

whereOL=HΨT/ΨT. Variance and energy are the most common observables to be minimized . Gradients (denoted ash·i_i) and Hessians (denoted ash·i_ij) with respect to a set of parameters and coefficients{ci}in Jastrow factors are also calculated so that optimization procedures such as quasi-Newton or Levenberg-Marquardt method can be applied.

3.5.1 Variance Minimization

Although energy minimization is straightforward, variance minimization is more efficient to get an optimized wavefunction[5]. The reason is that variance of the observable is bounded from

below by zero and energy has no lower bounder (the estimation of energy on a finite set is

possible getting lower while the true energy actually increased). The variance of the energy is defined as

σ2_c =

R

dRΨ2_c(EL−E¯)2

R dRΨ2

c

=h(EL−E¯)2iΨ2

c (3.41)

where c is the set of parameters. The gradient of the variance with respect toci parameter is

(σ2)i = 2

hEL,i(EL−E¯)i+h Ψi

ΨE

2

Li − h Ψi

ΨihE

2

Li −2 ¯Eh Ψi

Ψ(EL− ¯ E)i

(32)

Here we take the derivative of each component in Eq.3.41. Since we use the fixed set of

config-urations and correlated sampling as mentioned in [5], the change of Ψ can be ignored,

(σ2)i=2hEL,i(EL−E¯)i

=2h(EL,i−E¯i)(EL−E¯)i (3.43)

Using similar approximation, the gradient of Eq.3.43 is calculated by

(σ2)ij = 2h(EL,i−Ei¯ )(EL,j−E¯)i (3.44) which is a symmetric and positive-definite Hessian matrix.

There are several drawbacks to using variance-minimization in QMC. First, although it

is very efficient in optimizing Jastrow parameters, it becomes less effective in optimizing the determinantal coefficients of the orbitals[14][15][16]. Second, it has been demonstrated that

energy-optimized wavefunctions usually provide better estimates of non-energy-related

observ-ables than variance-optimized ones[17].

3.5.2 Energy Minimization

In this section, we will briefly summarize an energy-minimization method following the approach

of Umrigar and Filippi[16]. The gradient of energy(Eq.3.22) is

¯ Ei =h

Ψi ΨEL+

HΨi Ψ −2 ¯E

Ψi

Ψi (3.45)

=2hΨi

Ψ(EL−E¯)i (3.46)

The expression of gradient(Eq.3.46) which is simplified by Hermicity has zero-fluctuation in

the limit of Ψ being an exact eigenfunction. This is a very important stationary condition for

optimization methods such as Newton method. The straightforward estimator of the energy Hessian is given by

¯ Eij =2

h(Ψij

Ψ +

ΨiΨj

Ψ2 )(EL−E¯)i

− hΨi

ΨiEj¯ − h Ψj

ΨiEi¯ +h Ψi

(33)

Umrigar and Fillippi[16] suggested a rearrangement of last term in the Hessian so that the

fluctuation is much less than straightforward Heissian.

¯ Eij =2

h(Ψij

Ψ +

ΨiΨj

Ψ2 )(EL−E¯)i − h

Ψi

ΨiE¯j− h Ψj

ΨiE¯i

+hΨi

ΨEL,ji − h Ψi

ΨEL,ji+h Ψj

ΨEL,ii − h Ψj

ΨEL,ii (3.48)

This symmetric estimator has the same zero-expectation value as Eq.3.47 in the infinite sam-pling but no-zero value and large fluctuation cancelled in finite samsam-pling. As mentioned above,

most of our optimization are performed on Jastrow part only(recent works reveal that non-linear parameters such as the configuration interaction coefficients and orbital parameters can

be optimized by the energy fluctuation potential(EPP) method[18][19]). Therefore a special

rearrangement is applied to Eq.3.48[20][21]

¯ Eij =2

(Ψij

Ψ −

ΨiΨj

Ψ2 )(EL−E¯)

+ 4

(Ψi Ψ − h

Ψi Ψi)(

Ψj Ψ − h

Ψi Ψi)

(EL−E¯)

+hΨi

ΨEL,ji − h Ψi

ΨEL,ji+h Ψj

ΨEL,ii − h Ψj

ΨEL,ii (3.49)

The first line of Eq.3.49 gets cancelled if we only optimize exponential Jastrow parameters. Thus the expensive calculation of Hessian of wavefunction is avoided.

Analytical forms of the first and second derivatives of the energy and variance are given

above. Sometimes the mixture of two schemes yields better trial-wavefunction because it will decrease the fluctuations in the Hessian. The general form of the cost function is then written

as

C(p) =xE¯(p) + (1−x)hσ(p)2i (3.50) where p is the set of parameters to be optimized.

3.5.3 Newton Method and Beyond

Newton methods has been used to find the local minima of cost function for a long time. How-ever, our expressions of the Hessian (Eq.3.44 and Eq.3.49) avoids expensive direct calculations.

In fact, Hessian matrix is obtained from the gradient vectors g instead. The Newton method becomes quasi-Newton method:

(34)

1) calculate the (quasi-)Newton direction

∆p=−H_k−1g (3.51)

2) find the step size γ by performing line search and go to a new position

pk+1 =pk+γ∆p (3.52)

3) calculate new gradient gk+1 and Hessian matrix Hk+1. Go back to step 1.

A positive constant adiag is necessary to be adjusted and added to Hessian matrix if it is not positive definite. Other improvement to Newton methods is to combine with the steepest descent method because Newton methods is less effective at the point far from the minimum.

This is so-called Levenberg-Marquardt method[21][22]. When the current position is far from

the minimum, the algorithm behaves like steepest descent; when we are close to the minimum, Newton method will take effect. The new step direction in the parameter space is modified

according to

(Hk+µkI)∆pk=−gk (3.53)

whereI is identity matrices andµ is the positive damping parameter. The above optimization methods are not only applied to QMC calculation, but also implemented in the construction of pseudopotential of transition metal elements(see Chapter 4).

3.6 Diffusion Monte Carlo

Diffusion Monte Carlo (DMC) is developed on the concept of projection and Green’s function formalism in the diffusion process. One of the major advantages of DMC over VMC is that it

projects out the ground state from a set of eigenstates in the trial wavefunction and about 95%

correlation energy can be recovered by DMC.

3.6.1 Time Dependent Green’s Function

The imaginary time many-body Schrodinger equation is given by:

−∂Ψ(R, τ)

∂t =−(H −ET)Ψ(R, τ) (3.54)

where ET is an energy offset and τ is a real variabule under a Wick rotation t→iτ. To solve Ψ(R, τ) by iteration, Eq.3.54 can be transformed into a form of integral

Ψ(R, τ) =

Z

(35)

To obtain the expression of Green’s function G(R, τ,R0, τ0), we can operate on both sides of Eq.3.55

(H −ET)Ψ(R, τ0) =

Z

(H −ET)G(R, τ,R0, τ0)Ψ(R, τ)dR0 (3.56) and

− ∂

∂τ0Ψ(R, τ 0_{) =}₋

Z

∂G(R, τ,R0, τ0) ∂τ0 Ψ(R

0_{, τ}₎_d_R0 _(3.57)

Due of Eq. 3.54, the right sides of Eq.3.56 and Eq.3.57 are equivalent, which leads to a relation

− ∂

∂τG(R, τ,R 0

, τ0) = (H −ET)G(R, τ,R0, τ0) (3.58) There is also an initial condition that implies

G(R, τ,R0, τ) =δ(R−R0) (3.59)

By diffusion algorithm, it follows that

G(R, τ,R0, τ0) =hR|e−(H−ET)(τ−τ0)_|_R0_i

= ∞ X

i

exp(δt(En−ET))φi(R)φi(R0) (3.60)

A complete set of eigenstatesφiis inserted to Eq.3.60. Then rewrite Eq.3.55 for its first iteration

Ψ(R, δt) =

Z ∞ X

i

e−(E−ET)δt_φi₍_R₎_φi₍_R0₎_·X

k

hφk|Ψ(R0,0)idR0

= ∞ X

k=0

hφk|Ψ(R0,0)iφk(R)e−(Ek−ET)δt _(3.61)

And for nth iteration, we have

Ψ(R, nδτ) = ∞ X

k=0

hφk|Ψ(R,0)iφk(R) exp[−(Ek−ET)nδτ] (3.62)

In the limit τ → ∞, only the ground state will be projected out from an arbitrary trial wave-function Ψ because all other states are damped exponentially in the evolution as

lim

(36)

3.6.2 Short-time Approximation

However, the exact form of Green’s function has yet been solved since neither the eigenstate

nor eigenvalues is known in the function. To obtain the analytic Green’s function, a

short-time approximation is made. We can map the probelm onto solving a diffusion equation. In a short-time approximation, Green’s function is approximated by Trotter-Suzuki formula

≡Gdif fGB (3.64)

Please note that ˆT and ˆV do not actually commute and the correction is given by

G−Gdif fGB= 1 2[V, T]τ

2₊_O₍_τ3₎ _(3.65)

The approximation of Eq.3.64 is only valid for smallτ which makes it called short-time approx-imation. Gdif f and GB are gaussian functions expanding in time. The first term

Gdif f = ( 1 2πτ)

−3N/2_exp

−(R−R

0₎2

2τ

(3.66)

is a Green’s function which satisfies a diffusion equation in 3N-dimensional space

− ∂

∂τG(R, τ,R

0_,_{0) =} 1 2

N X

i=1

∇2_iG(R, τ,R0,0) (3.67)

GB is a branching term

GB(R, τ,R0,0) = exp

h

−τ

2(V(R) +V(R 0

)−2ET)

i

(3.68)

which satisfies the rate equation

− ∂

∂τG(R, τ,R

0_,_{0) = (}_E

(37)

3.6.3 Importance Sampling and Outline of DMC

Based on Green’s function Monte Carlo, an improved and more efficient method called diffusion

Monte Carlo is developed to reduce the statistical error during the calculation. The singularity

and significant variance of the potential ˆV in the presence of GB lead to inefficiency of the simple algorithm. In order to reduce the fluctuation in the sampling, we can implement an

importance sampling technique [23][24] with a mixed distributionf(R, τ) = ΨT(R)Ψ(R, τ) in which ΨT is a trial wavefunction.

Rearrange the diffusion equation in terms of f, we find

∂f(R, τ) ∂τ =−

1 2∇

2_f₍_R_{, τ}_{) +}_{∇ ·}_[_v

D(R)f(R, τ)] + [EL(R)−ET]f(R, τ) (3.70) where the local energy is EL≡HΨ/Ψ andvD =∇ln|ΨT(R)|2 = 2∇ΨT/ΨT. In fact, the new “drift” term “vD” will displace each walker by an additional effective velocity. The new diffusion and branching Green’s functions become

˜

Gdif f = ( 1 2πτ)

−3N/2_exp

−(R−R

0₋_{τ v}

D(R0))2 2τ

(3.71)

and

˜

GB(R, τ,R0,0) = exp h

−τ

2(EL(R) +EL(R

0₎₋₂_E T)

i

(3.72)

The drift velocity inGdif f will drive walkers towards the regions where|Ψ|2 is large. The local energy in GB is used to reduce the fluctuations and avoid singularity from potential V(R ). The transformation also satisfies zero-variance property of importance sampling. When Ψ is the exact wavefunction, EL(R) =E0.

However, the detailed balance condition of Metropolis algorithm is no longer fulfilled because

the Green’s Function with drift term is not symmetric on

˜

Gdif f(R, τ,R0, τ0)= ˜6 Gdif f(R0, τ,R, τ0) (3.73)

In order to achieve the equilibrium, the acceptance of the movement from R’ to R will be

A(R←R0) = min

1, Gdif f(R→R 0₎_|_Ψ

T(R)|2 Gdif f(R←R0)|ΨT(R0)|2

(38)

Using the mixed estimator, the DMC energy is estimated as

EDM C = lim τ→∞

he−τ HΨT|H|ΨTi

he−τ H_Ψ T|ΨTi

= lim τ→∞

R

Ψ∗(R, τ)HΨT(R)dR R

Ψ∗₍_R_{, τ}_)Ψ

T(R)dR

≈ 1

M X

m

EL(Rm) (3.75)

Here Rm is set of samples of electron configurations with importance function f(R, τ) = lim

τ→∞Ψ(R, τ)ΨT(R).

3.6.4 Fixed-Node Approximation

There is a requirement in choosing any function Φ(R) as the distribution of walkers: Φ must be positive everywhere so that it can be interpreted as the probability density function. For bosonic particles, their wavefunction is symmetric under the exchange of any two electrons

ΦB(r1, . . . , ri, . . . , rj, . . . , rN) = ΦB(r1, . . . , rj, . . . , ri, . . . , rN) (3.76)

The symmetric property indicates it is positive all over the space and can be easily used as prob-ability density function. However, the antisymmetric rule for many-fermion systems requires the

change of sign when exchanging two sermonic particles

ΦF(r1, . . . , ri, . . . , rj, . . . , rN) =−ΦF(r1, . . . , rj, . . . , ri, . . . , rN) (3.77)

which means there exists positive and negative regions. Therefore, ΦF can not be used as distribution directly. In order to solve this fermion-sign problem, we represent the wavefunction Φ(R) as the difference of two positive functions: Φ+ and Φ−

ΦF = Φ+−Φ− (3.78)

Φ+ and Φ− are separated by the nodal surface where Φ = 0. They can be imagined as the distribution of walker in positive and negative regions. The nodal structure of a N-particle

system is a 3N −1 dimensional hypersurface since the wavefunction is continuous. Then the problem becomes finding an accurate nodal surface (Φ = 0). But since we do not know the analytical form of exact ground state wavefunction Φ, we will not be able to know the exact

(39)

Φ

(r)

r

Φ0

ΦFN

Figure 3.3: Comparison of exact wavefunction Φ0and fixed-node solution ΦF N in 1-dimension, the error caused by fixed-node approximation is of the order of 5% of correlation energy

The practical implementation of the DMC method requires the so-called fixed-node ap-proximation which makes the nodes of the solution ΦF N identical to the nodes of ΨT avoiding thus the fermion sign problem. In an importance sampling DMC formalism, the distribution of

walkers is not the ground state wavefunction Φ but a mixed functionf(R,∞) = Φ(R)ΨT(R). The fixed-node approximation is naturally made by choosing ΨT’s boundary condition as the approximation of exact node. The nonnegative restriction is easily solved so that

f(R, τ)>0 (3.79)

In fact, the drift term vD will push the walker away from the nodes because vD is divergent at f(R) = 0. But it is still possible that the walker can cross the node, although the probability is low. The diffusion vector is proportional to√∆τ. If the drift fails to push away the walker from the nodes because the diffusion move is sufficient large, the walker will cross the nodes and sign

(40)

99%. In release-node diffusion Monte Carlo, gaps are opened in the original nodal surface and

walkers can now pass through different nodal pockets[25][26].

In Figure 3, we plot the comparison of an exact wavefunction and fixed-node solution ΦF N in one-dimension. Clearly, there is a fixed-node error in this approximation. The next question is

how fixed-node error affects the result of DMC. It has been shown that fixed-node energy is an upper-bound of the true ground state energy and nodal defect is in the second-order correction

[27] . Furthermore, the fixed-node DMC error is about an order of magnitude smaller than the

VMC error. DMC calculation typically recovers 90–95% of the correlation energy.

Apart from the fixed-node approximation for real wavefunction, it can be extended to

complex-valued wavefunction. We will discuss the fixed-phase approximation in Chapter 5.

3.7 Summary

In this chapter, we have reviewed several applications of Monte Carlo method to quantum

mechanical many-body systems. With modern supercomputers, it is possible to perform QMC calculation on systems up to 1000 particles. On a sufficient large set of samples, VMC is

ca-pable of computing about 80% correlation energy and for DMC, this accuracy could reach

95%. The optimization of Jastrow parameters will reduce the statistical error and save lots of computing time in DMC. Although the most accurate ground state energies can be obtained

by DMC, the nodal surface of trial wavefunction will significantly affect the accuracy. We will

(41)

REFERENCES

[1] J. S. Liu, (2001) Monte Carlo Strategies in Scientific Computing Page 31-37, Springer [2] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, J. Chem.

Phys. 21, 1087 (1953)

[3] T. Kato. Commun. Pure Appl. Math, 10, 151 (1957)

[4] B. Klahn and J.D. Morgan III, J. Chem. Phys. 81, 410 (1984).

[5] C. J. Umrigar , K. G. Wilson and J. W. Wilkins , Phys. Rev. Lett. 60, 1719 (1988)

[6] K. E. Schmidt and J. W. Moskowitz, J. Chem. Phys. 97 3382 (1992)

[7] P. J. Reynolds, D. M. Ceperley, B. J. Alder and W. A. Lester, Jr, J. Chem. Phys. 77 5593 (1982)

[8] Michal Bajdich thesisGeneralized Pairing Wave Functions and Nodal Properties for Elec-tronic Structure Quantum Monte Carlo, Appendix A, NCSU, March 2007

[9] Kevin Rasch, Shuming Hu and Lubos Mitas, Unpublished.

[10] K.M.Rasch and L.Mitas, Chemical Physics Letter, 525, 59, (2012)

[11] Shi Guo, Michal Bajdich, Lubos Mitas, Peter J. Reynolds, arXiv:1301.1723 (2013)

[12] Minyi Zhu and Lubos Mitas, Chemical Physics Letters, 572, 136 (2013)

[13] Jindrich Kolorenc and Lubos Mitas, Physical Review B, 75, 235118 (2007)

[14] C. J. Umrigar and Claudia Filippi, J. Chem. Phys. 105, 213 (1996)

[15] Chien-Jung Huang, C. J. Umrigar, and M. P. Nightingale, J. Chem. Phys. 107, 3007 (1997)

[16] C. J. Umrigar and Claudia Filippi, Phys. Rev. Lett. 94, 150201 (2005)

[17] Martin Snajdr and Stuart M. Rothstein, J. Chem. Phys. 112, 11 (2000)

[18] Friedemann Schautz and Stephen Fahy, J. Chem. Phys. 116, 3533 (2002)

[19] Friedemann Schautz and Claudia Filippi, J. Chem. Phys. 120, 10931 (2004)

[20] Xi Lin, Hongkai Zhang, and Andrew M. Rappe, J. Chem. Phys. 112, 2650 (2000)

[21] Michal Bajdich thesisGeneralized Pairing Wave Functions and Nodal Properties for Elec-tronic Structure Quantum Monte Carlo, Page 27-28, NCSU, March 2007

(42)

[23] P. J. Reynolds, D. M. Ceperley, B. J. Alder, and W. A. Lester Jr., J. Chem. Phys. 77, 5593 (1982)

[24] R. C. Grimm and R. G. Storer, Journal of Computational Physics 7, 134 (1971).

[25] D. M. Ceperley and B. J. Alder., Phys. Rev. Lett., 45, 566 (1980)

[26] Xin Li, Jindrich Kolorenc and Lubos Mitas, Physical Review A , 84, 023615 (2011)

(43)

Chapter 4

Study of Ne-core and He-core

Pseudopotential Errors in the MnO

Molecule

sections of this chapter also appeared in

Study of Ne-core and He-core pseudopotential errors in the MnO molecule: Quantum Monte Carlo benchmark

Minyi Zhu, Lubos Mitas

(44)

In this chapter, accuracy of effective core potential (ECP) is studied for two sizes of cores

by Density Functional Theory, Hartree-Fock and quantum Monte Carlo (QMC) methods using the MnO molecule as a test system. We compare the energy differences between high-spin and

low-spin states that were previously found to be problematic for transition metal oxide solids

calculations with ECPs. In order to disentangle errors caused by ECPs and by subsequent methods used in calculations, we construct a scalar-relativistic He-core ECP for Mn atom. We

find that within high quality correlated calculations both Ne-core and He-core ECPs provide

energy differences with comparable, high accuracy. The rest of chapter is organized as follows. After the introduction, a brief summary of relevant methodology is presented in Sec. II, including

the construction of the He-core ECP and its implementation in QMC calculations. In Sec. III, we

show the performance of the constructed ECP and primary results of DFT and QMC methods. We then discuss further aspects of ECPs and electronic structure methods for similar systems.

4.1 Introduction

For several decades transition metal oxide systems have been one of the most challenging

focal topics in chemical and condensed matter physics. Mainstream approaches such as density

functional theory (DFT) and Hartree-Fock (HF) have been applied to these systems quite extensively, however, many of obtained results are mixed at best. For example, it is well-known

that traditional DFT approaches underestimate the band gap of transition metal oxides very

significantly and even fail to predict correct ground states such as insulating antiferromagnets for FeO, CoO and other systems.

Several post-DFT approaches, such as hybrid functionals, on site Coulomb repulsion

correc-tion (LDA+U) or GW approximacorrec-tion have been applied in order to better capture the physics of these strongly correlated systems. Performance of some of these approaches for transition

metal oxide systems have been benchmarked for properties such as band structures, equations of

state and magnetic moments, see, for example, Ref. [1]. Although these more advanced DFT ap-proaches showed marked improvements, significant uncertainties are still present, for example,

in the prediction of pressure induced structural phase transitions [1, 2] and other properties. Of

particular interest are, for example, phase transitions related to the collapse of local magnetic moments such as the one observed in MnO [1, 2]. Similar importance of high-spin vs low-spin

state energy differences can be found in physics and chemistry of molecular (nano) systems

with potential use in spintronics or other applications [3, 4].

The electronic structure calculations for transition metal oxides are often carried out with

the frozen core or with effective core potentials (pseudopotentials) (ECPs). It is assumed that such modification of the original Hamiltonian does not affect the valence properties within the

(45)

small and the corresponding ECPs are accurate enough. For 3d elements the Ne-core is con-sidered to be deep and compact enough so that ECPs with 3s,3p,3d,4s states in the valence space faithfully capture the essential physics of all valence properties.

Unfortunately, for some approaches the proper treatment of calculations with ECPs is less

straightforward than one would expect and wish. We illustrate this issue by mentioning a few results of our recent study of MnO solid. In Table 4.1 [2] we show errors for the differnce

between the nonmagnetic (NM) and antiferromagnetic (AFM) states of MnO as obtained from

all-electron and ECP calculations.

Although highly accurate small Ne-core pseudopotentials [5, 6] were used, significant

dis-agreements with all-electron calculation were found ranging from +0.57 eV in generalized

gra-dient DFT to -0.55 eV in HF.

In this work we show that these ”pseudopotential errors” originate in biases gen-erated by the application of approximate methods such as DFT or HF to bonded systems. Assuming that the valence space is sufficiently large, the net contribution of the ECPs to such errors is marginal when compared to errors generated by the approximate

ap-proaches applied to many-body systems with bonds. In this work we explicitly demonstrate that

deep cores have much smaller effect on the basic valence properties than the results from such approximate calculations might suggest. The fact that the approximate theories can generate

additional errors for ECP Hamiltonians has been known in the DFT context for a long time. For example, the nonlinear core corrections for the DFT pseudopotentials have been devised

just for this purpose, ie, to remove the differences between all-electron and pseudopotential

cal-culations [8]. This has been originally justified by the DFT pseudopotential construction which is burdened by the nonlinearity of exchange-correlation functionals so that the core-valence

partitioning is difficult when these two densities overlap appreciably [8]. Very recently, these

types of corrections have been suggested not only for transition elements but also forspsystems in order to decrease the related errors [10]. However, the most accurate ECPs are generated

from Dirac-Fock atomic calculations in energy-adjusted framework (ie, by reproducing not only

one-particle norm conserving properties but also excitation energies) and are designed to repro-duce the physical ion in a true ab initio sense, ie, for use in calculations which can reach nearly

exact eigenstates. Therefore any nonlinearity-like corrections are unusable in many-body wave

function methods. Of the key importance is the genuine many-body accuracy of the (effective)

Table 4.1: The discrepancies between all-electron and Ne-core pseudopotential for the energy difference (eV) between antiferromagnetic and nonmagnetic states in the MnO solid[2].

HF B3LYP PW91