Wave Function–Based Quantum Chemistry
3. SINGLE-CONFIGURATIONAL AND
MULTICONFIGURATIONAL HARTREE–FOCK THEORY
In the present section, we discuss how the exact wave function may be approximately described by a few important configurations, constructed from a set of variationally
optimized orbitals. These wave function models are often used on their own, for a crude but qualitatively correct description of the electronic system. In addition, they are important as starting points for the more advanced, quantitatively correct treat-ments discussed in Sec. 4.
3.1. The Hartree–Fock Model
The Hartree–Fock model is the simplest, most basic model in ab initio electronic struc-ture theory [28]. In this model, the wave function is approximated by a single Slater determinant constructed from a set of orthonormal spin orbitals:
jHFi ¼ uj 1; u2; . . . ; unj: ð21Þ
The spin orbitals are determined by invoking the variation principle (8) (i.e., by minimizing the energy with respect to variations in the spin orbitals):
EHF¼ min
ui
f g
HF ˆ HHF
D E
HFjHF
h i : ð22Þ
The Hartree–Fock energy therefore constitutes a rigorous upper bound to the exact energy, EHFz Eexact. By expanding each spin orbital in AOs according to Eq. (17), the minimization is achieved by varying the AO expansion coefficients.
In a variational sense, the Hartree–Fock model represents the best one-deter-minant approximation to the exact electronic state. It typically recovers 99% or more of the total electronic energy and it yields, for most molecular properties, results within 5%–10% of the exact values. For many purposes, therefore, the Hartree–Fock model represents an adequate model by itself. Just as important, it constitutes a natural starting point for the more elaborate treatments of electronic structure discussed in Sec. 4.
The optimization of the Hartree–Fock spin orbitals in Eq. (21) is a nonlinear minimization problem. By recasting Eq. (22) as a generalized eigenvalue problem, the optimization may be accomplished by repeated solution of the pseudo-eigenvalue equations:
FCi¼ eiSCi; i¼ 1; 2; . . . ; n; ð23Þ
whose eigenvectors Cirepresent the molecular orbitals (MOs) and whose eigenvalues ei
are the orbital energies [29]. Assuming real AOs, the elements of the overlap matrix S are given by:
Sjk¼
m
vjð Þvx kð Þdx;x ð24Þand the elements of the Fock matrix F are calculated as:
Fjk¼ hjkþX
lm
Plm½ðlm jkj Þ lk jmð j Þ: ð25Þ
Here the one-electron and two-electron Hamiltonian integrals are given by:
hjk¼
m
vjð Þ ˆhx1 1vkð Þdxx1 1; ð26Þðlm jkj Þ ¼
mm
vlð Þvx1 jð Þx2 r112
vmð Þvx1 kð Þdxx2 1dx2; ð27Þ
and the AO density matrix elements are given by:
Plm ¼Xn
i¼ 1
CliCmi: ð28Þ
In Eq. (26), ˆh1is the one-electron Hamiltonian equation (Eq. (5)).
The generalized eigenvalue problem (Eq. (23)) is a pseudo-eigenvalue problem in the sense that the Fock matrix equation (Eq. (25)) depends (through Plm) on its own eigenvectors. The eigenvalue problem (Eq. (23)) must therefore be iterated until the orbitals that are generated by the diagonalization are the same as those used in the construction of the Fock matrix. A self-consistent field (SCF) solution has then been established, and the resulting Fock matrix constitutes the AO representation of an effective one-electron operator called the Fock operator:
ˆF1¼ 1
2j21þ ˆV1SCF: ð29Þ
Apart from the attractive nuclear potential, the effective one-electron potential ˆVSCF1 ¼ X
I
ZI
j!r1R!Ijþ ˆJ1 ˆK1; ð30Þ
contains a repulsive potential, where the Coulomb operator J1 and the exchange operator Kˆ1are defined as:
ˆJ1uið Þ ¼x1 X
j
uið Þx1
m
ujð Þux2r12jð Þx2dx2; ð31Þ
ˆK1ui x1ð Þ ¼X
j
ujð Þx1
m
ujð Þux2r12ið Þx2 dx2: ð32Þ In the limit of a complete basis, the pseudo-eigenvalue problem (Eq. (23)) may be expressed in the form:ˆF1uið Þ ¼ ex1 iuið Þx1 ð33Þ
showing that the MOs are eigenfunctions of the Fock operator.
The structure of Eq. (33) is similar to that of Eq. (5), indicating that, in the Hartree–Fock model, the electrons experience an average potential as described by the Coulomb and exchange operators. In Kohn–Sham DFT (this volume, chapter by Ayers and Yang), the exchange operator Kˆ1 is omitted and exchange is instead accounted for via an additional contribution to the effective potential from the exchange–correlation functional; in hybrid DFT, some proportion of the Hartree–
Fock exchange operator Kˆ1is retained.
The eigenvalues of the Fock eigenvalue problem—the orbital energies—satisfy Koopmans’ theorem, which states that the orbital energy ei is equal to minus the ionization potential (IP) associated with the removal of an electron from orbital uiin the Hartree–Fock state without modifying the remaining orbitals. The agreement with the observed IPs is crude but useful for qualitative discussions.
Hartree–Fock calculations carried out without restrictions on the spatial parts of the alpha and beta spin orbitals are referred to as unrestricted Hartree–Fock (UHF) calculations. Often, it is useful to impose the condition that the alpha and beta spin
orbitals occur in pairs, with the same spatial parts. Such calculations are referred to as restricted Hartree–Fock(RHF) calculations. Unlike UHF wave functions, RHF wave functions are pure spin states. On the other hand, because of the variation principle, the UHF energy is always equal to, or lower than, the RHF energy; when the two energies differ, the RHF model is said to be unstable [30].
The difference between the RHF and UHF models is illustrated for water in Fig. 1, where, for a fixed HOH bond angle, the UHF and RHF potential energy curves are plotted as functions of the OH bond distance, with the FCI curve in-cluded for comparison. The RHF instability sets in at 2.64a0, beyond which the UHF curve lies below the RHF curve.
3.2. Hartree–Fock Methods for Large Systems: Linear Scaling Methods
Nowadays, the Hartree–Fock method can be applied to systems containing several hundred atoms. In this section, we briefly review those aspects of Hartree–Fock theory that are important for large systems [32].
As described in Sec. 3.1, each Hartree–Fock iteration involves the construction of the Fock matrix for a given density matrix, followed by the diagonalization of the Fock matrix to generate a set of improved spin orbitals and thus an improved density matrix. Formally, the construction of the Fock matrix requires a number of operations proportional to K4, where K is the number of atoms (because the number of two-electron integrals scales as K4). For large systems, however, this quartic scaling with K (i.e., with system size) can be reduced to linear by special techniques, as will now be discussed.
Figure 1 RHF and UHF dissociation of H2O (atomic units).
A first reduction in cost is achieved by recognizing that the AOs are localized in space and that, for insulating electronic systems at least, the density matrix P is sparse.
Therefore, many of the two-electron integrals that formally contribute to the Fock matrix need not be computed. In the construction of the Fock matrix, prescreening techniques are used to identify and calculate only those integrals that make a significant contribution (i.e., a contribution greater than some prescribed threshold) [28]. All other integrals are neglected, resulting in a dramatical reduction in computa-tional cost for all but the smallest systems. Indeed, for large systems, the cost of this direct Hartree–Fock methodscales only quadratically with system size.
By further rearranging the calculations, it is possible to reduce the scaling of the Fock matrix construction to linear [33]. This may be achieved by treating the classical, long-range (one-electron and two-electron) Coulomb interactions by special multipole methods, organized in such a manner that the total cost of the Fock matrix con-struction scales linearly with the size of the system. Because, for systems containing up to several hundred atoms, the Fock matrix construction is the time-critical step, such fast multipole methods(FMMs) have significantly extended the range of systems that can be treated by the Hartree–Fock method [34]. In passing, we note that all steps in the construction of the Fock matrix are ideally suited to modern parallel computer architectures.
Having reduced the cost of the Fock matrix construction to linear, another computational bottleneck arises for large systems—the diagonalization of the Fock matrix, whose cost scales cubically with system size. By developing schemes for directly optimizing the AO density matrix P in Eq. (28) without introducing MOs, linear scaling has been achieved also for this step [36,37]. Although promising, these experimental techniques cannot yet be applied in a routine manner.
3.3. Calculation of Molecular Properties
As discussed hitherto, the Hartree–Fock method allows for the calculation of the electronic energy at a given nuclear configuration. From the density matrix P, we obtain:
q xð Þ ¼X
lm
Plmvlð Þvx mð Þx ð34Þ
from which we may extract the electron density qð Þ and the spin density q!r s
!r
as well as various one-electron properties such as dipole and quadrupole moments.
Moreover, a molecular electrostatic potential (MEP) (this volume, chapter by Politzer and Murray) can be derived by computing the Coulomb interaction between a charged particle and the electronic charge given by qð!rÞ. Furthermore, by splitting the sum-mation over l and m in Eq. (34) into sums over atoms and their respective basis functions, the (spin) densities can be partitioned into atomic contributions known as Mulliken charges.
For a quantum chemical method to be useful for the general chemist, algorithms for calculating other properties must also be developed. For example, to determine the equilibrium structure, the change in the energy induced by a nuclear displacement must be known. The theoretical prediction of harmonic frequencies involves the second derivative of the electronic energy with respect to changes in the nuclear coordinates. Similarly, electrical and magnetic properties such as polarizabilities and magnetizabilities as well as NMR parameters may be calculated as second derivatives of the energy with respect to various time-independent perturbations. Efficient schemes for calculating first and higher derivatives of the Hartree–Fock energy have been developed, applicable to small and large systems [38,39].
The response to frequency-dependent external fields may be obtained from Hartree–Fock response theory, yielding dynamical polarizabilities and hyperpolariz-abilities. The identification of excitation energies as the poles of the dynamical polar-izability tensor may be invoked to calculate excitation energies as well as one-photon and two-photon transition moments from the time development of the ground state [40–42].
The performance of the Hartree–Fock model is illustrated in Table 1, where we have listed the electronic dissociation energy (De), the equilibrium bond distance (re), and the harmonic (xe) and fundamental (m) frequencies calculated at the Hartree–
Fock/cc-pVXZ levels. Basis set convergence is in all cases rapid. Compared with the
Table 1 Calculations of the Electronic Dissociation Energy De(kJ/mol), the Equilibrium Geometry re(pm), and the Harmonic xe(cm1) and Fundamental m (cm1) Vibrational Frequencies of the N2Molecule
Method Basis De re xe m
RHF cc-pVDZ 469.3 107.73 2758.3 2735.7
cc-pVTZ 503.7 106.71 2731.7 2710.3
cc-pVQZ 509.7 106.56 2729.7 2708.1
cc-pV5Z 510.6 106.54 2730.3 2708.5
CASSCF cc-pVDZ 857.8 111.62 2354.3 2325.6
cc-pVTZ 885.3 110.56 2339.4 2312.1
cc-pVQZ 890.9 110.39 2339.5 2312.1
cc-pV5Z 891.9 110.37 2340.4 2313.0
MP2 cc-pCVDZ 897.0 112.84 2175.8 2135.7
cc-pCVTZ 962.8 111.01 2207.6 2169.9
cc-pCVQZ 988.1 110.78 2218.1 2180.7
cc-pCV5Z 998.8 110.70 2221.8 2184.4
CCSD cc-pCVDZ 813.4 111.12 2411.8 2384.9
cc-pCVTZ 873.7 109.35 2434.3 2408.4
cc-pCVQZ 896.9 109.08 2446.8 2421.0
cc-pCV5Z 905.6 108.99 2451.6 2425.7
CCSD(T) cc-pCVDZ 843.8 111.74 2341.3 2312.4
cc-pCVTZ 911.8 110.06 2354.7 2326.8
cc-pCVQZ 936.3 109.81 2365.8 2338.0
cc-pCV5Z 945.6 109.72 2370.1 2342.2
Experiment 956.3 109.77 2358.6 2329.9
In the correlated calculations, all electrons are correlated.
experiment, the dissociation energy is strongly underestimated, the bond distance is too short, and the vibrational frequencies are too high. This behavior is typical of the Hartree–Fock model, reflecting the inadequacy of the mean field description, which ignores the instantaneous interaction among the electrons.
3.4. Limitations of the Hartree–Fock Method
Although the Hartree–Fock model is applicable in many situations, providing a useful qualitative description of a wide variety of molecular systems and processes, it is important to realize that it fails in certain cases. In particular, the Hartree–Fock model fails to provide a reasonable approximation to the exact state whenever there are several Slater determinants with large weights in the FCI wave function (Eq. (14)).
This often happens for excited electronic states and for molecules far away from their equilibrium geometry, particularly in regions of bond breaking and spin recoupling.
Moreover, in molecules with more than one resonance structure or in molecules con-taining transition metal atoms, several determinants may be important even for the electronic ground state at equilibrium.
To illustrate the incorrect behavior of the RHF model upon bond breaking, we return to Fig. 1, which contains the potential energy curve for the symmetrical dissociation of the water molecule (i.e., for a fixed HOH bond angle). For OH bond distances far from equilibrium, the RHF curve is qualitatively different from the FCI curve, grossly overestimating the energy required for dissociation. By contrast, the UHF model dissociates correctly, at least in a qualitative—if not a quantitative—
sense, due to the mixing of several states of different multiplicities in the dissociation limit.
From the optimization itself, it may often be difficult to judge whether the Hartree–Fock wave function is a good approximation to the exact wave function—in particular, whether or not the FCI wave function is dominated by one Slater de-terminant. However, the presence of several important determinants often gives rise to negative eigenvalues in the Hartree–Fock electronic Hessian (i.e., the second deriva-tive of the Hartree–Fock energy with respect to changes in the MOs) [30,31]. For systems whose one-determinant dominance is in doubt, one should therefore inspect the electronic Hessian for negative eigenvalues (instabilities). However, even the absence of such instabilities does not ensure the correctness of the Hartree–Fock model. In difficult cases, therefore, it may be necessary to perform exploratory calculations using multiconfigurational methods, as discussed in Sec. 3.5.
3.5. Multiconfigurational Self-Consistent Field Theory
In Sec. 3.1, we saw that the Hartree–Fock model often gives results in qualitative agreement with experiment. It fails, however, in situations where static or near-de-generacycorrelation becomes important (i.e., when several electronic configurations have the same or nearly the same energy). Such situations typically arise in the course of molecular reactions, when bonds are broken or formed. Sometimes, near-degen-eracies may also be present at the equilibrium ground state geometry. Because only one of the nearly degenerate configurations can be occupied at the single-configuration Hartree–Fock level, the Hartree–Fock model breaks down and cannot be applied.
Instead, even for qualitative agreement, we must adopt a multiconfigurational de-scription of the electronic state.
The multiconfigurational SCF (MCSCF) model [43,44] is a generalization of the
whose expansion coefficients and MOs are simultaneously determined by optimizing the energy with respect to variations in both the MOs and the configuration coefficients:
The MCSCF procedure may be applied to excited states as well as to the ground state.
For the ground state, EMCSCFz Eexact; for excited states, by contrast, the MCSCF energy may sometimes be lower than the corresponding exact energy (unless the calculated state is required to be orthogonal to all lower-lying states). This behavior of MCSCF theory occurs because the MCSCF energies of different electronic states are obtained not by the diagonalization of a single Hamiltonian but instead by separate nonlinear optimizations of the energy function.
For the optimization of Hartree–Fock wave functions, it is usually sufficient to apply the SCF scheme described in Sec. 3.1. By contrast, the optimization of MCSCF wave functions requires more advanced methods (e.g., the quasi-Newton method or some globally convergent modification of Newton’s method, which involves, directly or indirectly, the calculation of the electronic Hessian as well as the electronic gradient at each iteration) [45].
3.6. Complete Active Space MCSCF Theory
When carrying out an MCSCF calculation, we must first decide which configurations to include in the wave function. Although the configurations may be selected in-dividually, it is more convenient to proceed by dividing the orbital space into subspaces and then to generate configurations by distributing electrons among these subspaces.
In the popular complete active space SCF (CASSCF) method, for example, the orbital space is divided into inactive, active, and secondary (external) subspaces [43,44]. The CASSCF model is now completely defined: the inactive orbitals are doubly occupied in all configurations, the secondary orbitals are unoccupied in all configurations, whereas the remaining electrons are distributed in all possible ways among the active orbitals.
In a sense, we are carrying out an FCI calculation in the configuration space spanned by the active orbitals except that, during the optimization of the FCI wave function, not only the configuration coefficients but also the orbitals are optimized so as to yield the best possible wave function in the chosen configuration space. However, it is not always necessary to optimize all orbitals during the MCSCF optimization [e.g., the core orbitals are usually described well at the Hartree–Fock level and are therefore often kept ‘‘frozen’’ (i.e., unchanged) during the MCSCF optimization].
Let us consider how we may go about setting up an active space for an MCSCF calculation. For the study of reactive systems, we would preferably include in the active space all valence orbitals (at least of all atoms involved in the reactions), leaving the core inactive. Thus, for first-row atoms, all orbitals belonging to the L shell are
active and those in the K shell are inactive. In this manner, we ensure a balanced description of the reactive system, no matter what reaction path is followed.
Unfortunately, at present, it is not possible to treat active spaces containing more than, say, 16 orbitals and the same number of valence electrons, confining this full-valence approach to rather small systems. To treat larger systems at the CASSCF level, we must exclude from the active space all orbitals that are deemed unimportant in a given chemical reaction, guided by our chemical intuition.
As an example, consider the symmetrical dissociation of H2O. In H2O, bonding arises from the combination of the two 1s orbitals on the hydrogens with two sp3 hybrid orbitals on oxygen. A minimal active space consists of four active orbitals with four electrons, excluding the remaining two sp3hybrids, which do not participate in the dissociation. In general, a minimal active space of 2n orbitals is required to dissociate n single bonds (each with two paired electrons) into 2n unpaired electrons. The dis-advantage of this scheme is that it introduces a bias toward the reaction path under study, making comparisons with other reactions difficult. In most cases, however, an unbiased full-valence CASSCF description of the reactive system will be prohibitively expensive and not applicable.
Fig. 2 shows the CASSCF potential energy curve for the symmetrical dissoci-ation of H2O, using a full-valence active space of six orbitals and eight electrons. For comparison, the figure also contains the FCI and RHF energy curves. Around equilibrium, the differences of the CASSCF and RHF curves from the FCI curve are similar. However, as the bonds are stretched, the RHF model dissociates in-correctly, whereas the CASSCF curve remains parallel to the FCI curve. Thus, the qualitative agreement that the RHF model exhibits around equilibrium has, in the CASSCF model, been extended to all bond distances, making it an ideal method for studies of reactions, at least in a qualitative sense.
Figure 2 CASSCF dissociation of H2O (atomic units).
More generally, properties such as vibrational frequencies and reaction energies, which depend on the form of the potential curve, are better predicted with CASSCF theory than with RHF theory, provided the active orbital space has been properly defined. In Table 1, we compare the N2full-valence CASSCF and RHF results for De, re, xe, and m in various correlation-consistent basis sets. At the CASSCF level, Deis in much better agreement with the experiment than at the RHF level. Moreover, the CASSCF reand xeare both close to the experimental values, although the RHF errors are slightly overcorrected because the CASSCF method overemphasizes the role of the
More generally, properties such as vibrational frequencies and reaction energies, which depend on the form of the potential curve, are better predicted with CASSCF theory than with RHF theory, provided the active orbital space has been properly defined. In Table 1, we compare the N2full-valence CASSCF and RHF results for De, re, xe, and m in various correlation-consistent basis sets. At the CASSCF level, Deis in much better agreement with the experiment than at the RHF level. Moreover, the CASSCF reand xeare both close to the experimental values, although the RHF errors are slightly overcorrected because the CASSCF method overemphasizes the role of the