Optimisation Approaches - Theoretical Methodology 1 Density Functional Theory

Methodology 2.1 Introduction

2.2 Theoretical Methodology 1 Density Functional Theory

2.2.2 Optimisation Approaches

This section is focused on the methods utilised to allow the atomic positions and the lattice parameters to be relaxed. These methods rely upon the minimisation of the electronic energy and the inter-nuclear Coulombic energies at 0 K.

2.2.2.1 Geometry Optimisations: Conjugate Gradients

The conjugate gradients (CG) methods is one of the various methods that is deployed to find the minimum of a multi-variable function. When applied to the problem of relaxing the atom positions, the energy function (E(R)) and its gradient with respect to the atom position R, the force (F) acting on the atoms is given by−𝜕𝐸(𝑟) _{𝜕𝑅, and} it is minimised to solve the Hellmann-Feynman theorem for the system of study.

The CG method is based upon the steepest descent (SD) algorithm, where the atoms are moved in the direction of F. The SD method evaluate E(R) along a line at regular intervals between two points. For a starting point Ri, the new position

Ri+1=R1+b1F(R1), where b1 is chosen to give an F(Ri+1)·F(R1)=0. The new gradient

F(Ri+1) is perpendicular to the previous line, the procedure is repeated in the direction of Ri+1 until the minimum is located. The first step of the SD and CG techniques is the same, in the CG method successive displacements can take any direction, this can be expressed as:

𝑅fw:= 𝑅f+ 𝑏f𝑆f (2.25) The search vector Sm now contains information from the gradient and the search direction of the previous step (equation 2.26).

𝑆_f = 𝐹 𝑅_f + 𝛾_f𝑆_fh: (2.26) The scalar coefficient, 𝛾f, is zero for m=1 and is defined by Fletcher and Reeves as equation 2.27.[187]

𝛾f=

𝐹 𝑅_f ∙ 𝐹 𝑅_f

𝐹 𝑅fh: ∙ 𝐹 𝑅fh: (2.27) The advantage of the CG method when compared to the SD method is that it is able to drastically reduce the number of search steps required to locate the minimum of the energy function. In the CG approach, search directions that are optimally independent from one another are utilised.

2.2.2.2 Broyden–Fletcher–Goldfarb–Shanno Algorithm

The Broyden–Fletcher–Goldfarb–Shanno (BFGS) method was proposed as means of solving nonlinear optimisation problem in a more efficient manner than the CG method outlined above.[188] In the BFGS (and other quasi-Newtonian methods), the Hessian matrix is not directly evaluated, but instead it is approximated by updates from the gradient evaluations. The BFGS method (L-BFGS, limited memory variant), is well suited to the efficient treatment of large multi-dimensional problems (equation 2.28).

𝐵•𝑝•= −∇𝑓 𝑥• , (2.28) where pk defines the search direction, Bk is the approximated Hessian matrix updated at each stage, and ∇𝑓(𝑥_•) is the gradient of the function evaluated at xk. A line search in the direction of pk locates the next point of evaluation xk+1. The approximate Hessian matrix is then updated at k via matrix addition (equation 2.29).

𝑥_•w: = 𝑥_•+ 𝛼_•𝑝_• (2.29) 𝛼_• defines the step size and is determined in the first step in the direction found in the first step. For subsequent steps 𝑠• = 𝛼•𝑝•, with the quasi-Newton condition enforced 𝑦_• = ∇𝑓(𝑥_•w:) − ∇𝑓(𝑥_•), this is all put together to give:

𝐵•w: = 𝐵•+ 𝑦_•𝑦_•% 𝑦_•%_𝑠 • − 𝐵_•𝑠_•𝑠_•%_𝐵 • 𝑠_•%_𝐵 •𝑠• (2.30)

𝑓(𝑥), is the function to be minimised. In the first step, B is the gradient, further steps being refined by Bk.

It is important to note that the above techniques assume there is a single minimum associated with the given energy function. Where there are several minima, the minimum located in the basin where the search starts will be located. Due to this shortcoming, the inability to move between basins necessitates the searching of configurational space. This ensures the configuration being studied is a meaningful minimum as opposed to a metastable configuration.

2.2.2.3 The Relaxation of Lattice Parameters: Pulay Stress and the Equation of State Method

While the relaxation of local geometries is covered in the previous section, each functional gives a variation in bond length, which by extension leads to lattice parameters being dependent upon functional. With this in mind, the fixing of the cell parameters can in effect lead to the describing of a geometrically excited state.

When both atom positions and lattice parameters are being relaxed, Pulay stress may arise.[189] This error comes about from the plane wave basis set being incomplete with respect to changes in volume. This means that both the energy cut- off and size of the reciprocal cell define the number of basis functions. The volume variation that occurs during the optimization of the lattice parameters, changes the number of plane waves in the basis set for a given energy cut-off. The effect of this error is to introduce a positive non-zero pressure, or stress that tends to decrease the volume. This is typically remedied by setting the cut-off energy 30% higher than required for energy convergence, as this is typically enough to converge the stress tensor. However, the higher cut-off has an impact on the computational time required.

An alternative approach, that avoids the Pulay stress, is to carry out a constant volume relaxation where only the ion positions and cell shape is allowed to relax. This keeps the basis set constant and removes the above issues induced by volume variations by repeating this at a number of volumes and fitting the points to a cubic equation of state, the minimum of which gives the equilibrium volume of the cell.

2.2.2.4 Transition States

The transition state is the minimum energy path (MEP) between two local minima, and defines the reaction coordinate for a given transition (reaction). The barrier height gives the adiabatic energy cost for a given reaction to occur, which for elementary reactions is equal to the activation energy.

There are a variety of methods designed to tackle the problem of finding the MEP, the two that will be focused on here are climbing image nudged elastic band (CI-NEB),[190], [191] and the improved dimer method (IDM).[192] The CI-NEB is based upon the nudged elastic band method, where a series of replica images along the reaction coordinates (band) are created and kept equidistant during the relaxation through the addition of spring forces between the images.[193], [194] An optimisation of all images except the initial and final is carried out, using the residual minimisation method-direct inversion in the interactive subspace (RMM-DIIS). This is a quasi- Newton method based on the forces and stress tensor in which the norm of the residual vector is minimised through diagonalization of the inverse Hessian matrix. In the CI-NEB approach, the method is refined to give an improved description of the saddle point with the same (or fewer) images. The highest energy image is freed from the spring constraints and the force along the tangent is inverted to allow this image to maximise its energy along the band and minimise it in all other directions.

The IDM method is implemented in VASP and CP2K. This algorithm identifies a transition state from only the initial configuration. Through identifying the negative vibrational mode that defines the reaction coordinate, a dimer axis is identified and followed to give the transition state.[195]

In document The characterisation of performance limiting defects in 4H-SiC devices using density functional theory (Page 70-74)