2.2 Performing molecular dynamics simulations
2.2.2 Simulation algorithms
An idealised MD simulation in an isolated environment needs only, with an appro- priate force-field, to numerically solve Newtonian dynamics for an entire molecular ensemble. As such, the basic algorithm underpinning MD simulation is, first, calcu- lating forces and, thus, velocities based on the force-field and the system coordinates, then translating the coordinates over a discrete time step according to the calculated velocities, and repeating. Of course, the key practical difficulties in a production simulation are in the amount of computational power required and the intrinsic in- adequacy of a classical representation of a quantum mechanical system. With the additional caveat of simulating a system in equilibrium with a constant temperature and pressure environment, these points lead to a series of necessary modifications to this basic MD algorithm.
Arguably, the most crucial algorithm and variable is the integrator method and timestep, ∆t, for the Newtonian dynamics. Any numerical differential equation solver requires a time integration step-size, ∆t, small enough to provide an accu- rate description of the dynamics. On the other hand, decreasing ∆t leads to a
proportional decrease in simulation efficiency. Furthermore, there are several inte- gration methods for the dynamics which, similarly, can be optimised for balancing simulation efficiency and accuracy.
Beyond the basic MD algorithm, there are a wide variety of algorithms for im- plementing temperature and pressure coupling. As with integration methods, the most efficient algorithms can come at the expense of accuracy and identifying how, if at all, this loss of accuracy manifests in the simulations we perform and, thus, the optimal algorithm is key to improved performance.
Numerous other algorithms are utilised in a production MD run such as the long- range electrostatic treatments discussed in Section 2.1.3, algorithms for maintaining bond lengths such as the LINCS and P-LINCS algorithms [167,168], and algorithms, such as the Verlet algorithm [178], for maintaining and updating interacting neigh- bour lists. Along with the references listed, an overview of these and the many other algorithms often utilised in MD simulations can be found in reference [166]. In this section, we focus on the integrator and environmental coupling algorithms and present an overview of the basic principles underlying these algorithms and the optimisation and testing procedures we have carried out.
Integrators for Newtonian dynamics
Of the integrators for Newtonian dynamics available in Gromacs, we use the default leap-frog algorithm [191]: ˙ri(t + 1 2∆t) = ˙ri(t− 1 2∆t) + ∆t mi Fi(t), (2.20) ri(t + ∆t) = ri(t) + ∆t ˙ri(t + 1 2∆t). (2.21)
This integrator, as can be seen from the above, has positions and velocities specified at different times in all iterations. This may have consequences in systems in which very tightly controlled energetics are required. In such systems, the Verlet integrator [192], which recasts the leap-frog algorithm so that the positions and velocities are specified at the same time step and, therefore, provides a more consistent input to temperature and velocity scaling, is often required. This integrator is given by:
˙ri(t + ∆t) = ˙ri(t) + ∆t mi (Fi(t) + Fi(t + ∆t)), (2.22) ri(t + ∆t) = ri(t) + ∆t ˙ri(t) + ∆t2 2mFi(t). (2.23)
While the Verlet integrator step incurs approximately the same processing cost as the leap-frog algorithm, the additional terms in the integration step mean that around
100 120 140 160 180
Dihedral Angle (º)
0.5 fs 2 fs 4fs 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Time (ps)
100 110 120 130 140Bond Angle (º)
0 2 4 6 8 10Error (º)
4 fs 2 fs 0 0.2 0.4 0.6 0.8 1Time (ps)
0 1 2 3 4 5 6Error (º)
(a) (b) (c) (d)Figure 2.7: Comparison of time traces of (a) a dihedral angle and (c) a bond angle between carbons at the intermonomer junction of a dioctyl-fluorene 2mer simulated in chloroform for integrator step-sizes of ∆t = 0.5, 2, and 4 fs. In (b) and (d), the absolute error, with respect to the 0.5 fs trace, is given for the dihedral angle and bond angle, respectively. (Legends in (a) and (b) apply to (c) and (d), respectively.)
twice as many communication calls are required [166] which, for large and highly parallelised simulations such as ours, is a significant disadvantage. Furthermore, as the fine details of thermodynamics are not a primary aim of our simulations and, also, given its succesful use in other MD works of a similar nature [46, 93, 141, 193], we find the leap-frog algorithm to be the optimal compromise.
Due to the implementation of constraints on bond-stretching terms (typically of vibrational period T ∼ 10 fs), the highest frequency vibrations present in a simula- tion are the bond angle terms (T ∼ 50 fs). As such, it follows that a suitable inte- gration timestep would be ∆t∼ 1 fs with values of 1 fs and 2 fs common in other MD works [46, 47, 93, 141–143]. In order to obtain an optimal value of ∆t, we performed simulations with no temperature or pressure coupling (N V E ensemble simulations) with varying values of ∆t. With no external coupling, the simulation dynamics are entirely deterministic and lead to comparable trajectories. In Figure 2.7, it is seen that the trajectories for an example dihedral angle and bond angle, each taken at the intermonomer junction of a dioctyl-fluorene 2mer, are near-identical over the course of 1ps with absolute errors in the range of 1°-2° between each trace. In contrast, an integrator step of 4 fs leads to the propagation of significant errors which are 3°-4° after 0.5 ps and as high as 5° for the bond angle and 10° for the dihedral angle after 1 ps.
Note that a typical simulation run is over 100 ns so the correspondence of these trajectories over only 1 ps is not fully indicative of the stability of the integrator over simulations of 10-100 ns in length. However, in production simulations with temperature and pressure coupling (as discussed later in this subsection), consistency is only required over this time-scale due to, in particular, the randomisation of velocities over a temperature coupling step ∼ 0.1 ps. Therefore, fully deterministic dynamics only ever occur over this timescale. With this point and the results shown in Figure 2.7 in mind, it is evident that 2 fs is a sufficiently small and accurate integration timestep.
Temperature and pressure scaling
In order to bring into effect any temperature or pressure scaling, one must first form a definition of each which can be extracted from the simulation state. The instantaneous, simulation temperature, T , is defined based on the relationship be- tween kinetic energy, EK, and temperature in analogy with the equipartition of
energy [194]: EK = 1 2NdkBT = 1 2 N X i mi|˙ri|2, (2.24)
with N the number of atoms in the simulation and Nd the number of degrees of
freedom available. Eq. 2.24 reduces immediately to the standard definition of en- semble temperature upon averaging over all states. In defining temperature in this way, it follows that an external source of heat may be coupled to the system via modification of a suitable sample of the system velocities.
To mediate the modification of velocities, we use the velocity-rescale thermostat [195]. This is a modified version of the Berendsen thermostat [194] whose basic operation is in modifying the simulation temperature T to the reference temperature T0 after an an interval τT as:
dT
dt =
T − T0
τT
. (2.25)
The modifying interval τT can be viewed as a coupling constant which controls the
influence of the external environment over the simulated system. The modification of the system temperature is performed by a uniform scaling of all velocities in the system. With the velocity rescale thermostat, the basic scaling of Eq. 2.25 is used along with an additional stochastic noise term so as to avoid suppressing fluctuations entirely. (For a full discussion of this point, see reference [195].)
This simple form of thermostat, even with the modification of preserving the kinetic energy distribution, is not fully capable of perfectly describing the internal thermodynamics of all systems. The failing of this method is in controlling temper- ature often too tightly and not allowing for the natural thermal fluctuations in the
system. Examples of approaches which can improve this are thermostats are the Nos´e-Hoover (NH) [196, 197] and Andersen [198] thermostats. The NH thermostat is often used in simulations of conjugated polymers though we find that adding this additional complexity to temperature control yields insignificant differences in our results. However, it is, as with the choice of integrator algorithm, noted that this may not be true if one is interested in strictly thermodynamic properties.
Given the definition of a system boundary, the simulation pressure can be defined using an instantaneous version of the virial equation [194].
P = 1 3V N X i mi|˙ri|2+ N X i N X j<i rij · Fij ! . (2.26)
The first term on the R.H.S. of this definition is related to the kinetic energy and, therefore, the simulation temperature. The second term is the virial of the system. Again, averaging over Eq 2.26 returns the usual definition of the virial equation [113]. The definition given here is also simplified due to our use of isotropic systems. Also, this definition is a modified variant for use with a pair-wise force-field in which rij ≡ rj − ri and Fij ≡ ∂Vij(rij)/∂rij are easily accesible.
The variable used for rescaling pressure is the system volume which is controlled by rescaling the box vectors accordingly. This is mediated by the Berendsen baro- stat [194], which, as its name suggests, performs in an analogous manner to the thermostat of Eq. 2.25 for a simulation pressure, P , reference pressure, P0, and cou-
pling interval, τP. For an isotropic system, the box vectors are scaled uniformly in
order to return the system pressure to its equilibrium value.
As with the Berendsen and velocity-rescale thermostats, the Berendsen barostat is a simplified barostat which suppresses fluctuations in the system. However, as with the simplified thermostat, we find this barostat equivalent in accuracy to, as an example, the Parrinello-Rahman barostat [199, 200] with fewer complications (such as simulations crashing in the equilibration phase).
In all our simulations, we use a τT = 0.1 ps and τP = 0.4 ps. These values are
recommended by the Gromacs program in combination with our choice of integrator step-size and neighbour searching frequency though, given our choice of thermostat and barostat, varying them has little effect on the resulting accuracy or efficiency. Further details of our tests of the velocity-rescale and Nos´e-Hoover thermostats, and Berendsen and Parrinello-Rahman barostats are provided in Appendix B.
Having discussed our use of thermostats and barostats in controlling temperature and pressure, and, thus, controlling the system equilibrium, we can now proceed to detail the process of achieving equilibrium in a simulation: the process of equilibra- tion.