4.3 Dropout neural network potential
4.3.3 Uncertainty quantification
Now that we have a method to determine whether a DNNIP is transferable to a new problem or not, the next question to answer is how to quantify the uncertainty in a property obtained through atomistic simulations. We provide two ways to compute
5Arguably, a better way is to compare the relative uncertainty (the uncertainty normalized by the predictive mean), but here it is not a problem because the energy scale of all four carbon allotropes is around 8 eV/atom.
the uncertainty: a direct method and an indirect method. In the direct method, we compute the property multiple times, each with different but fixed dropout matrices in theIP, and then calculate the average and standard deviation of the outputs from these multiple runs as the predictive mean and uncertainty, respectively. This method applies to any property. But if a property has a “simple” relation with the IP energy and/or forces, the indirect method can be employed to propagate the uncertainty in the energy and/or forces obtained from the IP to the property.
As an example, we compute the potential part of the virial stress in a monolayer graphene usingmolecular dynamics (MD) simulations. The potential part of the virial stress (stress for short below) can be expressed as [17,257]
sij = 1 V T T X t=1 Na X α=1 ri,tαfj,tα, (4.44)
where i, j ∈ {1, 2, 3} are Cartesian components, rα
i,t is the ith component of the position
of atom α at MDtime step t, fα
j,t is the jth component of the force on atom α at MD
step t, Na is the total number of atoms in the system, T is the total number of MD
steps, and V is the volume of the system defined as the area of graphene multiplied with the van der Waals thickness, 3.4 ˚A in the present case.
In the indirect method, we rewrite equation (4.44) in a matrix form
s = 1
V TRf , (4.45)
where the stress s, in Voigt notation, is a column vector of 6 components, the coordinates R is a 6× 3NaT matrix, and the forces f is a column vector of length 3NaT . See
appendix F for a method to construct R and f . Unlike the direct method where we run multiple MD simulations to compute the average and standard deviation of the outputs, only oneMDtrajectory is generated in the indirect method. At eachMDstep, we evaluate the forces multiple times with different dropout matrices and then update the positions of atoms by integrating the equations of motion using the average forces from the multiple evaluations. Therefore, we can assume that R is a coefficient matrix
2.343 2.367 2.392 2.417 2.441 2.466 2.491 2.515 2.540 2.565 2.589 Lattice parameter a (˚A) 0 10 20 30 40 50 60 70 Uncertain ty in atomic energy σe (meV ) −80 −60 −40 −20 0 20 40 60 Stress s11 (GP a) Direct Indirect σe
Figure 4.8: The potential part of the virial stress s11 and uncertainty in atomic energy
σe in monolayer graphene at various lattice parameters. The left y axis is for the error
bar plot of s11, where we show the predictive mean and uncertainty obtained using both
the direct and indirect methods. The right y axis is for the box and whisker plot of σe,
where the bar inside the box denotes the median, the ends of the whiskers represent the lowest datum and highest datum still within 1.5 interquartile range of the lower quartile and upper quartile, respectively, and the circles represent outliers.
without any uncertainty.6 Then the covariance of s can be estimated as [254]
Σs=
1
V2T2RΣfR
T, (4.46)
where Σs and Σf denote the covariance matrices of s and f , respectively, and the
square root of the 6 diagonal elements of Σs give the uncertainty in the stress. The
force covariance matrix, Σf, can be obtained from the multiple evaluations of theIPat
each MDstep.
Using both the direct and indirect methods, we computed the stress in a monolayer
6For a random variable x, the standard deviation of its sample mean ¯x = Pn
i=1xi /n is σx¯= σ/ √
n, where σ is the standard deviation of x, and n is the sample size. We assume that the number of dropout evaluations is large enough such that the standard deviation of the mean of the forces is 0, thus introducing no uncertainty to the positions of atoms.
graphene at various in-plane lattice parameters. We construct a rectangular monolayer graphene consisting of 96 atoms using in-plane lattice parameter ranging from 2.343 to 2.589 ˚A. The zigzag and armchair edges of the graphene are aligned with the first and second Cartesian directions, respectively. Periodic boundary conditions are applied to both in-plane directions. The equations of motion were integrated using a velocity- Verlet algorithm with a time step of ∆t = 1 fs. The system was thermalized at a constant temperature of T = 300 K under the canonical ensemble (NVT ) using a Langevin thermostat. For both the direct and indirect methods, we ignore the first 10,000 unstable steps and then sample 1 out of 10 steps to obtain a total number of 1,000 steps to compute the stress.
The stress in the x direction s11and its uncertainty σs11 are plotted in figure4.8. It is
seen that the direct and indirect methods yield almost the same stress and uncertainty.7 The stress s11 in the graphene has the smallest magnitude at the equilibrium lattice
parameter a = 2.466 ˚A, and the magnitude increases as the graphene moves away from its equilibrium structure. The uncertainty in the stress σs11 follows the same trend
as the stress s11 (i.e. small near a = 2.466 ˚A and getting larger when moving away
from it); however, the underlying mechanism is totally different. For s11, this is purely
due to the physical law that governs the material behavior: we get larger and larger tensile (compressive) stress when a material is constantly stretched (squeezed). But for σs11, moving away from the equilibrium lattice parameter means making predictions
for configurations deviating from the training set,8 and thus we would expect higher uncertainty in the predictions. This is in agreement with the uncertainty in atomic energy, which measures the distance between these configurations and the training set as discussed in section 4.3.2. The uncertainty in atomic energy is presented as box and whisker plots in figure 4.8.
As a second example, we consider the phonon dispersions in a monolayer graphene, which provides a comprehensive view of the elastic vibrational behavior of IPs. Un- like the stress, there is not a simple linear relation between the phonon frequency and
7The slight difference originates from the fact that the stress and uncertainty are obtained from a singleMD trajectory in the indirect method, whereas multiple distinctMDtrajectories different from the one used in the indirect method have to be used in the direct method.
8For monolayer graphene, our training set only includes ab initio molecular dynamics trajectories using an initial lattice parameter of a = 2.466 ˚A and slightly stretched and compressed configurations using a lattice parameter a ∈ [2.40, 2.52] ˚A.
the IP energy (or forces), so we compute the phonon dispersions using only the di- rect method. The phonon dispersions was calculated using the finite difference method as implemented in the phonopy package [209]. The phonon dispersions along some high-symmetry points in the first Brillouin zone are plotted in figure 4.9. We see that the predictive mean (dashed line) is in excellent agreement with DFT results (solid line). Specifically, it correctly captures the characteristics of the flexural acoustic (ZA) branch (e.g. the quadratic nature near the Γ point) that is associated with out-of-plane vibrations, which provides the dominant contribution to lattice thermal conductivity in graphene [218,219]. The uncertainty in the phonon frequency is small for acoustic branches and becomes larger for optical branches as the absolute phonon frequency in- creases. Also plotted in figure4.9is the prediction obtained using thereactive empirical bond order (REBO)potential [83], which performs the best among a number of physics- based potentials such as the Tersoff [113], adaptive intermolecular reactive empirical bond order (AIREBO) [148], long-range carbon bond order potential (LCBOP) [84], and reactivate force field (ReaxFF) [201] models. See figure 3.5for a comparison. The REBOpotential performs comparably well as ourDNNIPfor the low-frequency acoustic branches, whereas its predictions for the high-frequency TO and LO branches deviate significantly from DFT results, much worse thanDNNIP.