• No results found

In this section we give an example to demonstrate how to use KIM-based learning- integrated fitting framework (KLIFF) to train a Stillinger–Weber (SW) potential for silicon. Note, the codes shown in this section is compatible with KLIFF v0.1.1, which may not work for later versions of KLIFF. But the up-to-date example is available at the KLIFFdocumentation: https://kliff.readthedocs.io.

Here, we train aSWpotential for silicon that is archived in theopen knowledgebase of interatomic models (OpenKIM)repository. Before getting started to train the model, let’s first install it:

 

$ kim-api-collections-management install user \ SW_StillingerWeber_1985_Si__MO_405512056662_005

 

We are going to fit the model to a training set of energies and forces from com- pressed and stretched diamond silicon structures as well as configurations drawn from molecular dynamics (MD) trajectories at different temperatures. The training set is stored in the extended XYZ format. A tarball of the training set can be down- loaded fromhttps://raw.githubusercontent.com/mjwen/kliff/master/examples/ Si_training_set.tar.gz. To extract the training set, do

 

$ tar xzf Si_training_set.tar.gz

 

Note that the Si_training_set is just a toy data set for the purpose to demonstrate how to useKLIFFto train models, so by no means should it be suitable for the training of interatomic potentials (IPs)for real simulations.

Model

We first instantiate a knowledgebase of interatomic models (KIM) model for the SW potential and print out all its available parameters that can be optimized (we call this model parameters):

from kliff.models import KIM

model = KIM(model_name="SW_StillingerWeber_1985_Si__MO_405512056662_005") model.echo_model_params()

The output generated by the last line reads:

 

#=========================================================================== # Available parameters to optimize.

# # Model: SW_StillingerWeber_1985_Si__MO_405512056662_005 #=========================================================================== name: A value: [15.28484792] size: 1 dtype: Double

description: Multiplicative factors on the two-body energy function as a whole for each binary species combination. In terms of the original SW parameters, each quantity is equal to A*epsilon for the corresponding species combination. This array corresponds to a lower-triangular matrix

(of size N=1) in row-major storage. Ordering is according to SpeciesCode values. For example, to find the parameter related to

SpeciesCode ’i’ and SpeciesCode ’j’ (i >= j), use (zero-based) index = ( j*N + i - (j*j + j)/2).

name: B

value: [0.60222456] size: 1

description: Multiplicative factors on the repulsive term in the two-body energy function for each binary species combination. This array corresponds to a lower-triangular matrix (of size N=1) in row-major storage. Ordering is according to SpeciesCode values. For example, to find the parameter related to SpeciesCode ’i’ and SpeciesCode ’j’ (i >= j), use (zero-based) index = (j*N + i - (j*j + j)/2).

...

 

which shows the name, value, size, data type and a description of each parameter. In fact, there are other model parameters in the SW potential available for optimization (e.g. p, sigma, gamma, lambda, etc.), but we omit them here for the sake of space.

Now that we know what model parameters are available for fitting, we optimize a subset of them to reproduce the training set.

model.set_fitting_params( gamma=[[1.5]], B=[["default"]], sigma=[[2.0951, "fix"]], A=[[5, 1, 20]]) model.echo_fitting_params()

Here, we tell KLIFF to fit four parameters gamma, B, sigma, and A of the SW poten- tial. The information for each fitting parameter should be provided as a list of list, where the size of the outer list should be equal to the size of the parameter given by model.echo_model_params(). For each inner list, we can provide either one, two, or three items.

• One item. We can use a numerical value to provide an initial guess of the param- eter. For example, gamma. Alternatively, the string default can be provided to use the default value in the model. For example, B.

• Two items. The first item should be a numerical value and the second item should be the string fix, which tells KLIFF to use the first item as the value of the parameter but do not optimize it. For example, sigma.

• Three items. The first item can be a numerical value or the string default, having the same meanings as the one item case. The second and third items are the lower and upper bounds for the parameters, respectively. A bound can be provided as either a numerical value or None, with the latter indicating no bound is applied. For example, A.

The call of model.echo_fitting_params() prints to stdout the fitting parameters that we require KLIFFto optimize:

 

#=========================================================================== # Model parameters that are optimized.

#===========================================================================

A 1

5.0000000000000000e+00 1.0000000000000000e+00 2.0000000000000000e+01

B 1 6.0222455840000000e-01 sigma 1 2.0951000000000000e+00 fix gamma 1 1.5000000000000000e+00  

where the number 1 after the name of each parameter indicates the size of the parameter. Parameters that are not included as fitting parameters are fixed to their default values in the model during the optimization.

Training set

KLIFF has a Dataset class to deal with the training data (and test data). For the silicon training set, we can read and process the extended XYZ files by:

from kliff.dataset import Dataset

tset = Dataset()

tset.read(dataset_name) configs = tset.get_configs()

The configs in the last line is a list of Configuration. Each Configuration is an internal representation of a processed extended XYZ file, consisting of the species, co- ordinates, energy, forces, and other related information of a system of atoms.

Calculator

Calculator is the central agent that exchanges information and orchestrate the operation of the fitting process. It computes a set of predictions using the model and provides this information to the loss function (discussed below) to compute the loss value. It also grabs the new parameters from the optimizer and update the parameters in the model so that the up-to-date parameters are used the next time the model is evaluated. The calculator can be created by:

from kliff.calculator import Calculator

calc = Calculator(model) calc.create(configs)

where calc.create(configs) does necessary initializations for each configuration in the training set such as creating the neighbor list.

Loss function

KLIFFuses a loss function to quantify the difference between model predictions and the corresponding reference data in the training set and then uses optimization algorithms to reduce the loss as much as possible. For physics-based IPs, any algorithm listed on scipy.optimize.minimize and scipy.optimize.least squares can be used. In the following code snippet, we create a loss function and then use the L-BFGS-B algorithm to minimize the loss. The minimization will run with 1 process and a max number of 100 iterations are allowed.

steps = 100

loss = Loss(calc, nprocs=1)

loss.minimize(method="L-BFGS-B", options={"disp": True, "maxiter": steps})

The output reads:

 

RUNNING THE L-BFGS-B CODE

* * *

Machine precision = 2.220D-16

N = 3 M = 10

At X0 0 variables are exactly at the bounds

At iterate 0 f= 1.65618D+07 |proj g|= 1.63611D+07 At iterate 1 f= 4.50459D+06 |proj g|= 7.90884D+06 . . . At iterate 25 f= 3.25435D+03 |proj g|= 1.16308D+02 At iterate 26 f= 3.25435D+03 |proj g|= 3.06113D+00 At iterate 27 f= 3.25435D+03 |proj g|= 6.61066D-01 * * *

Tit = total number of iterations

Tnf = total number of function evaluations

Tnint = total number of segments explored during Cauchy searches Skip = number of BFGS updates skipped

Nact = number of active bounds at final generalized Cauchy point Projg = norm of the final projected gradient

* * *

N Tit Tnf Tnint Skip Nact Projg F

3 27 36 28 0 0 6.611D-01 3.254D+03

F = 3254.3480974009767

CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH

Cauchy time 0.000E+00 seconds.

Subspace minimization time 0.000E+00 seconds.

Line search time 0.000E+00 seconds.

 

As seen, the minimization converges after running for 27 steps.

Save trained model

After training, we’d better save the model to disk so that it can be loaded later for retraining, evaluation, or other analysis. If we are satisfied with the fitted model, we can also write it as aKIMmodel; then it can be used with simulation codes that conform to the KIM application programming interface (API).

model.echo_fitting_params() model.save("kliff_model.pkl") model.write_kim_model()

The first line of the above code generates:

 

#=========================================================================== # Model parameters that are optimized.

#===========================================================================

A 1

1.5008554501462323e+01 1.0000000000000000e+00 2.0000000000000000e+01

B 1

sigma 1

2.0951000000000000e+00 fix

gamma 1

2.4122637121188939e+00

 

A comparison with the original parameters before carrying out the minimization shows that we recover the original parameters quite reasonably. The second line saves the fitted model to disk with a file name of kliff_model.pkl, and the third line writes out aKIM model named SW_StillingerWeber_1985_Si__MO_405512056662_005_kliff_trained.

So far, we have successfully trained the physics-basedSWpotential for silicon. For machine learning IPs, the procedures would be largely the same. Again, refer to the KLIFF documentation for other examples.

Chapter 6

Conclusions and Future Work

Atomistic simulation with empirical interatomic potentials (IPs) is a useful compu- tational tool to investigate materials on a microscopic level. IPs that describe the interactions between atoms and thus produce the forces governing atomic motion and deformation are arguably the most important element that determines the quality of atomistic simulations.

During my doctoral studies, I have developed both physics-based and machine learn- ing IPs fortwo-dimensional (2D)materials and heterostructures and applied these IPs to study their mechanical and thermal properties. In particular, I have created a Stillinger–Weber (SW) potential for monolayer MoS2, a registry-dependent potential

for the interlayer interactions in graphene, as well as a neural network (NN) potential for multilayer graphene structures. TheseIPs are either built based on existing models to improve/correct their behaviors or built from scratch to capture the physics deemed to be important for 2Dmaterials and heterostructures.

Atomistic simulation withIPsare viewed as a tool limited to provide only qualitative insight. A key reason is that in such simulations there are many sources of uncertainty that are difficult to quantify, thus failing to give confidence interval on simulation results. A novel contribution of my work is the development and application of techniques to quantify the uncertainty in IPs themselves and simulation results obtained using IPs. For physics-based IPs, I demonstrate how to analyze the parameter sensitivity of an IP and the uncertainty in its predictions using the Fisher information theory extended to path space. For machine learning NNpotentials, I show how to apply the

dropout technique to train aNNand then obtain the predictive mean and variance (the uncertainty) as ensemble averages. Besides, I have discovered the correlation between the uncertainty in atomic energies and the distance between the training set and the configurations characterizing a new problem of interesting, and proposed a practical method to determine the transferability of NNpotentials.

I have also developed an open-sourceKIM-based learning-integrated fitting frame- work (KLIFF) to train both physics-based and machine learning IPs. KLIFF inte- grates closely with the knowledgebase of interatomic models (KIM) echosystem to ei- ther use the physics-basedIPsarchived in theopen knowledgebase of interatomic models (OpenKIM) repository or deploy the trained model via the KIM application program- ming interface (API). It supports a variety of atomic environment descriptors and ma- chine learning regression methods, and specifically PyTorch is used internally to provide state-of-the-art deep learning techniques for NN models. In addition, KLIFF provides a number of tools to assess the quality ofIPs such as computing theFisher information matrix (FIM). I hope other researchers will find KLIFF useful when developing their ownIPs.

There are many open directions for future research:

Potentials for other 2D materials and heterostructures. In this thesis, I present NN potentials for multilayer graphene structures. In the future, I intend to build NN potentials and perform large scale simulations for other 2D materials and heterostructures. One interesting problem is to investigate how heterostructures behave when different types of 2D materials are stacked on top of each other and whether we can create new physics by stacking them in different manners.

Active learning. TheIPsin this thesis are fit to pre-determined training sets that are believed to be important for the problems of interest. This is, indeed, a nontrivial task, especially for machine learning models, because we may not know a priori the configurations that carry the information needed by a problem of interest. With the uncertainty quantification capability of dropoutNN, we are interested in applying active learning to generate the training set automatically. First, we train an NNmodel to a preliminary training set. Next, the trained NNmodel is applied to make predictions for the problem of interest, and at the same time we measure the associated uncertainty in the predictions. Then, we add back to the training set the configurations with large

uncertainty and retrain the model against the updated training set. We do this training– uncertainty quantification–enriching the training set process until the uncertainty in the predictions is below a threshold value. With this active learning technique, training an NNpotential can be largely automated, if not all.

Faster atomic environment descriptors. Although significantly faster than first-principles approaches, machine learning IPs are more computationally expensive than physics-based IPs. For example, our NN potential that uses the symmetry func- tions as the atomic environment descriptor is about 100 times slower than a Tersoff [113] potential. Actually, the largest portion of time of a machine learning IP is spent on evaluating the atomic environment descriptor. We have initialized an effort to design new atomic environment descriptors that not only satisfy all the requirements discussed in section3.1, but are also computationally far less expensive. Preliminary results show that Gabor transformation [288] of atomic density function seems a promising method. Extending KLIFF. For now,KLIFFonly allows the use of energy, forces, and stress as the training target. We are planning to extend the framework such that any materials property can be employed in the training, for example, equilibrium lattice parameters, elastic moduli, and phonon dispersion curves to name a few. This is extremely useful for physics-basedIPs where the number of parameters is not too large such that we can afford to compute these properties during the training process. We also hope to support more physics-basedIPs and machine learning regression methods.

Bibliography

[1] N. R. Council. Materials and Man’s Needs: Materials Science and Engineering – Volume I, The History, Scope, and Nature of Materials Science and Engineering. The National Academies Press, Washington, DC, 1975.

[2] L. A. Dobrza´nski. Significance of materials science for the future development of societies. J. Mater. Process. Technol., 175(1-3):133–148, 2006.

[3] N. A. Spaldin. Fundamental materials research and the course of human civiliza- tion. arXiv preprint arXiv:1708.01325, 2017.

[4] D. L. Schodek, P. Ferreira, and M. F. Ashby. Nanomaterials, nanotechnologies and design: an introduction for engineers and architects. Butterworth-Heinemann, 2009.

[5] J. Ramsden. Nanotechnology: An Introduction. Micro and Nano Technologies. Elsevier Science, 2011.

[6] W. H. Hunt. Nanomaterials: Nomenclature, novelty, and necessity. JOM, 56(10):13–18, 2004.

[7] G. Binnig and H. Rohrer. Scanning tunneling microscopy. Surf. Sci., 126(1- 3):236–244, 1983.

[8] G. Binnig, C. F. Quate, and C. Gerber. Atomic force microscope. Phys. Rev. Lett., 56(9):930, 1986.

[9] F. Cailliez and P. Pernot. Statistical approaches to forcefield calibration and prediction uncertainty in molecular simulation. J. Chem. Phys., 134(5):054124, 2011.

[10] P. Angelikopoulos, C. Papadimitriou, and P. Koumoutsakos. Bayesian uncer- tainty quantification and propagation in molecular dynamics simulations: a high performance computing framework. J. Chem. Phys., 137(14):144103, 2012.

[11] R. Z. Khaliullin, H. Eshet, T. D. K¨uhne, J. Behler, and M. Parrinello. Nucle- ation mechanism for the direct graphite-to-diamond phase transition. Nat. Mater., 10(9):693, 2011.

[12] K. Chenoweth, A. C. Van Duin, and W. A. Goddard. Reaxff reactive force field for molecular dynamics simulations of hydrocarbon oxidation. J. Phys. Chem. A, 112(5):1040–1053, 2008.

[13] S. Piana, K. Lindorff-Larsen, and D. E. Shaw. Protein folding kinetics and thermo- dynamics from atomistic simulation. Proc. Natl. Acad. Sci., 109(44):17845–17850, 2012.

[14] R. A. Messerly, T. A. Knotts, and W. V. Wilding. Uncertainty quantification and propagation of errors of the lennard-jones 12-6 parameters for n-alkanes. J. Chem. Phys., 146(19):194110, 2017.

[15] K. S. Novoselov, A. K. Geim, S. V. Morozov, D. Jiang, Y. Zhang, S. V. Dubonos, I. V. Grigorieva, and A. A. Firsov. Electric field effect in atomically thin carbon films. Science, 306(5696):666–669, 2004.

[16] D. Griffiths. Introduction to Quantum Mechanics. Pearson international edition. Pearson Prentice Hall, 2005.

[17] E. B. Tadmor and R. E. Miller. Modeling materials: continuum, atomistic and multiscale techniques. Cambridge University Press, 2011.

[18] M. Ishigami, J. Chen, W. Cullen, M. Fuhrer, and E. Williams. Atomic structure of graphene on sio2. Nano Lett., 7(6):1643–1648, 2007.

[19] E. Fradkin. Critical behavior of disordered degenerate semiconductors. ii. spec- trum and transport properties in mean-field theory. Phys. Rev. B, 33(5):3263, 1986.

[20] The Nobel Prize in physics 2010. https://www.nobelprize.org/prizes/ physics/2010/summary/. Retrieved: 2019-03-08.

[21] Y. Hernandez, V. Nicolosi, M. Lotya, F. M. Blighe, Z. Sun, S. De, I. McGovern, B. Holland, M. Byrne, Y. K. Gun’Ko, et al. High-yield production of graphene by liquid-phase exfoliation of graphite. Nat. Nanotechnol., 3(9):563, 2008.

[22] K. V. Emtsev, A. Bostwick, K. Horn, J. Jobst, G. L. Kellogg, L. Ley, J. L. McChesney, T. Ohta, S. A. Reshanov, J. R¨ohrl, et al. Towards wafer-size graphene layers by atmospheric pressure graphitization of silicon carbide. Nat. Mater., 8(3):203, 2009.

[23] X. Li, W. Cai, J. An, S. Kim, J. Nah, D. Yang, R. Piner, A. Velamakanni, I. Jung, E. Tutuc, et al. Large-area synthesis of high-quality and uniform graphene films on copper foils. Science, 324(5932):1312–1314, 2009.

[24] W. Kern and G. L. Schnable. Low-pressure chemical vapor deposition for very large-scale integration processing—a review. IEEE Trans. Electron Devices, 26(4):647–657, 1979.

[25] A. H. C. Neto, F. Guinea, N. M. R. Peres, K. S. Novoselov, and A. K. Geim. The electronic properties of graphene. Rev. Mod. Phys., 81(1):109–162, 2009.

[26] C. Berger, Z. Song, X. Li, X. Wu, N. Brown, C. Naud, D. Mayou, T. Li, J. Hass, A. N. Marchenkov, et al. Electronic confinement and coherence in patterned epitaxial graphene. Science, 312(5777):1191–1196, 2006.

[27] Y. Zhang, Y.-W. Tan, H. L. Stormer, and P. Kim. Experimental observation of the quantum hall effect and berry’s phase in graphene. Nature, 438(7065):201–204, nov 2005.

[28] S.-E. Zhu, S. Yuan, and G. Janssen. Optical transmittance of multilayer graphene. EPL (Europhysics Letters), 108(1):17007, 2014.

[29] R. R. Nair, P. Blake, A. N. Grigorenko, K. S. Novoselov, T. J. Booth, T. Stauber, N. M. Peres, and A. K. Geim. Fine structure constant defines visual transparency of graphene. Science, 320(5881):1308–1308, 2008.

[30] D. E. Sheehy and J. Schmalian. Optical transparency of graphene as determined by the fine-structure constant. Phys. Rev. B, 80(19):193411, 2009.

[31] W. Cai, A. L. Moore, Y. Zhu, X. Li, S. Chen, L. Shi, and R. S. Ruoff. Thermal transport in suspended and supported monolayer graphene grown by chemical vapor deposition. Nano Lett., 10(5):1645–1651, 2010.

[32] C. Faugeras, B. Faugeras, M. Orlita, M. Potemski, R. R. Nair, and A. Geim. Thermal conductivity of graphene in corbino membrane geometry. ACS Nano, 4(4):1889–1892, 2010.

[33] X. Xu, L. F. Pereira, Y. Wang, J. Wu, K. Zhang, X. Zhao, S. Bae, C. T. Bui, R. Xie, J. T. Thong, et al. Length-dependent thermal conductivity in suspended single-layer graphene. Nat. Commun., 5:3689, 2014.

[34] J.-U. Lee, D. Yoon, H. Kim, S. W. Lee, and H. Cheong. Thermal conductivity of suspended pristine graphene measured by raman spectroscopy. Phys. Rev. B, 83(8):081419, 2011.

[35] A. Yousefzadi Nobakht and S. Shin. Anisotropic control of thermal transport in graphene/si heterostructures. J. Appl. Phys., 120(22):225111, 2016.

[36] J. Los, K. Zakharchenko, M. Katsnelson, and A. Fasolino. Melting temperature of graphene. Phys. Rev. B, 91(4):045415, 2015.

[37] List of thermal conductivities. https://en.wikipedia.org/wiki/List_of_ thermal_conductivities. Retrieved: 2019-03-08.

[38] C. Lee, X. Wei, J. W. Kysar, and J. Hone. Measurement of the elastic properties and intrinsic strength of monolayer graphene. Science, 321(5887):385–388, 2008.

[39] H. C. Schulitz, W. Sobek, and K. J. Habermann. Steel Contruction Manual. Walter de Gruyter, 2012.

[40] A. K. Geim and I. V. Grigorieva. Van der waals heterostructures. Nature, 499(7459):419–425, 2013.

[41] A. Jain, S. P. Ong, G. Hautier, W. Chen, W. D. Richards, S. Dacek, S. Cholia, D. Gunter, D. Skinner, G. Ceder, and K. a. Persson. The Materials Project: A materials genome approach to accelerating materials innovation. APL Mater., 1(1):011002, 2013.

[42] M. Ashton, J. Paul, S. B. Sinnott, and R. G. Hennig. Topology-scaling iden- tification of layered solids and stable exfoliated 2d materials. Phys. Rev. Lett., 118(10):106101, 2017.

[43] K. S. Novoselov, A. Mishchenko, A. Carvalho, and A. H. C. Neto. 2d materials and van der waals heterostructures. Science, 353(6298):aac9439, 2016.

[44] Y. Zhang, T.-T. Tang, C. Girit, Z. Hao, M. C. Martin, A. Zettl, M. F. Crommie, Y. R. Shen, and F. Wang. Direct observation of a widely tunable bandgap in bilayer graphene. Nature, 459(7248):820–823, 2009.

[45] Y. Cao, V. Fatemi, S. Fang, K. Watanabe, T. Taniguchi, E. Kaxiras, and P. Jarillo- Herrero. Unconventional superconductivity in magic-angle graphene superlattices. Nature, 556(7699):43–50, 2018.

[46] L. Ponomarenko, F. Schedin, M. Katsnelson, R. Yang, E. Hill, K. Novoselov, and A. Geim. Chaotic dirac billiard in graphene quantum dots. Science, 320(5874):356–358, 2008.

[47] J. Kedzierski, P.-L. Hsu, P. Healey, P. W. Wyatt, C. L. Keast, M. Sprinkle, C. Berger, and W. A. De Heer. Epitaxial graphene transistors on sic substrates. IEEE Trans. Electron Devices, 55(8):2078–2085, 2008.

[48] Y.-M. Lin, C. Dimitrakopoulos, K. A. Jenkins, D. B. Farmer, H.-Y. Chiu, A. Grill, and P. Avouris. 100-ghz transistors from wafer-scale epitaxial graphene. Science, 327(5966):662–662, 2010.

[49] L. Yang, T. Hu, R. Hao, C. Qiu, C. Xu, H. Yu, Y. Xu, X. Jiang, Y. Li, and J. Yang. Low-chirp high-extinction-ratio modulator based on graphene–silicon waveguide. Opt. Lett., 38(14):2512–2515, 2013.

[50] X. Li, M. Zhu, M. Du, Z. Lv, L. Zhang, Y. Li, Y. Yang, T. Yang, X. Li, K. Wang, H. Zhu, and Y. Fang. High detectivity graphene-silicon heterojunction photode- tector. Small, 12(5):595–601, 2015.

[51] X. Li, H. Zhu, K. Wang, A. Cao, J. Wei, C. Li, Y. Jia, Z. Li, X. Li, and D. Wu.

Related documents