Accelerating the Training and Improving the Reliability of Machine-Learned Interatomic Potentials for Strongly Anharmonic Materials through Active Learning
Kisung Kang, Thomas A. R. Purcell, Christian Carbogno, Matthias Scheffler
{"title":"Accelerating the Training and Improving the Reliability of Machine-Learned Interatomic Potentials for Strongly Anharmonic Materials through Active Learning","authors":"Kisung Kang, Thomas A. R. Purcell, Christian Carbogno, Matthias Scheffler","doi":"arxiv-2409.11808","DOIUrl":null,"url":null,"abstract":"Molecular dynamics (MD) employing machine-learned interatomic potentials\n(MLIPs) serve as an efficient, urgently needed complement to ab initio\nmolecular dynamics (aiMD). By training these potentials on data generated from\nab initio methods, their averaged predictions can exhibit comparable\nperformance to ab initio methods at a fraction of the cost. However,\ninsufficient training sets might lead to an improper description of the\ndynamics in strongly anharmonic materials, because critical effects might be\noverlooked in relevant cases, or only incorrectly captured, or hallucinated by\nthe MLIP when they are not actually present. In this work, we show that an\nactive learning scheme that combines MD with MLIPs (MLIP-MD) and uncertainty\nestimates can avoid such problematic predictions. In short, efficient MLIP-MD\nis used to explore configuration space quickly, whereby an acquisition function\nbased on uncertainty estimates and on energetic viability is employed to\nmaximize the value of the newly generated data and to focus on the most\nunfamiliar but reasonably accessible regions of phase space. To verify our\nmethodology, we screen over 112 materials and identify 10 examples experiencing\nthe aforementioned problems. Using CuI and AgGaSe$_2$ as archetypes for these\nproblematic materials, we discuss the physical implications for strongly\nanharmonic effects and demonstrate how the developed active learning scheme can\naddress these issues.","PeriodicalId":501234,"journal":{"name":"arXiv - PHYS - Materials Science","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Materials Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11808","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Molecular dynamics (MD) employing machine-learned interatomic potentials
(MLIPs) serve as an efficient, urgently needed complement to ab initio
molecular dynamics (aiMD). By training these potentials on data generated from
ab initio methods, their averaged predictions can exhibit comparable
performance to ab initio methods at a fraction of the cost. However,
insufficient training sets might lead to an improper description of the
dynamics in strongly anharmonic materials, because critical effects might be
overlooked in relevant cases, or only incorrectly captured, or hallucinated by
the MLIP when they are not actually present. In this work, we show that an
active learning scheme that combines MD with MLIPs (MLIP-MD) and uncertainty
estimates can avoid such problematic predictions. In short, efficient MLIP-MD
is used to explore configuration space quickly, whereby an acquisition function
based on uncertainty estimates and on energetic viability is employed to
maximize the value of the newly generated data and to focus on the most
unfamiliar but reasonably accessible regions of phase space. To verify our
methodology, we screen over 112 materials and identify 10 examples experiencing
the aforementioned problems. Using CuI and AgGaSe$_2$ as archetypes for these
problematic materials, we discuss the physical implications for strongly
anharmonic effects and demonstrate how the developed active learning scheme can
address these issues.