Renzhe Li, Jiaqi Wang, Akksay Singh, Bai Li, Zichen Song, Chuan Zhou, Lei Li
{"title":"Automatic Feature Selection for Atom-Centered Neural Network Potentials Using a Gradient Boosting Decision Algorithm.","authors":"Renzhe Li, Jiaqi Wang, Akksay Singh, Bai Li, Zichen Song, Chuan Zhou, Lei Li","doi":"10.1021/acs.jctc.4c01176","DOIUrl":null,"url":null,"abstract":"<p><p>Atom-centered neural network (ANN) potentials have shown high accuracy and computational efficiency in modeling atomic systems. A crucial step in developing reliable ANN potentials is the proper selection of atom-centered symmetry functions (ACSFs), also known as atomic features, to describe atomic environments. Inappropriate selection of ACSFs can lead to poor-quality ANN potentials. Here, we propose a gradient boosting decision tree (GBDT)-based framework for the automatic selection of optimal ACSFs. This framework takes uniformly distributed sets of ACSFs as input and evaluates their relative importance. The ACSFs with high average importance scores are selected and used to train an ANN potential. We applied this method to the Ge system, resulting in an ANN potential with root-mean-square errors (RMSE) of 10.2 meV/atom for energy and 84.8 meV/Å for force predictions, utilizing only 18 ACSFs to achieve a balance between accuracy and computational efficiency. The framework is validated using the grid searching method, demonstrating that ACSFs selected with our framework are in the optimal region. Furthermore, we also compared our method with commonly used feature selection algorithms. The results show that our algorithm outperforms the others in terms of effectiveness and accuracy. This study highlights the significance of the ACSF parameter effect on the ANN performance and presents a promising method for automatic ACSF selection, facilitating the development of machine learning potentials.</p>","PeriodicalId":45,"journal":{"name":"Journal of Chemical Theory and Computation","volume":" ","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Theory and Computation","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jctc.4c01176","RegionNum":1,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Atom-centered neural network (ANN) potentials have shown high accuracy and computational efficiency in modeling atomic systems. A crucial step in developing reliable ANN potentials is the proper selection of atom-centered symmetry functions (ACSFs), also known as atomic features, to describe atomic environments. Inappropriate selection of ACSFs can lead to poor-quality ANN potentials. Here, we propose a gradient boosting decision tree (GBDT)-based framework for the automatic selection of optimal ACSFs. This framework takes uniformly distributed sets of ACSFs as input and evaluates their relative importance. The ACSFs with high average importance scores are selected and used to train an ANN potential. We applied this method to the Ge system, resulting in an ANN potential with root-mean-square errors (RMSE) of 10.2 meV/atom for energy and 84.8 meV/Å for force predictions, utilizing only 18 ACSFs to achieve a balance between accuracy and computational efficiency. The framework is validated using the grid searching method, demonstrating that ACSFs selected with our framework are in the optimal region. Furthermore, we also compared our method with commonly used feature selection algorithms. The results show that our algorithm outperforms the others in terms of effectiveness and accuracy. This study highlights the significance of the ACSF parameter effect on the ANN performance and presents a promising method for automatic ACSF selection, facilitating the development of machine learning potentials.
期刊介绍:
The Journal of Chemical Theory and Computation invites new and original contributions with the understanding that, if accepted, they will not be published elsewhere. Papers reporting new theories, methodology, and/or important applications in quantum electronic structure, molecular dynamics, and statistical mechanics are appropriate for submission to this Journal. Specific topics include advances in or applications of ab initio quantum mechanics, density functional theory, design and properties of new materials, surface science, Monte Carlo simulations, solvation models, QM/MM calculations, biomolecular structure prediction, and molecular dynamics in the broadest sense including gas-phase dynamics, ab initio dynamics, biomolecular dynamics, and protein folding. The Journal does not consider papers that are straightforward applications of known methods including DFT and molecular dynamics. The Journal favors submissions that include advances in theory or methodology with applications to compelling problems.