Gaussian Processes for Finite Size Extrapolation of Many-Body Simulations

IF 3.3 3区化学 Q2 CHEMISTRY, PHYSICAL Faraday Discussions Pub Date : 2024-03-25 DOI:10.1039/d4fd00051j

Edgar Josue Landinez-Borda, Kenneth O. Berard, Annette Lopez, Brenda M Rubenstein

{"title":"Gaussian Processes for Finite Size Extrapolation of Many-Body Simulations","authors":"Edgar Josue Landinez-Borda, Kenneth O. Berard, Annette Lopez, Brenda M Rubenstein","doi":"10.1039/d4fd00051j","DOIUrl":null,"url":null,"abstract":"Key to being able to accurately model the properties of realistic materials is being able to predict their properties in the thermodynamic limit. Nevertheless, because most many-body electronic structure methods scale as a high-order polynomial, or even exponentially, with system size, directly simulating large systems in their thermodynamic limit rapidly becomes computationally intractable. As a result, researchers typically estimate the properties of large systems that approach the thermodynamic limit by extrapolating the properties of smaller, computationally-accessible systems based on relatively simple scaling expressions. In this work, we employ Gaussian processes to more accurately and efficiently extrapolate many-body simulations to their thermodynamic limit. We train our Gaussian processes on Smooth Overlap of Atomic Positions (SOAP) descriptors to extrapolate the energies of one-dimensional hydrogen chains obtained using two, high-accuracy many-body methods: Coupled Cluster theory and Auxiliary Field Quantum Monte Carlo (AFQMC). In so doing, we show that Gaussian processes trained on relatively short, 10-30-atom chains can predict the energies of both homogeneous and inhomogeneous hydrogen chains in their thermodynamic limit with sub-milliHartree accuracy. Unlike standard scaling expressions, our GPR-based approach is highly generalizable given representative training data and is not dependent on systems' geometries or dimensionality. This work highlights the potential for machine learning to correct for the finite size effects that routinely complicate the interpretation of finite size many-body simulations.","PeriodicalId":76,"journal":{"name":"Faraday Discussions","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Faraday Discussions","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1039/d4fd00051j","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, PHYSICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Key to being able to accurately model the properties of realistic materials is being able to predict their properties in the thermodynamic limit. Nevertheless, because most many-body electronic structure methods scale as a high-order polynomial, or even exponentially, with system size, directly simulating large systems in their thermodynamic limit rapidly becomes computationally intractable. As a result, researchers typically estimate the properties of large systems that approach the thermodynamic limit by extrapolating the properties of smaller, computationally-accessible systems based on relatively simple scaling expressions. In this work, we employ Gaussian processes to more accurately and efficiently extrapolate many-body simulations to their thermodynamic limit. We train our Gaussian processes on Smooth Overlap of Atomic Positions (SOAP) descriptors to extrapolate the energies of one-dimensional hydrogen chains obtained using two, high-accuracy many-body methods: Coupled Cluster theory and Auxiliary Field Quantum Monte Carlo (AFQMC). In so doing, we show that Gaussian processes trained on relatively short, 10-30-atom chains can predict the energies of both homogeneous and inhomogeneous hydrogen chains in their thermodynamic limit with sub-milliHartree accuracy. Unlike standard scaling expressions, our GPR-based approach is highly generalizable given representative training data and is not dependent on systems' geometries or dimensionality. This work highlights the potential for machine learning to correct for the finite size effects that routinely complicate the interpretation of finite size many-body simulations.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于多体模拟有限尺寸外推法的高斯过程

能够准确模拟现实材料特性的关键在于能够预测其热力学极限特性。然而，由于大多数多体电子结构方法的规模与系统大小成高阶多项式关系，甚至是指数关系，因此直接模拟热力学极限的大型系统很快就会变得难以计算。因此，研究人员通常根据相对简单的缩放表达式，通过推断较小的、可计算的系统的性质，来估计接近热力学极限的大型系统的性质。在这项工作中，我们采用高斯过程来更准确、更高效地推断多体模拟的热力学极限。我们在原子位置平滑重叠（SOAP）描述符上训练高斯过程，以推断使用两种高精度多体方法获得的一维氢链的能量：耦合簇理论和辅助场量子蒙特卡罗（AFQMC）。在此过程中，我们展示了在相对较短的 10-30 原子链上训练的高斯过程可以在热力学极限中以亚毫微哈特里精度预测均相和不均相氢链的能量。与标准的缩放表达式不同，我们基于 GPR 的方法在给出具有代表性的训练数据时具有很强的通用性，并且不依赖于系统的几何形状或维度。这项工作凸显了机器学习校正有限尺寸效应的潜力，这种效应通常会使有限尺寸多体模拟的解释复杂化。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Faraday Discussions 化学-物理化学

自引率

0.00%

发文量

259

期刊介绍： Discussion summary and research papers from discussion meetings that focus on rapidly developing areas of physical chemistry and its interfaces