Balancing Wigner sampling and geometry interpolation for deep neural networks learning photochemical reactions

Li Wang, Zhendong Li, Jingbai Li
{"title":"Balancing Wigner sampling and geometry interpolation for deep neural networks learning photochemical reactions","authors":"Li Wang,&nbsp;Zhendong Li,&nbsp;Jingbai Li","doi":"10.1016/j.aichem.2023.100018","DOIUrl":null,"url":null,"abstract":"<div><p>Machine learning photodynamics simulations are revolutionary tools to resolve elusive photochemical reaction mechanisms with time-dependent high-fidelity structure information. Besides the recent advances in neural networks (NNs) potentials, it still lacks a general rule for designing training data for learning photochemical reaction mechanisms with Wigner sampling and geometry interpolation. We present an in-depth investigation of the relationship between the accuracy of the multiple layer NNs and the combinations of training data based on the Wigner sampling and geometry interpolation using model photochemical reactions of the [3]-ladderdiene systems. The NNs trained with Wigner sampling data show underfitting, where the NN errors increase with the structural complexity and diversity. The NNs trained with composite Wigner sampling and geometry interpolation data show one magnitude reduced errors, suggesting an essential role of geometry interpolation in facilitating NNs learning the potential energy surfaces. However, increasing the interpolation steps results in overfitting if the Wigner sampled configuration space is narrowed. Correlating the mean absolute errors (MAE) of the NN predicted energies for the sampled and out-of-sample structures shows an optimal combination ratio of 100:10 between the Wigner sampling structures and geometry interpolation steps for 1000 training data, where the MAE of the sampled structures achieve chemical accuracy while the MAE of the out-of-sample structures is minimized. The NNs trained with the optimally combined data can detect the out-of-sample structures in adaptive sampling with a positive correlation between the maximum standard deviation and MAE of the predicted energies. Collectively, our findings suggest a general rule for designing the training data for ML photodynamics.</p></div>","PeriodicalId":72302,"journal":{"name":"Artificial intelligence chemistry","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2949747723000180/pdfft?md5=2cdb8ecc2616508d396111c8c149852d&pid=1-s2.0-S2949747723000180-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence chemistry","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949747723000180","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning photodynamics simulations are revolutionary tools to resolve elusive photochemical reaction mechanisms with time-dependent high-fidelity structure information. Besides the recent advances in neural networks (NNs) potentials, it still lacks a general rule for designing training data for learning photochemical reaction mechanisms with Wigner sampling and geometry interpolation. We present an in-depth investigation of the relationship between the accuracy of the multiple layer NNs and the combinations of training data based on the Wigner sampling and geometry interpolation using model photochemical reactions of the [3]-ladderdiene systems. The NNs trained with Wigner sampling data show underfitting, where the NN errors increase with the structural complexity and diversity. The NNs trained with composite Wigner sampling and geometry interpolation data show one magnitude reduced errors, suggesting an essential role of geometry interpolation in facilitating NNs learning the potential energy surfaces. However, increasing the interpolation steps results in overfitting if the Wigner sampled configuration space is narrowed. Correlating the mean absolute errors (MAE) of the NN predicted energies for the sampled and out-of-sample structures shows an optimal combination ratio of 100:10 between the Wigner sampling structures and geometry interpolation steps for 1000 training data, where the MAE of the sampled structures achieve chemical accuracy while the MAE of the out-of-sample structures is minimized. The NNs trained with the optimally combined data can detect the out-of-sample structures in adaptive sampling with a positive correlation between the maximum standard deviation and MAE of the predicted energies. Collectively, our findings suggest a general rule for designing the training data for ML photodynamics.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
平衡Wigner采样和几何插值的深度神经网络学习光化学反应
机器学习光动力学模拟是解决具有时间依赖性高保真结构信息的难以捉摸的光化学反应机制的革命性工具。除了神经网络电位的最新研究进展外,它仍然缺乏一个通用的规则来设计用于学习Wigner采样和几何插值的光化学反应机制的训练数据。我们利用[3]-阶梯二烯系统的模型光化学反应,深入研究了多层神经网络的精度与基于Wigner采样和几何插值的训练数据组合之间的关系。使用Wigner采样数据训练的神经网络出现欠拟合,其中神经网络误差随着结构复杂性和多样性的增加而增加。使用复合Wigner采样和几何插值数据训练的神经网络误差降低了一个数量级,这表明几何插值在促进神经网络学习势能面方面发挥了重要作用。然而,如果Wigner采样配置空间缩小,增加插值步骤会导致过拟合。将样本结构和样本外结构的神经网络预测能量的平均绝对误差(MAE)进行关联,结果表明,对于1000个训练数据,Wigner采样结构和几何插值步骤之间的最佳组合比为100:10,其中样本结构的MAE达到化学精度,而样本外结构的MAE最小。用最优组合的数据训练的神经网络在自适应采样中能够检测出样本外结构,预测能量的最大标准差与MAE之间存在正相关关系。总的来说,我们的发现提出了设计ML光动力学训练数据的一般规则。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Artificial intelligence chemistry
Artificial intelligence chemistry Chemistry (General)
自引率
0.00%
发文量
0
审稿时长
21 days
期刊最新文献
Molecular similarity: Theory, applications, and perspectives Large-language models: The game-changers for materials science research Conf-GEM: A geometric information-assisted direct conformation generation model Top 20 influential AI-based technologies in chemistry User-friendly and industry-integrated AI for medicinal chemists and pharmaceuticals
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1