在评估构象能量时,哪些分子可以挑战密度函数紧密结合方法? 利用机器学习工具集进行研究

Pub Date : 2024-03-26 DOI:10.1063/10.0024962
Andrii Terets, Tymofii Nikolaienko
{"title":"在评估构象能量时,哪些分子可以挑战密度函数紧密结合方法? 利用机器学习工具集进行研究","authors":"Andrii Terets, Tymofii Nikolaienko","doi":"10.1063/10.0024962","DOIUrl":null,"url":null,"abstract":"Large organic molecules and biomolecules can adopt multiple conformations, with the occurrences determined by their relative energies. Identifying the energetically most favorable conformations is crucial, especially when interpreting spectroscopic experiments conducted under cryogenic conditions. When the effects of irregular surrounding medium, such as noble gas matrices, on the vibrational properties of molecules become important, semi-empirical (SE) quantum-chemical methods are often employed for computational simulations. Although SE methods are computationally more efficient than first-principle quantum-chemical methods, they can be inaccurate in determining the energies of conformers in some molecules while displaying good accuracy in others. In this study, we employ a combination of advanced machine learning techniques, such as graph neural networks, to identify molecules with the highest errors in the relative energies of conformers computed by the semi-empirical tight-binding method GFN1-xTB. The performance of three different machine learning models is assessed by comparing their predicted errors with the actual errors in conformer energies obtained via the GFN1-xTB method. We further applied the ensemble machine-learning model to a larger collection of molecules from the ChEMBL database and identified a set of molecules as being challenging for the GFN1-xTB method. These molecules hold potential for further improvement of the GFN1-xTB method, showcasing the capability of machine learning models in identifying molecules that can challenge its physical model.","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Which molecules can challenge density-functional tight-binding methods in evaluating the energies of conformers? investigation with machine-learning toolset\",\"authors\":\"Andrii Terets, Tymofii Nikolaienko\",\"doi\":\"10.1063/10.0024962\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large organic molecules and biomolecules can adopt multiple conformations, with the occurrences determined by their relative energies. Identifying the energetically most favorable conformations is crucial, especially when interpreting spectroscopic experiments conducted under cryogenic conditions. When the effects of irregular surrounding medium, such as noble gas matrices, on the vibrational properties of molecules become important, semi-empirical (SE) quantum-chemical methods are often employed for computational simulations. Although SE methods are computationally more efficient than first-principle quantum-chemical methods, they can be inaccurate in determining the energies of conformers in some molecules while displaying good accuracy in others. In this study, we employ a combination of advanced machine learning techniques, such as graph neural networks, to identify molecules with the highest errors in the relative energies of conformers computed by the semi-empirical tight-binding method GFN1-xTB. The performance of three different machine learning models is assessed by comparing their predicted errors with the actual errors in conformer energies obtained via the GFN1-xTB method. We further applied the ensemble machine-learning model to a larger collection of molecules from the ChEMBL database and identified a set of molecules as being challenging for the GFN1-xTB method. These molecules hold potential for further improvement of the GFN1-xTB method, showcasing the capability of machine learning models in identifying molecules that can challenge its physical model.\",\"PeriodicalId\":0,\"journal\":{\"name\":\"\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0,\"publicationDate\":\"2024-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://doi.org/10.1063/10.0024962\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"101","ListUrlMain":"https://doi.org/10.1063/10.0024962","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

大型有机分子和生物大分子可以采用多种构象,其发生率取决于它们的相对能量。识别能量上最有利的构象至关重要,尤其是在解释低温条件下进行的光谱实验时。当周围不规则介质(如惰性气体基质)对分子振动特性的影响变得重要时,通常会采用半经验(SE)量子化学方法进行计算模拟。虽然半经验量子化学方法在计算上比第一原理量子化学方法更有效率,但在确定某些分子的构象能量时可能不准确,而在另一些分子中却表现出很好的准确性。在本研究中,我们结合使用了图神经网络等先进的机器学习技术,以识别在半经验紧密结合方法 GFN1-xTB 计算的构象相对能量中误差最大的分子。通过比较三种不同机器学习模型的预测误差和通过 GFN1-xTB 方法获得的构象能量的实际误差,评估了它们的性能。我们进一步将集合机器学习模型应用于 ChEMBL 数据库中更多的分子集合,并确定了一组对 GFN1-xTB 方法具有挑战性的分子。这些分子具有进一步改进 GFN1-xTB 方法的潜力,展示了机器学习模型识别可能挑战其物理模型的分子的能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
Which molecules can challenge density-functional tight-binding methods in evaluating the energies of conformers? investigation with machine-learning toolset
Large organic molecules and biomolecules can adopt multiple conformations, with the occurrences determined by their relative energies. Identifying the energetically most favorable conformations is crucial, especially when interpreting spectroscopic experiments conducted under cryogenic conditions. When the effects of irregular surrounding medium, such as noble gas matrices, on the vibrational properties of molecules become important, semi-empirical (SE) quantum-chemical methods are often employed for computational simulations. Although SE methods are computationally more efficient than first-principle quantum-chemical methods, they can be inaccurate in determining the energies of conformers in some molecules while displaying good accuracy in others. In this study, we employ a combination of advanced machine learning techniques, such as graph neural networks, to identify molecules with the highest errors in the relative energies of conformers computed by the semi-empirical tight-binding method GFN1-xTB. The performance of three different machine learning models is assessed by comparing their predicted errors with the actual errors in conformer energies obtained via the GFN1-xTB method. We further applied the ensemble machine-learning model to a larger collection of molecules from the ChEMBL database and identified a set of molecules as being challenging for the GFN1-xTB method. These molecules hold potential for further improvement of the GFN1-xTB method, showcasing the capability of machine learning models in identifying molecules that can challenge its physical model.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1