通过进化优化实现自动分子破碎

IF 7.1 2区 化学 Q1 CHEMISTRY, MULTIDISCIPLINARY Journal of Cheminformatics Pub Date : 2024-08-19 DOI:10.1186/s13321-024-00896-z
Fiona C. Y. Yu, Jorge L. Gálvez Vallejo, Giuseppe M. J. Barca
{"title":"通过进化优化实现自动分子破碎","authors":"Fiona C. Y. Yu,&nbsp;Jorge L. Gálvez Vallejo,&nbsp;Giuseppe M. J. Barca","doi":"10.1186/s13321-024-00896-z","DOIUrl":null,"url":null,"abstract":"<div><p>Molecular fragmentation is an effective suite of approaches to reduce the formal computational complexity of quantum chemistry calculations while enhancing their algorithmic parallelisability. However, the practical applicability of fragmentation techniques remains hindered by a dearth of automation and effective metrics to assess the quality of a fragmentation scheme. In this article, we present the Quick Fragmentation via Automated Genetic Search (QFRAGS), a novel automated fragmentation algorithm that uses a genetic optimisation procedure to generate molecular fragments that yield low energy errors when adopted in Many Body Expansions (MBEs). Benchmark testing of QFRAGS on protein systems with less than 500 atoms, using two-body (MBE2) and three-body (MBE3) MBE calculations at the HF/6-31G* level, reveals mean absolute energy errors (MAEE) of 20.6 and 2.2 kJ <span>\\(\\hbox {mol}^{-1}\\)</span>, respectively. For larger protein systems exceeding 500 atoms, MAEEs are 181.5 kJ <span>\\(\\hbox {mol}^{-1}\\)</span> for MBE2 and 24.3 kJ <span>\\(\\hbox {mol}^{-1}\\)</span> for MBE3. Furthermore, when compared to three manual fragmentation schemes on a 40-protein dataset, using both MBE and Fragment Molecular Orbital techniques, QFRAGS achieves comparable or often lower MAEEs. When applied to a 10-lipoglycan/glycolipid dataset, MAEs of 7.9 and 0.3 kJ <span>\\(\\hbox {mol}^{-1}\\)</span> were observed at the MBE2 and MBE3 levels, respectively.</p><p><b>Scientific Contribution</b> This Article presents the Quick Fragmentation via Automated Genetic Search (QFRAGS), an innovative molecular fragmentation algorithm that significantly improves upon existing molecular fragmentation approaches by specifically addressing their lack of automation and effective fragmentation quality metrics. With an evolutionary optimisation strategy, QFRAGS actively pursues high quality fragments, generating fragmentation schemes that exhibit minimal energy errors on systems with hundreds to thousands of atoms. The advent of QFRAGS represents a significant advancement in molecular fragmentation, greatly improving the accessibility and computational feasibility of accurate quantum chemistry calculations.</p></div>","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"16 1","pages":""},"PeriodicalIF":7.1000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00896-z","citationCount":"0","resultStr":"{\"title\":\"Automatic molecular fragmentation by evolutionary optimisation\",\"authors\":\"Fiona C. Y. Yu,&nbsp;Jorge L. Gálvez Vallejo,&nbsp;Giuseppe M. J. Barca\",\"doi\":\"10.1186/s13321-024-00896-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Molecular fragmentation is an effective suite of approaches to reduce the formal computational complexity of quantum chemistry calculations while enhancing their algorithmic parallelisability. However, the practical applicability of fragmentation techniques remains hindered by a dearth of automation and effective metrics to assess the quality of a fragmentation scheme. In this article, we present the Quick Fragmentation via Automated Genetic Search (QFRAGS), a novel automated fragmentation algorithm that uses a genetic optimisation procedure to generate molecular fragments that yield low energy errors when adopted in Many Body Expansions (MBEs). Benchmark testing of QFRAGS on protein systems with less than 500 atoms, using two-body (MBE2) and three-body (MBE3) MBE calculations at the HF/6-31G* level, reveals mean absolute energy errors (MAEE) of 20.6 and 2.2 kJ <span>\\\\(\\\\hbox {mol}^{-1}\\\\)</span>, respectively. For larger protein systems exceeding 500 atoms, MAEEs are 181.5 kJ <span>\\\\(\\\\hbox {mol}^{-1}\\\\)</span> for MBE2 and 24.3 kJ <span>\\\\(\\\\hbox {mol}^{-1}\\\\)</span> for MBE3. Furthermore, when compared to three manual fragmentation schemes on a 40-protein dataset, using both MBE and Fragment Molecular Orbital techniques, QFRAGS achieves comparable or often lower MAEEs. When applied to a 10-lipoglycan/glycolipid dataset, MAEs of 7.9 and 0.3 kJ <span>\\\\(\\\\hbox {mol}^{-1}\\\\)</span> were observed at the MBE2 and MBE3 levels, respectively.</p><p><b>Scientific Contribution</b> This Article presents the Quick Fragmentation via Automated Genetic Search (QFRAGS), an innovative molecular fragmentation algorithm that significantly improves upon existing molecular fragmentation approaches by specifically addressing their lack of automation and effective fragmentation quality metrics. With an evolutionary optimisation strategy, QFRAGS actively pursues high quality fragments, generating fragmentation schemes that exhibit minimal energy errors on systems with hundreds to thousands of atoms. The advent of QFRAGS represents a significant advancement in molecular fragmentation, greatly improving the accessibility and computational feasibility of accurate quantum chemistry calculations.</p></div>\",\"PeriodicalId\":617,\"journal\":{\"name\":\"Journal of Cheminformatics\",\"volume\":\"16 1\",\"pages\":\"\"},\"PeriodicalIF\":7.1000,\"publicationDate\":\"2024-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-024-00896-z\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Cheminformatics\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://link.springer.com/article/10.1186/s13321-024-00896-z\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-024-00896-z","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

分子破碎是一套有效的方法,可降低量子化学计算的形式计算复杂度,同时提高算法的并行性。然而,由于缺乏自动化和有效的指标来评估分片方案的质量,分片技术的实际应用仍然受到阻碍。在这篇文章中,我们介绍了 "通过自动遗传搜索进行快速片段化"(QFRAGS),这是一种新型的自动片段化算法,它使用遗传优化程序生成分子片段,在多体扩展(MBE)中使用时产生低能量误差。通过在 HF/6-31G* 水平上使用二体(MBE2)和三体(MBE3)MBE 计算,对少于 500 个原子的蛋白质系统进行了 QFRAGS 基准测试,结果显示平均绝对能量误差(MAEE)分别为 20.6 和 2.2 kJ $\hbox {mol}^{-1}$。对于超过 500 个原子的大型蛋白质系统,MBE2 的平均绝对能量误差为 181.5 kJ $\hbox {mol}^{-1}$ 和 MBE3 的平均绝对能量误差为 24.3 kJ $\hbox {mol}^{-1}$。此外,在使用 MBE 和片段分子轨道技术对 40 个蛋白质数据集进行人工片段分析时,QFRAGS 与三种人工片段分析方案进行了比较,QFRAGS 可获得相当或更低的 MAEE。当应用于 10 个脂聚糖/糖脂数据集时,在 MBE2 和 MBE3 水平上观察到的 MAE 分别为 7.9 和 0.3 kJ $\hbox {mol}^{-1}$ 。科学贡献 本文介绍了 "通过自动遗传搜索进行快速破碎"(QFRAGS),这是一种创新的分子破碎算法,通过专门解决现有分子破碎方法缺乏自动化和有效破碎质量指标的问题,大大改进了现有的分子破碎方法。QFRAGS 采用进化优化策略,积极追求高质量的片段,生成的片段方案在拥有数百到数千个原子的系统中表现出最小的能量误差。QFRAGS 的出现代表了分子破碎领域的重大进步,大大提高了精确量子化学计算的可及性和计算可行性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Automatic molecular fragmentation by evolutionary optimisation

Molecular fragmentation is an effective suite of approaches to reduce the formal computational complexity of quantum chemistry calculations while enhancing their algorithmic parallelisability. However, the practical applicability of fragmentation techniques remains hindered by a dearth of automation and effective metrics to assess the quality of a fragmentation scheme. In this article, we present the Quick Fragmentation via Automated Genetic Search (QFRAGS), a novel automated fragmentation algorithm that uses a genetic optimisation procedure to generate molecular fragments that yield low energy errors when adopted in Many Body Expansions (MBEs). Benchmark testing of QFRAGS on protein systems with less than 500 atoms, using two-body (MBE2) and three-body (MBE3) MBE calculations at the HF/6-31G* level, reveals mean absolute energy errors (MAEE) of 20.6 and 2.2 kJ \(\hbox {mol}^{-1}\), respectively. For larger protein systems exceeding 500 atoms, MAEEs are 181.5 kJ \(\hbox {mol}^{-1}\) for MBE2 and 24.3 kJ \(\hbox {mol}^{-1}\) for MBE3. Furthermore, when compared to three manual fragmentation schemes on a 40-protein dataset, using both MBE and Fragment Molecular Orbital techniques, QFRAGS achieves comparable or often lower MAEEs. When applied to a 10-lipoglycan/glycolipid dataset, MAEs of 7.9 and 0.3 kJ \(\hbox {mol}^{-1}\) were observed at the MBE2 and MBE3 levels, respectively.

Scientific Contribution This Article presents the Quick Fragmentation via Automated Genetic Search (QFRAGS), an innovative molecular fragmentation algorithm that significantly improves upon existing molecular fragmentation approaches by specifically addressing their lack of automation and effective fragmentation quality metrics. With an evolutionary optimisation strategy, QFRAGS actively pursues high quality fragments, generating fragmentation schemes that exhibit minimal energy errors on systems with hundreds to thousands of atoms. The advent of QFRAGS represents a significant advancement in molecular fragmentation, greatly improving the accessibility and computational feasibility of accurate quantum chemistry calculations.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Cheminformatics
Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS
CiteScore
14.10
自引率
7.00%
发文量
82
审稿时长
3 months
期刊介绍: Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.
期刊最新文献
One size does not fit all: revising traditional paradigms for assessing accuracy of QSAR models used for virtual screening Chemical space as a unifying theme for chemistry Context-dependent similarity analysis of analogue series for structure–activity relationship transfer based on a concept from natural language processing Fragmenstein: predicting protein–ligand structures of compounds derived from known crystallographic fragment hits using a strict conserved-binding–based methodology ADMET evaluation in drug discovery: 21. Application and industrial validation of machine learning algorithms for Caco-2 permeability prediction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1