Ichor: A Python library for computational chemistry data management and machine learning force field development

IF 3.4 3区 化学 Q2 CHEMISTRY, MULTIDISCIPLINARY Journal of Computational Chemistry Pub Date : 2024-08-31 DOI:10.1002/jcc.27477
Yulian T. Manchev, Matthew J. Burn, Paul L. A. Popelier
{"title":"Ichor: A Python library for computational chemistry data management and machine learning force field development","authors":"Yulian T. Manchev,&nbsp;Matthew J. Burn,&nbsp;Paul L. A. Popelier","doi":"10.1002/jcc.27477","DOIUrl":null,"url":null,"abstract":"<p><span>We present <b>ichor</b>, an open-source Python library that simplifies data management in computational chemistry and streamlines machine learning force field development. Ichor implements many easily extensible file management tools, in addition to a lazy file reading system, allowing efficient management of hundreds of thousands of computational chemistry files. Data from calculations can be readily stored into databases for easy sharing and post-processing. Raw data can be directly processed by ichor to create machine learning-ready datasets. In addition to powerful data-related capabilities, ichor provides interfaces to popular workload management software employed by High Performance Computing clusters, making for effortless submission of thousands of separate calculations with only a single line of Python code. Furthermore, a simple-to-use command line interface has been implemented through a series of menu systems to further increase accessibility and efficiency of common important ichor tasks. Finally, ichor implements general tools for visualization and analysis of datasets and tools for measuring machine-learning model quality both on test set data and in simulations. With the current functionalities, ichor can serve as an end-to-end data procurement, data management, and analysis solution for machine-learning force-field development.</span></p>","PeriodicalId":188,"journal":{"name":"Journal of Computational Chemistry","volume":"45 32","pages":"2912-2928"},"PeriodicalIF":3.4000,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jcc.27477","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Chemistry","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jcc.27477","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

We present ichor, an open-source Python library that simplifies data management in computational chemistry and streamlines machine learning force field development. Ichor implements many easily extensible file management tools, in addition to a lazy file reading system, allowing efficient management of hundreds of thousands of computational chemistry files. Data from calculations can be readily stored into databases for easy sharing and post-processing. Raw data can be directly processed by ichor to create machine learning-ready datasets. In addition to powerful data-related capabilities, ichor provides interfaces to popular workload management software employed by High Performance Computing clusters, making for effortless submission of thousands of separate calculations with only a single line of Python code. Furthermore, a simple-to-use command line interface has been implemented through a series of menu systems to further increase accessibility and efficiency of common important ichor tasks. Finally, ichor implements general tools for visualization and analysis of datasets and tools for measuring machine-learning model quality both on test set data and in simulations. With the current functionalities, ichor can serve as an end-to-end data procurement, data management, and analysis solution for machine-learning force-field development.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Ichor:用于计算化学数据管理和机器学习力场开发的 Python 库
我们介绍的 ichor 是一个开源 Python 库,可简化计算化学中的数据管理并简化机器学习力场的开发。除了懒文件读取系统外,Ichor 还实现了许多易于扩展的文件管理工具,从而可以高效管理成千上万的计算化学文件。计算数据可随时存储到数据库中,便于共享和后处理。原始数据可直接由 ichor 处理,以创建可用于机器学习的数据集。除了强大的数据相关功能外,ichor 还为高性能计算集群使用的流行工作负载管理软件提供了接口,只需一行 Python 代码即可轻松提交数千个独立计算。此外,还通过一系列菜单系统实现了简单易用的命令行界面,进一步提高了常见重要 ichor 任务的可访问性和效率。最后,ichor 还提供了用于数据集可视化和分析的通用工具,以及在测试集数据和模拟中衡量机器学习模型质量的工具。凭借现有功能,ichor 可作为机器学习力场开发的端到端数据采购、数据管理和分析解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
6.60
自引率
3.30%
发文量
247
审稿时长
1.7 months
期刊介绍: This distinguished journal publishes articles concerned with all aspects of computational chemistry: analytical, biological, inorganic, organic, physical, and materials. The Journal of Computational Chemistry presents original research, contemporary developments in theory and methodology, and state-of-the-art applications. Computational areas that are featured in the journal include ab initio and semiempirical quantum mechanics, density functional theory, molecular mechanics, molecular dynamics, statistical mechanics, cheminformatics, biomolecular structure prediction, molecular design, and bioinformatics.
期刊最新文献
Issue Information DC24: A new density coherence functional for multiconfiguration density‐coherence functional theory Excited state relaxation mechanisms of paracetamol and acetanilide. Stable, aromatic, and electrophilic azepinium ions: Design using quantum chemical methods Assessing small molecule conformational sampling methods in molecular docking
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1