用于材料性能预测的原子线图神经网络重现性研究

IF 6.2 Q1 CHEMISTRY, MULTIDISCIPLINARY Digital discovery Pub Date : 2024-04-30 DOI:10.1039/D4DD00064A
Kangming Li, Brian DeCost, Kamal Choudhary and Jason Hattrick-Simpers
{"title":"用于材料性能预测的原子线图神经网络重现性研究","authors":"Kangming Li, Brian DeCost, Kamal Choudhary and Jason Hattrick-Simpers","doi":"10.1039/D4DD00064A","DOIUrl":null,"url":null,"abstract":"<p >Use of machine learning has been increasingly popular in materials science as data-driven materials discovery is becoming the new paradigm. Reproducibility of findings is paramount for promoting transparency and accountability in research and building trust in the scientific community. Here we conduct a reproducibility analysis of the work by K. Choudhary and B. Brian [<em>npj Comput. Mater.</em>, <strong>7</strong>, 2021, 185], in which a new graph neural network architecture was developed with improved performance on multiple atomistic prediction tasks. We examine the reproducibility for the model performance on 29 regression tasks and for an ablation analysis of the graph neural network layers. We find that the reproduced results generally exhibit a good quantitative agreement with the initial study, despite minor disparities in model performance and training efficiency that may be resulting from factors such as hardware difference and stochasticity involved in model training and data splits. The ease of conducting these reproducibility experiments confirms the great benefits of open data and code practices to which the initial work adhered. We also discuss some further enhancements in reproducible practices such as code and data archiving and providing data identifiers used in dataset splits.</p>","PeriodicalId":72816,"journal":{"name":"Digital discovery","volume":" 6","pages":" 1123-1129"},"PeriodicalIF":6.2000,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://pubs.rsc.org/en/content/articlepdf/2024/dd/d4dd00064a?page=search","citationCount":"0","resultStr":"{\"title\":\"A reproducibility study of atomistic line graph neural networks for materials property prediction†\",\"authors\":\"Kangming Li, Brian DeCost, Kamal Choudhary and Jason Hattrick-Simpers\",\"doi\":\"10.1039/D4DD00064A\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >Use of machine learning has been increasingly popular in materials science as data-driven materials discovery is becoming the new paradigm. Reproducibility of findings is paramount for promoting transparency and accountability in research and building trust in the scientific community. Here we conduct a reproducibility analysis of the work by K. Choudhary and B. Brian [<em>npj Comput. Mater.</em>, <strong>7</strong>, 2021, 185], in which a new graph neural network architecture was developed with improved performance on multiple atomistic prediction tasks. We examine the reproducibility for the model performance on 29 regression tasks and for an ablation analysis of the graph neural network layers. We find that the reproduced results generally exhibit a good quantitative agreement with the initial study, despite minor disparities in model performance and training efficiency that may be resulting from factors such as hardware difference and stochasticity involved in model training and data splits. The ease of conducting these reproducibility experiments confirms the great benefits of open data and code practices to which the initial work adhered. We also discuss some further enhancements in reproducible practices such as code and data archiving and providing data identifiers used in dataset splits.</p>\",\"PeriodicalId\":72816,\"journal\":{\"name\":\"Digital discovery\",\"volume\":\" 6\",\"pages\":\" 1123-1129\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2024-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://pubs.rsc.org/en/content/articlepdf/2024/dd/d4dd00064a?page=search\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital discovery\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://pubs.rsc.org/en/content/articlelanding/2024/dd/d4dd00064a\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital discovery","FirstCategoryId":"1085","ListUrlMain":"https://pubs.rsc.org/en/content/articlelanding/2024/dd/d4dd00064a","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

随着数据驱动的材料发现正在成为新的范式,机器学习的使用在材料科学领域日益流行。研究结果的可重复性对于促进研究的透明度和问责制以及建立科学界的信任至关重要。在此,我们对 K. Choudhary 和 B. Brian [npj Comput. Mater., 7, 2021, 185]的研究成果进行了可重复性分析,该研究开发了一种新的图神经网络架构,提高了多种原子预测任务的性能。我们研究了 29 项回归任务中模型性能的再现性,以及图神经网络层的消融分析。我们发现,尽管在模型性能和训练效率方面可能会因硬件差异、模型训练中的随机性以及数据分割等因素而存在细微差别,但重现的结果总体上与最初的研究在数量上表现出良好的一致性。这些可重复性实验的轻松进行证实了最初工作所坚持的开放数据和代码实践的巨大好处。我们还讨论了可重复性实践中的一些进一步改进,如代码和数据归档以及提供数据集拆分中使用的数据标识符。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A reproducibility study of atomistic line graph neural networks for materials property prediction†

Use of machine learning has been increasingly popular in materials science as data-driven materials discovery is becoming the new paradigm. Reproducibility of findings is paramount for promoting transparency and accountability in research and building trust in the scientific community. Here we conduct a reproducibility analysis of the work by K. Choudhary and B. Brian [npj Comput. Mater., 7, 2021, 185], in which a new graph neural network architecture was developed with improved performance on multiple atomistic prediction tasks. We examine the reproducibility for the model performance on 29 regression tasks and for an ablation analysis of the graph neural network layers. We find that the reproduced results generally exhibit a good quantitative agreement with the initial study, despite minor disparities in model performance and training efficiency that may be resulting from factors such as hardware difference and stochasticity involved in model training and data splits. The ease of conducting these reproducibility experiments confirms the great benefits of open data and code practices to which the initial work adhered. We also discuss some further enhancements in reproducible practices such as code and data archiving and providing data identifiers used in dataset splits.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.80
自引率
0.00%
发文量
0
期刊最新文献
Back cover ArcaNN: automated enhanced sampling generation of training sets for chemically reactive machine learning interatomic potentials. Sorting polyolefins with near-infrared spectroscopy: identification of optimal data analysis pipelines and machine learning classifiers†‡ High accuracy uncertainty-aware interatomic force modeling with equivariant Bayesian neural networks† Correction: A smile is all you need: predicting limiting activity coefficients from SMILES with natural language processing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1