Reusability report: Annotating metabolite mass spectra with domain-inspired chemical formula transformers

IF 18.8 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Nature Machine Intelligence Pub Date : 2024-09-27 DOI:10.1038/s42256-024-00909-4
Janne Heirman, Wout Bittremieux
{"title":"Reusability report: Annotating metabolite mass spectra with domain-inspired chemical formula transformers","authors":"Janne Heirman, Wout Bittremieux","doi":"10.1038/s42256-024-00909-4","DOIUrl":null,"url":null,"abstract":"<p>We present an in-depth exploration of the Metabolite Inference with Spectrum Transformers (MIST) tool for annotating small-molecule mass spectrometry (MS) data, focusing on its reproducibility and generalizability. MIST innovates by integrating a ‘chemical formula transformer’ to process tandem MS spectra, aiming to bridge the substantial knowledge gap in untargeted MS studies, in which only a fraction of spectra are confidently annotated. Here we critically assessed the reproducibility of MIST by following the tool’s original training and testing protocols, encountering minor challenges but largely succeeding in replicating the results. We also evaluated the generalizability of MIST by applying it to an external dataset from the Critical Assessment of Small Molecule Identification 2022 challenge, showing insights into the model’s performance on previously unseen data. An ablation study further investigated the impact of various model features on database retrieval performance, suggesting that some algorithmic complexities may not significantly enhance the performance. Through rigorous evaluation, this study underscores the challenges and considerations in developing robust computational tools for MS data analysis. We advocate community-wide efforts in benchmarking, transparency and data sharing to foster advancements in metabolomics and computational biology.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"3 1","pages":""},"PeriodicalIF":18.8000,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1038/s42256-024-00909-4","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

We present an in-depth exploration of the Metabolite Inference with Spectrum Transformers (MIST) tool for annotating small-molecule mass spectrometry (MS) data, focusing on its reproducibility and generalizability. MIST innovates by integrating a ‘chemical formula transformer’ to process tandem MS spectra, aiming to bridge the substantial knowledge gap in untargeted MS studies, in which only a fraction of spectra are confidently annotated. Here we critically assessed the reproducibility of MIST by following the tool’s original training and testing protocols, encountering minor challenges but largely succeeding in replicating the results. We also evaluated the generalizability of MIST by applying it to an external dataset from the Critical Assessment of Small Molecule Identification 2022 challenge, showing insights into the model’s performance on previously unseen data. An ablation study further investigated the impact of various model features on database retrieval performance, suggesting that some algorithmic complexities may not significantly enhance the performance. Through rigorous evaluation, this study underscores the challenges and considerations in developing robust computational tools for MS data analysis. We advocate community-wide efforts in benchmarking, transparency and data sharing to foster advancements in metabolomics and computational biology.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
重用性报告:用领域启发式化学式转换器标注代谢物质谱
我们对用于注释小分子质谱(MS)数据的代谢物推断与谱图转换器(MIST)工具进行了深入探讨,重点关注其可重复性和可推广性。MIST 通过集成 "化学式转换器 "来处理串联质谱进行创新,旨在弥补非靶向质谱研究中的巨大知识差距,因为在非靶向质谱研究中,只有一小部分谱图能得到可靠的注释。在此,我们按照该工具的原始训练和测试协议对 MIST 的可重复性进行了严格评估,虽然遇到了一些小挑战,但基本上成功地复制了结果。我们还将 MIST 应用于 "小分子鉴定关键评估 2022 "挑战赛的外部数据集,评估了 MIST 的通用性,显示了模型在以前未见数据上的性能。一项消融研究进一步调查了各种模型特征对数据库检索性能的影响,表明某些算法的复杂性可能不会显著提高性能。通过严格的评估,这项研究强调了开发用于 MS 数据分析的强大计算工具所面临的挑战和需要考虑的因素。我们倡导全社会在基准设定、透明度和数据共享方面做出努力,以促进代谢组学和计算生物学的进步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
36.90
自引率
2.10%
发文量
127
期刊介绍: Nature Machine Intelligence is a distinguished publication that presents original research and reviews on various topics in machine learning, robotics, and AI. Our focus extends beyond these fields, exploring their profound impact on other scientific disciplines, as well as societal and industrial aspects. We recognize limitless possibilities wherein machine intelligence can augment human capabilities and knowledge in domains like scientific exploration, healthcare, medical diagnostics, and the creation of safe and sustainable cities, transportation, and agriculture. Simultaneously, we acknowledge the emergence of ethical, social, and legal concerns due to the rapid pace of advancements. To foster interdisciplinary discussions on these far-reaching implications, Nature Machine Intelligence serves as a platform for dialogue facilitated through Comments, News Features, News & Views articles, and Correspondence. Our goal is to encourage a comprehensive examination of these subjects. Similar to all Nature-branded journals, Nature Machine Intelligence operates under the guidance of a team of skilled editors. We adhere to a fair and rigorous peer-review process, ensuring high standards of copy-editing and production, swift publication, and editorial independence.
期刊最新文献
Reshaping the discovery of self-assembling peptides with generative AI guided by hybrid deep learning A soft skin with self-decoupled three-axis force-sensing taxels Efficient rare event sampling with unsupervised normalizing flows Clinical large language models with misplaced focus Efficient generation of protein pockets with PocketGen
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1