从测序数据中检索到的微生物丰度--NCBI 自动分类法(MARS):为微生物组建模工具箱创建相对微生物丰度数据的管道,并利用同义词高效地映射资源

IF 2.4 Q2 MATHEMATICAL & COMPUTATIONAL BIOLOGY Bioinformatics advances Pub Date : 2024-05-10 DOI:10.1093/bioadv/vbae068
T. Hulshof, Bram Nap, Filippo Martinelli, Ines Thiele
{"title":"从测序数据中检索到的微生物丰度--NCBI 自动分类法(MARS):为微生物组建模工具箱创建相对微生物丰度数据的管道,并利用同义词高效地映射资源","authors":"T. Hulshof, Bram Nap, Filippo Martinelli, Ines Thiele","doi":"10.1093/bioadv/vbae068","DOIUrl":null,"url":null,"abstract":"\n \n \n Computational approaches to the functional characterisation of the microbiome, such as the Microbiome Modelling Toolbox, require precise information on microbial composition and relative abundances. However, challenges arise from homosynonyms—different names referring to the same taxon, which can hinder the mapping process and lead to missed species mapping when using microbial metabolic reconstruction resources, such as AGORA and APOLLO.\n \n \n \n We introduce the integrated MARS pipeline, a user-friendly Python-based solution that addresses these challenges. MARS automates the extraction of relative abundances from metagenomic reads, maps species and genera onto microbial metabolic reconstructions, and accounts for alternative taxonomic names. It normalises microbial reads, provides an optional cut-off for low-abundance taxa, and produces relative abundance tables apt for integration with the Microbiome Modelling Toolbox. A sub-component of the pipeline automates the task of identifying homosynonyms, leveraging web scraping to find taxonomic IDs of given species, searching NCBI for alternative names, and cross-reference them with microbial reconstruction resources. Taken together, MARS streamlines the entire process from processed metagenomic reads to relative abundance, thereby significantly reducing time and effort when working with microbiome data.\n \n \n \n MARS is implemented in Python. It can be found as an interactive application here: https://mars-pipeline.streamlit.app/along with a detailed documentation here: https://github.com/ThieleLab/mars-pipeline.\n","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Microbial abundances retrieved from sequencing data—automated NCBI taxonomy (MARS): a pipeline to create relative microbial abundance data for the microbiome modelling toolbox and utilising homosynonyms for efficient mapping to resources\",\"authors\":\"T. Hulshof, Bram Nap, Filippo Martinelli, Ines Thiele\",\"doi\":\"10.1093/bioadv/vbae068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n \\n \\n Computational approaches to the functional characterisation of the microbiome, such as the Microbiome Modelling Toolbox, require precise information on microbial composition and relative abundances. However, challenges arise from homosynonyms—different names referring to the same taxon, which can hinder the mapping process and lead to missed species mapping when using microbial metabolic reconstruction resources, such as AGORA and APOLLO.\\n \\n \\n \\n We introduce the integrated MARS pipeline, a user-friendly Python-based solution that addresses these challenges. MARS automates the extraction of relative abundances from metagenomic reads, maps species and genera onto microbial metabolic reconstructions, and accounts for alternative taxonomic names. It normalises microbial reads, provides an optional cut-off for low-abundance taxa, and produces relative abundance tables apt for integration with the Microbiome Modelling Toolbox. A sub-component of the pipeline automates the task of identifying homosynonyms, leveraging web scraping to find taxonomic IDs of given species, searching NCBI for alternative names, and cross-reference them with microbial reconstruction resources. Taken together, MARS streamlines the entire process from processed metagenomic reads to relative abundance, thereby significantly reducing time and effort when working with microbiome data.\\n \\n \\n \\n MARS is implemented in Python. It can be found as an interactive application here: https://mars-pipeline.streamlit.app/along with a detailed documentation here: https://github.com/ThieleLab/mars-pipeline.\\n\",\"PeriodicalId\":72368,\"journal\":{\"name\":\"Bioinformatics advances\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2024-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bioinformatics advances\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/bioadv/vbae068\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics advances","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioadv/vbae068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

微生物组功能表征的计算方法(如微生物组建模工具箱)需要有关微生物组成和相对丰度的精确信息。然而,在使用 AGORA 和 APOLLO 等微生物代谢重建资源时,同源异名(指同一分类群的不同名称)会阻碍绘图过程并导致错过物种绘图。 我们介绍了集成的 MARS 管道,这是一种基于 Python 的用户友好型解决方案,可以解决这些难题。MARS 可自动从元基因组读数中提取相对丰度,将物种和属映射到微生物代谢重建上,并考虑到其他分类名称。它对微生物读数进行归一化处理,为低丰度类群提供可选的截止值,并生成适合与微生物组建模工具箱整合的相对丰度表。该管道的一个子组件可自动识别同义词,利用网络搜索功能查找给定物种的分类标识,在 NCBI 中搜索替代名称,并与微生物重建资源相互参照。总之,MARS 简化了从处理元基因组读数到相对丰度的整个过程,从而大大减少了处理微生物组数据的时间和精力。 MARS 使用 Python 实现。它的交互式应用程序见 https://mars-pipeline.streamlit.app/along,详细文档见 https://github.com/ThieleLab/mars-pipeline。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Microbial abundances retrieved from sequencing data—automated NCBI taxonomy (MARS): a pipeline to create relative microbial abundance data for the microbiome modelling toolbox and utilising homosynonyms for efficient mapping to resources
Computational approaches to the functional characterisation of the microbiome, such as the Microbiome Modelling Toolbox, require precise information on microbial composition and relative abundances. However, challenges arise from homosynonyms—different names referring to the same taxon, which can hinder the mapping process and lead to missed species mapping when using microbial metabolic reconstruction resources, such as AGORA and APOLLO. We introduce the integrated MARS pipeline, a user-friendly Python-based solution that addresses these challenges. MARS automates the extraction of relative abundances from metagenomic reads, maps species and genera onto microbial metabolic reconstructions, and accounts for alternative taxonomic names. It normalises microbial reads, provides an optional cut-off for low-abundance taxa, and produces relative abundance tables apt for integration with the Microbiome Modelling Toolbox. A sub-component of the pipeline automates the task of identifying homosynonyms, leveraging web scraping to find taxonomic IDs of given species, searching NCBI for alternative names, and cross-reference them with microbial reconstruction resources. Taken together, MARS streamlines the entire process from processed metagenomic reads to relative abundance, thereby significantly reducing time and effort when working with microbiome data. MARS is implemented in Python. It can be found as an interactive application here: https://mars-pipeline.streamlit.app/along with a detailed documentation here: https://github.com/ThieleLab/mars-pipeline.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
1.60
自引率
0.00%
发文量
0
期刊最新文献
motifbreakR v2: expanded variant analysis including indels and integrated evidence from transcription factor binding databases. TransAnnot-a fast transcriptome annotation pipeline. PatchProt: hydrophobic patch prediction using protein foundation models. Accelerating protein-protein interaction screens with reduced AlphaFold-Multimer sampling. CAPTVRED: an automated pipeline for viral tracking and discovery from capture-based metagenomics samples.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1