Metapresence: a tool for accurate species detection in metagenomics based on the genome-wide distribution of mapping reads.

IF 5 2区 生物学 Q1 MICROBIOLOGY mSystems Pub Date : 2024-08-20 Epub Date: 2024-07-09 DOI:10.1128/msystems.00213-24
Davide Sanguineti, Guido Zampieri, Laura Treu, Stefano Campanaro
{"title":"Metapresence: a tool for accurate species detection in metagenomics based on the genome-wide distribution of mapping reads.","authors":"Davide Sanguineti, Guido Zampieri, Laura Treu, Stefano Campanaro","doi":"10.1128/msystems.00213-24","DOIUrl":null,"url":null,"abstract":"<p><p>Shotgun metagenomics allows comprehensive sampling of the genomic information of microbes in a given environment and is a tool of choice for studying complex microbial systems. Mapping sequencing reads against a set of reference or metagenome-assembled genomes is in principle a simple and powerful approach to define the species-level composition of the microbial community under investigation. However, despite the widespread use of this approach, there is no established way to properly interpret the alignment results, with arbitrary relative abundance thresholds being routinely used to discriminate between present and absent species. Such an approach can be affected by significant biases, especially in the identification of rare species. Therefore, it is important to develop new metrics to overcome these biases. Here, we present Metapresence, a new tool to perform reliable identification of the species in metagenomic samples based on the distribution of mapped reads on the reference genomes. The analysis is based on two metrics describing the breadth of coverage and the genomic distance between consecutive reads. We demonstrate the high precision and wide applicability of the tool using data from various synthetic communities, a real mock community, and the gut microbiome of healthy individuals and antibiotic-associated-diarrhea patients. Overall, our results suggest that the proposed approach has a robust performance in hard-to-analyze microbial communities containing contaminated or closely related genomes in low abundance.IMPORTANCEDespite the prevalent use of genome-centric alignment-based methods to characterize microbial community composition, there lacks a standardized approach for accurately identifying the species within a sample. Currently, arbitrary relative abundance thresholds are commonly employed for this purpose. However, due to the inherent complexity of genome structure and biases associated with genome-centric approaches, this practice tends to be imprecise. Notably, it introduces significant biases, particularly in the identification of rare species. The method presented here addresses these limitations and contributes significantly to overcoming inaccuracies in precisely defining community composition, especially when dealing with rare members.</p>","PeriodicalId":18819,"journal":{"name":"mSystems","volume":null,"pages":null},"PeriodicalIF":5.0000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11338496/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"mSystems","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1128/msystems.00213-24","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/9 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Shotgun metagenomics allows comprehensive sampling of the genomic information of microbes in a given environment and is a tool of choice for studying complex microbial systems. Mapping sequencing reads against a set of reference or metagenome-assembled genomes is in principle a simple and powerful approach to define the species-level composition of the microbial community under investigation. However, despite the widespread use of this approach, there is no established way to properly interpret the alignment results, with arbitrary relative abundance thresholds being routinely used to discriminate between present and absent species. Such an approach can be affected by significant biases, especially in the identification of rare species. Therefore, it is important to develop new metrics to overcome these biases. Here, we present Metapresence, a new tool to perform reliable identification of the species in metagenomic samples based on the distribution of mapped reads on the reference genomes. The analysis is based on two metrics describing the breadth of coverage and the genomic distance between consecutive reads. We demonstrate the high precision and wide applicability of the tool using data from various synthetic communities, a real mock community, and the gut microbiome of healthy individuals and antibiotic-associated-diarrhea patients. Overall, our results suggest that the proposed approach has a robust performance in hard-to-analyze microbial communities containing contaminated or closely related genomes in low abundance.IMPORTANCEDespite the prevalent use of genome-centric alignment-based methods to characterize microbial community composition, there lacks a standardized approach for accurately identifying the species within a sample. Currently, arbitrary relative abundance thresholds are commonly employed for this purpose. However, due to the inherent complexity of genome structure and biases associated with genome-centric approaches, this practice tends to be imprecise. Notably, it introduces significant biases, particularly in the identification of rare species. The method presented here addresses these limitations and contributes significantly to overcoming inaccuracies in precisely defining community composition, especially when dealing with rare members.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Metapresence:基于全基因组分布的图谱读数在元基因组学中准确检测物种的工具。
射枪元基因组学可对特定环境中的微生物基因组信息进行全面采样,是研究复杂微生物系统的首选工具。将测序读数与一组参考基因组或元基因组组装的基因组进行映射,原则上是一种简单而强大的方法,可用于确定所研究微生物群落的物种级组成。然而,尽管这种方法被广泛使用,但却没有正确解释比对结果的既定方法,通常使用任意的相对丰度阈值来区分存在和不存在的物种。这种方法可能会受到严重偏差的影响,尤其是在识别稀有物种时。因此,开发新的指标来克服这些偏差非常重要。在此,我们介绍一种新工具 Metapresence,它能根据参考基因组上映射读数的分布情况,可靠地识别元基因组样本中的物种。分析基于两个指标,即覆盖范围和连续读数之间的基因组距离。我们利用来自各种合成群落、真实模拟群落以及健康人和抗生素相关性腹泻患者肠道微生物组的数据,证明了该工具的高精确性和广泛适用性。重要意义尽管以基因组为中心的比对方法被广泛用于描述微生物群落的组成,但目前还缺乏一种标准化的方法来准确识别样本中的物种。目前,通常采用任意相对丰度阈值来达到这一目的。然而,由于基因组结构本身的复杂性以及以基因组为中心的方法存在偏差,这种做法往往不够精确。值得注意的是,这种方法会带来很大的偏差,尤其是在识别稀有物种时。本文介绍的方法解决了这些局限性,大大有助于克服在精确定义群落组成时的不准确性,尤其是在处理稀有成员时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
mSystems
mSystems Biochemistry, Genetics and Molecular Biology-Biochemistry
CiteScore
10.50
自引率
3.10%
发文量
308
审稿时长
13 weeks
期刊介绍: mSystems™ will publish preeminent work that stems from applying technologies for high-throughput analyses to achieve insights into the metabolic and regulatory systems at the scale of both the single cell and microbial communities. The scope of mSystems™ encompasses all important biological and biochemical findings drawn from analyses of large data sets, as well as new computational approaches for deriving these insights. mSystems™ will welcome submissions from researchers who focus on the microbiome, genomics, metagenomics, transcriptomics, metabolomics, proteomics, glycomics, bioinformatics, and computational microbiology. mSystems™ will provide streamlined decisions, while carrying on ASM''s tradition of rigorous peer review.
期刊最新文献
Ecological drivers of CRISPR immune systems. Effect of combined probiotics and doxycycline therapy on the gut-skin axis in rosacea. Stable, multigenerational transmission of the bean seed microbiome despite abiotic stress. Antimicrobial and antibiofilm activity of human recombinant H1 histones against bacterial infections. Gut and oral microbial compositional differences in women with breast cancer, women with ductal carcinoma in situ, and healthy women.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1