CoverM: read alignment statistics for metagenomics.

Samuel T N Aroney, Rhys J P Newell, Jakob N Nissen, Antonio Pedro Camargo, Gene W Tyson, Ben J Woodcroft
{"title":"CoverM: read alignment statistics for metagenomics.","authors":"Samuel T N Aroney, Rhys J P Newell, Jakob N Nissen, Antonio Pedro Camargo, Gene W Tyson, Ben J Woodcroft","doi":"10.1093/bioinformatics/btaf147","DOIUrl":null,"url":null,"abstract":"<p><strong>Summary: </strong>Genome-centric analysis of metagenomic samples is a powerful method for understanding the function of microbial communities. Calculating read coverage is a central part of analysis, enabling differential coverage binning for recovery of genomes and estimation of microbial community composition. Coverage is determined by processing read alignments to reference sequences of either contigs or genomes. Per-reference coverage is typically calculated in an ad-hoc manner, with each software package providing its own implementation and specific definition of coverage. Here we present a unified software package CoverM which calculates several coverage statistics for contigs and genomes in an ergonomic and flexible manner. It uses \"Mosdepth arrays\" for computational efficiency and avoids unnecessary I/O overhead by calculating coverage statistics from streamed read alignment results.</p><p><strong>Availability and implementation: </strong>CoverM is free software available at https://github.com/wwood/coverm. CoverM is implemented in Rust, with Python (https://github.com/apcamargo/pycoverm) and Julia (https://github.com/JuliaBinaryWrappers/CoverM_jll.jl) interfaces.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11993303/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Summary: Genome-centric analysis of metagenomic samples is a powerful method for understanding the function of microbial communities. Calculating read coverage is a central part of analysis, enabling differential coverage binning for recovery of genomes and estimation of microbial community composition. Coverage is determined by processing read alignments to reference sequences of either contigs or genomes. Per-reference coverage is typically calculated in an ad-hoc manner, with each software package providing its own implementation and specific definition of coverage. Here we present a unified software package CoverM which calculates several coverage statistics for contigs and genomes in an ergonomic and flexible manner. It uses "Mosdepth arrays" for computational efficiency and avoids unnecessary I/O overhead by calculating coverage statistics from streamed read alignment results.

Availability and implementation: CoverM is free software available at https://github.com/wwood/coverm. CoverM is implemented in Rust, with Python (https://github.com/apcamargo/pycoverm) and Julia (https://github.com/JuliaBinaryWrappers/CoverM_jll.jl) interfaces.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CoverM:读取宏基因组的比对统计信息。
摘要:以基因组为中心的宏基因组样本分析是了解微生物群落功能的有力方法。计算读覆盖是分析的核心部分,它使基因组恢复和微生物群落组成估算的差异覆盖结合成为可能。覆盖范围是通过处理读取比对参考序列或组群或基因组来确定的。每个引用的覆盖率通常以一种特别的方式计算,每个软件包提供它自己的实现和特定的覆盖率定义。在这里,我们提出了一个统一的软件包CoverM,它以人体工程学和灵活的方式计算基因组和基因组的几种覆盖统计。它使用“Mosdepth数组”来提高计算效率,并通过计算流读对齐结果的覆盖统计数据来避免不必要的I/O开销。可用性和实现:CoverM是免费软件,可在https://github.com/wwood/coverm上获得。CoverM是用Rust实现的,使用Python (https://github.com/apcamargo/pycoverm)和Julia (https://github.com/JuliaBinaryWrappers/CoverM_jll.jl)接口。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
AutoGERN: Single-Cell RNA-Seq Gene Regulatory Network Inference via Explicit Link Modeling and Adaptive Architectures. Enzyme Association for Environmental Biotransformation Reactions Through Contrastive Learning of Reaction Center-Specific Fingerprints. Pretraining Improves Prediction of Genomic Datasets Across Species. FFC: A Scalable FASTA Compressor. Diagnosing scientific replicability through probabilistic distinguishability.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1