kMetaShot: a fast and reliable taxonomy classifier for metagenome-assembled genomes.

IF 6.8 2区 生物学 Q1 BIOCHEMICAL RESEARCH METHODS Briefings in bioinformatics Pub Date : 2024-11-22 DOI:10.1093/bib/bbae680
Giuseppe Defazio, Marco Antonio Tangaro, Graziano Pesole, Bruno Fosso
{"title":"kMetaShot: a fast and reliable taxonomy classifier for metagenome-assembled genomes.","authors":"Giuseppe Defazio, Marco Antonio Tangaro, Graziano Pesole, Bruno Fosso","doi":"10.1093/bib/bbae680","DOIUrl":null,"url":null,"abstract":"<p><p>The advent of high-throughput sequencing (HTS) technologies unlocked the complexity of the microbial world through the development of metagenomics, which now provides an unprecedented and comprehensive overview of its taxonomic and functional contribution in a huge variety of macro- and micro-ecosystems. In particular, shotgun metagenomics allows the reconstruction of microbial genomes, through the assembly of reads into MAGs (metagenome-assembled genomes). In fact, MAGs represent an information-rich proxy for inferring the taxonomic composition and the functional contribution of microbiomes, even if the relevant analytical approaches are not trivial and still improvable. In this regard, tools like CAMITAX and GTDBtk have implemented complex approaches, relying on marker gene identification and sequence alignments, requiring a large processing time. With the aim of deploying an effective tool for fast and reliable MAG taxonomic classification, we present here kMetaShot, a taxonomy classifier based on k-mer/minimizer counting. We benchmarked kMetaShot against CAMITAX and GTDBtk by using both in silico and real mock communities and demonstrated how, while implementing a fast and concise algorithm, it outperforms the other tools in terms of classification accuracy. Additionally, kMetaShot is an easy-to-install and easy-to-use bioinformatic tool that is also suitable for researchers with few command-line skills. It is available and documented at https://github.com/gdefazio/kMetaShot.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 1","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11695915/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Briefings in bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/bib/bbae680","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

The advent of high-throughput sequencing (HTS) technologies unlocked the complexity of the microbial world through the development of metagenomics, which now provides an unprecedented and comprehensive overview of its taxonomic and functional contribution in a huge variety of macro- and micro-ecosystems. In particular, shotgun metagenomics allows the reconstruction of microbial genomes, through the assembly of reads into MAGs (metagenome-assembled genomes). In fact, MAGs represent an information-rich proxy for inferring the taxonomic composition and the functional contribution of microbiomes, even if the relevant analytical approaches are not trivial and still improvable. In this regard, tools like CAMITAX and GTDBtk have implemented complex approaches, relying on marker gene identification and sequence alignments, requiring a large processing time. With the aim of deploying an effective tool for fast and reliable MAG taxonomic classification, we present here kMetaShot, a taxonomy classifier based on k-mer/minimizer counting. We benchmarked kMetaShot against CAMITAX and GTDBtk by using both in silico and real mock communities and demonstrated how, while implementing a fast and concise algorithm, it outperforms the other tools in terms of classification accuracy. Additionally, kMetaShot is an easy-to-install and easy-to-use bioinformatic tool that is also suitable for researchers with few command-line skills. It is available and documented at https://github.com/gdefazio/kMetaShot.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
kmetshot:一个快速可靠的宏基因组组装基因组分类分类器。
高通量测序(HTS)技术的出现通过宏基因组学的发展揭开了微生物世界的复杂性,现在提供了其在各种宏观和微生态系统中的分类和功能贡献的前所未有的全面概述。特别是,霰弹枪宏基因组学允许通过将reads组装成MAGs(宏基因组组装基因组)来重建微生物基因组。事实上,即使相关的分析方法不是微不足道的,而且仍然可以改进,mag也代表了推断微生物组的分类组成和功能贡献的信息丰富的代理。在这方面,CAMITAX和GTDBtk等工具实现了复杂的方法,依赖于标记基因鉴定和序列比对,需要大量的处理时间。为了部署一个快速可靠的MAG分类分类的有效工具,我们在这里提出了kmetshot,一个基于k-mer/minimizer计数的分类分类器。我们通过使用计算机和真实的模拟社区对kmetshot与CAMITAX和GTDBtk进行了基准测试,并演示了在实现快速简洁的算法的同时,它如何在分类准确性方面优于其他工具。此外,kmetshot是一个易于安装和易于使用的生物信息学工具,也适用于没有多少命令行技能的研究人员。它可以在https://github.com/gdefazio/kMetaShot上获得和记录。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Briefings in bioinformatics
Briefings in bioinformatics 生物-生化研究方法
CiteScore
13.20
自引率
13.70%
发文量
549
审稿时长
6 months
期刊介绍: Briefings in Bioinformatics is an international journal serving as a platform for researchers and educators in the life sciences. It also appeals to mathematicians, statisticians, and computer scientists applying their expertise to biological challenges. The journal focuses on reviews tailored for users of databases and analytical tools in contemporary genetics, molecular and systems biology. It stands out by offering practical assistance and guidance to non-specialists in computerized methodologies. Covering a wide range from introductory concepts to specific protocols and analyses, the papers address bacterial, plant, fungal, animal, and human data. The journal's detailed subject areas include genetic studies of phenotypes and genotypes, mapping, DNA sequencing, expression profiling, gene expression studies, microarrays, alignment methods, protein profiles and HMMs, lipids, metabolic and signaling pathways, structure determination and function prediction, phylogenetic studies, and education and training.
期刊最新文献
TRIAGE: an R package for regulatory gene analysis. AutoXAI4Omics: an automated explainable AI tool for omics and tabular data. MCGAE: unraveling tumor invasion through integrated multimodal spatial transcriptomics. tcrBLOSUM: an amino acid substitution matrix for sensitive alignment of distant epitope-specific TCRs. A versatile pipeline to identify convergently lost ancestral conserved fragments associated with convergent evolution of vocal learning.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1