mettannotator: a comprehensive and scalable Nextflow annotation pipeline for prokaryotic assemblies.

Tatiana A Gurbich, Martin Beracochea, Nishadi H De Silva, Robert D Finn
{"title":"mettannotator: a comprehensive and scalable Nextflow annotation pipeline for prokaryotic assemblies.","authors":"Tatiana A Gurbich, Martin Beracochea, Nishadi H De Silva, Robert D Finn","doi":"10.1093/bioinformatics/btaf037","DOIUrl":null,"url":null,"abstract":"<p><strong>Summary: </strong>In recent years, there has been a surge in prokaryotic genome assemblies, coming from both isolated organisms and environmental samples. These assemblies often include novel species that are poorly represented in reference databases creating a need for a tool that can annotate both well-described and novel taxa, and can run at scale. Here, we present mettannotator-a comprehensive, scalable Nextflow pipeline for prokaryotic genome annotation that identifies coding and noncoding regions, predicts protein functions, including antimicrobial resistance, and delineates gene clusters. The pipeline summarizes these results in a GFF (General Feature Format) file that can be easily utilized in downstream analysis or visualized using common genome browsers. Here, we show how it works on 200 genomes from 29 prokaryotic phyla, including isolate genomes and known and novel metagenome-assembled genomes, and present metrics on its performance in comparison to other tools.</p><p><strong>Availability and implementation: </strong>The pipeline is written in Nextflow and Python and published under an open source Apache 2.0 licence. Instructions and source code can be accessed at https://github.com/EBI-Metagenomics/mettannotator. The pipeline is also available on WorkflowHub: https://workflowhub.eu/workflows/1069.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11842068/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Summary: In recent years, there has been a surge in prokaryotic genome assemblies, coming from both isolated organisms and environmental samples. These assemblies often include novel species that are poorly represented in reference databases creating a need for a tool that can annotate both well-described and novel taxa, and can run at scale. Here, we present mettannotator-a comprehensive, scalable Nextflow pipeline for prokaryotic genome annotation that identifies coding and noncoding regions, predicts protein functions, including antimicrobial resistance, and delineates gene clusters. The pipeline summarizes these results in a GFF (General Feature Format) file that can be easily utilized in downstream analysis or visualized using common genome browsers. Here, we show how it works on 200 genomes from 29 prokaryotic phyla, including isolate genomes and known and novel metagenome-assembled genomes, and present metrics on its performance in comparison to other tools.

Availability and implementation: The pipeline is written in Nextflow and Python and published under an open source Apache 2.0 licence. Instructions and source code can be accessed at https://github.com/EBI-Metagenomics/mettannotator. The pipeline is also available on WorkflowHub: https://workflowhub.eu/workflows/1069.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
metannotator:一个全面和可扩展的Nextflow注释管道,用于原核生物组装。
摘要:近年来,来自分离生物和环境样本的原核生物基因组组装激增。这些组合通常包括在参考数据库中表现不佳的新物种,因此需要一种工具来注释描述良好的新分类群,并且可以大规模运行。在这里,我们提出了metttannotator——一个全面的、可扩展的Nextflow管道,用于原核生物基因组注释,识别编码区和非编码区,预测蛋白质功能,包括抗菌素耐药性,并描绘基因簇。该管道将这些工具的结果总结为GFF(通用特征格式)文件,可以很容易地用于下游分析或使用常见的基因组浏览器进行可视化。在这里,我们展示了它是如何在来自29个原核生物门的200个基因组上工作的,包括分离的基因组和已知的和新的宏基因组组装的基因组,并提出了与其他工具相比的性能指标。可用性和实现:该管道是用Nextflow和Python编写的,并在开源Apache 2.0许可下发布。指令和源代码可以在https://github.com/EBI-Metagenomics/mettannotator上访问。该管道也可在WorkflowHub上获得:https://workflowhub.eu/workflows/1069.Supplementary信息:补充数据可在Bioinformatics在线获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
pyBiodatafuse: extending interoperability of data using modular queries across biomedical resources. PyEvoMotion: a Python tool for population-based time-course analysis of genome evolution. scDBic: A novel deep learning-based biclustering algorithm for analyzing scRNA-seq data. Differential cell signaling testing for cell-cell communication inference from single-cell data by dominoSignal. Literature-derived, context-aware gene regulatory networks improve biological predictions and mathematical modeling.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1