zol and fai: large-scale targeted detection and evolutionary investigation of gene clusters

IF 13.1 2区 生物学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY Nucleic Acids Research Pub Date : 2025-02-05 DOI:10.1093/nar/gkaf045
Rauf Salamzade, Patricia Q Tran, Cody Martin, Abigail L Manson, Michael S Gilmore, Ashlee M Earl, Karthik Anantharaman, Lindsay R Kalan
{"title":"zol and fai: large-scale targeted detection and evolutionary investigation of gene clusters","authors":"Rauf Salamzade, Patricia Q Tran, Cody Martin, Abigail L Manson, Michael S Gilmore, Ashlee M Earl, Karthik Anantharaman, Lindsay R Kalan","doi":"10.1093/nar/gkaf045","DOIUrl":null,"url":null,"abstract":"Many universally and conditionally important genes are genomically aggregated within clusters. Here, we introduce fai and zol, which together enable large-scale comparative analysis of different types of gene clusters and mobile-genetic elements, such as biosynthetic gene clusters (BGCs) or viruses. Fundamentally, they overcome a current bottleneck to reliably perform comprehensive orthology inference at large scale across broad taxonomic contexts and thousands of genomes. First, fai allows the identification of orthologous instances of a query gene cluster of interest amongst a database of target genomes. Subsequently, zol enables reliable, context-specific inference of ortholog groups for individual protein-encoding genes across gene cluster instances. In addition, zol performs functional annotation and computes a variety of evolutionary statistics for each inferred ortholog group. Importantly, in comparison to tools for visual exploration of homologous relationships between gene clusters, zol can scale to handle thousands of gene cluster instances and produce detailed reports that are easy to digest. To showcase fai and zol, we apply them for: (i) longitudinal tracking of a virus in metagenomes, (ii) performing population genetic investigations of BGCs for a fungal species, and (iii) uncovering evolutionary trends for a virulence-associated gene cluster across thousands of genomes from a diverse bacterial genus.","PeriodicalId":19471,"journal":{"name":"Nucleic Acids Research","volume":"27 1","pages":""},"PeriodicalIF":13.1000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nucleic Acids Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/nar/gkaf045","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Many universally and conditionally important genes are genomically aggregated within clusters. Here, we introduce fai and zol, which together enable large-scale comparative analysis of different types of gene clusters and mobile-genetic elements, such as biosynthetic gene clusters (BGCs) or viruses. Fundamentally, they overcome a current bottleneck to reliably perform comprehensive orthology inference at large scale across broad taxonomic contexts and thousands of genomes. First, fai allows the identification of orthologous instances of a query gene cluster of interest amongst a database of target genomes. Subsequently, zol enables reliable, context-specific inference of ortholog groups for individual protein-encoding genes across gene cluster instances. In addition, zol performs functional annotation and computes a variety of evolutionary statistics for each inferred ortholog group. Importantly, in comparison to tools for visual exploration of homologous relationships between gene clusters, zol can scale to handle thousands of gene cluster instances and produce detailed reports that are easy to digest. To showcase fai and zol, we apply them for: (i) longitudinal tracking of a virus in metagenomes, (ii) performing population genetic investigations of BGCs for a fungal species, and (iii) uncovering evolutionary trends for a virulence-associated gene cluster across thousands of genomes from a diverse bacterial genus.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Zol和fai:基因簇的大规模靶向检测和进化研究
许多普遍和有条件的重要基因在基因组上聚集在簇内。在这里,我们引入fai和zol,它们一起可以对不同类型的基因簇和移动遗传元件(如生物合成基因簇(BGCs)或病毒)进行大规模比较分析。从根本上说,他们克服了目前的瓶颈,可以在广泛的分类背景和数千个基因组中可靠地进行大规模的全面的同源推断。首先,fai允许在目标基因组数据库中识别感兴趣的查询基因簇的同源实例。随后,zol能够在基因簇实例中对单个蛋白质编码基因的同源群进行可靠的、上下文特定的推断。此外,zol执行功能注释,并为每个推断的同源群计算各种进化统计信息。重要的是,与用于可视化探索基因簇之间同源关系的工具相比,zol可以扩展到处理数千个基因簇实例,并生成易于消化的详细报告。为了展示fai和zol,我们将它们应用于:(i)在宏基因组中对病毒进行纵向跟踪,(ii)对真菌物种的BGCs进行群体遗传调查,以及(iii)揭示来自不同细菌属的数千个基因组的毒力相关基因簇的进化趋势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Nucleic Acids Research
Nucleic Acids Research 生物-生化与分子生物学
CiteScore
27.10
自引率
4.70%
发文量
1057
审稿时长
2 months
期刊介绍: Nucleic Acids Research (NAR) is a scientific journal that publishes research on various aspects of nucleic acids and proteins involved in nucleic acid metabolism and interactions. It covers areas such as chemistry and synthetic biology, computational biology, gene regulation, chromatin and epigenetics, genome integrity, repair and replication, genomics, molecular biology, nucleic acid enzymes, RNA, and structural biology. The journal also includes a Survey and Summary section for brief reviews. Additionally, each year, the first issue is dedicated to biological databases, and an issue in July focuses on web-based software resources for the biological community. Nucleic Acids Research is indexed by several services including Abstracts on Hygiene and Communicable Diseases, Animal Breeding Abstracts, Agricultural Engineering Abstracts, Agbiotech News and Information, BIOSIS Previews, CAB Abstracts, and EMBASE.
期刊最新文献
CMDdemux: an efficient single cell demultiplexing method HM-DyadCap - capture and mapping of 5-hydroxymethylcytosine/5-methylcytosine CpG dyads in mammalian DNA. Cell-type-specific ablation of siRNAs by Arabidopsis RTL1 reveals a role of phloem companion cells in systemic post-transcriptional gene silencing. Functional analyses identify the L1 ORF2p Cryptic domain as a hub for intramolecular interfaces required for retrotransposition. PhaBOX2: an enhanced web server for discovering and analyzing viral contigs in metagenomic data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1