VISTA:病毒基因组序列快速分类分配工具。

Tao Zhang, Yiyun Liu, Xutong Guo, Xinran Zhang, Xinchang Zheng, Mochen Zhang, Yiming Bao
{"title":"VISTA:病毒基因组序列快速分类分配工具。","authors":"Tao Zhang, Yiyun Liu, Xutong Guo, Xinran Zhang, Xinchang Zheng, Mochen Zhang, Yiming Bao","doi":"10.1093/gpbjnl/qzae082","DOIUrl":null,"url":null,"abstract":"<p><p>The rapid expansion of the number of viral genome sequences in public databases necessitates a scalable, universal, and automated preliminary taxonomic framework for comprehensive virus studies. Here, we introduce VISTA (Virus Sequence-based Taxonomy Assignment), a computational tool that employs a novel pairwise sequence comparison system and an automatic demarcation threshold identification framework for virus taxonomy. Leveraging physio-chemical property sequences, k-mer profiles, and machine learning techniques, VISTA constructs a robust distance-based framework for taxonomic assignment. Functionally similar to PASC (Pairwise Sequence Comparison), a widely used virus assignment tool based on pairwise sequence comparison, VISTA demonstrates superior performance by providing significantly improved separation for taxonomic groups, more objective taxonomic demarcation thresholds, greatly enhanced speed, and a wider application scope. We successfully applied VISTA to 38 virus families, as well as to the class Caudoviricetes. This demonstrates VISTA's scalability, robustness, and ability to automatically and accurately assign taxonomy to both prokaryotic and eukaryotic viruses. Furthermore, the application of VISTA to 679 unclassified prokaryotic virus genomes recovered from metagenomic data identified 46 novel virus families. VISTA is available as both a command line tool and a user-friendly web portal at https://ngdc.cncb.ac.cn/vista.</p>","PeriodicalId":94020,"journal":{"name":"Genomics, proteomics & bioinformatics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"VISTA: A Tool for Fast Taxonomic Assignment of Viral Genome Sequences.\",\"authors\":\"Tao Zhang, Yiyun Liu, Xutong Guo, Xinran Zhang, Xinchang Zheng, Mochen Zhang, Yiming Bao\",\"doi\":\"10.1093/gpbjnl/qzae082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The rapid expansion of the number of viral genome sequences in public databases necessitates a scalable, universal, and automated preliminary taxonomic framework for comprehensive virus studies. Here, we introduce VISTA (Virus Sequence-based Taxonomy Assignment), a computational tool that employs a novel pairwise sequence comparison system and an automatic demarcation threshold identification framework for virus taxonomy. Leveraging physio-chemical property sequences, k-mer profiles, and machine learning techniques, VISTA constructs a robust distance-based framework for taxonomic assignment. Functionally similar to PASC (Pairwise Sequence Comparison), a widely used virus assignment tool based on pairwise sequence comparison, VISTA demonstrates superior performance by providing significantly improved separation for taxonomic groups, more objective taxonomic demarcation thresholds, greatly enhanced speed, and a wider application scope. We successfully applied VISTA to 38 virus families, as well as to the class Caudoviricetes. This demonstrates VISTA's scalability, robustness, and ability to automatically and accurately assign taxonomy to both prokaryotic and eukaryotic viruses. Furthermore, the application of VISTA to 679 unclassified prokaryotic virus genomes recovered from metagenomic data identified 46 novel virus families. VISTA is available as both a command line tool and a user-friendly web portal at https://ngdc.cncb.ac.cn/vista.</p>\",\"PeriodicalId\":94020,\"journal\":{\"name\":\"Genomics, proteomics & bioinformatics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-11-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Genomics, proteomics & bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/gpbjnl/qzae082\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Genomics, proteomics & bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/gpbjnl/qzae082","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

随着公共数据库中病毒基因组序列数量的迅速增加,需要一个可扩展、通用和自动化的初步分类框架来进行全面的病毒研究。我们在此介绍 VISTA(基于病毒序列的分类分配),它是一种计算工具,采用了新颖的成对序列比较系统和自动分界阈值识别框架来进行病毒分类。VISTA 利用物理化学特性序列、k-mer 剖面和机器学习技术,构建了一个基于距离的稳健分类分配框架。VISTA 在功能上类似于 PASC(成对序列比较),后者是一种广泛使用的基于成对序列比较的病毒分类工具,VISTA 通过显著提高分类组的分离度、更客观的分类划分阈值、大大提高的速度和更广泛的应用范围,展示了卓越的性能。我们成功地将 VISTA 应用于 38 个病毒科和 Caudoviricetes 类。这证明了 VISTA 的可扩展性、稳健性以及自动、准确地对原核和真核病毒进行分类的能力。此外,将 VISTA 应用于从元基因组数据中恢复的 679 个未分类的原核病毒基因组,发现了 46 个新的病毒科。VISTA 既是命令行工具,也是用户友好的门户网站,网址是 https://ngdc.cncb.ac.cn/vista。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
VISTA: A Tool for Fast Taxonomic Assignment of Viral Genome Sequences.

The rapid expansion of the number of viral genome sequences in public databases necessitates a scalable, universal, and automated preliminary taxonomic framework for comprehensive virus studies. Here, we introduce VISTA (Virus Sequence-based Taxonomy Assignment), a computational tool that employs a novel pairwise sequence comparison system and an automatic demarcation threshold identification framework for virus taxonomy. Leveraging physio-chemical property sequences, k-mer profiles, and machine learning techniques, VISTA constructs a robust distance-based framework for taxonomic assignment. Functionally similar to PASC (Pairwise Sequence Comparison), a widely used virus assignment tool based on pairwise sequence comparison, VISTA demonstrates superior performance by providing significantly improved separation for taxonomic groups, more objective taxonomic demarcation thresholds, greatly enhanced speed, and a wider application scope. We successfully applied VISTA to 38 virus families, as well as to the class Caudoviricetes. This demonstrates VISTA's scalability, robustness, and ability to automatically and accurately assign taxonomy to both prokaryotic and eukaryotic viruses. Furthermore, the application of VISTA to 679 unclassified prokaryotic virus genomes recovered from metagenomic data identified 46 novel virus families. VISTA is available as both a command line tool and a user-friendly web portal at https://ngdc.cncb.ac.cn/vista.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
iMFP-LG: Identification of Novel Multi-Functional Peptides by Using Protein Language Models and Graph-Based Deep Learning. ProtPipe: A Multifunctional Data Analysis Pipeline for Proteomics and Peptidomics. VISTA: A Tool for Fast Taxonomic Assignment of Viral Genome Sequences. Pangenome Reveals Gene Content Variations and Structural Variants Contributing to Pig Characteristics. SoyOD: An Integrated Soybean Multi-omics Database for Mining Genes and Biological Research.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1