From Trees to Clouds: PhageClouds for Fast Comparison of ∼640,000 Phage Genomic Sequences and Host-Centric Visualization Using Genomic Network Graphs.

IF 1.8 PHAGE (New Rochelle, N.Y.) Pub Date : 2021-12-01 Epub Date: 2021-12-16 DOI:10.1089/phage.2021.0008
Guillermo Rangel-Pineros, Andrew Millard, Slawomir Michniewski, David Scanlan, Kimmo Sirén, Alejandro Reyes, Bent Petersen, Martha R J Clokie, Thomas Sicheritz-Pontén
{"title":"From Trees to Clouds: PhageClouds for Fast Comparison of ∼640,000 Phage Genomic Sequences and Host-Centric Visualization Using Genomic Network Graphs.","authors":"Guillermo Rangel-Pineros, Andrew Millard, Slawomir Michniewski, David Scanlan, Kimmo Sirén, Alejandro Reyes, Bent Petersen, Martha R J Clokie, Thomas Sicheritz-Pontén","doi":"10.1089/phage.2021.0008","DOIUrl":null,"url":null,"abstract":"<p><p><b><i>Background:</i></b> Fast and computationally efficient strategies are required to explore genomic relationships within an increasingly large and diverse phage sequence space. Here, we present PhageClouds, a novel approach using a graph database of phage genomic sequences and their intergenomic distances to explore the phage genomic sequence space. <b><i>Methods:</i></b> A total of 640,000 phage genomic sequences were retrieved from a variety of databases and public virome assemblies. Intergenomic distances were calculated with dashing, an alignment-free method suitable for handling massive data sets. These data were used to build a Neo4j<sup>®</sup> graph database. <b><i>Results:</i></b> PhageClouds supported the search of related phages among all complete phage genomes from GenBank for a single query phage in just 10 s. Moreover, PhageClouds expanded the number of closely related phage sequences detected for both finished and draft phage genomes, in comparison with searches exclusively targeting phage entries from GenBank. <b><i>Conclusions:</i></b> PhageClouds is a novel resource that will facilitate the analysis of phage genomic sequences and the characterization of assembled phage genomes.</p>","PeriodicalId":74428,"journal":{"name":"PHAGE (New Rochelle, N.Y.)","volume":" ","pages":"194-203"},"PeriodicalIF":1.8000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/7d/81/phage.2021.0008.PMC9041511.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PHAGE (New Rochelle, N.Y.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1089/phage.2021.0008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/12/16 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Fast and computationally efficient strategies are required to explore genomic relationships within an increasingly large and diverse phage sequence space. Here, we present PhageClouds, a novel approach using a graph database of phage genomic sequences and their intergenomic distances to explore the phage genomic sequence space. Methods: A total of 640,000 phage genomic sequences were retrieved from a variety of databases and public virome assemblies. Intergenomic distances were calculated with dashing, an alignment-free method suitable for handling massive data sets. These data were used to build a Neo4j® graph database. Results: PhageClouds supported the search of related phages among all complete phage genomes from GenBank for a single query phage in just 10 s. Moreover, PhageClouds expanded the number of closely related phage sequences detected for both finished and draft phage genomes, in comparison with searches exclusively targeting phage entries from GenBank. Conclusions: PhageClouds is a novel resource that will facilitate the analysis of phage genomic sequences and the characterization of assembled phage genomes.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从树到云:PhageClouds 用于快速比较 ∼640,000 个噬菌体基因组序列,以及使用基因组网络图进行以宿主为中心的可视化。
背景:在日益庞大和多样化的噬菌体序列空间中探索基因组关系需要快速和高效的计算策略。在此,我们介绍 PhageClouds,这是一种利用噬菌体基因组序列图数据库及其基因组间距离来探索噬菌体基因组序列空间的新方法。方法:我们从各种数据库和公共病毒组汇编中检索了总共 64 万个噬菌体基因组序列。基因组间距离用 dashing 计算,这是一种适用于处理海量数据集的无比对方法。这些数据被用于建立 Neo4j® 图数据库。结果PhageClouds 支持在 GenBank 的所有完整噬菌体基因组中搜索相关噬菌体,单个查询噬菌体只需 10 秒钟。此外,与只针对 GenBank 中的噬菌体条目进行的搜索相比,PhageClouds 增加了在完整噬菌体基因组和草案噬菌体基因组中检测到的密切相关噬菌体序列的数量。结论噬菌体云是一种新颖的资源,有助于分析噬菌体基因组序列和鉴定组装的噬菌体基因组。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
NRG-P0074 Viral Sample RU1 from Unclassified Mosigvirus Genomic Characterization and Host Range Analysis. Escherichia Phage Ge15, NRG-P0073: Genomic Characterization and Host Range Analysis Against the ECOR Reference Library. Semantics Count in the Description of the Interactions Between Bacteria and Bacteriophage. Six Novel Pseudomonas aeruginosa Phages: Genomic Insights and Therapeutic Potential. Degradation of Preformed Gram-Positive and Gram-Negative Bacterial Biofilms Using Disintegrated and Intact Phages.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1