从树到云:PhageClouds 用于快速比较 ∼640,000 个噬菌体基因组序列,以及使用基因组网络图进行以宿主为中心的可视化。

PHAGE (New Rochelle, N.Y.) Pub Date : 2021-12-01 Epub Date: 2021-12-16 DOI:10.1089/phage.2021.0008
Guillermo Rangel-Pineros, Andrew Millard, Slawomir Michniewski, David Scanlan, Kimmo Sirén, Alejandro Reyes, Bent Petersen, Martha R J Clokie, Thomas Sicheritz-Pontén
{"title":"从树到云:PhageClouds 用于快速比较 ∼640,000 个噬菌体基因组序列,以及使用基因组网络图进行以宿主为中心的可视化。","authors":"Guillermo Rangel-Pineros, Andrew Millard, Slawomir Michniewski, David Scanlan, Kimmo Sirén, Alejandro Reyes, Bent Petersen, Martha R J Clokie, Thomas Sicheritz-Pontén","doi":"10.1089/phage.2021.0008","DOIUrl":null,"url":null,"abstract":"<p><p><b><i>Background:</i></b> Fast and computationally efficient strategies are required to explore genomic relationships within an increasingly large and diverse phage sequence space. Here, we present PhageClouds, a novel approach using a graph database of phage genomic sequences and their intergenomic distances to explore the phage genomic sequence space. <b><i>Methods:</i></b> A total of 640,000 phage genomic sequences were retrieved from a variety of databases and public virome assemblies. Intergenomic distances were calculated with dashing, an alignment-free method suitable for handling massive data sets. These data were used to build a Neo4j<sup>®</sup> graph database. <b><i>Results:</i></b> PhageClouds supported the search of related phages among all complete phage genomes from GenBank for a single query phage in just 10 s. Moreover, PhageClouds expanded the number of closely related phage sequences detected for both finished and draft phage genomes, in comparison with searches exclusively targeting phage entries from GenBank. <b><i>Conclusions:</i></b> PhageClouds is a novel resource that will facilitate the analysis of phage genomic sequences and the characterization of assembled phage genomes.</p>","PeriodicalId":74428,"journal":{"name":"PHAGE (New Rochelle, N.Y.)","volume":" ","pages":"194-203"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/7d/81/phage.2021.0008.PMC9041511.pdf","citationCount":"0","resultStr":"{\"title\":\"From Trees to Clouds: PhageClouds for Fast Comparison of ∼640,000 Phage Genomic Sequences and Host-Centric Visualization Using Genomic Network Graphs.\",\"authors\":\"Guillermo Rangel-Pineros, Andrew Millard, Slawomir Michniewski, David Scanlan, Kimmo Sirén, Alejandro Reyes, Bent Petersen, Martha R J Clokie, Thomas Sicheritz-Pontén\",\"doi\":\"10.1089/phage.2021.0008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b><i>Background:</i></b> Fast and computationally efficient strategies are required to explore genomic relationships within an increasingly large and diverse phage sequence space. Here, we present PhageClouds, a novel approach using a graph database of phage genomic sequences and their intergenomic distances to explore the phage genomic sequence space. <b><i>Methods:</i></b> A total of 640,000 phage genomic sequences were retrieved from a variety of databases and public virome assemblies. Intergenomic distances were calculated with dashing, an alignment-free method suitable for handling massive data sets. These data were used to build a Neo4j<sup>®</sup> graph database. <b><i>Results:</i></b> PhageClouds supported the search of related phages among all complete phage genomes from GenBank for a single query phage in just 10 s. Moreover, PhageClouds expanded the number of closely related phage sequences detected for both finished and draft phage genomes, in comparison with searches exclusively targeting phage entries from GenBank. <b><i>Conclusions:</i></b> PhageClouds is a novel resource that will facilitate the analysis of phage genomic sequences and the characterization of assembled phage genomes.</p>\",\"PeriodicalId\":74428,\"journal\":{\"name\":\"PHAGE (New Rochelle, N.Y.)\",\"volume\":\" \",\"pages\":\"194-203\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/7d/81/phage.2021.0008.PMC9041511.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PHAGE (New Rochelle, N.Y.)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1089/phage.2021.0008\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2021/12/16 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PHAGE (New Rochelle, N.Y.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1089/phage.2021.0008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2021/12/16 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景:在日益庞大和多样化的噬菌体序列空间中探索基因组关系需要快速和高效的计算策略。在此,我们介绍 PhageClouds,这是一种利用噬菌体基因组序列图数据库及其基因组间距离来探索噬菌体基因组序列空间的新方法。方法:我们从各种数据库和公共病毒组汇编中检索了总共 64 万个噬菌体基因组序列。基因组间距离用 dashing 计算,这是一种适用于处理海量数据集的无比对方法。这些数据被用于建立 Neo4j® 图数据库。结果PhageClouds 支持在 GenBank 的所有完整噬菌体基因组中搜索相关噬菌体,单个查询噬菌体只需 10 秒钟。此外,与只针对 GenBank 中的噬菌体条目进行的搜索相比,PhageClouds 增加了在完整噬菌体基因组和草案噬菌体基因组中检测到的密切相关噬菌体序列的数量。结论噬菌体云是一种新颖的资源,有助于分析噬菌体基因组序列和鉴定组装的噬菌体基因组。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
From Trees to Clouds: PhageClouds for Fast Comparison of ∼640,000 Phage Genomic Sequences and Host-Centric Visualization Using Genomic Network Graphs.

Background: Fast and computationally efficient strategies are required to explore genomic relationships within an increasingly large and diverse phage sequence space. Here, we present PhageClouds, a novel approach using a graph database of phage genomic sequences and their intergenomic distances to explore the phage genomic sequence space. Methods: A total of 640,000 phage genomic sequences were retrieved from a variety of databases and public virome assemblies. Intergenomic distances were calculated with dashing, an alignment-free method suitable for handling massive data sets. These data were used to build a Neo4j® graph database. Results: PhageClouds supported the search of related phages among all complete phage genomes from GenBank for a single query phage in just 10 s. Moreover, PhageClouds expanded the number of closely related phage sequences detected for both finished and draft phage genomes, in comparison with searches exclusively targeting phage entries from GenBank. Conclusions: PhageClouds is a novel resource that will facilitate the analysis of phage genomic sequences and the characterization of assembled phage genomes.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Aerosolic Application of Phages Against S. infantis on Plates and Chicken Skin. Expanding the Phage Galaxy: Isolation and Characterization of Five Novel Streptomyces Siphoviruses Ankus, Byblos, DekoNeimoidia, Mandalore, and Naboo. SalmoFree® Phage Additive Proves Its Safety for Laying Hens. Celebrating Progress and Overcoming Challenges in Phage Research. Perspectives of Success. Cartoon by Ellie Jameson.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1