Variant Evolution Graph: Can We Infer How SARS-CoV-2 Variants are Evolving?

Badhan Das, Lenwood S. Heath
{"title":"Variant Evolution Graph: Can We Infer How SARS-CoV-2 Variants are Evolving?","authors":"Badhan Das, Lenwood S. Heath","doi":"10.1101/2024.09.13.612805","DOIUrl":null,"url":null,"abstract":"The SARS-CoV-2 virus has undergone mutations over time, leading to genetic diversity among circulating viral strains. This genetic diversity can affect the characteristics of the virus, including its transmissibility and the severity of symptoms in infected individuals. During the pandemic, this frequent mutation creates an enormous cloud of variants known as viral quasispecies. Most variation is lost due to the tight bottlenecks imposed by transmission and survival. Advancements in next-generation sequencing have facilitated the rapid and cost-effective production of complete viral genomes, enabling the ongoing monitoring of the evolution of the SARS-CoV-2 genome. However, inferring a reliable phylogeny from GISAID (the Global Initiative on Sharing All Influenza Data) is daunting due to the vast number of sequences. In the face of this complexity, this research proposes a new method of representing the evolutionary and epidemiological relationships among the SARS-CoV-2 variants inspired by quasispecies theory. We aim to build a Variant Evolution Graph (VEG), a novel way to model viral evolution in a local pandemic region based on the mutational distance of the genotypes of the variants. VEG is a directed acyclic graph and not necessarily a tree because a variant can evolve from more than one variant; here, the vertices represent the genotypes of the variants associated with their human hosts, and the edges represent the evolutionary relationships among these variants. A disease transmission network, DTN, which represents the transmission relationships among the hosts, is also proposed and derived from the VEG. We downloaded the genotypes of the variants recorded in GISAID, which are complete, have high coverage, and have a complete collection date from five countries: Somalia (22), Bhutan (102), Hungary (581), Iran (1334), and Nepal (1719). We ran our algorithm on these datasets to get the evolution history of the variants, build the variant evolution graph represented by the adjacency matrix, and infer the disease transmission network. Our research represents a novel and unprecedented contribution to the field of viral evolution, offering new insights and approaches not explored in prior studies.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"65 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.13.612805","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The SARS-CoV-2 virus has undergone mutations over time, leading to genetic diversity among circulating viral strains. This genetic diversity can affect the characteristics of the virus, including its transmissibility and the severity of symptoms in infected individuals. During the pandemic, this frequent mutation creates an enormous cloud of variants known as viral quasispecies. Most variation is lost due to the tight bottlenecks imposed by transmission and survival. Advancements in next-generation sequencing have facilitated the rapid and cost-effective production of complete viral genomes, enabling the ongoing monitoring of the evolution of the SARS-CoV-2 genome. However, inferring a reliable phylogeny from GISAID (the Global Initiative on Sharing All Influenza Data) is daunting due to the vast number of sequences. In the face of this complexity, this research proposes a new method of representing the evolutionary and epidemiological relationships among the SARS-CoV-2 variants inspired by quasispecies theory. We aim to build a Variant Evolution Graph (VEG), a novel way to model viral evolution in a local pandemic region based on the mutational distance of the genotypes of the variants. VEG is a directed acyclic graph and not necessarily a tree because a variant can evolve from more than one variant; here, the vertices represent the genotypes of the variants associated with their human hosts, and the edges represent the evolutionary relationships among these variants. A disease transmission network, DTN, which represents the transmission relationships among the hosts, is also proposed and derived from the VEG. We downloaded the genotypes of the variants recorded in GISAID, which are complete, have high coverage, and have a complete collection date from five countries: Somalia (22), Bhutan (102), Hungary (581), Iran (1334), and Nepal (1719). We ran our algorithm on these datasets to get the evolution history of the variants, build the variant evolution graph represented by the adjacency matrix, and infer the disease transmission network. Our research represents a novel and unprecedented contribution to the field of viral evolution, offering new insights and approaches not explored in prior studies.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
变异体进化图:我们能推断出 SARS-CoV-2 变异是如何演变的吗?
随着时间的推移,SARS-CoV-2 病毒发生了变异,导致循环病毒株之间的遗传多样性。这种基因多样性会影响病毒的特性,包括其传播性和感染者症状的严重程度。在大流行期间,这种频繁的变异会产生大量变异株,被称为病毒准种。由于传播和存活的瓶颈限制,大多数变异都已消失。下一代测序技术的进步促进了完整病毒基因组的快速和低成本生产,使我们能够持续监测 SARS-CoV-2 基因组的演变。然而,由于序列数量庞大,要从 GISAID(全球流感数据共享计划)中推断出可靠的系统发育过程非常困难。面对这种复杂性,本研究受准种群理论的启发,提出了一种新的方法来表示 SARS-CoV-2 变种之间的进化和流行病学关系。我们的目标是建立一个变体进化图(VEG),这是一种基于变体基因型突变距离的新方法,用于模拟局部大流行区域的病毒进化。VEG 是有向无环图,不一定是树,因为一个变种可以从多个变种演化而来;在这里,顶点代表与人类宿主相关的变种基因型,边代表这些变种之间的演化关系。我们还提出了一个疾病传播网络(DTN),它代表宿主之间的传播关系,并从 VEG 中衍生出来。我们下载了 GISAID 中记录的变异体的基因型,这些变异体来自五个国家,具有完整、高覆盖率和完整的收集日期:索马里(22 个)、不丹(102 个)、匈牙利(581 个)、伊朗(1334 个)和尼泊尔(1719 个)。我们在这些数据集上运行我们的算法,以获得变异体的进化历史,构建由邻接矩阵表示的变异体进化图,并推断疾病传播网络。我们的研究为病毒进化领域做出了前所未有的新贡献,提供了以往研究未曾探索过的新见解和新方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
ECSFinder: Optimized prediction of evolutionarily conserved RNA secondary structures from genome sequences GeneSpectra: a method for context-aware comparison of cell type gene expression across species A Bioinformatician, Computer Scientist, and Geneticist lead bioinformatic tool development - which one is better? Interpretable high-resolution dimension reduction of spatial transcriptomics data by DeepFuseNMF Pangenomics to understand prophage dynamics in the Pectobacterium genus and the radiating lineages of P. brasiliense
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1