Frontiers | Phylogenetic-based methods for fine-scale classification of PRRSV-2 ORF5 sequences: a comparison of their robustness and reproducibility

IF 2 Q4 VIROLOGY Frontiers in virology Pub Date : 2024-07-24 DOI:10.3389/fviro.2024.1433931
Kimberly VanderWaal, Nakarin Pamornchainavakul, Mariana Kikuti, Daniel C. Linhares, Giovani Trevisan, Jianqiang Zhang, Tavis K. Anderson, Michael Zeller, Stephanie Rossow, Derald Holtkamp, Dennis N. Makau, Cesar A. Corzo, Igor Paploski
{"title":"Frontiers | Phylogenetic-based methods for fine-scale classification of PRRSV-2 ORF5 sequences: a comparison of their robustness and reproducibility","authors":"Kimberly VanderWaal, Nakarin Pamornchainavakul, Mariana Kikuti, Daniel C. Linhares, Giovani Trevisan, Jianqiang Zhang, Tavis K. Anderson, Michael Zeller, Stephanie Rossow, Derald Holtkamp, Dennis N. Makau, Cesar A. Corzo, Igor Paploski","doi":"10.3389/fviro.2024.1433931","DOIUrl":null,"url":null,"abstract":"Disease management and epidemiological investigations of porcine reproductive and respiratory syndrome virus-type 2 (PRRSV-2) often rely on grouping together highly related sequences. In the USA, the last five years have seen a major shift within the swine industry when classifying PRRSV-2, beginning to move away from RFLP (restriction fragment length polymorphisms)-typing and adopting the use of phylogenetic lineage-based classification. However, lineages and sub-lineages are large and genetically diverse, making them insufficient for identifying new and emerging variants. Thus, within the lineage system, a dynamic fine-scale classification scheme is needed to provide better resolution on the relatedness of PRRSV-2 viruses to inform disease management and monitoring efforts and facilitate research and communication surrounding circulating PRRSV viruses. Here, we compare fine-scale systems for classifying PRRSV-2 variants (i.e., genetic clusters of closely related ORF5 sequences at finer scales than sub-lineage) using a database of 28,730 sequences from 2010 to 2021, representing >55% of the U.S. pig population. In total, we compared 140 approaches that differed in their tree-building method, criteria, and thresholds for defining variants within phylogenetic trees. Three approaches resulted in variant classifications that were reproducible and robust even when the input data or input phylogenies were changed. For these approaches, the average genetic distance among sequences belonging to the same variant was 2.1–2.5%, and the genetic divergence between variants was 2.5–2.7%. Machine learning classification algorithms were trained to assign new sequences to an existing variant with >95% accuracy, which shows that newly generated sequences can be assigned to a variant without repeating the phylogenetic and clustering analyses. Finally, we identified 73 sequence-clusters (dated <1 year apart with close phylogenetic relatedness) associated with circulation events on single farms. The percent of farm sequence-clusters with an ID change was 6.5–8.7% for our approaches. In contrast, ~43% of farm sequence-clusters had variation in their RFLP-type, further demonstrating how our proposed fine-scale classification system addresses shortcomings of RFLP-typing. Through identifying robust and reproducible classification approaches for PRRSV-2, this work lays the foundation for a fine-scale system that would more reliably group related field viruses and provide better resolution for decision-making surrounding disease management.","PeriodicalId":73114,"journal":{"name":"Frontiers in virology","volume":"12 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in virology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fviro.2024.1433931","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"VIROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Disease management and epidemiological investigations of porcine reproductive and respiratory syndrome virus-type 2 (PRRSV-2) often rely on grouping together highly related sequences. In the USA, the last five years have seen a major shift within the swine industry when classifying PRRSV-2, beginning to move away from RFLP (restriction fragment length polymorphisms)-typing and adopting the use of phylogenetic lineage-based classification. However, lineages and sub-lineages are large and genetically diverse, making them insufficient for identifying new and emerging variants. Thus, within the lineage system, a dynamic fine-scale classification scheme is needed to provide better resolution on the relatedness of PRRSV-2 viruses to inform disease management and monitoring efforts and facilitate research and communication surrounding circulating PRRSV viruses. Here, we compare fine-scale systems for classifying PRRSV-2 variants (i.e., genetic clusters of closely related ORF5 sequences at finer scales than sub-lineage) using a database of 28,730 sequences from 2010 to 2021, representing >55% of the U.S. pig population. In total, we compared 140 approaches that differed in their tree-building method, criteria, and thresholds for defining variants within phylogenetic trees. Three approaches resulted in variant classifications that were reproducible and robust even when the input data or input phylogenies were changed. For these approaches, the average genetic distance among sequences belonging to the same variant was 2.1–2.5%, and the genetic divergence between variants was 2.5–2.7%. Machine learning classification algorithms were trained to assign new sequences to an existing variant with >95% accuracy, which shows that newly generated sequences can be assigned to a variant without repeating the phylogenetic and clustering analyses. Finally, we identified 73 sequence-clusters (dated <1 year apart with close phylogenetic relatedness) associated with circulation events on single farms. The percent of farm sequence-clusters with an ID change was 6.5–8.7% for our approaches. In contrast, ~43% of farm sequence-clusters had variation in their RFLP-type, further demonstrating how our proposed fine-scale classification system addresses shortcomings of RFLP-typing. Through identifying robust and reproducible classification approaches for PRRSV-2, this work lays the foundation for a fine-scale system that would more reliably group related field viruses and provide better resolution for decision-making surrounding disease management.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
前沿 | 基于系统发生学的 PRRSV-2 ORF5 序列精细分类方法:其稳健性和可重复性比较
猪繁殖与呼吸综合征病毒 2 型(PRRSV-2)的疾病管理和流行病学调查通常依赖于对高度相关的序列进行分组。过去五年中,美国养猪业在对 PRRSV-2 进行分类时发生了重大转变,开始摒弃 RFLP(限制性片段长度多态性)类型分类法,转而采用基于系统发生学的系谱分类法。然而,世系和亚世系庞大且基因多样,不足以识别新出现的变种。因此,在系谱系统中,需要一种动态的精细分类方案,以更好地解析 PRRSV-2 病毒的亲缘关系,为疾病管理和监测工作提供信息,并促进围绕循环 PRRSV 病毒的研究和交流。在此,我们使用 2010 年至 2021 年期间的 28,730 个序列(占美国猪群的 55%以上)数据库,比较了 PRRSV-2 变异体(即在比亚系更细的尺度上由密切相关的 ORF5 序列组成的基因簇)的精细分类系统。我们总共比较了 140 种方法,这些方法在系统发育树中定义变异的建树方法、标准和阈值方面各不相同。有三种方法得出的变异分类即使在输入数据或输入系统发生变化时也具有可重复性和稳健性。在这些方法中,属于同一变体的序列之间的平均遗传距离为 2.1-2.5%,变体之间的遗传差异为 2.5-2.7%。通过训练机器学习分类算法,将新序列归入现有变体的准确率大于 95%,这表明新产生的序列可以归入一个变体,而无需重复系统发育和聚类分析。最后,我们确定了 73 个与单个农场的循环事件相关的序列簇(日期相隔小于 1 年,系统发育关系密切)。在我们的方法中,ID 发生变化的农场序列簇的百分比为 6.5-8.7%。相比之下,约 43% 的农场序列群在其 RFLP 类型上存在差异,这进一步证明了我们提出的精细分类系统如何解决了 RFLP 类型的不足。通过为 PRRSV-2 确定稳健且可重复的分类方法,这项工作为建立一个精细的系统奠定了基础,该系统将更可靠地对相关的田间病毒进行分组,并为疾病管理决策提供更好的分辨率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Frontiers | Phylogenetic-based methods for fine-scale classification of PRRSV-2 ORF5 sequences: a comparison of their robustness and reproducibility Frontiers | A proposed new Tombusviridae genus featuring extremely long 5' untranslated regions and a luteo/polerovirus-like gene block Frontiers | Severe Acute Respiratory Syndrome Coronavirus-2 seroprevalence in non-vaccinated People Living with HIV in Uganda during the year 2022 Frontiers | Predicting Antibody and ACE2 Affinity for SARS-CoV-2 BA.2.86 and JN.1 with In Silico Protein Modeling and Docking Frontiers | HIV latency potential may beis influenced by intra-subtype genetic differences in the viral long-terminal repeat
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1