Fast and Accurate Maximum-Likelihood Estimation of Multi-Type Birth-Death Epidemiological Models from Phylogenetic Trees.

IF 6.1 1区 生物学 Q1 EVOLUTIONARY BIOLOGY Systematic Biology Pub Date : 2023-12-30 DOI:10.1093/sysbio/syad059
Anna Zhukova, Frédéric Hecht, Yvon Maday, Olivier Gascuel
{"title":"Fast and Accurate Maximum-Likelihood Estimation of Multi-Type Birth-Death Epidemiological Models from Phylogenetic Trees.","authors":"Anna Zhukova, Frédéric Hecht, Yvon Maday, Olivier Gascuel","doi":"10.1093/sysbio/syad059","DOIUrl":null,"url":null,"abstract":"<p><p>Multi-type birth-death (MTBD) models are phylodynamic analogies of compartmental models in classical epidemiology. They serve to infer such epidemiological parameters as the average number of secondary infections Re and the infectious time from a phylogenetic tree (a genealogy of pathogen sequences). The representatives of this model family focus on various aspects of pathogen epidemics. For instance, the birth-death exposed-infectious (BDEI) model describes the transmission of pathogens featuring an incubation period (when there is a delay between the moment of infection and becoming infectious, as for Ebola and SARS-CoV-2), and permits its estimation along with other parameters. With constantly growing sequencing data, MTBD models should be extremely useful for unravelling information on pathogen epidemics. However, existing implementations of these models in a phylodynamic framework have not yet caught up with the sequencing speed. Computing time and numerical instability issues limit their applicability to medium data sets (≤ 500 samples), while the accuracy of estimations should increase with more data. We propose a new highly parallelizable formulation of ordinary differential equations for MTBD models. We also extend them to forests to represent situations when a (sub-)epidemic started from several cases (e.g., multiple introductions to a country). We implemented it for the BDEI model in a maximum likelihood framework using a combination of numerical analysis methods for efficient equation resolution. Our implementation estimates epidemiological parameter values and their confidence intervals in two minutes on a phylogenetic tree of 10,000 samples. Comparison to the existing implementations on simulated data shows that it is not only much faster but also more accurate. An application of our tool to the 2014 Ebola epidemic in Sierra-Leone is also convincing, with very fast calculation and precise estimates. As MTBD models are closely related to Cladogenetic State Speciation and Extinction (ClaSSE)-like models, our findings could also be easily transferred to the macroevolution domain.</p>","PeriodicalId":22120,"journal":{"name":"Systematic Biology","volume":" ","pages":"1387-1402"},"PeriodicalIF":6.1000,"publicationDate":"2023-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10924745/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systematic Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1093/sysbio/syad059","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EVOLUTIONARY BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Multi-type birth-death (MTBD) models are phylodynamic analogies of compartmental models in classical epidemiology. They serve to infer such epidemiological parameters as the average number of secondary infections Re and the infectious time from a phylogenetic tree (a genealogy of pathogen sequences). The representatives of this model family focus on various aspects of pathogen epidemics. For instance, the birth-death exposed-infectious (BDEI) model describes the transmission of pathogens featuring an incubation period (when there is a delay between the moment of infection and becoming infectious, as for Ebola and SARS-CoV-2), and permits its estimation along with other parameters. With constantly growing sequencing data, MTBD models should be extremely useful for unravelling information on pathogen epidemics. However, existing implementations of these models in a phylodynamic framework have not yet caught up with the sequencing speed. Computing time and numerical instability issues limit their applicability to medium data sets (≤ 500 samples), while the accuracy of estimations should increase with more data. We propose a new highly parallelizable formulation of ordinary differential equations for MTBD models. We also extend them to forests to represent situations when a (sub-)epidemic started from several cases (e.g., multiple introductions to a country). We implemented it for the BDEI model in a maximum likelihood framework using a combination of numerical analysis methods for efficient equation resolution. Our implementation estimates epidemiological parameter values and their confidence intervals in two minutes on a phylogenetic tree of 10,000 samples. Comparison to the existing implementations on simulated data shows that it is not only much faster but also more accurate. An application of our tool to the 2014 Ebola epidemic in Sierra-Leone is also convincing, with very fast calculation and precise estimates. As MTBD models are closely related to Cladogenetic State Speciation and Extinction (ClaSSE)-like models, our findings could also be easily transferred to the macroevolution domain.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
从系统发生树快速、准确地估计多类型出生-死亡流行病学模型的最大似然法
多类型出生-死亡(MTBD)模型是经典流行病学中分区模型的系统动力学类似物。它们可以从系统发育树(病原体序列的系谱)中推断出二次感染的平均数量和感染时间等流行病学参数。该模型系列的代表侧重于病原体流行的各个方面。例如,"出生-死亡-暴露-感染"(BDEI)模型描述了具有潜伏期的病原体传播(从感染到具有传染性之间存在延迟,如埃博拉病毒和 SARS-CoV-2),并允许与其他参数一起进行估算。随着测序数据的不断增加,MTBD 模型对于揭示病原体流行信息应该非常有用。然而,这些模型在系统动力学框架下的现有实现还跟不上测序速度。计算时间和数值不稳定性问题限制了它们对中等数据集(≤ 500 个样本)的适用性,而估计的准确性应随着数据量的增加而提高。我们为 MTBD 模型提出了一种新的高度可并行化的常微分方程公式。我们还将其扩展到森林,以表示一种(亚)流行病从多个病例开始的情况(如一个国家的多次引入)。我们在最大似然法框架内对 BDEI 模型实施了这一方法,并结合使用了数值分析方法来有效解决方程问题。我们的实施方案能在两分钟内估算出包含 10,000 个样本的系统发生树上的流行病学参数值及其置信区间。与现有的模拟数据实施方案相比,我们的实施方案不仅速度更快,而且更加准确。我们的工具在 2014 年塞拉利昂埃博拉疫情中的应用也令人信服,计算速度非常快,估计值也很精确。由于 MTBD 模型与类似于 Cladogenetic State Speciation and Extinction(ClaSSE)的模型密切相关,我们的发现也可以很容易地转移到宏观进化领域。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Systematic Biology
Systematic Biology 生物-进化生物学
CiteScore
13.00
自引率
7.70%
发文量
70
审稿时长
6-12 weeks
期刊介绍: Systematic Biology is the bimonthly journal of the Society of Systematic Biologists. Papers for the journal are original contributions to the theory, principles, and methods of systematics as well as phylogeny, evolution, morphology, biogeography, paleontology, genetics, and the classification of all living things. A Points of View section offers a forum for discussion, while book reviews and announcements of general interest are also featured.
期刊最新文献
A Double-edged Sword: Evolutionary Novelty along Deep-time Diversity Oscillation in An Iconic Group of Predatory Insects (Neuroptera: Mantispoidea) Are Modern Cryptic Species Detectable in the Fossil Record? A Case Study on Agamid Lizards. Bayesian Selection of Relaxed-clock Models: Distinguishing Between Independent and Autocorrelated Rates. Testing relationships between multiple regional features and biogeographic processes of speciation, extinction, and dispersal Robustness of Divergence Time Estimation Despite Gene Tree Estimation Error: A Case Study of Fireflies (Coleoptera: Lampyridae)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1