Shi Chen , Daniel Janies , Rajib Paul , Jean-Claude Thill
{"title":"Leveraging advances in data-driven deep learning methods for hybrid epidemic modeling","authors":"Shi Chen , Daniel Janies , Rajib Paul , Jean-Claude Thill","doi":"10.1016/j.epidem.2024.100782","DOIUrl":null,"url":null,"abstract":"<div><p>Mathematical modeling of epidemic dynamics is crucial to understand its underlying mechanisms, quantify important parameters, and make predictions that facilitate more informed decision-making. There are three major types of models: mechanistic models including the SEIR-type paradigm, alternative data-driven (DD) approaches, and hybrid models that combine mechanistic models with DD approaches. In this paper, we summarize our work in the COVID-19 Scenario Modeling Hub (SMH) for more than 12 rounds since early 2021 for informed decision support. We emphasize the importance of deep learning techniques for epidemic modeling via a flexible DD framework that substantially complements the mechanistic paradigm to evaluate various future epidemic scenarios. We start with a traditional curve-fitting approach to model cumulative COVID-19 based on the underlying SEIR-type mechanisms. Hospitalizations and deaths are modeled as binomial processes of cases and hospitalization, respectively. We further formulate two types of deep learning models based on multivariate long short term memory (LSTM) to address the challenges of more traditional DD models. The first LSTM is structurally similar to the curve fitting approach and assumes that hospitalizations and deaths are binomial processes of cases. Instead of using a predefined exponential curve, LSTM relies on the underlying data to identify the most appropriate functions, and is capable of capturing both long-term and short-term epidemic behaviors. We then relax the assumption of dependent inputs among cases, hospitalizations, and death. Another type of LSTM that handles all input time series as parallel signals, the independent multivariate LSTM, is developed. Independent multivariate LSTM can incorporate a wide range of data sources beyond traditional case-based epidemiological surveillance. The DD framework unleashes its potential in big data era with previously neglected heterogeneous surveillance data sources, such as syndromic, environment, genomic, serologic, infoveillance, and mobility data. DD approaches, especially LSTM, complement and integrate with the mechanistic modeling paradigm, provide a feasible alternative approach to model today’s complex socio-epidemiological systems, and further leverage our ability to explore different scenarios for more informed decision-making during health emergencies.</p></div>","PeriodicalId":49206,"journal":{"name":"Epidemics","volume":"48 ","pages":"Article 100782"},"PeriodicalIF":3.0000,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1755436524000434/pdfft?md5=66b4b845bc293b9536b2c77f14da946e&pid=1-s2.0-S1755436524000434-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1755436524000434","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
引用次数: 0
Abstract
Mathematical modeling of epidemic dynamics is crucial to understand its underlying mechanisms, quantify important parameters, and make predictions that facilitate more informed decision-making. There are three major types of models: mechanistic models including the SEIR-type paradigm, alternative data-driven (DD) approaches, and hybrid models that combine mechanistic models with DD approaches. In this paper, we summarize our work in the COVID-19 Scenario Modeling Hub (SMH) for more than 12 rounds since early 2021 for informed decision support. We emphasize the importance of deep learning techniques for epidemic modeling via a flexible DD framework that substantially complements the mechanistic paradigm to evaluate various future epidemic scenarios. We start with a traditional curve-fitting approach to model cumulative COVID-19 based on the underlying SEIR-type mechanisms. Hospitalizations and deaths are modeled as binomial processes of cases and hospitalization, respectively. We further formulate two types of deep learning models based on multivariate long short term memory (LSTM) to address the challenges of more traditional DD models. The first LSTM is structurally similar to the curve fitting approach and assumes that hospitalizations and deaths are binomial processes of cases. Instead of using a predefined exponential curve, LSTM relies on the underlying data to identify the most appropriate functions, and is capable of capturing both long-term and short-term epidemic behaviors. We then relax the assumption of dependent inputs among cases, hospitalizations, and death. Another type of LSTM that handles all input time series as parallel signals, the independent multivariate LSTM, is developed. Independent multivariate LSTM can incorporate a wide range of data sources beyond traditional case-based epidemiological surveillance. The DD framework unleashes its potential in big data era with previously neglected heterogeneous surveillance data sources, such as syndromic, environment, genomic, serologic, infoveillance, and mobility data. DD approaches, especially LSTM, complement and integrate with the mechanistic modeling paradigm, provide a feasible alternative approach to model today’s complex socio-epidemiological systems, and further leverage our ability to explore different scenarios for more informed decision-making during health emergencies.
期刊介绍:
Epidemics publishes papers on infectious disease dynamics in the broadest sense. Its scope covers both within-host dynamics of infectious agents and dynamics at the population level, particularly the interaction between the two. Areas of emphasis include: spread, transmission, persistence, implications and population dynamics of infectious diseases; population and public health as well as policy aspects of control and prevention; dynamics at the individual level; interaction with the environment, ecology and evolution of infectious diseases, as well as population genetics of infectious agents.