Pub Date : 2025-04-12DOI: 10.1016/j.epidem.2025.100824
Sophie Phillips , George Mohler , Frederic Schoenberg
Hawkes point process models have been shown to forecast the number of daily new cases of epidemic diseases, including SARS-CoV-2 (Covid-19), with high accuracy. Here, we explore how accurately Hawkes models forecast surges of Covid-19 in the United States. We use Hawkes models to estimate the effective reproduction rate and transmission density parameters for Covid-19 case counts in each of the 50 United States, then forecast in future weeks with simple exponential smoothing. A classifier based on is applied to predict upcoming surges in cases each week from August 2020 to December 2021, using only data available up to that week. At false alarm rates below 5%, the forecasts based on are correct more often than forecasts based on smoothing the raw case count data, achieving a maximum accuracy of 90% with . The optimal decision boundary uses a combination of and observed data.
{"title":"Detection of surges of SARS-Cov-2 using nonparametric Hawkes models","authors":"Sophie Phillips , George Mohler , Frederic Schoenberg","doi":"10.1016/j.epidem.2025.100824","DOIUrl":"10.1016/j.epidem.2025.100824","url":null,"abstract":"<div><div>Hawkes point process models have been shown to forecast the number of daily new cases of epidemic diseases, including SARS-CoV-2 (Covid-19), with high accuracy. Here, we explore how accurately Hawkes models forecast surges of Covid-19 in the United States. We use Hawkes models to estimate the effective reproduction rate <span><math><msub><mrow><mi>R</mi></mrow><mrow><mi>t</mi></mrow></msub></math></span> and transmission density parameters for Covid-19 case counts in each of the 50 United States, then forecast <span><math><msub><mrow><mi>R</mi></mrow><mrow><mi>t</mi></mrow></msub></math></span> in future weeks with simple exponential smoothing. A classifier based on <span><math><mrow><msub><mrow><mi>R</mi></mrow><mrow><mi>t</mi></mrow></msub><mo>></mo><mi>x</mi></mrow></math></span> is applied to predict upcoming surges in cases each week from August 2020 to December 2021, using only data available up to that week. At false alarm rates below 5%, the forecasts based on <span><math><msub><mrow><mi>R</mi></mrow><mrow><mi>t</mi></mrow></msub></math></span> are correct more often than forecasts based on smoothing the raw case count data, achieving a maximum accuracy of 90% with <span><math><mrow><msub><mrow><mi>R</mi></mrow><mrow><mi>t</mi></mrow></msub><mo>></mo><mn>1</mn><mo>.</mo><mn>39</mn></mrow></math></span>. The optimal decision boundary uses a combination of <span><math><msub><mrow><mi>R</mi></mrow><mrow><mi>t</mi></mrow></msub></math></span> and observed data.</div></div>","PeriodicalId":49206,"journal":{"name":"Epidemics","volume":"51 ","pages":"Article 100824"},"PeriodicalIF":3.0,"publicationDate":"2025-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143830099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-05DOI: 10.1016/j.epidem.2025.100826
Regina Manansala , Joke Bilcke , Lander Willem , Niel Hens , Philippe Beutels
Background
Many European countries prioritize groups for annual influenza vaccination based on risk of severe disease and death. This has resulted in relatively high influenza vaccination coverage in older adults in Belgium. However, coverage is much lower in younger adults and negligible in children. Children and young adults are known to play a major role in the transmission dynamics of influenza. Thus, an important policy question is how influenza vaccines can be optimally allocated across age groups, taking indirect effects into account.
Methods
We adapted a dynamic transmission model to reproduce influenza seasonality in Belgium comparing 6720 mutually exclusive vaccination options, including current practice. Vaccination options were defined by different combinations of coverage level changes in nine age groups. We performed an economic evaluation comparing all options from a healthcare payer perspective. Quality-adjusted life-years (QALYs) were the primary health outcome. We expressed parametric uncertainty using the Incremental Net Monetary Benefits (INMB) approach.
Results
Of all the vaccination options considered, over 90 % dominated the current Belgian vaccination strategy in terms of cost-effectiveness. Children were estimated to contribute a substantial indirect protective effect to the overall population. The most cost-effective program increases vaccination coverage rates for children to 90 %, 50–64 years old to 48 %, and 65–74 years old to 75 %.
Discussion
Overall QALY gains can be maximized in seasonal influenza vaccination programs at acceptable costs by achieving high vaccination coverage in childhood age groups. Programmatic and ethical concerns towards such an implementation in the Belgian context need to be separately considered.
{"title":"Optimizing influenza vaccine allocation by age using cost-effectiveness analysis: A comparison of 6720 vaccination program scenarios in children and adults in Belgium","authors":"Regina Manansala , Joke Bilcke , Lander Willem , Niel Hens , Philippe Beutels","doi":"10.1016/j.epidem.2025.100826","DOIUrl":"10.1016/j.epidem.2025.100826","url":null,"abstract":"<div><h3>Background</h3><div>Many European countries prioritize groups for annual influenza vaccination based on risk of severe disease and death. This has resulted in relatively high influenza vaccination coverage in older adults in Belgium. However, coverage is much lower in younger adults and negligible in children. Children and young adults are known to play a major role in the transmission dynamics of influenza. Thus, an important policy question is how influenza vaccines can be optimally allocated across age groups, taking indirect effects into account.</div></div><div><h3>Methods</h3><div>We adapted a dynamic transmission model to reproduce influenza seasonality in Belgium comparing 6720 mutually exclusive vaccination options, including current practice. Vaccination options were defined by different combinations of coverage level changes in nine age groups. We performed an economic evaluation comparing all options from a healthcare payer perspective. Quality-adjusted life-years (QALYs) were the primary health outcome. We expressed parametric uncertainty using the Incremental Net Monetary Benefits (INMB) approach.</div></div><div><h3>Results</h3><div>Of all the vaccination options considered, over 90 % dominated the current Belgian vaccination strategy in terms of cost-effectiveness. Children were estimated to contribute a substantial indirect protective effect to the overall population. The most cost-effective program increases vaccination coverage rates for children to 90 %, 50–64 years old to 48 %, and 65–74 years old to 75 %.</div></div><div><h3>Discussion</h3><div>Overall QALY gains can be maximized in seasonal influenza vaccination programs at acceptable costs by achieving high vaccination coverage in childhood age groups. Programmatic and ethical concerns towards such an implementation in the Belgian context need to be separately considered.</div></div>","PeriodicalId":49206,"journal":{"name":"Epidemics","volume":"51 ","pages":"Article 100826"},"PeriodicalIF":3.0,"publicationDate":"2025-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143833472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The lack of conventional methods of estimating real-time infectious disease burden in granular regions inhibits timely and efficient public health response. Comprehensive data sources (e.g., state health department data) typically needed for such estimation are often limited due to 1) substantial delays in data reporting and 2) lack of geographic granularity in data provided to researchers. Leveraging real-time local health system data presents an opportunity to overcome these challenges. This study evaluates the effectiveness of machine learning and statistical approaches using local health system data to estimate current and previous COVID-19 hospitalizations in South Carolina. Random Forest models demonstrated consistently higher average median percent agreement accuracy compared to generalized linear mixed models for current weekly hospitalizations across 123 ZIP codes (72.29 %, IQR: 63.20–75.62 %) and 28 counties (76.43 %, IQR: 70.33–81.16 %) with sufficient health system coverage. To account for underrepresented populations in health systems, we combined Random Forest models with Classification and Regression Trees (CART) for imputation. The average median percent agreement was 61.02 % (IQR: 51.17–72.29 %) for all ZIP codes and 72.64 % (IQR: 66.13–77.69 %) for all counties. Median percent agreement for cumulative hospitalizations over the previous 6 months was 80.98 % (IQR: 68.99–89.66 %) for all ZIP codes and 81.17 % (IQR: 68.55–91.33 %) for all counties. These findings emphasize the effectiveness of utilizing real-time health system data to estimate infectious disease burden. Moreover, the methodologies developed in this study can be adapted to estimate hospitalizations for other diseases, offering a valuable tool for public health officials to respond swiftly and effectively to various health crises.
{"title":"Machine learning approaches for real-time ZIP code and county-level estimation of state-wide infectious disease hospitalizations using local health system data","authors":"Tanvir Ahammed , Md Sakhawat Hossain , Christopher McMahan , Lior Rennert","doi":"10.1016/j.epidem.2025.100823","DOIUrl":"10.1016/j.epidem.2025.100823","url":null,"abstract":"<div><div>The lack of conventional methods of estimating real-time infectious disease burden in granular regions inhibits timely and efficient public health response. Comprehensive data sources (e.g., state health department data) typically needed for such estimation are often limited due to 1) substantial delays in data reporting and 2) lack of geographic granularity in data provided to researchers. Leveraging real-time local health system data presents an opportunity to overcome these challenges. This study evaluates the effectiveness of machine learning and statistical approaches using local health system data to estimate current and previous COVID-19 hospitalizations in South Carolina. Random Forest models demonstrated consistently higher average median percent agreement accuracy compared to generalized linear mixed models for current weekly hospitalizations across 123 ZIP codes (72.29 %, IQR: 63.20–75.62 %) and 28 counties (76.43 %, IQR: 70.33–81.16 %) with sufficient health system coverage. To account for underrepresented populations in health systems, we combined Random Forest models with Classification and Regression Trees (CART) for imputation. The average median percent agreement was 61.02 % (IQR: 51.17–72.29 %) for all ZIP codes and 72.64 % (IQR: 66.13–77.69 %) for all counties. Median percent agreement for cumulative hospitalizations over the previous 6 months was 80.98 % (IQR: 68.99–89.66 %) for all ZIP codes and 81.17 % (IQR: 68.55–91.33 %) for all counties. These findings emphasize the effectiveness of utilizing real-time health system data to estimate infectious disease burden. Moreover, the methodologies developed in this study can be adapted to estimate hospitalizations for other diseases, offering a valuable tool for public health officials to respond swiftly and effectively to various health crises.</div></div>","PeriodicalId":49206,"journal":{"name":"Epidemics","volume":"51 ","pages":"Article 100823"},"PeriodicalIF":3.0,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143815115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-28DOI: 10.1016/j.epidem.2025.100825
KM O’Reilly , MJ Wade , K. Farkas , F. Amman , A. Lison , JD Munday , J. Bingham , ZE Mthombothi , Z. Fang , CS Brown , RR Kao , L. Danon
Wastewater-based epidemiology is the detection of pathogens from sewage systems and the interpretation of these data to improve public health. Its use has increased in scope since 2020, when it was demonstrated that SARS-CoV-2 RNA could be successfully extracted from the wastewater of affected populations. In this Perspective we provide an overview of recent advances in pathogen detection within wastewater, propose a framework for identifying the utility of wastewater sampling for pathogen detection and suggest areas where analytics require development. Ensuring that both data collection and analysis are tailored towards key questions at different stages of an epidemic will improve the inference made. For analyses to be useful we require methods to determine the absence of infection, early detection of infection, reliably estimate epidemic trajectories and prevalence, and detect novel variants without reliance on consensus sequences. This research area has included many innovations that have improved the interpretation of collected data and we are optimistic that innovation will continue in the future.
{"title":"Analysis insights to support the use of wastewater and environmental surveillance data for infectious diseases and pandemic preparedness","authors":"KM O’Reilly , MJ Wade , K. Farkas , F. Amman , A. Lison , JD Munday , J. Bingham , ZE Mthombothi , Z. Fang , CS Brown , RR Kao , L. Danon","doi":"10.1016/j.epidem.2025.100825","DOIUrl":"10.1016/j.epidem.2025.100825","url":null,"abstract":"<div><div>Wastewater-based epidemiology is the detection of pathogens from sewage systems and the interpretation of these data to improve public health. Its use has increased in scope since 2020, when it was demonstrated that SARS-CoV-2 RNA could be successfully extracted from the wastewater of affected populations. In this <em>Perspective</em> we provide an overview of recent advances in pathogen detection within wastewater, propose a framework for identifying the utility of wastewater sampling for pathogen detection and suggest areas where analytics require development. Ensuring that both data collection and analysis are tailored towards key questions at different stages of an epidemic will improve the inference made. For analyses to be useful we require methods to determine the absence of infection, early detection of infection, reliably estimate epidemic trajectories and prevalence, and detect novel variants without reliance on consensus sequences. This research area has included many innovations that have improved the interpretation of collected data and we are optimistic that innovation will continue in the future.</div></div>","PeriodicalId":49206,"journal":{"name":"Epidemics","volume":"51 ","pages":"Article 100825"},"PeriodicalIF":3.0,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143737878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-18DOI: 10.1016/j.epidem.2025.100820
Gabrielle Thivierge , Aaron Rumack , F. William Townes
Seasonal influenza forecasting is critical for public health and individual decision making. We investigate whether the inclusion of data about influenza activity in neighboring states can improve point predictions and distribution forecasting of influenza-like illness (ILI) in each US state using statistical regression models. Using CDC FluView ILI data from 2010–2019, we forecast weekly ILI in each US state with quantile, linear, and Poisson autoregressive models fit using different combinations of ILI data from the target state, neighboring states, and the US population-weighted average. Scoring with root mean squared error and weighted interval score indicated that the covariate sets including neighbors and/or the US weighted average ILI showed slightly higher accuracy than models fit only using lagged ILI in the target state, on average. Additionally, the improvement in performance when including neighbors was similar to the improvement when including the US average instead, suggesting the proximity of the neighboring states is not the driver of the slight increase in accuracy. There is also clear within-season and between-season variability in the effect of spatial information on prediction accuracy.
{"title":"Does spatial information improve forecasting of influenza-like illness?","authors":"Gabrielle Thivierge , Aaron Rumack , F. William Townes","doi":"10.1016/j.epidem.2025.100820","DOIUrl":"10.1016/j.epidem.2025.100820","url":null,"abstract":"<div><div>Seasonal influenza forecasting is critical for public health and individual decision making. We investigate whether the inclusion of data about influenza activity in neighboring states can improve point predictions and distribution forecasting of influenza-like illness (ILI) in each US state using statistical regression models. Using CDC FluView ILI data from 2010–2019, we forecast weekly ILI in each US state with quantile, linear, and Poisson autoregressive models fit using different combinations of ILI data from the target state, neighboring states, and the US population-weighted average. Scoring with root mean squared error and weighted interval score indicated that the covariate sets including neighbors and/or the US weighted average ILI showed slightly higher accuracy than models fit only using lagged ILI in the target state, on average. Additionally, the improvement in performance when including neighbors was similar to the improvement when including the US average instead, suggesting the proximity of the neighboring states is not the driver of the slight increase in accuracy. There is also clear within-season and between-season variability in the effect of spatial information on prediction accuracy.</div></div>","PeriodicalId":49206,"journal":{"name":"Epidemics","volume":"51 ","pages":"Article 100820"},"PeriodicalIF":3.0,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143714961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-11DOI: 10.1016/j.epidem.2025.100821
Etthel M. Windels , Cecilia Valenzuela Agüí , Bouke C. de Jong , Conor J. Meehan , Chloé Loiseau , Galo A. Goig , Michaela Zwyer , Sonia Borrell , Daniela Brites , Sebastien Gagneux , Tanja Stadler
Mycobacterium tuberculosis complex (MTBC) lineages show substantial variability in virulence, but the epidemiological consequences of this variability have not been studied in detail. Here, we aimed for a lineage-specific epidemiological characterization by applying phylodynamic models to genomic data from different countries, representing the most abundant MTBC lineages. Our results suggest that all lineages are associated with similar durations and levels of infectiousness, resulting in similar reproductive numbers. However, L1 and L6 are associated with a delayed onset of infectiousness, leading to longer periods between subsequent transmission events. Together, our findings highlight the role of MTBC genetic diversity in tuberculosis disease progression and transmission.
{"title":"Onset of infectiousness explains differences in transmissibility across Mycobacterium tuberculosis lineages","authors":"Etthel M. Windels , Cecilia Valenzuela Agüí , Bouke C. de Jong , Conor J. Meehan , Chloé Loiseau , Galo A. Goig , Michaela Zwyer , Sonia Borrell , Daniela Brites , Sebastien Gagneux , Tanja Stadler","doi":"10.1016/j.epidem.2025.100821","DOIUrl":"10.1016/j.epidem.2025.100821","url":null,"abstract":"<div><div><em>Mycobacterium tuberculosis</em> complex (MTBC) lineages show substantial variability in virulence, but the epidemiological consequences of this variability have not been studied in detail. Here, we aimed for a lineage-specific epidemiological characterization by applying phylodynamic models to genomic data from different countries, representing the most abundant MTBC lineages. Our results suggest that all lineages are associated with similar durations and levels of infectiousness, resulting in similar reproductive numbers. However, L1 and L6 are associated with a delayed onset of infectiousness, leading to longer periods between subsequent transmission events. Together, our findings highlight the role of MTBC genetic diversity in tuberculosis disease progression and transmission.</div></div>","PeriodicalId":49206,"journal":{"name":"Epidemics","volume":"51 ","pages":"Article 100821"},"PeriodicalIF":3.0,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143674697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-14DOI: 10.1016/j.epidem.2025.100819
Stefania Fiandrino , Andrea Bizzotto , Giorgio Guzzetta , Stefano Merler , Federico Baldo , Eugenio Valdano , Alberto Mateo Urdiales , Antonino Bella , Francesco Celino , Lorenzo Zino , Alessandro Rizzo , Yuhan Li , Nicola Perra , Corrado Gioannini , Paolo Milano , Daniela Paolotti , Marco Quaggiotto , Luca Rossi , Ivan Vismara , Alessandro Vespignani , Nicolò Gozzi
Collaborative hubs that integrate multiple teams to generate ensemble projections and forecasts for shared targets are now regarded as state-of-the-art in epidemic predictive modeling. In this paper, we introduce Influcast, Italy’s first epidemic forecasting hub for influenza-like illness. During the 2023/2024 winter season, Influcast provided 20 rounds of forecasts, involving five teams and eight models to predict influenza-like illness incidence up to four weeks in advance at the national and regional administrative level. The individual forecasts were synthesized into an ensemble and benchmarked against a baseline model. Across all models, the ensemble most frequently ranks among the top performers at the national level considering different metrics and forecasting rounds. Additionally, the ensemble outperforms the baseline and most individual models across all regions. Despite a decline in absolute performance over longer horizons, the ensemble model outperformed the baseline in all considered horizons. These findings show the importance of multimodel forecasting hubs in producing reliable short-term influenza-like illnesses forecasts that can inform public health preparedness and mitigation strategies.
{"title":"Collaborative forecasting of influenza-like illness in Italy: The Influcast experience","authors":"Stefania Fiandrino , Andrea Bizzotto , Giorgio Guzzetta , Stefano Merler , Federico Baldo , Eugenio Valdano , Alberto Mateo Urdiales , Antonino Bella , Francesco Celino , Lorenzo Zino , Alessandro Rizzo , Yuhan Li , Nicola Perra , Corrado Gioannini , Paolo Milano , Daniela Paolotti , Marco Quaggiotto , Luca Rossi , Ivan Vismara , Alessandro Vespignani , Nicolò Gozzi","doi":"10.1016/j.epidem.2025.100819","DOIUrl":"10.1016/j.epidem.2025.100819","url":null,"abstract":"<div><div>Collaborative hubs that integrate multiple teams to generate ensemble projections and forecasts for shared targets are now regarded as state-of-the-art in epidemic predictive modeling. In this paper, we introduce Influcast, Italy’s first epidemic forecasting hub for influenza-like illness. During the 2023/2024 winter season, Influcast provided 20 rounds of forecasts, involving five teams and eight models to predict influenza-like illness incidence up to four weeks in advance at the national and regional administrative level. The individual forecasts were synthesized into an ensemble and benchmarked against a baseline model. Across all models, the ensemble most frequently ranks among the top performers at the national level considering different metrics and forecasting rounds. Additionally, the ensemble outperforms the baseline and most individual models across all regions. Despite a decline in absolute performance over longer horizons, the ensemble model outperformed the baseline in all considered horizons. These findings show the importance of multimodel forecasting hubs in producing reliable short-term influenza-like illnesses forecasts that can inform public health preparedness and mitigation strategies.</div></div>","PeriodicalId":49206,"journal":{"name":"Epidemics","volume":"50 ","pages":"Article 100819"},"PeriodicalIF":3.0,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143429019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-02-07DOI: 10.1016/j.epidem.2025.100816
Austin G. Meyer , Fred Lu , Leonardo Clemente , Mauricio Santillana
Accurate, real-time forecasts of influenza hospitalizations would facilitate prospective resource allocation and public health preparedness. State-of-the-art machine learning methods are a promising approach to produce such forecasts, but they require extensive historical data to be properly trained. Unfortunately, data on influenza hospitalizations, for the 50 states in the United States, are only available since the beginning of 2020. In addition, the data are far from perfect as they were under-reported for several months before health systems began consistently submitting their data. To address these issues, we propose a transfer learning approach. We extend the currently available two-season dataset for state-level influenza hospitalizations by an additional ten seasons. Our method leverages influenza-like illness (ILI) data to infer historical estimates of influenza hospitalizations. This data augmentation enables the implementation of advanced machine learning techniques, multi-horizon training, and an ensemble of models to improve hospitalization forecasts. We evaluated the performance of our machine learning approaches by prospectively producing forecasts for future weeks and submitting them in real time to the Centers for Disease Control and Prevention FluSight challenges during two seasons: 2022–2023 and 2023–2024. Our methodology demonstrated good accuracy and reliability, achieving a fourth place finish (among 20 participating teams) in the 2022–23 and a second place finish (among 20 participating teams) in the 2023–24 CDC FluSight challenges. Our findings highlight the utility of data augmentation and knowledge transfer in the application of machine learning models to public health surveillance where only limited historical data is available.
{"title":"A prospective real-time transfer learning approach to estimate influenza hospitalizations with limited data","authors":"Austin G. Meyer , Fred Lu , Leonardo Clemente , Mauricio Santillana","doi":"10.1016/j.epidem.2025.100816","DOIUrl":"10.1016/j.epidem.2025.100816","url":null,"abstract":"<div><div>Accurate, real-time forecasts of influenza hospitalizations would facilitate prospective resource allocation and public health preparedness. State-of-the-art machine learning methods are a promising approach to produce such forecasts, but they require extensive historical data to be properly trained. Unfortunately, data on influenza hospitalizations, for the 50 states in the United States, are only available since the beginning of 2020. In addition, the data are far from perfect as they were under-reported for several months before health systems began consistently submitting their data. To address these issues, we propose a transfer learning approach. We extend the currently available two-season dataset for state-level influenza hospitalizations by an additional ten seasons. Our method leverages influenza-like illness (ILI) data to infer historical estimates of influenza hospitalizations. This data augmentation enables the implementation of advanced machine learning techniques, multi-horizon training, and an ensemble of models to improve hospitalization forecasts. We evaluated the performance of our machine learning approaches by prospectively producing forecasts for future weeks and submitting them in real time to the Centers for Disease Control and Prevention FluSight challenges during two seasons: 2022–2023 and 2023–2024. Our methodology demonstrated good accuracy and reliability, achieving a fourth place finish (among 20 participating teams) in the 2022–23 and a second place finish (among 20 participating teams) in the 2023–24 CDC FluSight challenges. Our findings highlight the utility of data augmentation and knowledge transfer in the application of machine learning models to public health surveillance where only limited historical data is available.</div></div>","PeriodicalId":49206,"journal":{"name":"Epidemics","volume":"50 ","pages":"Article 100816"},"PeriodicalIF":3.0,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143464330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-26DOI: 10.1016/j.epidem.2025.100818
Matteo Perini , Teresa K. Yamana , Marta Galanti , Jiyeon Suh , Roselyn Kaondera-Shava , Jeffrey Shaman
Background
Understanding the dynamics of infectious disease spread and predicting clinical outcomes are critical for managing large-scale epidemics and pandemics, such as COVID-19. Effective modeling of disease transmission in interconnected populations helps inform public health responses and interventions across regions.
Methods
We developed a novel metapopulation model for simulating respiratory virus transmission in the North America region, specifically for the 96 states, provinces, and territories of Canada, Mexico, and the United States. The model is informed by COVID-19 case data, which are assimilated using the Ensemble Adjustment Kalman filter (EAKF), a Bayesian inference algorithm. Additionally, commuting and mobility data are used to build and adjust the network and movement across locations on a daily basis.
Results
This model-inference system provides estimates of transmission dynamics, infection rates, and ascertainment rates for each of the 96 locations from January 2020 to March 2021. The results highlight differences in disease dynamics and ascertainment among the three countries.
Conclusions
The metapopulation structure enables rapid simulation at a large scale, and the data assimilation method makes the system responsive to changes in system dynamics. This model can serve as a versatile platform for modeling other infectious diseases across the North American region.
{"title":"Modelling COVID-19 in the North American region with a metapopulation network and Kalman filter","authors":"Matteo Perini , Teresa K. Yamana , Marta Galanti , Jiyeon Suh , Roselyn Kaondera-Shava , Jeffrey Shaman","doi":"10.1016/j.epidem.2025.100818","DOIUrl":"10.1016/j.epidem.2025.100818","url":null,"abstract":"<div><h3>Background</h3><div>Understanding the dynamics of infectious disease spread and predicting clinical outcomes are critical for managing large-scale epidemics and pandemics, such as COVID-19. Effective modeling of disease transmission in interconnected populations helps inform public health responses and interventions across regions.</div></div><div><h3>Methods</h3><div>We developed a novel metapopulation model for simulating respiratory virus transmission in the North America region, specifically for the 96 states, provinces, and territories of Canada, Mexico, and the United States. The model is informed by COVID-19 case data, which are assimilated using the Ensemble Adjustment Kalman filter (EAKF), a Bayesian inference algorithm. Additionally, commuting and mobility data are used to build and adjust the network and movement across locations on a daily basis.</div></div><div><h3>Results</h3><div>This model-inference system provides estimates of transmission dynamics, infection rates, and ascertainment rates for each of the 96 locations from January 2020 to March 2021. The results highlight differences in disease dynamics and ascertainment among the three countries.</div></div><div><h3>Conclusions</h3><div>The metapopulation structure enables rapid simulation at a large scale, and the data assimilation method makes the system responsive to changes in system dynamics. This model can serve as a versatile platform for modeling other infectious diseases across the North American region.</div></div>","PeriodicalId":49206,"journal":{"name":"Epidemics","volume":"50 ","pages":"Article 100818"},"PeriodicalIF":3.0,"publicationDate":"2025-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143076069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-25DOI: 10.1016/j.epidem.2025.100817
Jaime Cascante Vega , Rami Yaari , Tal Robin , Lingsheng Wen , Jason Zucker , Anne-Catrin Uhlemann , Sen Pei , Jeffrey Shaman
Pathogenic bacteria are a major threat to patient health in hospitals. Here we leverage electronic health records from a major New York City hospital system collected during 2020–2021 to support simulation inference of nosocomial transmission and pathogenic bacteria detection using an agent-based model (ABM). The ABM uses these data to inform simulation of importation from the community, nosocomial transmission, and patient spontaneous decolonization of bacteria. We additionally use patient clinical culture results to inform an observational model of detection of the pathogenic bacteria. The model is coupled with a Bayesian inference algorithm, an iterated ensemble adjustment Kalman filter, to estimate the likelihood of detection upon testing and nosocomial transmission rates. We evaluate parameter identifiability for this model-inference system and find that the system is able to estimate modelled nosocomial transmission and effective sensitivity upon clinical culture testing. We apply the framework to estimate both quantities for seven prevalent bacterial pathogens: Escherichia coli, Klebsiella pneumoniae, Pseudomonas aeruginosa, Staphylococcus aureus (both sensitive, MSSA, and resistant, MRSA, phenotypes), Enterococcus faecium and Enterococcus faecalis. We estimate that nosocomial transmission for E. coli is negligible. While bacterial pathogens have different importation rates, nosocomial transmission rates were similar among organisms, except E. coli. We also find that estimated likelihoods of detection are similar for all pathogens. This work highlights how fine-scale patient data can support inference of the epidemiological properties of micro-organisms and how hospital traffic and patient contact determine epidemiological features. Evaluation of the transmission potential for different pathogens could ultimately support the development of hospital control measures, as well as the design of surveillance strategies.
{"title":"Estimating nosocomial transmission of micro-organisms in hospital settings using patient records and culture data","authors":"Jaime Cascante Vega , Rami Yaari , Tal Robin , Lingsheng Wen , Jason Zucker , Anne-Catrin Uhlemann , Sen Pei , Jeffrey Shaman","doi":"10.1016/j.epidem.2025.100817","DOIUrl":"10.1016/j.epidem.2025.100817","url":null,"abstract":"<div><div>Pathogenic bacteria are a major threat to patient health in hospitals. Here we leverage electronic health records from a major New York City hospital system collected during 2020–2021 to support simulation inference of nosocomial transmission and pathogenic bacteria detection using an agent-based model (ABM). The ABM uses these data to inform simulation of importation from the community, nosocomial transmission, and patient spontaneous decolonization of bacteria. We additionally use patient clinical culture results to inform an observational model of detection of the pathogenic bacteria. The model is coupled with a Bayesian inference algorithm, an iterated ensemble adjustment Kalman filter, to estimate the likelihood of detection upon testing and nosocomial transmission rates. We evaluate parameter identifiability for this model-inference system and find that the system is able to estimate modelled nosocomial transmission and effective sensitivity upon clinical culture testing. We apply the framework to estimate both quantities for seven prevalent bacterial pathogens: <em>Escherichia coli, Klebsiella pneumoniae, Pseudomonas aeruginosa, Staphylococcus aureus</em> (both sensitive, MSSA, and resistant, MRSA, phenotypes), <em>Enterococcus faecium</em> and <em>Enterococcus faecalis</em>. We estimate that nosocomial transmission for <em>E. coli</em> is negligible<em>.</em> While bacterial pathogens have different importation rates, nosocomial transmission rates were similar among organisms, except <em>E. coli</em>. We also find that estimated likelihoods of detection are similar for all pathogens. This work highlights how fine-scale patient data can support inference of the epidemiological properties of micro-organisms and how hospital traffic and patient contact determine epidemiological features. Evaluation of the transmission potential for different pathogens could ultimately support the development of hospital control measures, as well as the design of surveillance strategies.</div></div>","PeriodicalId":49206,"journal":{"name":"Epidemics","volume":"50 ","pages":"Article 100817"},"PeriodicalIF":3.0,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143394841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}