{"title":"Early prediction of the outbreak risk of dengue fever in Ba Ria-Vung Tau province, Vietnam: An analysis based on Google trends and statistical models","authors":"Dang Anh Tuan , Pham Vu Nhat Uyen","doi":"10.1016/j.idm.2025.03.001","DOIUrl":null,"url":null,"abstract":"<div><div>Dengue fever (DF), caused by the Dengue virus through the Aedes mosquito vector, is a dangerous infectious disease with the potential to become a global epidemic. Vietnam, particularly Ba Ria-Vung Tau (BRVT) province, is facing a high risk of DF. This study aims to determine the relationship between the search volume for DF on Google Trends and DF cases in BRVT province, thereby constructing a model to predict the early outbreak risk of DF locally. Using Poisson regression (adjusted by quasi-Poisson), considering the lagged effect of Google Trends Index (GTI) search volume on DF cases, and removing the autocorrelation (AC) of DF cases by using appropriate transformations, seven forecast models were surveyed based on the dataset of DF cases and GTI search volume weekly with the phrase \"sốt xuất huyết\" (dengue fever) in BRVT province from January 2019 to August 2023 (243 weeks). The model selected is the one with the lowest dispersion index. The results show that the correlation coefficient (95% confidence interval) and dispersion index of the 7 models including Basis TSR; Basis TSR + AC: Lag(Residuals,1); Basis TSR + AC: Lag(SXH,1); Basis TSR + AC: Lag(log(SXH+1),1); TSR Lag(GTI,2) + AC: Lag(log(SXH+1),2); TSR Lag(GTI,3) + AC: Lag(log(SXH+1),3); TSR Lag(GTI,0) + AC: Lag(log(SXH+1),1) are 0.71 (0.63–0.76) and 74.2; 0.79 (0.73–0.83) and 48.6; 0.89 (0.87–0.92) and 37.3; 0.98 (0.97–0.99) and 7.2; 0.96 (0.95–0.97) and 14.3; 0.93 (0.91–0.94) and 25.7; 0.98 (0.97–0.99) and 6.8, respectively. Therefore, the final model is the most suitable one selected. Testing the accuracy of the selected model using the ROC curve with the Youden criterion, the AUC (threshold 75%) is 0.982, and the AUC (threshold 95%) is 0.984, indicating the very good predictive ability of the model. In summary, the research results show the potential for applying this model in Vietnam, especially in BRVT, to enhance the effectiveness of epidemic prevention measures and protect public health.</div></div>","PeriodicalId":36831,"journal":{"name":"Infectious Disease Modelling","volume":"10 3","pages":"Pages 743-757"},"PeriodicalIF":8.8000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Infectious Disease Modelling","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2468042725000120","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0
Abstract
Dengue fever (DF), caused by the Dengue virus through the Aedes mosquito vector, is a dangerous infectious disease with the potential to become a global epidemic. Vietnam, particularly Ba Ria-Vung Tau (BRVT) province, is facing a high risk of DF. This study aims to determine the relationship between the search volume for DF on Google Trends and DF cases in BRVT province, thereby constructing a model to predict the early outbreak risk of DF locally. Using Poisson regression (adjusted by quasi-Poisson), considering the lagged effect of Google Trends Index (GTI) search volume on DF cases, and removing the autocorrelation (AC) of DF cases by using appropriate transformations, seven forecast models were surveyed based on the dataset of DF cases and GTI search volume weekly with the phrase "sốt xuất huyết" (dengue fever) in BRVT province from January 2019 to August 2023 (243 weeks). The model selected is the one with the lowest dispersion index. The results show that the correlation coefficient (95% confidence interval) and dispersion index of the 7 models including Basis TSR; Basis TSR + AC: Lag(Residuals,1); Basis TSR + AC: Lag(SXH,1); Basis TSR + AC: Lag(log(SXH+1),1); TSR Lag(GTI,2) + AC: Lag(log(SXH+1),2); TSR Lag(GTI,3) + AC: Lag(log(SXH+1),3); TSR Lag(GTI,0) + AC: Lag(log(SXH+1),1) are 0.71 (0.63–0.76) and 74.2; 0.79 (0.73–0.83) and 48.6; 0.89 (0.87–0.92) and 37.3; 0.98 (0.97–0.99) and 7.2; 0.96 (0.95–0.97) and 14.3; 0.93 (0.91–0.94) and 25.7; 0.98 (0.97–0.99) and 6.8, respectively. Therefore, the final model is the most suitable one selected. Testing the accuracy of the selected model using the ROC curve with the Youden criterion, the AUC (threshold 75%) is 0.982, and the AUC (threshold 95%) is 0.984, indicating the very good predictive ability of the model. In summary, the research results show the potential for applying this model in Vietnam, especially in BRVT, to enhance the effectiveness of epidemic prevention measures and protect public health.
期刊介绍:
Infectious Disease Modelling is an open access journal that undergoes peer-review. Its main objective is to facilitate research that combines mathematical modelling, retrieval and analysis of infection disease data, and public health decision support. The journal actively encourages original research that improves this interface, as well as review articles that highlight innovative methodologies relevant to data collection, informatics, and policy making in the field of public health.