Chen Lin, Jianghong Zhou, Jing Zhang, Carl Yang, Eugene Agichtein
{"title":"Graph Neural Network Modeling of Web Search Activity for Real-time Pandemic Forecasting.","authors":"Chen Lin, Jianghong Zhou, Jing Zhang, Carl Yang, Eugene Agichtein","doi":"10.1109/ichi57859.2023.00027","DOIUrl":null,"url":null,"abstract":"<p><p>The utilization of web search activity for pandemic forecasting has significant implications for managing disease spread and informing policy decisions. However, web search records tend to be noisy and influenced by geographical location, making it difficult to develop large-scale models. While regularized linear models have been effective in predicting the spread of respiratory illnesses like COVID-19, they are limited to specific locations. The lack of incorporation of neighboring areas' data and the inability to transfer models to new locations with limited data has impeded further progress. To address these limitations, this study proposes a novel self-supervised message-passing neural network (SMPNN) framework for modeling local and cross-location dynamics in pandemic forecasting. The SMPNN framework utilizes an MPNN module to learn cross-location dependencies through self-supervised learning and improve local predictions with graph-generated features. The framework is designed as an end-to-end solution and is compared with state-of-the-art statistical and deep learning models using COVID-19 data from England and the US. The results of the study demonstrate that the SMPNN model outperforms other models by achieving up to a 6.9% improvement in prediction accuracy and lower prediction errors during the early stages of disease outbreaks. This approach represents a significant advancement in disease surveillance and forecasting, providing a novel methodology, datasets, and insights that combine web search data and spatial information. The proposed SMPNN framework offers a promising avenue for modeling the spread of pandemics, leveraging both local and cross-location information, and has the potential to inform public health policy decisions.</p>","PeriodicalId":73284,"journal":{"name":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","volume":"2023 ","pages":"128-137"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10853009/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Conference on Healthcare Informatics. IEEE International Conference on Healthcare Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ichi57859.2023.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/12/11 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The utilization of web search activity for pandemic forecasting has significant implications for managing disease spread and informing policy decisions. However, web search records tend to be noisy and influenced by geographical location, making it difficult to develop large-scale models. While regularized linear models have been effective in predicting the spread of respiratory illnesses like COVID-19, they are limited to specific locations. The lack of incorporation of neighboring areas' data and the inability to transfer models to new locations with limited data has impeded further progress. To address these limitations, this study proposes a novel self-supervised message-passing neural network (SMPNN) framework for modeling local and cross-location dynamics in pandemic forecasting. The SMPNN framework utilizes an MPNN module to learn cross-location dependencies through self-supervised learning and improve local predictions with graph-generated features. The framework is designed as an end-to-end solution and is compared with state-of-the-art statistical and deep learning models using COVID-19 data from England and the US. The results of the study demonstrate that the SMPNN model outperforms other models by achieving up to a 6.9% improvement in prediction accuracy and lower prediction errors during the early stages of disease outbreaks. This approach represents a significant advancement in disease surveillance and forecasting, providing a novel methodology, datasets, and insights that combine web search data and spatial information. The proposed SMPNN framework offers a promising avenue for modeling the spread of pandemics, leveraging both local and cross-location information, and has the potential to inform public health policy decisions.