Navid Shaghaghi, Yash Kamdar, Ron Huang, A. Calle, Jaidev Mirchandani, Michael Castillo
{"title":"Attempts at Enhancing eVision’s Influenza Forecasting Using Social Media","authors":"Navid Shaghaghi, Yash Kamdar, Ron Huang, A. Calle, Jaidev Mirchandani, Michael Castillo","doi":"10.1109/BMEiCON56653.2022.10012095","DOIUrl":null,"url":null,"abstract":"Prediction of the spread of infectious diseases such as the seasonal Influenza is of utmost importance in the preparation for and mitigation of the severity of their impact. eVision (short for Epidemic Vision) is a machine learning time series forecaster under research and development by Santa Clara University’s EPIC (Ethical, Pragmatic, and Intelligent Computing) and BioInnovation & Design laboratories. Since eVision’s Long Short-Term Memory (LSTM) neural network makes use of Influenza related keywords in Google Trends as prediction features, it stands to reason that further feature selection from trending keywords relating to the flu in social media posts could enhance its prediction. After close examination, the only social media platforms that prove capable of supplying relevant data for time series analysis are the Twitter micro-blogging and Reddit social news aggregation and discussion forum platforms; as other social media platforms are either meant for sharing images and videos, or private multi-cast communication rather than public broadcasting and discourse. However, due to the burstiness of flu related Reddit posts, no useful feature for time series forecasting can be extracted from that platform; and Twitter, which has been examined for Influenza forecasting by numerous other researchers with successful results, poses a number of obstacles such as changes in policy as well as placing features behind expensive paywalls through the disabling of existing free APIs. Regardless however, the results obtained by the addition of Twitter data as another feature in eVision’s LSTM resulted in an almost negligible predictive improvement as delineated in this paper.","PeriodicalId":177401,"journal":{"name":"2022 14th Biomedical Engineering International Conference (BMEiCON)","volume":"601 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 14th Biomedical Engineering International Conference (BMEiCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BMEiCON56653.2022.10012095","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Prediction of the spread of infectious diseases such as the seasonal Influenza is of utmost importance in the preparation for and mitigation of the severity of their impact. eVision (short for Epidemic Vision) is a machine learning time series forecaster under research and development by Santa Clara University’s EPIC (Ethical, Pragmatic, and Intelligent Computing) and BioInnovation & Design laboratories. Since eVision’s Long Short-Term Memory (LSTM) neural network makes use of Influenza related keywords in Google Trends as prediction features, it stands to reason that further feature selection from trending keywords relating to the flu in social media posts could enhance its prediction. After close examination, the only social media platforms that prove capable of supplying relevant data for time series analysis are the Twitter micro-blogging and Reddit social news aggregation and discussion forum platforms; as other social media platforms are either meant for sharing images and videos, or private multi-cast communication rather than public broadcasting and discourse. However, due to the burstiness of flu related Reddit posts, no useful feature for time series forecasting can be extracted from that platform; and Twitter, which has been examined for Influenza forecasting by numerous other researchers with successful results, poses a number of obstacles such as changes in policy as well as placing features behind expensive paywalls through the disabling of existing free APIs. Regardless however, the results obtained by the addition of Twitter data as another feature in eVision’s LSTM resulted in an almost negligible predictive improvement as delineated in this paper.