{"title":"Prediction of influenza outbreaks by integrating Wikipedia article access logs and Google flu trend data","authors":"Batuhan Bardak, Mehmet Tan","doi":"10.1109/BIBE.2015.7367640","DOIUrl":null,"url":null,"abstract":"Prediction of influenza outbreaks is of utmost importance for health practitioners, officers and people. After the increasing usage of internet, it became easier and more valuable to fetch and process internet search query data. There are two significant platforms that people widely use, Google and Wikipedia. In both platforms, access logs are available which means that we can see how often any query/article was searched. Google has its own web service for monitoring and forecasting influenza-illness which is called the Google Flu Trends. It provides estimates of influenza activity for some countries. The second alternative is Wikipedia access logs which provide the number of visits for the articles on Wikipedia. There are papers which work with these platforms separately. In this paper, we propose a new technique to use these two sources together to improve the prediction of influenza outbreaks. We achieved promising results for both nowcasting and forecasting with linear regression models.","PeriodicalId":422807,"journal":{"name":"2015 IEEE 15th International Conference on Bioinformatics and Bioengineering (BIBE)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 15th International Conference on Bioinformatics and Bioengineering (BIBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2015.7367640","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23
Abstract
Prediction of influenza outbreaks is of utmost importance for health practitioners, officers and people. After the increasing usage of internet, it became easier and more valuable to fetch and process internet search query data. There are two significant platforms that people widely use, Google and Wikipedia. In both platforms, access logs are available which means that we can see how often any query/article was searched. Google has its own web service for monitoring and forecasting influenza-illness which is called the Google Flu Trends. It provides estimates of influenza activity for some countries. The second alternative is Wikipedia access logs which provide the number of visits for the articles on Wikipedia. There are papers which work with these platforms separately. In this paper, we propose a new technique to use these two sources together to improve the prediction of influenza outbreaks. We achieved promising results for both nowcasting and forecasting with linear regression models.