Multistep Forecasting of New COVID-19 Cases Based on LSTMs Using Bayesian Optimization

Tianqian Chen, Shuyu Chen, Shan Mei, Shuqi An, Xiaohan Yuan, Yuwen Lu
{"title":"Multistep Forecasting of New COVID-19 Cases Based on LSTMs Using Bayesian Optimization","authors":"Tianqian Chen, Shuyu Chen, Shan Mei, Shuqi An, Xiaohan Yuan, Yuwen Lu","doi":"10.1145/3459104.3459116","DOIUrl":null,"url":null,"abstract":"The multistep prediction of new Corona Virus Disease (COVID-19) cases plays a vital role during the epidemic control period, and the Long Short-Term Memory (LSTM) based time series analysis model is the most frequently used among many prediction methods. But whether it is the cumulative error of the multistep prediction or the instability of the new case data of the COVID-19 make the performance of LSTM in this task not so good. In this paper, we selected three countries with more severe COVID-19 epidemics—India, Russia, and Chile, to predict new cases in the next 15 days with different multistep LSTM network models, and use Bayesian Optimization to explore the optimal hyperparameter space. The results show that: a) the performance of Recursive Prediction LSTM is the best (Mean Absolute Percentage Error, MAPE was reduced to 14.88%, 6.46%, and 16.31% for the three countries respectively), Encoder Decoder LSTM is second (15.52%, 19.61%, 19.87%), and the effect of vector output LSTM is the worst (23.55%, 26.82%, 19.57%); b) there are obvious extremely poor areas in the hyperparameter space, and the Bayesian Optimizer can focus on the good areas to avoid cost of tuning parameters based on bad hyperparameters; c) the data of new cases of COVID-19 in different countries have great differences in the hyperparameter expectations for the model. The bad area of hyperparameters and different expectations are likely to be one of the reasons why the COVID-19 data of different countries is hard to train jointly.","PeriodicalId":142284,"journal":{"name":"2021 International Symposium on Electrical, Electronics and Information Engineering","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Symposium on Electrical, Electronics and Information Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3459104.3459116","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The multistep prediction of new Corona Virus Disease (COVID-19) cases plays a vital role during the epidemic control period, and the Long Short-Term Memory (LSTM) based time series analysis model is the most frequently used among many prediction methods. But whether it is the cumulative error of the multistep prediction or the instability of the new case data of the COVID-19 make the performance of LSTM in this task not so good. In this paper, we selected three countries with more severe COVID-19 epidemics—India, Russia, and Chile, to predict new cases in the next 15 days with different multistep LSTM network models, and use Bayesian Optimization to explore the optimal hyperparameter space. The results show that: a) the performance of Recursive Prediction LSTM is the best (Mean Absolute Percentage Error, MAPE was reduced to 14.88%, 6.46%, and 16.31% for the three countries respectively), Encoder Decoder LSTM is second (15.52%, 19.61%, 19.87%), and the effect of vector output LSTM is the worst (23.55%, 26.82%, 19.57%); b) there are obvious extremely poor areas in the hyperparameter space, and the Bayesian Optimizer can focus on the good areas to avoid cost of tuning parameters based on bad hyperparameters; c) the data of new cases of COVID-19 in different countries have great differences in the hyperparameter expectations for the model. The bad area of hyperparameters and different expectations are likely to be one of the reasons why the COVID-19 data of different countries is hard to train jointly.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于贝叶斯优化lstm的新冠肺炎多步预测
新型冠状病毒病(COVID-19)病例的多步骤预测在疫情控制期间起着至关重要的作用,而基于LSTM的时间序列分析模型是众多预测方法中最常用的一种。但无论是多步预测的累积误差,还是新冠肺炎病例数据的不稳定性,都使得LSTM在这项任务中的表现不尽如人意。本文选取疫情较为严重的三个国家——印度、俄罗斯和智利,采用不同的多步LSTM网络模型预测未来15天的新增病例,并利用贝叶斯优化方法探索最优超参数空间。结果表明:a)递归预测LSTM的性能最好(三个国家的Mean Absolute Percentage Error、MAPE分别降低到14.88%、6.46%和16.31%),Encoder - Decoder LSTM次之(15.52%、19.61%、19.87%),vector output LSTM效果最差(23.55%、26.82%、19.57%);b)超参数空间中存在明显的极差区域,贝叶斯优化器可以专注于较好的区域,避免了基于较差超参数调优参数的代价;c)不同国家新发病例数据对模型的超参数期望存在较大差异。超参数的坏区和不同的预期可能是不同国家COVID-19数据难以联合训练的原因之一。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Exploring the Integration of Blockchain Technology and IoT in a Smart University Application Architecture 3D Moving Rigid Body Localization in the Presence of Anchor Position Errors RANS/LES Simulation of Low-Frequency Flow Oscillations on a NACA0012 Airfoil Near Stall Tuning Language Representation Models for Classification of Turkish News Improving Consumer Experience for Medical Information Using Text Analytics
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1