基于回归的BMKG降雨持续时间预测缺失值的KNN插补

Ikke Dian Oktaviani, Aji Gautama Putrada
{"title":"基于回归的BMKG降雨持续时间预测缺失值的KNN插补","authors":"Ikke Dian Oktaviani, Aji Gautama Putrada","doi":"10.20895/infotel.v14i4.840","DOIUrl":null,"url":null,"abstract":"The prediction of rain duration based on data from the Meteorology, Climatology, and Geophysics Agency (BMKG) is an important issue but remains an open problem. At the same time, several studies have shown that missing values can cause a decrease in the performance of the model in making predictions. This study proposes k-nearest neighbors (KNN) imputation to overcome the problem of missing values in predicting rain duration. The source of the rain duration prediction dataset is the BMKG data. We compared gradient boosting regression (GBR), adaptive boosting regression (ABR), and linear regression (LR) for the regression model for predicting rain duration. We compared the KNN imputation method with several benchmark methods, including zero imputation, mean imputation, and iterative imputation. Parameters r2, mean squared error (MSE) and mean bias error (MBE) measure the performance of these imputation methods. The test results show that for rain duration prediction using the regression method, GBR shows the best performance, both for train data and test data with r2 = 0.915 and 0.776, respectively. Then our proposed KNN imputation has the best performance for missing value imputation compared to the benchmark imputation method. The prediction values of r2 and MSE when using KNN imputation at Missing Percentage = 90% are 0.71 and 0.36, respectively.","PeriodicalId":30672,"journal":{"name":"Jurnal Infotel","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"KNN imputation to missing values of regression-based rain duration prediction on BMKG data\",\"authors\":\"Ikke Dian Oktaviani, Aji Gautama Putrada\",\"doi\":\"10.20895/infotel.v14i4.840\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The prediction of rain duration based on data from the Meteorology, Climatology, and Geophysics Agency (BMKG) is an important issue but remains an open problem. At the same time, several studies have shown that missing values can cause a decrease in the performance of the model in making predictions. This study proposes k-nearest neighbors (KNN) imputation to overcome the problem of missing values in predicting rain duration. The source of the rain duration prediction dataset is the BMKG data. We compared gradient boosting regression (GBR), adaptive boosting regression (ABR), and linear regression (LR) for the regression model for predicting rain duration. We compared the KNN imputation method with several benchmark methods, including zero imputation, mean imputation, and iterative imputation. Parameters r2, mean squared error (MSE) and mean bias error (MBE) measure the performance of these imputation methods. The test results show that for rain duration prediction using the regression method, GBR shows the best performance, both for train data and test data with r2 = 0.915 and 0.776, respectively. Then our proposed KNN imputation has the best performance for missing value imputation compared to the benchmark imputation method. The prediction values of r2 and MSE when using KNN imputation at Missing Percentage = 90% are 0.71 and 0.36, respectively.\",\"PeriodicalId\":30672,\"journal\":{\"name\":\"Jurnal Infotel\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Jurnal Infotel\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.20895/infotel.v14i4.840\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jurnal Infotel","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20895/infotel.v14i4.840","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

根据气象、气候和地球物理局(BMKG)的数据预测降雨持续时间是一个重要问题,但仍然是一个悬而未决的问题。与此同时,几项研究表明,缺失的值可能会导致模型预测性能下降。本研究提出了k近邻(KNN)插补,以克服降雨持续时间预测中的缺失值问题。降雨持续时间预测数据集的来源是BMKG数据。我们比较了预测降雨持续时间的回归模型的梯度增强回归(GBR)、自适应增强回归(ABR)和线性回归(LR)。我们将KNN插补方法与几种基准方法进行了比较,包括零插补、平均插补和迭代插补。参数r2、均方误差(MSE)和均偏误差(MBE)衡量这些插补方法的性能。测试结果表明,对于使用回归方法的降雨持续时间预测,GBR在列车数据和测试数据中表现出最佳性能,r2=0.915和0.776。然后,与基准插补方法相比,我们提出的KNN插补在缺失值插补方面具有最佳性能。当在缺失百分比=90%时使用KNN插补时,r2和MSE的预测值分别为0.71和0.36。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
KNN imputation to missing values of regression-based rain duration prediction on BMKG data
The prediction of rain duration based on data from the Meteorology, Climatology, and Geophysics Agency (BMKG) is an important issue but remains an open problem. At the same time, several studies have shown that missing values can cause a decrease in the performance of the model in making predictions. This study proposes k-nearest neighbors (KNN) imputation to overcome the problem of missing values in predicting rain duration. The source of the rain duration prediction dataset is the BMKG data. We compared gradient boosting regression (GBR), adaptive boosting regression (ABR), and linear regression (LR) for the regression model for predicting rain duration. We compared the KNN imputation method with several benchmark methods, including zero imputation, mean imputation, and iterative imputation. Parameters r2, mean squared error (MSE) and mean bias error (MBE) measure the performance of these imputation methods. The test results show that for rain duration prediction using the regression method, GBR shows the best performance, both for train data and test data with r2 = 0.915 and 0.776, respectively. Then our proposed KNN imputation has the best performance for missing value imputation compared to the benchmark imputation method. The prediction values of r2 and MSE when using KNN imputation at Missing Percentage = 90% are 0.71 and 0.36, respectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
47
审稿时长
6 weeks
期刊最新文献
Geo-Navigation in Museums: Augmented Reality Application in the Geological Museum Indonesia Cloud-based Metabase GIS Data Analysis Platform Quality Management According to ISO 9126 Indicators Solar Panel Power Generator with Automatic Charging using PWM System based on Microcontroller Weighted Voting Ensemble Learning of CNN Architectures for Diabetic Retinopathy Classification An Evaluation of Wireless Network Security with Penetration Testing Method at PT PLN UP2D S2JB
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1