A Triangulation Meta-Learning Framework for Imputing Missing Values in Weather Time Series

Q4 Social Sciences Revista Brasileira de Cartografia Pub Date : 2021-10-18 DOI:10.14393/rbcv73n4-59795
Vinícius H. A. Alves, Marconi de Arruda Pereira
{"title":"A Triangulation Meta-Learning Framework for Imputing Missing Values in Weather Time Series","authors":"Vinícius H. A. Alves, Marconi de Arruda Pereira","doi":"10.14393/rbcv73n4-59795","DOIUrl":null,"url":null,"abstract":"Machine learning and statistical methods can help model meteorological phenomena, especially in a context with many variables. However, it is not unusual that the measurement of those variables fails, generating data gaps and compromising data history analysis. The framework combines the predictions provided by three machine learning methods: decision trees, artificial neural networks and support vector machine, together with values calculated through five triangulation methods: arithmetic average, inverse distance weighted, optimized inverse distance weighted, optimized normal ratio and regional weight. Each machine learning algorithm generates eight regression models. One of the machine learning models makes predictions based only on the date. The remaining seven models make predictions based on one weather parameter (max. temperature, min. temperature, insolation, among others), in addition to the respective date. The triangulation methods use the climatic data from three neighboring cities to estimate the parameter of the target city. The generated dataset is, posteriorly, optimized by meta-learning algorithms. The results show that the additional information provided by the new machine learning models and the triangulation methods offered a significant increase in the accuracy of the imputed data. Moreover, the statistical analysis and coefficient of determination R² showed that the meta-learning model based on regression trees successfully combined the base-level outputs to generate outputs that best fill in the missing values of the time series studied in this paper.","PeriodicalId":36183,"journal":{"name":"Revista Brasileira de Cartografia","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista Brasileira de Cartografia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14393/rbcv73n4-59795","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning and statistical methods can help model meteorological phenomena, especially in a context with many variables. However, it is not unusual that the measurement of those variables fails, generating data gaps and compromising data history analysis. The framework combines the predictions provided by three machine learning methods: decision trees, artificial neural networks and support vector machine, together with values calculated through five triangulation methods: arithmetic average, inverse distance weighted, optimized inverse distance weighted, optimized normal ratio and regional weight. Each machine learning algorithm generates eight regression models. One of the machine learning models makes predictions based only on the date. The remaining seven models make predictions based on one weather parameter (max. temperature, min. temperature, insolation, among others), in addition to the respective date. The triangulation methods use the climatic data from three neighboring cities to estimate the parameter of the target city. The generated dataset is, posteriorly, optimized by meta-learning algorithms. The results show that the additional information provided by the new machine learning models and the triangulation methods offered a significant increase in the accuracy of the imputed data. Moreover, the statistical analysis and coefficient of determination R² showed that the meta-learning model based on regression trees successfully combined the base-level outputs to generate outputs that best fill in the missing values of the time series studied in this paper.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
天气时间序列缺失值的三角化元学习框架
机器学习和统计方法可以帮助建模气象现象,特别是在有许多变量的情况下。然而,这些变量的测量失败,产生数据缺口和影响数据历史分析的情况并不罕见。该框架结合了决策树、人工神经网络和支持向量机三种机器学习方法提供的预测,以及通过算术平均、距离逆加权、优化距离逆加权、优化正态比和区域权重五种三角化方法计算的值。每个机器学习算法生成8个回归模型。其中一个机器学习模型仅根据日期进行预测。剩下的7个模型根据一个天气参数(最大值为1)进行预测。温度、最低温度、日晒等),以及各自的日期。三角测量方法利用三个相邻城市的气候数据来估计目标城市的参数。生成的数据集之后通过元学习算法进行优化。结果表明,新的机器学习模型和三角测量方法提供的附加信息显著提高了输入数据的准确性。此外,统计分析和决定系数R²表明,基于回归树的元学习模型成功地结合了基础水平输出,生成了最能填补本文研究的时间序列缺失值的输出。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Revista Brasileira de Cartografia
Revista Brasileira de Cartografia Earth and Planetary Sciences-Earth-Surface Processes
CiteScore
0.70
自引率
0.00%
发文量
37
审稿时长
16 weeks
期刊最新文献
Semantic Alignment of Official and Collaborative Geospatial Data: A Case Study in Brazil Padrão Espacial de Ocorrência de Plantação de Mandioca na Amazônia Brasileira: a Região Oeste do Estado do Pará Generation of a Digital Terrain Model (DTM) Fusioning WV-2 Images and RTK-derived Topobathymetric Data Tecnologia de Geoinformação na Identificação de Lugares Ótimos para Lazer e Cultura em Divinópolis, MG: Uma Abordagem Didática Revisitando o variograma e covariância
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1