{"title":"Synergising Simulated Annealing and Generative Adversarial Network for Enhanced Wind Data Imputation in Climate Change Modelling","authors":"Soumyabrata Bhattacharjee, Gaurav Kumar Gugliani","doi":"10.3233/jcc240004","DOIUrl":null,"url":null,"abstract":"Climate models help us simulate and predict how the Earth’s climate is going to change in the future. Wind speed data is critical for developing and validating such models. However, in the real world, often owing to many factors such as station maintenance and sensor failures, a considerable amount of wind data goes missing. The Generative Adversarial Network (GAN) has been used to impute missing wind data, but the handling of unrealistic GAN output has remained largely unstudied. In this paper, we propose a novel hybrid approach that combines both the GAN and dual annealing algorithms to not only impute missing wind speed data but also counter unrealistic GAN outcomes. The hourly mean wind data has been collected from the National Centers for Environmental Information for four Indian stations, viz. Ahmedabad, Indore, Mangaluru and Mumbai. We compared the performance of the proposed approach with those of k-nn, soft imputation, and plain GAN-based approaches on mean, variance, standard deviation, kurtosis, skewness, and R-square. We found that our approach ranks number one based on the R-square value for all the considered stations. Our model consistently produces realistic results, unlike plain GAN. We observed that Mumbai has the lowest percentage of missing data (13.14%) and the highest R-square value (0.9999186451). However, Indore has the highest percentage of missing data (46.6463%) and the lowest R-square value (0.9046885604).","PeriodicalId":43177,"journal":{"name":"Journal of Climate Change","volume":null,"pages":null},"PeriodicalIF":0.7000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Climate Change","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/jcc240004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"METEOROLOGY & ATMOSPHERIC SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Climate models help us simulate and predict how the Earth’s climate is going to change in the future. Wind speed data is critical for developing and validating such models. However, in the real world, often owing to many factors such as station maintenance and sensor failures, a considerable amount of wind data goes missing. The Generative Adversarial Network (GAN) has been used to impute missing wind data, but the handling of unrealistic GAN output has remained largely unstudied. In this paper, we propose a novel hybrid approach that combines both the GAN and dual annealing algorithms to not only impute missing wind speed data but also counter unrealistic GAN outcomes. The hourly mean wind data has been collected from the National Centers for Environmental Information for four Indian stations, viz. Ahmedabad, Indore, Mangaluru and Mumbai. We compared the performance of the proposed approach with those of k-nn, soft imputation, and plain GAN-based approaches on mean, variance, standard deviation, kurtosis, skewness, and R-square. We found that our approach ranks number one based on the R-square value for all the considered stations. Our model consistently produces realistic results, unlike plain GAN. We observed that Mumbai has the lowest percentage of missing data (13.14%) and the highest R-square value (0.9999186451). However, Indore has the highest percentage of missing data (46.6463%) and the lowest R-square value (0.9046885604).
气候模型可以帮助我们模拟和预测未来地球气候的变化。风速数据对于开发和验证此类模型至关重要。然而,在现实世界中,往往由于台站维护和传感器故障等多种因素,大量的风速数据会丢失。生成对抗网络(GAN)已被用于对缺失的风力数据进行补偿,但如何处理不真实的 GAN 输出在很大程度上仍未得到研究。在本文中,我们提出了一种新颖的混合方法,该方法结合了 GAN 算法和双退火算法,不仅能计算缺失的风速数据,还能处理不切实际的 GAN 结果。我们从国家环境信息中心收集了印度四个站点(艾哈迈达巴德、印多尔、曼加鲁鲁和孟买)的每小时平均风速数据。我们比较了拟议方法与 k-nn、软估算和基于 GAN 的普通方法在平均值、方差、标准差、峰度、偏斜度和 R 平方方面的性能。我们发现,在所有考虑的站点中,根据 R 平方值,我们的方法排名第一。与普通 GAN 不同的是,我们的模型始终能产生真实的结果。我们发现,孟买的数据缺失率最低(13.14%),R 方值最高(0.9999186451)。然而,印多尔的数据缺失率最高(46.6463%),R 方值最低(0.9046885604)。