A comparative analysis of machine learning approaches to gap filling meteorological datasets

IF 2.8 4区环境科学与生态学 Q3 ENVIRONMENTAL SCIENCES Environmental Earth Sciences Pub Date : 2024-12-02 DOI:10.1007/s12665-024-11982-8

Branislava Lalic, Adam Stapleton, Thomas Vergauwen, Steven Caluwaerts, Elke Eichelmann, Mark Roantree

{"title":"A comparative analysis of machine learning approaches to gap filling meteorological datasets","authors":"Branislava Lalic, Adam Stapleton, Thomas Vergauwen, Steven Caluwaerts, Elke Eichelmann, Mark Roantree","doi":"10.1007/s12665-024-11982-8","DOIUrl":null,"url":null,"abstract":"<div><p>Observational data of the Earth’s weather and climate at the level of ground-based weather stations are prone to gaps due to a variety of causes. These gaps can inhibit scientific research as they impede the use of numerical models for agricultural, meteorological and climatological applications as well as introducing analytic biases. In this research, different machine learning techniques are evaluated together with traditional approaches to gap filling automated weather station data. When filling gaps for a specific data stream, data from neighbouring weather stations are used in addition to reanalysis data from the European Centre for Medium-Range Weather Forecasts atmospheric reanalyses of the global climate, ERA-5 Land. A novel gap creation method is introduced that provides 100% coverage in sampling the dataset while ensuring that the sampled data are randomly distributed. Gap filling across a range of different gap lengths and target variables are compared using a range of error functions. The variables selected for modelling are mean air temperature, dew point, mean relative humidity and leaf wetness. Our results show that models perform best on gap-filling temperature and dew point with worst performance on leaf wetness. As expected, model performance decreases with increasing gap length. Comparison between machine learning and reanalysis approaches show very promising results from a number of the machine learning models.</p></div>","PeriodicalId":542,"journal":{"name":"Environmental Earth Sciences","volume":"83 24","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Earth Sciences","FirstCategoryId":"93","ListUrlMain":"https://link.springer.com/article/10.1007/s12665-024-11982-8","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Observational data of the Earth’s weather and climate at the level of ground-based weather stations are prone to gaps due to a variety of causes. These gaps can inhibit scientific research as they impede the use of numerical models for agricultural, meteorological and climatological applications as well as introducing analytic biases. In this research, different machine learning techniques are evaluated together with traditional approaches to gap filling automated weather station data. When filling gaps for a specific data stream, data from neighbouring weather stations are used in addition to reanalysis data from the European Centre for Medium-Range Weather Forecasts atmospheric reanalyses of the global climate, ERA-5 Land. A novel gap creation method is introduced that provides 100% coverage in sampling the dataset while ensuring that the sampled data are randomly distributed. Gap filling across a range of different gap lengths and target variables are compared using a range of error functions. The variables selected for modelling are mean air temperature, dew point, mean relative humidity and leaf wetness. Our results show that models perform best on gap-filling temperature and dew point with worst performance on leaf wetness. As expected, model performance decreases with increasing gap length. Comparison between machine learning and reanalysis approaches show very promising results from a number of the machine learning models.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Environmental Earth Sciences 环境科学-地球科学综合

CiteScore

5.10

自引率

3.60%

发文量

494

审稿时长

8.3 months

期刊介绍： Environmental Earth Sciences is an international multidisciplinary journal concerned with all aspects of interaction between humans, natural resources, ecosystems, special climates or unique geographic zones, and the earth: Water and soil contamination caused by waste management and disposal practices Environmental problems associated with transportation by land, air, or water Geological processes that may impact biosystems or humans Man-made or naturally occurring geological or hydrological hazards Environmental problems associated with the recovery of materials from the earth Environmental problems caused by extraction of minerals, coal, and ores, as well as oil and gas, water and alternative energy sources Environmental impacts of exploration and recultivation – Environmental impacts of hazardous materials Management of environmental data and information in data banks and information systems Dissemination of knowledge on techniques, methods, approaches and experiences to improve and remediate the environment In pursuit of these topics, the geoscientific disciplines are invited to contribute their knowledge and experience. Major disciplines include: hydrogeology, hydrochemistry, geochemistry, geophysics, engineering geology, remediation science, natural resources management, environmental climatology and biota, environmental geography, soil science and geomicrobiology.