Gabriel Stachura, Zbigniew Ustrnul, Piotr Sekuła, Bogdan Bochenek, Marcin Kolonko, Małgorzata Szczęch‐Gajewska
{"title":"Machine learning based post‐processing of model‐derived near‐surface air temperature – a multi‐model approach","authors":"Gabriel Stachura, Zbigniew Ustrnul, Piotr Sekuła, Bogdan Bochenek, Marcin Kolonko, Małgorzata Szczęch‐Gajewska","doi":"10.1002/qj.4613","DOIUrl":null,"url":null,"abstract":"Abstract In the article, a machine learning based tool for calibrating numerical forecasts of near‐surface air temperature is proposed. The study area covers Poland representing a temperate type of climate with transitional features and highly variable weather. A direct output of numerical weather prediction (NWP) models is often biased and needs to be adjusted to observed values. Forecasters have to reconcile forecasts from several NWP models during their operational work. As the proposed method is based on deterministic forecasts from three short‐range limited area models (ALARO, AROME and COSMO), it can support them in their decision‐making process. Predictors include forecasts of weather elements produced by the NWP models at synoptic weather stations across Poland and station‐embedded data on ambient orography. The Random Forests algorithm (RF) has been used to produce bias‐corrected forecasts on a test set spanning one year. Its performance was evaluated against the NWP models, a linear combination of all predictors (multiple linear regression, MLR) as well as a basic Artificial Neural Network (ANN). Detailed evaluation was done to identify potential strengths and weaknesses of the model at the temporal and spatial scale. The value of RMSE of a forecast obtained by the RF model was 11% and 27% lower compared to the MLR model and the best performing NWP model, respectively. The ANN model turned out to be even superior, outperforming RF by around 2.5%. The greatest improvement occurred for warm bias during the nighttime from July to September. The largest difference in forecast accuracy between RF and ANN appeared for temperature drops at April nights. Poor performance of RF for extreme temperature ranges may be suppressed by training the model on forecast error instead of observed values of the variable. This article is protected by copyright. All rights reserved.","PeriodicalId":49646,"journal":{"name":"Quarterly Journal of the Royal Meteorological Society","volume":"73 10","pages":"0"},"PeriodicalIF":3.0000,"publicationDate":"2023-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quarterly Journal of the Royal Meteorological Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/qj.4613","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"METEOROLOGY & ATMOSPHERIC SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract In the article, a machine learning based tool for calibrating numerical forecasts of near‐surface air temperature is proposed. The study area covers Poland representing a temperate type of climate with transitional features and highly variable weather. A direct output of numerical weather prediction (NWP) models is often biased and needs to be adjusted to observed values. Forecasters have to reconcile forecasts from several NWP models during their operational work. As the proposed method is based on deterministic forecasts from three short‐range limited area models (ALARO, AROME and COSMO), it can support them in their decision‐making process. Predictors include forecasts of weather elements produced by the NWP models at synoptic weather stations across Poland and station‐embedded data on ambient orography. The Random Forests algorithm (RF) has been used to produce bias‐corrected forecasts on a test set spanning one year. Its performance was evaluated against the NWP models, a linear combination of all predictors (multiple linear regression, MLR) as well as a basic Artificial Neural Network (ANN). Detailed evaluation was done to identify potential strengths and weaknesses of the model at the temporal and spatial scale. The value of RMSE of a forecast obtained by the RF model was 11% and 27% lower compared to the MLR model and the best performing NWP model, respectively. The ANN model turned out to be even superior, outperforming RF by around 2.5%. The greatest improvement occurred for warm bias during the nighttime from July to September. The largest difference in forecast accuracy between RF and ANN appeared for temperature drops at April nights. Poor performance of RF for extreme temperature ranges may be suppressed by training the model on forecast error instead of observed values of the variable. This article is protected by copyright. All rights reserved.
期刊介绍:
The Quarterly Journal of the Royal Meteorological Society is a journal published by the Royal Meteorological Society. It aims to communicate and document new research in the atmospheric sciences and related fields. The journal is considered one of the leading publications in meteorology worldwide. It accepts articles, comprehensive review articles, and comments on published papers. It is published eight times a year, with additional special issues.
The Quarterly Journal has a wide readership of scientists in the atmospheric and related fields. It is indexed and abstracted in various databases, including Advanced Polymers Abstracts, Agricultural Engineering Abstracts, CAB Abstracts, CABDirect, COMPENDEX, CSA Civil Engineering Abstracts, Earthquake Engineering Abstracts, Engineered Materials Abstracts, Science Citation Index, SCOPUS, Web of Science, and more.