Deep ensemble machine learning with Bayesian blending improved accuracy and precision of modelled ground-level ozone for region with sparse monitoring: Australia, 2005–2018
I.C. Hanigan , W. Yu , C. Yuen , K. Gopi , L.D. Knibbs , C.T. Cowie , B. Jalaludin , M. Cope , M.L. Riley , J. Heyworth , L. Morawska , G.B. Marks , G.G. Morgan , Y. Guo
{"title":"Deep ensemble machine learning with Bayesian blending improved accuracy and precision of modelled ground-level ozone for region with sparse monitoring: Australia, 2005–2018","authors":"I.C. Hanigan , W. Yu , C. Yuen , K. Gopi , L.D. Knibbs , C.T. Cowie , B. Jalaludin , M. Cope , M.L. Riley , J. Heyworth , L. Morawska , G.B. Marks , G.G. Morgan , Y. Guo","doi":"10.1016/j.envsoft.2025.106378","DOIUrl":null,"url":null,"abstract":"<div><div>Ground-level ozone (O<sub>3</sub>) is a significant public health concern. We developed maps of monthly average 1-h maximum O<sub>3</sub> concentrations in New South Wales, Australia (2005–2018), a region with sparse monitoring. For the first time Bayesian Maximum Entropy (BME) blending was used within a Deep Ensemble Machine Learning (DEML) framework for air pollution predictions. The DEML combined geographical predictors in random forest (RF), extreme gradient boosting (XGBoost), and gradient boosted machine (GBM) models with three meta-models. BME blending incorporated observed O<sub>3</sub> data into posterior predictions. We generated 2.5 km × 2.5 km resolution gridded surfaces. The DEML estimates achieved an R<sup>2</sup> of 0.89 and RMSE of 2.3 ppb in the held-out test dataset at monitors. DEML grid cell predictions (R<sup>2</sup>: 0.84, RMSE: 3.03 ppb) were improved by BME blending (R<sup>2</sup>: 0.89, RMSE: 2.49 ppb). Mean bias reduced from −0.7 ppb to −0.4 ppb. This demonstrates high accuracy and precision in a sparsely monitored region.</div></div>","PeriodicalId":310,"journal":{"name":"Environmental Modelling & Software","volume":"187 ","pages":"Article 106378"},"PeriodicalIF":4.8000,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Modelling & Software","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1364815225000623","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Ground-level ozone (O3) is a significant public health concern. We developed maps of monthly average 1-h maximum O3 concentrations in New South Wales, Australia (2005–2018), a region with sparse monitoring. For the first time Bayesian Maximum Entropy (BME) blending was used within a Deep Ensemble Machine Learning (DEML) framework for air pollution predictions. The DEML combined geographical predictors in random forest (RF), extreme gradient boosting (XGBoost), and gradient boosted machine (GBM) models with three meta-models. BME blending incorporated observed O3 data into posterior predictions. We generated 2.5 km × 2.5 km resolution gridded surfaces. The DEML estimates achieved an R2 of 0.89 and RMSE of 2.3 ppb in the held-out test dataset at monitors. DEML grid cell predictions (R2: 0.84, RMSE: 3.03 ppb) were improved by BME blending (R2: 0.89, RMSE: 2.49 ppb). Mean bias reduced from −0.7 ppb to −0.4 ppb. This demonstrates high accuracy and precision in a sparsely monitored region.
期刊介绍:
Environmental Modelling & Software publishes contributions, in the form of research articles, reviews and short communications, on recent advances in environmental modelling and/or software. The aim is to improve our capacity to represent, understand, predict or manage the behaviour of environmental systems at all practical scales, and to communicate those improvements to a wide scientific and professional audience.