Wenfu Sun, Frederik Tack, Lieven Clarisse, Rochelle Schneider, Trissevgeni Stavrakou, Michel Van Roozendael
{"title":"Inferring Surface NO2 Over Western Europe: A Machine Learning Approach With Uncertainty Quantification","authors":"Wenfu Sun, Frederik Tack, Lieven Clarisse, Rochelle Schneider, Trissevgeni Stavrakou, Michel Van Roozendael","doi":"10.1029/2023JD040676","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <p>Nitrogen oxides (NO<sub>x</sub> = NO + NO<sub>2</sub>) are of great concern due to their impact on human health and the environment. In recent years, machine learning (ML) techniques have been widely used for surface NO<sub>2</sub> estimation with rapid developments in computational power and big data. However, the uncertainties inherent to such retrievals are rarely studied. In this study, a novel ML framework has been developed, enhanced with uncertainty quantification techniques, to estimate surface NO<sub>2</sub> and provide corresponding data-induced uncertainty. We apply the Boosting Ensemble Conformal Quantile Estimator (BEnCQE) model to infer surface NO<sub>2</sub> concentrations over Western Europe at the daily scale and 1 km spatial resolution from May 2018 to December 2021. High NO<sub>2</sub> mainly appears in urban areas, industrial areas, and roads. The space-based cross-validation shows that our model achieves accurate point estimates (<i>r</i> = 0.8, <i>R</i><sup>2</sup> = 0.64, root mean square error = 8.08 μg/m<sup>3</sup>) and reliable prediction intervals (coverage probability, PI-50%: 51.0%, PI-90%: 90.5%). Also, the model result agrees with the Copernicus Atmosphere Monitoring Service (CAMS) model. The quantile regression in our model enables us to understand the importance of predictors for different NO<sub>2</sub> level estimations. Additionally, the uncertainty information reveals the extra potential exceedance of the World Health Organization (WHO) 2021 limit in some locations, which is undetectable by only point estimates. Meanwhile, the uncertainty quantification allows assessment of the model's robustness outside existing in-situ station measurements. It reveals challenges of NO<sub>2</sub> estimation over urban and mountainous areas where NO<sub>2</sub> is highly variable and heterogeneously distributed.</p>\n </section>\n </div>","PeriodicalId":15986,"journal":{"name":"Journal of Geophysical Research: Atmospheres","volume":"129 20","pages":""},"PeriodicalIF":3.8000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1029/2023JD040676","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Geophysical Research: Atmospheres","FirstCategoryId":"89","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1029/2023JD040676","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"METEOROLOGY & ATMOSPHERIC SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Nitrogen oxides (NOx = NO + NO2) are of great concern due to their impact on human health and the environment. In recent years, machine learning (ML) techniques have been widely used for surface NO2 estimation with rapid developments in computational power and big data. However, the uncertainties inherent to such retrievals are rarely studied. In this study, a novel ML framework has been developed, enhanced with uncertainty quantification techniques, to estimate surface NO2 and provide corresponding data-induced uncertainty. We apply the Boosting Ensemble Conformal Quantile Estimator (BEnCQE) model to infer surface NO2 concentrations over Western Europe at the daily scale and 1 km spatial resolution from May 2018 to December 2021. High NO2 mainly appears in urban areas, industrial areas, and roads. The space-based cross-validation shows that our model achieves accurate point estimates (r = 0.8, R2 = 0.64, root mean square error = 8.08 μg/m3) and reliable prediction intervals (coverage probability, PI-50%: 51.0%, PI-90%: 90.5%). Also, the model result agrees with the Copernicus Atmosphere Monitoring Service (CAMS) model. The quantile regression in our model enables us to understand the importance of predictors for different NO2 level estimations. Additionally, the uncertainty information reveals the extra potential exceedance of the World Health Organization (WHO) 2021 limit in some locations, which is undetectable by only point estimates. Meanwhile, the uncertainty quantification allows assessment of the model's robustness outside existing in-situ station measurements. It reveals challenges of NO2 estimation over urban and mountainous areas where NO2 is highly variable and heterogeneously distributed.
期刊介绍:
JGR: Atmospheres publishes articles that advance and improve understanding of atmospheric properties and processes, including the interaction of the atmosphere with other components of the Earth system.