{"title":"Uncovering Key Sources of Regional Ozone Simulation Biases Using Machine Learning and SHAP Analysis","authors":"Xin Yuan, Xinlong Hong, Zhijiong Huang, Li Sheng, Jinlong Zhang, Duohong Chen, Zhuangmin Zhong, Boguang Wang, Junyu Zheng","doi":"10.1016/j.envpol.2025.126012","DOIUrl":null,"url":null,"abstract":"Atmospheric chemical transport models (CTMs) are widely used in air quality management, but still have large biases in simulations. Accurately and efficiently identifying key sources of simulation biases is crucial for model improvement. However, traditional approaches, such as sensitivity and uncertainty analyses, are computationally intensive and inefficient, as they require multiple model runs. In this study, we explored the use of machine learning, specifically XGBoost combined with SHAP analysis, as an efficient diagnostic tool for analyzing simulation biases, focusing on ozone modeling in Guangdong Province as a case study. We used the bias of model inputs as features and excluded a dataset that was more susceptible to observational uncertainties to better target bias sources. Results reveal that biases in concentrations of NO<sub>2</sub>, NO and PM<sub>2.5</sub>, temperature and biogenic emissions are important sources that lead to O<sub>3</sub> simulation biases. Notably, NO<sub>x</sub> emissions were identified as the primary cause, particularly in VOC-limited regimes during autumn and winter. Additionally, underestimated NO<sub>x</sub> emissions caused the model to misrepresent the NO<sub>2</sub>-O<sub>3</sub> relationship, leading to an underestimation of the spatial extent of VOC-limited regimes in the PRD. This study demonstrates that enhancing NO<sub>x</sub> emission estimates reduces O<sub>3</sub> simulation biases in the PRD by 34% and enhances the representation of the NO<sub>2</sub>-O<sub>3</sub> relationship.","PeriodicalId":311,"journal":{"name":"Environmental Pollution","volume":"67 1","pages":""},"PeriodicalIF":7.6000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Pollution","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1016/j.envpol.2025.126012","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Atmospheric chemical transport models (CTMs) are widely used in air quality management, but still have large biases in simulations. Accurately and efficiently identifying key sources of simulation biases is crucial for model improvement. However, traditional approaches, such as sensitivity and uncertainty analyses, are computationally intensive and inefficient, as they require multiple model runs. In this study, we explored the use of machine learning, specifically XGBoost combined with SHAP analysis, as an efficient diagnostic tool for analyzing simulation biases, focusing on ozone modeling in Guangdong Province as a case study. We used the bias of model inputs as features and excluded a dataset that was more susceptible to observational uncertainties to better target bias sources. Results reveal that biases in concentrations of NO2, NO and PM2.5, temperature and biogenic emissions are important sources that lead to O3 simulation biases. Notably, NOx emissions were identified as the primary cause, particularly in VOC-limited regimes during autumn and winter. Additionally, underestimated NOx emissions caused the model to misrepresent the NO2-O3 relationship, leading to an underestimation of the spatial extent of VOC-limited regimes in the PRD. This study demonstrates that enhancing NOx emission estimates reduces O3 simulation biases in the PRD by 34% and enhances the representation of the NO2-O3 relationship.
期刊介绍:
Environmental Pollution is an international peer-reviewed journal that publishes high-quality research papers and review articles covering all aspects of environmental pollution and its impacts on ecosystems and human health.
Subject areas include, but are not limited to:
• Sources and occurrences of pollutants that are clearly defined and measured in environmental compartments, food and food-related items, and human bodies;
• Interlinks between contaminant exposure and biological, ecological, and human health effects, including those of climate change;
• Contaminants of emerging concerns (including but not limited to antibiotic resistant microorganisms or genes, microplastics/nanoplastics, electronic wastes, light, and noise) and/or their biological, ecological, or human health effects;
• Laboratory and field studies on the remediation/mitigation of environmental pollution via new techniques and with clear links to biological, ecological, or human health effects;
• Modeling of pollution processes, patterns, or trends that is of clear environmental and/or human health interest;
• New techniques that measure and examine environmental occurrences, transport, behavior, and effects of pollutants within the environment or the laboratory, provided that they can be clearly used to address problems within regional or global environmental compartments.