Chun-Sheng Huang , Kang Lo , Yee-Lin Wu , Fu-Cheng Wang , Yi-Shiang Shiu , Chu-Chih Chen , Yuan-Chien Lin , Cheng-Pin Kuo , Ho-Tang Liao , Tang-Huang Lin , Chang-Fu Wu
{"title":"Estimating and characterizing spatiotemporal distributions of elemental PM2.5 using an ensemble machine learning approach in Taiwan","authors":"Chun-Sheng Huang , Kang Lo , Yee-Lin Wu , Fu-Cheng Wang , Yi-Shiang Shiu , Chu-Chih Chen , Yuan-Chien Lin , Cheng-Pin Kuo , Ho-Tang Liao , Tang-Huang Lin , Chang-Fu Wu","doi":"10.1016/j.apr.2025.102463","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents an ensemble machine learning approach that combines Generalized Additive Model (GAM) with eXtreme Gradient Boosting (XGBoost) to estimate and characterize the spatiotemporal distributions of elemental PM<sub>2.5</sub> in Taiwan. Daily field measurements of 12 PM<sub>2.5</sub> elemental components were collected from 28 air quality monitoring stations between June 2021 and May 2022. Time-variant meteorological factors and time-invariant land-use patterns were incorporated as predictors. Results showed that the ensemble model effectively captured spatial variations in elemental PM<sub>2.5</sub> levels, as demonstrated by the identification of numerous time-invariant features using Shapley additive explanations analysis. A comparative analysis was conducted with a model using only XGBoost, which outperformed the ensemble model with higher cross-validated <em>R</em><sup><em>2</em></sup> and lower prediction errors. While the XGBoost-only model is recommended for exposure prediction, the ensemble model offers superior interpretability for investigating air pollution sources and aids in formulating air quality strategies from a spatial perspective.</div></div>","PeriodicalId":8604,"journal":{"name":"Atmospheric Pollution Research","volume":"16 5","pages":"Article 102463"},"PeriodicalIF":3.9000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Atmospheric Pollution Research","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1309104225000650","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents an ensemble machine learning approach that combines Generalized Additive Model (GAM) with eXtreme Gradient Boosting (XGBoost) to estimate and characterize the spatiotemporal distributions of elemental PM2.5 in Taiwan. Daily field measurements of 12 PM2.5 elemental components were collected from 28 air quality monitoring stations between June 2021 and May 2022. Time-variant meteorological factors and time-invariant land-use patterns were incorporated as predictors. Results showed that the ensemble model effectively captured spatial variations in elemental PM2.5 levels, as demonstrated by the identification of numerous time-invariant features using Shapley additive explanations analysis. A comparative analysis was conducted with a model using only XGBoost, which outperformed the ensemble model with higher cross-validated R2 and lower prediction errors. While the XGBoost-only model is recommended for exposure prediction, the ensemble model offers superior interpretability for investigating air pollution sources and aids in formulating air quality strategies from a spatial perspective.
期刊介绍:
Atmospheric Pollution Research (APR) is an international journal designed for the publication of articles on air pollution. Papers should present novel experimental results, theory and modeling of air pollution on local, regional, or global scales. Areas covered are research on inorganic, organic, and persistent organic air pollutants, air quality monitoring, air quality management, atmospheric dispersion and transport, air-surface (soil, water, and vegetation) exchange of pollutants, dry and wet deposition, indoor air quality, exposure assessment, health effects, satellite measurements, natural emissions, atmospheric chemistry, greenhouse gases, and effects on climate change.