Unveiling the Transparency of Prediction Models for Spatial PM2.5 over Singapore: Comparison of Different Machine Learning Approaches with eXplainable Artificial Intelligence
M. S. Shyam Sunder, Vinay Anand Tikkiwal, Arun Kumar, Bhishma Tyagi
{"title":"Unveiling the Transparency of Prediction Models for Spatial PM2.5 over Singapore: Comparison of Different Machine Learning Approaches with eXplainable Artificial Intelligence","authors":"M. S. Shyam Sunder, Vinay Anand Tikkiwal, Arun Kumar, Bhishma Tyagi","doi":"10.3390/ai4040040","DOIUrl":null,"url":null,"abstract":"Aerosols play a crucial role in the climate system due to direct and indirect effects, such as scattering and absorbing radiant energy. They also have adverse effects on visibility and human health. Humans are exposed to fine PM2.5, which has adverse health impacts related to cardiovascular and respiratory-related diseases. Long-term trends in PM concentrations are influenced by emissions and meteorological variations, while meteorological factors primarily drive short-term variations. Factors such as vegetation cover, relative humidity, temperature, and wind speed impact the divergence in the PM2.5 concentrations on the surface. Machine learning proved to be a good predictor of air quality. This study focuses on predicting PM2.5 with these parameters as input for spatial and temporal information. The work analyzes the in situ observations for PM2.5 over Singapore for seven years (2014–2021) at five locations, and these datasets are used for spatial prediction of PM2.5. The study aims to provide a novel framework based on temporal-based prediction using Random Forest (RF), Gradient Boosting (GB) regression, and Tree-based Pipeline Optimization Tool (TP) Auto ML works based on meta-heuristic via genetic algorithm. TP produced reasonable Global Performance Index values; 7.4 was the highest GPI value in August 2016, and the lowest was −0.6 in June 2019. This indicates the positive performance of the TP model; even the negative values are less than other models, denoting less pessimistic predictions. The outcomes are explained with the eXplainable Artificial Intelligence (XAI) techniques which help to investigate the fidelity of feature importance of the machine learning models to extract information regarding the rhythmic shift of the PM2.5 pattern.","PeriodicalId":93633,"journal":{"name":"AI (Basel, Switzerland)","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI (Basel, Switzerland)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/ai4040040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Aerosols play a crucial role in the climate system due to direct and indirect effects, such as scattering and absorbing radiant energy. They also have adverse effects on visibility and human health. Humans are exposed to fine PM2.5, which has adverse health impacts related to cardiovascular and respiratory-related diseases. Long-term trends in PM concentrations are influenced by emissions and meteorological variations, while meteorological factors primarily drive short-term variations. Factors such as vegetation cover, relative humidity, temperature, and wind speed impact the divergence in the PM2.5 concentrations on the surface. Machine learning proved to be a good predictor of air quality. This study focuses on predicting PM2.5 with these parameters as input for spatial and temporal information. The work analyzes the in situ observations for PM2.5 over Singapore for seven years (2014–2021) at five locations, and these datasets are used for spatial prediction of PM2.5. The study aims to provide a novel framework based on temporal-based prediction using Random Forest (RF), Gradient Boosting (GB) regression, and Tree-based Pipeline Optimization Tool (TP) Auto ML works based on meta-heuristic via genetic algorithm. TP produced reasonable Global Performance Index values; 7.4 was the highest GPI value in August 2016, and the lowest was −0.6 in June 2019. This indicates the positive performance of the TP model; even the negative values are less than other models, denoting less pessimistic predictions. The outcomes are explained with the eXplainable Artificial Intelligence (XAI) techniques which help to investigate the fidelity of feature importance of the machine learning models to extract information regarding the rhythmic shift of the PM2.5 pattern.