Muhammad Haseeb , Zainab Tahir , Syed Amer Mahmood , Hania Arif , Khalid F. Almutairi , Walid Soufan , Aqil Tariq
{"title":"Comparative analysis of machine learning models for predicting PM2.5 concentrations using meteorological and chemical indicators","authors":"Muhammad Haseeb , Zainab Tahir , Syed Amer Mahmood , Hania Arif , Khalid F. Almutairi , Walid Soufan , Aqil Tariq","doi":"10.1016/j.jastp.2024.106338","DOIUrl":null,"url":null,"abstract":"<div><p>Air pollution significantly impacts human health, causing numerous premature deaths, particularly with the rise in PM<sub>2.5</sub> concentrations. Therefore, comparing different machine learning (ML) models for predicting PM<sub>2.5</sub> concentration is crucial. This research focuses on six ML models: Linear Regression (LR), Regression Tree (RT), Support Vector Machine (SVM), Ensemble Regression (ERT), Gaussian Process Regression (GPR), and Artificial Neural Networks (ANN). Trained on six years of data (July 2015–December 2021) with optimized hyperparameters, the models consider eight meteorological and chemical indicators as PM<sub>2.5</sub> predictors, including temperature, relative humidity, air pressure, O<sub>3</sub>, SO<sub>2</sub>, NO<sub>2</sub>, dew point, and wind speed. Model efficiency is assessed using Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Correlation Coefficient (R), and Coefficient of Determination (R<sup>2</sup>) values. The models achieve R<sup>2</sup> and RMSE values as follows: LR (0.72, 13.52), RT (0.8, 12.156), SVM (0.82, 10.28), ERT (0.81, 11.87), GPR (0.94, 7.65), and ANN (0.99, 2.36). These metrics indicate the superior performance of ANN, with its R<sup>2</sup> value approaching 1 and the lowest RMSE compared to other models. The results highlight the effectiveness of ANN, particularly the model with three hidden layers, in predicting PM<sub>2.5</sub> concentration. Utilizing ML models for this purpose is crucial for understanding and mitigating the impacts on human health and the environment, with ANN emerging as a promising tool for various investigations.</p></div>","PeriodicalId":15096,"journal":{"name":"Journal of Atmospheric and Solar-Terrestrial Physics","volume":"263 ","pages":"Article 106338"},"PeriodicalIF":1.8000,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Atmospheric and Solar-Terrestrial Physics","FirstCategoryId":"89","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1364682624001664","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GEOCHEMISTRY & GEOPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
Air pollution significantly impacts human health, causing numerous premature deaths, particularly with the rise in PM2.5 concentrations. Therefore, comparing different machine learning (ML) models for predicting PM2.5 concentration is crucial. This research focuses on six ML models: Linear Regression (LR), Regression Tree (RT), Support Vector Machine (SVM), Ensemble Regression (ERT), Gaussian Process Regression (GPR), and Artificial Neural Networks (ANN). Trained on six years of data (July 2015–December 2021) with optimized hyperparameters, the models consider eight meteorological and chemical indicators as PM2.5 predictors, including temperature, relative humidity, air pressure, O3, SO2, NO2, dew point, and wind speed. Model efficiency is assessed using Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Correlation Coefficient (R), and Coefficient of Determination (R2) values. The models achieve R2 and RMSE values as follows: LR (0.72, 13.52), RT (0.8, 12.156), SVM (0.82, 10.28), ERT (0.81, 11.87), GPR (0.94, 7.65), and ANN (0.99, 2.36). These metrics indicate the superior performance of ANN, with its R2 value approaching 1 and the lowest RMSE compared to other models. The results highlight the effectiveness of ANN, particularly the model with three hidden layers, in predicting PM2.5 concentration. Utilizing ML models for this purpose is crucial for understanding and mitigating the impacts on human health and the environment, with ANN emerging as a promising tool for various investigations.
期刊介绍:
The Journal of Atmospheric and Solar-Terrestrial Physics (JASTP) is an international journal concerned with the inter-disciplinary science of the Earth''s atmospheric and space environment, especially the highly varied and highly variable physical phenomena that occur in this natural laboratory and the processes that couple them.
The journal covers the physical processes operating in the troposphere, stratosphere, mesosphere, thermosphere, ionosphere, magnetosphere, the Sun, interplanetary medium, and heliosphere. Phenomena occurring in other "spheres", solar influences on climate, and supporting laboratory measurements are also considered. The journal deals especially with the coupling between the different regions.
Solar flares, coronal mass ejections, and other energetic events on the Sun create interesting and important perturbations in the near-Earth space environment. The physics of such "space weather" is central to the Journal of Atmospheric and Solar-Terrestrial Physics and the journal welcomes papers that lead in the direction of a predictive understanding of the coupled system. Regarding the upper atmosphere, the subjects of aeronomy, geomagnetism and geoelectricity, auroral phenomena, radio wave propagation, and plasma instabilities, are examples within the broad field of solar-terrestrial physics which emphasise the energy exchange between the solar wind, the magnetospheric and ionospheric plasmas, and the neutral gas. In the lower atmosphere, topics covered range from mesoscale to global scale dynamics, to atmospheric electricity, lightning and its effects, and to anthropogenic changes.