Short-term air pollution prediction using graph convolutional neural networks

IF 12.9 1区管理学 Q1 BUSINESS Technological Forecasting and Social Change Pub Date : 2024-08-22 DOI:10.1016/j.techfore.2024.123684

{"title":"Short-term air pollution prediction using graph convolutional neural networks","authors":"","doi":"10.1016/j.techfore.2024.123684","DOIUrl":null,"url":null,"abstract":"<div><p>Pollution is a major concern in the present day, causing multiple illnesses and deaths, specifically in developing countries in Asia and Africa. While it has drawn worldwide attention as governments try to issue laws to meet certain criteria for air pollution levels, pollution concentration forecasting has become a major challenge. Particularly, short term forecasting will help to gain information regarding concentrations of harmful pollutants for the upcoming hours and enable better decision-making with regards to controlling air pollution. In this paper, we investigate spatio-temporal graph-based models to determine the best methods for spatial and temporal analysis of data. The models have the additional capacity to perform multi-variate predictions of correlated data, i.e., predicting multiple pollutant concentrations simultaneously, thus requiring lower computational efforts. A real-world pollution dataset measured over Delhi, India, is used to comparing the proposed models with baselines, which shows the Spatio-Temporal Graph Convolution Neural Network (STGCN) models to be performing better than others. For a better understanding of model architectures with the most effective strategies for spatial and temporal data analysis, three models, namely STGCN-A, STGCN-B, STGCN-C have been developed. The models have been compared with 6 other baselines over multiple forecasting horizons of 1 h, 24 h, and 48 h timesteps using various metrics such as mean absolute error (MAE), root mean square error (RMSE), mean absolute percent error (MAPE). On the PM<sub>2.5</sub> dataset of Delhi, STGCN-B achieves a performance of 10.53 MAE, 6.92 RMSE and 25.25 MAPE for a 1 h forecast, while STGCN-C achieves 20.18 MAE, 14.73 RMSE and 55.45 MAPE for a 24 h forecast. In general, both structures achieve similar results, with STGCN-C being better in many cases. They are further analysed through observation-prediction graphs and Taylor diagrams, which give an insight into our findings. The models are additionally validated on a benchmark real-world dataset from California, USA for better understanding of the spatio-temporal relations and model performances on a more stable dataset, where STGCN-C performs best for PM<sub>2.5</sub> with 4.30 RMSE, 1.98 MAE, 25.96 MAPE for 1 h predictions for univariate data and 3.63 RMSE, 1.88 MAE and 25.91 MAPE in multivariate forecasting. The developed spatio-temporal graph-based models hold promising applications in urban air quality management, aiding policymakers in implementing targeted interventions to mitigate pollution-related health risks. Furthermore, these models can support public health agencies by providing timely and accurate forecasts of pollutant concentrations, enabling proactive measures to safeguard community well-being. Our study showcases the efficacy of spatio-temporal graph-based models in accurately forecasting air pollutant concentrations, with particular emphasis on short-term predictions. By leveraging multi-variate capabilities, our proposed models demonstrate superior performance compared to baseline approaches across various forecasting horizons.</p></div>","PeriodicalId":48454,"journal":{"name":"Technological Forecasting and Social Change","volume":null,"pages":null},"PeriodicalIF":12.9000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Technological Forecasting and Social Change","FirstCategoryId":"91","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0040162524004827","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BUSINESS","Score":null,"Total":0}

引用次数: 0

Abstract

Pollution is a major concern in the present day, causing multiple illnesses and deaths, specifically in developing countries in Asia and Africa. While it has drawn worldwide attention as governments try to issue laws to meet certain criteria for air pollution levels, pollution concentration forecasting has become a major challenge. Particularly, short term forecasting will help to gain information regarding concentrations of harmful pollutants for the upcoming hours and enable better decision-making with regards to controlling air pollution. In this paper, we investigate spatio-temporal graph-based models to determine the best methods for spatial and temporal analysis of data. The models have the additional capacity to perform multi-variate predictions of correlated data, i.e., predicting multiple pollutant concentrations simultaneously, thus requiring lower computational efforts. A real-world pollution dataset measured over Delhi, India, is used to comparing the proposed models with baselines, which shows the Spatio-Temporal Graph Convolution Neural Network (STGCN) models to be performing better than others. For a better understanding of model architectures with the most effective strategies for spatial and temporal data analysis, three models, namely STGCN-A, STGCN-B, STGCN-C have been developed. The models have been compared with 6 other baselines over multiple forecasting horizons of 1 h, 24 h, and 48 h timesteps using various metrics such as mean absolute error (MAE), root mean square error (RMSE), mean absolute percent error (MAPE). On the PM_2.5 dataset of Delhi, STGCN-B achieves a performance of 10.53 MAE, 6.92 RMSE and 25.25 MAPE for a 1 h forecast, while STGCN-C achieves 20.18 MAE, 14.73 RMSE and 55.45 MAPE for a 24 h forecast. In general, both structures achieve similar results, with STGCN-C being better in many cases. They are further analysed through observation-prediction graphs and Taylor diagrams, which give an insight into our findings. The models are additionally validated on a benchmark real-world dataset from California, USA for better understanding of the spatio-temporal relations and model performances on a more stable dataset, where STGCN-C performs best for PM_2.5 with 4.30 RMSE, 1.98 MAE, 25.96 MAPE for 1 h predictions for univariate data and 3.63 RMSE, 1.88 MAE and 25.91 MAPE in multivariate forecasting. The developed spatio-temporal graph-based models hold promising applications in urban air quality management, aiding policymakers in implementing targeted interventions to mitigate pollution-related health risks. Furthermore, these models can support public health agencies by providing timely and accurate forecasts of pollutant concentrations, enabling proactive measures to safeguard community well-being. Our study showcases the efficacy of spatio-temporal graph-based models in accurately forecasting air pollutant concentrations, with particular emphasis on short-term predictions. By leveraging multi-variate capabilities, our proposed models demonstrate superior performance compared to baseline approaches across various forecasting horizons.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用图卷积神经网络进行短期空气污染预测

污染是当今人们关注的一个主要问题，它导致多种疾病和死亡，尤其是在亚洲和非洲的发展中国家。随着各国政府努力颁布法律以达到一定的空气污染水平标准，污染问题已引起全世界的关注，而污染浓度预报已成为一项重大挑战。特别是，短期预报将有助于获得未来几个小时有害污染物浓度的信息，从而在控制空气污染方面做出更好的决策。在本文中，我们研究了基于时空图的模型，以确定对数据进行时空分析的最佳方法。这些模型还能对相关数据进行多变量预测，即同时预测多种污染物的浓度，从而降低计算成本。我们利用在印度德里测量的真实世界污染数据集，将提出的模型与基线模型进行比较，结果表明时空图卷积神经网络（STGCN）模型的性能优于其他模型。为了更好地了解具有最有效时空数据分析策略的模型架构，我们开发了三种模型，即 STGCN-A、STGCN-B 和 STGCN-C。在 1 小时、24 小时和 48 小时的多个预报时间跨度内，使用平均绝对误差（MAE）、均方根误差（RMSE）和平均绝对百分误差（MAPE）等各种指标，将这些模型与其他 6 个基线模型进行了比较。在德里 PM2.5 数据集上，STGCN-B 的 1 小时预测性能为 10.53 MAE、6.92 RMSE 和 25.25 MAPE，而 STGCN-C 的 24 小时预测性能为 20.18 MAE、14.73 RMSE 和 55.45 MAPE。总体而言，两种结构取得了相似的结果，STGCN-C 在许多情况下更胜一筹。我们还通过观测-预测图和泰勒图进一步分析了这两个模型，从而深入了解我们的研究结果。为了更好地理解时空关系和模型在更稳定的数据集上的表现，我们还在美国加利福尼亚州的一个基准真实数据集上对这些模型进行了验证，其中 STGCN-C 在 PM2.5 的单变量数据预测中表现最佳，1 小时预测的均方根误差为 4.30、1.98 MAE 和 25.96 MAPE，多变量预测的均方根误差为 3.63、1.88 MAE 和 25.91 MAPE。所开发的基于时空图的模型在城市空气质量管理中具有广阔的应用前景，可帮助决策者实施有针对性的干预措施，以降低与污染相关的健康风险。此外，这些模型还能为公共卫生机构提供支持，及时准确地预测污染物浓度，从而采取积极措施保障社区福祉。我们的研究展示了基于时空图的模型在准确预测空气污染物浓度方面的功效，尤其侧重于短期预测。通过利用多变量能力，我们提出的模型在各种预测范围内都表现出了优于基准方法的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Technological Forecasting and Social Change Multiple-

CiteScore

21.30

自引率

10.80%

发文量

813

期刊介绍： Technological Forecasting and Social Change is a prominent platform for individuals engaged in the methodology and application of technological forecasting and future studies as planning tools, exploring the interconnectedness of social, environmental, and technological factors. In addition to serving as a key forum for these discussions, we offer numerous benefits for authors, including complimentary PDFs, a generous copyright policy, exclusive discounts on Elsevier publications, and more.