Multiple PM Low-Cost Sensors, Multiple Seasons’ Data, and Multiple Calibration Models

IF 2.5 4区环境科学与生态学 Q3 ENVIRONMENTAL SCIENCES Aerosol and Air Quality Research Pub Date : 2023-01-01 DOI:10.4209/aaqr.220428

Srishti Singh, Pratyush Agrawal, P. Kulkarni, H. Gautam, Meenakshi Kushwaha, V. Sreekanth

{"title":"Multiple PM Low-Cost Sensors, Multiple Seasons’ Data, and Multiple Calibration Models","authors":"Srishti Singh, Pratyush Agrawal, P. Kulkarni, H. Gautam, Meenakshi Kushwaha, V. Sreekanth","doi":"10.4209/aaqr.220428","DOIUrl":null,"url":null,"abstract":"In this study, we combined state-of-the-art data modelling techniques (machine learning [ML] methods) and data from state-of-the-art low-cost particulate matter (PM) sensors (LCSs) to improve the accuracy of LCS-measured PM 2.5 (PM with aerodynamic diameter less than 2.5 microns) mass concentrations. We collocated nine LCSs and a reference PM 2.5 instrument for 9 months, covering all local seasons, in Bengaluru, India. Using the collocation data, we evaluated the performance of the LCSs and trained around 170 ML models to reduce the observed bias in the LCS-measured PM 2.5 . The ML models included (i) Decision Tree, (ii) Random Forest (RF), (iii) eXtreme Gradient Boosting, and (iv) Support Vector Regression (SVR). A hold-out validation was performed to assess the model performance. Model performance metrics included (i) coefficient of determination (R 2 ), (ii) root mean square error (RMSE), (iii) normalised RMSE, and (iv) mean absolute error. We found that the bias in the LCS PM 2.5 measurements varied across different LCS types (RMSE = 8– 29 µ g m –3 ) and that SVR models performed best in correcting the LCS PM 2.5 measurements. Hyperparameter tuning improved the performance of the ML models (except for RF). The performance of ML models trained with significant predictors (fewer in number than the number of all predictors, chosen based on recursive feature elimination algorithm) was comparable to that of the ‘all predictors’ trained models (except for RF). The performance of most ML models was better than that of the linear models. Finally, as a research objective, we introduced the collocated black carbon mass concentration measurements into the ML models but found no significant improvement in the model performance.","PeriodicalId":7402,"journal":{"name":"Aerosol and Air Quality Research","volume":"1 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Aerosol and Air Quality Research","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.4209/aaqr.220428","RegionNum":4,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

In this study, we combined state-of-the-art data modelling techniques (machine learning [ML] methods) and data from state-of-the-art low-cost particulate matter (PM) sensors (LCSs) to improve the accuracy of LCS-measured PM 2.5 (PM with aerodynamic diameter less than 2.5 microns) mass concentrations. We collocated nine LCSs and a reference PM 2.5 instrument for 9 months, covering all local seasons, in Bengaluru, India. Using the collocation data, we evaluated the performance of the LCSs and trained around 170 ML models to reduce the observed bias in the LCS-measured PM 2.5 . The ML models included (i) Decision Tree, (ii) Random Forest (RF), (iii) eXtreme Gradient Boosting, and (iv) Support Vector Regression (SVR). A hold-out validation was performed to assess the model performance. Model performance metrics included (i) coefficient of determination (R 2 ), (ii) root mean square error (RMSE), (iii) normalised RMSE, and (iv) mean absolute error. We found that the bias in the LCS PM 2.5 measurements varied across different LCS types (RMSE = 8– 29 µ g m –3 ) and that SVR models performed best in correcting the LCS PM 2.5 measurements. Hyperparameter tuning improved the performance of the ML models (except for RF). The performance of ML models trained with significant predictors (fewer in number than the number of all predictors, chosen based on recursive feature elimination algorithm) was comparable to that of the ‘all predictors’ trained models (except for RF). The performance of most ML models was better than that of the linear models. Finally, as a research objective, we introduced the collocated black carbon mass concentration measurements into the ML models but found no significant improvement in the model performance.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

多个PM低成本传感器，多个季节数据和多个校准模型

在这项研究中，我们结合了最先进的数据建模技术(机器学习[ML]方法)和最先进的低成本颗粒物(PM)传感器(lcs)的数据，以提高lcs测量的PM 2.5(空气动力学直径小于2.5微米的PM)质量浓度的准确性。我们在印度班加罗尔设置了9个lcs和一个参考PM 2.5仪器，为期9个月，覆盖了当地所有季节。利用搭配数据，我们评估了lcs的性能，并训练了大约170个ML模型，以减少lcs测量的pm2.5中观察到的偏差。ML模型包括(i)决策树，(ii)随机森林(RF)， (iii)极端梯度增强和(iv)支持向量回归(SVR)。对模型性能进行hold-out验证。模型性能指标包括(i)决定系数(r2)， (ii)均方根误差(RMSE)， (iii)归一化RMSE，以及(iv)平均绝对误差。我们发现LCS PM 2.5测量的偏差在不同的LCS类型中有所不同(RMSE = 8 - 29µg m - 3)，并且SVR模型在校正LCS PM 2.5测量方面表现最好。超参数调优提高了ML模型的性能(RF除外)。使用显著预测因子(数量少于所有预测因子的数量，基于递归特征消除算法选择)训练的ML模型的性能与“所有预测因子”训练的模型(RF除外)相当。大多数ML模型的性能优于线性模型。最后，作为研究目标，我们在ML模型中引入了并置的黑碳质量浓度测量，但没有发现模型性能有明显改善。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Aerosol and Air Quality Research ENVIRONMENTAL SCIENCES-

CiteScore

8.30

自引率

10.00%

发文量

163

审稿时长

3 months

期刊介绍： The international journal of Aerosol and Air Quality Research (AAQR) covers all aspects of aerosol science and technology, atmospheric science and air quality related issues. It encompasses a multi-disciplinary field, including: - Aerosol, air quality, atmospheric chemistry and global change; - Air toxics (hazardous air pollutants (HAPs), persistent organic pollutants (POPs)) - Sources, control, transport and fate, human exposure; - Nanoparticle and nanotechnology; - Sources, combustion, thermal decomposition, emission, properties, behavior, formation, transport, deposition, measurement and analysis; - Effects on the environments; - Air quality and human health; - Bioaerosols; - Indoor air quality; - Energy and air pollution; - Pollution control technologies; - Invention and improvement of sampling instruments and technologies; - Optical/radiative properties and remote sensing; - Carbon dioxide emission, capture, storage and utilization; novel methods for the reduction of carbon dioxide emission; - Other topics related to aerosol and air quality.