将机器学习应用于低成本PM2.5和PM10空气污染传感器的大规模现场校准

Applied AI letters Pub Date : 2022-07-31 DOI:10.1002/ail2.76

Priscilla Adong, Engineer Bainomugisha, Deo Okure, Richard Sserunjogi

{"title":"将机器学习应用于低成本PM2.5和PM10空气污染传感器的大规模现场校准","authors":"Priscilla Adong, Engineer Bainomugisha, Deo Okure, Richard Sserunjogi","doi":"10.1002/ail2.76","DOIUrl":null,"url":null,"abstract":"Low-cost air quality monitoring networks can potentially increase the availability of high-resolution monitoring to inform analytic and evidence-informed approaches to better manage air quality. This is particularly relevant in low and middle-income settings where access to traditional reference-grade monitoring networks remains a challenge. However, low-cost air quality sensors are impacted by ambient conditions which could lead to over- or underestimation of pollution concentrations and thus require field calibration to improve their accuracy and reliability. In this paper, we demonstrate the feasibility of using machine learning methods for large-scale calibration of AirQo sensors, low-cost PM sensors custom-designed for and deployed in Sub-Saharan urban settings. The performance of various machine learning methods is assessed by comparing model corrected PM using k-nearest neighbours, support vector regression, multivariate linear regression, ridge regression, lasso regression, elastic net regression, XGBoost, multilayer perceptron, random forest and gradient boosting with collocated reference PM concentrations from a Beta Attenuation Monitor (BAM). To this end, random forest and lasso regression models were superior for PM2.5 and PM10 calibration, respectively. Employing the random forest model decreased RMSE of raw data from 18.6 μg/m3 to 7.2 μg/m3 with an average BAM PM2.5 concentration of 37.8 μg/m3 while the lasso regression model decreased RMSE from 13.4 μg/m3 to 7.9 μg/m3 with an average BAM PM10 concentration of 51.1 μg/m3. We validate our models through cross-unit and cross-site validation, allowing analysis of AirQo devices' consistency. The resulting calibration models were deployed to the entire large-scale air quality monitoring network consisting of over 120 AirQo devices, which demonstrates the use of machine learning systems to address practical challenges in a developing world setting.","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.76","citationCount":"10","resultStr":"{\"title\":\"Applying machine learning for large scale field calibration of low-cost PM2.5 and PM10 air pollution sensors\",\"authors\":\"Priscilla Adong, Engineer Bainomugisha, Deo Okure, Richard Sserunjogi\",\"doi\":\"10.1002/ail2.76\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Low-cost air quality monitoring networks can potentially increase the availability of high-resolution monitoring to inform analytic and evidence-informed approaches to better manage air quality. This is particularly relevant in low and middle-income settings where access to traditional reference-grade monitoring networks remains a challenge. However, low-cost air quality sensors are impacted by ambient conditions which could lead to over- or underestimation of pollution concentrations and thus require field calibration to improve their accuracy and reliability. In this paper, we demonstrate the feasibility of using machine learning methods for large-scale calibration of AirQo sensors, low-cost PM sensors custom-designed for and deployed in Sub-Saharan urban settings. The performance of various machine learning methods is assessed by comparing model corrected PM using k-nearest neighbours, support vector regression, multivariate linear regression, ridge regression, lasso regression, elastic net regression, XGBoost, multilayer perceptron, random forest and gradient boosting with collocated reference PM concentrations from a Beta Attenuation Monitor (BAM). To this end, random forest and lasso regression models were superior for PM2.5 and PM10 calibration, respectively. Employing the random forest model decreased RMSE of raw data from 18.6 μg/m3 to 7.2 μg/m3 with an average BAM PM2.5 concentration of 37.8 μg/m3 while the lasso regression model decreased RMSE from 13.4 μg/m3 to 7.9 μg/m3 with an average BAM PM10 concentration of 51.1 μg/m3. We validate our models through cross-unit and cross-site validation, allowing analysis of AirQo devices' consistency. The resulting calibration models were deployed to the entire large-scale air quality monitoring network consisting of over 120 AirQo devices, which demonstrates the use of machine learning systems to address practical challenges in a developing world setting.\",\"PeriodicalId\":72253,\"journal\":{\"name\":\"Applied AI letters\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.76\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied AI letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ail2.76\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied AI letters","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ail2.76","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

低成本空气质量监测网络有可能增加高分辨率监测的可用性，为更好地管理空气质量的分析和循证方法提供信息。这在低收入和中等收入环境中尤其重要，因为在这些环境中，使用传统的参考级监测网络仍然是一项挑战。然而，低成本的空气质量传感器受到环境条件的影响，可能导致对污染浓度的高估或低估，因此需要现场校准以提高其准确性和可靠性。在本文中，我们展示了使用机器学习方法大规模校准AirQo传感器的可行性，AirQo传感器是为撒哈拉以南城市环境定制并部署的低成本PM传感器。各种机器学习方法的性能通过比较模型校正PM来评估，使用k-近邻、支持向量回归、多元线性回归、脊回归、lasso回归、弹性网回归、XGBoost、多层感知器、随机森林和梯度增强，并使用来自Beta衰减监视器(BAM)的参考PM浓度。因此，随机森林模型和套索回归模型分别对PM2.5和PM10的校准具有优势。采用随机森林模型将原始数据的RMSE从18.6 μg/m3降低到7.2 μg/m3, BAM PM2.5平均浓度为37.8 μg/m3;套索回归模型将RMSE从13.4 μg/m3降低到7.9 μg/m3, BAM PM10平均浓度为51.1 μg/m3。我们通过跨单元和跨站点验证来验证我们的模型，从而分析AirQo设备的一致性。由此产生的校准模型被部署到由120多台AirQo设备组成的整个大规模空气质量监测网络中，这证明了机器学习系统在解决发展中国家环境中的实际挑战方面的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Applying machine learning for large scale field calibration of low-cost PM2.5 and PM10 air pollution sensors

Low-cost air quality monitoring networks can potentially increase the availability of high-resolution monitoring to inform analytic and evidence-informed approaches to better manage air quality. This is particularly relevant in low and middle-income settings where access to traditional reference-grade monitoring networks remains a challenge. However, low-cost air quality sensors are impacted by ambient conditions which could lead to over- or underestimation of pollution concentrations and thus require field calibration to improve their accuracy and reliability. In this paper, we demonstrate the feasibility of using machine learning methods for large-scale calibration of AirQo sensors, low-cost PM sensors custom-designed for and deployed in Sub-Saharan urban settings. The performance of various machine learning methods is assessed by comparing model corrected PM using k-nearest neighbours, support vector regression, multivariate linear regression, ridge regression, lasso regression, elastic net regression, XGBoost, multilayer perceptron, random forest and gradient boosting with collocated reference PM concentrations from a Beta Attenuation Monitor (BAM). To this end, random forest and lasso regression models were superior for PM_2.5 and PM₁₀ calibration, respectively. Employing the random forest model decreased RMSE of raw data from 18.6 μg/m³ to 7.2 μg/m³ with an average BAM PM_2.5 concentration of 37.8 μg/m³ while the lasso regression model decreased RMSE from 13.4 μg/m³ to 7.9 μg/m³ with an average BAM PM₁₀ concentration of 51.1 μg/m³. We validate our models through cross-unit and cross-site validation, allowing analysis of AirQo devices' consistency. The resulting calibration models were deployed to the entire large-scale air quality monitoring network consisting of over 120 AirQo devices, which demonstrates the use of machine learning systems to address practical challenges in a developing world setting.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助