Priscilla Adong, Engineer Bainomugisha, Deo Okure, Richard Sserunjogi
{"title":"将机器学习应用于低成本PM2.5和PM10空气污染传感器的大规模现场校准","authors":"Priscilla Adong, Engineer Bainomugisha, Deo Okure, Richard Sserunjogi","doi":"10.1002/ail2.76","DOIUrl":null,"url":null,"abstract":"<p>Low-cost air quality monitoring networks can potentially increase the availability of high-resolution monitoring to inform analytic and evidence-informed approaches to better manage air quality. This is particularly relevant in low and middle-income settings where access to traditional reference-grade monitoring networks remains a challenge. However, low-cost air quality sensors are impacted by ambient conditions which could lead to over- or underestimation of pollution concentrations and thus require field calibration to improve their accuracy and reliability. In this paper, we demonstrate the feasibility of using machine learning methods for large-scale calibration of AirQo sensors, low-cost PM sensors custom-designed for and deployed in Sub-Saharan urban settings. The performance of various machine learning methods is assessed by comparing model corrected PM using <i>k</i>-nearest neighbours, support vector regression, multivariate linear regression, ridge regression, lasso regression, elastic net regression, XGBoost, multilayer perceptron, random forest and gradient boosting with collocated reference PM concentrations from a Beta Attenuation Monitor (BAM). To this end, random forest and lasso regression models were superior for PM<sub>2.5</sub> and PM<sub>10</sub> calibration, respectively. Employing the random forest model decreased RMSE of raw data from 18.6 μg/m<sup>3</sup> to 7.2 μg/m<sup>3</sup> with an average BAM PM<sub>2.5</sub> concentration of 37.8 μg/m<sup>3</sup> while the lasso regression model decreased RMSE from 13.4 μg/m<sup>3</sup> to 7.9 μg/m<sup>3</sup> with an average BAM PM<sub>10</sub> concentration of 51.1 μg/m<sup>3</sup>. We validate our models through cross-unit and cross-site validation, allowing analysis of AirQo devices' consistency. The resulting calibration models were deployed to the entire large-scale air quality monitoring network consisting of over 120 AirQo devices, which demonstrates the use of machine learning systems to address practical challenges in a developing world setting.</p>","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.76","citationCount":"10","resultStr":"{\"title\":\"Applying machine learning for large scale field calibration of low-cost PM2.5 and PM10 air pollution sensors\",\"authors\":\"Priscilla Adong, Engineer Bainomugisha, Deo Okure, Richard Sserunjogi\",\"doi\":\"10.1002/ail2.76\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Low-cost air quality monitoring networks can potentially increase the availability of high-resolution monitoring to inform analytic and evidence-informed approaches to better manage air quality. This is particularly relevant in low and middle-income settings where access to traditional reference-grade monitoring networks remains a challenge. However, low-cost air quality sensors are impacted by ambient conditions which could lead to over- or underestimation of pollution concentrations and thus require field calibration to improve their accuracy and reliability. In this paper, we demonstrate the feasibility of using machine learning methods for large-scale calibration of AirQo sensors, low-cost PM sensors custom-designed for and deployed in Sub-Saharan urban settings. The performance of various machine learning methods is assessed by comparing model corrected PM using <i>k</i>-nearest neighbours, support vector regression, multivariate linear regression, ridge regression, lasso regression, elastic net regression, XGBoost, multilayer perceptron, random forest and gradient boosting with collocated reference PM concentrations from a Beta Attenuation Monitor (BAM). To this end, random forest and lasso regression models were superior for PM<sub>2.5</sub> and PM<sub>10</sub> calibration, respectively. Employing the random forest model decreased RMSE of raw data from 18.6 μg/m<sup>3</sup> to 7.2 μg/m<sup>3</sup> with an average BAM PM<sub>2.5</sub> concentration of 37.8 μg/m<sup>3</sup> while the lasso regression model decreased RMSE from 13.4 μg/m<sup>3</sup> to 7.9 μg/m<sup>3</sup> with an average BAM PM<sub>10</sub> concentration of 51.1 μg/m<sup>3</sup>. We validate our models through cross-unit and cross-site validation, allowing analysis of AirQo devices' consistency. The resulting calibration models were deployed to the entire large-scale air quality monitoring network consisting of over 120 AirQo devices, which demonstrates the use of machine learning systems to address practical challenges in a developing world setting.</p>\",\"PeriodicalId\":72253,\"journal\":{\"name\":\"Applied AI letters\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.76\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied AI letters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ail2.76\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied AI letters","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ail2.76","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Applying machine learning for large scale field calibration of low-cost PM2.5 and PM10 air pollution sensors
Low-cost air quality monitoring networks can potentially increase the availability of high-resolution monitoring to inform analytic and evidence-informed approaches to better manage air quality. This is particularly relevant in low and middle-income settings where access to traditional reference-grade monitoring networks remains a challenge. However, low-cost air quality sensors are impacted by ambient conditions which could lead to over- or underestimation of pollution concentrations and thus require field calibration to improve their accuracy and reliability. In this paper, we demonstrate the feasibility of using machine learning methods for large-scale calibration of AirQo sensors, low-cost PM sensors custom-designed for and deployed in Sub-Saharan urban settings. The performance of various machine learning methods is assessed by comparing model corrected PM using k-nearest neighbours, support vector regression, multivariate linear regression, ridge regression, lasso regression, elastic net regression, XGBoost, multilayer perceptron, random forest and gradient boosting with collocated reference PM concentrations from a Beta Attenuation Monitor (BAM). To this end, random forest and lasso regression models were superior for PM2.5 and PM10 calibration, respectively. Employing the random forest model decreased RMSE of raw data from 18.6 μg/m3 to 7.2 μg/m3 with an average BAM PM2.5 concentration of 37.8 μg/m3 while the lasso regression model decreased RMSE from 13.4 μg/m3 to 7.9 μg/m3 with an average BAM PM10 concentration of 51.1 μg/m3. We validate our models through cross-unit and cross-site validation, allowing analysis of AirQo devices' consistency. The resulting calibration models were deployed to the entire large-scale air quality monitoring network consisting of over 120 AirQo devices, which demonstrates the use of machine learning systems to address practical challenges in a developing world setting.