Predicting Personal Exposure to PM2.5 Using Different Determinants and Machine Learning Algorithms in Two Megacities, China

IF 4.3 2区 环境科学与生态学 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Indoor air Pub Date : 2024-03-08 DOI:10.1155/2024/5589891
Na Li, Yunpu Li, Dongqun Xu, Zhe Liu, Ning Li, Ryan Chartier, Junrui Chang, Qin Wang, Chunyu Xu
{"title":"Predicting Personal Exposure to PM2.5 Using Different Determinants and Machine Learning Algorithms in Two Megacities, China","authors":"Na Li,&nbsp;Yunpu Li,&nbsp;Dongqun Xu,&nbsp;Zhe Liu,&nbsp;Ning Li,&nbsp;Ryan Chartier,&nbsp;Junrui Chang,&nbsp;Qin Wang,&nbsp;Chunyu Xu","doi":"10.1155/2024/5589891","DOIUrl":null,"url":null,"abstract":"<p>The primary aim of this study is to explore the utility of machine learning algorithms for predicting personal PM<sub>2.5</sub> exposures of elderly participants and to evaluate the effect of individual variables on model performance. Personal PM<sub>2.5</sub> was measured on five consecutive days across seasons in 66 retired adults in Beijing (BJ) and Nanjing (NJ), China. The potential predictors were extracted from routine monitoring data (ambient PM<sub>2.5</sub> concentrations and meteorological factors), basic questionnaires (personal and household characteristics), and time-activity diary (TAD). Prediction models were developed based on either traditional multiple linear regression (MLR) or five advanced machine learning methods. Our results revealed that personal PM<sub>2.5</sub> exposures were well predicted by both MLR and machine learning models with predictors extracted from routine monitoring data, which was indicated by the high nested cross-validation (CV) <i>R</i><sup>2</sup> ranging from 0.76 to 0.88. The addition of predictors from either the questionnaire or TAD did not improve predictive accuracy for all algorithms. The ambient PM<sub>2.5</sub> concentrations were the most important predictor. Overall, the random forest, support vector machine, and extreme gradient boosting algorithms outperformed the reference MLR method. Compared with the traditional MLR approach, the CV <i>R</i><sup>2</sup> of the RF model increased up to 7% (from 0.82 ± 0.13 to 0.88 ± 0.10), while the RMSE reduced up to 18% (from 19.8 ± 5.4 to 16.3 ± 4.5) in BJ.</p>","PeriodicalId":13529,"journal":{"name":"Indoor air","volume":"2024 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Indoor air","FirstCategoryId":"93","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/2024/5589891","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The primary aim of this study is to explore the utility of machine learning algorithms for predicting personal PM2.5 exposures of elderly participants and to evaluate the effect of individual variables on model performance. Personal PM2.5 was measured on five consecutive days across seasons in 66 retired adults in Beijing (BJ) and Nanjing (NJ), China. The potential predictors were extracted from routine monitoring data (ambient PM2.5 concentrations and meteorological factors), basic questionnaires (personal and household characteristics), and time-activity diary (TAD). Prediction models were developed based on either traditional multiple linear regression (MLR) or five advanced machine learning methods. Our results revealed that personal PM2.5 exposures were well predicted by both MLR and machine learning models with predictors extracted from routine monitoring data, which was indicated by the high nested cross-validation (CV) R2 ranging from 0.76 to 0.88. The addition of predictors from either the questionnaire or TAD did not improve predictive accuracy for all algorithms. The ambient PM2.5 concentrations were the most important predictor. Overall, the random forest, support vector machine, and extreme gradient boosting algorithms outperformed the reference MLR method. Compared with the traditional MLR approach, the CV R2 of the RF model increased up to 7% (from 0.82 ± 0.13 to 0.88 ± 0.10), while the RMSE reduced up to 18% (from 19.8 ± 5.4 to 16.3 ± 4.5) in BJ.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用不同的决定因素和机器学习算法预测中国两个特大城市的 PM2.5 个人暴露量
本研究的主要目的是探索机器学习算法对预测老年参与者个人 PM2.5 暴露的实用性,并评估个体变量对模型性能的影响。研究人员对中国北京和南京的 66 名退休成年人进行了跨季节、连续五天的个人 PM2.5 测量。从常规监测数据(环境 PM2.5 浓度和气象因素)、基本问卷(个人和家庭特征)和时间活动日记(TAD)中提取了潜在的预测因子。根据传统的多元线性回归(MLR)或五种先进的机器学习方法建立了预测模型。我们的研究结果表明,利用从常规监测数据中提取的预测因子建立的多元线性回归模型和机器学习模型都能很好地预测个人的 PM2.5 暴露,嵌套交叉验证(CV)R2 从 0.76 到 0.88 不等,说明了这一点。在所有算法中,增加来自问卷或 TAD 的预测因子并没有提高预测准确性。环境 PM2.5 浓度是最重要的预测因子。总体而言,随机森林、支持向量机和极端梯度提升算法的表现优于参考的 MLR 方法。与传统的 MLR 方法相比,RF 模型的 CV R2 增加了 7%(从 0.82±0.13 到 0.88±0.10),而 RMSE 在 BJ 中减少了 18%(从 19.8±5.4 到 16.3±4.5)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Indoor air
Indoor air 环境科学-工程:环境
CiteScore
10.80
自引率
10.30%
发文量
175
审稿时长
3 months
期刊介绍: The quality of the environment within buildings is a topic of major importance for public health. Indoor Air provides a location for reporting original research results in the broad area defined by the indoor environment of non-industrial buildings. An international journal with multidisciplinary content, Indoor Air publishes papers reflecting the broad categories of interest in this field: health effects; thermal comfort; monitoring and modelling; source characterization; ventilation and other environmental control techniques. The research results present the basic information to allow designers, building owners, and operators to provide a healthy and comfortable environment for building occupants, as well as giving medical practitioners information on how to deal with illnesses related to the indoor environment.
期刊最新文献
Impact of Cooking Methods on Indoor Air Quality: A Comparative Study of Particulate Matter (PM) and Volatile Organic Compound (VOC) Emissions Evaluation of Seasonal Variations of Human Subjective Responses in China’s Cold Climate Zone COVID-19 Infection Risk Assessment in a Kindergarten Utilizing Continuous Air Quality Monitoring Data Objective and Subjective Indoor Air Quality and Thermal Comfort Indices: Characterization of Mediterranean Climate Archetypal Schools After the COVID-19 Pandemic Indoor Air Quality: Predicting and Comparing Protective Behaviors in Germany and Portugal
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1