Na Li, Yunpu Li, Dongqun Xu, Zhe Liu, Ning Li, Ryan Chartier, Junrui Chang, Qin Wang, Chunyu Xu
{"title":"使用不同的决定因素和机器学习算法预测中国两个特大城市的 PM2.5 个人暴露量","authors":"Na Li, Yunpu Li, Dongqun Xu, Zhe Liu, Ning Li, Ryan Chartier, Junrui Chang, Qin Wang, Chunyu Xu","doi":"10.1155/2024/5589891","DOIUrl":null,"url":null,"abstract":"<p>The primary aim of this study is to explore the utility of machine learning algorithms for predicting personal PM<sub>2.5</sub> exposures of elderly participants and to evaluate the effect of individual variables on model performance. Personal PM<sub>2.5</sub> was measured on five consecutive days across seasons in 66 retired adults in Beijing (BJ) and Nanjing (NJ), China. The potential predictors were extracted from routine monitoring data (ambient PM<sub>2.5</sub> concentrations and meteorological factors), basic questionnaires (personal and household characteristics), and time-activity diary (TAD). Prediction models were developed based on either traditional multiple linear regression (MLR) or five advanced machine learning methods. Our results revealed that personal PM<sub>2.5</sub> exposures were well predicted by both MLR and machine learning models with predictors extracted from routine monitoring data, which was indicated by the high nested cross-validation (CV) <i>R</i><sup>2</sup> ranging from 0.76 to 0.88. The addition of predictors from either the questionnaire or TAD did not improve predictive accuracy for all algorithms. The ambient PM<sub>2.5</sub> concentrations were the most important predictor. Overall, the random forest, support vector machine, and extreme gradient boosting algorithms outperformed the reference MLR method. Compared with the traditional MLR approach, the CV <i>R</i><sup>2</sup> of the RF model increased up to 7% (from 0.82 ± 0.13 to 0.88 ± 0.10), while the RMSE reduced up to 18% (from 19.8 ± 5.4 to 16.3 ± 4.5) in BJ.</p>","PeriodicalId":13529,"journal":{"name":"Indoor air","volume":"2024 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predicting Personal Exposure to PM2.5 Using Different Determinants and Machine Learning Algorithms in Two Megacities, China\",\"authors\":\"Na Li, Yunpu Li, Dongqun Xu, Zhe Liu, Ning Li, Ryan Chartier, Junrui Chang, Qin Wang, Chunyu Xu\",\"doi\":\"10.1155/2024/5589891\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The primary aim of this study is to explore the utility of machine learning algorithms for predicting personal PM<sub>2.5</sub> exposures of elderly participants and to evaluate the effect of individual variables on model performance. Personal PM<sub>2.5</sub> was measured on five consecutive days across seasons in 66 retired adults in Beijing (BJ) and Nanjing (NJ), China. The potential predictors were extracted from routine monitoring data (ambient PM<sub>2.5</sub> concentrations and meteorological factors), basic questionnaires (personal and household characteristics), and time-activity diary (TAD). Prediction models were developed based on either traditional multiple linear regression (MLR) or five advanced machine learning methods. Our results revealed that personal PM<sub>2.5</sub> exposures were well predicted by both MLR and machine learning models with predictors extracted from routine monitoring data, which was indicated by the high nested cross-validation (CV) <i>R</i><sup>2</sup> ranging from 0.76 to 0.88. The addition of predictors from either the questionnaire or TAD did not improve predictive accuracy for all algorithms. The ambient PM<sub>2.5</sub> concentrations were the most important predictor. Overall, the random forest, support vector machine, and extreme gradient boosting algorithms outperformed the reference MLR method. Compared with the traditional MLR approach, the CV <i>R</i><sup>2</sup> of the RF model increased up to 7% (from 0.82 ± 0.13 to 0.88 ± 0.10), while the RMSE reduced up to 18% (from 19.8 ± 5.4 to 16.3 ± 4.5) in BJ.</p>\",\"PeriodicalId\":13529,\"journal\":{\"name\":\"Indoor air\",\"volume\":\"2024 1\",\"pages\":\"\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-03-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Indoor air\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1155/2024/5589891\",\"RegionNum\":2,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CONSTRUCTION & BUILDING TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Indoor air","FirstCategoryId":"93","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/2024/5589891","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}
Predicting Personal Exposure to PM2.5 Using Different Determinants and Machine Learning Algorithms in Two Megacities, China
The primary aim of this study is to explore the utility of machine learning algorithms for predicting personal PM2.5 exposures of elderly participants and to evaluate the effect of individual variables on model performance. Personal PM2.5 was measured on five consecutive days across seasons in 66 retired adults in Beijing (BJ) and Nanjing (NJ), China. The potential predictors were extracted from routine monitoring data (ambient PM2.5 concentrations and meteorological factors), basic questionnaires (personal and household characteristics), and time-activity diary (TAD). Prediction models were developed based on either traditional multiple linear regression (MLR) or five advanced machine learning methods. Our results revealed that personal PM2.5 exposures were well predicted by both MLR and machine learning models with predictors extracted from routine monitoring data, which was indicated by the high nested cross-validation (CV) R2 ranging from 0.76 to 0.88. The addition of predictors from either the questionnaire or TAD did not improve predictive accuracy for all algorithms. The ambient PM2.5 concentrations were the most important predictor. Overall, the random forest, support vector machine, and extreme gradient boosting algorithms outperformed the reference MLR method. Compared with the traditional MLR approach, the CV R2 of the RF model increased up to 7% (from 0.82 ± 0.13 to 0.88 ± 0.10), while the RMSE reduced up to 18% (from 19.8 ± 5.4 to 16.3 ± 4.5) in BJ.
期刊介绍:
The quality of the environment within buildings is a topic of major importance for public health.
Indoor Air provides a location for reporting original research results in the broad area defined by the indoor environment of non-industrial buildings. An international journal with multidisciplinary content, Indoor Air publishes papers reflecting the broad categories of interest in this field: health effects; thermal comfort; monitoring and modelling; source characterization; ventilation and other environmental control techniques.
The research results present the basic information to allow designers, building owners, and operators to provide a healthy and comfortable environment for building occupants, as well as giving medical practitioners information on how to deal with illnesses related to the indoor environment.