[基于机器学习的交通碳排放预测模型的构建与分析]。

Q2 Environmental Science Huanjing Kexue/Environmental Science Pub Date : 2024-06-08 DOI:10.13227/j.hjkx.202305234

Hui-Tian Liu, Da-Wei Hu

{"title":"[基于机器学习的交通碳排放预测模型的构建与分析]。","authors":"Hui-Tian Liu, Da-Wei Hu","doi":"10.13227/j.hjkx.202305234","DOIUrl":null,"url":null,"abstract":"Addressing the issue of carbon emissions in the transportation sector, this research constructed various predictive models using multiple machine learning algorithms based on panel data from 30 provinces in China from 2005 to 2019. The study aimed to identify the optimal machine learning algorithm and key factors influencing the carbon emissions of transportation, providing potent references for policymakers and decision-makers to reduce carbon emissions and promote the sustainable development of the transportation sector. Initially, drawing from the concept of the fixed effects model, we included the heterogeneity differences among provinces as an important factor. We further employed a combined method of Pearson's correlation coefficient and Spearman's rank correlation coefficient to screen 18 factors influencing transportation carbon emissions. We then made a preliminary selection of seven common machine learning algorithms and used the screened factors as explanatory variables for model training. The three algorithms with the best performance were further optimized and trained. Subsequently, we utilized the K-fold cross-validation method; plotted learning curves to test the performance of each predictive model; and used MSE, MAE, R2, and MAPE as evaluation indicators to determine the best predictive model. SHAP values were chosen to calculate the importance of each explanatory variable in the optimal predictive model. The results indicated that the multicollinearity among the seven factors of provincial differences, total consumption of social goods, urban green space area, freight turnover, number of private cars, transportation industry output, and permanent population was weak, and all passed the significance test. They could be used as explanatory variables in the prediction model of transportation carbon emissions. The prediction results of the Random Forest and XGBoost algorithms were both outstanding, with R2 values above 0.97 and errors below 10 %, showing no signs of overfitting or underfitting. Among them, the XGBoost algorithm performed the best, whereas the KNN algorithm performed poorly. The importance ranking of the explanatory variables was as follows:provincial differences > total consumption of social goods > number of private cars > permanent population > freight turnover > urban green space area > transportation industry output. A comprehensive analysis of relevance and importance showed that provincial differences were an indispensable variable in the prediction of transportation carbon emissions. In conclusion, this study provides a new approach to the governance of carbon emissions in the transportation industry, and the results can serve as a reference for policymakers and decision-makers. In future policy design and decision-making, the distinctive factors of each province should not be overlooked. Measures targeted at specific regions need to be formulated to promote the sustainable development of the transportation industry.","PeriodicalId":35937,"journal":{"name":"Huanjing Kexue/Environmental Science","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"[Construction and Analysis of Machine Learning Based Transportation Carbon Emission Prediction Model].\",\"authors\":\"Hui-Tian Liu, Da-Wei Hu\",\"doi\":\"10.13227/j.hjkx.202305234\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Addressing the issue of carbon emissions in the transportation sector, this research constructed various predictive models using multiple machine learning algorithms based on panel data from 30 provinces in China from 2005 to 2019. The study aimed to identify the optimal machine learning algorithm and key factors influencing the carbon emissions of transportation, providing potent references for policymakers and decision-makers to reduce carbon emissions and promote the sustainable development of the transportation sector. Initially, drawing from the concept of the fixed effects model, we included the heterogeneity differences among provinces as an important factor. We further employed a combined method of Pearson's correlation coefficient and Spearman's rank correlation coefficient to screen 18 factors influencing transportation carbon emissions. We then made a preliminary selection of seven common machine learning algorithms and used the screened factors as explanatory variables for model training. The three algorithms with the best performance were further optimized and trained. Subsequently, we utilized the K-fold cross-validation method; plotted learning curves to test the performance of each predictive model; and used MSE, MAE, R2, and MAPE as evaluation indicators to determine the best predictive model. SHAP values were chosen to calculate the importance of each explanatory variable in the optimal predictive model. The results indicated that the multicollinearity among the seven factors of provincial differences, total consumption of social goods, urban green space area, freight turnover, number of private cars, transportation industry output, and permanent population was weak, and all passed the significance test. They could be used as explanatory variables in the prediction model of transportation carbon emissions. The prediction results of the Random Forest and XGBoost algorithms were both outstanding, with R2 values above 0.97 and errors below 10 %, showing no signs of overfitting or underfitting. Among them, the XGBoost algorithm performed the best, whereas the KNN algorithm performed poorly. The importance ranking of the explanatory variables was as follows:provincial differences > total consumption of social goods > number of private cars > permanent population > freight turnover > urban green space area > transportation industry output. A comprehensive analysis of relevance and importance showed that provincial differences were an indispensable variable in the prediction of transportation carbon emissions. In conclusion, this study provides a new approach to the governance of carbon emissions in the transportation industry, and the results can serve as a reference for policymakers and decision-makers. In future policy design and decision-making, the distinctive factors of each province should not be overlooked. Measures targeted at specific regions need to be formulated to promote the sustainable development of the transportation industry.\",\"PeriodicalId\":35937,\"journal\":{\"name\":\"Huanjing Kexue/Environmental Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Huanjing Kexue/Environmental Science\",\"FirstCategoryId\":\"1087\",\"ListUrlMain\":\"https://doi.org/10.13227/j.hjkx.202305234\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Environmental Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Huanjing Kexue/Environmental Science","FirstCategoryId":"1087","ListUrlMain":"https://doi.org/10.13227/j.hjkx.202305234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Environmental Science","Score":null,"Total":0}

引用次数: 0

摘要

针对交通领域的碳排放问题，本研究基于2005年至2019年中国30个省份的面板数据，利用多种机器学习算法构建了多种预测模型。研究旨在找出最优的机器学习算法和影响交通碳排放的关键因素，为政策制定者和决策者减少碳排放、促进交通行业可持续发展提供有力参考。首先，我们借鉴固定效应模型的概念，将各省之间的异质性差异作为一个重要因素。我们进一步采用皮尔逊相关系数和斯皮尔曼秩相关系数相结合的方法，筛选出 18 个影响交通碳排放的因素。然后，我们初步选择了七种常见的机器学习算法，并将筛选出的因素作为模型训练的解释变量。性能最好的三种算法被进一步优化和训练。随后，我们采用 K 倍交叉验证法，绘制学习曲线来测试每个预测模型的性能，并使用 MSE、MAE、R2 和 MAPE 作为评估指标来确定最佳预测模型。选择 SHAP 值来计算每个解释变量在最佳预测模型中的重要性。结果表明，省际差异、社会商品消费总量、城市绿地面积、货运周转量、私家车保有量、交通运输业产值、常住人口七个因子之间的多重共线性较弱，均通过了显著性检验。它们可以作为交通碳排放预测模型的解释变量。随机森林算法和 XGBoost 算法的预测结果都很出色，R2 值都在 0.97 以上，误差都在 10%以下，没有过拟合或欠拟合的迹象。其中，XGBoost 算法表现最好，而 KNN 算法表现较差。解释变量的重要性排序如下：省际差异；社会商品消费总额；私家车数量；常住人口；货运周转量；城市绿地面积；交通运输业产值。对相关性和重要性的综合分析表明，省际差异是预测交通碳排放不可或缺的变量。总之，本研究为交通运输业碳排放治理提供了一种新的思路，研究结果可为政策制定者和决策者提供参考。在未来的政策设计和决策中，不应忽视各省的特色因素。需要制定针对特定地区的措施，促进交通运输业的可持续发展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

[Construction and Analysis of Machine Learning Based Transportation Carbon Emission Prediction Model].

Addressing the issue of carbon emissions in the transportation sector, this research constructed various predictive models using multiple machine learning algorithms based on panel data from 30 provinces in China from 2005 to 2019. The study aimed to identify the optimal machine learning algorithm and key factors influencing the carbon emissions of transportation, providing potent references for policymakers and decision-makers to reduce carbon emissions and promote the sustainable development of the transportation sector. Initially, drawing from the concept of the fixed effects model, we included the heterogeneity differences among provinces as an important factor. We further employed a combined method of Pearson's correlation coefficient and Spearman's rank correlation coefficient to screen 18 factors influencing transportation carbon emissions. We then made a preliminary selection of seven common machine learning algorithms and used the screened factors as explanatory variables for model training. The three algorithms with the best performance were further optimized and trained. Subsequently, we utilized the K-fold cross-validation method; plotted learning curves to test the performance of each predictive model; and used MSE, MAE, R², and MAPE as evaluation indicators to determine the best predictive model. SHAP values were chosen to calculate the importance of each explanatory variable in the optimal predictive model. The results indicated that the multicollinearity among the seven factors of provincial differences, total consumption of social goods, urban green space area, freight turnover, number of private cars, transportation industry output, and permanent population was weak, and all passed the significance test. They could be used as explanatory variables in the prediction model of transportation carbon emissions. The prediction results of the Random Forest and XGBoost algorithms were both outstanding, with R² values above 0.97 and errors below 10 %, showing no signs of overfitting or underfitting. Among them, the XGBoost algorithm performed the best, whereas the KNN algorithm performed poorly. The importance ranking of the explanatory variables was as follows:provincial differences > total consumption of social goods > number of private cars > permanent population > freight turnover > urban green space area > transportation industry output. A comprehensive analysis of relevance and importance showed that provincial differences were an indispensable variable in the prediction of transportation carbon emissions. In conclusion, this study provides a new approach to the governance of carbon emissions in the transportation industry, and the results can serve as a reference for policymakers and decision-makers. In future policy design and decision-making, the distinctive factors of each province should not be overlooked. Measures targeted at specific regions need to be formulated to promote the sustainable development of the transportation industry.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Huanjing Kexue/Environmental Science Environmental Science-Environmental Science (all)

CiteScore

4.40

自引率

0.00%

发文量

15329