The Application of Machine Learning Models in the Prediction of PM2.5/PM10 Concentration

Xinzhi Lin
{"title":"The Application of Machine Learning Models in the Prediction of PM2.5/PM10 Concentration","authors":"Xinzhi Lin","doi":"10.1145/3450588.3450605","DOIUrl":null,"url":null,"abstract":"The current world economy and science are in an era of rapid development, and Beijing is experiencing chronic air pollution. The air quality is important to the travel of people, development of enterprise and normal operation of traffic. PM2.5 and PM10 are the main components which cause the air pollution, and it's very meaningful to predict their concentration in the air [1]. Although some traditional models (like basic linear regression) have been proposed to predict the content of PM2.5/PM10, the quantities of variables included to predict the concentration are few and it executes with low efficiency and low accuracy. In the big data era, it's necessary to build the model which can execute the big data kinds and sets. With the adequate data sets from different meteorological stations in Beijing, we can use the more abundant variables such as mass of SO2, NO2, wind direction and other weather observations to predict the content of PM2.5/PM10. We build the machine learning models with higher efficiency, accuracy and stronger learning ability, whose primary algorithms include: multiple linear regression, decision tree, boosting and random forest based on decision tree and neural network. The result demonstrates that the prediction effect of the models is based on neural network and ensemble learning. Boosting performs best among these models, which achieves R-square 84.2% and 75.7% on the test set for the PM2.5 and PM10, respectively.","PeriodicalId":150426,"journal":{"name":"Proceedings of the 2021 4th International Conference on Computers in Management and Business","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 4th International Conference on Computers in Management and Business","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3450588.3450605","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

The current world economy and science are in an era of rapid development, and Beijing is experiencing chronic air pollution. The air quality is important to the travel of people, development of enterprise and normal operation of traffic. PM2.5 and PM10 are the main components which cause the air pollution, and it's very meaningful to predict their concentration in the air [1]. Although some traditional models (like basic linear regression) have been proposed to predict the content of PM2.5/PM10, the quantities of variables included to predict the concentration are few and it executes with low efficiency and low accuracy. In the big data era, it's necessary to build the model which can execute the big data kinds and sets. With the adequate data sets from different meteorological stations in Beijing, we can use the more abundant variables such as mass of SO2, NO2, wind direction and other weather observations to predict the content of PM2.5/PM10. We build the machine learning models with higher efficiency, accuracy and stronger learning ability, whose primary algorithms include: multiple linear regression, decision tree, boosting and random forest based on decision tree and neural network. The result demonstrates that the prediction effect of the models is based on neural network and ensemble learning. Boosting performs best among these models, which achieves R-square 84.2% and 75.7% on the test set for the PM2.5 and PM10, respectively.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器学习模型在PM2.5/PM10浓度预测中的应用
当今世界经济和科学正处于快速发展的时代,而北京正经历着长期的空气污染。空气质量关系到人们的出行、企业的发展和交通的正常运行。PM2.5和PM10是造成大气污染的主要成分,对其在空气中的浓度进行预测具有重要意义[1]。虽然已经提出了一些传统的模型(如基本线性回归)来预测PM2.5/PM10的含量,但用于预测浓度的变量数量少,执行效率低,精度低。在大数据时代,有必要建立能够执行大数据种类和集合的模型。在北京市各气象站数据充足的情况下,我们可以利用SO2质量、NO2质量、风向等较为丰富的气象观测变量预测PM2.5/PM10的含量。我们构建了效率更高、精度更高、学习能力更强的机器学习模型,其主要算法包括:基于决策树和神经网络的多元线性回归、决策树、boosting和随机森林。结果表明,该模型的预测效果是基于神经网络和集成学习。在这些模型中,Boosting的表现最好,在PM2.5和PM10的测试集上分别达到了84.2%和75.7%的r方。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Software development, design and implementation of management system for characteristic B & B Avoiding Counterfeits and Achieving Privacy in Supply Chain: A Blockchain Based Approach Economic Geography of Tin Mining Industry: Understanding Regional Charecteristics Matter Growth With or Without Development in Bangka Belitung Islands Province, Indonesia, Period of 2004-2018 An ISM Modelling of Success Factors for Blockchain Adoption in a Cyber Secure Supply Chain The Role of End Users in Efficient Business Intelligence Solutions: A Preliminary Study
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1