A Novel Random Forest and its Application on Classification of Air Quality

Hualing Yi, Qingyu Xiong, Qinghong Zou, Rui Xu, Kai Wang, Min Gao
{"title":"A Novel Random Forest and its Application on Classification of Air Quality","authors":"Hualing Yi, Qingyu Xiong, Qinghong Zou, Rui Xu, Kai Wang, Min Gao","doi":"10.1109/IIAI-AAI.2019.00018","DOIUrl":null,"url":null,"abstract":"Air pollution has a serious impact on daily life. It is necessary to inform the air quality in time to the public in order to take measures in advance. Machine learning methods such as random forest are good at evaluating grades of air quality. We find the distribution of air data is imbalance, which leads to negative effect on random forest classifiers. We propose a random forest method based on samples grouped bootstrap to solve this problem. Then we design three sets of experiments to evaluate the performance of the proposed method. The results of experiments indicate that the proposed method presents an improvement of random forest when both apply on balance datasets. The improvement is very significant when they apply on imbalance datasets, where the new method is much better at classifying minority samples.","PeriodicalId":136474,"journal":{"name":"2019 8th International Congress on Advanced Applied Informatics (IIAI-AAI)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 8th International Congress on Advanced Applied Informatics (IIAI-AAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IIAI-AAI.2019.00018","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

Abstract

Air pollution has a serious impact on daily life. It is necessary to inform the air quality in time to the public in order to take measures in advance. Machine learning methods such as random forest are good at evaluating grades of air quality. We find the distribution of air data is imbalance, which leads to negative effect on random forest classifiers. We propose a random forest method based on samples grouped bootstrap to solve this problem. Then we design three sets of experiments to evaluate the performance of the proposed method. The results of experiments indicate that the proposed method presents an improvement of random forest when both apply on balance datasets. The improvement is very significant when they apply on imbalance datasets, where the new method is much better at classifying minority samples.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种新型随机森林及其在空气质量分类中的应用
空气污染严重影响人们的日常生活。有必要及时向公众通报空气质量,以便提前采取措施。随机森林等机器学习方法擅长评估空气质量等级。我们发现空气数据的分布是不平衡的,这对随机森林分类器产生了负面影响。我们提出了一种基于样本分组自举的随机森林方法来解决这一问题。然后我们设计了三组实验来评估所提出方法的性能。实验结果表明,当两种方法都应用于平衡数据集时,所提出的方法都是对随机森林的改进。当它们应用于不平衡数据集时,改进是非常显著的,其中新方法在分类少数样本方面要好得多。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Developing a Multifaceted Evaluation System of Students' Learning Outcomes in Medical School Cognitive Acceleration Program in Undergraduate School Linking Business Strategies and System Demands Using GQM+Strategies and Systems Modeling Language Bubbloid Algorithm: A Simple Method for Generating Bubble-like Line Drawings Shape Recovery of Polyp Using Blood Vessel Detection and Matching Estimation by U-Net
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1