Diversity in Ensemble Model for Classification of Data Streams with Concept Drift

Michal Kolárik, M. Sarnovský, Ján Paralič
{"title":"Diversity in Ensemble Model for Classification of Data Streams with Concept Drift","authors":"Michal Kolárik, M. Sarnovský, Ján Paralič","doi":"10.1109/SAMI50585.2021.9378625","DOIUrl":null,"url":null,"abstract":"Data streams can be defined as the continuous stream of data in many forms coming from different sources. Data streams are usually non-stationary with continually changing their underlying structure. Solving of predictive or classification tasks on such data must consider this aspect. Traditional machine learning models applied on the drifting data may become invalid in the case when a concept change appears. To tackle this problem, we must utilize special adaptive learning models, which utilize various tools able to reflect the drifting data. One of the most popular groups of such methods are adaptive ensembles. This paper describes the work focused on the design and implementation of a novel adaptive ensemble learning model, which is based on the construction of a robust ensemble consisting of a heterogeneous set of its members. We used k-NN, Naive Bayes and Hoeffding trees as base learners and implemented an update mechanism, which considers dynamic class-weighting and Q statistics diversity calculation to ensure the diversity of the ensemble. The model was experimentally evaluated on the streaming datasets, and the effects of the diversity calculation were analyzed.","PeriodicalId":402414,"journal":{"name":"2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI)","volume":"125 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 19th World Symposium on Applied Machine Intelligence and Informatics (SAMI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAMI50585.2021.9378625","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Data streams can be defined as the continuous stream of data in many forms coming from different sources. Data streams are usually non-stationary with continually changing their underlying structure. Solving of predictive or classification tasks on such data must consider this aspect. Traditional machine learning models applied on the drifting data may become invalid in the case when a concept change appears. To tackle this problem, we must utilize special adaptive learning models, which utilize various tools able to reflect the drifting data. One of the most popular groups of such methods are adaptive ensembles. This paper describes the work focused on the design and implementation of a novel adaptive ensemble learning model, which is based on the construction of a robust ensemble consisting of a heterogeneous set of its members. We used k-NN, Naive Bayes and Hoeffding trees as base learners and implemented an update mechanism, which considers dynamic class-weighting and Q statistics diversity calculation to ensure the diversity of the ensemble. The model was experimentally evaluated on the streaming datasets, and the effects of the diversity calculation were analyzed.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
概念漂移数据流分类集成模型的多样性
数据流可以定义为来自不同来源的多种形式的连续数据流。数据流通常是非平稳的,其底层结构不断变化。解决基于此类数据的预测或分类任务必须考虑这一方面。在概念发生变化的情况下,应用于漂移数据的传统机器学习模型可能会失效。为了解决这个问题,我们必须利用特殊的自适应学习模型,该模型利用各种能够反映漂移数据的工具。这类方法中最流行的一组是自适应集成。本文描述了一种新的自适应集成学习模型的设计和实现,该模型基于由其成员的异构集合组成的鲁棒集成的构建。我们使用k-NN、朴素贝叶斯和Hoeffding树作为基础学习器,并实现了一种更新机制,该机制考虑了动态类加权和Q统计多样性计算,以确保集合的多样性。在流数据集上对该模型进行了实验验证,并分析了分集计算的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Usage of RAPTOR for travel time minimizing journey planner Slip Control by Identifying the Magnetic Field of the Elements of an Asynchronous Motor Supervised Operational Change Point Detection using Ensemble Long-Short Term Memory in a Multicomponent Industrial System Improving the activity recognition using GMAF and transfer learning in post-stroke rehabilitation assessment A Baseline Assessment Method of UAV Swarm Resilience Based on Complex Networks*
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1