{"title":"用于数据流挖掘的无监督漂移检测器集成","authors":"Lukasz Korycki, B. Krawczyk","doi":"10.1109/DSAA.2019.00047","DOIUrl":null,"url":null,"abstract":"Data stream mining is among the most contemporary branches of machine learning. The potentially infinite sources give us many opportunities and at the same time pose new challenges. To properly handle streaming data we need to improve our well-established methods, so they can work with dynamic data and under strict constraints. Supervised streaming machine learning algorithms require a certain number of labeled instances in order to stay up-to-date. Since high budgets dedicated for this purpose are usually infeasible, we have to limit the supervision as much as we can. One possible approach is to trigger labeling, only if a change is explicitly indicated by a detector. While there are several supervised algorithms dedicated for this purpose, the more practical unsupervised ones are still lacking a proper attention. In this paper, we propose a novel unsupervised ensemble drift detector that recognizes local changes in feature subspaces (EDFS) without additional supervision, using specialized committees of incremental Kolmogorov-Smirnov tests. We combine it with an adaptive classifier and update it, only if the drift detector signalizes a change. Conducted experiments show that our framework is able to efficiently adapt to various concept drifts and outperform other unsupervised algorithms.","PeriodicalId":416037,"journal":{"name":"2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"2018 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Unsupervised Drift Detector Ensembles for Data Stream Mining\",\"authors\":\"Lukasz Korycki, B. Krawczyk\",\"doi\":\"10.1109/DSAA.2019.00047\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data stream mining is among the most contemporary branches of machine learning. The potentially infinite sources give us many opportunities and at the same time pose new challenges. To properly handle streaming data we need to improve our well-established methods, so they can work with dynamic data and under strict constraints. Supervised streaming machine learning algorithms require a certain number of labeled instances in order to stay up-to-date. Since high budgets dedicated for this purpose are usually infeasible, we have to limit the supervision as much as we can. One possible approach is to trigger labeling, only if a change is explicitly indicated by a detector. While there are several supervised algorithms dedicated for this purpose, the more practical unsupervised ones are still lacking a proper attention. In this paper, we propose a novel unsupervised ensemble drift detector that recognizes local changes in feature subspaces (EDFS) without additional supervision, using specialized committees of incremental Kolmogorov-Smirnov tests. We combine it with an adaptive classifier and update it, only if the drift detector signalizes a change. Conducted experiments show that our framework is able to efficiently adapt to various concept drifts and outperform other unsupervised algorithms.\",\"PeriodicalId\":416037,\"journal\":{\"name\":\"2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA)\",\"volume\":\"2018 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSAA.2019.00047\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSAA.2019.00047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Unsupervised Drift Detector Ensembles for Data Stream Mining
Data stream mining is among the most contemporary branches of machine learning. The potentially infinite sources give us many opportunities and at the same time pose new challenges. To properly handle streaming data we need to improve our well-established methods, so they can work with dynamic data and under strict constraints. Supervised streaming machine learning algorithms require a certain number of labeled instances in order to stay up-to-date. Since high budgets dedicated for this purpose are usually infeasible, we have to limit the supervision as much as we can. One possible approach is to trigger labeling, only if a change is explicitly indicated by a detector. While there are several supervised algorithms dedicated for this purpose, the more practical unsupervised ones are still lacking a proper attention. In this paper, we propose a novel unsupervised ensemble drift detector that recognizes local changes in feature subspaces (EDFS) without additional supervision, using specialized committees of incremental Kolmogorov-Smirnov tests. We combine it with an adaptive classifier and update it, only if the drift detector signalizes a change. Conducted experiments show that our framework is able to efficiently adapt to various concept drifts and outperform other unsupervised algorithms.