{"title":"Concept Drift Detection Based on Pre-Clustering and Statistical Testing","authors":"Jones Sai-Wang Wan, Shenglin Wang","doi":"10.3966/160792642021032202020","DOIUrl":null,"url":null,"abstract":"Stream data processing has become an important issue in the last decade. Data streams are generated on the fly and possibly change their data distribution over time. Data stream processing requires some mechanisms or methods to adapt to the changes of data distribution, which is called the concept drift. Concept drift detection can be challenging due to the data labels are not known. In this paper, we propose a drift detection method based on the statistical test with clustering and feature extraction as preprocessing. The goal is to reduce the detection time with principal component analysis (PCA) for the feature extraction method. Experimental results on synthetic and real-world streaming data show that the clustering preprocessing improve the performance of the drift detection and feature extraction trade-off an insignificant performance of detection for speedup for the execution time.","PeriodicalId":50172,"journal":{"name":"Journal of Internet Technology","volume":"22 1","pages":"465-472"},"PeriodicalIF":0.9000,"publicationDate":"2021-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Internet Technology","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.3966/160792642021032202020","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 1
Abstract
Stream data processing has become an important issue in the last decade. Data streams are generated on the fly and possibly change their data distribution over time. Data stream processing requires some mechanisms or methods to adapt to the changes of data distribution, which is called the concept drift. Concept drift detection can be challenging due to the data labels are not known. In this paper, we propose a drift detection method based on the statistical test with clustering and feature extraction as preprocessing. The goal is to reduce the detection time with principal component analysis (PCA) for the feature extraction method. Experimental results on synthetic and real-world streaming data show that the clustering preprocessing improve the performance of the drift detection and feature extraction trade-off an insignificant performance of detection for speedup for the execution time.
期刊介绍:
The Journal of Internet Technology accepts original technical articles in all disciplines of Internet Technology & Applications. Manuscripts are submitted for review with the understanding that they have not been published elsewhere.
Topics of interest to JIT include but not limited to:
Broadband Networks
Electronic service systems (Internet, Intranet, Extranet, E-Commerce, E-Business)
Network Management
Network Operating System (NOS)
Intelligent systems engineering
Government or Staff Jobs Computerization
National Information Policy
Multimedia systems
Network Behavior Modeling
Wireless/Satellite Communication
Digital Library
Distance Learning
Internet/WWW Applications
Telecommunication Networks
Security in Networks and Systems
Cloud Computing
Internet of Things (IoT)
IPv6 related topics are especially welcome.