Syed Sajjad Hussain, M. Hashmani, Vali Uddin, T. Ansari, Muslim Jameel
{"title":"A Novel Approach to Detect Concept Drift Using Machine Learning","authors":"Syed Sajjad Hussain, M. Hashmani, Vali Uddin, T. Ansari, Muslim Jameel","doi":"10.1109/ICCOINS49721.2021.9497232","DOIUrl":null,"url":null,"abstract":"Data concept drift is reported as one of the critical performance degradation phenomena in Machine Learning, especially for volumetric data. Besides, the concept drift annotation is also one of the major research problems in the said domain. In this paper, a novel approach for data concept drift detection is presented. Moreover, the performance after removing the instances with concept drift is also compared with the original dataset on various machine learning algorithms. Specifically, the concept using Euclidean distance in clusters and the mutual information of an instance refer to the degree of concept drift of the instance. The said approach has been employed on the SEA dataset","PeriodicalId":245662,"journal":{"name":"2021 International Conference on Computer & Information Sciences (ICCOINS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Computer & Information Sciences (ICCOINS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCOINS49721.2021.9497232","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
Data concept drift is reported as one of the critical performance degradation phenomena in Machine Learning, especially for volumetric data. Besides, the concept drift annotation is also one of the major research problems in the said domain. In this paper, a novel approach for data concept drift detection is presented. Moreover, the performance after removing the instances with concept drift is also compared with the original dataset on various machine learning algorithms. Specifically, the concept using Euclidean distance in clusters and the mutual information of an instance refer to the degree of concept drift of the instance. The said approach has been employed on the SEA dataset