{"title":"Outliers classification for mining evolutionary community using Support Vector Machine and Logistic Regression on Azure ML","authors":"Shahi Dost, Sajid Anwer, Faryal Saud, Maham Shabbir","doi":"10.1109/C-CODE.2017.7918931","DOIUrl":null,"url":null,"abstract":"Evolutionary Community Outliers (ECO) is a type of outlier group in which the average evolutionary behavior of certain objects is different from the same community by some measures. The detection, classification and removal of ECO are a very important and challenging task of data cleaning and preprocessing. ECO is different from the old outlier detection technique, which detect objects that have fairly diverse nature as compared with the community objects. Different outlier detection and removal studies have been presented in the last few years, but there are some limitations due to the individual process at a time. The proposed research approach uses Support Vector Machine (SVM) and Logistic Regression (LR) at the same time to accurately classify, detect and remove the outliers form the Evolutionary community dataset, which is a new technique for outlier detection and removal. We have used Azure Machine Learning (ML) which is the state of the art data processing tool to test our proposed technique on the Forest Fire data of Southern Algarve Portugal to show the result, which gives us very good results on the given data. The accuracy of these outliers' detection and removal is 73–93 percent varying due to the data acquisition process. Proposed technique shows remarkable results in detecting diverse ECO on both real time and temporal datasets.","PeriodicalId":344222,"journal":{"name":"2017 International Conference on Communication, Computing and Digital Systems (C-CODE)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Communication, Computing and Digital Systems (C-CODE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/C-CODE.2017.7918931","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Evolutionary Community Outliers (ECO) is a type of outlier group in which the average evolutionary behavior of certain objects is different from the same community by some measures. The detection, classification and removal of ECO are a very important and challenging task of data cleaning and preprocessing. ECO is different from the old outlier detection technique, which detect objects that have fairly diverse nature as compared with the community objects. Different outlier detection and removal studies have been presented in the last few years, but there are some limitations due to the individual process at a time. The proposed research approach uses Support Vector Machine (SVM) and Logistic Regression (LR) at the same time to accurately classify, detect and remove the outliers form the Evolutionary community dataset, which is a new technique for outlier detection and removal. We have used Azure Machine Learning (ML) which is the state of the art data processing tool to test our proposed technique on the Forest Fire data of Southern Algarve Portugal to show the result, which gives us very good results on the given data. The accuracy of these outliers' detection and removal is 73–93 percent varying due to the data acquisition process. Proposed technique shows remarkable results in detecting diverse ECO on both real time and temporal datasets.