{"title":"An Improved Over-sampling Algorithm based on iForest and SMOTE","authors":"Yifeng Zheng, Guohe Li, Teng Zhang","doi":"10.1145/3316615.3316641","DOIUrl":null,"url":null,"abstract":"Imbalance learning is one of the most challenging problems in supervised learning, so many different strategies are designed to tackle balanced sample distribution. The over-sampling techniques which achieve a relatively balanced class distribution through synthesizing samples receive more and more attention. In this paper, we present an over-sampling approach based on isolation Forest (iForest) and SMOTE, called iForest-SMOTE. Firstly, for minority class samples, iForest-score is employed to assess the importance of each sample based on iForest model. Then, in each SMOTE process, roulette wheel selection based on iForest-score is utilized to select the neighbor sample. Finally, M-dimensional-sphere interpolation approach is employed to generate a new sample. The experiments illustrate that our approach takes into account the spatial distribution of minority class samples and sample synthetic simultaneously. Therefore, iForest-SMOTE can effectively improve the performance of the classification model.","PeriodicalId":268392,"journal":{"name":"Proceedings of the 2019 8th International Conference on Software and Computer Applications","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 8th International Conference on Software and Computer Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3316615.3316641","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Imbalance learning is one of the most challenging problems in supervised learning, so many different strategies are designed to tackle balanced sample distribution. The over-sampling techniques which achieve a relatively balanced class distribution through synthesizing samples receive more and more attention. In this paper, we present an over-sampling approach based on isolation Forest (iForest) and SMOTE, called iForest-SMOTE. Firstly, for minority class samples, iForest-score is employed to assess the importance of each sample based on iForest model. Then, in each SMOTE process, roulette wheel selection based on iForest-score is utilized to select the neighbor sample. Finally, M-dimensional-sphere interpolation approach is employed to generate a new sample. The experiments illustrate that our approach takes into account the spatial distribution of minority class samples and sample synthetic simultaneously. Therefore, iForest-SMOTE can effectively improve the performance of the classification model.