{"title":"Bank Customer Churn Prediction Using SMOTE: A Comparative Analysis","authors":"M. A. Hambali, Ishaku Andrew","doi":"10.32388/h82xtw","DOIUrl":null,"url":null,"abstract":"In today's market, customers have a plethora of options available to them when deciding where to invest their money. Consequently, customer churn and engagement have emerged as prominent concerns. With an increasing number of service providers targeting the same customer base, it is imperative for providers to understand evolving customer behavior and heightened expectations to retain their clientele. Numerous studies have addressed the issue of customer churn, with data mining frequently employed to predict bank customer attrition. While many researchers have proposed various approaches for predicting customer churn, some machine learning (ML) algorithms have struggled to deliver the required performance in identifying customer churn accurately most especially when the dataset is imbalance data. Therefore, this paper presents an application of Synthetic Minority Over Sampling Technique (SMOTE) on bank churn dataset. The SMOTE algorithm was employed to address the problem of data imbalance and Genetic Algorithm (GA) was applied to select most informative features from the original dataset. The selective features were evaluate using four (4) different classification algorithms: Random Forest (RF), K-Nearnest Neighbor (KNN), Artificial Neural Network (ANN) and Adaboost algorithms. The KNN model demonstrated superior performance compared to other models in terms of accuracy (96%), precision (96%), and F-measure (96%) respectively. Furthermore, we compared our results with existing models that utilized the same dataset, and our proposed strategy outperformed them.\n","PeriodicalId":500839,"journal":{"name":"Qeios","volume":"35 17","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Qeios","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.32388/h82xtw","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In today's market, customers have a plethora of options available to them when deciding where to invest their money. Consequently, customer churn and engagement have emerged as prominent concerns. With an increasing number of service providers targeting the same customer base, it is imperative for providers to understand evolving customer behavior and heightened expectations to retain their clientele. Numerous studies have addressed the issue of customer churn, with data mining frequently employed to predict bank customer attrition. While many researchers have proposed various approaches for predicting customer churn, some machine learning (ML) algorithms have struggled to deliver the required performance in identifying customer churn accurately most especially when the dataset is imbalance data. Therefore, this paper presents an application of Synthetic Minority Over Sampling Technique (SMOTE) on bank churn dataset. The SMOTE algorithm was employed to address the problem of data imbalance and Genetic Algorithm (GA) was applied to select most informative features from the original dataset. The selective features were evaluate using four (4) different classification algorithms: Random Forest (RF), K-Nearnest Neighbor (KNN), Artificial Neural Network (ANN) and Adaboost algorithms. The KNN model demonstrated superior performance compared to other models in terms of accuracy (96%), precision (96%), and F-measure (96%) respectively. Furthermore, we compared our results with existing models that utilized the same dataset, and our proposed strategy outperformed them.