{"title":"Identification of Customer Churn Considering Difficult Case Mining","authors":"Jianfeng Li, Xuepeng Bai, Qian Xu, Dexiang Yang","doi":"10.3390/systems11070325","DOIUrl":null,"url":null,"abstract":"In the process of user churn modeling, due to the imbalance between lost users and retained users, the use of traditional classification models often cannot accurately and comprehensively identify users with churn tendency. To address this issue, it is not sufficient to simply increase the misclassification cost of minority class samples in cost-sensitive methods. This paper proposes using the Focal Loss hard example mining technique to add the class weight α and the focus parameter γ to the cross-entropy loss function of LightGBM. In addition, it emphasizes the identification of customers at risk of churning and raises the cost of misclassification for minority and difficult-to-classify samples. On the basis of the preceding ideas, the FocalLoss_LightGBM model is proposed, along with random forests, SVM, XGBoost, and LightGBM. Empirical analysis based on a dataset of credit card users publicly available on the Kaggle website. The AUC, TPR, and G-mean index values were superior to the existing model, which can effectively improve the accuracy and stability of potential lost users.","PeriodicalId":52858,"journal":{"name":"syst mt`lyh","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"syst mt`lyh","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/systems11070325","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In the process of user churn modeling, due to the imbalance between lost users and retained users, the use of traditional classification models often cannot accurately and comprehensively identify users with churn tendency. To address this issue, it is not sufficient to simply increase the misclassification cost of minority class samples in cost-sensitive methods. This paper proposes using the Focal Loss hard example mining technique to add the class weight α and the focus parameter γ to the cross-entropy loss function of LightGBM. In addition, it emphasizes the identification of customers at risk of churning and raises the cost of misclassification for minority and difficult-to-classify samples. On the basis of the preceding ideas, the FocalLoss_LightGBM model is proposed, along with random forests, SVM, XGBoost, and LightGBM. Empirical analysis based on a dataset of credit card users publicly available on the Kaggle website. The AUC, TPR, and G-mean index values were superior to the existing model, which can effectively improve the accuracy and stability of potential lost users.