Dr. Sonali Nemade, Dr. Sujata Patil, Mrs. Deepashree Mehendale, Mrs. Vidya Shinde, Mrs. Reshma Masurekar
{"title":"使用机器学习算法研究和分析客户流失预测","authors":"Dr. Sonali Nemade, Dr. Sujata Patil, Mrs. Deepashree Mehendale, Mrs. Vidya Shinde, Mrs. Reshma Masurekar","doi":"10.32628/ijsrset241143","DOIUrl":null,"url":null,"abstract":"The customer churn prediction (CCP) is one of the challenging problems in the E-Commerce industry. With the advancement in the field of machine learning and artificial intelligence, the possibilities to predict customer churn has increased significantly. Our proposed methodology, consists of six phases. In the first two phases, data pre-processing and feature analysis is performed. In the third phase, feature selection is taken into consideration. Next, the data has been split into two parts train and test set in the ratio of 80% and 20% respectively. In the prediction process, most popular predictive models have been applied, namely, logistic regression, random forest classifier etc. on train set are applied to see the effect on accuracy of models. In addition, K-fold cross validation has been used over train set for hyper parameter tuning and to prevent overfitting of models. Finally, the obtained results on test set have been evaluated using confusion matrix and AUC curve.","PeriodicalId":14228,"journal":{"name":"International Journal of Scientific Research in Science, Engineering and Technology","volume":" 9","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"To Study and Analyse the Customer Churn Prediction using Machine Learning Algorithm\",\"authors\":\"Dr. Sonali Nemade, Dr. Sujata Patil, Mrs. Deepashree Mehendale, Mrs. Vidya Shinde, Mrs. Reshma Masurekar\",\"doi\":\"10.32628/ijsrset241143\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The customer churn prediction (CCP) is one of the challenging problems in the E-Commerce industry. With the advancement in the field of machine learning and artificial intelligence, the possibilities to predict customer churn has increased significantly. Our proposed methodology, consists of six phases. In the first two phases, data pre-processing and feature analysis is performed. In the third phase, feature selection is taken into consideration. Next, the data has been split into two parts train and test set in the ratio of 80% and 20% respectively. In the prediction process, most popular predictive models have been applied, namely, logistic regression, random forest classifier etc. on train set are applied to see the effect on accuracy of models. In addition, K-fold cross validation has been used over train set for hyper parameter tuning and to prevent overfitting of models. Finally, the obtained results on test set have been evaluated using confusion matrix and AUC curve.\",\"PeriodicalId\":14228,\"journal\":{\"name\":\"International Journal of Scientific Research in Science, Engineering and Technology\",\"volume\":\" 9\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Scientific Research in Science, Engineering and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.32628/ijsrset241143\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Scientific Research in Science, Engineering and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.32628/ijsrset241143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
客户流失预测(CCP)是电子商务行业中极具挑战性的问题之一。随着机器学习和人工智能领域的进步,预测客户流失的可能性大大增加。我们提出的方法包括六个阶段。在前两个阶段,进行数据预处理和特征分析。第三阶段是特征选择。接下来,数据被分成训练集和测试集两部分,比例分别为 80% 和 20%。在预测过程中,在训练集上应用了最流行的预测模型,即逻辑回归、随机森林分类器等,以了解模型对准确率的影响。此外,还在训练集上使用了 K 折交叉验证来进行超参数调整,防止模型过度拟合。最后,使用混淆矩阵和 AUC 曲线对测试集上获得的结果进行评估。
To Study and Analyse the Customer Churn Prediction using Machine Learning Algorithm
The customer churn prediction (CCP) is one of the challenging problems in the E-Commerce industry. With the advancement in the field of machine learning and artificial intelligence, the possibilities to predict customer churn has increased significantly. Our proposed methodology, consists of six phases. In the first two phases, data pre-processing and feature analysis is performed. In the third phase, feature selection is taken into consideration. Next, the data has been split into two parts train and test set in the ratio of 80% and 20% respectively. In the prediction process, most popular predictive models have been applied, namely, logistic regression, random forest classifier etc. on train set are applied to see the effect on accuracy of models. In addition, K-fold cross validation has been used over train set for hyper parameter tuning and to prevent overfitting of models. Finally, the obtained results on test set have been evaluated using confusion matrix and AUC curve.