Predicting customer churn: A systematic literature review

Soumi De, P. Prabu
{"title":"Predicting customer churn: A systematic literature review","authors":"Soumi De, P. Prabu","doi":"10.1080/09720529.2022.2133238","DOIUrl":null,"url":null,"abstract":"Abstract Churn prediction is an active topic for research and machine learning approaches have made significant contributions in this domain. Models built to address customer churn, aim to identify customers who are at a high risk of terminating services offered by a company. Hence, an effective machine learning model indirectly contributes to the revenue growth of an organization, by identifying “at risk” customers, well in advance. This improves the success rate of retention campaigns and reduces costs associated with churn. The aim of this study is to explore the state-of-the-art machine learning techniques used in churn prediction. A systematic literature review, that is driven by 5 research questions and rigorous quality assessment criteria, is presented. There are 38 primary studies that are selected out of 420 studies published between 2018 and 2021. The review identifies popular machine learning techniques used in churn prediction and provides directions for future research. Firstly, the study finds that churn models lack generalization capability across industry domains. Hence, it identifies a need for researchers to explore techniques that extend beyond model experimentation, to improve efficiency of classifiers across domains. Secondly, it is observed that the traditional approaches to churn prediction depend significantly on demographic, product-usage, and revenue features alone. However, recent papers have integrated social network analysis-related features in churn models and achieved satisfactory results. Furthermore, there is a lack of scientific work that utilizes information-rich content of customer-company-interaction instances via email, chat conversations and other means. This area is the least explored. Thirdly, there is scope to investigate the effect of hybrid sampling strategies on model performance. This has not been extensively evaluated in literature. Lastly, there is no formal guideline on correct evaluation parameters to be used for models applied on imbalanced churn datasets. This is a grey area that requires greater attention.","PeriodicalId":46563,"journal":{"name":"JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2022-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/09720529.2022.2133238","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract Churn prediction is an active topic for research and machine learning approaches have made significant contributions in this domain. Models built to address customer churn, aim to identify customers who are at a high risk of terminating services offered by a company. Hence, an effective machine learning model indirectly contributes to the revenue growth of an organization, by identifying “at risk” customers, well in advance. This improves the success rate of retention campaigns and reduces costs associated with churn. The aim of this study is to explore the state-of-the-art machine learning techniques used in churn prediction. A systematic literature review, that is driven by 5 research questions and rigorous quality assessment criteria, is presented. There are 38 primary studies that are selected out of 420 studies published between 2018 and 2021. The review identifies popular machine learning techniques used in churn prediction and provides directions for future research. Firstly, the study finds that churn models lack generalization capability across industry domains. Hence, it identifies a need for researchers to explore techniques that extend beyond model experimentation, to improve efficiency of classifiers across domains. Secondly, it is observed that the traditional approaches to churn prediction depend significantly on demographic, product-usage, and revenue features alone. However, recent papers have integrated social network analysis-related features in churn models and achieved satisfactory results. Furthermore, there is a lack of scientific work that utilizes information-rich content of customer-company-interaction instances via email, chat conversations and other means. This area is the least explored. Thirdly, there is scope to investigate the effect of hybrid sampling strategies on model performance. This has not been extensively evaluated in literature. Lastly, there is no formal guideline on correct evaluation parameters to be used for models applied on imbalanced churn datasets. This is a grey area that requires greater attention.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
预测客户流失:系统的文献回顾
流失预测是一个活跃的研究课题,机器学习方法在这一领域做出了重大贡献。为解决客户流失而建立的模型,旨在识别那些有高风险终止公司提供服务的客户。因此,有效的机器学习模型通过提前识别“有风险”的客户,间接地促进了组织的收入增长。这提高了留存率活动的成功率,并减少了与流失相关的成本。本研究的目的是探索在流失预测中使用的最先进的机器学习技术。在5个研究问题和严格的质量评估标准的驱动下,提出了系统的文献综述。从2018年至2021年发表的420项研究中选出了38项初步研究。该综述确定了在客户流失预测中使用的流行机器学习技术,并为未来的研究提供了方向。首先,研究发现流失模型缺乏跨行业领域的泛化能力。因此,它确定了研究人员需要探索超越模型实验的技术,以提高跨领域分类器的效率。其次,传统的流失预测方法主要依赖于人口统计、产品使用和收入特征。然而,最近的论文将社会网络分析的相关特征整合到流失模型中,并取得了令人满意的结果。此外,缺乏通过电子邮件、聊天对话等方式利用客户-公司互动实例中信息丰富的内容的科学工作。这个地区是最少被探索的。第三,混合采样策略对模型性能的影响还有待进一步研究。这在文献中还没有得到广泛的评价。最后,对于应用于不平衡客户流失数据集的模型,没有关于正确评估参数的正式指南。这是一个需要更多关注的灰色地带。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
3.10
自引率
21.40%
发文量
126
期刊最新文献
A4-graph for the twisted group 3D4 (3) Modern Metrics (MM): Software size estimation using function points for artificial intelligence and data analytics applications and finding the effort modifiers of the functional units using indian software industry Optimized deep learning methodology for intruder behavior detection and classification in cloud I-prime fuzzy submodules Information security based on sub-system keys generator by utilizing polynomials method and logic gate
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1