Churn Prediction in Telecommunications Industry Based on Conditional Wasserstein GAN

Chang Su, Linglin Wei, Xianzhong Xie
{"title":"Churn Prediction in Telecommunications Industry Based on Conditional Wasserstein GAN","authors":"Chang Su, Linglin Wei, Xianzhong Xie","doi":"10.1109/HiPC56025.2022.00034","DOIUrl":null,"url":null,"abstract":"In recent years, with the globalization and advancement of the telecommunications industry, the competition in the telecommunications market has become more intense, accompanied by high customer churn rates. Therefore, telecom operators urgently need to formulate effective marketing strategies to prevent the churning of customers. Customer churn prediction is an important means to prevent customer churn, but due to the imbalance of data in the telecommunications industry, the prediction results are always unsatisfactory. To improve prediction performance, the most common method is to oversample the minority class. Standard methods such as SMOTE usually only focus on the minority class samples, and it is easy to ignore the connection between the minority class samples and the majority class samples. In addition, in the case of high-dimensional, complex data distribution, the Euclidean distance used in the SMOTE algorithm is not particularly meaningful and tend to underperform. While Generative Adversarial Networks (GANs) are able to model complex distributions and can in principle be used to generate minority class cases. Therefore, this paper adopts a comprehensive GAN model (CWGAN) based on Wasserstein GAN with Gradient Penalty (WGANGP) and Conditional GAN (CGAN) to handle the imbalanced data in the telecom industry. This is also the first time that GAN has been used to deal with the data imbalance problem in the telecom industry. At the same time, this paper also introduces a hybrid attention mechanism (CBAM) to further assist the generator to focus on features related to classification tasks. Afterwards, the effectiveness of the adopted method is demonstrated on four commonly used machine learning classifiers.","PeriodicalId":119363,"journal":{"name":"2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC56025.2022.00034","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In recent years, with the globalization and advancement of the telecommunications industry, the competition in the telecommunications market has become more intense, accompanied by high customer churn rates. Therefore, telecom operators urgently need to formulate effective marketing strategies to prevent the churning of customers. Customer churn prediction is an important means to prevent customer churn, but due to the imbalance of data in the telecommunications industry, the prediction results are always unsatisfactory. To improve prediction performance, the most common method is to oversample the minority class. Standard methods such as SMOTE usually only focus on the minority class samples, and it is easy to ignore the connection between the minority class samples and the majority class samples. In addition, in the case of high-dimensional, complex data distribution, the Euclidean distance used in the SMOTE algorithm is not particularly meaningful and tend to underperform. While Generative Adversarial Networks (GANs) are able to model complex distributions and can in principle be used to generate minority class cases. Therefore, this paper adopts a comprehensive GAN model (CWGAN) based on Wasserstein GAN with Gradient Penalty (WGANGP) and Conditional GAN (CGAN) to handle the imbalanced data in the telecom industry. This is also the first time that GAN has been used to deal with the data imbalance problem in the telecom industry. At the same time, this paper also introduces a hybrid attention mechanism (CBAM) to further assist the generator to focus on features related to classification tasks. Afterwards, the effectiveness of the adopted method is demonstrated on four commonly used machine learning classifiers.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于条件Wasserstein GAN的电信行业流失预测
近年来,随着电信行业的全球化和先进性,电信市场竞争日趋激烈,客户流失率居高不下。因此,电信运营商迫切需要制定有效的营销策略,防止客户流失。客户流失预测是防止客户流失的重要手段,但由于电信行业数据的不平衡,预测结果往往不尽人意。为了提高预测性能,最常用的方法是对少数类进行过采样。SMOTE等标准方法通常只关注少数类样本,容易忽略少数类样本与多数类样本之间的联系。此外,在高维、复杂数据分布的情况下,SMOTE算法中使用的欧氏距离并不是特别有意义,往往表现不佳。而生成对抗网络(GANs)能够模拟复杂的分布,原则上可以用于生成少数类案例。因此,本文采用基于Wasserstein梯度惩罚GAN (WGANGP)和条件GAN (CGAN)的综合GAN模型(CWGAN)来处理电信行业的不平衡数据。这也是GAN首次被用于解决电信行业的数据不平衡问题。同时,本文还引入了混合注意机制(CBAM),进一步辅助生成器关注与分类任务相关的特征。然后,在四种常用的机器学习分类器上验证了所采用方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
HiPC 2022 Technical Program Committee A Deep Learning-Based In Situ Analysis Framework for Tropical Cyclogenesis Prediction COMPROF and COMPLACE: Shared-Memory Communication Profiling and Automated Thread Placement via Dynamic Binary Instrumentation Message from the HiPC 2022 General Co-Chairs Efficient Personalized and Non-Personalized Alltoall Communication for Modern Multi-HCA GPU-Based Clusters
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1