A cost-sensitive ensemble deep forest approach for extremely imbalanced credit fraud detection

IF 1.5 4区经济学 Q3 BUSINESS, FINANCE Quantitative Finance Pub Date : 2023-07-26 DOI:10.1080/14697688.2023.2230264

Fang Zhao, Gang Li, Yanxia Lyu, Hong-Dong Ma, Xiaoqian Zhu

{"title":"A cost-sensitive ensemble deep forest approach for extremely imbalanced credit fraud detection","authors":"Fang Zhao, Gang Li, Yanxia Lyu, Hong-Dong Ma, Xiaoqian Zhu","doi":"10.1080/14697688.2023.2230264","DOIUrl":null,"url":null,"abstract":"Credit fraud detection modeling helps prevent default risks and reduce economic losses, and increasingly sophisticated methods have been designed for predicting the default probability of clients. In such problems, the fact that the class of fraud clients is much smaller than the class of good clients makes it a challenge to detect the fraud class. To minimize the financial losses in extremely imbalanced datasets, this paper delivers a novel cost-sensitive ensemble model under the framework of deep forest. The model first introduces a cost-sensitive strategy to assign a higher cost to the fraud class, thereby improving the attention of the model to the fraud samples. As everyone knows, for the basic classifiers of ensemble learning, the greater their differences, the better the performance after ensemble. So the model adds superior cost-sensitive base classifiers into the cascade structure to improve the overall performance. The model also introduces Type II error as the convergence index to automatically adjust the depth of the cascade structure. The experiments conducted on the European credit dataset and a private electronic transaction dataset are presented to demonstrate the performance of the proposed method. The results indicate that the proposed model outperforms most benchmarks in detecting fraud samples.","PeriodicalId":20747,"journal":{"name":"Quantitative Finance","volume":"15 1","pages":"1397 - 1409"},"PeriodicalIF":1.5000,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Finance","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.1080/14697688.2023.2230264","RegionNum":4,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}

引用次数: 0

Abstract

Credit fraud detection modeling helps prevent default risks and reduce economic losses, and increasingly sophisticated methods have been designed for predicting the default probability of clients. In such problems, the fact that the class of fraud clients is much smaller than the class of good clients makes it a challenge to detect the fraud class. To minimize the financial losses in extremely imbalanced datasets, this paper delivers a novel cost-sensitive ensemble model under the framework of deep forest. The model first introduces a cost-sensitive strategy to assign a higher cost to the fraud class, thereby improving the attention of the model to the fraud samples. As everyone knows, for the basic classifiers of ensemble learning, the greater their differences, the better the performance after ensemble. So the model adds superior cost-sensitive base classifiers into the cascade structure to improve the overall performance. The model also introduces Type II error as the convergence index to automatically adjust the depth of the cascade structure. The experiments conducted on the European credit dataset and a private electronic transaction dataset are presented to demonstrate the performance of the proposed method. The results indicate that the proposed model outperforms most benchmarks in detecting fraud samples.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一种成本敏感的集成深度森林方法用于极度不平衡信用欺诈检测

信用欺诈检测模型有助于预防违约风险和减少经济损失，并且已经设计出越来越复杂的方法来预测客户的违约概率。在此类问题中，欺诈客户的类别远远小于良好客户的类别，这使得检测欺诈类别成为一项挑战。为了最大限度地减少极度不平衡数据集的经济损失，本文提出了一种新的深森林框架下的成本敏感集成模型。该模型首先引入了成本敏感策略，为欺诈类分配更高的成本，从而提高了模型对欺诈样本的关注。众所周知，对于集成学习的基本分类器来说，它们之间的差异越大，集成后的性能越好。因此，该模型在级联结构中加入了对代价敏感的基分类器，以提高整体性能。该模型还引入了II型误差作为收敛指标来自动调节串级结构的深度。在欧洲信用数据集和私人电子交易数据集上进行的实验证明了所提出方法的性能。结果表明，该模型在检测欺诈样本方面优于大多数基准测试。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Quantitative Finance 社会科学-数学跨学科应用

CiteScore

3.20

自引率

7.70%

发文量

102

审稿时长

4-8 weeks

期刊介绍： The frontiers of finance are shifting rapidly, driven in part by the increasing use of quantitative methods in the field. Quantitative Finance welcomes original research articles that reflect the dynamism of this area. The journal provides an interdisciplinary forum for presenting both theoretical and empirical approaches and offers rapid publication of original new work with high standards of quality. The readership is broad, embracing researchers and practitioners across a range of specialisms and within a variety of organizations. All articles should aim to be of interest to this broad readership.