Fighting sampling bias: A framework for training and evaluating credit scoring models

IF 6 2区 管理学 Q1 OPERATIONS RESEARCH & MANAGEMENT SCIENCE European Journal of Operational Research Pub Date : 2025-02-07 DOI:10.1016/j.ejor.2025.01.040
Nikita Kozodoi , Stefan Lessmann , Morteza Alamgir , Luis Moreira-Matias , Konstantinos Papakonstantinou
{"title":"Fighting sampling bias: A framework for training and evaluating credit scoring models","authors":"Nikita Kozodoi ,&nbsp;Stefan Lessmann ,&nbsp;Morteza Alamgir ,&nbsp;Luis Moreira-Matias ,&nbsp;Konstantinos Papakonstantinou","doi":"10.1016/j.ejor.2025.01.040","DOIUrl":null,"url":null,"abstract":"<div><div>Scoring models support decision-making in financial institutions. Their estimation and evaluation rely on labeled data from previously accepted clients. Ignoring rejected applicants with unknown repayment behavior introduces sampling bias, as the available labeled data only partially represents the population of potential borrowers. This paper examines the impact of sampling bias and introduces new methods to mitigate its adverse effect. First, we develop a bias-aware self-labeling algorithm for scorecard training, which debiases the training data by adding selected rejects with an inferred label. Second, we propose a Bayesian framework to address sampling bias in scorecard evaluation. To provide reliable projections of future scorecard performance, we include rejected clients with random pseudo-labels in the test set and use Monte Carlo sampling to estimate the scorecard’s expected performance across label realizations. We conduct extensive experiments using both synthetic and observational data. The observational data includes an unbiased sample of applicants accepted without scoring, representing the true borrower population and facilitating a realistic assessment of reject inference techniques. The results show that our methods outperform established benchmarks in predictive accuracy and profitability. Additional sensitivity analysis clarifies the conditions under which they are most effective. Comparing the relative effectiveness of addressing sampling bias during scorecard training versus evaluation, we find the latter much more promising. For example, we estimate the expected return per dollar issued to increase by up to 2.07 and up to 5.76 percentage points when using bias-aware self-labeling and Bayesian evaluation, respectively.</div></div>","PeriodicalId":55161,"journal":{"name":"European Journal of Operational Research","volume":"324 2","pages":"Pages 616-628"},"PeriodicalIF":6.0000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Operational Research","FirstCategoryId":"91","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0377221725000839","RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"OPERATIONS RESEARCH & MANAGEMENT SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Scoring models support decision-making in financial institutions. Their estimation and evaluation rely on labeled data from previously accepted clients. Ignoring rejected applicants with unknown repayment behavior introduces sampling bias, as the available labeled data only partially represents the population of potential borrowers. This paper examines the impact of sampling bias and introduces new methods to mitigate its adverse effect. First, we develop a bias-aware self-labeling algorithm for scorecard training, which debiases the training data by adding selected rejects with an inferred label. Second, we propose a Bayesian framework to address sampling bias in scorecard evaluation. To provide reliable projections of future scorecard performance, we include rejected clients with random pseudo-labels in the test set and use Monte Carlo sampling to estimate the scorecard’s expected performance across label realizations. We conduct extensive experiments using both synthetic and observational data. The observational data includes an unbiased sample of applicants accepted without scoring, representing the true borrower population and facilitating a realistic assessment of reject inference techniques. The results show that our methods outperform established benchmarks in predictive accuracy and profitability. Additional sensitivity analysis clarifies the conditions under which they are most effective. Comparing the relative effectiveness of addressing sampling bias during scorecard training versus evaluation, we find the latter much more promising. For example, we estimate the expected return per dollar issued to increase by up to 2.07 and up to 5.76 percentage points when using bias-aware self-labeling and Bayesian evaluation, respectively.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
对抗抽样偏差:一个训练和评估信用评分模型的框架
评分模型支持金融机构的决策。他们的估计和评估依赖于先前接受的客户的标记数据。忽略还款行为未知的被拒绝申请人会引入抽样偏差,因为可用的标记数据仅部分代表潜在借款人的人口。本文探讨了抽样偏差的影响,并介绍了减轻其不利影响的新方法。首先,我们开发了一种用于记分卡训练的偏差感知自标记算法,该算法通过添加带有推断标签的选择拒绝来消除训练数据的偏差。其次,我们提出了一个贝叶斯框架来解决计分卡评估中的抽样偏差。为了提供未来计分卡性能的可靠预测,我们在测试集中包括带有随机伪标签的拒绝客户,并使用蒙特卡罗抽样来估计计分卡跨标签实现的预期性能。我们利用合成数据和观测数据进行了广泛的实验。观察数据包括无评分接受的申请人的无偏样本,代表真实的借款人人口,并促进拒绝推理技术的现实评估。结果表明,我们的方法在预测准确性和盈利能力方面优于既定基准。附加的敏感性分析阐明了它们最有效的条件。比较在计分卡训练和评估期间解决抽样偏差的相对有效性,我们发现后者更有希望。例如,我们估计,当使用偏差感知的自我标签和贝叶斯评估时,每美元发行的预期回报分别增加了2.07和5.76个百分点。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
European Journal of Operational Research
European Journal of Operational Research 管理科学-运筹学与管理科学
CiteScore
11.90
自引率
9.40%
发文量
786
审稿时长
8.2 months
期刊介绍: The European Journal of Operational Research (EJOR) publishes high quality, original papers that contribute to the methodology of operational research (OR) and to the practice of decision making.
期刊最新文献
A novel stochastic conjugate gradient algorithm based on a stochastic differential equation perspective A Branch-and-Price Algorithm for Scheduling Parallel Continuous Steel Annealing Lines Merger Remedies in Frontier-Based Regulation Multi-period, multi-blockchain optimization for risk-aware and sustainable supply chain networks The Impact of Autonomous Vehicle Encroachment on Human-Driven Ride-Hailing Platforms: Implications for Peer-to-Peer and Fleet-Based Modes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1