A machine learning workflow to address credit default prediction

Rambod Rahmani, Marco Parola, Mario G. C. A. Cimino
{"title":"A machine learning workflow to address credit default prediction","authors":"Rambod Rahmani, Marco Parola, Mario G. C. A. Cimino","doi":"arxiv-2403.03785","DOIUrl":null,"url":null,"abstract":"Due to the recent increase in interest in Financial Technology (FinTech),\napplications like credit default prediction (CDP) are gaining significant\nindustrial and academic attention. In this regard, CDP plays a crucial role in\nassessing the creditworthiness of individuals and businesses, enabling lenders\nto make informed decisions regarding loan approvals and risk management. In\nthis paper, we propose a workflow-based approach to improve CDP, which refers\nto the task of assessing the probability that a borrower will default on his or\nher credit obligations. The workflow consists of multiple steps, each designed\nto leverage the strengths of different techniques featured in machine learning\npipelines and, thus best solve the CDP task. We employ a comprehensive and\nsystematic approach starting with data preprocessing using Weight of Evidence\nencoding, a technique that ensures in a single-shot data scaling by removing\noutliers, handling missing values, and making data uniform for models working\nwith different data types. Next, we train several families of learning models,\nintroducing ensemble techniques to build more robust models and hyperparameter\noptimization via multi-objective genetic algorithms to consider both predictive\naccuracy and financial aspects. Our research aims at contributing to the\nFinTech industry in providing a tool to move toward more accurate and reliable\ncredit risk assessment, benefiting both lenders and borrowers.","PeriodicalId":501128,"journal":{"name":"arXiv - QuantFin - Risk Management","volume":"24 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuantFin - Risk Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.03785","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Due to the recent increase in interest in Financial Technology (FinTech), applications like credit default prediction (CDP) are gaining significant industrial and academic attention. In this regard, CDP plays a crucial role in assessing the creditworthiness of individuals and businesses, enabling lenders to make informed decisions regarding loan approvals and risk management. In this paper, we propose a workflow-based approach to improve CDP, which refers to the task of assessing the probability that a borrower will default on his or her credit obligations. The workflow consists of multiple steps, each designed to leverage the strengths of different techniques featured in machine learning pipelines and, thus best solve the CDP task. We employ a comprehensive and systematic approach starting with data preprocessing using Weight of Evidence encoding, a technique that ensures in a single-shot data scaling by removing outliers, handling missing values, and making data uniform for models working with different data types. Next, we train several families of learning models, introducing ensemble techniques to build more robust models and hyperparameter optimization via multi-objective genetic algorithms to consider both predictive accuracy and financial aspects. Our research aims at contributing to the FinTech industry in providing a tool to move toward more accurate and reliable credit risk assessment, benefiting both lenders and borrowers.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
解决信用违约预测问题的机器学习工作流程
由于最近人们对金融科技(FinTech)的兴趣与日俱增,信用违约预测(CDP)等应用正受到业界和学术界的极大关注。在这方面,CDP 在评估个人和企业的信用度方面发挥着至关重要的作用,使贷款人能够在贷款审批和风险管理方面做出明智的决策。在本文中,我们提出了一种基于工作流的方法来改进 CDP,CDP 指的是评估借款人拖欠信贷的可能性。该工作流程由多个步骤组成,每个步骤都旨在利用机器学习管道中不同技术的优势,从而最有效地解决 CDP 任务。我们采用了一种全面而系统的方法,首先使用证据权重编码进行数据预处理,该技术通过去除异常值、处理缺失值和使数据统一,确保一次性完成数据缩放,以适用于处理不同数据类型的模型。接下来,我们训练了多个学习模型系列,引入了集合技术来建立更稳健的模型,并通过多目标遗传算法进行超参数优化,以同时考虑预测精度和财务方面的问题。我们的研究旨在为金融科技行业做出贡献,提供一种工具来实现更准确、更可靠的信贷风险评估,使贷款人和借款人都能从中受益。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
DeFi Arbitrage in Hedged Liquidity Tokens Decomposition Pipeline for Large-Scale Portfolio Optimization with Applications to Near-Term Quantum Computing Research and Design of a Financial Intelligent Risk Control Platform Based on Big Data Analysis and Deep Machine Learning Credit Spreads' Term Structure: Stochastic Modeling with CIR++ Intensity Claims processing and costs under capacity constraints
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1