Machine learning and artificial neural networks to construct P2P lending credit-scoring model: A case using Lending Club data

IF 3.2 Q1 BUSINESS, FINANCE Quantitative Finance and Economics Pub Date : 2022-01-01 DOI:10.3934/qfe.2022013
An-Hsing Chang, Li-Kai Yang, R. Tsaih, Shih-Kuei Lin
{"title":"Machine learning and artificial neural networks to construct P2P lending credit-scoring model: A case using Lending Club data","authors":"An-Hsing Chang, Li-Kai Yang, R. Tsaih, Shih-Kuei Lin","doi":"10.3934/qfe.2022013","DOIUrl":null,"url":null,"abstract":"In this study, we constructed the credit-scoring model of P2P loans by using several machine learning and artificial neural network (ANN) methods, including logistic regression (LR), a support vector machine, a decision tree, random forest, XGBoost, LightGBM and 2-layer neural networks. This study explores several hyperparameter settings for each method by performing a grid search and cross-validation to get the most suitable credit-scoring model in terms of training time and test performance. In this study, we get and clean the open P2P loan data from Lending Club with feature engineering concepts. In order to find significant default factors, we used an XGBoost method to pre-train all data and get the feature importance. The 16 selected features can provide economic implications for research about default prediction in P2P loans. Besides, the empirical result shows that gradient-boosting decision tree methods, including XGBoost and LightGBM, outperform ANN and LR methods, which are commonly used for traditional credit scoring. Among all of the methods, XGBoost performed the best.","PeriodicalId":45226,"journal":{"name":"Quantitative Finance and Economics","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Finance and Economics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3934/qfe.2022013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
引用次数: 5

Abstract

In this study, we constructed the credit-scoring model of P2P loans by using several machine learning and artificial neural network (ANN) methods, including logistic regression (LR), a support vector machine, a decision tree, random forest, XGBoost, LightGBM and 2-layer neural networks. This study explores several hyperparameter settings for each method by performing a grid search and cross-validation to get the most suitable credit-scoring model in terms of training time and test performance. In this study, we get and clean the open P2P loan data from Lending Club with feature engineering concepts. In order to find significant default factors, we used an XGBoost method to pre-train all data and get the feature importance. The 16 selected features can provide economic implications for research about default prediction in P2P loans. Besides, the empirical result shows that gradient-boosting decision tree methods, including XGBoost and LightGBM, outperform ANN and LR methods, which are commonly used for traditional credit scoring. Among all of the methods, XGBoost performed the best.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
机器学习和人工神经网络构建P2P借贷信用评分模型——以lending Club数据为例
本文采用逻辑回归(LR)、支持向量机(svm)、决策树(decision tree)、随机森林(random forest)、XGBoost、LightGBM和两层神经网络(two -layer neural networks)等机器学习和人工神经网络(ANN)方法构建了P2P贷款的信用评分模型。本研究通过执行网格搜索和交叉验证来探索每种方法的几个超参数设置,以获得在训练时间和测试性能方面最合适的信用评分模型。在本研究中,我们利用特征工程的概念对Lending Club开放的P2P贷款数据进行了提取和清理。为了找到重要的默认因素,我们使用XGBoost方法对所有数据进行预训练,得到特征的重要性。选取的16个特征可以为P2P贷款违约预测的研究提供经济意义。此外,实证结果表明,梯度增强决策树方法(包括XGBoost和LightGBM)优于传统信用评分常用的ANN和LR方法。在所有方法中,XGBoost的性能最好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
0.30
自引率
1.90%
发文量
14
审稿时长
12 weeks
期刊最新文献
The effects of different modes of foreign bank entry in the Turkish banking sector during the 2007–2009 Global financial crisis Cost and performance of carbon risk in socially responsible mutual funds Investing in virtue and frowning at vice? Lessons from the global economic and financial crisis Wavelet-based systematic risk estimation for GCC stock markets and impact of the embargo on the Qatar case Autoregressive distributed lag estimation of bank financing and Nigerian manufacturing sector capacity utilization
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1