基于数据填充和贝叶斯优化改进的修正平衡随机森林的不平衡信贷风险预测

Hongyu Zhang, Zhenjun Ye
{"title":"基于数据填充和贝叶斯优化改进的修正平衡随机森林的不平衡信贷风险预测","authors":"Hongyu Zhang, Zhenjun Ye","doi":"10.61935/aedmr.2.1.2024.p115","DOIUrl":null,"url":null,"abstract":"Based on the distribution characteristics of financial big data, credit risk prediction models often face some problems, such as unbalanced data distribution and difficult data preprocessing process. High-precision models are often accompanied by low model efficiency. Therefore, this paper constructs a complete non-equilibrium credit risk prediction model, namely BO-PBRF, and improves the algorithm to deal with common problems in financial data. In the data preprocessing stage, two missing value fillers are generated according to the original data to facilitate the subsequent new data processing. In the modeling stage, we improve the balanced random forest algorithm, so that the model can not only deal with unbalanced data sets, but also suitable for the background of the explosive development of financial big data, and improve the operation speed of the model. In addition, in the process of establishing the model, we add the Bayesian optimization algorithm to further improve the accuracy of the model, especially in the prediction of default loans. In order to verify the effectiveness of the model proposed in this paper, in the empirical research, we select the credit data from the real world, and compare the model proposed in this paper with the previous models. The experimental results show that the proposed model has the best prediction performance for default data.","PeriodicalId":502155,"journal":{"name":"Advances in Economic Development and Management Research","volume":" 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Imbalanced credit risk prediction based on data fillers and modified balanced random forest improved by Bayesian optimization\",\"authors\":\"Hongyu Zhang, Zhenjun Ye\",\"doi\":\"10.61935/aedmr.2.1.2024.p115\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Based on the distribution characteristics of financial big data, credit risk prediction models often face some problems, such as unbalanced data distribution and difficult data preprocessing process. High-precision models are often accompanied by low model efficiency. Therefore, this paper constructs a complete non-equilibrium credit risk prediction model, namely BO-PBRF, and improves the algorithm to deal with common problems in financial data. In the data preprocessing stage, two missing value fillers are generated according to the original data to facilitate the subsequent new data processing. In the modeling stage, we improve the balanced random forest algorithm, so that the model can not only deal with unbalanced data sets, but also suitable for the background of the explosive development of financial big data, and improve the operation speed of the model. In addition, in the process of establishing the model, we add the Bayesian optimization algorithm to further improve the accuracy of the model, especially in the prediction of default loans. In order to verify the effectiveness of the model proposed in this paper, in the empirical research, we select the credit data from the real world, and compare the model proposed in this paper with the previous models. The experimental results show that the proposed model has the best prediction performance for default data.\",\"PeriodicalId\":502155,\"journal\":{\"name\":\"Advances in Economic Development and Management Research\",\"volume\":\" 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in Economic Development and Management Research\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.61935/aedmr.2.1.2024.p115\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Economic Development and Management Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.61935/aedmr.2.1.2024.p115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

基于金融大数据的分布特点,信用风险预测模型往往面临一些问题,如数据分布不均衡、数据预处理过程困难等。高精度模型往往伴随着低模型效率。因此,本文构建了一个完整的非均衡信用风险预测模型,即 BO-PBRF,并针对金融数据中的常见问题对算法进行了改进。在数据预处理阶段,根据原始数据生成两个缺失值填充器,以方便后续的新数据处理。在建模阶段,改进了平衡随机森林算法,使模型既能处理不平衡数据集,又适合金融大数据爆发式发展的背景,提高了模型的运行速度。此外,在建立模型的过程中,我们加入了贝叶斯优化算法,进一步提高了模型的准确性,尤其是在违约贷款的预测方面。为了验证本文提出的模型的有效性,在实证研究中,我们选取了现实世界中的信贷数据,并将本文提出的模型与之前的模型进行了比较。实验结果表明,本文提出的模型对违约数据的预测效果最好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Imbalanced credit risk prediction based on data fillers and modified balanced random forest improved by Bayesian optimization
Based on the distribution characteristics of financial big data, credit risk prediction models often face some problems, such as unbalanced data distribution and difficult data preprocessing process. High-precision models are often accompanied by low model efficiency. Therefore, this paper constructs a complete non-equilibrium credit risk prediction model, namely BO-PBRF, and improves the algorithm to deal with common problems in financial data. In the data preprocessing stage, two missing value fillers are generated according to the original data to facilitate the subsequent new data processing. In the modeling stage, we improve the balanced random forest algorithm, so that the model can not only deal with unbalanced data sets, but also suitable for the background of the explosive development of financial big data, and improve the operation speed of the model. In addition, in the process of establishing the model, we add the Bayesian optimization algorithm to further improve the accuracy of the model, especially in the prediction of default loans. In order to verify the effectiveness of the model proposed in this paper, in the empirical research, we select the credit data from the real world, and compare the model proposed in this paper with the previous models. The experimental results show that the proposed model has the best prediction performance for default data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Spatio-temporal changes of rice cropping system in Anhui Province from 2010 to 2020 The Influence of Digital Economy Development and Household Consumption Upgrading on Industrial Structure Upgrading Narrative and aesthetic features of micro-short plays on short video platforms: -A case study of Douyin micro-short plays Flexible Employment for Generation Z Youth: Characteristics and Challenges URBAN RESONANCE: Reconsideration of Public Art and Civic Spaces in Financial Districts in the Post-COVID-19 Era
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1