{"title":"基于Logistic回归和随机森林技术的信用风险预测","authors":"Xin Yang","doi":"10.1145/3558819.3565138","DOIUrl":null,"url":null,"abstract":"With the increasing demand of bank loan businesses, the probability of non-performing loans, that is, loan default, has also increased sharply. We design machine learning algorithm to solve the problem, which can reduce the loan risk and improve service efficiency, especially when we face the data unbalanced issues. Firstly, we train the random forest model with the historical bank loan data and associated data from other financial institutions. Secondly, we revised the unbalanced data classification algorithm with random forest and tuned the data feature extraction methods. Thirdly, the results show that the machine learning risk predication algorithm outperforms traditional statistical algorithms. In addition, we use random forest algorithm to identify the impact of data feature, it is possible to obtain features that have a huge impact on the definition of the results, allowing for more accurate loan risk assessment in the financial sector.","PeriodicalId":373484,"journal":{"name":"Proceedings of the 7th International Conference on Cyber Security and Information Engineering","volume":"252 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of Credit Risk based on Logistic Regression and Random Forest technique\",\"authors\":\"Xin Yang\",\"doi\":\"10.1145/3558819.3565138\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the increasing demand of bank loan businesses, the probability of non-performing loans, that is, loan default, has also increased sharply. We design machine learning algorithm to solve the problem, which can reduce the loan risk and improve service efficiency, especially when we face the data unbalanced issues. Firstly, we train the random forest model with the historical bank loan data and associated data from other financial institutions. Secondly, we revised the unbalanced data classification algorithm with random forest and tuned the data feature extraction methods. Thirdly, the results show that the machine learning risk predication algorithm outperforms traditional statistical algorithms. In addition, we use random forest algorithm to identify the impact of data feature, it is possible to obtain features that have a huge impact on the definition of the results, allowing for more accurate loan risk assessment in the financial sector.\",\"PeriodicalId\":373484,\"journal\":{\"name\":\"Proceedings of the 7th International Conference on Cyber Security and Information Engineering\",\"volume\":\"252 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th International Conference on Cyber Security and Information Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3558819.3565138\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th International Conference on Cyber Security and Information Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3558819.3565138","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Prediction of Credit Risk based on Logistic Regression and Random Forest technique
With the increasing demand of bank loan businesses, the probability of non-performing loans, that is, loan default, has also increased sharply. We design machine learning algorithm to solve the problem, which can reduce the loan risk and improve service efficiency, especially when we face the data unbalanced issues. Firstly, we train the random forest model with the historical bank loan data and associated data from other financial institutions. Secondly, we revised the unbalanced data classification algorithm with random forest and tuned the data feature extraction methods. Thirdly, the results show that the machine learning risk predication algorithm outperforms traditional statistical algorithms. In addition, we use random forest algorithm to identify the impact of data feature, it is possible to obtain features that have a huge impact on the definition of the results, allowing for more accurate loan risk assessment in the financial sector.