{"title":"Research on borrower's credit classification of P2P network loan based on LightGBM algorithm","authors":"Sen Zhang, Yuping Hu, Zhuoyi Tan","doi":"10.1504/ijes.2019.102434","DOIUrl":null,"url":null,"abstract":"The credit classification of a borrower is the main method to effectively reduce the credit risk of P2P online loans. In this paper, LightGBM algorithm has the advantage in the high accuracy of data classification. Feature extraction, selection and reconstruction of the original data are performed by feature engineering. The One hot Encoding technology is used to re-encode the discretised feature indicators. Z-score data normalisation normalises the characteristics of continuous variables. Re-sort all feature indicators by contribution and perform PCA dimensionality reduction, and filter out effective feature indicators for training and testing. Finally, the problem of imbalance of samples and optimisation of model parameters is solved by ten-fold cross-validation. Result of simulation experiment shows that the LightGBM model has good stability, good fitting ability and high classification prediction accuracy.","PeriodicalId":412308,"journal":{"name":"Int. J. Embed. Syst.","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Embed. Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijes.2019.102434","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
The credit classification of a borrower is the main method to effectively reduce the credit risk of P2P online loans. In this paper, LightGBM algorithm has the advantage in the high accuracy of data classification. Feature extraction, selection and reconstruction of the original data are performed by feature engineering. The One hot Encoding technology is used to re-encode the discretised feature indicators. Z-score data normalisation normalises the characteristics of continuous variables. Re-sort all feature indicators by contribution and perform PCA dimensionality reduction, and filter out effective feature indicators for training and testing. Finally, the problem of imbalance of samples and optimisation of model parameters is solved by ten-fold cross-validation. Result of simulation experiment shows that the LightGBM model has good stability, good fitting ability and high classification prediction accuracy.