{"title":"利用遗传算法优化的机器学习分类器预测银行业绩","authors":"Ummey Hany Ainan, Md. Nur-E-Arefin","doi":"10.1109/icaeee54957.2022.9836523","DOIUrl":null,"url":null,"abstract":"Bank performance is defined as the reflection of the way by which the assets of the bank are utilized in a form which enables it to accomplice its targets. Economic development highly depends on the functionalities of the banks. In past statistical approach is used to predict bank performance. Nowadays Machine Learning (ML) approaches are used in banking sector for better accuracy. In this work three famous Machine Learning classifiers named Random Forest (RF), Support Vector Machine (SVM) and Logistic Regression (LR) are used to find out the bank performance. The dataset used in this work are consist of 50 Turkish banks, 30 American banks and 20 European banks. The data have 24 performance indicators that measures performance from the year of 2010 to 2020. CAMEL technique is applied in this dataset in order to find ratings of the banks. In this study Genetic Algorithm (GA) plays a vital role. GA is used as optimizer and feature selector. At the end the models are evaluated with and without feature selection as well as with and without optimization. In this study SVM with optimization but without feature selection provides best accuracy among all the models which is 97.06% test accuracy. On the other hand, LR provides 80.21% test accuracy with feature selection but without optimization which is lowest in the whole study.","PeriodicalId":383872,"journal":{"name":"2022 International Conference on Advancement in Electrical and Electronic Engineering (ICAEEE)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of Bank Performance Using Machine Learning Classifiers Optimized by Genetic Algorithm\",\"authors\":\"Ummey Hany Ainan, Md. Nur-E-Arefin\",\"doi\":\"10.1109/icaeee54957.2022.9836523\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Bank performance is defined as the reflection of the way by which the assets of the bank are utilized in a form which enables it to accomplice its targets. Economic development highly depends on the functionalities of the banks. In past statistical approach is used to predict bank performance. Nowadays Machine Learning (ML) approaches are used in banking sector for better accuracy. In this work three famous Machine Learning classifiers named Random Forest (RF), Support Vector Machine (SVM) and Logistic Regression (LR) are used to find out the bank performance. The dataset used in this work are consist of 50 Turkish banks, 30 American banks and 20 European banks. The data have 24 performance indicators that measures performance from the year of 2010 to 2020. CAMEL technique is applied in this dataset in order to find ratings of the banks. In this study Genetic Algorithm (GA) plays a vital role. GA is used as optimizer and feature selector. At the end the models are evaluated with and without feature selection as well as with and without optimization. In this study SVM with optimization but without feature selection provides best accuracy among all the models which is 97.06% test accuracy. On the other hand, LR provides 80.21% test accuracy with feature selection but without optimization which is lowest in the whole study.\",\"PeriodicalId\":383872,\"journal\":{\"name\":\"2022 International Conference on Advancement in Electrical and Electronic Engineering (ICAEEE)\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Conference on Advancement in Electrical and Electronic Engineering (ICAEEE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/icaeee54957.2022.9836523\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Advancement in Electrical and Electronic Engineering (ICAEEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icaeee54957.2022.9836523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Prediction of Bank Performance Using Machine Learning Classifiers Optimized by Genetic Algorithm
Bank performance is defined as the reflection of the way by which the assets of the bank are utilized in a form which enables it to accomplice its targets. Economic development highly depends on the functionalities of the banks. In past statistical approach is used to predict bank performance. Nowadays Machine Learning (ML) approaches are used in banking sector for better accuracy. In this work three famous Machine Learning classifiers named Random Forest (RF), Support Vector Machine (SVM) and Logistic Regression (LR) are used to find out the bank performance. The dataset used in this work are consist of 50 Turkish banks, 30 American banks and 20 European banks. The data have 24 performance indicators that measures performance from the year of 2010 to 2020. CAMEL technique is applied in this dataset in order to find ratings of the banks. In this study Genetic Algorithm (GA) plays a vital role. GA is used as optimizer and feature selector. At the end the models are evaluated with and without feature selection as well as with and without optimization. In this study SVM with optimization but without feature selection provides best accuracy among all the models which is 97.06% test accuracy. On the other hand, LR provides 80.21% test accuracy with feature selection but without optimization which is lowest in the whole study.