{"title":"利用数据挖掘技术检测金融交易中的信用卡欺诈","authors":"R. H. Alwan, Murtadha M. Hamad, O. Dawood","doi":"10.1109/ICCITM53167.2021.9677867","DOIUrl":null,"url":null,"abstract":"Every year, fraudulent credit card transactions result in the loss of billions of dollars. The development of effective fraud detection algorithms is critical for lowering this loss, and more algorithms are turning to advanced data mining approaches to help in fraud detection. Due to the unstable distribution of the data, the design of fraud detection algorithms is very difficult, and the distribution of the categories is highly unbalanced, yet there are many transactions that are categorized by fraud detection system. This paper proposes a system for detection fraud in financial transactions by using some types of data mining models which are logistic regression, random forest, naïve bayes and support vector machine. This is done through suggested basic steps: the first step is to use European cardholder dataset which contains 284.807 transactions that split into two groups. First one contains 199.3649 transactions which is used for training the models, while 85.4421 transactions remained for testing the models. This dataset is highly imbalanced, therefore by using SMOTE technique it will transform to a balanced one. The Second step is preparing the data and apply the Correlation function on training dataset, then implementing the used models on it. The results are compared by evaluation metrics to show which model is the best for detecting fraud. From these results, it is concluded that the Random Forest classifier is the best for fraud detection, which achieved accuracy with 99.15% in testing data.","PeriodicalId":406104,"journal":{"name":"2021 7th International Conference on Contemporary Information Technology and Mathematics (ICCITM)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Credit Card Fraud Detection in Financial Transactions Using Data Mining Techniques\",\"authors\":\"R. H. Alwan, Murtadha M. Hamad, O. Dawood\",\"doi\":\"10.1109/ICCITM53167.2021.9677867\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Every year, fraudulent credit card transactions result in the loss of billions of dollars. The development of effective fraud detection algorithms is critical for lowering this loss, and more algorithms are turning to advanced data mining approaches to help in fraud detection. Due to the unstable distribution of the data, the design of fraud detection algorithms is very difficult, and the distribution of the categories is highly unbalanced, yet there are many transactions that are categorized by fraud detection system. This paper proposes a system for detection fraud in financial transactions by using some types of data mining models which are logistic regression, random forest, naïve bayes and support vector machine. This is done through suggested basic steps: the first step is to use European cardholder dataset which contains 284.807 transactions that split into two groups. First one contains 199.3649 transactions which is used for training the models, while 85.4421 transactions remained for testing the models. This dataset is highly imbalanced, therefore by using SMOTE technique it will transform to a balanced one. The Second step is preparing the data and apply the Correlation function on training dataset, then implementing the used models on it. The results are compared by evaluation metrics to show which model is the best for detecting fraud. From these results, it is concluded that the Random Forest classifier is the best for fraud detection, which achieved accuracy with 99.15% in testing data.\",\"PeriodicalId\":406104,\"journal\":{\"name\":\"2021 7th International Conference on Contemporary Information Technology and Mathematics (ICCITM)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 7th International Conference on Contemporary Information Technology and Mathematics (ICCITM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCITM53167.2021.9677867\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th International Conference on Contemporary Information Technology and Mathematics (ICCITM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCITM53167.2021.9677867","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Credit Card Fraud Detection in Financial Transactions Using Data Mining Techniques
Every year, fraudulent credit card transactions result in the loss of billions of dollars. The development of effective fraud detection algorithms is critical for lowering this loss, and more algorithms are turning to advanced data mining approaches to help in fraud detection. Due to the unstable distribution of the data, the design of fraud detection algorithms is very difficult, and the distribution of the categories is highly unbalanced, yet there are many transactions that are categorized by fraud detection system. This paper proposes a system for detection fraud in financial transactions by using some types of data mining models which are logistic regression, random forest, naïve bayes and support vector machine. This is done through suggested basic steps: the first step is to use European cardholder dataset which contains 284.807 transactions that split into two groups. First one contains 199.3649 transactions which is used for training the models, while 85.4421 transactions remained for testing the models. This dataset is highly imbalanced, therefore by using SMOTE technique it will transform to a balanced one. The Second step is preparing the data and apply the Correlation function on training dataset, then implementing the used models on it. The results are compared by evaluation metrics to show which model is the best for detecting fraud. From these results, it is concluded that the Random Forest classifier is the best for fraud detection, which achieved accuracy with 99.15% in testing data.