{"title":"基于机器学习的信用卡欺诈检测中最佳抽样策略分析","authors":"Hanbin Zou","doi":"10.1145/3460179.3460186","DOIUrl":null,"url":null,"abstract":"∗With the growing use of credit cards, credit fraud becomes a major issue in the finance business. Billions of dollars of loss are caused every year by fraudulent credit card transactions. The best strategy in estimate the loss and detecting fraud situation remains unanswered since public data are scarcely available for confidentiality issues and companies constantly do not disclose the amount of losses due to frauds. Another problem in credit card fraud detection is that the fraud patterns are changing rapidly. This requires fraud detection to be re-evaluated from a reactive to a proactive approach. At the same time, intense interest in applying machine learning in module detection and analysis is widespread. In this regard, the implementation of efficient fraud detection algorithms using machine-learning techniques is key to reduce these losses, and to assist fraud investigators. This article aims to provide some answers by focusing on crucial issues in solving detection in credit card fraud: 1) How to deal with the imbalance in the database by applying SMOTE, Adaptive Synthetic Sampling (ADASYN)Borderline-SMOTE in sampling the data. 2) Random forest, gradient boosting, Logistic Regression,and XGboost are applied to the current public database on credit card and which machine learning method can achieve higher accuracy in the prediction model.","PeriodicalId":193744,"journal":{"name":"Proceedings of the 2021 6th International Conference on Intelligent Information Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Analysis of Best Sampling Strategy in Credit Card Fraud Detection Using Machine Learning\",\"authors\":\"Hanbin Zou\",\"doi\":\"10.1145/3460179.3460186\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"∗With the growing use of credit cards, credit fraud becomes a major issue in the finance business. Billions of dollars of loss are caused every year by fraudulent credit card transactions. The best strategy in estimate the loss and detecting fraud situation remains unanswered since public data are scarcely available for confidentiality issues and companies constantly do not disclose the amount of losses due to frauds. Another problem in credit card fraud detection is that the fraud patterns are changing rapidly. This requires fraud detection to be re-evaluated from a reactive to a proactive approach. At the same time, intense interest in applying machine learning in module detection and analysis is widespread. In this regard, the implementation of efficient fraud detection algorithms using machine-learning techniques is key to reduce these losses, and to assist fraud investigators. This article aims to provide some answers by focusing on crucial issues in solving detection in credit card fraud: 1) How to deal with the imbalance in the database by applying SMOTE, Adaptive Synthetic Sampling (ADASYN)Borderline-SMOTE in sampling the data. 2) Random forest, gradient boosting, Logistic Regression,and XGboost are applied to the current public database on credit card and which machine learning method can achieve higher accuracy in the prediction model.\",\"PeriodicalId\":193744,\"journal\":{\"name\":\"Proceedings of the 2021 6th International Conference on Intelligent Information Technology\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2021 6th International Conference on Intelligent Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3460179.3460186\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 6th International Conference on Intelligent Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3460179.3460186","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Analysis of Best Sampling Strategy in Credit Card Fraud Detection Using Machine Learning
∗With the growing use of credit cards, credit fraud becomes a major issue in the finance business. Billions of dollars of loss are caused every year by fraudulent credit card transactions. The best strategy in estimate the loss and detecting fraud situation remains unanswered since public data are scarcely available for confidentiality issues and companies constantly do not disclose the amount of losses due to frauds. Another problem in credit card fraud detection is that the fraud patterns are changing rapidly. This requires fraud detection to be re-evaluated from a reactive to a proactive approach. At the same time, intense interest in applying machine learning in module detection and analysis is widespread. In this regard, the implementation of efficient fraud detection algorithms using machine-learning techniques is key to reduce these losses, and to assist fraud investigators. This article aims to provide some answers by focusing on crucial issues in solving detection in credit card fraud: 1) How to deal with the imbalance in the database by applying SMOTE, Adaptive Synthetic Sampling (ADASYN)Borderline-SMOTE in sampling the data. 2) Random forest, gradient boosting, Logistic Regression,and XGboost are applied to the current public database on credit card and which machine learning method can achieve higher accuracy in the prediction model.