Navin Kasa, Andrew Dahbura, Charishma Ravoori, Stephen Adams
{"title":"通过分析和聚类帐户改进信用卡欺诈检测","authors":"Navin Kasa, Andrew Dahbura, Charishma Ravoori, Stephen Adams","doi":"10.1109/SIEDS.2019.8735623","DOIUrl":null,"url":null,"abstract":"Credit card fraud is a problem that can cost banks billions of dollars annually, leading to increased incentives among financial institutions for the development of fast, effective and dynamic fraud detection systems. This research paper addresses credit card fraud detection through a semi-supervised approach, in which clusters of account profiles are created and used for modeling classifiers. Accounts are profiled based on their behavioral trends and clustered into similar groups. Groups are further identified as distinct customer segments based on purchase characteristics such as amount, frequency or distance. Random forest and XGBoost classifiers are trained on an entire sample and compared against classifiers trained at the transaction level across each cluster. This research concludes that the overall weighted performance of classifiers trained at the cluster level does not significantly outperform classifiers trained on the full sample. However, this research finds that clustering can be used to find meaningful groups of account holders that also have varying fraud rates across each cluster. Additionally, some classifiers trained on specific clusters yield significant improvements in performance over the baseline, whereas classifiers for other clusters do not perform as well as the baseline. This research also concludes that the optimal classifier for a given cluster varies by cluster, highlighting the potential for further development of new classifiers which may perform well on clusters that currently exhibit underperforming models.","PeriodicalId":265421,"journal":{"name":"2019 Systems and Information Engineering Design Symposium (SIEDS)","volume":"483 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Improving Credit Card Fraud Detection by Profiling and Clustering Accounts\",\"authors\":\"Navin Kasa, Andrew Dahbura, Charishma Ravoori, Stephen Adams\",\"doi\":\"10.1109/SIEDS.2019.8735623\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Credit card fraud is a problem that can cost banks billions of dollars annually, leading to increased incentives among financial institutions for the development of fast, effective and dynamic fraud detection systems. This research paper addresses credit card fraud detection through a semi-supervised approach, in which clusters of account profiles are created and used for modeling classifiers. Accounts are profiled based on their behavioral trends and clustered into similar groups. Groups are further identified as distinct customer segments based on purchase characteristics such as amount, frequency or distance. Random forest and XGBoost classifiers are trained on an entire sample and compared against classifiers trained at the transaction level across each cluster. This research concludes that the overall weighted performance of classifiers trained at the cluster level does not significantly outperform classifiers trained on the full sample. However, this research finds that clustering can be used to find meaningful groups of account holders that also have varying fraud rates across each cluster. Additionally, some classifiers trained on specific clusters yield significant improvements in performance over the baseline, whereas classifiers for other clusters do not perform as well as the baseline. This research also concludes that the optimal classifier for a given cluster varies by cluster, highlighting the potential for further development of new classifiers which may perform well on clusters that currently exhibit underperforming models.\",\"PeriodicalId\":265421,\"journal\":{\"name\":\"2019 Systems and Information Engineering Design Symposium (SIEDS)\",\"volume\":\"483 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 Systems and Information Engineering Design Symposium (SIEDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIEDS.2019.8735623\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Systems and Information Engineering Design Symposium (SIEDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIEDS.2019.8735623","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving Credit Card Fraud Detection by Profiling and Clustering Accounts
Credit card fraud is a problem that can cost banks billions of dollars annually, leading to increased incentives among financial institutions for the development of fast, effective and dynamic fraud detection systems. This research paper addresses credit card fraud detection through a semi-supervised approach, in which clusters of account profiles are created and used for modeling classifiers. Accounts are profiled based on their behavioral trends and clustered into similar groups. Groups are further identified as distinct customer segments based on purchase characteristics such as amount, frequency or distance. Random forest and XGBoost classifiers are trained on an entire sample and compared against classifiers trained at the transaction level across each cluster. This research concludes that the overall weighted performance of classifiers trained at the cluster level does not significantly outperform classifiers trained on the full sample. However, this research finds that clustering can be used to find meaningful groups of account holders that also have varying fraud rates across each cluster. Additionally, some classifiers trained on specific clusters yield significant improvements in performance over the baseline, whereas classifiers for other clusters do not perform as well as the baseline. This research also concludes that the optimal classifier for a given cluster varies by cluster, highlighting the potential for further development of new classifiers which may perform well on clusters that currently exhibit underperforming models.