Pattern Mining for Anomaly Detection in Graphs: Application to Fraud in Public Procurement

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference) Pub Date : 2023-06-19 DOI:10.48550/arXiv.2306.10857

Lucas Potin, R. Figueiredo, Vincent Labatut, C. Largeron

{"title":"Pattern Mining for Anomaly Detection in Graphs: Application to Fraud in Public Procurement","authors":"Lucas Potin, R. Figueiredo, Vincent Labatut, C. Largeron","doi":"10.48550/arXiv.2306.10857","DOIUrl":null,"url":null,"abstract":"In the context of public procurement, several indicators called red flags are used to estimate fraud risk. They are computed according to certain contract attributes and are therefore dependent on the proper filling of the contract and award notices. However, these attributes are very often missing in practice, which prohibits red flags computation. Traditional fraud detection approaches focus on tabular data only, considering each contract separately, and are therefore very sensitive to this issue. In this work, we adopt a graph-based method allowing leveraging relations between contracts, to compensate for the missing attributes. We propose PANG (Pattern-Based Anomaly Detection in Graphs), a general supervised framework relying on pattern extraction to detect anomalous graphs in a collection of attributed graphs. Notably, it is able to identify induced subgraphs, a type of pattern widely overlooked in the literature. When benchmarked on standard datasets, its predictive performance is on par with state-of-the-art methods, with the additional advantage of being explainable. These experiments also reveal that induced patterns are more discriminative on certain datasets. When applying PANG to public procurement data, the prediction is superior to other methods, and it identifies subgraph patterns that are characteristic of fraud-prone situations, thereby making it possible to better understand fraudulent behavior.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2306.10857","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

In the context of public procurement, several indicators called red flags are used to estimate fraud risk. They are computed according to certain contract attributes and are therefore dependent on the proper filling of the contract and award notices. However, these attributes are very often missing in practice, which prohibits red flags computation. Traditional fraud detection approaches focus on tabular data only, considering each contract separately, and are therefore very sensitive to this issue. In this work, we adopt a graph-based method allowing leveraging relations between contracts, to compensate for the missing attributes. We propose PANG (Pattern-Based Anomaly Detection in Graphs), a general supervised framework relying on pattern extraction to detect anomalous graphs in a collection of attributed graphs. Notably, it is able to identify induced subgraphs, a type of pattern widely overlooked in the literature. When benchmarked on standard datasets, its predictive performance is on par with state-of-the-art methods, with the additional advantage of being explainable. These experiments also reveal that induced patterns are more discriminative on certain datasets. When applying PANG to public procurement data, the prediction is superior to other methods, and it identifies subgraph patterns that are characteristic of fraud-prone situations, thereby making it possible to better understand fraudulent behavior.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

图中异常检测的模式挖掘:在公共采购欺诈中的应用

在公共采购的背景下，几个被称为危险信号的指标被用来估计欺诈风险。它们是根据某些合同属性计算的，因此取决于合同和授予通知的正确填写。然而，这些属性在实践中经常缺失，这就禁止了危险信号计算。传统的欺诈检测方法只关注表格数据，单独考虑每个合同，因此对这个问题非常敏感。在这项工作中，我们采用了一种基于图的方法，允许利用契约之间的关系来补偿缺失的属性。我们提出了基于模式的异常检测(pattern - based Anomaly Detection in Graphs)，这是一个基于模式提取的通用监督框架，用于检测属性图集合中的异常图。值得注意的是，它能够识别诱导子图，这是一种在文献中被广泛忽视的模式。当在标准数据集上进行基准测试时，其预测性能与最先进的方法相当，并且具有可解释的额外优势。这些实验还表明，诱导模式在某些数据集上更具歧视性。当将PANG应用于公共采购数据时，预测优于其他方法，并且它识别出易发生欺诈情况的子图模式，从而可以更好地理解欺诈行为。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

自引率

0.00%

发文量