Medical insurance plays a vital role in modern society, yet organized healthcare fraud causes billions of dollars in annual losses, severely harming the sustainability of the social welfare system. Existing works mostly focus on detecting individual fraud entities or claims, ignoring hidden conspiracy patterns. Hence, they face severe challenges in tackling organized fraud. In this paper, we proposed RDPGL, a novel Risk Diffusion-based Parallel Graph Learning approach, to fighting against medical insurance criminal gangs. In particular, we first leverage a heterogeneous graph attention network to encode the local context from the beneficiary-provider graph. Then, we devise a community-aware risk diffusion model to infer the global context of organized fraud behaviors with the claim-claim relation graph. The local and global representations are parallel concatenated together and trained simultaneously in an end-to-end manner. Our approach is extensively evaluated on a real-world medical insurance dataset. The experimental results demonstrate the superiority of our proposed approach, which could detect more organized fraud claims with relatively high precision compared with state-of-the-art baselines.
{"title":"Fighting against Organized Fraudsters Using Risk Diffusion-based Parallel Graph Neural Network","authors":"Jiacheng Ma, Fan Li, Rui Zhang, Zhikang Xu, Dawei Cheng, Ouyang Yi, Ruihui Zhao, Jianguang Zheng, Yefeng Zheng, Changjun Jiang","doi":"10.24963/ijcai.2023/681","DOIUrl":"https://doi.org/10.24963/ijcai.2023/681","url":null,"abstract":"Medical insurance plays a vital role in modern society, yet organized healthcare fraud causes billions of dollars in annual losses, severely harming the sustainability of the social welfare system. Existing works mostly focus on detecting individual fraud entities or claims, ignoring hidden conspiracy patterns. Hence, they face severe challenges in tackling organized fraud. In this paper, we proposed RDPGL, a novel Risk Diffusion-based Parallel Graph Learning approach, to fighting against medical insurance criminal gangs. In particular, we first leverage a heterogeneous graph attention network to encode the local context from the beneficiary-provider graph. Then, we devise a community-aware risk diffusion model to infer the global context of organized fraud behaviors with the claim-claim relation graph. The local and global representations are parallel concatenated together and trained simultaneously in an end-to-end manner. Our approach is extensively evaluated on a real-world medical insurance dataset. The experimental results demonstrate the superiority of our proposed approach, which could detect more organized fraud claims with relatively high precision compared with state-of-the-art baselines.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114569275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Over recent decades, sequential decision-making tasks are mostly tackled with expert systems and reinforcement learning. However, these methods are still incapable of being generalizable enough to solve new tasks at a low cost. In this article, we discuss a novel paradigm that leverages Transformer-based sequence models to tackle decision-making tasks, named large decision models. Starting from offline reinforcement learning scenarios, early attempts demonstrate that sequential modeling methods can be applied to train an effective policy given sufficient expert trajectories. When the sequence model goes large, its generalization ability over a variety of tasks and fast adaptation to new tasks has been observed, which is highly potential to enable the agent to achieve artificial general intelligence for sequential decision-making in the near future.
{"title":"Large Decision Models","authors":"Weinan Zhang","doi":"10.24963/ijcai.2023/808","DOIUrl":"https://doi.org/10.24963/ijcai.2023/808","url":null,"abstract":"Over recent decades, sequential decision-making tasks are mostly tackled with expert systems and reinforcement learning. However, these methods are still incapable of being generalizable enough to solve new tasks at a low cost. In this article, we discuss a novel paradigm that leverages Transformer-based sequence models to tackle decision-making tasks, named large decision models. Starting from offline reinforcement learning scenarios, early attempts demonstrate that sequential modeling methods can be applied to train an effective policy given sufficient expert trajectories. When the sequence model goes large, its generalization ability over a variety of tasks and fast adaptation to new tasks has been observed, which is highly potential to enable the agent to achieve artificial general intelligence for sequential decision-making in the near future.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114847326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Local clustering aims at extracting a local structure inside a graph without the necessity of knowing the entire graph structure. As the local structure is usually small in size compared to the entire graph, one can think of it as a compressive sensing problem where the indices of target cluster can be thought as a sparse solution to a linear system. In this paper, we apply this idea based on two pioneering works under the same framework and propose a new semi-supervised local clustering approach using only few labeled nodes. Our approach improves the existing works by making the initial cut to be the entire graph and hence overcomes a major limitation of the existing works, which is the low quality of initial cut. Extensive experimental results on various datasets demonstrate the effectiveness of our approach.
{"title":"Graph-based Semi-supervised Local Clustering with Few Labeled Nodes","authors":"Zhaiming Shen, M. Lai, Sheng Li","doi":"10.24963/ijcai.2023/466","DOIUrl":"https://doi.org/10.24963/ijcai.2023/466","url":null,"abstract":"Local clustering aims at extracting a local structure inside a graph without the necessity of knowing the entire graph structure. As the local structure is usually small in size compared to the entire graph, one can think of it as a compressive sensing problem where the indices of target cluster can be thought as a sparse solution to a linear system. In this paper, we apply this idea based on two pioneering works under the same framework and propose a new semi-supervised local clustering approach using only few labeled nodes. Our approach improves the existing works by making the initial cut to be the entire graph and hence overcomes a major limitation of the existing works, which is the low quality of initial cut. Extensive experimental results on various datasets demonstrate the effectiveness of our approach.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117351765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The anonymity of online networks makes tackling fraud increasingly costly. Thanks to the superiority of graph representation learning, graph-based fraud detection has made significant progress in recent years. However, upgrading fraudulent strategies produces more advanced and difficult scams. One common strategy is synergistic camouflage —— combining multiple means to deceive others. Existing methods mostly investigate the differences between relations on individual frauds, that neglect the correlation among multi-relation fraudulent behaviors. In this paper, we design several statistics to validate the existence of synergistic camouflage of fraudsters by exploring the correlation among multi-relation interactions. From the perspective of multi-relation, we find two distinctive features of fraudulent behaviors, i.e., alienation and marginalization. Based on the finding, we propose COFRAUD, a correlation-aware fraud detection model, which innovatively incorporates synergistic camouflage into fraud detection. It captures the correlation among multi-relation fraudulent behaviors. Experimental results on two public datasets demonstrate that COFRAUD achieves significant improvements over state-of-the-art methods.
{"title":"Don't Ignore Alienation and Marginalization: Correlating Fraud Detection","authors":"Yilong Zang, Ruimin Hu, Zheng Wang, Danni Xu, Jia Wu, Dengshi Li, Junhang Wu, Lingfei Ren","doi":"10.24963/ijcai.2023/551","DOIUrl":"https://doi.org/10.24963/ijcai.2023/551","url":null,"abstract":"The anonymity of online networks makes tackling fraud increasingly costly. Thanks to the superiority of graph representation learning, graph-based fraud detection has made significant progress in recent years. However, upgrading fraudulent strategies produces more advanced and difficult scams. One common strategy is synergistic camouflage —— combining multiple means to deceive others. Existing methods mostly investigate the differences between relations on individual frauds, that neglect the correlation among multi-relation fraudulent behaviors. In this paper, we design several statistics to validate the existence of synergistic camouflage of fraudsters by exploring the correlation among multi-relation interactions. From the perspective of multi-relation, we find two distinctive features of fraudulent behaviors, i.e., alienation and marginalization. Based on the finding, we propose COFRAUD, a correlation-aware fraud detection model, which innovatively incorporates synergistic camouflage into fraud detection. It captures the correlation among multi-relation fraudulent behaviors. Experimental results on two public datasets demonstrate that COFRAUD achieves significant improvements over state-of-the-art methods.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116304923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The graph width-measure twin-width recently attracted great attention because of its solving power and generality. Many prominent NP-hard problems are tractable on graphs of bounded twin-width if a certificate for the twin-width bound is provided as an input. Bounded twin-width subsumes other prominent structural restrictions such as bounded treewidth and bounded rank-width. Computing such a certificate is NP-hard itself, already for twin-width 4, and the only known implemented algorithm for twin-width computation is based on a SAT encoding. In this paper, we propose two new algorithmic approaches for computing twin-width that significantly improve the state of the art. Firstly, we develop a SAT encoding that is far more compact than the known encoding and consequently scales to larger graphs. Secondly, we propose a new Branch & Bound algorithm for twin-width that, on many graphs, is significantly faster than the SAT encoding. It utilizes a sophisticated caching system for partial solutions. Both algorithmic approaches are based on new conceptual insights into twin-width computation, including the reordering of contractions.
{"title":"Computing Twin-width with SAT and Branch & Bound","authors":"André Schidler, Stefan Szeider","doi":"10.24963/ijcai.2023/224","DOIUrl":"https://doi.org/10.24963/ijcai.2023/224","url":null,"abstract":"The graph width-measure twin-width recently attracted great attention because of its solving power and generality. Many prominent NP-hard problems are tractable on graphs of bounded twin-width if a certificate for the twin-width bound is provided as an input. Bounded twin-width subsumes other prominent structural restrictions such as bounded treewidth and bounded rank-width.\u0000\u0000Computing such a certificate is NP-hard itself, already for twin-width 4, and the only known implemented algorithm for twin-width computation is based on a SAT encoding.\u0000\u0000\u0000\u0000In this paper, we propose two new algorithmic approaches for computing twin-width that\u0000\u0000significantly improve the state of the art.\u0000\u0000Firstly, we develop a SAT encoding that is far more compact than the known encoding and consequently scales to larger graphs. Secondly, we propose a new Branch & Bound algorithm for twin-width that, on many graphs, is significantly faster than the SAT encoding. It utilizes a sophisticated caching system for partial solutions.\u0000\u0000Both algorithmic approaches are based on new conceptual insights into twin-width computation,\u0000\u0000including the reordering of contractions.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"123 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123585585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yan Zhou, Bo Dong, Yuanfeng Wu, Wentao Zhu, Geng Chen, Yanning Zhang
Dichotomous image segmentation (DIS) has a wide range of real-world applications and gained increasing research attention in recent years. In this paper, we propose to tackle DIS with informative frequency priors. Our model, called FP-DIS, stems from the fact that prior knowledge in the frequency domain can provide valuable cues to identify fine-grained object boundaries. Specifically, we propose a frequency prior generator to jointly utilize a fixed filter and learnable filters to extract informative frequency priors. Before embedding the frequency priors into the network, we first harmonize the multi-scale side-out features to reduce their heterogeneity. This is achieved by our feature harmonization module, which is based on a gating mechanism to harmonize the grouped features. Finally, we propose a frequency prior embedding module to embed the frequency priors into multi-scale features through an adaptive modulation strategy. Extensive experiments on the benchmark dataset, DIS5K, demonstrate that our FP-DIS outperforms state-of-the-art methods by a large margin in terms of key evaluation metrics.
{"title":"Dichotomous Image Segmentation with Frequency Priors","authors":"Yan Zhou, Bo Dong, Yuanfeng Wu, Wentao Zhu, Geng Chen, Yanning Zhang","doi":"10.24963/ijcai.2023/202","DOIUrl":"https://doi.org/10.24963/ijcai.2023/202","url":null,"abstract":"Dichotomous image segmentation (DIS) has a wide range of real-world applications and gained increasing research attention in recent years. In this paper, we propose to tackle DIS with informative frequency priors. Our model, called FP-DIS, stems from the fact that prior knowledge in the frequency domain can provide valuable cues to identify fine-grained object boundaries. Specifically, we propose a frequency prior generator to jointly utilize a fixed filter and learnable filters to extract informative frequency priors. Before embedding the frequency priors into the network, we first harmonize the multi-scale side-out features to reduce their heterogeneity. This is achieved by our feature harmonization module, which is based on a gating mechanism to harmonize the grouped features. Finally, we propose a frequency prior embedding module to embed the frequency priors into multi-scale features through an adaptive modulation strategy. Extensive experiments on the benchmark dataset, DIS5K, demonstrate that our FP-DIS outperforms state-of-the-art methods by a large margin in terms of key evaluation metrics.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"33 1-2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123606200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accurately credit rating on Interbank assets is essential for a healthy financial environment and substantial economic development. But individual participants tend to provide manipulated information in order to attack the rating model to produce a higher score, which may conduct serious adverse effects on the economic system, such as the 2008 global financial crisis. To this end, in this paper, we propose a novel selective-aware graph neural network model (SA-GNN) for defense the Interbank credit rating attacks. In particular, we first simulate the rating information manipulating process by structural and feature poisoning attacks. Then we build a selective-aware defense graph neural model to adaptively prioritize the poisoning training data with Bernoulli distribution similarities. Finally, we optimize the model with weighed penalization on the objection function so that the model could differentiate the attackers. Extensive experiments on our collected real-world Interbank dataset, with over 20 thousand banks and their relations, demonstrate the superior performance of our proposed method in preventing credit rating attacks compared with the state-of-the-art baselines.
{"title":"Preventing Attacks in Interbank Credit Rating with Selective-aware Graph Neural Network","authors":"Junyi Liu, Dawei Cheng, Changjun Jiang","doi":"10.24963/ijcai.2023/675","DOIUrl":"https://doi.org/10.24963/ijcai.2023/675","url":null,"abstract":"Accurately credit rating on Interbank assets is essential for a healthy financial environment and substantial economic development. But individual participants tend to provide manipulated information in order to attack the rating model to produce a higher score, which may conduct serious adverse effects on the economic system, such as the 2008 global financial crisis. To this end, in this paper, we propose a novel selective-aware graph neural network model (SA-GNN) for defense the Interbank credit rating attacks. In particular, we first simulate the rating information manipulating process by structural and feature poisoning attacks. Then we build a selective-aware defense graph neural model to adaptively prioritize the poisoning training data with Bernoulli distribution similarities. Finally, we optimize the model with weighed penalization on the objection function so that the model could differentiate the attackers. Extensive experiments on our collected real-world Interbank dataset, with over 20 thousand banks and their relations, demonstrate the superior performance of our proposed method in preventing credit rating attacks compared with the state-of-the-art baselines.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122054368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most systems integrating data-driven machine learning with knowledge-driven reasoning usually rely on a specifically designed knowledge base to enable efficient symbolic inference. However, it could be cumbersome for the nonexpert end-users to prepare such a knowledge base in real tasks. Recent years have witnessed the success of large-scale knowledge graphs, which could be ideal domain knowledge resources for real-world machine learning tasks. However, these large-scale knowledge graphs usually contain much information that is irrelevant to a specific learning task. Moreover, they often contain a certain degree of noise. Existing methods can hardly make use of them because the large-scale probabilistic logical inference is usually intractable. To address these problems, we present ABductive Learning with Knowledge Graph (ABL-KG) that can automatically mine logic rules from knowledge graphs during learning, using a knowledge forgetting mechanism for filtering out irrelevant information. Meanwhile, these rules can form a logic program that enables efficient joint optimization of the machine learning model and logic inference within the Abductive Learning (ABL) framework. Experiments on four different tasks show that ABL-KG can automatically extract useful rules from large-scale and noisy knowledge graphs, and significantly improve the performance of machine learning with only a handful of labeled data.
{"title":"Enabling Abductive Learning to Exploit Knowledge Graph","authors":"Yu-Xuan Huang, Zequn Sun, Guang-pu Li, Xiaobin Tian, Wang-Zhou Dai, Wei Hu, Yuan Jiang, Zhi-Hua Zhou","doi":"10.24963/ijcai.2023/427","DOIUrl":"https://doi.org/10.24963/ijcai.2023/427","url":null,"abstract":"Most systems integrating data-driven machine learning with knowledge-driven reasoning usually rely on a specifically designed knowledge base to enable efficient symbolic inference. However, it could be cumbersome for the nonexpert end-users to prepare such a knowledge base in real tasks. Recent years have witnessed the success of large-scale knowledge graphs, which could be ideal domain knowledge resources for real-world machine learning tasks. However, these large-scale knowledge graphs usually contain much information that is irrelevant to a specific learning task. Moreover, they often contain a certain degree of noise. Existing methods can hardly make use of them because the large-scale probabilistic logical inference is usually intractable. To address these problems, we present ABductive Learning with Knowledge Graph (ABL-KG) that can automatically mine logic rules from knowledge graphs during learning, using a knowledge forgetting mechanism for filtering out irrelevant information. Meanwhile, these rules can form a logic program that enables efficient joint optimization of the machine learning model and logic inference within the Abductive Learning (ABL) framework. Experiments on four different tasks show that ABL-KG can automatically extract useful rules from large-scale and noisy knowledge graphs, and significantly improve the performance of machine learning with only a handful of labeled data.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116992551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Markus Brill, E. Markakis, Georgios Papasotiropoulos, Jannik Peters
We consider a multi-issue election setting over a set of possibly interdependent issues with the goal of achieving proportional representation of the views of the electorate. To this end, we employ a proportionality criterion suggested recently in the literature, that guarantees fair representation for all groups of voters of sufficient size. For this criterion, there exist rules that perform well in the case where all the issues have a binary domain and are independent of each other. In particular, this has been shown for Proportional Approval Voting (PAV) and for the Method of Equal Shares (MES). In this paper, we go two steps further: we generalize these guarantees for issues with a non-binary domain, and, most importantly, we consider extensions to elections with dependencies among issues, where we identify restrictions that lead to analogous results. To achieve this, we define appropriate generalizations of PAV and MES to handle conditional ballots. In addition to proportionality considerations, we also examine the computational properties of the conditional version of MES. Our findings indicate that the conditional case poses additional challenges and differs significantly from the unconditional one, both in terms of proportionality guarantees and computational complexity.
{"title":"Proportionality Guarantees in Elections with Interdependent Issues","authors":"Markus Brill, E. Markakis, Georgios Papasotiropoulos, Jannik Peters","doi":"10.24963/ijcai.2023/282","DOIUrl":"https://doi.org/10.24963/ijcai.2023/282","url":null,"abstract":"We consider a multi-issue election setting over a set of possibly interdependent issues with the goal of achieving proportional representation of the views of the electorate. To this end, we employ a proportionality criterion suggested recently in the literature, that guarantees fair representation for all groups of voters of sufficient size. For this criterion, there exist rules that perform well in the case where all the issues have a binary domain and are independent of each other. In particular, this has been shown for Proportional Approval Voting (PAV) and for the Method of Equal Shares (MES). In this paper, we go two steps further: we generalize these guarantees for issues with a non-binary domain, and, most importantly, we consider extensions to elections with dependencies among issues, where we identify restrictions that lead to analogous results. To achieve this, we define appropriate generalizations of PAV and MES to handle conditional ballots. In addition to proportionality considerations, we also examine the computational properties of the conditional version of MES. Our findings indicate that the conditional case poses additional challenges and differs significantly from the unconditional one, both in terms of proportionality guarantees and computational complexity.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117139788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thomas Eiter, Tobias Geibinger, N. Higuera, J. Oetsch
Visual Question Answering (VQA) is a well-known problem for which deep-learning is key. This poses a challenge for explaining answers to questions, the more if advanced notions like contrastive explanations (CEs) should be provided. The latter explain why an answer has been reached in contrast to a different one and are attractive as they focus on reasons necessary to flip a query answer. We present a CE framework for VQA that uses a neurosymbolic VQA architecture which disentangles perception from reasoning. Once the reasoning part is provided as logical theory, we use answer-set programming, in which CE generation can be framed as an abduction problem. We validate our approach on the CLEVR dataset, which we extend by more sophisticated questions to further demonstrate the robustness of the modular architecture. While we achieve top performance compared to related approaches, we can also produce CEs for explanation, model debugging, and validation tasks, showing the versatility of the declarative approach to reasoning.
{"title":"A Logic-based Approach to Contrastive Explainability for Neurosymbolic Visual Question Answering","authors":"Thomas Eiter, Tobias Geibinger, N. Higuera, J. Oetsch","doi":"10.24963/ijcai.2023/408","DOIUrl":"https://doi.org/10.24963/ijcai.2023/408","url":null,"abstract":"Visual Question Answering (VQA) is a well-known problem for which deep-learning is key. This poses a challenge for explaining answers to questions, the more if advanced notions like contrastive explanations (CEs) should be provided. The latter explain why an answer has been reached in contrast to a different one and are attractive as they focus on reasons necessary to flip a query answer. We present a CE framework for VQA that uses a neurosymbolic VQA architecture which disentangles perception from reasoning. Once the reasoning part is provided as logical theory, we use answer-set programming, in which CE generation can be framed as an abduction problem. We validate our approach on the CLEVR dataset, which we extend by more sophisticated questions to further demonstrate the robustness of the modular architecture. While we achieve top performance compared to related approaches, we can also produce CEs for explanation, model debugging, and validation tasks, showing the versatility of the declarative approach to reasoning.","PeriodicalId":394530,"journal":{"name":"International Joint Conference on Artificial Intelligence","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125942209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}