Counterfactual Causal-Effect Intervention for Interpretable Medical Visual Question Answering.

IEEE transactions on medical imaging Pub Date : 2024-07-09 DOI:10.1109/TMI.2024.3425533

Linqin Cai, Haodu Fang, Nuoying Xu, Bo Ren

{"title":"Counterfactual Causal-Effect Intervention for Interpretable Medical Visual Question Answering.","authors":"Linqin Cai, Haodu Fang, Nuoying Xu, Bo Ren","doi":"10.1109/TMI.2024.3425533","DOIUrl":null,"url":null,"abstract":"<p><p>Medical Visual Question Answering (VQA-Med) is a challenging task that involves answering clinical questions related to medical images. However, most current VQA-Med methods ignore the causal correlation between specific lesion or abnormality features and answers, while also failing to provide accurate explanations for their decisions. To explore the interpretability of VQA-Med, this paper proposes a novel CCIS-MVQA model for VQA-Med based on a counterfactual causal-effect intervention strategy. This model consists of the modified ResNet for image feature extraction, a GloVe decoder for question feature extraction, a bilinear attention network for vision and language feature fusion, and an interpretability generator for producing the interpretability and prediction results. The proposed CCIS-MVQA introduces a layer-wise relevance propagation method to automatically generate counterfactual samples. Additionally, CCIS-MVQA applies counterfactual causal reasoning throughout the training phase to enhance interpretability and generalization. Extensive experiments on three benchmark datasets show that the proposed CCIS-MVQA model outperforms the state-of-the-art methods. Enough visualization results are produced to analyze the interpretability and performance of CCIS-MVQA.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on medical imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TMI.2024.3425533","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Medical Visual Question Answering (VQA-Med) is a challenging task that involves answering clinical questions related to medical images. However, most current VQA-Med methods ignore the causal correlation between specific lesion or abnormality features and answers, while also failing to provide accurate explanations for their decisions. To explore the interpretability of VQA-Med, this paper proposes a novel CCIS-MVQA model for VQA-Med based on a counterfactual causal-effect intervention strategy. This model consists of the modified ResNet for image feature extraction, a GloVe decoder for question feature extraction, a bilinear attention network for vision and language feature fusion, and an interpretability generator for producing the interpretability and prediction results. The proposed CCIS-MVQA introduces a layer-wise relevance propagation method to automatically generate counterfactual samples. Additionally, CCIS-MVQA applies counterfactual causal reasoning throughout the training phase to enhance interpretability and generalization. Extensive experiments on three benchmark datasets show that the proposed CCIS-MVQA model outperforms the state-of-the-art methods. Enough visualization results are produced to analyze the interpretability and performance of CCIS-MVQA.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于可解释医学视觉问题解答的反事实因果干预。

医学视觉问题解答（VQA-Med）是一项具有挑战性的任务，涉及回答与医学图像相关的临床问题。然而，目前大多数 VQA-Med 方法都忽略了特定病变或异常特征与答案之间的因果关系，同时也无法为其决策提供准确的解释。为了探索 VQA-Med 的可解释性，本文提出了一种基于反事实因果干预策略的新型 CCIS-MVQA VQA-Med 模型。该模型由用于图像特征提取的改进 ResNet、用于问题特征提取的 GloVe 解码器、用于视觉和语言特征融合的双线性注意网络以及用于生成可解释性和预测结果的可解释性生成器组成。所提出的 CCIS-MVQA 引入了一种分层相关性传播方法，可自动生成反事实样本。此外，CCIS-MVQA 还将反事实因果推理应用于整个训练阶段，以提高可解释性和泛化能力。在三个基准数据集上进行的广泛实验表明，所提出的 CCIS-MVQA 模型优于最先进的方法。实验产生的可视化结果足以分析 CCIS-MVQA 的可解释性和性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on medical imaging

自引率

0.00%

发文量