Distilling knowledge in causal inference for unbiased visual question answering

Proceedings of the 2nd ACM International Conference on Multimedia in Asia Pub Date : 2021-03-07 DOI:10.1145/3444685.3446256

Yonghua Pan, Zechao Li, Liyan Zhang, Jinhui Tang

{"title":"Distilling knowledge in causal inference for unbiased visual question answering","authors":"Yonghua Pan, Zechao Li, Liyan Zhang, Jinhui Tang","doi":"10.1145/3444685.3446256","DOIUrl":null,"url":null,"abstract":"Current Visual Question Answering (VQA) models mainly explore the statistical correlations between answers and questions, which fail to capture the relationship between the visual information and answers. The performance dramatically decreases when the distribution of handled data is different from the training data. Towards this end, this paper proposes a novel unbiased VQA model by exploring the Casual Inference with Knowledge Distillation (CIKD) to reduce the influence of bias. Specifically, the causal graph is first constructed to explore the counterfactual causality and infer the casual target based on the causal effect, which well reduces the bias from questions and obtain answers without training. Then knowledge distillation is leveraged to transfer the knowledge of the inferred casual target to the conventional VQA model. It makes the proposed method enable to handle both the biased data and standard data. To address the problem of the bad bias from the knowledge distillation, the ensemble learning is introduced based on the hypothetical bias reason. Experiments are conducted to show the performance of the proposed method. The significant improvements over the state-of-the-art methods on the VQA-CP v2 dataset well validate the contributions of this work.","PeriodicalId":119278,"journal":{"name":"Proceedings of the 2nd ACM International Conference on Multimedia in Asia","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd ACM International Conference on Multimedia in Asia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3444685.3446256","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

Current Visual Question Answering (VQA) models mainly explore the statistical correlations between answers and questions, which fail to capture the relationship between the visual information and answers. The performance dramatically decreases when the distribution of handled data is different from the training data. Towards this end, this paper proposes a novel unbiased VQA model by exploring the Casual Inference with Knowledge Distillation (CIKD) to reduce the influence of bias. Specifically, the causal graph is first constructed to explore the counterfactual causality and infer the casual target based on the causal effect, which well reduces the bias from questions and obtain answers without training. Then knowledge distillation is leveraged to transfer the knowledge of the inferred casual target to the conventional VQA model. It makes the proposed method enable to handle both the biased data and standard data. To address the problem of the bad bias from the knowledge distillation, the ensemble learning is introduced based on the hypothetical bias reason. Experiments are conducted to show the performance of the proposed method. The significant improvements over the state-of-the-art methods on the VQA-CP v2 dataset well validate the contributions of this work.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

为无偏视觉问答提取因果推理中的知识

目前的视觉问答(Visual Question answer, VQA)模型主要探索答案与问题之间的统计相关性，未能捕捉到视觉信息与答案之间的关系。当处理数据的分布与训练数据不同时，性能会显著下降。为此，本文提出了一种新的无偏VQA模型，通过探索带有知识蒸馏的随机推理(CIKD)来减少偏差的影响。具体而言，首先构建因果图，探索反事实因果关系，根据因果效应推断偶然目标，很好地减少了问题的偏差，无需训练即可获得答案。然后利用知识蒸馏将推断出的随机目标的知识转移到传统的VQA模型中。这使得该方法能够同时处理有偏差数据和标准数据。为了解决知识蒸馏中存在的不良偏差问题，在假设偏差原因的基础上引入了集成学习。实验证明了该方法的有效性。在VQA-CP v2数据集上对最先进方法的显著改进很好地验证了本工作的贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 2nd ACM International Conference on Multimedia in Asia

自引率

0.00%

发文量

期刊最新文献

Storyboard relational model for group activity recognition Objective object segmentation visual quality evaluation based on pixel-level and region-level characteristics Multiplicative angular margin loss for text-based person search Distilling knowledge in causal inference for unbiased visual question answering A large-scale image retrieval system for everyday scenes