Arijit Ray, Michael Cogswell, Xiaoyu Lin, Kamran Alipour, Ajay Divakaran, Yi Yao, Giedrius Burachas
{"title":"Generating and Evaluating Explanations of Attended and Error-Inducing Input Regions for VQA Models","authors":"Arijit Ray, Michael Cogswell, Xiaoyu Lin, Kamran Alipour, Ajay Divakaran, Yi Yao, Giedrius Burachas","doi":"10.22541/au.162464902.28050142/v1","DOIUrl":null,"url":null,"abstract":"Attention maps, a popular heatmap-based explanation method for Visual\nQuestion Answering (VQA), are supposed to help users understand the\nmodel by highlighting portions of the image/question used by the model\nto infer answers. However, we see that users are often misled by current\nattention map visualizations that point to relevant regions despite the\nmodel producing an incorrect answer. Hence, we propose Error Maps that\nclarify the error by highlighting image regions where the model is prone\nto err. Error maps can indicate when a correctly attended region may be\nprocessed incorrectly leading to an incorrect answer, and hence, improve\nusers’ understanding of those cases. To evaluate our new explanations,\nwe further introduce a metric that simulates users’ interpretation of\nexplanations to evaluate their potential helpfulness to understand model\ncorrectness. We finally conduct user studies to see that our new\nexplanations help users understand model correctness better than\nbaselines by an expected 30% and that our proxy helpfulness metrics\ncorrelate strongly (rho>0.97) with how well users can\npredict model correctness.","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied AI letters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22541/au.162464902.28050142/v1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Attention maps, a popular heatmap-based explanation method for Visual
Question Answering (VQA), are supposed to help users understand the
model by highlighting portions of the image/question used by the model
to infer answers. However, we see that users are often misled by current
attention map visualizations that point to relevant regions despite the
model producing an incorrect answer. Hence, we propose Error Maps that
clarify the error by highlighting image regions where the model is prone
to err. Error maps can indicate when a correctly attended region may be
processed incorrectly leading to an incorrect answer, and hence, improve
users’ understanding of those cases. To evaluate our new explanations,
we further introduce a metric that simulates users’ interpretation of
explanations to evaluate their potential helpfulness to understand model
correctness. We finally conduct user studies to see that our new
explanations help users understand model correctness better than
baselines by an expected 30% and that our proxy helpfulness metrics
correlate strongly (rho>0.97) with how well users can
predict model correctness.