UKP-SQuARE v2: Explainability and Adversarial Attacks for Trustworthy QA

Q3 Environmental Science AACL Bioflux Pub Date : 2022-08-19 DOI:10.48550/arXiv.2208.09316

Rachneet Sachdeva, Haritz Puerto, Tim Baumgärtner, Sewin Tariverdian, Hao Zhang, Kexin Wang, H. Saad, Leonardo F. R. Ribeiro, Iryna Gurevych

{"title":"UKP-SQuARE v2: Explainability and Adversarial Attacks for Trustworthy QA","authors":"Rachneet Sachdeva, Haritz Puerto, Tim Baumgärtner, Sewin Tariverdian, Hao Zhang, Kexin Wang, H. Saad, Leonardo F. R. Ribeiro, Iryna Gurevych","doi":"10.48550/arXiv.2208.09316","DOIUrl":null,"url":null,"abstract":"Question Answering (QA) systems are increasingly deployed in applications where they support real-world decisions. However, state-of-the-art models rely on deep neural networks, which are difficult to interpret by humans. Inherently interpretable models or post hoc explainability methods can help users to comprehend how a model arrives at its prediction and, if successful, increase their trust in the system. Furthermore, researchers can leverage these insights to develop new methods that are more accurate and less biased. In this paper, we introduce SQuARE v2, the new version of SQuARE, to provide an explainability infrastructure for comparing models based on methods such as saliency maps and graph-based explanations. While saliency maps are useful to inspect the importance of each input token for the model’s prediction, graph-based explanations from external Knowledge Graphs enable the users to verify the reasoning behind the model prediction. In addition, we provide multiple adversarial attacks to compare the robustness of QA models. With these explainability methods and adversarial attacks, we aim to ease the research on trustworthy QA models. SQuARE is available on https://square.ukp-lab.de.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"2 1","pages":"28-38"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AACL Bioflux","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2208.09316","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Environmental Science","Score":null,"Total":0}

引用次数: 2

Abstract

Question Answering (QA) systems are increasingly deployed in applications where they support real-world decisions. However, state-of-the-art models rely on deep neural networks, which are difficult to interpret by humans. Inherently interpretable models or post hoc explainability methods can help users to comprehend how a model arrives at its prediction and, if successful, increase their trust in the system. Furthermore, researchers can leverage these insights to develop new methods that are more accurate and less biased. In this paper, we introduce SQuARE v2, the new version of SQuARE, to provide an explainability infrastructure for comparing models based on methods such as saliency maps and graph-based explanations. While saliency maps are useful to inspect the importance of each input token for the model’s prediction, graph-based explanations from external Knowledge Graphs enable the users to verify the reasoning behind the model prediction. In addition, we provide multiple adversarial attacks to compare the robustness of QA models. With these explainability methods and adversarial attacks, we aim to ease the research on trustworthy QA models. SQuARE is available on https://square.ukp-lab.de.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

UKP-SQuARE v2:可解释性和可信赖QA的对抗性攻击

问答系统越来越多地部署在支持实际决策的应用程序中。然而，最先进的模型依赖于人类难以理解的深度神经网络。固有的可解释模型或事后可解释性方法可以帮助用户理解模型是如何实现其预测的，如果成功，可以增加他们对系统的信任。此外，研究人员可以利用这些见解来开发更准确、更少偏见的新方法。在本文中，我们引入了SQuARE的新版本SQuARE v2，为基于显著性图和基于图的解释等方法的模型比较提供了一个可解释性基础设施。虽然显著性图对于检查模型预测的每个输入标记的重要性很有用，但来自外部知识图的基于图的解释使用户能够验证模型预测背后的推理。此外，我们提供了多个对抗性攻击来比较QA模型的鲁棒性。利用这些可解释性方法和对抗性攻击，我们的目标是简化可信质量保证模型的研究。SQuARE可以在https://square.ukp-lab.de上找到。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

AACL Bioflux Environmental Science-Management, Monitoring, Policy and Law

CiteScore

1.40

自引率

0.00%

发文量