阐明重要信息，更好地解释变压器

Findings Pub Date : 2024-01-18 DOI:10.48550/arXiv.2401.09972

Linxin Song, Yan Cui, Ao Luo, Freddy Lecue, Irene Li

{"title":"阐明重要信息，更好地解释变压器","authors":"Linxin Song, Yan Cui, Ao Luo, Freddy Lecue, Irene Li","doi":"10.48550/arXiv.2401.09972","DOIUrl":null,"url":null,"abstract":"Transformer-based models excel in various natural language processing (NLP) tasks, attracting countless efforts to explain their inner workings. Prior methods explain Transformers by focusing on the raw gradient and attention as token attribution scores, where non-relevant information is often considered during explanation computation, resulting in confusing results. In this work, we propose highlighting the important information and eliminating irrelevant information by a refined information flow on top of the layer-wise relevance propagation (LRP) method. Specifically, we consider identifying syntactic and positional heads as important attention heads and focus on the relevance obtained from these important heads. Experimental results demonstrate that irrelevant information does distort output attribution scores and then should be masked during explanation computation. Compared to eight baselines on both classification and question-answering datasets, our method consistently outperforms with over 3% to 33% improvement on explanation metrics, providing superior explanation performance. Our anonymous code repository is available at: https://anonymous.4open.science/r/MLRP-E676/","PeriodicalId":508951,"journal":{"name":"Findings","volume":"21 3","pages":"2048-2062"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Better Explain Transformers by Illuminating Important Information\",\"authors\":\"Linxin Song, Yan Cui, Ao Luo, Freddy Lecue, Irene Li\",\"doi\":\"10.48550/arXiv.2401.09972\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Transformer-based models excel in various natural language processing (NLP) tasks, attracting countless efforts to explain their inner workings. Prior methods explain Transformers by focusing on the raw gradient and attention as token attribution scores, where non-relevant information is often considered during explanation computation, resulting in confusing results. In this work, we propose highlighting the important information and eliminating irrelevant information by a refined information flow on top of the layer-wise relevance propagation (LRP) method. Specifically, we consider identifying syntactic and positional heads as important attention heads and focus on the relevance obtained from these important heads. Experimental results demonstrate that irrelevant information does distort output attribution scores and then should be masked during explanation computation. Compared to eight baselines on both classification and question-answering datasets, our method consistently outperforms with over 3% to 33% improvement on explanation metrics, providing superior explanation performance. Our anonymous code repository is available at: https://anonymous.4open.science/r/MLRP-E676/\",\"PeriodicalId\":508951,\"journal\":{\"name\":\"Findings\",\"volume\":\"21 3\",\"pages\":\"2048-2062\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Findings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2401.09972\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Findings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2401.09972","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

基于变换器的模型在各种自然语言处理（NLP）任务中表现出色，吸引了无数人努力解释其内部工作原理。先前的方法通过关注原始梯度和注意力作为标记归因得分来解释变换器，在解释计算过程中往往会考虑非相关信息，从而导致结果混乱。在这项工作中，我们建议在层向相关性传播（LRP）方法的基础上，通过细化信息流来突出重要信息并剔除无关信息。具体来说，我们将句法和位置标题视为重要的关注标题，并关注从这些重要标题中获得的相关性。实验结果表明，无关信息确实会扭曲输出归因得分，因此在计算解释时应屏蔽这些信息。与分类和问题解答数据集上的八种基线方法相比，我们的方法在解释指标上的表现始终优于它们，提高了 3% 到 33% 不等，提供了卓越的解释性能。我们的匿名代码库位于： https://anonymous.4open.science/r/MLRP-E676/

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Better Explain Transformers by Illuminating Important Information

Transformer-based models excel in various natural language processing (NLP) tasks, attracting countless efforts to explain their inner workings. Prior methods explain Transformers by focusing on the raw gradient and attention as token attribution scores, where non-relevant information is often considered during explanation computation, resulting in confusing results. In this work, we propose highlighting the important information and eliminating irrelevant information by a refined information flow on top of the layer-wise relevance propagation (LRP) method. Specifically, we consider identifying syntactic and positional heads as important attention heads and focus on the relevance obtained from these important heads. Experimental results demonstrate that irrelevant information does distort output attribution scores and then should be masked during explanation computation. Compared to eight baselines on both classification and question-answering datasets, our method consistently outperforms with over 3% to 33% improvement on explanation metrics, providing superior explanation performance. Our anonymous code repository is available at: https://anonymous.4open.science/r/MLRP-E676/

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Findings

自引率

0.00%

发文量