{"title":"Having Your Cake and Eating it Too: Training Neural Retrieval for Language Inference without Losing Lexical Match","authors":"Vikas Yadav, Steven Bethard, M. Surdeanu","doi":"10.1145/3397271.3401311","DOIUrl":null,"url":null,"abstract":"We present a study on the importance of information retrieval (IR) techniques for both the interpretability and the performance of neural question answering (QA) methods. We show that the current state-of-the-art transformer methods (like RoBERTa) encode poorly simple information retrieval (IR) concepts such as lexical overlap between query and the document. To mitigate this limitation, we introduce a supervised RoBERTa QA method that is trained to mimic the behavior of BM25 and the soft-matching idea behind embedding-based alignment methods. We show that fusing the simple lexical-matching IR concepts in transformer techniques results in improvement a) of their (lexical-matching) interpretability, b) retrieval performance, and c) the QA performance on two multi-hop QA datasets. We further highlight the lexical-chasm gap bridging capabilities of transformer methods by analyzing the attention distributions of the supervised RoBERTa classifier over the context versus lexically-matched token pairs.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3397271.3401311","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
We present a study on the importance of information retrieval (IR) techniques for both the interpretability and the performance of neural question answering (QA) methods. We show that the current state-of-the-art transformer methods (like RoBERTa) encode poorly simple information retrieval (IR) concepts such as lexical overlap between query and the document. To mitigate this limitation, we introduce a supervised RoBERTa QA method that is trained to mimic the behavior of BM25 and the soft-matching idea behind embedding-based alignment methods. We show that fusing the simple lexical-matching IR concepts in transformer techniques results in improvement a) of their (lexical-matching) interpretability, b) retrieval performance, and c) the QA performance on two multi-hop QA datasets. We further highlight the lexical-chasm gap bridging capabilities of transformer methods by analyzing the attention distributions of the supervised RoBERTa classifier over the context versus lexically-matched token pairs.