忠实拨号:信息寻求对话的忠实基准

IF 4.2 1区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Transactions of the Association for Computational Linguistics Pub Date : 2022-04-22 DOI:10.1162/tacl_a_00529

Nouha Dziri, Ehsan Kamalloo, Sivan Milton, Osmar Zaiane, Mo Yu, E. Ponti, Siva Reddy

{"title":"忠实拨号:信息寻求对话的忠实基准","authors":"Nouha Dziri, Ehsan Kamalloo, Sivan Milton, Osmar Zaiane, Mo Yu, E. Ponti, Siva Reddy","doi":"10.1162/tacl_a_00529","DOIUrl":null,"url":null,"abstract":"Abstract The goal of information-seeking dialogue is to respond to seeker queries with natural language utterances that are grounded on knowledge sources. However, dialogue systems often produce unsupported utterances, a phenomenon known as hallucination. To mitigate this behavior, we adopt a data-centric solution and create FaithDial, a new benchmark for hallucination-free dialogues, by editing hallucinated responses in the Wizard of Wikipedia (WoW) benchmark. We observe that FaithDial is more faithful than WoW while also maintaining engaging conversations. We show that FaithDial can serve as training signal for: i) a hallucination critic, which discriminates whether an utterance is faithful or not, and boosts the performance by 12.8 F1 score on the BEGIN benchmark compared to existing datasets for dialogue coherence; ii) high-quality dialogue generation. We benchmark a series of state-of-the-art models and propose an auxiliary contrastive objective that achieves the highest level of faithfulness and abstractiveness based on several automated metrics. Further, we find that the benefits of FaithDial generalize to zero-shot transfer on other datasets, such as CMU-Dog and TopicalChat. Finally, human evaluation reveals that responses generated by models trained on FaithDial are perceived as more interpretable, cooperative, and engaging.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"10 1","pages":"1473-1490"},"PeriodicalIF":4.2000,"publicationDate":"2022-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":"{\"title\":\"FaithDial: A Faithful Benchmark for Information-Seeking Dialogue\",\"authors\":\"Nouha Dziri, Ehsan Kamalloo, Sivan Milton, Osmar Zaiane, Mo Yu, E. Ponti, Siva Reddy\",\"doi\":\"10.1162/tacl_a_00529\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract The goal of information-seeking dialogue is to respond to seeker queries with natural language utterances that are grounded on knowledge sources. However, dialogue systems often produce unsupported utterances, a phenomenon known as hallucination. To mitigate this behavior, we adopt a data-centric solution and create FaithDial, a new benchmark for hallucination-free dialogues, by editing hallucinated responses in the Wizard of Wikipedia (WoW) benchmark. We observe that FaithDial is more faithful than WoW while also maintaining engaging conversations. We show that FaithDial can serve as training signal for: i) a hallucination critic, which discriminates whether an utterance is faithful or not, and boosts the performance by 12.8 F1 score on the BEGIN benchmark compared to existing datasets for dialogue coherence; ii) high-quality dialogue generation. We benchmark a series of state-of-the-art models and propose an auxiliary contrastive objective that achieves the highest level of faithfulness and abstractiveness based on several automated metrics. Further, we find that the benefits of FaithDial generalize to zero-shot transfer on other datasets, such as CMU-Dog and TopicalChat. Finally, human evaluation reveals that responses generated by models trained on FaithDial are perceived as more interpretable, cooperative, and engaging.\",\"PeriodicalId\":33559,\"journal\":{\"name\":\"Transactions of the Association for Computational Linguistics\",\"volume\":\"10 1\",\"pages\":\"1473-1490\"},\"PeriodicalIF\":4.2000,\"publicationDate\":\"2022-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"39\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transactions of the Association for Computational Linguistics\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1162/tacl_a_00529\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions of the Association for Computational Linguistics","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1162/tacl_a_00529","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 39

摘要

摘要信息寻求对话的目标是用基于知识来源的自然语言话语来回应寻求者的询问。然而，对话系统经常会产生未经支持的话语，这种现象被称为幻觉。为了缓解这种行为，我们采用了一种以数据为中心的解决方案，并通过在维基百科向导（WoW）基准中编辑幻觉反应，创建了FaithDial，这是一个无幻觉对话的新基准。我们观察到FaithDial比魔兽世界更忠实，同时也保持着引人入胜的对话。我们表明，FaithDial可以作为以下方面的训练信号：i）幻觉评论家，它可以区分话语是否忠实，并在对话连贯性方面，与现有数据集相比，在BEGIN基准上的表现提高了12.8F1分；ii）高质量的对话生成。我们对一系列最先进的模型进行了基准测试，并提出了一个辅助对比目标，该目标基于几个自动化指标实现了最高水平的忠实性和抽象性。此外，我们发现FaithDial的好处可以推广到其他数据集上的零样本传输，如CMU-Dog和TopicalChat。最后，人类评估显示，在FaithDial上训练的模型产生的反应被认为更具可解释性、合作性和参与性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

FaithDial: A Faithful Benchmark for Information-Seeking Dialogue

Abstract The goal of information-seeking dialogue is to respond to seeker queries with natural language utterances that are grounded on knowledge sources. However, dialogue systems often produce unsupported utterances, a phenomenon known as hallucination. To mitigate this behavior, we adopt a data-centric solution and create FaithDial, a new benchmark for hallucination-free dialogues, by editing hallucinated responses in the Wizard of Wikipedia (WoW) benchmark. We observe that FaithDial is more faithful than WoW while also maintaining engaging conversations. We show that FaithDial can serve as training signal for: i) a hallucination critic, which discriminates whether an utterance is faithful or not, and boosts the performance by 12.8 F1 score on the BEGIN benchmark compared to existing datasets for dialogue coherence; ii) high-quality dialogue generation. We benchmark a series of state-of-the-art models and propose an auxiliary contrastive objective that achieves the highest level of faithfulness and abstractiveness based on several automated metrics. Further, we find that the benefits of FaithDial generalize to zero-shot transfer on other datasets, such as CMU-Dog and TopicalChat. Finally, human evaluation reveals that responses generated by models trained on FaithDial are perceived as more interpretable, cooperative, and engaging.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

相关文献

二甲双胍通过HDAC6和FoxO3a转录调控肌肉生长抑制素诱导肌肉萎缩

IF 8.9 1区医学Journal of Cachexia, Sarcopenia and MusclePub Date : 2021-11-02 DOI: 10.1002/jcsm.12833

Min Ju Kang, Ji Wook Moon, Jung Ok Lee, Ji Hae Kim, Eun Jeong Jung, Su Jin Kim, Joo Yeon Oh, Sang Woo Wu, Pu Reum Lee, Sun Hwa Park, Hyeon Soo Kim

具有疾病敏感单倍型的非亲属供体脐带血移植后的1型糖尿病

IF 3.2 3区医学Journal of Diabetes InvestigationPub Date : 2022-11-02 DOI: 10.1111/jdi.13939

Kensuke Matsumoto, Taisuke Matsuyama, Ritsu Sumiyoshi, Matsuo Takuji, Tadashi Yamamoto, Ryosuke Shirasaki, Haruko Tashiro

封面:蛋白质组学分析确定IRSp53和fastin是PRV输出和直接细胞-细胞传播的关键

IF 3.4 4区生物学ProteomicsPub Date : 2019-12-02 DOI: 10.1002/pmic.201970201

Fei-Long Yu, Huan Miao, Jinjin Xia, Fan Jia, Huadong Wang, Fuqiang Xu, Lin Guo

来源期刊

Transactions of the Association for Computational Linguistics Multiple-

CiteScore

32.60

自引率

4.60%

发文量

审稿时长

8 weeks

期刊介绍： The highly regarded quarterly journal Computational Linguistics has a companion journal called Transactions of the Association for Computational Linguistics. This open access journal publishes articles in all areas of natural language processing and is an important resource for academic and industry computational linguists, natural language processing experts, artificial intelligence and machine learning investigators, cognitive scientists, speech specialists, as well as linguists and philosophers. The journal disseminates work of vital relevance to these professionals on an annual basis.