Cross-Refine: Improving Natural Language Explanation Generation by Learning in Tandem

Qianli Wang, Tatiana Anikina, Nils Feldhus, Simon Ostermann, Sebastian Möller, Vera Schmitt
{"title":"Cross-Refine: Improving Natural Language Explanation Generation by Learning in Tandem","authors":"Qianli Wang, Tatiana Anikina, Nils Feldhus, Simon Ostermann, Sebastian Möller, Vera Schmitt","doi":"arxiv-2409.07123","DOIUrl":null,"url":null,"abstract":"Natural language explanations (NLEs) are vital for elucidating the reasoning\nbehind large language model (LLM) decisions. Many techniques have been\ndeveloped to generate NLEs using LLMs. However, like humans, LLMs might not\nalways produce optimal NLEs on first attempt. Inspired by human learning\nprocesses, we introduce Cross-Refine, which employs role modeling by deploying\ntwo LLMs as generator and critic, respectively. The generator outputs a first\nNLE and then refines this initial explanation using feedback and suggestions\nprovided by the critic. Cross-Refine does not require any supervised training\ndata or additional training. We validate Cross-Refine across three NLP tasks\nusing three state-of-the-art open-source LLMs through automatic and human\nevaluation. We select Self-Refine (Madaan et al., 2023) as the baseline, which\nonly utilizes self-feedback to refine the explanations. Our findings from\nautomatic evaluation and a user study indicate that Cross-Refine outperforms\nSelf-Refine. Meanwhile, Cross-Refine can perform effectively with less powerful\nLLMs, whereas Self-Refine only yields strong results with ChatGPT.\nAdditionally, we conduct an ablation study to assess the importance of feedback\nand suggestions. Both of them play an important role in refining explanations.\nWe further evaluate Cross-Refine on a bilingual dataset in English and German.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computation and Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07123","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Natural language explanations (NLEs) are vital for elucidating the reasoning behind large language model (LLM) decisions. Many techniques have been developed to generate NLEs using LLMs. However, like humans, LLMs might not always produce optimal NLEs on first attempt. Inspired by human learning processes, we introduce Cross-Refine, which employs role modeling by deploying two LLMs as generator and critic, respectively. The generator outputs a first NLE and then refines this initial explanation using feedback and suggestions provided by the critic. Cross-Refine does not require any supervised training data or additional training. We validate Cross-Refine across three NLP tasks using three state-of-the-art open-source LLMs through automatic and human evaluation. We select Self-Refine (Madaan et al., 2023) as the baseline, which only utilizes self-feedback to refine the explanations. Our findings from automatic evaluation and a user study indicate that Cross-Refine outperforms Self-Refine. Meanwhile, Cross-Refine can perform effectively with less powerful LLMs, whereas Self-Refine only yields strong results with ChatGPT. Additionally, we conduct an ablation study to assess the importance of feedback and suggestions. Both of them play an important role in refining explanations. We further evaluate Cross-Refine on a bilingual dataset in English and German.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
交叉定义:通过串联学习改进自然语言解释生成
自然语言解释(NLE)对于阐明大型语言模型(LLM)决策背后的推理至关重要。目前已经开发了许多技术来使用 LLM 生成 NLE。然而,与人类一样,LLM 也不一定能在第一次尝试时生成最佳的 NLE。受人类学习过程的启发,我们引入了 Cross-Refine,它通过部署两个 LLM 分别作为生成器和批判器来进行角色建模。生成器输出第一个 NLE,然后利用批评者提供的反馈和建议完善这个初始解释。Cross-Refine 不需要任何监督训练数据或额外的训练。通过自动和人工评估,我们使用三种最先进的开源 LLM 在三个 NLP 任务中验证了 Cross-Refine。我们选择Self-Refine(Madaan等人,2023年)作为基线,它只利用自我反馈来完善解释。我们的自动评估和用户研究结果表明,Cross-Refine 优于 Self-Refine。同时,Cross-Refine 可以有效地使用功能较弱的LLM,而 Self-Refine 只有在使用 ChatGPT 时才能产生强大的效果。我们还在英语和德语的双语数据集上对 Cross-Refine 进行了进一步评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
LLMs + Persona-Plug = Personalized LLMs MEOW: MEMOry Supervised LLM Unlearning Via Inverted Facts Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resources Human-like Affective Cognition in Foundation Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1