FEDS-ICL: Enhancing translation ability and efficiency of large language model by optimizing demonstration selection

IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Information Processing & Management Pub Date : 2024-07-03 DOI:10.1016/j.ipm.2024.103825
Shaolin Zhu, Leiyu Pan, Deyi Xiong
{"title":"FEDS-ICL: Enhancing translation ability and efficiency of large language model by optimizing demonstration selection","authors":"Shaolin Zhu,&nbsp;Leiyu Pan,&nbsp;Deyi Xiong","doi":"10.1016/j.ipm.2024.103825","DOIUrl":null,"url":null,"abstract":"<div><p>Large language models (LLMs) that exhibit a remarkable ability by in-context learning (ICL) with bilingual demonstrations have been recognized as a potential solution for machine translation. However, the process of selecting these demonstrations from vast datastores is notoriously time-consuming and inefficient. Moreover, the strategies for designing effective in-context demonstrations are not well-established. To address these critical gaps, we introduce a novel Fast and Effective approach for Demonstration Selection in-Context learning (FEDS-ICL) tailored to LLMs. Our method is designed to mainly enhance the efficiency and accuracy of translation of LLMs. Our approach revolutionizes demonstration selection by designing new product quantization technique that rapidly extracts neighboring target tokens from a strategically curated subset of sentences. This method significantly deviates from the conventional exhaustive search across entire datastores, leading to a remarkable increase in speed. Furthermore, FEDS-ICL pioneers an innovative template design for in-context demonstrations, specifically crafted to amplify the translation capabilities of multilingual LLMs. In experiments, we compare our FEDS-ICL with various existing methods on across diverse language pairs on ten different LLMs. The results reveal an up to 2.1-fold increase in selection speed and an impressive enhancement in translation accuracy, outperforming existing baselines by up to 2.0 BLEU points at least on ten different LLMs. The ablation study show the proposed product quantization and multi-view demonstration can effectively enhance the efficiency and accuracy of LLMs in machine translation. The analysis on robustness of FEDS-ICL shows that the incorporation of a greater number of demonstrations can lead a positive correlation between the quantity of contextually rich demonstrations and the translation quality of LLMs. These advancements position FEDS-ICL as a transformative methodology in the domain of machine translation and pattern analysis, marking a significant leap towards more efficient and precise machine translation.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324001845","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Large language models (LLMs) that exhibit a remarkable ability by in-context learning (ICL) with bilingual demonstrations have been recognized as a potential solution for machine translation. However, the process of selecting these demonstrations from vast datastores is notoriously time-consuming and inefficient. Moreover, the strategies for designing effective in-context demonstrations are not well-established. To address these critical gaps, we introduce a novel Fast and Effective approach for Demonstration Selection in-Context learning (FEDS-ICL) tailored to LLMs. Our method is designed to mainly enhance the efficiency and accuracy of translation of LLMs. Our approach revolutionizes demonstration selection by designing new product quantization technique that rapidly extracts neighboring target tokens from a strategically curated subset of sentences. This method significantly deviates from the conventional exhaustive search across entire datastores, leading to a remarkable increase in speed. Furthermore, FEDS-ICL pioneers an innovative template design for in-context demonstrations, specifically crafted to amplify the translation capabilities of multilingual LLMs. In experiments, we compare our FEDS-ICL with various existing methods on across diverse language pairs on ten different LLMs. The results reveal an up to 2.1-fold increase in selection speed and an impressive enhancement in translation accuracy, outperforming existing baselines by up to 2.0 BLEU points at least on ten different LLMs. The ablation study show the proposed product quantization and multi-view demonstration can effectively enhance the efficiency and accuracy of LLMs in machine translation. The analysis on robustness of FEDS-ICL shows that the incorporation of a greater number of demonstrations can lead a positive correlation between the quantity of contextually rich demonstrations and the translation quality of LLMs. These advancements position FEDS-ICL as a transformative methodology in the domain of machine translation and pattern analysis, marking a significant leap towards more efficient and precise machine translation.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
FEDS-ICL:通过优化示范选择提高大型语言模型的翻译能力和效率
大型语言模型(LLMs)通过双语示范的上下文学习(ICL)展现出非凡的能力,已被视为机器翻译的潜在解决方案。然而,从庞大的数据库中选择这些示例的过程耗时长、效率低是众所周知的。此外,设计有效的上下文示范的策略也不成熟。为了解决这些关键问题,我们引入了一种专为 LLM 量身定制的快速有效的上下文学习演示选择方法(FEDS-ICL)。我们的方法主要是为了提高 LLM 翻译的效率和准确性。我们的方法通过设计新的产品量化技术,从经过战略策划的句子子集中快速提取相邻的目标标记,从而彻底改变了示范选择。这种方法大大偏离了传统的对整个数据存储进行穷举搜索的方法,从而显著提高了速度。此外,FEDS-ICL 还开创了用于上下文演示的创新模板设计,专门用于增强多语言 LLM 的翻译能力。在实验中,我们将 FEDS-ICL 与现有的各种方法进行了比较,这些方法适用于十种不同 LLM 上的不同语言对。结果表明,我们的选择速度提高了 2.1 倍,翻译准确性也得到了显著提升,在十种不同的 LLM 上,我们的翻译准确性至少比现有基线高出 2.0 个 BLEU 点。消融研究表明,建议的乘积量化和多视图演示可以有效提高机器翻译中 LLM 的效率和准确性。对 FEDS-ICL 鲁棒性的分析表明,加入更多的演示可以使上下文丰富的演示数量与 LLM 的翻译质量呈正相关。这些进步将 FEDS-ICL 定位为机器翻译和模式分析领域的变革性方法,标志着向更高效、更精确的机器翻译迈出了重要一步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Information Processing & Management
Information Processing & Management 工程技术-计算机:信息系统
CiteScore
17.00
自引率
11.60%
发文量
276
审稿时长
39 days
期刊介绍: Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.
期刊最新文献
ME3A: A Multimodal Entity Entailment framework for multimodal Entity Alignment Hierarchical multi-label text classification of tourism resources using a label-aware dual graph attention network Impact of economic and socio-political risk factors on sovereign credit ratings Higher-order structure based node importance evaluation in directed networks Membership inference attacks via spatial projection-based relative information loss in MLaaS
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1