RUIE: Retrieval-based Unified Information Extraction using Large Language Model

Xincheng Liao, Junwen Duan, Yixi Huang, Jianxin Wang
{"title":"RUIE: Retrieval-based Unified Information Extraction using Large Language Model","authors":"Xincheng Liao, Junwen Duan, Yixi Huang, Jianxin Wang","doi":"arxiv-2409.11673","DOIUrl":null,"url":null,"abstract":"Unified information extraction (UIE) aims to complete all information\nextraction tasks using a single model or framework. While previous work has\nprimarily focused on instruction-tuning large language models (LLMs) with\nconstructed datasets, these methods require significant computational resources\nand struggle to generalize to unseen tasks. To address these limitations, we\npropose RUIE (Retrieval-based Unified Information Extraction), a framework that\nleverages in-context learning to enable rapid generalization while reducing\ncomputational costs. The key challenge in RUIE is selecting the most beneficial\ndemonstrations for LLMs to effectively handle diverse IE tasks. To achieve\nthis, we integrate LLM preferences for ranking candidate demonstrations and\ndesign a keyword-enhanced reward model to capture fine-grained relationships\nbetween queries and demonstrations. We then train a bi-encoder retriever for\nUIE through contrastive learning and knowledge distillation. To the best of our\nknowledge, RUIE is the first trainable retrieval framework for UIE.\nExperimental results on 8 held-out datasets demonstrate RUIE's effectiveness in\ngeneralizing to unseen tasks, with average F1-score improvements of 19.22 and\n3.13 compared to instruction-tuning methods and other retrievers, respectively.\nFurther analysis confirms RUIE's adaptability to LLMs of varying sizes and the\nimportance of its key components.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computation and Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11673","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Unified information extraction (UIE) aims to complete all information extraction tasks using a single model or framework. While previous work has primarily focused on instruction-tuning large language models (LLMs) with constructed datasets, these methods require significant computational resources and struggle to generalize to unseen tasks. To address these limitations, we propose RUIE (Retrieval-based Unified Information Extraction), a framework that leverages in-context learning to enable rapid generalization while reducing computational costs. The key challenge in RUIE is selecting the most beneficial demonstrations for LLMs to effectively handle diverse IE tasks. To achieve this, we integrate LLM preferences for ranking candidate demonstrations and design a keyword-enhanced reward model to capture fine-grained relationships between queries and demonstrations. We then train a bi-encoder retriever for UIE through contrastive learning and knowledge distillation. To the best of our knowledge, RUIE is the first trainable retrieval framework for UIE. Experimental results on 8 held-out datasets demonstrate RUIE's effectiveness in generalizing to unseen tasks, with average F1-score improvements of 19.22 and 3.13 compared to instruction-tuning methods and other retrievers, respectively. Further analysis confirms RUIE's adaptability to LLMs of varying sizes and the importance of its key components.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
RUIE:使用大型语言模型进行基于检索的统一信息提取
统一信息提取(UIE)旨在使用单一模型或框架完成所有信息提取任务。以往的工作主要集中在利用构建的数据集对大型语言模型(LLMs)进行指令调整,但这些方法需要大量的计算资源,而且很难推广到未见过的任务中。为了解决这些局限性,我们提出了 RUIE(基于检索的统一信息提取),这是一种利用上下文学习实现快速泛化,同时降低计算成本的框架。RUIE 面临的主要挑战是为 LLM 挑选最有益的演示,以有效处理各种信息提取任务。为了实现这一目标,我们整合了 LLM 对候选演示排序的偏好,并设计了一个关键字增强奖励模型来捕捉查询和演示之间的细粒度关系。然后,我们通过对比学习和知识提炼来训练用于 UIE 的双编码器检索器。据我们所知,RUIE是第一个可训练的UIE检索框架。在8个保留数据集上的实验结果表明,RUIE能有效地推广到未见任务中,与指令调整方法和其他检索器相比,平均F1分数分别提高了19.22和3.13。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
LLMs + Persona-Plug = Personalized LLMs MEOW: MEMOry Supervised LLM Unlearning Via Inverted Facts Extract-and-Abstract: Unifying Extractive and Abstractive Summarization within Single Encoder-Decoder Framework Development and bilingual evaluation of Japanese medical large language model within reasonably low computational resources Human-like Affective Cognition in Foundation Models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1