{"title":"Multi-LoRA continual learning based instruction tuning framework for universal information extraction","authors":"Yu Jin, Jie Liu, Shaowei Chen","doi":"10.1016/j.knosys.2024.112750","DOIUrl":null,"url":null,"abstract":"<div><div>Universal information extraction (Universal IE) aims to develop one model capable of solving multiple IE target tasks. Previous works have enhanced extraction performance of target tasks through auxiliary tasks. However, there are still limitations in terms of learning strategies. From one aspect, joint learning-based universal IE approaches, which simply mix auxiliary tasks with target tasks, fail to enable the model to master basic knowledge from auxiliary tasks before learning target tasks. From another aspect, continual learning-based universal IE approaches, which sequentially update all the model parameters on auxiliary tasks and target tasks, tend to cause catastrophic forgetting. In this study, we design a multi-LoRA continual learning-based instruction fine-tuning framework for universal IE. Specifically, we design unique LoRA modules for learning auxiliary tasks and target tasks. We first freeze pre-trained weights and update additional parameters on auxiliary tasks through one LoRA module. Subsequently, we keep the weights frozen and further adjust parameters through another LoRA module to adapt the model to the target tasks. Finally, we merge the frozen weights with learned weights, thereby enabling the model to better leverage the acquired abilities during the inference phase. Therefore, our model masters basic extraction abilities before learning target tasks and does not forget this basic knowledge during the target learning process. Moreover, we regard extraction, classification, and recognition as basic abilities and further design auxiliary tasks based on these basic abilities. Experimental results on 37 datasets across 3 tasks show that our approach reaches state-of-the-art performance.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"308 ","pages":"Article 112750"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124013844","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Universal information extraction (Universal IE) aims to develop one model capable of solving multiple IE target tasks. Previous works have enhanced extraction performance of target tasks through auxiliary tasks. However, there are still limitations in terms of learning strategies. From one aspect, joint learning-based universal IE approaches, which simply mix auxiliary tasks with target tasks, fail to enable the model to master basic knowledge from auxiliary tasks before learning target tasks. From another aspect, continual learning-based universal IE approaches, which sequentially update all the model parameters on auxiliary tasks and target tasks, tend to cause catastrophic forgetting. In this study, we design a multi-LoRA continual learning-based instruction fine-tuning framework for universal IE. Specifically, we design unique LoRA modules for learning auxiliary tasks and target tasks. We first freeze pre-trained weights and update additional parameters on auxiliary tasks through one LoRA module. Subsequently, we keep the weights frozen and further adjust parameters through another LoRA module to adapt the model to the target tasks. Finally, we merge the frozen weights with learned weights, thereby enabling the model to better leverage the acquired abilities during the inference phase. Therefore, our model masters basic extraction abilities before learning target tasks and does not forget this basic knowledge during the target learning process. Moreover, we regard extraction, classification, and recognition as basic abilities and further design auxiliary tasks based on these basic abilities. Experimental results on 37 datasets across 3 tasks show that our approach reaches state-of-the-art performance.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.