Learning how to transfer: A lifelong domain knowledge distillation framework for continual MRC

Intelligent Systems with Applications Pub Date : 2025-03-08 DOI:10.1016/j.iswa.2025.200497

Songze Li , Zhijing Wu , Runmin Cao , Xiaohan Zhang , Yifan Wang , Hua Xu , Kai Gao

{"title":"Learning how to transfer: A lifelong domain knowledge distillation framework for continual MRC","authors":"Songze Li , Zhijing Wu , Runmin Cao , Xiaohan Zhang , Yifan Wang , Hua Xu , Kai Gao","doi":"10.1016/j.iswa.2025.200497","DOIUrl":null,"url":null,"abstract":"<div><div>Machine Reading Comprehension (MRC) has attracted wide attention in recent years. It can reflect how well a machine understands human language. Benefitting from the increasing large-scale benchmark and pre-trained language models, a lot of MRC models have achieved remarkable success and even exceeded human performance. However, real-world MRC systems need incrementally learn from a continuous data stream across time without accessing the previously seen data, called Continual MRC system. It is a great challenge to learn a new domain incrementally without catastrophically forgetting previous knowledge. In this paper, MK-MRC (an extension of MA-MRC), a continual MRC framework with uncertainty-aware fixed <strong>M</strong>emory and lifelong domain <strong>K</strong>nowledge distillation, is proposed. MK-MRC is a memory replaying based method, in which a fixed-size memory buffer stores a small number of samples in previous domain data along with an uncertainty-aware updating strategy when new domain data arrives. For incremental learning, MK-MRC fully uses the domain adaptation and transfer relationship between memory and new domain data through several domain knowledge distillation strategies.</div><div>Compared with MA-MRC, MK-MRC additionally introduces more strategies to strengthen the ability of continual learning, such as data augmentation and special task-related knowledge distillation. Experimental results show that MK-MRC yields consistent improvement compared with strong baselines and has a substantial incremental learning ability without catastrophically forgetting under four continual span-extractive and multiple-choice MRC settings.</div></div>","PeriodicalId":100684,"journal":{"name":"Intelligent Systems with Applications","volume":"26 ","pages":"Article 200497"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Systems with Applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667305325000237","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Machine Reading Comprehension (MRC) has attracted wide attention in recent years. It can reflect how well a machine understands human language. Benefitting from the increasing large-scale benchmark and pre-trained language models, a lot of MRC models have achieved remarkable success and even exceeded human performance. However, real-world MRC systems need incrementally learn from a continuous data stream across time without accessing the previously seen data, called Continual MRC system. It is a great challenge to learn a new domain incrementally without catastrophically forgetting previous knowledge. In this paper, MK-MRC (an extension of MA-MRC), a continual MRC framework with uncertainty-aware fixed Memory and lifelong domain Knowledge distillation, is proposed. MK-MRC is a memory replaying based method, in which a fixed-size memory buffer stores a small number of samples in previous domain data along with an uncertainty-aware updating strategy when new domain data arrives. For incremental learning, MK-MRC fully uses the domain adaptation and transfer relationship between memory and new domain data through several domain knowledge distillation strategies.

Compared with MA-MRC, MK-MRC additionally introduces more strategies to strengthen the ability of continual learning, such as data augmentation and special task-related knowledge distillation. Experimental results show that MK-MRC yields consistent improvement compared with strong baselines and has a substantial incremental learning ability without catastrophically forgetting under four continual span-extractive and multiple-choice MRC settings.

查看原文