Utilizing Inherent Bias for Memory Efficient Continual Learning: A Simple and Robust Baseline

IF 4.2 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Image and Vision Computing Pub Date : 2024-09-27 DOI:10.1016/j.imavis.2024.105288

Neela Rahimi, Ming Shao

{"title":"Utilizing Inherent Bias for Memory Efficient Continual Learning: A Simple and Robust Baseline","authors":"Neela Rahimi, Ming Shao","doi":"10.1016/j.imavis.2024.105288","DOIUrl":null,"url":null,"abstract":"<div><div>Learning from continuously evolving data is critical in real-world applications. This type of learning, known as Continual Learning (CL), aims to assimilate new information without compromising performance on prior knowledge. However, learning new information leads to a bias in the network towards recent observations, resulting in a phenomenon known as catastrophic forgetting. The complexity increases in Online Continual Learning (OCL) scenarios where models are allowed only a single pass over data. Existing OCL approaches that rely on replaying exemplar sets are not only memory-intensive when it comes to large-scale datasets but also raise security concerns. While recent dynamic network models address memory concerns, they often present computationally demanding, over-parameterized solutions with limited generalizability. To address this longstanding problem, we propose a novel OCL approach termed “Bias Robust online Continual Learning (BRCL).” BRCL retains all intermediate models generated. These models inherently exhibit a preference for recently learned classes. To leverage this property for enhanced performance, we devise a strategy we describe as ‘utilizing bias to counteract bias.’ This method involves the development of an Inference function that capitalizes on the inherent biases of each model towards the recent tasks. Furthermore, we integrate a model consolidation technique that aligns the first layers of these models, particularly focusing on similar feature representations. This process effectively reduces the memory requirement, ensuring a low memory footprint. Despite the simplicity of the methodology to guarantee expandability to various frameworks, extensive experiments reveal a notable performance edge over leading methods on key benchmarks, getting continual learning closer to matching offline training. (Source code will be made publicly available upon the publication of this paper.)</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"151 ","pages":"Article 105288"},"PeriodicalIF":4.2000,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885624003937","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Learning from continuously evolving data is critical in real-world applications. This type of learning, known as Continual Learning (CL), aims to assimilate new information without compromising performance on prior knowledge. However, learning new information leads to a bias in the network towards recent observations, resulting in a phenomenon known as catastrophic forgetting. The complexity increases in Online Continual Learning (OCL) scenarios where models are allowed only a single pass over data. Existing OCL approaches that rely on replaying exemplar sets are not only memory-intensive when it comes to large-scale datasets but also raise security concerns. While recent dynamic network models address memory concerns, they often present computationally demanding, over-parameterized solutions with limited generalizability. To address this longstanding problem, we propose a novel OCL approach termed “Bias Robust online Continual Learning (BRCL).” BRCL retains all intermediate models generated. These models inherently exhibit a preference for recently learned classes. To leverage this property for enhanced performance, we devise a strategy we describe as ‘utilizing bias to counteract bias.’ This method involves the development of an Inference function that capitalizes on the inherent biases of each model towards the recent tasks. Furthermore, we integrate a model consolidation technique that aligns the first layers of these models, particularly focusing on similar feature representations. This process effectively reduces the memory requirement, ensuring a low memory footprint. Despite the simplicity of the methodology to guarantee expandability to various frameworks, extensive experiments reveal a notable performance edge over leading methods on key benchmarks, getting continual learning closer to matching offline training. (Source code will be made publicly available upon the publication of this paper.)

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用固有偏差实现高效记忆持续学习：简单稳健的基线

在实际应用中，从不断变化的数据中学习至关重要。这种学习方式被称为 "持续学习"（Continual Learning，CL），旨在吸收新信息，同时不影响先前知识的性能。然而，学习新信息会导致网络偏向于最近的观察结果，从而产生一种被称为灾难性遗忘的现象。在在线持续学习（OCL）场景中，模型只能对数据进行一次传递，因此复杂性也随之增加。现有的 OCL 方法依赖于重放示例集，在处理大规模数据集时不仅会耗费大量内存，还会引发安全问题。虽然最近的动态网络模型解决了内存问题，但它们往往提出了计算要求高、参数过多且通用性有限的解决方案。为了解决这个长期存在的问题，我们提出了一种新颖的 OCL 方法，称为 "稳健偏差在线持续学习（BRCL）"。BRCL 保留生成的所有中间模型。这些模型在本质上表现出对最近学习的类别的偏好。为了利用这一特性来提高性能，我们设计了一种被称为 "利用偏差来抵消偏差 "的策略。这种方法包括开发一种推理函数，利用每个模型对最近任务的固有偏好。此外，我们还整合了一种模型整合技术，可将这些模型的第一层进行整合，尤其侧重于相似的特征表征。这一过程有效降低了内存需求，确保了低内存占用。尽管该方法简单易用，可确保扩展到各种框架，但广泛的实验表明，在关键基准上，该方法的性能明显优于领先方法，使持续学习更接近于匹配离线训练。(源代码将在本文发表后公开）。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Image and Vision Computing 工程技术-工程：电子与电气

CiteScore

8.50

自引率

8.50%

发文量

143

审稿时长

7.8 months

期刊介绍： Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.