Unsupervised continual learning by cross-level, instance-group and pseudo-group discrimination with hard attention

IF 3.7 3区计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Journal of Computational Science Pub Date : 2025-02-18 DOI:10.1016/j.jocs.2025.102535

Ankit Malviya, Sayak Dhole, Chandresh Kumar Maurya

{"title":"Unsupervised continual learning by cross-level, instance-group and pseudo-group discrimination with hard attention","authors":"Ankit Malviya, Sayak Dhole, Chandresh Kumar Maurya","doi":"10.1016/j.jocs.2025.102535","DOIUrl":null,"url":null,"abstract":"<div><div>Extensive work has been done in supervised continual learning (SCL) , wherein models adapt to changing distributions with labeled data while mitigating catastrophic forgetting. However, this approach diverges from real-world scenarios where labeled data is scarce or non-existent. Unsupervised continual learning (UCL) emerges to bridge this disparity. Previous research has explored methods for unsupervised continuous feature learning by incorporating rehearsal to alleviate the problem of catastrophic forgetting. Although these techniques are effective, they may not be feasible for scenarios where storing training data is impractical. Moreover, rehearsal techniques may confront challenges pertaining to representation drifts and overfitting, particularly under limited buffer size conditions. To address these drawbacks, we employ parameter isolation as a strategy to mitigate forgetting. Specifically, we use task-specific hard attention to prevent updates to parameters important for previous tasks. In contrastive learning, loss is prone to be negatively affected by a reduction in the diversity of negative samples. Therefore, we incorporate instance-to-instance similarity into contrastive learning through both direct instance grouping and discrimination at the cross-level with local instance groups, as well as with local pseudo-instance groups. The masked model learns the features using cross-level discrimination, which naturally clusters similar data in the representation space. Extensive experimentation demonstrates that our proposed approach outperforms current state-of-the-art (SOTA) baselines by significant margins, all while exhibiting minimal or nearly zero forgetting, and without the need for any rehearsal buffer. Additionally, the model learns distinct task boundaries. It achieves an overall-average task and class incremental learning (TIL & CIL) accuracy of 76.79% and 62.96% respectively with nearly zero forgetting, across standard datasets for varying task sequences ranging from 5 to 100. This surpasses SOTA baselines, which only reach 74.28% and 60.68% respectively in the UCL setting, where they experience substantial forgetting of almost over 4%. Moreover, our approach achieves performance nearly comparable to the SCL baseline and even surpasses it on some standard datasets, with a notable reduction in forgetting from almost 14.51% to nearly zero.</div></div>","PeriodicalId":48907,"journal":{"name":"Journal of Computational Science","volume":"86 ","pages":"Article 102535"},"PeriodicalIF":3.7000,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Science","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1877750325000122","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Extensive work has been done in supervised continual learning (SCL) , wherein models adapt to changing distributions with labeled data while mitigating catastrophic forgetting. However, this approach diverges from real-world scenarios where labeled data is scarce or non-existent. Unsupervised continual learning (UCL) emerges to bridge this disparity. Previous research has explored methods for unsupervised continuous feature learning by incorporating rehearsal to alleviate the problem of catastrophic forgetting. Although these techniques are effective, they may not be feasible for scenarios where storing training data is impractical. Moreover, rehearsal techniques may confront challenges pertaining to representation drifts and overfitting, particularly under limited buffer size conditions. To address these drawbacks, we employ parameter isolation as a strategy to mitigate forgetting. Specifically, we use task-specific hard attention to prevent updates to parameters important for previous tasks. In contrastive learning, loss is prone to be negatively affected by a reduction in the diversity of negative samples. Therefore, we incorporate instance-to-instance similarity into contrastive learning through both direct instance grouping and discrimination at the cross-level with local instance groups, as well as with local pseudo-instance groups. The masked model learns the features using cross-level discrimination, which naturally clusters similar data in the representation space. Extensive experimentation demonstrates that our proposed approach outperforms current state-of-the-art (SOTA) baselines by significant margins, all while exhibiting minimal or nearly zero forgetting, and without the need for any rehearsal buffer. Additionally, the model learns distinct task boundaries. It achieves an overall-average task and class incremental learning (TIL & CIL) accuracy of 76.79% and 62.96% respectively with nearly zero forgetting, across standard datasets for varying task sequences ranging from 5 to 100. This surpasses SOTA baselines, which only reach 74.28% and 60.68% respectively in the UCL setting, where they experience substantial forgetting of almost over 4%. Moreover, our approach achieves performance nearly comparable to the SCL baseline and even surpasses it on some standard datasets, with a notable reduction in forgetting from almost 14.51% to nearly zero.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

硬注意跨层次、实例组和伪组判别的无监督持续学习

在监督持续学习（SCL）方面已经做了大量的工作，其中模型适应标记数据的变化分布，同时减轻灾难性遗忘。然而，这种方法与标记数据稀缺或不存在的现实场景有所不同。无监督持续学习（UCL）的出现弥补了这一差距。以前的研究已经探索了通过结合排练来缓解灾难性遗忘问题的无监督连续特征学习方法。尽管这些技术是有效的，但对于存储训练数据不实际的场景，它们可能不可行。此外，排练技术可能面临与表示漂移和过拟合有关的挑战，特别是在有限的缓冲大小条件下。为了解决这些缺点，我们采用参数隔离作为减轻遗忘的策略。具体来说，我们使用特定于任务的硬注意来防止更新对先前任务重要的参数。在对比学习中，由于负样本多样性的减少，损失容易受到负面影响。因此，我们通过直接实例分组和与本地实例组以及本地伪实例组的跨级别判别，将实例到实例的相似性纳入对比学习。掩蔽模型使用跨层判别学习特征，自然地将表示空间中的相似数据聚类。大量的实验表明，我们提出的方法在很大程度上优于当前最先进的（SOTA）基线，同时表现出最小或几乎为零的遗忘，并且不需要任何排练缓冲。此外，该模型还学习不同的任务边界。它实现了总体平均任务和类增量学习(TIL &；在不同的任务序列（从5到100）的标准数据集中，CIL的准确率分别为76.79%和62.96%，几乎为零遗忘。这超过了SOTA的基线，在UCL环境中，这两个基线分别仅达到74.28%和60.68%，在UCL环境中，他们经历了几乎超过4%的严重遗忘。此外，我们的方法达到了几乎与SCL基线相当的性能，甚至在一些标准数据集上超过了它，遗忘率从14.51%显着降低到几乎为零。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Computational Science COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-COMPUTER SCIENCE, THEORY & METHODS

CiteScore

5.50

自引率

3.00%

发文量

227

审稿时长

41 days

期刊介绍： Computational Science is a rapidly growing multi- and interdisciplinary field that uses advanced computing and data analysis to understand and solve complex problems. It has reached a level of predictive capability that now firmly complements the traditional pillars of experimentation and theory. The recent advances in experimental techniques such as detectors, on-line sensor networks and high-resolution imaging techniques, have opened up new windows into physical and biological processes at many levels of detail. The resulting data explosion allows for detailed data driven modeling and simulation. This new discipline in science combines computational thinking, modern computational methods, devices and collateral technologies to address problems far beyond the scope of traditional numerical methods. Computational science typically unifies three distinct elements: • Modeling, Algorithms and Simulations (e.g. numerical and non-numerical, discrete and continuous); • Software developed to solve science (e.g., biological, physical, and social), engineering, medicine, and humanities problems; • Computer and information science that develops and optimizes the advanced system hardware, software, networking, and data management components (e.g. problem solving environments).