连续学习的受限正交梯度投影

AI Open Pub Date : 2023-01-01 DOI:10.1016/j.aiopen.2023.08.010

Zeyuan Yang , Zonghan Yang , Yichen Liu , Peng Li , Yang Liu

{"title":"连续学习的受限正交梯度投影","authors":"Zeyuan Yang , Zonghan Yang , Yichen Liu , Peng Li , Yang Liu","doi":"10.1016/j.aiopen.2023.08.010","DOIUrl":null,"url":null,"abstract":"<div><p>Continual learning aims to avoid catastrophic forgetting and effectively leverage learned experiences to master new knowledge. Existing gradient projection approaches impose hard constraints on the optimization space for new tasks to minimize interference, which simultaneously hinders forward knowledge transfer. To address this issue, recent methods reuse frozen parameters with a growing network, resulting in high computational costs. Thus, it remains a challenge whether we can improve forward knowledge transfer for gradient projection approaches <em>using a fixed network architecture</em>. In this work, we propose the Restricted Orthogonal Gradient prOjection (ROGO) framework. The basic idea is to adopt a restricted orthogonal constraint allowing parameters optimized in the direction oblique to the whole frozen space to facilitate forward knowledge transfer while consolidating previous knowledge. Our framework requires neither data buffers nor extra parameters. Extensive experiments have demonstrated the superiority of our framework over several strong baselines. We also provide theoretical guarantees for our relaxing strategy.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 98-110"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Restricted orthogonal gradient projection for continual learning\",\"authors\":\"Zeyuan Yang , Zonghan Yang , Yichen Liu , Peng Li , Yang Liu\",\"doi\":\"10.1016/j.aiopen.2023.08.010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Continual learning aims to avoid catastrophic forgetting and effectively leverage learned experiences to master new knowledge. Existing gradient projection approaches impose hard constraints on the optimization space for new tasks to minimize interference, which simultaneously hinders forward knowledge transfer. To address this issue, recent methods reuse frozen parameters with a growing network, resulting in high computational costs. Thus, it remains a challenge whether we can improve forward knowledge transfer for gradient projection approaches <em>using a fixed network architecture</em>. In this work, we propose the Restricted Orthogonal Gradient prOjection (ROGO) framework. The basic idea is to adopt a restricted orthogonal constraint allowing parameters optimized in the direction oblique to the whole frozen space to facilitate forward knowledge transfer while consolidating previous knowledge. Our framework requires neither data buffers nor extra parameters. Extensive experiments have demonstrated the superiority of our framework over several strong baselines. We also provide theoretical guarantees for our relaxing strategy.</p></div>\",\"PeriodicalId\":100068,\"journal\":{\"name\":\"AI Open\",\"volume\":\"4 \",\"pages\":\"Pages 98-110\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AI Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666651023000128\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666651023000128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

持续学习旨在避免灾难性的遗忘，并有效地利用所学经验来掌握新知识。现有的梯度投影方法对新任务的优化空间施加了严格的约束，以最大限度地减少干扰，这同时阻碍了正向知识转移。为了解决这个问题，最近的方法在不断增长的网络中重用冻结的参数，导致计算成本高。因此，我们是否可以使用固定网络架构改进梯度投影方法的前向知识转移仍然是一个挑战。在这项工作中，我们提出了限制正交梯度投影（ROGO）框架。其基本思想是采用限制正交约束，允许在倾斜于整个冻结空间的方向上优化参数，以便于在巩固先前知识的同时向前转移知识。我们的框架既不需要数据缓冲区，也不需要额外的参数。大量的实验已经证明了我们的框架相对于几个强大的基线的优越性。我们也为我们的放松策略提供了理论保障。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Restricted orthogonal gradient projection for continual learning

Continual learning aims to avoid catastrophic forgetting and effectively leverage learned experiences to master new knowledge. Existing gradient projection approaches impose hard constraints on the optimization space for new tasks to minimize interference, which simultaneously hinders forward knowledge transfer. To address this issue, recent methods reuse frozen parameters with a growing network, resulting in high computational costs. Thus, it remains a challenge whether we can improve forward knowledge transfer for gradient projection approaches using a fixed network architecture. In this work, we propose the Restricted Orthogonal Gradient prOjection (ROGO) framework. The basic idea is to adopt a restricted orthogonal constraint allowing parameters optimized in the direction oblique to the whole frozen space to facilitate forward knowledge transfer while consolidating previous knowledge. Our framework requires neither data buffers nor extra parameters. Extensive experiments have demonstrated the superiority of our framework over several strong baselines. We also provide theoretical guarantees for our relaxing strategy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊