Flexible task abstractions emerge in linear networks with fast and bounded units.

ArXiv Pub Date : 2025-01-16

Kai Sandbrink, Jan P Bauer, Alexandra M Proca, Andrew M Saxe, Christopher Summerfield, Ali Hummos

{"title":"Flexible task abstractions emerge in linear networks with fast and bounded units.","authors":"Kai Sandbrink, Jan P Bauer, Alexandra M Proca, Andrew M Saxe, Christopher Summerfield, Ali Hummos","doi":"","DOIUrl":null,"url":null,"abstract":"Animals survive in dynamic environments changing at arbitrary timescales, but such data distribution shifts are a challenge to neural networks. To adapt to change, neural systems may change a large number of parameters, which is a slow process involving forgetting past information. In contrast, animals leverage distribution changes to segment their stream of experience into tasks and associate them with internal task abstractions. Animals can then respond flexibly by selecting the appropriate task abstraction. However, how such flexible task abstractions may arise in neural systems remains unknown. Here, we analyze a linear gated network where the weights and gates are jointly optimized via gradient descent, but with neuron-like constraints on the gates including a faster timescale, nonnegativity, and bounded activity. We observe that the weights self-organize into modules specialized for tasks or sub-tasks encountered, while the gates layer forms unique representations that switch the appropriate weight modules (task abstractions). We analytically reduce the learning dynamics to an effective eigenspace, revealing a virtuous cycle: fast adapting gates drive weight specialization by protecting previous knowledge, while weight specialization in turn increases the update rate of the gating layer. Task switching in the gating layer accelerates as a function of curriculum block size and task training, mirroring key findings in cognitive neuroscience. We show that the discovered task abstractions support generalization through both task and subtask composition, and we extend our findings to a non-linear network switching between two tasks. Overall, our work offers a theory of cognitive flexibility in animals as arising from joint gradient descent on synaptic and neural gating in a neural network architecture.","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11774440/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ArXiv","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Animals survive in dynamic environments changing at arbitrary timescales, but such data distribution shifts are a challenge to neural networks. To adapt to change, neural systems may change a large number of parameters, which is a slow process involving forgetting past information. In contrast, animals leverage distribution changes to segment their stream of experience into tasks and associate them with internal task abstractions. Animals can then respond flexibly by selecting the appropriate task abstraction. However, how such flexible task abstractions may arise in neural systems remains unknown. Here, we analyze a linear gated network where the weights and gates are jointly optimized via gradient descent, but with neuron-like constraints on the gates including a faster timescale, nonnegativity, and bounded activity. We observe that the weights self-organize into modules specialized for tasks or sub-tasks encountered, while the gates layer forms unique representations that switch the appropriate weight modules (task abstractions). We analytically reduce the learning dynamics to an effective eigenspace, revealing a virtuous cycle: fast adapting gates drive weight specialization by protecting previous knowledge, while weight specialization in turn increases the update rate of the gating layer. Task switching in the gating layer accelerates as a function of curriculum block size and task training, mirroring key findings in cognitive neuroscience. We show that the discovered task abstractions support generalization through both task and subtask composition, and we extend our findings to a non-linear network switching between two tasks. Overall, our work offers a theory of cognitive flexibility in animals as arising from joint gradient descent on synaptic and neural gating in a neural network architecture.

微信好友朋友圈 QQ好友复制链接

本刊更多论文

灵活的任务抽象出现在具有快速和有界单元的线性网络中。

动物在任意时间尺度变化的动态环境中生存，但这种数据分布的变化对神经网络来说是一个挑战。为了适应变化，神经系统可能会改变大量的参数，这是一个缓慢的过程，包括忘记过去的信息。相比之下，动物利用分布变化将它们的经验流分割成任务，并将它们与内部任务抽象相关联。然后，动物可以通过选择适当的任务抽象来灵活地做出反应。然而，这种灵活的任务抽象如何在神经系统中出现仍然未知。在这里，我们分析了一个线性门控网络，其中权值和门通过梯度下降共同优化，但在门上具有类似神经元的约束，包括更快的时间尺度，非负性和有界活动。我们观察到，权重自组织成专门针对任务或遇到的子任务的模块，而门层形成唯一的表示，用于切换适当的权重模块（任务抽象）。我们分析地将学习动态简化为有效的特征空间，揭示了一个良性循环：快速自适应门通过保护先前的知识来驱动权值专门化，而权值专门化反过来又提高了门控层的更新速度。门控层的任务转换随着课程块大小和任务训练而加速，这反映了认知神经科学的重要发现。我们表明，发现的任务抽象支持通过任务和子任务组成的泛化，并将我们的发现扩展到两个任务之间的非线性网络切换。总的来说，我们的工作提供了一种理论，动物的认知灵活性是由神经网络结构中突触和神经门控的联合梯度下降引起的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ArXiv

自引率

0.00%

发文量