Learning Fixed Points of Recurrent Neural Networks by Reparameterizing the Network Model

IF 2.1 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neural Computation Pub Date : 2024-07-19 DOI:10.1162/neco_a_01681

Vicky Zhu;Robert Rosenbaum

{"title":"Learning Fixed Points of Recurrent Neural Networks by Reparameterizing the Network Model","authors":"Vicky Zhu;Robert Rosenbaum","doi":"10.1162/neco_a_01681","DOIUrl":null,"url":null,"abstract":"In computational neuroscience, recurrent neural networks are widely used to model neural activity and learning. In many studies, fixed points of recurrent neural networks are used to model neural responses to static or slowly changing stimuli, such as visual cortical responses to static visual stimuli. These applications raise the question of how to train the weights in a recurrent neural network to minimize a loss function evaluated on fixed points. In parallel, training fixed points is a central topic in the study of deep equilibrium models in machine learning. A natural approach is to use gradient descent on the Euclidean space of weights. We show that this approach can lead to poor learning performance due in part to singularities that arise in the loss surface. We use a reparameterization of the recurrent network model to derive two alternative learning rules that produce more robust learning dynamics. We demonstrate that these learning rules avoid singularities and learn more effectively than standard gradient descent. The new learning rules can be interpreted as steepest descent and gradient descent, respectively, under a non-Euclidean metric on the space of recurrent weights. Our results question the common, implicit assumption that learning in the brain should be expected to follow the negative Euclidean gradient of synaptic weights.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 8","pages":"1568-1600"},"PeriodicalIF":2.1000,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computation","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10661267/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In computational neuroscience, recurrent neural networks are widely used to model neural activity and learning. In many studies, fixed points of recurrent neural networks are used to model neural responses to static or slowly changing stimuli, such as visual cortical responses to static visual stimuli. These applications raise the question of how to train the weights in a recurrent neural network to minimize a loss function evaluated on fixed points. In parallel, training fixed points is a central topic in the study of deep equilibrium models in machine learning. A natural approach is to use gradient descent on the Euclidean space of weights. We show that this approach can lead to poor learning performance due in part to singularities that arise in the loss surface. We use a reparameterization of the recurrent network model to derive two alternative learning rules that produce more robust learning dynamics. We demonstrate that these learning rules avoid singularities and learn more effectively than standard gradient descent. The new learning rules can be interpreted as steepest descent and gradient descent, respectively, under a non-Euclidean metric on the space of recurrent weights. Our results question the common, implicit assumption that learning in the brain should be expected to follow the negative Euclidean gradient of synaptic weights.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过重参数化网络模型学习循环神经网络的定点

在计算神经科学中，循环神经网络被广泛用于神经活动和学习建模。在许多研究中，循环神经网络的定点被用于模拟神经对静态或缓慢变化刺激的反应，如视觉皮层对静态视觉刺激的反应。这些应用提出了一个问题：如何训练递归神经网络中的权重，以最小化在定点上评估的损失函数。与此同时，训练定点也是机器学习中深度平衡模型研究的核心课题。一种自然的方法是在权重的欧氏空间上使用梯度下降法。我们的研究表明，这种方法会导致学习效果不佳，部分原因是损失面中出现了奇点。我们利用对递归网络模型的重新参数化，推导出两种可供选择的学习规则，它们能产生更稳健的学习动态。我们证明，这些学习规则可以避免奇异性，而且比标准梯度下降学习方法更有效。在循环权重空间的非欧几里得度量下，新的学习规则可分别解释为最陡峭下降和梯度下降。我们的研究结果质疑了一个常见的隐含假设，即大脑中的学习应该遵循突触权重的负欧氏梯度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Neural Computation 工程技术-计算机：人工智能

CiteScore

6.30

自引率

3.40%

发文量

审稿时长

3.0 months

期刊介绍： Neural Computation is uniquely positioned at the crossroads between neuroscience and TMCS and welcomes the submission of original papers from all areas of TMCS, including: Advanced experimental design; Analysis of chemical sensor data; Connectomic reconstructions; Analysis of multielectrode and optical recordings; Genetic data for cell identity; Analysis of behavioral data; Multiscale models; Analysis of molecular mechanisms; Neuroinformatics; Analysis of brain imaging data; Neuromorphic engineering; Principles of neural coding, computation, circuit dynamics, and plasticity; Theories of brain function.