Claus Metzner;Marius E. Yamakou;Dennis Voelkl;Achim Schilling;Patrick Krauss
{"title":"Quantifying and Maximizing the Information Flux in Recurrent Neural Networks","authors":"Claus Metzner;Marius E. Yamakou;Dennis Voelkl;Achim Schilling;Patrick Krauss","doi":"10.1162/neco_a_01651","DOIUrl":null,"url":null,"abstract":"Free-running recurrent neural networks (RNNs), especially probabilistic models, generate an ongoing information flux that can be quantified with the mutual information I[x→(t),x→(t+1)] between subsequent system states x→. Although previous studies have shown that I depends on the statistics of the network's connection weights, it is unclear how to maximize I systematically and how to quantify the flux in large systems where computing the mutual information becomes intractable. Here, we address these questions using Boltzmann machines as model systems. We find that in networks with moderately strong connections, the mutual information I is approximately a monotonic transformation of the root-mean-square averaged Pearson correlations between neuron pairs, a quantity that can be efficiently computed even in large systems. Furthermore, evolutionary maximization of I[x→(t),x→(t+1)] reveals a general design principle for the weight matrices enabling the systematic construction of systems with a high spontaneous information flux. Finally, we simultaneously maximize information flux and the mean period length of cyclic attractors in the state-space of these dynamical networks. Our results are potentially useful for the construction of RNNs that serve as short-time memories or pattern generators.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 3","pages":"351-384"},"PeriodicalIF":2.7000,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Computation","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10535083/","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Free-running recurrent neural networks (RNNs), especially probabilistic models, generate an ongoing information flux that can be quantified with the mutual information I[x→(t),x→(t+1)] between subsequent system states x→. Although previous studies have shown that I depends on the statistics of the network's connection weights, it is unclear how to maximize I systematically and how to quantify the flux in large systems where computing the mutual information becomes intractable. Here, we address these questions using Boltzmann machines as model systems. We find that in networks with moderately strong connections, the mutual information I is approximately a monotonic transformation of the root-mean-square averaged Pearson correlations between neuron pairs, a quantity that can be efficiently computed even in large systems. Furthermore, evolutionary maximization of I[x→(t),x→(t+1)] reveals a general design principle for the weight matrices enabling the systematic construction of systems with a high spontaneous information flux. Finally, we simultaneously maximize information flux and the mean period length of cyclic attractors in the state-space of these dynamical networks. Our results are potentially useful for the construction of RNNs that serve as short-time memories or pattern generators.
期刊介绍:
Neural Computation is uniquely positioned at the crossroads between neuroscience and TMCS and welcomes the submission of original papers from all areas of TMCS, including: Advanced experimental design; Analysis of chemical sensor data; Connectomic reconstructions; Analysis of multielectrode and optical recordings; Genetic data for cell identity; Analysis of behavioral data; Multiscale models; Analysis of molecular mechanisms; Neuroinformatics; Analysis of brain imaging data; Neuromorphic engineering; Principles of neural coding, computation, circuit dynamics, and plasticity; Theories of brain function.