We investigate a mutual relationship between information and energy during the early phase of LTP induction and maintenance in a large-scale system of mutually coupled dendritic spines, with discrete internal states and probabilistic dynamics, within the framework of nonequilibrium stochastic thermodynamics. In order to analyze this computationally intractable stochastic multidimensional system, we introduce a pair approximation, which allows us to reduce the spine dynamics into a lower-dimensional manageable system of closed equations. We found that the rates of information gain and energy attain their maximal values during an initial period of LTP (i.e., during stimulation), and after that, they recover to their baseline low values, as opposed to a memory trace that lasts much longer. This suggests that the learning phase is much more energy demanding than the memory phase. We show that positive correlations between neighboring spines increase both a duration of memory trace and energy cost during LTP, but the memory time per invested energy increases dramatically for very strong, positive synaptic cooperativity, suggesting a beneficial role of synaptic clustering on memory duration. In contrast, information gain after LTP is the largest for negative correlations, and energy efficiency of that information generally declines with increasing synaptic cooperativity. We also find that dendritic spines can use sparse representations for encoding long-term information, as both energetic and structural efficiencies of retained information and its lifetime exhibit maxima for low fractions of stimulated synapses during LTP. Moreover, we find that such efficiencies drop significantly with increasing the number of spines. In general, our stochastic thermodynamics approach provides a unifying framework for studying, from first principles, information encoding, and its energy cost during learning and memory in stochastic systems of interacting synapses.
{"title":"Cooperativity, Information Gain, and Energy Cost During Early LTP in Dendritic Spines","authors":"Jan Karbowski;Paulina Urban","doi":"10.1162/neco_a_01632","DOIUrl":"10.1162/neco_a_01632","url":null,"abstract":"We investigate a mutual relationship between information and energy during the early phase of LTP induction and maintenance in a large-scale system of mutually coupled dendritic spines, with discrete internal states and probabilistic dynamics, within the framework of nonequilibrium stochastic thermodynamics. In order to analyze this computationally intractable stochastic multidimensional system, we introduce a pair approximation, which allows us to reduce the spine dynamics into a lower-dimensional manageable system of closed equations. We found that the rates of information gain and energy attain their maximal values during an initial period of LTP (i.e., during stimulation), and after that, they recover to their baseline low values, as opposed to a memory trace that lasts much longer. This suggests that the learning phase is much more energy demanding than the memory phase. We show that positive correlations between neighboring spines increase both a duration of memory trace and energy cost during LTP, but the memory time per invested energy increases dramatically for very strong, positive synaptic cooperativity, suggesting a beneficial role of synaptic clustering on memory duration. In contrast, information gain after LTP is the largest for negative correlations, and energy efficiency of that information generally declines with increasing synaptic cooperativity. We also find that dendritic spines can use sparse representations for encoding long-term information, as both energetic and structural efficiencies of retained information and its lifetime exhibit maxima for low fractions of stimulated synapses during LTP. Moreover, we find that such efficiencies drop significantly with increasing the number of spines. In general, our stochastic thermodynamics approach provides a unifying framework for studying, from first principles, information encoding, and its energy cost during learning and memory in stochastic systems of interacting synapses.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 2","pages":"271-311"},"PeriodicalIF":2.9,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138687565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
There is growing evidence that many forms of neural computation may be implemented by low-dimensional dynamics unfolding at the population scale. However, neither the connectivity structure nor the general capabilities of these embedded dynamical processes are currently understood. In this work, the two most common formalisms of firing-rate models are evaluated using tools from analysis, topology, and nonlinear dynamics in order to provide plausible explanations for these problems. It is shown that low-rank structured connectivities predict the formation of invariant and globally attracting manifolds in all these models. Regarding the dynamics arising in these manifolds, it is proved they are topologically equivalent across the considered formalisms. This letter also shows that under the low-rank hypothesis, the flows emerging in neural manifolds, including input-driven systems, are universal, which broadens previous findings. It explores how low-dimensional orbits can bear the production of continuous sets of muscular trajectories, the implementation of central pattern generators, and the storage of memory states. These dynamics can robustly simulate any Turing machine over arbitrary bounded memory strings, virtually endowing rate models with the power of universal computation. In addition, the letter shows how the low-rank hypothesis predicts the parsimonious correlation structure observed in cortical activity. Finally, it discusses how this theory could provide a useful tool from which to study neuropsychological phenomena using mathematical methods.
{"title":"Emergence of Universal Computations Through Neural Manifold Dynamics","authors":"Joan Gort","doi":"10.1162/neco_a_01631","DOIUrl":"10.1162/neco_a_01631","url":null,"abstract":"There is growing evidence that many forms of neural computation may be implemented by low-dimensional dynamics unfolding at the population scale. However, neither the connectivity structure nor the general capabilities of these embedded dynamical processes are currently understood. In this work, the two most common formalisms of firing-rate models are evaluated using tools from analysis, topology, and nonlinear dynamics in order to provide plausible explanations for these problems. It is shown that low-rank structured connectivities predict the formation of invariant and globally attracting manifolds in all these models. Regarding the dynamics arising in these manifolds, it is proved they are topologically equivalent across the considered formalisms. This letter also shows that under the low-rank hypothesis, the flows emerging in neural manifolds, including input-driven systems, are universal, which broadens previous findings. It explores how low-dimensional orbits can bear the production of continuous sets of muscular trajectories, the implementation of central pattern generators, and the storage of memory states. These dynamics can robustly simulate any Turing machine over arbitrary bounded memory strings, virtually endowing rate models with the power of universal computation. In addition, the letter shows how the low-rank hypothesis predicts the parsimonious correlation structure observed in cortical activity. Finally, it discusses how this theory could provide a useful tool from which to study neuropsychological phenomena using mathematical methods.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 2","pages":"227-270"},"PeriodicalIF":2.9,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138687695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Assigning labels to instances is crucial for supervised machine learning. In this letter, we propose a novel annotation method, Q&A labeling, which involves a question generator that asks questions about the labels of the instances to be assigned and an annotator that answers the questions and assigns the corresponding labels to the instances. We derived a generative model of labels assigned according to two Q&A labeling procedures that differ in the way questions are asked and answered. We showed that in both procedures, the derived model is partially consistent with that assumed in previous studies. The main distinction of this study from previous ones lies in the fact that the label generative model was not assumed but, rather, derived based on the definition of a specific annotation method, Q&A labeling. We also derived a loss function to evaluate the classification risk of ordinary supervised machine learning using instances assigned Q&A labels and evaluated the upper bound of the classification error. The results indicate statistical consistency in learning with Q&A labels.
{"title":"Q&A Label Learning","authors":"Kota Kawamoto;Masato Uchida","doi":"10.1162/neco_a_01633","DOIUrl":"10.1162/neco_a_01633","url":null,"abstract":"Assigning labels to instances is crucial for supervised machine learning. In this letter, we propose a novel annotation method, Q&A labeling, which involves a question generator that asks questions about the labels of the instances to be assigned and an annotator that answers the questions and assigns the corresponding labels to the instances. We derived a generative model of labels assigned according to two Q&A labeling procedures that differ in the way questions are asked and answered. We showed that in both procedures, the derived model is partially consistent with that assumed in previous studies. The main distinction of this study from previous ones lies in the fact that the label generative model was not assumed but, rather, derived based on the definition of a specific annotation method, Q&A labeling. We also derived a loss function to evaluate the classification risk of ordinary supervised machine learning using instances assigned Q&A labels and evaluated the upper bound of the classification error. The results indicate statistical consistency in learning with Q&A labels.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 2","pages":"312-349"},"PeriodicalIF":2.9,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138687568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
C. Daniel Greenidge;Benjamin Scholl;Jacob L. Yates;Jonathan W. Pillow
Neural decoding methods provide a powerful tool for quantifying the information content of neural population codes and the limits imposed by correlations in neural activity. However, standard decoding methods are prone to overfitting and scale poorly to high-dimensional settings. Here, we introduce a novel decoding method to overcome these limitations. Our approach, the gaussian process multiclass decoder (GPMD), is well suited to decoding a continuous low-dimensional variable from high-dimensional population activity and provides a platform for assessing the importance of correlations in neural population codes. The GPMD is a multinomial logistic regression model with a gaussian process prior over the decoding weights. The prior includes hyperparameters that govern the smoothness of each neuron's decoding weights, allowing automatic pruning of uninformative neurons during inference. We provide a variational inference method for fitting the GPMD to data, which scales to hundreds or thousands of neurons and performs well even in data sets with more neurons than trials. We apply the GPMD to recordings from primary visual cortex in three species: monkey, ferret, and mouse. Our decoder achieves state-of-the-art accuracy on all three data sets and substantially outperforms independent Bayesian decoding, showing that knowledge of the correlation structure is essential for optimal decoding in all three species.
{"title":"Efficient Decoding of Large-Scale Neural Population Responses With Gaussian-Process Multiclass Regression","authors":"C. Daniel Greenidge;Benjamin Scholl;Jacob L. Yates;Jonathan W. Pillow","doi":"10.1162/neco_a_01630","DOIUrl":"10.1162/neco_a_01630","url":null,"abstract":"Neural decoding methods provide a powerful tool for quantifying the information content of neural population codes and the limits imposed by correlations in neural activity. However, standard decoding methods are prone to overfitting and scale poorly to high-dimensional settings. Here, we introduce a novel decoding method to overcome these limitations. Our approach, the gaussian process multiclass decoder (GPMD), is well suited to decoding a continuous low-dimensional variable from high-dimensional population activity and provides a platform for assessing the importance of correlations in neural population codes. The GPMD is a multinomial logistic regression model with a gaussian process prior over the decoding weights. The prior includes hyperparameters that govern the smoothness of each neuron's decoding weights, allowing automatic pruning of uninformative neurons during inference. We provide a variational inference method for fitting the GPMD to data, which scales to hundreds or thousands of neurons and performs well even in data sets with more neurons than trials. We apply the GPMD to recordings from primary visual cortex in three species: monkey, ferret, and mouse. Our decoder achieves state-of-the-art accuracy on all three data sets and substantially outperforms independent Bayesian decoding, showing that knowledge of the correlation structure is essential for optimal decoding in all three species.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 2","pages":"175-226"},"PeriodicalIF":2.9,"publicationDate":"2024-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138687239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Under difficult viewing conditions, the brain's visual system uses a variety of recurrent modulatory mechanisms to augment feedforward processing. One resulting phenomenon is contour integration, which occurs in the primary visual (V1) cortex and strengthens neural responses to edges if they belong to a larger smooth contour. Computational models have contributed to an understanding of the circuit mechanisms of contour integration, but less is known about its role in visual perception. To address this gap, we embedded a biologically grounded model of contour integration in a task-driven artificial neural network and trained it using a gradient-descent variant. We used this model to explore how brain-like contour integration may be optimized for high-level visual objectives as well as its potential roles in perception. When the model was trained to detect contours in a background of random edges, a task commonly used to examine contour integration in the brain, it closely mirrored the brain in terms of behavior, neural responses, and lateral connection patterns. When trained on natural images, the model enhanced weaker contours and distinguished whether two points lay on the same versus different contours. The model learned robust features that generalized well to out-of-training-distribution stimuli. Surprisingly, and in contrast with the synthetic task, a parameter-matched control network without recurrence performed the same as or better than the model on the natural-image tasks. Thus, a contour integration mechanism is not essential to perform these more naturalistic contour-related tasks. Finally, the best performance in all tasks was achieved by a modified contour integration model that did not distinguish between excitatory and inhibitory neurons.
{"title":"Modeling the Role of Contour Integration in Visual Inference","authors":"Salman Khan;Alexander Wong;Bryan Tripp","doi":"10.1162/neco_a_01625","DOIUrl":"10.1162/neco_a_01625","url":null,"abstract":"Under difficult viewing conditions, the brain's visual system uses a variety of recurrent modulatory mechanisms to augment feedforward processing. One resulting phenomenon is contour integration, which occurs in the primary visual (V1) cortex and strengthens neural responses to edges if they belong to a larger smooth contour. Computational models have contributed to an understanding of the circuit mechanisms of contour integration, but less is known about its role in visual perception. To address this gap, we embedded a biologically grounded model of contour integration in a task-driven artificial neural network and trained it using a gradient-descent variant. We used this model to explore how brain-like contour integration may be optimized for high-level visual objectives as well as its potential roles in perception. When the model was trained to detect contours in a background of random edges, a task commonly used to examine contour integration in the brain, it closely mirrored the brain in terms of behavior, neural responses, and lateral connection patterns. When trained on natural images, the model enhanced weaker contours and distinguished whether two points lay on the same versus different contours. The model learned robust features that generalized well to out-of-training-distribution stimuli. Surprisingly, and in contrast with the synthetic task, a parameter-matched control network without recurrence performed the same as or better than the model on the natural-image tasks. Thus, a contour integration mechanism is not essential to perform these more naturalistic contour-related tasks. Finally, the best performance in all tasks was achieved by a modified contour integration model that did not distinguish between excitatory and inhibitory neurons.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 1","pages":"33-74"},"PeriodicalIF":2.9,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10534913","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138489120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Kunin;Javier Sagastuy-Brena;Lauren Gillespie;Eshed Margalit;Hidenori Tanaka;Surya Ganguli;Daniel L. K. Yamins
In this work, we explore the limiting dynamics of deep neural networks trained with stochastic gradient descent (SGD). As observed previously, long after performance has converged, networks continue to move through parameter space by a process of anomalous diffusion in which distance traveled grows as a power law in the number of gradient updates with a nontrivial exponent. We reveal an intricate interaction among the hyperparameters of optimization, the structure in the gradient noise, and the Hessian matrix at the end of training that explains this anomalous diffusion. To build this understanding, we first derive a continuous-time model for SGD with finite learning rates and batch sizes as an underdamped Langevin equation. We study this equation in the setting of linear regression, where we can derive exact, analytic expressions for the phase-space dynamics of the parameters and their instantaneous velocities from initialization to stationarity. Using the Fokker-Planck equation, we show that the key ingredient driving these dynamics is not the original training loss but rather the combination of a modified loss, which implicitly regularizes the velocity, and probability currents that cause oscillations in phase space. We identify qualitative and quantitative predictions of this theory in the dynamics of a ResNet-18 model trained on ImageNet. Through the lens of statistical physics, we uncover a mechanistic origin for the anomalous limiting dynamics of deep neural networks trained with SGD. Understanding the limiting dynamics of SGD, and its dependence on various important hyperparameters like batch size, learning rate, and momentum, can serve as a basis for future work that can turn these insights into algorithmic gains.
{"title":"The Limiting Dynamics of SGD: Modified Loss, Phase-Space Oscillations, and Anomalous Diffusion","authors":"Daniel Kunin;Javier Sagastuy-Brena;Lauren Gillespie;Eshed Margalit;Hidenori Tanaka;Surya Ganguli;Daniel L. K. Yamins","doi":"10.1162/neco_a_01626","DOIUrl":"10.1162/neco_a_01626","url":null,"abstract":"In this work, we explore the limiting dynamics of deep neural networks trained with stochastic gradient descent (SGD). As observed previously, long after performance has converged, networks continue to move through parameter space by a process of anomalous diffusion in which distance traveled grows as a power law in the number of gradient updates with a nontrivial exponent. We reveal an intricate interaction among the hyperparameters of optimization, the structure in the gradient noise, and the Hessian matrix at the end of training that explains this anomalous diffusion. To build this understanding, we first derive a continuous-time model for SGD with finite learning rates and batch sizes as an underdamped Langevin equation. We study this equation in the setting of linear regression, where we can derive exact, analytic expressions for the phase-space dynamics of the parameters and their instantaneous velocities from initialization to stationarity. Using the Fokker-Planck equation, we show that the key ingredient driving these dynamics is not the original training loss but rather the combination of a modified loss, which implicitly regularizes the velocity, and probability currents that cause oscillations in phase space. We identify qualitative and quantitative predictions of this theory in the dynamics of a ResNet-18 model trained on ImageNet. Through the lens of statistical physics, we uncover a mechanistic origin for the anomalous limiting dynamics of deep neural networks trained with SGD. Understanding the limiting dynamics of SGD, and its dependence on various important hyperparameters like batch size, learning rate, and momentum, can serve as a basis for future work that can turn these insights into algorithmic gains.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 1","pages":"151-174"},"PeriodicalIF":2.9,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138489123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A hypothesis in the study of the brain is that sparse coding is realized in information representation of external stimuli, which has been experimentally confirmed for visual stimulus recently. However, unlike the specific functional region in the brain, sparse coding in information processing in the whole brain has not been clarified sufficiently. In this study, we investigate the validity of sparse coding in the whole human brain by applying various matrix factorization methods to functional magnetic resonance imaging data of neural activities in the brain. The result suggests the sparse coding hypothesis in information representation in the whole human brain, because extracted features from the sparse matrix factorization (MF) method, sparse principal component analysis (SparsePCA), or method of optimal directions (MOD) under a high sparsity setting or an approximate sparse MF method, fast independent component analysis (FastICA), can classify external visual stimuli more accurately than the nonsparse MF method or sparse MF method under a low sparsity setting.
{"title":"Performance Evaluation of Matrix Factorization for fMRI Data","authors":"Yusuke Endo;Koujin Takeda","doi":"10.1162/neco_a_01628","DOIUrl":"10.1162/neco_a_01628","url":null,"abstract":"A hypothesis in the study of the brain is that sparse coding is realized in information representation of external stimuli, which has been experimentally confirmed for visual stimulus recently. However, unlike the specific functional region in the brain, sparse coding in information processing in the whole brain has not been clarified sufficiently. In this study, we investigate the validity of sparse coding in the whole human brain by applying various matrix factorization methods to functional magnetic resonance imaging data of neural activities in the brain. The result suggests the sparse coding hypothesis in information representation in the whole human brain, because extracted features from the sparse matrix factorization (MF) method, sparse principal component analysis (SparsePCA), or method of optimal directions (MOD) under a high sparsity setting or an approximate sparse MF method, fast independent component analysis (FastICA), can classify external visual stimuli more accurately than the nonsparse MF method or sparse MF method under a low sparsity setting.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 1","pages":"128-150"},"PeriodicalIF":2.9,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138489121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anru R. Zhang;Ryan P. Bell;Chen An;Runshi Tang;Shana A. Hall;Cliburn Chan;Kareem Al-Khalil;Christina S. Meade
This letter considers the use of machine learning algorithms for predicting cocaine use based on magnetic resonance imaging (MRI) connectomic data. The study used functional MRI (fMRI) and diffusion MRI (dMRI) data collected from 275 individuals, which was then parcellated into 246 regions of interest (ROIs) using the Brainnetome atlas. After data preprocessing, the data sets were transformed into tensor form. We developed a tensor-based unsupervised machine learning algorithm to reduce the size of the data tensor from 275 (individuals) × 2 (fMRI and dMRI) × 246 (ROIs) × 246 (ROIs) to 275 (individuals) × 2 (fMRI and dMRI) × 6 (clusters) × 6 (clusters). This was achieved by applying the high-order Lloyd algorithm to group the ROI data into six clusters. Features were extracted from the reduced tensor and combined with demographic features (age, gender, race, and HIV status). The resulting data set was used to train a Catboost model using subsampling and nested cross-validation techniques, which achieved a prediction accuracy of 0.857 for identifying cocaine users. The model was also compared with other models, and the feature importance of the model was presented. Overall, this study highlights the potential for using tensor-based machine learning algorithms to predict cocaine use based on MRI connectomic data and presents a promising approach for identifying individuals at risk of substance abuse.
{"title":"Cocaine Use Prediction With Tensor-Based Machine Learning on Multimodal MRI Connectome Data","authors":"Anru R. Zhang;Ryan P. Bell;Chen An;Runshi Tang;Shana A. Hall;Cliburn Chan;Kareem Al-Khalil;Christina S. Meade","doi":"10.1162/neco_a_01623","DOIUrl":"10.1162/neco_a_01623","url":null,"abstract":"This letter considers the use of machine learning algorithms for predicting cocaine use based on magnetic resonance imaging (MRI) connectomic data. The study used functional MRI (fMRI) and diffusion MRI (dMRI) data collected from 275 individuals, which was then parcellated into 246 regions of interest (ROIs) using the Brainnetome atlas. After data preprocessing, the data sets were transformed into tensor form. We developed a tensor-based unsupervised machine learning algorithm to reduce the size of the data tensor from 275 (individuals) × 2 (fMRI and dMRI) × 246 (ROIs) × 246 (ROIs) to 275 (individuals) × 2 (fMRI and dMRI) × 6 (clusters) × 6 (clusters). This was achieved by applying the high-order Lloyd algorithm to group the ROI data into six clusters. Features were extracted from the reduced tensor and combined with demographic features (age, gender, race, and HIV status). The resulting data set was used to train a Catboost model using subsampling and nested cross-validation techniques, which achieved a prediction accuracy of 0.857 for identifying cocaine users. The model was also compared with other models, and the feature importance of the model was presented. Overall, this study highlights the potential for using tensor-based machine learning algorithms to predict cocaine use based on MRI connectomic data and presents a promising approach for identifying individuals at risk of substance abuse.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 1","pages":"107-127"},"PeriodicalIF":2.9,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138489119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Anca Rǎdulescu;Danae Evans;Amani-Dasia Augustin;Anthony Cooper;Johan Nakuci;Sarah Muldoon
Synchronization and clustering are well studied in the context of networks of oscillators, such as neuronal networks. However, this relationship is notoriously difficult to approach mathematically in natural, complex networks. Here, we aim to understand it in a canonical framework, using complex quadratic node dynamics, coupled in networks that we call complex quadratic networks (CQNs). We review previously defined extensions of the Mandelbrot and Julia sets for networks, focusing on the behavior of the node-wise projections of these sets and on describing the phenomena of node clustering and synchronization. One aspect of our work consists of exploring ties between a network's connectivity and its ensemble dynamics by identifying mechanisms that lead to clusters of nodes exhibiting identical or different Mandelbrot sets. Based on our preliminary analytical results (obtained primarily in two-dimensional networks), we propose that clustering is strongly determined by the network connectivity patterns, with the geometry of these clusters further controlled by the connection weights. Here, we first explore this relationship further, using examples of synthetic networks, increasing in size (from 3, to 5, to 20 nodes). We then illustrate the potential practical implications of synchronization in an existing set of whole brain, tractography-based networks obtained from 197 human subjects using diffusion tensor imaging. Understanding the similarities to how these concepts apply to CQNs contributes to our understanding of universal principles in dynamic networks and may help extend theoretical results to natural, complex systems.
{"title":"Synchronization and Clustering in Complex Quadratic Networks","authors":"Anca Rǎdulescu;Danae Evans;Amani-Dasia Augustin;Anthony Cooper;Johan Nakuci;Sarah Muldoon","doi":"10.1162/neco_a_01624","DOIUrl":"10.1162/neco_a_01624","url":null,"abstract":"Synchronization and clustering are well studied in the context of networks of oscillators, such as neuronal networks. However, this relationship is notoriously difficult to approach mathematically in natural, complex networks. Here, we aim to understand it in a canonical framework, using complex quadratic node dynamics, coupled in networks that we call complex quadratic networks (CQNs). We review previously defined extensions of the Mandelbrot and Julia sets for networks, focusing on the behavior of the node-wise projections of these sets and on describing the phenomena of node clustering and synchronization. One aspect of our work consists of exploring ties between a network's connectivity and its ensemble dynamics by identifying mechanisms that lead to clusters of nodes exhibiting identical or different Mandelbrot sets. Based on our preliminary analytical results (obtained primarily in two-dimensional networks), we propose that clustering is strongly determined by the network connectivity patterns, with the geometry of these clusters further controlled by the connection weights. Here, we first explore this relationship further, using examples of synthetic networks, increasing in size (from 3, to 5, to 20 nodes). We then illustrate the potential practical implications of synchronization in an existing set of whole brain, tractography-based networks obtained from 197 human subjects using diffusion tensor imaging. Understanding the similarities to how these concepts apply to CQNs contributes to our understanding of universal principles in dynamic networks and may help extend theoretical results to natural, complex systems.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 1","pages":"75-106"},"PeriodicalIF":2.9,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138489122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rajesh P. N. Rao;Dimitrios C. Gklezakos;Vishwas Sathish
There is growing interest in predictive coding as a model of how the brain learns through predictions and prediction errors. Predictive coding models have traditionally focused on sensory coding and perception. Here we introduce active predictive coding (APC) as a unifying model for perception, action, and cognition. The APC model addresses important open problems in cognitive science and AI, including (1) how we learn compositional representations (e.g., part-whole hierarchies for equivariant vision) and (2) how we solve large-scale planning problems, which are hard for traditional reinforcement learning, by composing complex state dynamics and abstract actions from simpler dynamics and primitive actions. By using hypernetworks, self-supervised learning, and reinforcement learning, APC learns hierarchical world models by combining task-invariant state transition networks and task-dependent policy networks at multiple abstraction levels. We illustrate the applicability of the APC model to active visual perception and hierarchical planning. Our results represent, to our knowledge, the first proof-of-concept demonstration of a unified approach to addressing the part-whole learning problem in vision, the nested reference frames learning problem in cognition, and the integrated state-action hierarchy learning problem in reinforcement learning.
{"title":"Active Predictive Coding: A Unifying Neural Model for Active Perception, Compositional Learning, and Hierarchical Planning","authors":"Rajesh P. N. Rao;Dimitrios C. Gklezakos;Vishwas Sathish","doi":"10.1162/neco_a_01627","DOIUrl":"10.1162/neco_a_01627","url":null,"abstract":"There is growing interest in predictive coding as a model of how the brain learns through predictions and prediction errors. Predictive coding models have traditionally focused on sensory coding and perception. Here we introduce active predictive coding (APC) as a unifying model for perception, action, and cognition. The APC model addresses important open problems in cognitive science and AI, including (1) how we learn compositional representations (e.g., part-whole hierarchies for equivariant vision) and (2) how we solve large-scale planning problems, which are hard for traditional reinforcement learning, by composing complex state dynamics and abstract actions from simpler dynamics and primitive actions. By using hypernetworks, self-supervised learning, and reinforcement learning, APC learns hierarchical world models by combining task-invariant state transition networks and task-dependent policy networks at multiple abstraction levels. We illustrate the applicability of the APC model to active visual perception and hierarchical planning. Our results represent, to our knowledge, the first proof-of-concept demonstration of a unified approach to addressing the part-whole learning problem in vision, the nested reference frames learning problem in cognition, and the integrated state-action hierarchy learning problem in reinforcement learning.","PeriodicalId":54731,"journal":{"name":"Neural Computation","volume":"36 1","pages":"1-32"},"PeriodicalIF":2.9,"publicationDate":"2023-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138489040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}