Unsupervised discovery of the shared and private geometry in multi-view data

arXiv - QuanBio - Neurons and Cognition Pub Date : 2024-08-22 DOI:arxiv-2408.12091

Sai Koukuntla, Joshua B. Julian, Jesse C. Kaminsky, Manuel Schottdorf, David W. Tank, Carlos D. Brody, Adam S. Charles

{"title":"Unsupervised discovery of the shared and private geometry in multi-view data","authors":"Sai Koukuntla, Joshua B. Julian, Jesse C. Kaminsky, Manuel Schottdorf, David W. Tank, Carlos D. Brody, Adam S. Charles","doi":"arxiv-2408.12091","DOIUrl":null,"url":null,"abstract":"Modern applications often leverage multiple views of a subject of study.\nWithin neuroscience, there is growing interest in large-scale simultaneous\nrecordings across multiple brain regions. Understanding the relationship\nbetween views (e.g., the neural activity in each region recorded) can reveal\nfundamental principles about the characteristics of each representation and\nabout the system. However, existing methods to characterize such relationships\neither lack the expressivity required to capture complex nonlinearities,\ndescribe only sources of variance that are shared between views, or discard\ngeometric information that is crucial to interpreting the data. Here, we\ndevelop a nonlinear neural network-based method that, given paired samples of\nhigh-dimensional views, disentangles low-dimensional shared and private latent\nvariables underlying these views while preserving intrinsic data geometry.\nAcross multiple simulated and real datasets, we demonstrate that our method\noutperforms competing methods. Using simulated populations of lateral\ngeniculate nucleus (LGN) and V1 neurons we demonstrate our model's ability to\ndiscover interpretable shared and private structure across different noise\nconditions. On a dataset of unrotated and corresponding but randomly rotated\nMNIST digits, we recover private latents for the rotated view that encode\nrotation angle regardless of digit class, and places the angle representation\non a 1-d manifold, while shared latents encode digit class but not rotation\nangle. Applying our method to simultaneous Neuropixels recordings of\nhippocampus and prefrontal cortex while mice run on a linear track, we discover\na low-dimensional shared latent space that encodes the animal's position. We\npropose our approach as a general-purpose method for finding succinct and\ninterpretable descriptions of paired data sets in terms of disentangled shared\nand private latent variables.","PeriodicalId":501517,"journal":{"name":"arXiv - QuanBio - Neurons and Cognition","volume":"39 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Neurons and Cognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.12091","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Modern applications often leverage multiple views of a subject of study. Within neuroscience, there is growing interest in large-scale simultaneous recordings across multiple brain regions. Understanding the relationship between views (e.g., the neural activity in each region recorded) can reveal fundamental principles about the characteristics of each representation and about the system. However, existing methods to characterize such relationships either lack the expressivity required to capture complex nonlinearities, describe only sources of variance that are shared between views, or discard geometric information that is crucial to interpreting the data. Here, we develop a nonlinear neural network-based method that, given paired samples of high-dimensional views, disentangles low-dimensional shared and private latent variables underlying these views while preserving intrinsic data geometry. Across multiple simulated and real datasets, we demonstrate that our method outperforms competing methods. Using simulated populations of lateral geniculate nucleus (LGN) and V1 neurons we demonstrate our model's ability to discover interpretable shared and private structure across different noise conditions. On a dataset of unrotated and corresponding but randomly rotated MNIST digits, we recover private latents for the rotated view that encode rotation angle regardless of digit class, and places the angle representation on a 1-d manifold, while shared latents encode digit class but not rotation angle. Applying our method to simultaneous Neuropixels recordings of hippocampus and prefrontal cortex while mice run on a linear track, we discover a low-dimensional shared latent space that encodes the animal's position. We propose our approach as a general-purpose method for finding succinct and interpretable descriptions of paired data sets in terms of disentangled shared and private latent variables.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在多视角数据中无监督地发现共享和私有几何图形

在神经科学领域，人们对跨多个脑区的大规模同步记录越来越感兴趣。了解视图之间的关系（例如记录的每个区域的神经活动）可以揭示每个表征的特征和系统的基本原理。然而，现有的描述这种关系的方法要么缺乏捕捉复杂非线性所需的表现力，要么只描述视图之间共享的变异源，要么丢弃了对解释数据至关重要的几何信息。在这里，我们开发了一种基于非线性神经网络的方法，在给定高维视图配对样本的情况下，在保留内在数据几何特征的同时，分离出这些视图背后的低维共享和私有潜在变量。通过模拟侧细核（LGN）和 V1 神经元群，我们证明了我们的模型在不同噪声条件下发现可解释的共享和私有结构的能力。在未旋转和相应但随机旋转的 MNIST 数字数据集上，我们恢复了旋转视图的私有潜变量，该潜变量编码旋转角度而与数字类别无关，并将角度表征置于 1-d 流形上，而共享潜变量编码数字类别而不编码旋转角度。将我们的方法应用于小鼠在线性轨道上奔跑时海马和前额叶皮层的同步神经像素记录，我们发现了一个编码动物位置的低维共享潜空间。我们提出，我们的方法是一种通用方法，可用于根据分离的共享潜变量和私有潜变量找到简洁且可解释的配对数据集描述。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - QuanBio - Neurons and Cognition

自引率

0.00%

发文量