Sai Koukuntla, Joshua B. Julian, Jesse C. Kaminsky, Manuel Schottdorf, David W. Tank, Carlos D. Brody, Adam S. Charles
{"title":"在多视角数据中无监督地发现共享和私有几何图形","authors":"Sai Koukuntla, Joshua B. Julian, Jesse C. Kaminsky, Manuel Schottdorf, David W. Tank, Carlos D. Brody, Adam S. Charles","doi":"arxiv-2408.12091","DOIUrl":null,"url":null,"abstract":"Modern applications often leverage multiple views of a subject of study.\nWithin neuroscience, there is growing interest in large-scale simultaneous\nrecordings across multiple brain regions. Understanding the relationship\nbetween views (e.g., the neural activity in each region recorded) can reveal\nfundamental principles about the characteristics of each representation and\nabout the system. However, existing methods to characterize such relationships\neither lack the expressivity required to capture complex nonlinearities,\ndescribe only sources of variance that are shared between views, or discard\ngeometric information that is crucial to interpreting the data. Here, we\ndevelop a nonlinear neural network-based method that, given paired samples of\nhigh-dimensional views, disentangles low-dimensional shared and private latent\nvariables underlying these views while preserving intrinsic data geometry.\nAcross multiple simulated and real datasets, we demonstrate that our method\noutperforms competing methods. Using simulated populations of lateral\ngeniculate nucleus (LGN) and V1 neurons we demonstrate our model's ability to\ndiscover interpretable shared and private structure across different noise\nconditions. On a dataset of unrotated and corresponding but randomly rotated\nMNIST digits, we recover private latents for the rotated view that encode\nrotation angle regardless of digit class, and places the angle representation\non a 1-d manifold, while shared latents encode digit class but not rotation\nangle. Applying our method to simultaneous Neuropixels recordings of\nhippocampus and prefrontal cortex while mice run on a linear track, we discover\na low-dimensional shared latent space that encodes the animal's position. We\npropose our approach as a general-purpose method for finding succinct and\ninterpretable descriptions of paired data sets in terms of disentangled shared\nand private latent variables.","PeriodicalId":501517,"journal":{"name":"arXiv - QuanBio - Neurons and Cognition","volume":"39 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unsupervised discovery of the shared and private geometry in multi-view data\",\"authors\":\"Sai Koukuntla, Joshua B. Julian, Jesse C. Kaminsky, Manuel Schottdorf, David W. Tank, Carlos D. Brody, Adam S. Charles\",\"doi\":\"arxiv-2408.12091\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern applications often leverage multiple views of a subject of study.\\nWithin neuroscience, there is growing interest in large-scale simultaneous\\nrecordings across multiple brain regions. Understanding the relationship\\nbetween views (e.g., the neural activity in each region recorded) can reveal\\nfundamental principles about the characteristics of each representation and\\nabout the system. However, existing methods to characterize such relationships\\neither lack the expressivity required to capture complex nonlinearities,\\ndescribe only sources of variance that are shared between views, or discard\\ngeometric information that is crucial to interpreting the data. Here, we\\ndevelop a nonlinear neural network-based method that, given paired samples of\\nhigh-dimensional views, disentangles low-dimensional shared and private latent\\nvariables underlying these views while preserving intrinsic data geometry.\\nAcross multiple simulated and real datasets, we demonstrate that our method\\noutperforms competing methods. Using simulated populations of lateral\\ngeniculate nucleus (LGN) and V1 neurons we demonstrate our model's ability to\\ndiscover interpretable shared and private structure across different noise\\nconditions. On a dataset of unrotated and corresponding but randomly rotated\\nMNIST digits, we recover private latents for the rotated view that encode\\nrotation angle regardless of digit class, and places the angle representation\\non a 1-d manifold, while shared latents encode digit class but not rotation\\nangle. Applying our method to simultaneous Neuropixels recordings of\\nhippocampus and prefrontal cortex while mice run on a linear track, we discover\\na low-dimensional shared latent space that encodes the animal's position. We\\npropose our approach as a general-purpose method for finding succinct and\\ninterpretable descriptions of paired data sets in terms of disentangled shared\\nand private latent variables.\",\"PeriodicalId\":501517,\"journal\":{\"name\":\"arXiv - QuanBio - Neurons and Cognition\",\"volume\":\"39 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Neurons and Cognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.12091\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Neurons and Cognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.12091","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Unsupervised discovery of the shared and private geometry in multi-view data
Modern applications often leverage multiple views of a subject of study.
Within neuroscience, there is growing interest in large-scale simultaneous
recordings across multiple brain regions. Understanding the relationship
between views (e.g., the neural activity in each region recorded) can reveal
fundamental principles about the characteristics of each representation and
about the system. However, existing methods to characterize such relationships
either lack the expressivity required to capture complex nonlinearities,
describe only sources of variance that are shared between views, or discard
geometric information that is crucial to interpreting the data. Here, we
develop a nonlinear neural network-based method that, given paired samples of
high-dimensional views, disentangles low-dimensional shared and private latent
variables underlying these views while preserving intrinsic data geometry.
Across multiple simulated and real datasets, we demonstrate that our method
outperforms competing methods. Using simulated populations of lateral
geniculate nucleus (LGN) and V1 neurons we demonstrate our model's ability to
discover interpretable shared and private structure across different noise
conditions. On a dataset of unrotated and corresponding but randomly rotated
MNIST digits, we recover private latents for the rotated view that encode
rotation angle regardless of digit class, and places the angle representation
on a 1-d manifold, while shared latents encode digit class but not rotation
angle. Applying our method to simultaneous Neuropixels recordings of
hippocampus and prefrontal cortex while mice run on a linear track, we discover
a low-dimensional shared latent space that encodes the animal's position. We
propose our approach as a general-purpose method for finding succinct and
interpretable descriptions of paired data sets in terms of disentangled shared
and private latent variables.