对具有未知协方差的高斯进行分解

Ameer Dharamshi, Anna Neufeld, Lucy L. Gao, Jacob Bien, Daniela Witten
{"title":"对具有未知协方差的高斯进行分解","authors":"Ameer Dharamshi, Anna Neufeld, Lucy L. Gao, Jacob Bien, Daniela Witten","doi":"arxiv-2409.11497","DOIUrl":null,"url":null,"abstract":"Common workflows in machine learning and statistics rely on the ability to\npartition the information in a data set into independent portions. Recent work\nhas shown that this may be possible even when conventional sample splitting is\nnot (e.g., when the number of samples $n=1$, or when observations are not\nindependent and identically distributed). However, the approaches that are\ncurrently available to decompose multivariate Gaussian data require knowledge\nof the covariance matrix. In many important problems (such as in spatial or\nlongitudinal data analysis, and graphical modeling), the covariance matrix may\nbe unknown and even of primary interest. Thus, in this work we develop new\napproaches to decompose Gaussians with unknown covariance. First, we present a\ngeneral algorithm that encompasses all previous decomposition approaches for\nGaussian data as special cases, and can further handle the case of an unknown\ncovariance. It yields a new and more flexible alternative to sample splitting\nwhen $n>1$. When $n=1$, we prove that it is impossible to partition the\ninformation in a multivariate Gaussian into independent portions without\nknowing the covariance matrix. Thus, we use the general algorithm to decompose\na single multivariate Gaussian with unknown covariance into dependent parts\nwith tractable conditional distributions, and demonstrate their use for\ninference and validation. The proposed decomposition strategy extends naturally\nto Gaussian processes. In simulation and on electroencephalography data, we\napply these decompositions to the tasks of model selection and post-selection\ninference in settings where alternative strategies are unavailable.","PeriodicalId":501425,"journal":{"name":"arXiv - STAT - Methodology","volume":"77 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Decomposing Gaussians with Unknown Covariance\",\"authors\":\"Ameer Dharamshi, Anna Neufeld, Lucy L. Gao, Jacob Bien, Daniela Witten\",\"doi\":\"arxiv-2409.11497\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Common workflows in machine learning and statistics rely on the ability to\\npartition the information in a data set into independent portions. Recent work\\nhas shown that this may be possible even when conventional sample splitting is\\nnot (e.g., when the number of samples $n=1$, or when observations are not\\nindependent and identically distributed). However, the approaches that are\\ncurrently available to decompose multivariate Gaussian data require knowledge\\nof the covariance matrix. In many important problems (such as in spatial or\\nlongitudinal data analysis, and graphical modeling), the covariance matrix may\\nbe unknown and even of primary interest. Thus, in this work we develop new\\napproaches to decompose Gaussians with unknown covariance. First, we present a\\ngeneral algorithm that encompasses all previous decomposition approaches for\\nGaussian data as special cases, and can further handle the case of an unknown\\ncovariance. It yields a new and more flexible alternative to sample splitting\\nwhen $n>1$. When $n=1$, we prove that it is impossible to partition the\\ninformation in a multivariate Gaussian into independent portions without\\nknowing the covariance matrix. Thus, we use the general algorithm to decompose\\na single multivariate Gaussian with unknown covariance into dependent parts\\nwith tractable conditional distributions, and demonstrate their use for\\ninference and validation. The proposed decomposition strategy extends naturally\\nto Gaussian processes. In simulation and on electroencephalography data, we\\napply these decompositions to the tasks of model selection and post-selection\\ninference in settings where alternative strategies are unavailable.\",\"PeriodicalId\":501425,\"journal\":{\"name\":\"arXiv - STAT - Methodology\",\"volume\":\"77 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Methodology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11497\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Methodology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11497","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

机器学习和统计学中的常见工作流程依赖于将数据集中的信息分割成独立部分的能力。最近的研究表明,即使在传统的样本分割方法无法实现的情况下(例如,当样本数 $n=1$ 时,或当观测值不是独立且同分布时),这种方法也是可行的。然而,目前可用来分解多变量高斯数据的方法需要了解协方差矩阵。在许多重要问题中(如空间或纵向数据分析以及图形建模),协方差矩阵可能是未知的,甚至是最重要的。因此,在这项工作中,我们开发了分解具有未知协方差的高斯的新方法。首先,我们提出了一种通用算法,它包含了以往所有高斯数据分解方法的特例,并能进一步处理未知协方差的情况。当 $n>1$ 时,它产生了一种新的、更灵活的样本分割替代方法。当 $n=1$ 时,我们证明不可能在不知道协方差矩阵的情况下将多元高斯中的信息分割成独立的部分。因此,我们使用一般算法将具有未知协方差的单个多元高斯分解为具有可控条件分布的从属部分,并演示了它们在推断和验证中的应用。所提出的分解策略可以自然地扩展到高斯过程。在仿真和脑电图数据中,我们将这些分解应用于模型选择和后选择推断任务,而这些任务是在没有替代策略的情况下完成的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Decomposing Gaussians with Unknown Covariance
Common workflows in machine learning and statistics rely on the ability to partition the information in a data set into independent portions. Recent work has shown that this may be possible even when conventional sample splitting is not (e.g., when the number of samples $n=1$, or when observations are not independent and identically distributed). However, the approaches that are currently available to decompose multivariate Gaussian data require knowledge of the covariance matrix. In many important problems (such as in spatial or longitudinal data analysis, and graphical modeling), the covariance matrix may be unknown and even of primary interest. Thus, in this work we develop new approaches to decompose Gaussians with unknown covariance. First, we present a general algorithm that encompasses all previous decomposition approaches for Gaussian data as special cases, and can further handle the case of an unknown covariance. It yields a new and more flexible alternative to sample splitting when $n>1$. When $n=1$, we prove that it is impossible to partition the information in a multivariate Gaussian into independent portions without knowing the covariance matrix. Thus, we use the general algorithm to decompose a single multivariate Gaussian with unknown covariance into dependent parts with tractable conditional distributions, and demonstrate their use for inference and validation. The proposed decomposition strategy extends naturally to Gaussian processes. In simulation and on electroencephalography data, we apply these decompositions to the tasks of model selection and post-selection inference in settings where alternative strategies are unavailable.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Poisson approximate likelihood compared to the particle filter Optimising the Trade-Off Between Type I and Type II Errors: A Review and Extensions Bias Reduction in Matched Observational Studies with Continuous Treatments: Calipered Non-Bipartite Matching and Bias-Corrected Estimation and Inference Forecasting age distribution of life-table death counts via α-transformation Probability-scale residuals for event-time data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1