高维稀疏泛函主成分分析

IF 1.5 3区数学 Q2 STATISTICS & PROBABILITY Statistica Sinica Pub Date : 2023-01-01 DOI:10.5705/ss.202020.0445

Xiaoyu Hu, Fang Yao

{"title":"高维稀疏泛函主成分分析","authors":"Xiaoyu Hu, Fang Yao","doi":"10.5705/ss.202020.0445","DOIUrl":null,"url":null,"abstract":"Functional principal component analysis (FPCA) is a fundamental tool and has attracted increasing attention in recent decades, while existing methods are restricted to data with a single or finite number of random functions (much smaller than the sample size $n$). In this work, we focus on high-dimensional functional processes where the number of random functions $p$ is comparable to, or even much larger than $n$. Such data are ubiquitous in various fields such as neuroimaging analysis, and cannot be properly modeled by existing methods. We propose a new algorithm, called sparse FPCA, which is able to model principal eigenfunctions effectively under sensible sparsity regimes. While sparsity assumptions are standard in multivariate statistics, they have not been investigated in the complex context where not only is $p$ large, but also each variable itself is an intrinsically infinite-dimensional process. The sparsity structure motivates a thresholding rule that is easy to compute without nonparametric smoothing by exploiting the relationship between univariate orthonormal basis expansions and multivariate Kahunen-Lo\\`eve (K-L) representations. We investigate the theoretical properties of the resulting estimators, and illustrate the performance with simulated and real data examples.","PeriodicalId":49478,"journal":{"name":"Statistica Sinica","volume":"12 1","pages":"0"},"PeriodicalIF":1.5000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Sparse Functional Principal Component Analysis in High Dimensions\",\"authors\":\"Xiaoyu Hu, Fang Yao\",\"doi\":\"10.5705/ss.202020.0445\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Functional principal component analysis (FPCA) is a fundamental tool and has attracted increasing attention in recent decades, while existing methods are restricted to data with a single or finite number of random functions (much smaller than the sample size $n$). In this work, we focus on high-dimensional functional processes where the number of random functions $p$ is comparable to, or even much larger than $n$. Such data are ubiquitous in various fields such as neuroimaging analysis, and cannot be properly modeled by existing methods. We propose a new algorithm, called sparse FPCA, which is able to model principal eigenfunctions effectively under sensible sparsity regimes. While sparsity assumptions are standard in multivariate statistics, they have not been investigated in the complex context where not only is $p$ large, but also each variable itself is an intrinsically infinite-dimensional process. The sparsity structure motivates a thresholding rule that is easy to compute without nonparametric smoothing by exploiting the relationship between univariate orthonormal basis expansions and multivariate Kahunen-Lo\\\\`eve (K-L) representations. We investigate the theoretical properties of the resulting estimators, and illustrate the performance with simulated and real data examples.\",\"PeriodicalId\":49478,\"journal\":{\"name\":\"Statistica Sinica\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistica Sinica\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5705/ss.202020.0445\",\"RegionNum\":3,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistica Sinica","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5705/ss.202020.0445","RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 3

摘要

功能主成分分析(Functional principal component analysis, FPCA)是一种基础工具，近几十年来受到越来越多的关注，而现有的方法仅限于具有单个或有限数量的随机函数(远小于样本量)的数据。在这项工作中，我们专注于高维函数过程，其中随机函数的数量p与n相当，甚至远远大于n。这些数据在神经影像分析等各个领域都无处不在，无法用现有方法正确建模。我们提出了一种新的算法，称为稀疏FPCA，它能够在显稀疏性条件下有效地建模主特征函数。虽然稀疏性假设在多元统计中是标准的，但它们并没有在复杂的环境中进行研究，在这种环境中，不仅$p$大，而且每个变量本身本质上是一个无限维的过程。通过利用单变量正交基展开式和多元Kahunen-Lo\ ' eve (K-L)表示之间的关系，稀疏性结构激发了一种无需非参数平滑即可轻松计算的阈值规则。我们研究了所得到的估计器的理论性质，并用模拟和实际数据实例说明了其性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Sparse Functional Principal Component Analysis in High Dimensions

Functional principal component analysis (FPCA) is a fundamental tool and has attracted increasing attention in recent decades, while existing methods are restricted to data with a single or finite number of random functions (much smaller than the sample size $n$). In this work, we focus on high-dimensional functional processes where the number of random functions $p$ is comparable to, or even much larger than $n$. Such data are ubiquitous in various fields such as neuroimaging analysis, and cannot be properly modeled by existing methods. We propose a new algorithm, called sparse FPCA, which is able to model principal eigenfunctions effectively under sensible sparsity regimes. While sparsity assumptions are standard in multivariate statistics, they have not been investigated in the complex context where not only is $p$ large, but also each variable itself is an intrinsically infinite-dimensional process. The sparsity structure motivates a thresholding rule that is easy to compute without nonparametric smoothing by exploiting the relationship between univariate orthonormal basis expansions and multivariate Kahunen-Lo\`eve (K-L) representations. We investigate the theoretical properties of the resulting estimators, and illustrate the performance with simulated and real data examples.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Statistica Sinica 数学-统计学与概率论

CiteScore

2.10

自引率

0.00%

发文量

审稿时长

10.5 months

期刊介绍： Statistica Sinica aims to meet the needs of statisticians in a rapidly changing world. It provides a forum for the publication of innovative work of high quality in all areas of statistics, including theory, methodology and applications. The journal encourages the development and principled use of statistical methodology that is relevant for society, science and technology.

期刊最新文献

Multi-response Regression for Block-missing Multi-modal Data without Imputation. On the Efficiency of Composite Likelihood Estimation for Gaussian Spatial Processes Adaptive Randomization via Mahalanobis Distance Unbiased Boosting Estimation for Censored Survival Data Parsimonious Tensor Discriminant Analysis