An eigenvector-assisted estimation framework for signal-plus-noise matrix models

IF 2.4 2区 数学 Q2 BIOLOGY Biometrika Pub Date : 2023-09-19 DOI:10.1093/biomet/asad058
Fangzheng Xie, Dingbo Wu
{"title":"An eigenvector-assisted estimation framework for signal-plus-noise matrix models","authors":"Fangzheng Xie, Dingbo Wu","doi":"10.1093/biomet/asad058","DOIUrl":null,"url":null,"abstract":"Summary In this paper, we develop an eigenvector-assisted estimation framework for a collection of signal-plus-noise matrix models arising in high-dimensional statistics and many applications. The framework is built upon a novel asymptotically unbiased estimating equation using the leading eigenvectors of the data matrix. However, the estimator obtained by directly solving the estimating equation could be numerically unstable in practice and lacks robustness against model misspecification. We propose to use the quasi-posterior distribution by exponentiating a criterion function whose maximizer coincides with the estimating equation estimator. The proposed framework can incorporate heteroskedastic variance information but does not require the complete specification of the sampling distribution and is also robust to the potential misspecification of the distribution of the noise matrix. Computationally, the quasi-posterior distribution can be obtained via a Markov Chain Monte Carlo sampler, which exhibits superior numerical stability than some of the existing optimization-based estimators and is straightforward for uncertainty quantification. Under mild regularity conditions, we establish the large sample properties of the quasi-posterior distributions. In particular, the quasi-posterior credible sets have the correct frequentist nominal coverage probability provided that the criterion function is carefully selected. The validity and usefulness of the proposed framework are demonstrated through the analysis of synthetic datasets and the real-world ENZYMES network datasets.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrika","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/biomet/asad058","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Summary In this paper, we develop an eigenvector-assisted estimation framework for a collection of signal-plus-noise matrix models arising in high-dimensional statistics and many applications. The framework is built upon a novel asymptotically unbiased estimating equation using the leading eigenvectors of the data matrix. However, the estimator obtained by directly solving the estimating equation could be numerically unstable in practice and lacks robustness against model misspecification. We propose to use the quasi-posterior distribution by exponentiating a criterion function whose maximizer coincides with the estimating equation estimator. The proposed framework can incorporate heteroskedastic variance information but does not require the complete specification of the sampling distribution and is also robust to the potential misspecification of the distribution of the noise matrix. Computationally, the quasi-posterior distribution can be obtained via a Markov Chain Monte Carlo sampler, which exhibits superior numerical stability than some of the existing optimization-based estimators and is straightforward for uncertainty quantification. Under mild regularity conditions, we establish the large sample properties of the quasi-posterior distributions. In particular, the quasi-posterior credible sets have the correct frequentist nominal coverage probability provided that the criterion function is carefully selected. The validity and usefulness of the proposed framework are demonstrated through the analysis of synthetic datasets and the real-world ENZYMES network datasets.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
信号加噪声矩阵模型的特征向量辅助估计框架
在本文中,我们开发了一个特征向量辅助估计框架,用于高维统计和许多应用中的信号加噪声矩阵模型集合。该框架是建立在一个新的渐近无偏估计方程上,使用数据矩阵的首特征向量。然而,在实际应用中,直接求解估计方程得到的估计量在数值上是不稳定的,并且缺乏对模型错规范的鲁棒性。我们提出利用准后验分布,对一个准则函数取幂,该准则函数的最大值与估计方程估计量重合。所提出的框架可以包含异方差信息,但不需要完全规范采样分布,并且对噪声矩阵分布的潜在错误规范也具有鲁棒性。计算上,拟后验分布可以通过马尔可夫链蒙特卡罗采样器获得,与现有的一些基于优化的估计器相比,它具有更好的数值稳定性,并且可以直接用于不确定性量化。在温和的正则性条件下,我们建立了准后验分布的大样本性质。特别是,准后验可信集具有正确的频率论名义覆盖概率,前提是仔细选择准则函数。通过对合成数据集和实际酶网络数据集的分析,证明了所提出框架的有效性和实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Biometrika
Biometrika 生物-生物学
CiteScore
5.50
自引率
3.70%
发文量
56
审稿时长
6-12 weeks
期刊介绍: Biometrika is primarily a journal of statistics in which emphasis is placed on papers containing original theoretical contributions of direct or potential value in applications. From time to time, papers in bordering fields are also published.
期刊最新文献
Local Bootstrap for Network Data A Simple Bootstrap for Chatterjee's Rank Correlation Sensitivity models and bounds under sequential unmeasured confounding in longitudinal studies Studies in the history of probability and statistics, LI: the first conditional logistic regression Skip-sampling: subsampling in the frequency domain
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1