ENTRYWISE EIGENVECTOR ANALYSIS OF RANDOM MATRICES WITH LOW EXPECTED RANK.

IF 3.2 1区 数学 Q1 STATISTICS & PROBABILITY Annals of Statistics Pub Date : 2020-06-01 Epub Date: 2020-07-17 DOI:10.1214/19-aos1854
Emmanuel Abbe, Jianqing Fan, Kaizheng Wang, Yiqiao Zhong
{"title":"ENTRYWISE EIGENVECTOR ANALYSIS OF RANDOM MATRICES WITH LOW EXPECTED RANK.","authors":"Emmanuel Abbe, Jianqing Fan, Kaizheng Wang, Yiqiao Zhong","doi":"10.1214/19-aos1854","DOIUrl":null,"url":null,"abstract":"<p><p>Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking, matrix completion, among others. While a large variety of bounds are available for average errors between empirical and population statistics of eigenvectors, few results are tight for entrywise analyses, which are critical for a number of problems such as community detection. This paper investigates entrywise behaviors of eigenvectors for a large class of random matrices whose expectations are low-rank, which helps settle the conjecture in Abbe et al. (2014b) that the spectral algorithm achieves exact recovery in the stochastic block model without any trimming or cleaning steps. The key is a first-order approximation of eigenvectors under the <i>ℓ</i> <sub>∞</sub> norm: <dispformula> <math> <mrow><msub><mi>u</mi> <mi>k</mi></msub> <mo>≈</mo> <mfrac><mrow><mi>A</mi> <msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> <mrow><msubsup><mi>λ</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> </mfrac> <mo>,</mo></mrow> </math> </dispformula> where {<i>u</i> <sub><i>k</i></sub> } and <math> <mrow><mrow><mo>{</mo> <mrow><msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> <mo>}</mo></mrow> </mrow> </math> are eigenvectors of a random matrix <i>A</i> and its expectation <math><mrow><mi>E</mi> <mi>A</mi></mrow> </math> , respectively. The fact that the approximation is both tight and linear in <i>A</i> facilitates sharp comparisons between <i>u</i> <sub><i>k</i></sub> and <math> <mrow><msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> </math> . In particular, it allows for comparing the signs of <i>u</i> <sub><i>k</i></sub> and <math> <mrow><msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> </math> even if <math> <mrow> <msub> <mrow><mrow><mo>‖</mo> <mrow><msub><mi>u</mi> <mi>k</mi></msub> <mo>-</mo> <msubsup><mi>u</mi> <mi>k</mi> <mo>*</mo></msubsup> </mrow> <mo>‖</mo></mrow> </mrow> <mi>∞</mi></msub> </mrow> </math> is large. The results are further extended to perturbations of eigenspaces, yielding new <i>ℓ</i> <sub>∞</sub>-type bounds for synchronization ( <math> <mrow><msub><mi>ℤ</mi> <mn>2</mn></msub> </mrow> </math> -spiked Wigner model) and noisy matrix completion.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 3","pages":"1452-1474"},"PeriodicalIF":3.2000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8046180/pdf/nihms-1053828.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/19-aos1854","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2020/7/17 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0

Abstract

Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking, matrix completion, among others. While a large variety of bounds are available for average errors between empirical and population statistics of eigenvectors, few results are tight for entrywise analyses, which are critical for a number of problems such as community detection. This paper investigates entrywise behaviors of eigenvectors for a large class of random matrices whose expectations are low-rank, which helps settle the conjecture in Abbe et al. (2014b) that the spectral algorithm achieves exact recovery in the stochastic block model without any trimming or cleaning steps. The key is a first-order approximation of eigenvectors under the norm: u k A u k * λ k * , where {u k } and { u k * } are eigenvectors of a random matrix A and its expectation E A , respectively. The fact that the approximation is both tight and linear in A facilitates sharp comparisons between u k and u k * . In particular, it allows for comparing the signs of u k and u k * even if u k - u k * is large. The results are further extended to perturbations of eigenspaces, yielding new -type bounds for synchronization ( 2 -spiked Wigner model) and noisy matrix completion.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
低预期秩随机矩阵的条目特征向量分析。
通过特征向量扰动分析恢复低秩结构是统计机器学习中的一个常见问题,如因子分析、群落检测、排序、矩阵补全等。虽然对特征向量的经验统计和群体统计之间的平均误差有大量的约束,但很少有结果能严密地进行入口分析,而入口分析对群体检测等一系列问题至关重要。本文研究了一大类期望为低秩的随机矩阵的特征向量入口行为,这有助于解决 Abbe 等人(2014b)的猜想,即在随机块模型中,谱算法无需任何修剪或清理步骤即可实现精确恢复。关键在于ℓ ∞ 规范下特征向量的一阶近似:u k ≈ A u k * λ k *,其中 {u k } 和 { u k * } 分别是随机矩阵 A 的特征向量及其期望 E A。近似值在 A 中既紧密又线性,这一事实有助于对 u k 和 u k * 进行清晰的比较。特别是,即使 ‖ u k - u k * ‖ ∞ 很大,也能比较 u k 和 u k * 的符号。这些结果进一步扩展到特征空间的扰动,产生了同步化(ℤ 2 -spiked Wigner 模型)和噪声矩阵补全的新ℓ ∞ 型边界。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Annals of Statistics
Annals of Statistics 数学-统计学与概率论
CiteScore
9.30
自引率
8.90%
发文量
119
审稿时长
6-12 weeks
期刊介绍: The Annals of Statistics aim to publish research papers of highest quality reflecting the many facets of contemporary statistics. Primary emphasis is placed on importance and originality, not on formalism. The journal aims to cover all areas of statistics, especially mathematical statistics and applied & interdisciplinary statistics. Of course many of the best papers will touch on more than one of these general areas, because the discipline of statistics has deep roots in mathematics, and in substantive scientific fields.
期刊最新文献
ON BLOCKWISE AND REFERENCE PANEL-BASED ESTIMATORS FOR GENETIC DATA PREDICTION IN HIGH DIMENSIONS. RANK-BASED INDICES FOR TESTING INDEPENDENCE BETWEEN TWO HIGH-DIMENSIONAL VECTORS. Single index Fréchet regression Graphical models for nonstationary time series On lower bounds for the bias-variance trade-off
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1