邻域图上拉普拉斯特征映射在Sobolev空间上的极大极小最优回归

IF 1.4 4区数学 Q2 MATHEMATICS, APPLIED Information and Inference-A Journal of the Ima Pub Date : 2023-04-27 DOI:10.1093/imaiai/iaad034

Alden Green, Sivaraman Balakrishnan, Ryan J Tibshirani

{"title":"邻域图上拉普拉斯特征映射在Sobolev空间上的极大极小最优回归","authors":"Alden Green, Sivaraman Balakrishnan, Ryan J Tibshirani","doi":"10.1093/imaiai/iaad034","DOIUrl":null,"url":null,"abstract":"Abstract In this paper, we study the statistical properties of Principal Components Regression with Laplacian Eigenmaps (PCR-LE), a method for non-parametric regression based on Laplacian Eigenmaps (LE). PCR-LE works by projecting a vector of observed responses ${\\textbf Y} = (Y_1,\\ldots ,Y_n)$ onto a subspace spanned by certain eigenvectors of a neighbourhood graph Laplacian. We show that PCR-LE achieves minimax rates of convergence for random design regression over Sobolev spaces. Under sufficient smoothness conditions on the design density $p$, PCR-LE achieves the optimal rates for both estimation (where the optimal rate in squared $L^2$ norm is known to be $n^{-2s/(2s + d)}$) and goodness-of-fit testing ($n^{-4s/(4s + d)}$). We also consider the situation where the design is supported on a manifold of small intrinsic dimension $m$, and give upper bounds establishing that PCR-LE achieves the faster minimax estimation ($n^{-2s/(2s + m)}$) and testing ($n^{-4s/(4s + m)}$) rates of convergence. Interestingly, these rates are almost always much faster than the known rates of convergence of graph Laplacian eigenvectors to their population-level limits; in other words, for this problem regression with estimated features appears to be much easier, statistically speaking, than estimating the features itself. We support these theoretical results with empirical evidence.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"54 1","pages":"0"},"PeriodicalIF":1.4000,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Minimax optimal regression over Sobolev spaces via Laplacian Eigenmaps on neighbourhood graphs\",\"authors\":\"Alden Green, Sivaraman Balakrishnan, Ryan J Tibshirani\",\"doi\":\"10.1093/imaiai/iaad034\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract In this paper, we study the statistical properties of Principal Components Regression with Laplacian Eigenmaps (PCR-LE), a method for non-parametric regression based on Laplacian Eigenmaps (LE). PCR-LE works by projecting a vector of observed responses ${\\\\textbf Y} = (Y_1,\\\\ldots ,Y_n)$ onto a subspace spanned by certain eigenvectors of a neighbourhood graph Laplacian. We show that PCR-LE achieves minimax rates of convergence for random design regression over Sobolev spaces. Under sufficient smoothness conditions on the design density $p$, PCR-LE achieves the optimal rates for both estimation (where the optimal rate in squared $L^2$ norm is known to be $n^{-2s/(2s + d)}$) and goodness-of-fit testing ($n^{-4s/(4s + d)}$). We also consider the situation where the design is supported on a manifold of small intrinsic dimension $m$, and give upper bounds establishing that PCR-LE achieves the faster minimax estimation ($n^{-2s/(2s + m)}$) and testing ($n^{-4s/(4s + m)}$) rates of convergence. Interestingly, these rates are almost always much faster than the known rates of convergence of graph Laplacian eigenvectors to their population-level limits; in other words, for this problem regression with estimated features appears to be much easier, statistically speaking, than estimating the features itself. We support these theoretical results with empirical evidence.\",\"PeriodicalId\":45437,\"journal\":{\"name\":\"Information and Inference-A Journal of the Ima\",\"volume\":\"54 1\",\"pages\":\"0\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2023-04-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information and Inference-A Journal of the Ima\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/imaiai/iaad034\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Inference-A Journal of the Ima","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/imaiai/iaad034","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}

引用次数: 4

摘要

摘要本文研究了基于拉普拉斯特征映射的非参数回归方法——主成分回归与拉普拉斯特征映射(PCR-LE)的统计性质。PCR-LE的工作原理是将观察到的响应向量${\textbf Y} = (Y_1，\ldots,Y_n)$投影到由邻域图拉普拉斯算子的某些特征向量张成的子空间上。我们证明了PCR-LE在Sobolev空间上实现了随机设计回归的极小极大收敛速率。在设计密度$p$的充分平滑条件下，PCR-LE实现了估计(其中最优率的平方$L^2$范数已知为$n^{-2s/(2s + d)}$)和拟合优度检验($n^{-4s/(4s + d)}$)的最优率。我们还考虑了在小内维数$m$的流形上支持设计的情况，并给出了上界，证明PCR-LE实现了更快的极小极大估计($n^{-2s/(2s + m)}$)和测试($n^{-4s/(4s + m)}$)收敛速度。有趣的是，这些速率几乎总是比已知的图拉普拉斯特征向量收敛到其种群水平极限的速率快得多;换句话说，对于这个问题，用估计的特征进行回归似乎比估计特征本身要容易得多。我们用经验证据来支持这些理论结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Minimax optimal regression over Sobolev spaces via Laplacian Eigenmaps on neighbourhood graphs

Abstract In this paper, we study the statistical properties of Principal Components Regression with Laplacian Eigenmaps (PCR-LE), a method for non-parametric regression based on Laplacian Eigenmaps (LE). PCR-LE works by projecting a vector of observed responses ${\textbf Y} = (Y_1,\ldots ,Y_n)$ onto a subspace spanned by certain eigenvectors of a neighbourhood graph Laplacian. We show that PCR-LE achieves minimax rates of convergence for random design regression over Sobolev spaces. Under sufficient smoothness conditions on the design density $p$, PCR-LE achieves the optimal rates for both estimation (where the optimal rate in squared $L^2$ norm is known to be $n^{-2s/(2s + d)}$) and goodness-of-fit testing ($n^{-4s/(4s + d)}$). We also consider the situation where the design is supported on a manifold of small intrinsic dimension $m$, and give upper bounds establishing that PCR-LE achieves the faster minimax estimation ($n^{-2s/(2s + m)}$) and testing ($n^{-4s/(4s + m)}$) rates of convergence. Interestingly, these rates are almost always much faster than the known rates of convergence of graph Laplacian eigenvectors to their population-level limits; in other words, for this problem regression with estimated features appears to be much easier, statistically speaking, than estimating the features itself. We support these theoretical results with empirical evidence.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information and Inference-A Journal of the Ima Multiple-

CiteScore

3.90

自引率

0.00%

发文量