首页 > 最新文献

Annals of Statistics最新文献

英文 中文
Post-selection inference via algorithmic stability 通过算法稳定性进行后选择推理
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-08-01 DOI: 10.1214/23-aos2303
Tijana Zrnic, Michael I. Jordan
When the target of statistical inference is chosen in a data-driven manner, the guarantees provided by classical theories vanish. We propose a solution to the problem of inference after selection by building on the framework of algorithmic stability, in particular its branch with origins in the field of differential privacy. Stability is achieved via randomization of selection and it serves as a quantitative measure that is sufficient to obtain nontrivial post-selection corrections for classical confidence intervals. Importantly, the underpinnings of algorithmic stability translate directly into computational efficiency—our method computes simple corrections for selective inference without recourse to Markov chain Monte Carlo sampling.
当以数据驱动的方式选择统计推断的目标时,经典理论提供的保证就消失了。我们提出了一种基于算法稳定性框架的选择后推理问题的解决方案,特别是其起源于差分隐私领域的分支。稳定性是通过选择的随机化实现的,它作为一种定量测量,足以获得经典置信区间的非平凡选择后校正。重要的是,算法稳定性的基础直接转化为计算效率——我们的方法计算选择性推理的简单修正,而不依赖于马尔可夫链蒙特卡罗采样。
{"title":"Post-selection inference via algorithmic stability","authors":"Tijana Zrnic, Michael I. Jordan","doi":"10.1214/23-aos2303","DOIUrl":"https://doi.org/10.1214/23-aos2303","url":null,"abstract":"When the target of statistical inference is chosen in a data-driven manner, the guarantees provided by classical theories vanish. We propose a solution to the problem of inference after selection by building on the framework of algorithmic stability, in particular its branch with origins in the field of differential privacy. Stability is achieved via randomization of selection and it serves as a quantitative measure that is sufficient to obtain nontrivial post-selection corrections for classical confidence intervals. Importantly, the underpinnings of algorithmic stability translate directly into computational efficiency—our method computes simple corrections for selective inference without recourse to Markov chain Monte Carlo sampling.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135165184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Statistical inference on a changing extreme value dependence structure 变化极值依赖结构的统计推断
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-08-01 DOI: 10.1214/23-aos2314
Holger Drees
We analyze the extreme value dependence of independent, not necessarily identically distributed multivariate regularly varying random vectors. More specifically, we propose estimators of the spectral measure locally at some time point and of the spectral measures integrated over time. The uniform asymptotic normality of these estimators is proved under suitable nonparametric smoothness and regularity assumptions. We then use the process convergence of the integrated spectral measure to devise consistent tests for the null hypothesis that the spectral measure does not change over time.
我们分析了独立的、不一定同分布的多变量正则变随机向量的极值相关性。更具体地说,我们提出了在某个时间点的局部光谱测度和随时间集成的光谱测度的估计。在适当的非参数光滑性和正则性假设下,证明了这些估计量的一致渐近正态性。然后,我们使用集成光谱测量的过程收敛性来设计零假设的一致检验,即光谱测量不随时间变化。
{"title":"Statistical inference on a changing extreme value dependence structure","authors":"Holger Drees","doi":"10.1214/23-aos2314","DOIUrl":"https://doi.org/10.1214/23-aos2314","url":null,"abstract":"We analyze the extreme value dependence of independent, not necessarily identically distributed multivariate regularly varying random vectors. More specifically, we propose estimators of the spectral measure locally at some time point and of the spectral measures integrated over time. The uniform asymptotic normality of these estimators is proved under suitable nonparametric smoothness and regularity assumptions. We then use the process convergence of the integrated spectral measure to devise consistent tests for the null hypothesis that the spectral measure does not change over time.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135055890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Bridging factor and sparse models 桥接因子与稀疏模型
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-08-01 DOI: 10.1214/23-aos2304
Jianqing Fan, Ricardo Masini, Marcelo C. Medeiros
Factor and sparse models are widely used to impose a low-dimensional structure in high-dimensions. However, they are seemingly mutually exclusive. We propose a lifting method that combines the merits of these two models in a supervised learning methodology that allows for efficiently exploring all the information in high-dimensional datasets. The method is based on a flexible model for high-dimensional panel data with observable and/or latent common factors and idiosyncratic components. The model is called the factor-augmented regression model. It includes principal components and sparse regression as specific models, significantly weakens the cross-sectional dependence, and facilitates model selection and interpretability. The method consists of several steps and a novel test for (partial) covariance structure in high dimensions to infer the remaining cross-section dependence at each step. We develop the theory for the model and demonstrate the validity of the multiplier bootstrap for testing a high-dimensional (partial) covariance structure. A simulation study and applications support the theory.
因子模型和稀疏模型被广泛用于在高维空间中施加低维结构。然而,它们似乎是相互排斥的。我们提出了一种提升方法,该方法结合了这两种模型在监督学习方法中的优点,可以有效地探索高维数据集中的所有信息。该方法基于具有可观察和/或潜在共同因素和特殊成分的高维面板数据的灵活模型。该模型称为因子增广回归模型。它将主成分和稀疏回归作为具体模型,大大削弱了截面依赖性,便于模型选择和可解释性。该方法包括几个步骤和一个新的高维(部分)协方差结构检验,以推断每一步的剩余截面依赖性。我们发展了该模型的理论,并证明了乘数自举法用于测试高维(部分)协方差结构的有效性。仿真研究和应用支持了这一理论。
{"title":"Bridging factor and sparse models","authors":"Jianqing Fan, Ricardo Masini, Marcelo C. Medeiros","doi":"10.1214/23-aos2304","DOIUrl":"https://doi.org/10.1214/23-aos2304","url":null,"abstract":"Factor and sparse models are widely used to impose a low-dimensional structure in high-dimensions. However, they are seemingly mutually exclusive. We propose a lifting method that combines the merits of these two models in a supervised learning methodology that allows for efficiently exploring all the information in high-dimensional datasets. The method is based on a flexible model for high-dimensional panel data with observable and/or latent common factors and idiosyncratic components. The model is called the factor-augmented regression model. It includes principal components and sparse regression as specific models, significantly weakens the cross-sectional dependence, and facilitates model selection and interpretability. The method consists of several steps and a novel test for (partial) covariance structure in high dimensions to infer the remaining cross-section dependence at each step. We develop the theory for the model and demonstrate the validity of the multiplier bootstrap for testing a high-dimensional (partial) covariance structure. A simulation study and applications support the theory.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134951962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Projected state-action balancing weights for offline reinforcement learning 用于离线强化学习的预估状态-行为平衡权值
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-08-01 DOI: 10.1214/23-aos2302
Jiayi Wang, Zhengling Qi, Raymond K. W. Wong
Off-policy evaluation is considered a fundamental and challenging problem in reinforcement learning (RL). This paper focuses on value estimation of a target policy based on pre-collected data generated from a possibly different policy, under the framework of infinite-horizon Markov decision processes. Motivated by the recently developed marginal importance sampling method in RL and the covariate balancing idea in causal inference, we propose a novel estimator with approximately projected state-action balancing weights for the policy value estimation. We obtain the convergence rate of these weights, and show that the proposed value estimator is asymptotically normal under technical conditions. In terms of asymptotics, our results scale with both the number of trajectories and the number of decision points at each trajectory. As such, consistency can still be achieved with a limited number of subjects when the number of decision points diverges. In addition, we develop a necessary and sufficient condition for establishing the well-posedness of the operator that relates to the nonparametric Q-function estimation in the off-policy setting, which characterizes the difficulty of Q-function estimation and may be of independent interest. Numerical experiments demonstrate the promising performance of our proposed estimator.
非策略评估被认为是强化学习(RL)中的一个基本和具有挑战性的问题。本文研究了在无限视界马尔可夫决策过程框架下,基于可能不同策略产生的预采集数据对目标策略的价值估计。基于最近发展起来的强化学习中的边际重要性抽样方法和因果推理中的协变量平衡思想,我们提出了一种具有近似投影状态-行为平衡权的策略值估计器。我们得到了这些权值的收敛速率,并证明了所提出的值估计量在技术条件下是渐近正态的。在渐近性方面,我们的结果与轨迹的数量和每个轨迹上的决策点的数量都有关系。因此,当决策点的数量偏离时,仍然可以用有限数量的受试者实现一致性。此外,我们还建立了一个关于非参数q函数估计的算子的适定性的充分必要条件,它表征了q函数估计的难度,可能具有独立的研究意义。数值实验证明了该估计方法的良好性能。
{"title":"Projected state-action balancing weights for offline reinforcement learning","authors":"Jiayi Wang, Zhengling Qi, Raymond K. W. Wong","doi":"10.1214/23-aos2302","DOIUrl":"https://doi.org/10.1214/23-aos2302","url":null,"abstract":"Off-policy evaluation is considered a fundamental and challenging problem in reinforcement learning (RL). This paper focuses on value estimation of a target policy based on pre-collected data generated from a possibly different policy, under the framework of infinite-horizon Markov decision processes. Motivated by the recently developed marginal importance sampling method in RL and the covariate balancing idea in causal inference, we propose a novel estimator with approximately projected state-action balancing weights for the policy value estimation. We obtain the convergence rate of these weights, and show that the proposed value estimator is asymptotically normal under technical conditions. In terms of asymptotics, our results scale with both the number of trajectories and the number of decision points at each trajectory. As such, consistency can still be achieved with a limited number of subjects when the number of decision points diverges. In addition, we develop a necessary and sufficient condition for establishing the well-posedness of the operator that relates to the nonparametric Q-function estimation in the off-policy setting, which characterizes the difficulty of Q-function estimation and may be of independent interest. Numerical experiments demonstrate the promising performance of our proposed estimator.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135055878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A cross-validation framework for signal denoising with applications to trend filtering, dyadic CART and beyond 一个用于信号去噪的交叉验证框架,应用于趋势滤波,二元CART等
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-08-01 DOI: 10.1214/23-aos2283
Anamitra Chaudhuri, Sabyasachi Chatterjee
This paper formulates a general cross-validation framework for signal denoising. The general framework is then applied to nonparametric regression methods such as trend filtering and dyadic CART. The resulting cross-validated versions are then shown to attain nearly the same rates of convergence as are known for the optimally tuned analogues. There did not exist any previous theoretical analyses of cross-validated versions of trend filtering or dyadic CART. To illustrate the generality of the framework, we also propose and study cross-validated versions of two fundamental estimators; lasso for high-dimensional linear regression and singular value thresholding for matrix estimation. Our general framework is inspired by the ideas in Chatterjee and Jafarov (2015) and is potentially applicable to a wide range of estimation methods which use tuning parameters.
本文提出了一种通用的信号去噪交叉验证框架。然后将一般框架应用于趋势滤波和二元CART等非参数回归方法。由此产生的交叉验证的版本,然后被证明达到几乎相同的收敛速度为已知的最优调整的类似物。目前还没有任何理论分析的交叉验证版本的趋势滤波或二元CART。为了说明框架的通用性,我们还提出并研究了两个基本估计器的交叉验证版本;Lasso用于高维线性回归,奇异值阈值用于矩阵估计。我们的总体框架受到Chatterjee和Jafarov(2015)思想的启发,并且可能适用于使用调优参数的广泛估计方法。
{"title":"A cross-validation framework for signal denoising with applications to trend filtering, dyadic CART and beyond","authors":"Anamitra Chaudhuri, Sabyasachi Chatterjee","doi":"10.1214/23-aos2283","DOIUrl":"https://doi.org/10.1214/23-aos2283","url":null,"abstract":"This paper formulates a general cross-validation framework for signal denoising. The general framework is then applied to nonparametric regression methods such as trend filtering and dyadic CART. The resulting cross-validated versions are then shown to attain nearly the same rates of convergence as are known for the optimally tuned analogues. There did not exist any previous theoretical analyses of cross-validated versions of trend filtering or dyadic CART. To illustrate the generality of the framework, we also propose and study cross-validated versions of two fundamental estimators; lasso for high-dimensional linear regression and singular value thresholding for matrix estimation. Our general framework is inspired by the ideas in Chatterjee and Jafarov (2015) and is potentially applicable to a wide range of estimation methods which use tuning parameters.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135055879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Relaxing the i.i.d. assumption: Adaptively minimax optimal regret via root-entropic regularization 放宽i.i.d假设:基于根熵正则化的自适应最小最大最优后悔
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-08-01 DOI: 10.1214/23-aos2315
Blair Bilodeau, Jeffrey Negrea, Daniel M. Roy
We consider prediction with expert advice when data are generated from distributions varying arbitrarily within an unknown constraint set. This semi-adversarial setting includes (at the extremes) the classical i.i.d. setting, when the unknown constraint set is restricted to be a singleton, and the unconstrained adversarial setting, when the constraint set is the set of all distributions. The Hedge algorithm—long known to be minimax (rate) optimal in the adversarial regime—was recently shown to be simultaneously minimax optimal for i.i.d. data. In this work, we propose to relax the i.i.d. assumption by seeking adaptivity at all levels of a natural ordering on constraint sets. We provide matching upper and lower bounds on the minimax regret at all levels, show that Hedge with deterministic learning rates is suboptimal outside of the extremes and prove that one can adaptively obtain minimax regret at all levels. We achieve this optimal adaptivity using the follow-the-regularized-leader (FTRL) framework, with a novel adaptive regularization scheme that implicitly scales as the square root of the entropy of the current predictive distribution, rather than the entropy of the initial predictive distribution. Finally, we provide novel technical tools to study the statistical performance of FTRL along the semi-adversarial spectrum.
当数据是从一个未知约束集内任意变化的分布中生成时,我们考虑专家建议的预测。这种半对抗设置包括(在极端情况下)经典的i.i.d.设置,当未知约束集被限制为单个时,以及无约束对抗设置,当约束集是所有分布的集合时。对冲算法——长期以来被认为是对抗状态下的最大最小(率)最优算法——最近被证明同时是id数据的最大最小最优算法。在这项工作中,我们建议通过在约束集的自然排序的所有层次上寻求自适应性来放宽i.i.d假设。我们提供了所有级别上的最大最小遗憾的匹配上界和下界,证明了具有确定性学习率的Hedge在极端之外是次优的,并证明了人们可以自适应地在所有级别上获得最大最小遗憾。我们使用遵循正则化领导者(FTRL)框架实现了这种最优自适应,并采用了一种新的自适应正则化方案,该方案隐式缩放为当前预测分布的熵的平方根,而不是初始预测分布的熵。最后,我们提供了新的技术工具来研究FTRL在半对抗频谱上的统计性能。
{"title":"Relaxing the i.i.d. assumption: Adaptively minimax optimal regret via root-entropic regularization","authors":"Blair Bilodeau, Jeffrey Negrea, Daniel M. Roy","doi":"10.1214/23-aos2315","DOIUrl":"https://doi.org/10.1214/23-aos2315","url":null,"abstract":"We consider prediction with expert advice when data are generated from distributions varying arbitrarily within an unknown constraint set. This semi-adversarial setting includes (at the extremes) the classical i.i.d. setting, when the unknown constraint set is restricted to be a singleton, and the unconstrained adversarial setting, when the constraint set is the set of all distributions. The Hedge algorithm—long known to be minimax (rate) optimal in the adversarial regime—was recently shown to be simultaneously minimax optimal for i.i.d. data. In this work, we propose to relax the i.i.d. assumption by seeking adaptivity at all levels of a natural ordering on constraint sets. We provide matching upper and lower bounds on the minimax regret at all levels, show that Hedge with deterministic learning rates is suboptimal outside of the extremes and prove that one can adaptively obtain minimax regret at all levels. We achieve this optimal adaptivity using the follow-the-regularized-leader (FTRL) framework, with a novel adaptive regularization scheme that implicitly scales as the square root of the entropy of the current predictive distribution, rather than the entropy of the initial predictive distribution. Finally, we provide novel technical tools to study the statistical performance of FTRL along the semi-adversarial spectrum.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135165186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Graphical models for nonstationary time series 非平稳时间序列的图形模型
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-08-01 DOI: 10.1214/22-aos2205
Sumanta Basu, Suhasini Subba Rao
We propose NonStGM, a general nonparametric graphical modeling framework, for studying dynamic associations among the components of a nonstationary multivariate time series. It builds on the framework of Gaussian graphical models (GGM) and stationary time series graphical models (StGM) and complements existing works on parametric graphical models based on change point vector autoregressions (VAR). Analogous to StGM, the proposed framework captures conditional noncorrelations (both intertemporal and contemporaneous) in the form of an undirected graph. In addition, to describe the more nuanced nonstationary relationships among the components of the time series, we introduce the new notion of conditional nonstationarity/stationarity and incorporate it within the graph. This can be used to search for small subnetworks that serve as the “source” of nonstationarity in a large system. We explicitly connect conditional noncorrelation and stationarity between and within components of the multivariate time series to zero and Toeplitz embeddings of an infinite-dimensional inverse covariance operator. In the Fourier domain, conditional stationarity and noncorrelation relationships in the inverse covariance operator are encoded with a specific sparsity structure of its integral kernel operator. We show that these sparsity patterns can be recovered from finite-length time series by nodewise regression of discrete Fourier transforms (DFT) across different Fourier frequencies. We demonstrate the feasibility of learning NonStGM structure from data using simulation studies.
我们提出了一个通用的非参数图形建模框架NonStGM,用于研究非平稳多元时间序列中各分量之间的动态关联。它建立在高斯图形模型(GGM)和平稳时间序列图形模型(StGM)的框架上,并补充了基于变化点向量自回归(VAR)的参数化图形模型的现有工作。与StGM类似,所提出的框架以无向图的形式捕获条件非相关性(跨时间和同期)。此外,为了描述时间序列组成部分之间更细微的非平稳关系,我们引入了条件非平稳/平稳性的新概念,并将其纳入图中。这可以用于搜索作为大型系统中非平稳性“源”的小子网。我们明确地将多元时间序列分量之间和分量内的条件非相关和平稳性与零和无限维逆协方差算子的Toeplitz嵌入联系起来。在傅里叶域中,用其积分核算子的特定稀疏性结构对协方差逆算子中的条件平稳和非相关关系进行编码。我们展示了这些稀疏模式可以通过跨不同傅立叶频率的离散傅立叶变换(DFT)的节点回归从有限长度时间序列中恢复。我们通过仿真研究证明了从数据中学习非stgm结构的可行性。
{"title":"Graphical models for nonstationary time series","authors":"Sumanta Basu, Suhasini Subba Rao","doi":"10.1214/22-aos2205","DOIUrl":"https://doi.org/10.1214/22-aos2205","url":null,"abstract":"We propose NonStGM, a general nonparametric graphical modeling framework, for studying dynamic associations among the components of a nonstationary multivariate time series. It builds on the framework of Gaussian graphical models (GGM) and stationary time series graphical models (StGM) and complements existing works on parametric graphical models based on change point vector autoregressions (VAR). Analogous to StGM, the proposed framework captures conditional noncorrelations (both intertemporal and contemporaneous) in the form of an undirected graph. In addition, to describe the more nuanced nonstationary relationships among the components of the time series, we introduce the new notion of conditional nonstationarity/stationarity and incorporate it within the graph. This can be used to search for small subnetworks that serve as the “source” of nonstationarity in a large system. We explicitly connect conditional noncorrelation and stationarity between and within components of the multivariate time series to zero and Toeplitz embeddings of an infinite-dimensional inverse covariance operator. In the Fourier domain, conditional stationarity and noncorrelation relationships in the inverse covariance operator are encoded with a specific sparsity structure of its integral kernel operator. We show that these sparsity patterns can be recovered from finite-length time series by nodewise regression of discrete Fourier transforms (DFT) across different Fourier frequencies. We demonstrate the feasibility of learning NonStGM structure from data using simulation studies.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134951510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Learning low-dimensional nonlinear structures from high-dimensional noisy data: An integral operator approach 从高维噪声数据中学习低维非线性结构:一种积分算子方法
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-08-01 DOI: 10.1214/23-aos2306
Xiucai Ding, Rong Ma
We propose a kernel-spectral embedding algorithm for learning low-dimensional nonlinear structures from noisy and high-dimensional observations, where the data sets are assumed to be sampled from a nonlinear manifold model and corrupted by high-dimensional noise. The algorithm employs an adaptive bandwidth selection procedure which does not rely on prior knowledge of the underlying manifold. The obtained low-dimensional embeddings can be further utilized for downstream purposes such as data visualization, clustering and prediction. Our method is theoretically justified and practically interpretable. Specifically, for a general class of kernel functions, we establish the convergence of the final embeddings to their noiseless counterparts when the dimension grows polynomially with the size, and characterize the effect of the signal-to-noise ratio on the rate of convergence and phase transition. We also prove the convergence of the embeddings to the eigenfunctions of an integral operator defined by the kernel map of some reproducing kernel Hilbert space capturing the underlying nonlinear structures. Our results hold even when the dimension of the manifold grows with the sample size. Numerical simulations and analysis of real data sets show the superior empirical performance of the proposed method, compared to many existing methods, on learning various nonlinear manifolds in diverse applications.
我们提出了一种核谱嵌入算法,用于从噪声和高维观测中学习低维非线性结构,其中数据集被假设从非线性流形模型中采样并被高维噪声破坏。该算法采用了一种不依赖于先验知识的自适应带宽选择方法。得到的低维嵌入可以进一步用于下游目的,如数据可视化、聚类和预测。我们的方法在理论上是合理的,在实践上是可以解释的。具体来说,对于一类一般的核函数,我们建立了当维数随大小多项式增长时,最终嵌入到其无噪声对应物的收敛性,并表征了信噪比对收敛速度和相变的影响。我们也证明了嵌入到一个积分算子的特征函数的收敛性,这个积分算子是由捕获底层非线性结构的再现核希尔伯特空间的核映射所定义的。即使流形的尺寸随着样本量的增加而增加,我们的结果仍然成立。数值模拟和实际数据集分析表明,与许多现有方法相比,该方法在学习各种应用中的各种非线性流形方面具有优越的经验性能。
{"title":"Learning low-dimensional nonlinear structures from high-dimensional noisy data: An integral operator approach","authors":"Xiucai Ding, Rong Ma","doi":"10.1214/23-aos2306","DOIUrl":"https://doi.org/10.1214/23-aos2306","url":null,"abstract":"We propose a kernel-spectral embedding algorithm for learning low-dimensional nonlinear structures from noisy and high-dimensional observations, where the data sets are assumed to be sampled from a nonlinear manifold model and corrupted by high-dimensional noise. The algorithm employs an adaptive bandwidth selection procedure which does not rely on prior knowledge of the underlying manifold. The obtained low-dimensional embeddings can be further utilized for downstream purposes such as data visualization, clustering and prediction. Our method is theoretically justified and practically interpretable. Specifically, for a general class of kernel functions, we establish the convergence of the final embeddings to their noiseless counterparts when the dimension grows polynomially with the size, and characterize the effect of the signal-to-noise ratio on the rate of convergence and phase transition. We also prove the convergence of the embeddings to the eigenfunctions of an integral operator defined by the kernel map of some reproducing kernel Hilbert space capturing the underlying nonlinear structures. Our results hold even when the dimension of the manifold grows with the sample size. Numerical simulations and analysis of real data sets show the superior empirical performance of the proposed method, compared to many existing methods, on learning various nonlinear manifolds in diverse applications.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135055287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Noisy linear inverse problems under convex constraints: Exact risk asymptotics in high dimensions 凸约束下的噪声线性逆问题:高维的精确风险渐近性
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-08-01 DOI: 10.1214/23-aos2301
Qiyang Han
In the standard Gaussian linear measurement model Y=Xμ0+ξ∈Rm with a fixed noise level σ>0, we consider the problem of estimating the unknown signal μ0 under a convex constraint μ0∈K, where K is a closed convex set in Rn. We show that the risk of the natural convex constrained least squares estimator (LSE) μˆ(σ) can be characterized exactly in high-dimensional limits, by that of the convex constrained LSE μˆKseq in the corresponding Gaussian sequence model at a different noise level. Formally, we show that ‖μˆ(σ)−μ0‖2/(nrn2)→1in probability, where rn 2>0 solves the fixed-point equation E‖μˆKseq( (rn2+σ2)/(m/n))−μ0‖2=nrn2. This characterization holds (uniformly) for risks rn2 in the maximal regime that ranges from constant order all the way down to essentially the parametric rate, as long as certain necessary nondegeneracy condition is satisfied for μˆ(σ). The precise risk characterization reveals a fundamental difference between noiseless (or low noise limit) and noisy linear inverse problems in terms of the sample complexity for signal recovery. A concrete example is given by the isotonic regression problem: While exact recovery of a general monotone signal requires m≫n1/3 samples in the noiseless setting, consistent signal recovery in the noisy setting requires as few as m≫logn samples. Such a discrepancy occurs when the low and high noise risk behavior of μˆKseq differ significantly. In statistical languages, this occurs when μˆKseq estimates 0 at a faster “adaptation rate” than the slower “worst-case rate” for general signals. Several other examples, including nonnegative least squares and generalized Lasso (in constrained forms), are also worked out to demonstrate the concrete applicability of the theory in problems of different types. The proof relies on a collection of new analytic and probabilistic results concerning estimation error, log likelihood ratio test statistics and degree-of-freedom associated with μˆKseq, regarded as stochastic processes indexed by the noise level. These results are of independent interest in and of themselves.
在噪声水平σ>0的标准高斯线性测量模型Y=Xμ0+ξ∈Rm中,考虑了在凸约束μ0∈K下未知信号μ0的估计问题,其中K是Rn中的一个闭凸集。我们证明了自然凸约束最小二乘估计(LSE) μ - (σ)的风险可以通过不同噪声水平下相应高斯序列模型中的凸约束LSE μ - Kseq的风险在高维极限下精确表征。在形式上,我们证明了‖μ (σ)−μ‖2/(nrn2)→1的概率,其中rn2 >0求解不动点方程E‖μ Kseq((rn2+σ2)/(m/n))−μ‖2=nrn2。对于从常阶一直到基本参数率的最大区间的风险rn2,只要满足μ - (σ)的某些必要的非简并性条件,这种表征(一致地)成立。精确的风险表征揭示了在信号恢复的样本复杂度方面,无噪声(或低噪声极限)和有噪声线性逆问题之间的根本区别。等压回归问题给出了一个具体的例子:一般单调信号在无噪声条件下的精确恢复需要m比n1/3个样本,而在有噪声条件下的一致信号恢复只需要m比logn个样本。当μ - Kseq的低噪声和高噪声风险行为显著不同时,就会出现这种差异。在统计语言中,当μ - Kseq以比一般信号更慢的“最坏情况速率”更快的“适应速率”估计0时,就会发生这种情况。另外,还列举了非负最小二乘法和广义Lasso(约束形式)等例子,以证明该理论在不同类型问题中的具体适用性。该证明依赖于关于估计误差、对数似然比检验统计量和与μ - Kseq相关的自由度的新分析和概率结果的集合,这些结果被视为由噪声水平索引的随机过程。这些结果本身具有独立的意义。
{"title":"Noisy linear inverse problems under convex constraints: Exact risk asymptotics in high dimensions","authors":"Qiyang Han","doi":"10.1214/23-aos2301","DOIUrl":"https://doi.org/10.1214/23-aos2301","url":null,"abstract":"In the standard Gaussian linear measurement model Y=Xμ0+ξ∈Rm with a fixed noise level σ>0, we consider the problem of estimating the unknown signal μ0 under a convex constraint μ0∈K, where K is a closed convex set in Rn. We show that the risk of the natural convex constrained least squares estimator (LSE) μˆ(σ) can be characterized exactly in high-dimensional limits, by that of the convex constrained LSE μˆKseq in the corresponding Gaussian sequence model at a different noise level. Formally, we show that ‖μˆ(σ)−μ0‖2/(nrn2)→1in probability, where rn 2>0 solves the fixed-point equation E‖μˆKseq( (rn2+σ2)/(m/n))−μ0‖2=nrn2. This characterization holds (uniformly) for risks rn2 in the maximal regime that ranges from constant order all the way down to essentially the parametric rate, as long as certain necessary nondegeneracy condition is satisfied for μˆ(σ). The precise risk characterization reveals a fundamental difference between noiseless (or low noise limit) and noisy linear inverse problems in terms of the sample complexity for signal recovery. A concrete example is given by the isotonic regression problem: While exact recovery of a general monotone signal requires m≫n1/3 samples in the noiseless setting, consistent signal recovery in the noisy setting requires as few as m≫logn samples. Such a discrepancy occurs when the low and high noise risk behavior of μˆKseq differ significantly. In statistical languages, this occurs when μˆKseq estimates 0 at a faster “adaptation rate” than the slower “worst-case rate” for general signals. Several other examples, including nonnegative least squares and generalized Lasso (in constrained forms), are also worked out to demonstrate the concrete applicability of the theory in problems of different types. The proof relies on a collection of new analytic and probabilistic results concerning estimation error, log likelihood ratio test statistics and degree-of-freedom associated with μˆKseq, regarded as stochastic processes indexed by the noise level. These results are of independent interest in and of themselves.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"138 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135055877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Universality of regularized regression estimators in high dimensions 高维正则回归估计量的通用性
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-08-01 DOI: 10.1214/23-aos2309
Qiyang Han, Yandi Shen
The Convex Gaussian Min–Max Theorem (CGMT) has emerged as a prominent theoretical tool for analyzing the precise stochastic behavior of various statistical estimators in the so-called high-dimensional proportional regime, where the sample size and the signal dimension are of the same order. However, a well-recognized limitation of the existing CGMT machinery rests in its stringent requirement on the exact Gaussianity of the design matrix, therefore rendering the obtained precise high-dimensional asymptotics, largely a specific Gaussian theory in various important statistical models. This paper provides a structural universality framework for a broad class of regularized regression estimators that is particularly compatible with the CGMT machinery. Here, universality means that if a “structure” is satisfied by the regression estimator μˆG for a standard Gaussian design G, then it will also be satisfied by μˆA for a general non-Gaussian design A with independent entries. In particular, we show that with a good enough ℓ∞ bound for the regression estimator μˆA, any “structural property” that can be detected via the CGMT for μˆG also holds for μˆA under a general design A with independent entries. As a proof of concept, we demonstrate our new universality framework in three key examples of regularized regression estimators: the Ridge, Lasso and regularized robust regression estimators, where new universality properties of risk asymptotics and/or distributions of regression estimators and other related quantities are proved. As a major statistical implication of the Lasso universality results, we validate inference procedures using the degrees-of-freedom adjusted debiased Lasso under general design and error distributions. We also provide a counterexample, showing that universality properties for regularized regression estimators do not extend to general isotropic designs. The proof of our universality results relies on new comparison inequalities for the optimum of a broad class of cost functions and Gordon’s max–min (or min–max) costs, over arbitrary structure sets subject to ℓ∞ constraints. These results may be of independent interest and broader applicability.
凸高斯最小-最大定理(cggmt)已经成为一个重要的理论工具,用于分析所谓的高维比例状态下各种统计估计器的精确随机行为,其中样本大小和信号维数是同一阶的。然而,现有的cggmt机制的一个公认的局限性在于其对设计矩阵的精确高斯性的严格要求,因此使得所获得的精确高维渐近性,在很大程度上是各种重要统计模型中的特定高斯理论。本文为广义的正则化回归估计提供了一个结构通用性框架,它与cggmt机制特别兼容。这里,通用性意味着如果一个“结构”对于标准高斯设计G被回归估计量μ μ G所满足,那么对于具有独立项的一般非高斯设计a,它也将被μ μ a所满足。特别地,我们证明了对于回归估计量μ δ a有一个足够好的r∞界,任何可以通过μ δ G的CGMT检测到的“结构性质”在具有独立项的一般设计a下也适用于μ δ a。作为概念证明,我们在正则回归估计的三个关键例子中证明了我们的新普适性框架:Ridge, Lasso和正则鲁棒回归估计,其中证明了回归估计的风险渐近和/或分布以及其他相关量的新普适性。作为Lasso通用性结果的主要统计含义,我们在一般设计和误差分布下使用自由度调整的去偏Lasso来验证推理过程。我们还提供了一个反例,表明正则回归估计量的通用性不能扩展到一般的各向同性设计。我们的普适性结果的证明依赖于一个新的比较不等式,该不等式用于在任意结构集上,对广泛类别的成本函数和Gordon的max-min(或min-max)成本进行优化。这些结果可能具有独立的兴趣和更广泛的适用性。
{"title":"Universality of regularized regression estimators in high dimensions","authors":"Qiyang Han, Yandi Shen","doi":"10.1214/23-aos2309","DOIUrl":"https://doi.org/10.1214/23-aos2309","url":null,"abstract":"The Convex Gaussian Min–Max Theorem (CGMT) has emerged as a prominent theoretical tool for analyzing the precise stochastic behavior of various statistical estimators in the so-called high-dimensional proportional regime, where the sample size and the signal dimension are of the same order. However, a well-recognized limitation of the existing CGMT machinery rests in its stringent requirement on the exact Gaussianity of the design matrix, therefore rendering the obtained precise high-dimensional asymptotics, largely a specific Gaussian theory in various important statistical models. This paper provides a structural universality framework for a broad class of regularized regression estimators that is particularly compatible with the CGMT machinery. Here, universality means that if a “structure” is satisfied by the regression estimator μˆG for a standard Gaussian design G, then it will also be satisfied by μˆA for a general non-Gaussian design A with independent entries. In particular, we show that with a good enough ℓ∞ bound for the regression estimator μˆA, any “structural property” that can be detected via the CGMT for μˆG also holds for μˆA under a general design A with independent entries. As a proof of concept, we demonstrate our new universality framework in three key examples of regularized regression estimators: the Ridge, Lasso and regularized robust regression estimators, where new universality properties of risk asymptotics and/or distributions of regression estimators and other related quantities are proved. As a major statistical implication of the Lasso universality results, we validate inference procedures using the degrees-of-freedom adjusted debiased Lasso under general design and error distributions. We also provide a counterexample, showing that universality properties for regularized regression estimators do not extend to general isotropic designs. The proof of our universality results relies on new comparison inequalities for the optimum of a broad class of cost functions and Gordon’s max–min (or min–max) costs, over arbitrary structure sets subject to ℓ∞ constraints. These results may be of independent interest and broader applicability.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135055619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Annals of Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1