首页 > 最新文献

Electronic Journal of Statistics最新文献

英文 中文
Least sum of squares of trimmed residuals regression 裁剪残差回归的最小平方和
4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-01-01 DOI: 10.1214/23-ejs2164
Yijun Zuo, Hanwen Zuo
In the famous least sum of trimmed squares (LTS) estimator [21], residuals are first squared and then trimmed. In this article, we first trim residuals – using a depth trimming scheme – and then square the remaining of residuals. The estimator that minimizes the sum of trimmed and squared residuals, is called an LST estimator. Not only is the LST a robust alternative to the classic least sum of squares (LS) estimator. It also has a high finite sample breakdown point-and can resist, asymptotically, up to 50% contamination without breakdown – in sharp contrast to the 0% of the LS estimator. The population version of the LST is Fisher consistent, and the sample version is strong, root-n consistent, and asymptotically normal. We propose approximate algorithms for computing the LST and test on synthetic and real data sets. Despite being approximate, one of the algorithms compute the LST estimator quickly with relatively small variances in contrast to the famous LTS estimator. Thus, evidence suggests the LST serves as a robust alternative to the LS estimator and is feasible even in high dimension data sets with contamination and outliers.
在著名的最小平方和(LTS)估计器[21]中,残差首先被平方,然后被裁剪。在本文中,我们首先使用深度修剪方案来修剪残差,然后对残差的剩余部分进行平方。使残差裁剪和平方之和最小的估计量称为LST估计量。LST不仅是经典最小平方和(LS)估计器的鲁棒替代品。它还具有很高的有限样本击穿点,并且可以渐进地抵抗高达50%的污染而不击穿-与LS估计器的0%形成鲜明对比。LST的总体版本是Fisher一致的,样本版本是强的,根n一致的,并且是渐近正态的。我们提出了计算LST的近似算法,并在合成数据集和真实数据集上进行了测试。尽管是近似的,但与著名的LTS估计器相比,其中一种算法计算LST估计器的速度较快,方差相对较小。因此,证据表明LST可以作为LS估计器的鲁棒替代品,即使在具有污染和异常值的高维数据集中也是可行的。
{"title":"Least sum of squares of trimmed residuals regression","authors":"Yijun Zuo, Hanwen Zuo","doi":"10.1214/23-ejs2164","DOIUrl":"https://doi.org/10.1214/23-ejs2164","url":null,"abstract":"In the famous least sum of trimmed squares (LTS) estimator [21], residuals are first squared and then trimmed. In this article, we first trim residuals – using a depth trimming scheme – and then square the remaining of residuals. The estimator that minimizes the sum of trimmed and squared residuals, is called an LST estimator. Not only is the LST a robust alternative to the classic least sum of squares (LS) estimator. It also has a high finite sample breakdown point-and can resist, asymptotically, up to 50% contamination without breakdown – in sharp contrast to the 0% of the LS estimator. The population version of the LST is Fisher consistent, and the sample version is strong, root-n consistent, and asymptotically normal. We propose approximate algorithms for computing the LST and test on synthetic and real data sets. Despite being approximate, one of the algorithms compute the LST estimator quickly with relatively small variances in contrast to the famous LTS estimator. Thus, evidence suggests the LST serves as a robust alternative to the LS estimator and is feasible even in high dimension data sets with contamination and outliers.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":"2010 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136202186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Corrigendum to “Maximum likelihood estimation in logistic regression models with a diverging number of covariates” 更正“具有发散协变量数的逻辑回归模型中的最大似然估计”
IF 1.1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-01-01 DOI: 10.1214/12-EJS731
Hua Liang, Pang Du
Binary data with high-dimensional covariates have become more and more common in many disciplines. In this paper we consider the maximum likelihood estimation for logistic regression models with a diverging number of covariates. Under mild conditions we establish the asymptotic normality of the maximum likelihood estimate when the number of covariates p goes to infinity with the sample size n in the order of p = o(n). This remarkably improves the existing results that can only allow p growing in an order of o(nα) with α ∈ [1/5, 1/2] [12, 14]. A major innovation in our proof is the use of the injective function. AMS 2000 subject classifications: Primary 62F12; secondary 62J12.
具有高维协变量的二进制数据在许多学科中变得越来越普遍。在本文中,我们考虑具有发散协变量数的逻辑回归模型的最大似然估计。在温和条件下,当协变量的数量p随着样本大小n以p=o(n)的顺序变为无穷大时,我们建立了最大似然估计的渐近正态性。这显著改进了现有的结果,即仅允许p在α∈[1/5,1/2][12,14]的情况下以o(nα)的顺序生长。我们证明中的一个主要创新是使用了内射函数。AMS 2000学科分类:小学62F12;次级62J12。
{"title":"Corrigendum to “Maximum likelihood estimation in logistic regression models with a diverging number of covariates”","authors":"Hua Liang, Pang Du","doi":"10.1214/12-EJS731","DOIUrl":"https://doi.org/10.1214/12-EJS731","url":null,"abstract":"Binary data with high-dimensional covariates have become more and more common in many disciplines. In this paper we consider the maximum likelihood estimation for logistic regression models with a diverging number of covariates. Under mild conditions we establish the asymptotic normality of the maximum likelihood estimate when the number of covariates p goes to infinity with the sample size n in the order of p = o(n). This remarkably improves the existing results that can only allow p growing in an order of o(nα) with α ∈ [1/5, 1/2] [12, 14]. A major innovation in our proof is the use of the injective function. AMS 2000 subject classifications: Primary 62F12; secondary 62J12.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/12-EJS731","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48042414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Sieve estimation of semiparametric accelerated mean models with panel count data 具有面板计数数据的半参数加速平均模型的筛估计
IF 1.1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-01-01 DOI: 10.1214/23-ejs2128
Xiangbin Hu, Wen Su, Xingqiu Zhao
{"title":"Sieve estimation of semiparametric accelerated mean models with panel count data","authors":"Xiangbin Hu, Wen Su, Xingqiu Zhao","doi":"10.1214/23-ejs2128","DOIUrl":"https://doi.org/10.1214/23-ejs2128","url":null,"abstract":"","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45925239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonregular designs from Paley’s Hadamard matrices: Generalized resolution, projectivity and hidden projection property 来自Paley Hadamard矩阵的不规则设计:广义分辨率、投影性和隐投影性质
4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-01-01 DOI: 10.1214/23-ejs2148
Guanzhou Chen, Chenlu Shi, Boxin Tang
Nonregular designs are attractive, as compared with regular designs, not just because they have flexible run sizes but also because of their performances in terms of generalized resolution, projectivity, and hidden projection property. In this paper, we conduct a comprehensive study on three classes of designs that are obtained from Paley’s two constructions of Hadamard matrices. In terms of generalized resolution, we complete the study of Shi and Tang [15] on strength-two designs by adding results on strength-three designs. In terms of projectivty and hidden projection property, our results substantially expand those of Bulutoglu and Cheng [2]. For the purpose of practical applications, we conduct an extensive search of minimum G-aberration designs from those with maximum generalized resolutions and results are obtained for strength-two designs with 36, 44, 48, 52, 60, 64, 96 and 128 runs and strength-three designs with 72, 88 and 120 runs.
与规则设计相比,不规则设计具有吸引力,不仅因为它们具有灵活的运行尺寸,还因为它们在广义分辨率、投影性和隐藏投影特性方面的性能。在本文中,我们对由Paley的两个Hadamard矩阵构造得到的三类设计进行了全面的研究。在广义分辨率方面,我们补充了三强度设计的结果,完成了Shi和Tang[15]对二强度设计的研究。在投影性和隐投影性方面,我们的结果大大扩展了Bulutoglu和Cheng[2]的结果。为了实际应用,我们从具有最大广义分辨率的设计中广泛搜索最小g像差设计,并获得了强度2设计(36、44、48、52、60、64、96和128次)和强度3设计(72、88和120次)的结果。
{"title":"Nonregular designs from Paley’s Hadamard matrices: Generalized resolution, projectivity and hidden projection property","authors":"Guanzhou Chen, Chenlu Shi, Boxin Tang","doi":"10.1214/23-ejs2148","DOIUrl":"https://doi.org/10.1214/23-ejs2148","url":null,"abstract":"Nonregular designs are attractive, as compared with regular designs, not just because they have flexible run sizes but also because of their performances in terms of generalized resolution, projectivity, and hidden projection property. In this paper, we conduct a comprehensive study on three classes of designs that are obtained from Paley’s two constructions of Hadamard matrices. In terms of generalized resolution, we complete the study of Shi and Tang [15] on strength-two designs by adding results on strength-three designs. In terms of projectivty and hidden projection property, our results substantially expand those of Bulutoglu and Cheng [2]. For the purpose of practical applications, we conduct an extensive search of minimum G-aberration designs from those with maximum generalized resolutions and results are obtained for strength-two designs with 36, 44, 48, 52, 60, 64, 96 and 128 runs and strength-three designs with 72, 88 and 120 runs.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135911387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncertainty quantification for sparse spectral variational approximations in Gaussian process regression 高斯过程回归中稀疏谱变分近似的不确定性量化
4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-01-01 DOI: 10.1214/23-ejs2155
Dennis Nieman, Botond Szabo, Harry van Zanten
We investigate the frequentist guarantees of the variational sparse Gaussian process regression model. In the theoretical analysis, we focus on the variational approach with spectral features as inducing variables. We derive guarantees and limitations for the frequentist coverage of the resulting variational credible sets. We also derive sufficient and necessary lower bounds for the number of inducing variables required to achieve minimax posterior contraction rates. The implications of these results are demonstrated for different choices of priors. In a numerical analysis we consider a wider range of inducing variable methods and observe similar phenomena beyond the scope of our theoretical findings.
研究了变分稀疏高斯过程回归模型的频率保证。在理论分析中,我们着重于用谱特征作为诱导变量的变分方法。我们给出了结果变分可信集的频率覆盖的保证和限制。我们还推导了达到最小最大后缩率所需的诱导变量数量的充分和必要的下界。这些结果的含义证明了不同的选择先验。在数值分析中,我们考虑了更广泛的诱导变量方法,并观察到超出我们理论发现范围的类似现象。
{"title":"Uncertainty quantification for sparse spectral variational approximations in Gaussian process regression","authors":"Dennis Nieman, Botond Szabo, Harry van Zanten","doi":"10.1214/23-ejs2155","DOIUrl":"https://doi.org/10.1214/23-ejs2155","url":null,"abstract":"We investigate the frequentist guarantees of the variational sparse Gaussian process regression model. In the theoretical analysis, we focus on the variational approach with spectral features as inducing variables. We derive guarantees and limitations for the frequentist coverage of the resulting variational credible sets. We also derive sufficient and necessary lower bounds for the number of inducing variables required to achieve minimax posterior contraction rates. The implications of these results are demonstrated for different choices of priors. In a numerical analysis we consider a wider range of inducing variable methods and observe similar phenomena beyond the scope of our theoretical findings.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135952877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Regression analysis of mixed sparse synchronous and asynchronous longitudinal covariates with varying-coefficient models 混合稀疏同步与异步纵向协变量的变系数模型回归分析
4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-01-01 DOI: 10.1214/23-ejs2175
Congmin Liu, Zhuowei Sun, Hongyuan Cao
We consider varying-coefficient models for mixed synchronous and asynchronous longitudinal covariates, where asynchronicity refers to the misalignment of longitudinal measurement times within an individual. We propose three different methods of parameter estimation and inference. The first method is a one-step approach that estimates non-parametric regression functions for synchronous and asynchronous longitudinal covariates simultaneously. The second method is a two-step approach in which synchronous longitudinal covariates are regressed with the longitudinal response by centering the synchronous longitudinal covariates first and, in the second step, the residuals from the first step are regressed with asynchronous longitudinal covariates. The third method is the same as the second method except that in the first step, we omit the asynchronous longitudinal covariate and include a non-parametric intercept in the regression analysis of synchronous longitudinal covariates and the longitudinal response. We further construct simultaneous confidence bands for the non-parametric regression functions to quantify the overall magnitude of variation. Extensive simulation studies provide numerical support for the theoretical findings. The practical utility of the methods is illustrated on a dataset from the ADNI study.
我们考虑混合同步和异步纵向协变量的变系数模型,其中异步性是指个体内纵向测量时间的不对准。我们提出了三种不同的参数估计和推理方法。第一种方法是一步法,同时估计同步和异步纵向协变量的非参数回归函数。第二种方法是两步方法,首先以同步纵向协变量为中心,将同步纵向协变量与纵向响应进行回归,第二步,将第一步的残差与异步纵向协变量进行回归。第三种方法与第二种方法相同,只是在第一步中,我们省略了异步纵向协变量,并在同步纵向协变量和纵向响应的回归分析中包含了非参数截距。我们进一步为非参数回归函数构建同步置信带,以量化总体变化幅度。大量的模拟研究为理论发现提供了数值支持。ADNI研究的一个数据集说明了这些方法的实际效用。
{"title":"Regression analysis of mixed sparse synchronous and asynchronous longitudinal covariates with varying-coefficient models","authors":"Congmin Liu, Zhuowei Sun, Hongyuan Cao","doi":"10.1214/23-ejs2175","DOIUrl":"https://doi.org/10.1214/23-ejs2175","url":null,"abstract":"We consider varying-coefficient models for mixed synchronous and asynchronous longitudinal covariates, where asynchronicity refers to the misalignment of longitudinal measurement times within an individual. We propose three different methods of parameter estimation and inference. The first method is a one-step approach that estimates non-parametric regression functions for synchronous and asynchronous longitudinal covariates simultaneously. The second method is a two-step approach in which synchronous longitudinal covariates are regressed with the longitudinal response by centering the synchronous longitudinal covariates first and, in the second step, the residuals from the first step are regressed with asynchronous longitudinal covariates. The third method is the same as the second method except that in the first step, we omit the asynchronous longitudinal covariate and include a non-parametric intercept in the regression analysis of synchronous longitudinal covariates and the longitudinal response. We further construct simultaneous confidence bands for the non-parametric regression functions to quantify the overall magnitude of variation. Extensive simulation studies provide numerical support for the theoretical findings. The practical utility of the methods is illustrated on a dataset from the ADNI study.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135662402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards optimal doubly robust estimation of heterogeneous causal effects 对异质性因果效应的最优双稳健估计
4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-01-01 DOI: 10.1214/23-ejs2157
Edward H. Kennedy
Heterogeneous effect estimation is crucial in causal inference, with applications across medicine and social science. Many methods for estimating conditional average treatment effects (CATEs) have been proposed, but there are gaps in understanding if and when such methods are optimal. This is especially true when the CATE has nontrivial structure (e.g., smoothness or sparsity). Our work contributes in several ways. First, we study a two-stage doubly robust CATE estimator and give a generic error bound, which yields rates faster than much of the literature. We apply the bound to derive error rates in smooth nonparametric models, and give sufficient conditions for oracle efficiency. Along the way we give a general error bound for regression with estimated outcomes; this is the second main contribution. The third contribution is aimed at understanding the fundamental statistical limits of CATE estimation. To that end, we propose and study a local polynomial adaptation of double-residual regression. We show that this estimator can be oracle efficient under even weaker conditions, and we conjecture that they are minimal in a minimax sense. We go on to give error bounds in the non-trivial regime where oracle rates cannot be achieved. Some finite-sample properties are explored with simulations.
异质效应估计在因果推理中是至关重要的,在医学和社会科学领域都有应用。已经提出了许多估计条件平均处理效果(CATEs)的方法,但是在理解这些方法是否以及何时是最佳的方面存在差距。当CATE具有非平凡结构(例如,平滑性或稀疏性)时尤其如此。我们的工作有几个方面的贡献。首先,我们研究了一个两阶段双鲁棒CATE估计器,并给出了一个通用的误差界,它的产生率比大多数文献快。应用该界导出了光滑非参数模型的错误率,并给出了oracle效率的充分条件。在此过程中,我们给出了带有估计结果的回归的一般误差范围;这是第二个主要贡献。第三项贡献旨在理解CATE估计的基本统计限制。为此,我们提出并研究了一种局部多项式自适应的双残差回归。我们证明了这个估计器在更弱的条件下是非常有效的,并且我们推测它们在极小极大意义上是最小的。我们继续给出在非平凡情况下,oracle率无法达到的误差范围。通过模拟探讨了一些有限样本性质。
{"title":"Towards optimal doubly robust estimation of heterogeneous causal effects","authors":"Edward H. Kennedy","doi":"10.1214/23-ejs2157","DOIUrl":"https://doi.org/10.1214/23-ejs2157","url":null,"abstract":"Heterogeneous effect estimation is crucial in causal inference, with applications across medicine and social science. Many methods for estimating conditional average treatment effects (CATEs) have been proposed, but there are gaps in understanding if and when such methods are optimal. This is especially true when the CATE has nontrivial structure (e.g., smoothness or sparsity). Our work contributes in several ways. First, we study a two-stage doubly robust CATE estimator and give a generic error bound, which yields rates faster than much of the literature. We apply the bound to derive error rates in smooth nonparametric models, and give sufficient conditions for oracle efficiency. Along the way we give a general error bound for regression with estimated outcomes; this is the second main contribution. The third contribution is aimed at understanding the fundamental statistical limits of CATE estimation. To that end, we propose and study a local polynomial adaptation of double-residual regression. We show that this estimator can be oracle efficient under even weaker conditions, and we conjecture that they are minimal in a minimax sense. We go on to give error bounds in the non-trivial regime where oracle rates cannot be achieved. Some finite-sample properties are explored with simulations.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135662404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 194
Asymptotic normality of a change plane estimator in fixed dimension with near-optimal rate 具有近最优速率的固定维变换平面估计量的渐近正态性
4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-01-01 DOI: 10.1214/23-ejs2144
Debarghya Mukherjee, Moulinath Banerjee, Debasri Mukherjee, Ya’acov Ritov
Linear thresholding models postulate that the conditional distribution of a response variable in terms of covariates differs on the two sides of a (typically unknown) hyperplane in the covariate space. A key goal in such models is to learn about this separating hyperplane. Exact likelihood or least squares methods to estimate the thresholding parameter involve an indicator function which make them difficult to optimize and are, therefore, often tackled by using a surrogate loss that uses a smooth approximation to the indicator. In this paper, we demonstrate that the resulting estimator is asymptotically normal with a near optimal rate of convergence: n−1 up to a log factor, in both classification and regression thresholding models. This is substantially faster than the currently established convergence rates of smoothed estimators for similar models in the statistics and econometrics literatures. We also present a real-data application of our approach to an environmental data set where CO2 emission is explained in terms of a separating hyperplane defined through per-capita GDP and urban agglomeration.
线性阈值模型假设响应变量的协变量条件分布在协变量空间的一个(通常是未知的)超平面的两侧是不同的。这种模型的一个关键目标是了解这种分离超平面。估计阈值参数的精确似然或最小二乘方法涉及一个指标函数,这使得它们难以优化,因此,通常通过使用对指标进行平滑近似的替代损失来解决。在本文中,我们证明了所得到的估计量在分类和回归阈值模型中都是渐近正态的,具有接近最优的收敛速度:n−1直到一个对数因子。这比目前统计和计量经济学文献中类似模型的光滑估计器的收敛速度要快得多。我们还展示了我们的方法在环境数据集上的实际数据应用,其中二氧化碳排放是根据通过人均GDP和城市群定义的分离超平面来解释的。
{"title":"Asymptotic normality of a change plane estimator in fixed dimension with near-optimal rate","authors":"Debarghya Mukherjee, Moulinath Banerjee, Debasri Mukherjee, Ya’acov Ritov","doi":"10.1214/23-ejs2144","DOIUrl":"https://doi.org/10.1214/23-ejs2144","url":null,"abstract":"Linear thresholding models postulate that the conditional distribution of a response variable in terms of covariates differs on the two sides of a (typically unknown) hyperplane in the covariate space. A key goal in such models is to learn about this separating hyperplane. Exact likelihood or least squares methods to estimate the thresholding parameter involve an indicator function which make them difficult to optimize and are, therefore, often tackled by using a surrogate loss that uses a smooth approximation to the indicator. In this paper, we demonstrate that the resulting estimator is asymptotically normal with a near optimal rate of convergence: n−1 up to a log factor, in both classification and regression thresholding models. This is substantially faster than the currently established convergence rates of smoothed estimators for similar models in the statistics and econometrics literatures. We also present a real-data application of our approach to an environmental data set where CO2 emission is explained in terms of a separating hyperplane defined through per-capita GDP and urban agglomeration.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135954678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing linear operator constraints in functional response regression with incomplete response functions 不完全响应函数在函数响应回归中的线性算子约束检验
4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-01-01 DOI: 10.1214/23-ejs2177
Yeonjoo Park, Kyunghee Han, Douglas G. Simpson
Hypothesis testing procedures are developed to assess linear operator constraints in function-on-scalar regression when incomplete functional responses are observed. The approach enables statistical inferences about the shape and other aspects of the functional regression coefficients within a unified framework encompassing three incomplete sampling scenarios; (i) partially observed response functions as curve segments over random sub-intervals of the domain, (ii) discretely observed functional responses with additive measurement errors, and (iii) the composition of former two scenarios, where partially observed response segments are observed discretely with measurement error. The latter scenario has been little explored to date, although such structured data is increasingly common in applications. For statistical inference, deviations from the constraint space are measured via integrated L2-distance between estimates from the constrained and unconstrained model spaces. Large sample properties of the proposed test procedure are established, including the consistency, asymptotic distribution, and local power of the test statistic. The finite sample power and level of the proposed test are investigated in a simulation study covering a variety of scenarios. The proposed methodologies are illustrated by applications to U.S. obesity prevalence data, analyzing the functional shape of its trends over time, and motion analysis in a study of automotive ergonomics.
假设检验程序的发展,以评估线性算子约束的函数对标量回归时,不完整的功能响应被观察到。该方法能够在包含三个不完整采样场景的统一框架内对功能回归系数的形状和其他方面进行统计推断;(i)部分观测到的响应函数在域的随机子区间上作为曲线段,(ii)具有附加测量误差的离散观测到的功能响应,以及(iii)前两种情况的组合,其中部分观测到的响应段是具有测量误差的离散观测到的。尽管这种结构化数据在应用程序中越来越普遍,但到目前为止,对后一种情况的探索还很少。对于统计推断,通过约束和非约束模型空间估计之间的综合l2距离来测量约束空间的偏差。建立了所提出的检验方法的大样本性质,包括检验统计量的一致性、渐近分布和局部幂。在涵盖多种场景的模拟研究中,研究了所提出的测试的有限样本功率和水平。通过对美国肥胖流行数据的应用,分析其随时间变化趋势的功能形状,以及对汽车人体工程学研究中的运动分析,说明了所提出的方法。
{"title":"Testing linear operator constraints in functional response regression with incomplete response functions","authors":"Yeonjoo Park, Kyunghee Han, Douglas G. Simpson","doi":"10.1214/23-ejs2177","DOIUrl":"https://doi.org/10.1214/23-ejs2177","url":null,"abstract":"Hypothesis testing procedures are developed to assess linear operator constraints in function-on-scalar regression when incomplete functional responses are observed. The approach enables statistical inferences about the shape and other aspects of the functional regression coefficients within a unified framework encompassing three incomplete sampling scenarios; (i) partially observed response functions as curve segments over random sub-intervals of the domain, (ii) discretely observed functional responses with additive measurement errors, and (iii) the composition of former two scenarios, where partially observed response segments are observed discretely with measurement error. The latter scenario has been little explored to date, although such structured data is increasingly common in applications. For statistical inference, deviations from the constraint space are measured via integrated L2-distance between estimates from the constrained and unconstrained model spaces. Large sample properties of the proposed test procedure are established, including the consistency, asymptotic distribution, and local power of the test statistic. The finite sample power and level of the proposed test are investigated in a simulation study covering a variety of scenarios. The proposed methodologies are illustrated by applications to U.S. obesity prevalence data, analyzing the functional shape of its trends over time, and motion analysis in a study of automotive ergonomics.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135662403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spectrum inference for replicated spatial locally time-harmonizable time series 复制空间局部时间可调和时间序列的谱推断
IF 1.1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-01-01 DOI: 10.1214/23-ejs2130
J. Aston, D. Dehay, A. Dudek, Jean-Marc Freyermuth, Dénes Szűcs, Lincoln J. Colling
: In this paper, we develop tools for statistical inference on repli- cated realizations of spatiotemporal processes that are locally time-harmonizable. Our method estimates both the rescaled spatial time-varying Loève-spectrum and the spatial time-varying dual-frequency coherence function under realistic modeling assumptions. We construct confidence intervals for these parameters of interest using the Circular Block Bootstrap method and prove its consistency. We illustrate the application of our methodology on a dataset arising from an experiment in neuropsychology. From EEG recordings, our method allows studying the dynamic functional connectiv- ity within the brain associated to visual working memory performance
:在本文中,我们开发了用于对局部时间可调和的时空过程的重复实现进行统计推断的工具。我们的方法在现实建模假设下估计了重新缩放的空间时变Loève谱和空间时变双频相干函数。我们使用Circular Block Bootstrap方法为这些感兴趣的参数构建了置信区间,并证明了其一致性。我们在神经心理学实验的数据集上说明了我们的方法论的应用。通过脑电图记录,我们的方法可以研究大脑中与视觉工作记忆表现相关的动态功能连接
{"title":"Spectrum inference for replicated spatial locally time-harmonizable time series","authors":"J. Aston, D. Dehay, A. Dudek, Jean-Marc Freyermuth, Dénes Szűcs, Lincoln J. Colling","doi":"10.1214/23-ejs2130","DOIUrl":"https://doi.org/10.1214/23-ejs2130","url":null,"abstract":": In this paper, we develop tools for statistical inference on repli- cated realizations of spatiotemporal processes that are locally time-harmonizable. Our method estimates both the rescaled spatial time-varying Loève-spectrum and the spatial time-varying dual-frequency coherence function under realistic modeling assumptions. We construct confidence intervals for these parameters of interest using the Circular Block Bootstrap method and prove its consistency. We illustrate the application of our methodology on a dataset arising from an experiment in neuropsychology. From EEG recordings, our method allows studying the dynamic functional connectiv- ity within the brain associated to visual working memory performance","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49584863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Electronic Journal of Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1