首页 > 最新文献

Electronic Journal of Statistics最新文献

英文 中文
Adaptive threshold-based classification of sparse high-dimensional data 基于自适应阈值的稀疏高维数据分类
IF 1.1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-01-01 DOI: 10.1214/22-ejs1998
T. Pavlenko, N. Stepanova, Lee Thompson
Abstract: We revisit the problem of designing an efficient binary classifier in a challenging high-dimensional framework. The model under study assumes some local dependence structure among feature variables represented by a block-diagonal covariance matrix with a growing number of blocks of an arbitrary, but fixed size. The blocks correspond to non-overlapping independent groups of strongly correlated features. To assess the relevance of a particular block in predicting the response, we introduce a measure of “signal strength” pertaining to each feature block. This measure is then used to specify a sparse model of our interest. We further propose a threshold-based feature selector which operates as a screen-and-clean scheme integrated into a linear classifier: the data is subject to screening and hard threshold cleaning to filter out the blocks that contain no signals. Asymptotic properties of the proposed classifiers are studied when the sample size n depends on the number of feature blocks b, and the sample size goes to infinity with b at a slower rate than b. The new classifiers, which are fully adaptive to unknown parameters of the model, are shown to perform asymptotically optimally in a large part of the classification region. The numerical study confirms good analytical properties of the new classifiers that compare favorably to the existing threshold-based procedure used in a similar context.
摘要:我们重新审视了在一个具有挑战性的高维框架中设计一个高效的二进制分类器的问题。所研究的模型假设由块对角协方差矩阵表示的特征变量之间存在一些局部依赖结构,该矩阵具有不断增长的任意但固定大小的块。这些块对应于强相关特征的不重叠的独立组。为了评估特定块在预测响应中的相关性,我们引入了与每个特征块相关的“信号强度”度量。然后使用该度量来指定我们感兴趣的稀疏模型。我们进一步提出了一种基于阈值的特征选择器,它作为一种集成到线性分类器中的筛选和清理方案进行操作:对数据进行筛选和硬阈值清理,以过滤出不包含信号的块。当样本大小n取决于特征块b的数量,并且样本大小随b以比b慢的速率变为无穷大时,研究了所提出的分类器的渐近性质。新分类器完全自适应于模型的未知参数,在很大一部分分类区域中表现为渐近最优。数值研究证实了新分类器的良好分析性能,与在类似环境中使用的现有基于阈值的过程相比,这些分类器具有良好的分析性能。
{"title":"Adaptive threshold-based classification of sparse high-dimensional data","authors":"T. Pavlenko, N. Stepanova, Lee Thompson","doi":"10.1214/22-ejs1998","DOIUrl":"https://doi.org/10.1214/22-ejs1998","url":null,"abstract":"Abstract: We revisit the problem of designing an efficient binary classifier in a challenging high-dimensional framework. The model under study assumes some local dependence structure among feature variables represented by a block-diagonal covariance matrix with a growing number of blocks of an arbitrary, but fixed size. The blocks correspond to non-overlapping independent groups of strongly correlated features. To assess the relevance of a particular block in predicting the response, we introduce a measure of “signal strength” pertaining to each feature block. This measure is then used to specify a sparse model of our interest. We further propose a threshold-based feature selector which operates as a screen-and-clean scheme integrated into a linear classifier: the data is subject to screening and hard threshold cleaning to filter out the blocks that contain no signals. Asymptotic properties of the proposed classifiers are studied when the sample size n depends on the number of feature blocks b, and the sample size goes to infinity with b at a slower rate than b. The new classifiers, which are fully adaptive to unknown parameters of the model, are shown to perform asymptotically optimally in a large part of the classification region. The numerical study confirms good analytical properties of the new classifiers that compare favorably to the existing threshold-based procedure used in a similar context.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47611113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal estimation of the supremum and occupation times of a self-similar Lévy process 自相似Lévy过程的上确界和占用时间的最优估计
IF 1.1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-01-01 DOI: 10.1214/21-ejs1928
J. Ivanovs, M. Podolskij
In this paper we present new theoretical results on optimal estimation of certain random quantities based on high frequency observations of a Lévy process. More specifically, we investigate the asymptotic theory for the conditional mean and conditional median estimators of the supremum/infimum of a linear Brownian motion and a strictly stable Lévy process. Another contribution of our article is the conditional mean estimation of the local time and the occupation time of a linear Brownian motion. We demonstrate that the new estimators are considerably more efficient compared to the classical estimators studied in e.g. [6, 14, 29, 30, 38]. Furthermore, we discuss pre-estimation of the parameters of the underlying models, which is required for practical implementation of the proposed statistics. MSC2020 subject classifications: Primary 62M05, 62G20, 60F05; secondary 62G15, 60G18, 60G51.
在本文中,我们给出了基于Lévy过程的高频观测的某些随机量的最优估计的新的理论结果。更具体地说,我们研究了线性布朗运动和严格稳定Lévy过程的上确界/下确界的条件均值和条件中值估计的渐近理论。我们文章的另一个贡献是线性布朗运动的局部时间和占用时间的条件均值估计。我们证明,与[6,14,29,30,38]中研究的经典估计量相比,新的估计量要高效得多。此外,我们还讨论了基础模型参数的预估计,这是实际实现所提出的统计数据所必需的。MSC2020受试者分类:初级62M05、62G20、60F05;次级62G15、60G18、60G51。
{"title":"Optimal estimation of the supremum and occupation times of a self-similar Lévy process","authors":"J. Ivanovs, M. Podolskij","doi":"10.1214/21-ejs1928","DOIUrl":"https://doi.org/10.1214/21-ejs1928","url":null,"abstract":"In this paper we present new theoretical results on optimal estimation of certain random quantities based on high frequency observations of a Lévy process. More specifically, we investigate the asymptotic theory for the conditional mean and conditional median estimators of the supremum/infimum of a linear Brownian motion and a strictly stable Lévy process. Another contribution of our article is the conditional mean estimation of the local time and the occupation time of a linear Brownian motion. We demonstrate that the new estimators are considerably more efficient compared to the classical estimators studied in e.g. [6, 14, 29, 30, 38]. Furthermore, we discuss pre-estimation of the parameters of the underlying models, which is required for practical implementation of the proposed statistics. MSC2020 subject classifications: Primary 62M05, 62G20, 60F05; secondary 62G15, 60G18, 60G51.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43013455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Minimal σ-field for flexible sufficient dimension reduction 最小的σ-域为灵活的充分降维
IF 1.1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-01-01 DOI: 10.1214/22-ejs1999
Hanmin Guo, Lin Hou, Y. Zhu
Sufficient Dimension Reduction (SDR) becomes an important tool for mitigating the curse of dimensionality in high dimensional regression analysis. Recently, Flexible SDR (FSDR) has been proposed to extend SDR by finding lower dimensional projections of transformed explanatory variables. The dimensions of the projections however cannot fully represent the extent of data reduction FSDR can achieve. As a consequence, optimality and other theoretical properties of FSDR are currently not well understood. In this article, we propose to use the σ-field associated with the projections, together with their dimensions to fully characterize FSDR, and refer to the σ-field as the FSDR σ-field. We further introduce the concept of minimal FSDR σ-field and consider FSDR projections with the minimal σfield optimal. Under some mild conditions, we show that the minimal FSDR σ-field exists, attaining the lowest dimensionality at the same time. To estimate the minimal FSDR σ-field, we propose a two-stage procedure called the Generalized Kernel Dimension Reduction (GKDR) method and partially establish its consistency property under weak conditions. Extensive simulation experiments demonstrate that the GKDRmethod can effectively find the minimal FSDR σ-field and outperform other existing methods. The application of GKDR to a real life air pollution data set sheds new light on the connections between atmospheric conditions and air quality. MSC2020 subject classifications: Primary 62B05; secondary 62J02.
充分降维(SDR)成为缓解高维回归分析中维数诅咒的重要工具。最近,柔性SDR(FSDR)被提出通过寻找转换的解释变量的低维投影来扩展SDR。然而,预测的维度不能完全代表FSDR可以实现的数据缩减程度。因此,FSDR的最优性和其他理论性质目前还没有得到很好的理解。在本文中,我们建议使用与投影相关的σ-场及其维度来完全表征FSDR,并将σ-场称为FSDRσ-场。我们进一步引入了最小FSDRσ场的概念,并考虑了具有最小σ场最优的FSDR投影。在一些温和的条件下,我们证明了最小FSDRσ-场的存在,同时达到了最低维。为了估计最小FSDRσ-场,我们提出了一种称为广义核降维(GKDR)方法的两阶段过程,并在弱条件下部分建立了它的一致性性质。大量的仿真实验表明,GKDR方法能够有效地找到最小FSDRσ场,并且优于现有的其他方法。GKDR在现实生活中的空气污染数据集中的应用为大气条件和空气质量之间的联系提供了新的线索。MSC2020受试者分类:初级62B05;次级62J02。
{"title":"Minimal σ-field for flexible sufficient dimension reduction","authors":"Hanmin Guo, Lin Hou, Y. Zhu","doi":"10.1214/22-ejs1999","DOIUrl":"https://doi.org/10.1214/22-ejs1999","url":null,"abstract":"Sufficient Dimension Reduction (SDR) becomes an important tool for mitigating the curse of dimensionality in high dimensional regression analysis. Recently, Flexible SDR (FSDR) has been proposed to extend SDR by finding lower dimensional projections of transformed explanatory variables. The dimensions of the projections however cannot fully represent the extent of data reduction FSDR can achieve. As a consequence, optimality and other theoretical properties of FSDR are currently not well understood. In this article, we propose to use the σ-field associated with the projections, together with their dimensions to fully characterize FSDR, and refer to the σ-field as the FSDR σ-field. We further introduce the concept of minimal FSDR σ-field and consider FSDR projections with the minimal σfield optimal. Under some mild conditions, we show that the minimal FSDR σ-field exists, attaining the lowest dimensionality at the same time. To estimate the minimal FSDR σ-field, we propose a two-stage procedure called the Generalized Kernel Dimension Reduction (GKDR) method and partially establish its consistency property under weak conditions. Extensive simulation experiments demonstrate that the GKDRmethod can effectively find the minimal FSDR σ-field and outperform other existing methods. The application of GKDR to a real life air pollution data set sheds new light on the connections between atmospheric conditions and air quality. MSC2020 subject classifications: Primary 62B05; secondary 62J02.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46459761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of partially conditional average treatment effect by double kernel-covariate balancing 用双核协变量平衡估计部分条件平均治疗效果
IF 1.1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-01-01 DOI: 10.1214/22-ejs2000
Jiayi Wang, R. K. Wong, Shu Yang, K. C. G. Chan
{"title":"Estimation of partially conditional average treatment effect by double kernel-covariate balancing","authors":"Jiayi Wang, R. K. Wong, Shu Yang, K. C. G. Chan","doi":"10.1214/22-ejs2000","DOIUrl":"https://doi.org/10.1214/22-ejs2000","url":null,"abstract":"","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49422993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Isotonic regression for elicitable functionals and their Bayes risk 可引出泛函的同调回归及其Bayes风险
IF 1.1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-01-01 DOI: 10.1214/22-ejs2034
Anja Mühlemann, Johanna F. Ziegel
{"title":"Isotonic regression for elicitable functionals and their Bayes risk","authors":"Anja Mühlemann, Johanna F. Ziegel","doi":"10.1214/22-ejs2034","DOIUrl":"https://doi.org/10.1214/22-ejs2034","url":null,"abstract":"","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44276319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regularized high dimension low tubal-rank tensor regression 正则化高维低管阶张量回归
IF 1.1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-01-01 DOI: 10.1214/22-ejs2004
S. Roy, G. Michailidis
: Tensor regression models are of emerging interest in diverse fields of social and behavioral sciences, including neuroimaging analysis, neural networks, image processing and so on. Recent theoretical advance- ments of tensor decomposition have facilitated significant development of various tensor regression models. The focus of most of the available lit- erature has been on the Canonical Polyadic (CP) decomposition and its variants for the regression coefficient tensor. A CP decomposed coefficient tensor enables estimation with relatively small sample size, but it may not always capture the underlying complex structure in the data. In this work, we leverage the recently developed concept of tubal rank and develop a tensor regression model, wherein the coefficient tensor is decomposed into two components: a low tubal rank tensor and a structured sparse one. We first address the issue of identifiability of the two components comprising the coefficient tensor and subsequently develop a fast and scalable Alternating Minimization algorithm to solve the convex regularized program. Further, we provide finite sample error bounds under high dimensional scaling for the model parameters. The performance of the model is assessed on synthetic data and is also used in an application involving data from an intelligent tutoring platform.
:张量回归模型在社会和行为科学的各个领域都引起了人们的兴趣,包括神经成像分析、神经网络、图像处理等。张量分解的最新理论进展促进了各种张量回归模型的显著发展。大多数可用文献的焦点都集中在正则多项式(CP)分解及其回归系数张量的变体上。CP分解的系数张量能够以相对较小的样本量进行估计,但它可能并不总是捕捉数据中潜在的复杂结构。在这项工作中,我们利用了最近发展起来的输卵管秩的概念,并开发了一个张量回归模型,其中有效张量被分解为两个分量:低输卵管秩张量和结构化稀疏张量。我们首先解决了包括系数张量的两个分量的可识别性问题,随后开发了一种快速且可扩展的交替最小化算法来解决凸正则化程序。此外,我们为模型参数提供了高维标度下的有限样本误差边界。该模型的性能是根据合成数据进行评估的,也用于涉及智能辅导平台数据的应用程序中。
{"title":"Regularized high dimension low tubal-rank tensor regression","authors":"S. Roy, G. Michailidis","doi":"10.1214/22-ejs2004","DOIUrl":"https://doi.org/10.1214/22-ejs2004","url":null,"abstract":": Tensor regression models are of emerging interest in diverse fields of social and behavioral sciences, including neuroimaging analysis, neural networks, image processing and so on. Recent theoretical advance- ments of tensor decomposition have facilitated significant development of various tensor regression models. The focus of most of the available lit- erature has been on the Canonical Polyadic (CP) decomposition and its variants for the regression coefficient tensor. A CP decomposed coefficient tensor enables estimation with relatively small sample size, but it may not always capture the underlying complex structure in the data. In this work, we leverage the recently developed concept of tubal rank and develop a tensor regression model, wherein the coefficient tensor is decomposed into two components: a low tubal rank tensor and a structured sparse one. We first address the issue of identifiability of the two components comprising the coefficient tensor and subsequently develop a fast and scalable Alternating Minimization algorithm to solve the convex regularized program. Further, we provide finite sample error bounds under high dimensional scaling for the model parameters. The performance of the model is assessed on synthetic data and is also used in an application involving data from an intelligent tutoring platform.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46715850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
LAMN property for multivariate inhomogeneous diffusions with discrete observations 离散观测下多元非齐次扩散的LAMN性质
IF 1.1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-01-01 DOI: 10.1214/22-ejs2049
N. Tran, H. Ngo
{"title":"LAMN property for multivariate inhomogeneous diffusions with discrete observations","authors":"N. Tran, H. Ngo","doi":"10.1214/22-ejs2049","DOIUrl":"https://doi.org/10.1214/22-ejs2049","url":null,"abstract":"","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46884058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing subspace restrictions in the presence of high dimensional nuisance parameters 在存在高维干扰参数的情况下测试子空间限制
IF 1.1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-01-01 DOI: 10.1214/22-ejs2058
Alessio Sancetta
{"title":"Testing subspace restrictions in the presence of high dimensional nuisance parameters","authors":"Alessio Sancetta","doi":"10.1214/22-ejs2058","DOIUrl":"https://doi.org/10.1214/22-ejs2058","url":null,"abstract":"","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42335792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Concentration inequalities for non-causal random fields 非因果随机场的集中不等式
IF 1.1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-01-01 DOI: 10.1214/22-ejs1992
Rémy Garnier, Raphael Langhendries
Concentration inequalities are widely used for analyzing machines learning algorithms. However, current concentration inequalities cannot be applied to some of the most popular deep neural networks, notably in natural language processing. This is mostly due to the non-causal nature of such involved data, in the sense that each data point depends on other neighbor data points. In this paper, a framework for modeling non-causal random fields is provided and a Hoeffding-type concentration inequality is obtained for this framework. The proof of this result relies on a local approximation of the non-causal random field by a function of a finite number of i.i.d. random variables.
集中不等式被广泛用于分析机器学习算法。然而,当前的集中不等式不能应用于一些最流行的深度神经网络,尤其是在自然语言处理中。这主要是由于此类相关数据的非因果性质,即每个数据点都依赖于其他相邻数据点。本文给出了一个非因果随机场的建模框架,并得到了该框架的Hoeffding型浓度不等式。该结果的证明依赖于非因果随机场的局部近似,该局部近似是有限数量的i.i.d.随机变量的函数。
{"title":"Concentration inequalities for non-causal random fields","authors":"Rémy Garnier, Raphael Langhendries","doi":"10.1214/22-ejs1992","DOIUrl":"https://doi.org/10.1214/22-ejs1992","url":null,"abstract":"Concentration inequalities are widely used for analyzing machines learning algorithms. However, current concentration inequalities cannot be applied to some of the most popular deep neural networks, notably in natural language processing. This is mostly due to the non-causal nature of such involved data, in the sense that each data point depends on other neighbor data points. In this paper, a framework for modeling non-causal random fields is provided and a Hoeffding-type concentration inequality is obtained for this framework. The proof of this result relies on a local approximation of the non-causal random field by a function of a finite number of i.i.d. random variables.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49352696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Estimation of the variance matrix in bivariate classical measurement error models 二元经典测量误差模型中方差矩阵的估计
IF 1.1 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-01-01 DOI: 10.1214/22-ejs1996
Elif Kekeç, I. Van Keilegom
: The presence of measurement errors is a ubiquitously faced problem and plenty of work has been done to overcome this when a single covariate is mismeasured under a variety of conditions. However, in practice, it is possible that more than one covariate is measured with error. When measurements are taken by the same device, the errors of these measurements are likely correlated. In this paper, we present a novel approach to estimate the covariance matrix of classical additive errors in the absence of validation data or auxiliary variables when two covariates are subject to measurement error. Our method assumes these errors to be following a bivariate normal distribution. We show that the variance matrix is identifiable under certain conditions on the support of the error-free variables and propose an estimation method based on an expansion of Bernstein polynomials. To investigate the per- formance of the proposed estimation method, the asymptotic properties of the estimator are examined and a diverse set of simulation studies is con- ducted. The estimated matrix is then used by the simulation-extrapolation (SIMEX) algorithm to reduce the bias caused by measurement error in lo- gistic regression models. Finally, the method is demonstrated using data from the Framingham Heart Study.
:测量误差的存在是一个普遍面临的问题,当在各种条件下对单个协变量进行错误测量时,已经做了大量的工作来克服这一问题。然而,在实践中,有可能测量到一个以上的协变量存在误差。当由同一设备进行测量时,这些测量的误差可能是相关的。在本文中,当两个协变量受到测量误差时,我们提出了一种新的方法来估计在没有验证数据或辅助变量的情况下经典加性误差的协方差矩阵。我们的方法假设这些误差遵循二元正态分布。我们证明了在无误差变量的支持下,方差矩阵在某些条件下是可识别的,并提出了一种基于Bernstein多项式展开的估计方法。为了研究所提出的估计方法的性能,检验了估计量的渐近性质,并进行了一组不同的模拟研究。然后通过模拟外推(SIMEX)算法使用估计矩阵,以减少逻辑回归模型中由测量误差引起的偏差。最后,使用弗雷明汉心脏研究的数据对该方法进行了验证。
{"title":"Estimation of the variance matrix in bivariate classical measurement error models","authors":"Elif Kekeç, I. Van Keilegom","doi":"10.1214/22-ejs1996","DOIUrl":"https://doi.org/10.1214/22-ejs1996","url":null,"abstract":": The presence of measurement errors is a ubiquitously faced problem and plenty of work has been done to overcome this when a single covariate is mismeasured under a variety of conditions. However, in practice, it is possible that more than one covariate is measured with error. When measurements are taken by the same device, the errors of these measurements are likely correlated. In this paper, we present a novel approach to estimate the covariance matrix of classical additive errors in the absence of validation data or auxiliary variables when two covariates are subject to measurement error. Our method assumes these errors to be following a bivariate normal distribution. We show that the variance matrix is identifiable under certain conditions on the support of the error-free variables and propose an estimation method based on an expansion of Bernstein polynomials. To investigate the per- formance of the proposed estimation method, the asymptotic properties of the estimator are examined and a diverse set of simulation studies is con- ducted. The estimated matrix is then used by the simulation-extrapolation (SIMEX) algorithm to reduce the bias caused by measurement error in lo- gistic regression models. Finally, the method is demonstrated using data from the Framingham Heart Study.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47033673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Electronic Journal of Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1