Abstract: We revisit the problem of designing an efficient binary classifier in a challenging high-dimensional framework. The model under study assumes some local dependence structure among feature variables represented by a block-diagonal covariance matrix with a growing number of blocks of an arbitrary, but fixed size. The blocks correspond to non-overlapping independent groups of strongly correlated features. To assess the relevance of a particular block in predicting the response, we introduce a measure of “signal strength” pertaining to each feature block. This measure is then used to specify a sparse model of our interest. We further propose a threshold-based feature selector which operates as a screen-and-clean scheme integrated into a linear classifier: the data is subject to screening and hard threshold cleaning to filter out the blocks that contain no signals. Asymptotic properties of the proposed classifiers are studied when the sample size n depends on the number of feature blocks b, and the sample size goes to infinity with b at a slower rate than b. The new classifiers, which are fully adaptive to unknown parameters of the model, are shown to perform asymptotically optimally in a large part of the classification region. The numerical study confirms good analytical properties of the new classifiers that compare favorably to the existing threshold-based procedure used in a similar context.
{"title":"Adaptive threshold-based classification of sparse high-dimensional data","authors":"T. Pavlenko, N. Stepanova, Lee Thompson","doi":"10.1214/22-ejs1998","DOIUrl":"https://doi.org/10.1214/22-ejs1998","url":null,"abstract":"Abstract: We revisit the problem of designing an efficient binary classifier in a challenging high-dimensional framework. The model under study assumes some local dependence structure among feature variables represented by a block-diagonal covariance matrix with a growing number of blocks of an arbitrary, but fixed size. The blocks correspond to non-overlapping independent groups of strongly correlated features. To assess the relevance of a particular block in predicting the response, we introduce a measure of “signal strength” pertaining to each feature block. This measure is then used to specify a sparse model of our interest. We further propose a threshold-based feature selector which operates as a screen-and-clean scheme integrated into a linear classifier: the data is subject to screening and hard threshold cleaning to filter out the blocks that contain no signals. Asymptotic properties of the proposed classifiers are studied when the sample size n depends on the number of feature blocks b, and the sample size goes to infinity with b at a slower rate than b. The new classifiers, which are fully adaptive to unknown parameters of the model, are shown to perform asymptotically optimally in a large part of the classification region. The numerical study confirms good analytical properties of the new classifiers that compare favorably to the existing threshold-based procedure used in a similar context.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47611113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we present new theoretical results on optimal estimation of certain random quantities based on high frequency observations of a Lévy process. More specifically, we investigate the asymptotic theory for the conditional mean and conditional median estimators of the supremum/infimum of a linear Brownian motion and a strictly stable Lévy process. Another contribution of our article is the conditional mean estimation of the local time and the occupation time of a linear Brownian motion. We demonstrate that the new estimators are considerably more efficient compared to the classical estimators studied in e.g. [6, 14, 29, 30, 38]. Furthermore, we discuss pre-estimation of the parameters of the underlying models, which is required for practical implementation of the proposed statistics. MSC2020 subject classifications: Primary 62M05, 62G20, 60F05; secondary 62G15, 60G18, 60G51.
{"title":"Optimal estimation of the supremum and occupation times of a self-similar Lévy process","authors":"J. Ivanovs, M. Podolskij","doi":"10.1214/21-ejs1928","DOIUrl":"https://doi.org/10.1214/21-ejs1928","url":null,"abstract":"In this paper we present new theoretical results on optimal estimation of certain random quantities based on high frequency observations of a Lévy process. More specifically, we investigate the asymptotic theory for the conditional mean and conditional median estimators of the supremum/infimum of a linear Brownian motion and a strictly stable Lévy process. Another contribution of our article is the conditional mean estimation of the local time and the occupation time of a linear Brownian motion. We demonstrate that the new estimators are considerably more efficient compared to the classical estimators studied in e.g. [6, 14, 29, 30, 38]. Furthermore, we discuss pre-estimation of the parameters of the underlying models, which is required for practical implementation of the proposed statistics. MSC2020 subject classifications: Primary 62M05, 62G20, 60F05; secondary 62G15, 60G18, 60G51.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43013455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sufficient Dimension Reduction (SDR) becomes an important tool for mitigating the curse of dimensionality in high dimensional regression analysis. Recently, Flexible SDR (FSDR) has been proposed to extend SDR by finding lower dimensional projections of transformed explanatory variables. The dimensions of the projections however cannot fully represent the extent of data reduction FSDR can achieve. As a consequence, optimality and other theoretical properties of FSDR are currently not well understood. In this article, we propose to use the σ-field associated with the projections, together with their dimensions to fully characterize FSDR, and refer to the σ-field as the FSDR σ-field. We further introduce the concept of minimal FSDR σ-field and consider FSDR projections with the minimal σfield optimal. Under some mild conditions, we show that the minimal FSDR σ-field exists, attaining the lowest dimensionality at the same time. To estimate the minimal FSDR σ-field, we propose a two-stage procedure called the Generalized Kernel Dimension Reduction (GKDR) method and partially establish its consistency property under weak conditions. Extensive simulation experiments demonstrate that the GKDRmethod can effectively find the minimal FSDR σ-field and outperform other existing methods. The application of GKDR to a real life air pollution data set sheds new light on the connections between atmospheric conditions and air quality. MSC2020 subject classifications: Primary 62B05; secondary 62J02.
{"title":"Minimal σ-field for flexible sufficient dimension reduction","authors":"Hanmin Guo, Lin Hou, Y. Zhu","doi":"10.1214/22-ejs1999","DOIUrl":"https://doi.org/10.1214/22-ejs1999","url":null,"abstract":"Sufficient Dimension Reduction (SDR) becomes an important tool for mitigating the curse of dimensionality in high dimensional regression analysis. Recently, Flexible SDR (FSDR) has been proposed to extend SDR by finding lower dimensional projections of transformed explanatory variables. The dimensions of the projections however cannot fully represent the extent of data reduction FSDR can achieve. As a consequence, optimality and other theoretical properties of FSDR are currently not well understood. In this article, we propose to use the σ-field associated with the projections, together with their dimensions to fully characterize FSDR, and refer to the σ-field as the FSDR σ-field. We further introduce the concept of minimal FSDR σ-field and consider FSDR projections with the minimal σfield optimal. Under some mild conditions, we show that the minimal FSDR σ-field exists, attaining the lowest dimensionality at the same time. To estimate the minimal FSDR σ-field, we propose a two-stage procedure called the Generalized Kernel Dimension Reduction (GKDR) method and partially establish its consistency property under weak conditions. Extensive simulation experiments demonstrate that the GKDRmethod can effectively find the minimal FSDR σ-field and outperform other existing methods. The application of GKDR to a real life air pollution data set sheds new light on the connections between atmospheric conditions and air quality. MSC2020 subject classifications: Primary 62B05; secondary 62J02.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46459761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimation of partially conditional average treatment effect by double kernel-covariate balancing","authors":"Jiayi Wang, R. K. Wong, Shu Yang, K. C. G. Chan","doi":"10.1214/22-ejs2000","DOIUrl":"https://doi.org/10.1214/22-ejs2000","url":null,"abstract":"","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49422993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Isotonic regression for elicitable functionals and their Bayes risk","authors":"Anja Mühlemann, Johanna F. Ziegel","doi":"10.1214/22-ejs2034","DOIUrl":"https://doi.org/10.1214/22-ejs2034","url":null,"abstract":"","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44276319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
: Tensor regression models are of emerging interest in diverse fields of social and behavioral sciences, including neuroimaging analysis, neural networks, image processing and so on. Recent theoretical advance- ments of tensor decomposition have facilitated significant development of various tensor regression models. The focus of most of the available lit- erature has been on the Canonical Polyadic (CP) decomposition and its variants for the regression coefficient tensor. A CP decomposed coefficient tensor enables estimation with relatively small sample size, but it may not always capture the underlying complex structure in the data. In this work, we leverage the recently developed concept of tubal rank and develop a tensor regression model, wherein the coefficient tensor is decomposed into two components: a low tubal rank tensor and a structured sparse one. We first address the issue of identifiability of the two components comprising the coefficient tensor and subsequently develop a fast and scalable Alternating Minimization algorithm to solve the convex regularized program. Further, we provide finite sample error bounds under high dimensional scaling for the model parameters. The performance of the model is assessed on synthetic data and is also used in an application involving data from an intelligent tutoring platform.
{"title":"Regularized high dimension low tubal-rank tensor regression","authors":"S. Roy, G. Michailidis","doi":"10.1214/22-ejs2004","DOIUrl":"https://doi.org/10.1214/22-ejs2004","url":null,"abstract":": Tensor regression models are of emerging interest in diverse fields of social and behavioral sciences, including neuroimaging analysis, neural networks, image processing and so on. Recent theoretical advance- ments of tensor decomposition have facilitated significant development of various tensor regression models. The focus of most of the available lit- erature has been on the Canonical Polyadic (CP) decomposition and its variants for the regression coefficient tensor. A CP decomposed coefficient tensor enables estimation with relatively small sample size, but it may not always capture the underlying complex structure in the data. In this work, we leverage the recently developed concept of tubal rank and develop a tensor regression model, wherein the coefficient tensor is decomposed into two components: a low tubal rank tensor and a structured sparse one. We first address the issue of identifiability of the two components comprising the coefficient tensor and subsequently develop a fast and scalable Alternating Minimization algorithm to solve the convex regularized program. Further, we provide finite sample error bounds under high dimensional scaling for the model parameters. The performance of the model is assessed on synthetic data and is also used in an application involving data from an intelligent tutoring platform.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46715850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LAMN property for multivariate inhomogeneous diffusions with discrete observations","authors":"N. Tran, H. Ngo","doi":"10.1214/22-ejs2049","DOIUrl":"https://doi.org/10.1214/22-ejs2049","url":null,"abstract":"","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46884058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Testing subspace restrictions in the presence of high dimensional nuisance parameters","authors":"Alessio Sancetta","doi":"10.1214/22-ejs2058","DOIUrl":"https://doi.org/10.1214/22-ejs2058","url":null,"abstract":"","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42335792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Concentration inequalities are widely used for analyzing machines learning algorithms. However, current concentration inequalities cannot be applied to some of the most popular deep neural networks, notably in natural language processing. This is mostly due to the non-causal nature of such involved data, in the sense that each data point depends on other neighbor data points. In this paper, a framework for modeling non-causal random fields is provided and a Hoeffding-type concentration inequality is obtained for this framework. The proof of this result relies on a local approximation of the non-causal random field by a function of a finite number of i.i.d. random variables.
{"title":"Concentration inequalities for non-causal random fields","authors":"Rémy Garnier, Raphael Langhendries","doi":"10.1214/22-ejs1992","DOIUrl":"https://doi.org/10.1214/22-ejs1992","url":null,"abstract":"Concentration inequalities are widely used for analyzing machines learning algorithms. However, current concentration inequalities cannot be applied to some of the most popular deep neural networks, notably in natural language processing. This is mostly due to the non-causal nature of such involved data, in the sense that each data point depends on other neighbor data points. In this paper, a framework for modeling non-causal random fields is provided and a Hoeffding-type concentration inequality is obtained for this framework. The proof of this result relies on a local approximation of the non-causal random field by a function of a finite number of i.i.d. random variables.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49352696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
: The presence of measurement errors is a ubiquitously faced problem and plenty of work has been done to overcome this when a single covariate is mismeasured under a variety of conditions. However, in practice, it is possible that more than one covariate is measured with error. When measurements are taken by the same device, the errors of these measurements are likely correlated. In this paper, we present a novel approach to estimate the covariance matrix of classical additive errors in the absence of validation data or auxiliary variables when two covariates are subject to measurement error. Our method assumes these errors to be following a bivariate normal distribution. We show that the variance matrix is identifiable under certain conditions on the support of the error-free variables and propose an estimation method based on an expansion of Bernstein polynomials. To investigate the per- formance of the proposed estimation method, the asymptotic properties of the estimator are examined and a diverse set of simulation studies is con- ducted. The estimated matrix is then used by the simulation-extrapolation (SIMEX) algorithm to reduce the bias caused by measurement error in lo- gistic regression models. Finally, the method is demonstrated using data from the Framingham Heart Study.
{"title":"Estimation of the variance matrix in bivariate classical measurement error models","authors":"Elif Kekeç, I. Van Keilegom","doi":"10.1214/22-ejs1996","DOIUrl":"https://doi.org/10.1214/22-ejs1996","url":null,"abstract":": The presence of measurement errors is a ubiquitously faced problem and plenty of work has been done to overcome this when a single covariate is mismeasured under a variety of conditions. However, in practice, it is possible that more than one covariate is measured with error. When measurements are taken by the same device, the errors of these measurements are likely correlated. In this paper, we present a novel approach to estimate the covariance matrix of classical additive errors in the absence of validation data or auxiliary variables when two covariates are subject to measurement error. Our method assumes these errors to be following a bivariate normal distribution. We show that the variance matrix is identifiable under certain conditions on the support of the error-free variables and propose an estimation method based on an expansion of Bernstein polynomials. To investigate the per- formance of the proposed estimation method, the asymptotic properties of the estimator are examined and a diverse set of simulation studies is con- ducted. The estimated matrix is then used by the simulation-extrapolation (SIMEX) algorithm to reduce the bias caused by measurement error in lo- gistic regression models. Finally, the method is demonstrated using data from the Framingham Heart Study.","PeriodicalId":49272,"journal":{"name":"Electronic Journal of Statistics","volume":" ","pages":""},"PeriodicalIF":1.1,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47033673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}