Pub Date : 2023-10-18DOI: 10.1007/s00362-023-01501-5
Panxu Yuan, Yinfei Kong, Gaorong Li
{"title":"FDR control and power analysis for high-dimensional logistic regression via StabKoff","authors":"Panxu Yuan, Yinfei Kong, Gaorong Li","doi":"10.1007/s00362-023-01501-5","DOIUrl":"https://doi.org/10.1007/s00362-023-01501-5","url":null,"abstract":"","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135883996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-07DOI: 10.1007/s00362-023-01485-2
Muhammad Qasim
Abstract In this article, a Stein-type weighted limited information maximum likelihood (LIML) estimator is proposed. It is based on a weighted average of the ordinary least squares (OLS) and LIML estimators, with weights inversely proportional to the Hausman test statistic. The asymptotic distribution of the proposed estimator is derived by means of local-to-exogenous asymptotic theory. In addition, the asymptotic risk of the Stein-type LIML estimator is calculated, and it is shown that the risk is strictly smaller than the risk of the LIML under certain conditions. A Monte Carlo simulation and an empirical application of a green patent dataset from Nordic countries are used to demonstrate the superiority of the Stein-type LIML estimator to the OLS, two-stage least squares, LIML and combined estimators when the number of instruments is large.
{"title":"A weighted average limited information maximum likelihood estimator","authors":"Muhammad Qasim","doi":"10.1007/s00362-023-01485-2","DOIUrl":"https://doi.org/10.1007/s00362-023-01485-2","url":null,"abstract":"Abstract In this article, a Stein-type weighted limited information maximum likelihood (LIML) estimator is proposed. It is based on a weighted average of the ordinary least squares (OLS) and LIML estimators, with weights inversely proportional to the Hausman test statistic. The asymptotic distribution of the proposed estimator is derived by means of local-to-exogenous asymptotic theory. In addition, the asymptotic risk of the Stein-type LIML estimator is calculated, and it is shown that the risk is strictly smaller than the risk of the LIML under certain conditions. A Monte Carlo simulation and an empirical application of a green patent dataset from Nordic countries are used to demonstrate the superiority of the Stein-type LIML estimator to the OLS, two-stage least squares, LIML and combined estimators when the number of instruments is large.","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135254336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-07DOI: 10.1007/s00362-023-01493-2
Astrid Jourdan
Uniform designs are widely used for experiments with mixtures. The uniformity of the design points is usually evaluated with a discrepancy criterion. In this paper, we propose a new criterion to measure the deviation between the design point distribution and a Dirichlet distribution. The support of the Dirichlet distribution, is defined by the set of d-dimensional vectors whose entries are real numbers in the interval [0,1] such that the sum of the coordinates is equal to 1. This support is suitable for mixture experiments. Depending on its parameters, the Dirichlet distribution allows symmetric or asymmetric, uniform or more concentrated point distribution. The difference between the empirical and the target distributions is evaluated with the Kullback–Leibler divergence. We use two methods to estimate the divergence: the plug-in estimate and the nearest-neighbor estimate. The resulting two criteria are used to build space-filling designs for mixture experiments. In the particular case of the flat Dirichlet distribution, both criteria lead to uniform designs. They are compared to existing uniformity criteria. The advantage of the new criteria is that they allow other distributions than uniformity and they are fast to compute.
{"title":"Space-filling designs with a Dirichlet distribution for mixture experiments","authors":"Astrid Jourdan","doi":"10.1007/s00362-023-01493-2","DOIUrl":"https://doi.org/10.1007/s00362-023-01493-2","url":null,"abstract":"Uniform designs are widely used for experiments with mixtures. The uniformity of the design points is usually evaluated with a discrepancy criterion. In this paper, we propose a new criterion to measure the deviation between the design point distribution and a Dirichlet distribution. The support of the Dirichlet distribution, is defined by the set of d-dimensional vectors whose entries are real numbers in the interval [0,1] such that the sum of the coordinates is equal to 1. This support is suitable for mixture experiments. Depending on its parameters, the Dirichlet distribution allows symmetric or asymmetric, uniform or more concentrated point distribution. The difference between the empirical and the target distributions is evaluated with the Kullback–Leibler divergence. We use two methods to estimate the divergence: the plug-in estimate and the nearest-neighbor estimate. The resulting two criteria are used to build space-filling designs for mixture experiments. In the particular case of the flat Dirichlet distribution, both criteria lead to uniform designs. They are compared to existing uniformity criteria. The advantage of the new criteria is that they allow other distributions than uniformity and they are fast to compute.","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135252055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-07DOI: 10.1007/s00362-023-01500-6
Yongshuai Chen, Wenwen Guo, Hengjian Cui
{"title":"On the test of covariance between two high-dimensional random vectors","authors":"Yongshuai Chen, Wenwen Guo, Hengjian Cui","doi":"10.1007/s00362-023-01500-6","DOIUrl":"https://doi.org/10.1007/s00362-023-01500-6","url":null,"abstract":"","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135254337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-03DOI: 10.1007/s00362-023-01499-w
Shuangzhe Liu, Götz Trenkler, Tõnu Kollo, Dietrich von Rosen, Oskar Maria Baksalary
{"title":"Professor Heinz Neudecker and matrix differential calculus","authors":"Shuangzhe Liu, Götz Trenkler, Tõnu Kollo, Dietrich von Rosen, Oskar Maria Baksalary","doi":"10.1007/s00362-023-01499-w","DOIUrl":"https://doi.org/10.1007/s00362-023-01499-w","url":null,"abstract":"","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135696549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-29DOI: 10.1007/s00362-023-01497-y
Alberto Lanconelli, Christopher S. A. Lauria
Abstract We consider the problem of tracking an unknown time varying parameter that characterizes the probabilistic evolution of a sequence of independent observations. To this aim, we propose a stochastic gradient descent-based recursive scheme in which the log-likelihood of the observations acts as time varying gain function. We prove convergence in mean-square error in a suitable neighbourhood of the unknown time varying parameter and illustrate the details of our findings in the case where data are generated from distributions belonging to the exponential family.
{"title":"Maximum Likelihood With a Time Varying Parameter","authors":"Alberto Lanconelli, Christopher S. A. Lauria","doi":"10.1007/s00362-023-01497-y","DOIUrl":"https://doi.org/10.1007/s00362-023-01497-y","url":null,"abstract":"Abstract We consider the problem of tracking an unknown time varying parameter that characterizes the probabilistic evolution of a sequence of independent observations. To this aim, we propose a stochastic gradient descent-based recursive scheme in which the log-likelihood of the observations acts as time varying gain function. We prove convergence in mean-square error in a suitable neighbourhood of the unknown time varying parameter and illustrate the details of our findings in the case where data are generated from distributions belonging to the exponential family.","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135244210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-29DOI: 10.1007/s00362-023-01492-3
Matteo Farnè, Angelos Vouldis
Abstract This paper presents a methodology, called ROBOUT, to identify outliers conditional on a high-dimensional noisy information set. In particular, ROBOUT is able to identify observations with outlying conditional mean or variance when the dataset contains multivariate outliers in or besides the predictors, multi-collinearity, and a large variable dimension compared to the sample size. ROBOUT entails a pre-processing step, a preliminary robust imputation procedure that prevents anomalous instances from corrupting predictor recovery, a selection stage of the statistically relevant predictors (through cross-validated LASSO-penalized Huber loss regression), the estimation of a robust regression model based on the selected predictors (via MM regression), and a criterion to identify conditional outliers. We conduct a comprehensive simulation study in which the proposed algorithm is tested under a wide range of perturbation scenarios. The combination formed by LASSO-penalized Huber loss and MM regression turns out to be the best in terms of conditional outlier detection under the above described perturbed conditions, also compared to existing integrated methodologies like Sparse Least Trimmed Squares and Robust Least Angle Regression. Furthermore, the proposed methodology is applied to a granular supervisory banking dataset collected by the European Central Bank, in order to model the total assets of euro area banks.
{"title":"ROBOUT: a conditional outlier detection methodology for high-dimensional data","authors":"Matteo Farnè, Angelos Vouldis","doi":"10.1007/s00362-023-01492-3","DOIUrl":"https://doi.org/10.1007/s00362-023-01492-3","url":null,"abstract":"Abstract This paper presents a methodology, called ROBOUT, to identify outliers conditional on a high-dimensional noisy information set. In particular, ROBOUT is able to identify observations with outlying conditional mean or variance when the dataset contains multivariate outliers in or besides the predictors, multi-collinearity, and a large variable dimension compared to the sample size. ROBOUT entails a pre-processing step, a preliminary robust imputation procedure that prevents anomalous instances from corrupting predictor recovery, a selection stage of the statistically relevant predictors (through cross-validated LASSO-penalized Huber loss regression), the estimation of a robust regression model based on the selected predictors (via MM regression), and a criterion to identify conditional outliers. We conduct a comprehensive simulation study in which the proposed algorithm is tested under a wide range of perturbation scenarios. The combination formed by LASSO-penalized Huber loss and MM regression turns out to be the best in terms of conditional outlier detection under the above described perturbed conditions, also compared to existing integrated methodologies like Sparse Least Trimmed Squares and Robust Least Angle Regression. Furthermore, the proposed methodology is applied to a granular supervisory banking dataset collected by the European Central Bank, in order to model the total assets of euro area banks.","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135199371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-29DOI: 10.1007/s00362-023-01496-z
Jan Beran, Frieder Droullier
Abstract We consider INAR(1) processes modulated by an unobserved strongly dependent $$0-1$$ 0-1 process. The observed process exhibits zero inflation and long memory. A simple method is proposed for estimating the INAR-parameters without modelling the unobserved modulating process. Asymptotic results for the estimators are derived, and a zero-inflation test is introduced. Asymptotic rejection regions and asymptotic power under long-memory alternatives are derived. A small simulation study illustrates the asymptotic results.
{"title":"On strongly dependent zero-inflated INAR(1) processes","authors":"Jan Beran, Frieder Droullier","doi":"10.1007/s00362-023-01496-z","DOIUrl":"https://doi.org/10.1007/s00362-023-01496-z","url":null,"abstract":"Abstract We consider INAR(1) processes modulated by an unobserved strongly dependent $$0-1$$ <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:mn>0</mml:mn> <mml:mo>-</mml:mo> <mml:mn>1</mml:mn> </mml:mrow> </mml:math> process. The observed process exhibits zero inflation and long memory. A simple method is proposed for estimating the INAR-parameters without modelling the unobserved modulating process. Asymptotic results for the estimators are derived, and a zero-inflation test is introduced. Asymptotic rejection regions and asymptotic power under long-memory alternatives are derived. A small simulation study illustrates the asymptotic results.","PeriodicalId":51166,"journal":{"name":"Statistical Papers","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135199861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}