Pub Date : 2025-03-01Epub Date: 2024-08-22DOI: 10.1016/j.jspi.2024.106227
Zikun Qin, Malay Ghosh
The paper considers a multiple testing problem of multivariate normal means under sparsity. First, the Bayes risk of the multivariate Bayes oracle is derived. Then, a hierarchical Bayesian approach is taken with global–local shrinkage priors, where the global parameter is either treated as a tuning parameter or is given a specific prior. The method is shown to attain an asymptotic Bayes optimal under sparsity (ABOS) property. Finally, an empirical Bayes procedure is proposed which involves estimation of the global shrinkage parameter. The approach is also shown to lead to the ABOS property.
本文研究了稀疏性条件下的多元正态均值多重检验问题。首先,推导出多元贝叶斯神谕的贝叶斯风险。然后,采用全局-局部收缩先验的分层贝叶斯方法,其中全局参数要么被视为调整参数,要么被赋予特定先验。结果表明,该方法具有稀疏性下的渐进贝叶斯最优(ABOS)特性。最后,提出了一种经验贝叶斯程序,涉及全局收缩参数的估计。该方法也显示出 ABOS 特性。
{"title":"Bayes oracle property of multiple tests of multivariate normal means under sparsity","authors":"Zikun Qin, Malay Ghosh","doi":"10.1016/j.jspi.2024.106227","DOIUrl":"10.1016/j.jspi.2024.106227","url":null,"abstract":"<div><p>The paper considers a multiple testing problem of multivariate normal means under sparsity. First, the Bayes risk of the multivariate Bayes oracle is derived. Then, a hierarchical Bayesian approach is taken with global–local shrinkage priors, where the global parameter is either treated as a tuning parameter or is given a specific prior. The method is shown to attain an asymptotic Bayes optimal under sparsity (ABOS) property. Finally, an empirical Bayes procedure is proposed which involves estimation of the global shrinkage parameter. The approach is also shown to lead to the ABOS property.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"235 ","pages":"Article 106227"},"PeriodicalIF":0.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142088421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01Epub Date: 2024-08-10DOI: 10.1016/j.jspi.2024.106221
Feng Yu , Archer Yi Yang , Teng Zhang
We propose a new algorithm for solving the graph-fused lasso (GFL), a regularized model that operates under the assumption that the signal tends to be locally constant over a predefined graph structure. The proposed method applies a novel decomposition of the objective function for the alternating direction method of multipliers (ADMM) algorithm. While ADMM has been widely used in fused lasso problems, existing works such as the network lasso decompose the objective function into the loss function component and the total variation penalty component. In contrast, based on the graph matching technique in graph theory, we propose a new method of decomposition that separates the objective function into two components, where one component is the loss function plus part of the total variation penalty, and the other component is the remaining total variation penalty. We develop an exact convergence rate of the proposed algorithm by developing a general theory on the local convergence of ADMM. Compared with the network lasso algorithm, our algorithm has a faster exact linear convergence rate (although in the same order as for the network lasso). It also enjoys a smaller computational cost per iteration, thus converges overall faster in most numerical examples.
{"title":"A graph decomposition-based approach for the graph-fused lasso","authors":"Feng Yu , Archer Yi Yang , Teng Zhang","doi":"10.1016/j.jspi.2024.106221","DOIUrl":"10.1016/j.jspi.2024.106221","url":null,"abstract":"<div><p>We propose a new algorithm for solving the graph-fused lasso (GFL), a regularized model that operates under the assumption that the signal tends to be locally constant over a predefined graph structure. The proposed method applies a novel decomposition of the objective function for the alternating direction method of multipliers (ADMM) algorithm. While ADMM has been widely used in fused lasso problems, existing works such as the network lasso decompose the objective function into the loss function component and the total variation penalty component. In contrast, based on the graph matching technique in graph theory, we propose a new method of decomposition that separates the objective function into two components, where one component is the loss function plus part of the total variation penalty, and the other component is the remaining total variation penalty. We develop an exact convergence rate of the proposed algorithm by developing a general theory on the local convergence of ADMM. Compared with the network lasso algorithm, our algorithm has a faster exact linear convergence rate (although in the same order as for the network lasso). It also enjoys a smaller computational cost per iteration, thus converges overall faster in most numerical examples.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"235 ","pages":"Article 106221"},"PeriodicalIF":0.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142096052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper focuses on drawing statistical inference based on a novel variant of maxima or minima nomination sampling (NS) designs. These sampling designs are useful for obtaining more representative sample units from the tails of the population distribution using the available auxiliary ranking information. However, one common difficulty in performing NS in practice is that the researcher cannot obtain a nominated sample unless he/she uniquely determines the sample unit with the highest or the lowest rank in each set. To overcome this problem, a variant of NS, which is called partial nomination sampling, is proposed, in which the researcher is allowed to declare that two or more units are tied in the ranks whenever he/she cannot find the sample unit with the highest or the lowest rank. Based on this sampling design, two asymptotically unbiased estimators are developed for the cumulative distribution function, which is obtained using maximum likelihood and moment-based approaches, and their asymptotic normalities are proved. Several numerical studies have shown that the proposed estimators have higher relative efficiencies than their counterparts in simple random sampling in analyzing either the upper or the lower tail of the parent distribution. The procedures that we developed are then implemented on a real dataset from the Third National Health and Nutrition Examination Survey (NHANES III) to estimate the prevalence of osteoporosis among adult women aged 50 and over. It is shown that in certain circumstances, the techniques that we have developed require only one-third of the sample size needed in SRS to achieve the desired precision. This results in a considerable reduction in time and cost compared to the standard SRS method.
{"title":"Statistical inference from partially nominated sets: An application to estimating the prevalence of osteoporosis among adult women","authors":"Zeinab Akbari Ghamsari , Ehsan Zamanzade , Majid Asadi","doi":"10.1016/j.jspi.2024.106214","DOIUrl":"10.1016/j.jspi.2024.106214","url":null,"abstract":"<div><p>This paper focuses on drawing statistical inference based on a novel variant of maxima or minima nomination sampling (NS) designs. These sampling designs are useful for obtaining more representative sample units from the tails of the population distribution using the available auxiliary ranking information. However, one common difficulty in performing NS in practice is that the researcher cannot obtain a nominated sample unless he/she uniquely determines the sample unit with the highest or the lowest rank in each set. To overcome this problem, a variant of NS, which is called partial nomination sampling, is proposed, in which the researcher is allowed to declare that two or more units are tied in the ranks whenever he/she cannot find the sample unit with the highest or the lowest rank. Based on this sampling design, two asymptotically unbiased estimators are developed for the cumulative distribution function, which is obtained using maximum likelihood and moment-based approaches, and their asymptotic normalities are proved. Several numerical studies have shown that the proposed estimators have higher relative efficiencies than their counterparts in simple random sampling in analyzing either the upper or the lower tail of the parent distribution. The procedures that we developed are then implemented on a real dataset from the Third National Health and Nutrition Examination Survey (NHANES III) to estimate the prevalence of osteoporosis among adult women aged 50 and over. It is shown that in certain circumstances, the techniques that we have developed require only one-third of the sample size needed in SRS to achieve the desired precision. This results in a considerable reduction in time and cost compared to the standard SRS method.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"235 ","pages":"Article 106214"},"PeriodicalIF":0.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-01Epub Date: 2024-08-08DOI: 10.1016/j.jspi.2024.106222
Andrea Bratsberg , Magne Thoresen , Abhik Ghosh
Generalized linear mixed models are powerful tools for analyzing clustered data, where the unknown parameters are classically (and most commonly) estimated by the maximum likelihood and restricted maximum likelihood procedures. However, since the likelihood-based procedures are known to be highly sensitive to outliers, M-estimators have become popular as a means to obtain robust estimates under possible data contamination. In this paper, we prove that for sufficiently smooth general loss functions defining the M-estimators in generalized linear mixed models, the tail probability of the deviation between the estimated and the true regression coefficients has an exponential bound. This implies an exponential rate of consistency of these M-estimators under appropriate assumptions, generalizing the existing exponential consistency results from univariate to multivariate responses. We have illustrated this theoretical result further for the special examples of the maximum likelihood estimator and the robust minimum density power divergence estimator, a popular example of model-based M-estimators, in the settings of linear and logistic mixed models, comparing it with the empirical rate of convergence through simulation studies.
{"title":"Exponential consistency of M-estimators in generalized linear mixed models","authors":"Andrea Bratsberg , Magne Thoresen , Abhik Ghosh","doi":"10.1016/j.jspi.2024.106222","DOIUrl":"10.1016/j.jspi.2024.106222","url":null,"abstract":"<div><p>Generalized linear mixed models are powerful tools for analyzing clustered data, where the unknown parameters are classically (and most commonly) estimated by the maximum likelihood and restricted maximum likelihood procedures. However, since the likelihood-based procedures are known to be highly sensitive to outliers, M-estimators have become popular as a means to obtain robust estimates under possible data contamination. In this paper, we prove that for sufficiently smooth general loss functions defining the M-estimators in generalized linear mixed models, the tail probability of the deviation between the estimated and the true regression coefficients has an exponential bound. This implies an exponential rate of consistency of these M-estimators under appropriate assumptions, generalizing the existing exponential consistency results from univariate to multivariate responses. We have illustrated this theoretical result further for the special examples of the maximum likelihood estimator and the robust minimum density power divergence estimator, a popular example of model-based M-estimators, in the settings of linear and logistic mixed models, comparing it with the empirical rate of convergence through simulation studies.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"235 ","pages":"Article 106222"},"PeriodicalIF":0.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S037837582400079X/pdfft?md5=852e7e6dbe375fd6c8f548a7fe669070&pid=1-s2.0-S037837582400079X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141990800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-07-05DOI: 10.1016/j.jspi.2024.106211
Kiran Iftikhar , Manzoor Khan , Jake Olivier
In repeated measurements, regression to the mean (RTM) is a tendency of subjects with observed extreme values to move closer to the mean when measured a second time. Not accounting for RTM could lead to incorrect decisions such as when observed natural variation is incorrectly attributed to the effect of a treatment/intervention. A strategy for addressing RTM is to decompose the total effect, the expected difference in paired random variables conditional on the first being in the tail of its distribution, into regression to the mean and unbiased treatment effects. The unbiased treatment effect can then be estimated by subtraction. Formulae are available in the literature to quantify RTM for Poisson distributed data which are constrained by mean–variance equivalence, although there are many real life examples of overdispersed count data that are not well approximated by the Poisson. The negative binomial can be considered an explicit overdispersed Poisson process where the Poisson intensity is chosen from a gamma distribution. In this study, the truncated bivariate negative binomial distribution is used to decompose the total effect formulae into RTM and treatment effects. Maximum likelihood estimators (MLE) and method of moments estimators are developed for the total, RTM, and treatment effects. A simulation study is carried out to investigate the properties of the estimators and compare them with those developed under the assumption of the Poisson process. Data on the incidence of dengue cases reported from 2007 to 2017 are used to estimate the total, RTM, and treatment effects.
{"title":"Regression to the mean for overdispersed count data","authors":"Kiran Iftikhar , Manzoor Khan , Jake Olivier","doi":"10.1016/j.jspi.2024.106211","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106211","url":null,"abstract":"<div><p>In repeated measurements, regression to the mean (RTM) is a tendency of subjects with observed extreme values to move closer to the mean when measured a second time. Not accounting for RTM could lead to incorrect decisions such as when observed natural variation is incorrectly attributed to the effect of a treatment/intervention. A strategy for addressing RTM is to decompose the <em>total effect</em>, the expected difference in paired random variables conditional on the first being in the tail of its distribution, into regression to the mean and unbiased treatment effects. The unbiased treatment effect can then be estimated by subtraction. Formulae are available in the literature to quantify RTM for Poisson distributed data which are constrained by mean–variance equivalence, although there are many real life examples of overdispersed count data that are not well approximated by the Poisson. The negative binomial can be considered an explicit overdispersed Poisson process where the Poisson intensity is chosen from a gamma distribution. In this study, the truncated bivariate negative binomial distribution is used to decompose the total effect formulae into RTM and treatment effects. Maximum likelihood estimators (MLE) and method of moments estimators are developed for the total, RTM, and treatment effects. A simulation study is carried out to investigate the properties of the estimators and compare them with those developed under the assumption of the Poisson process. Data on the incidence of dengue cases reported from 2007 to 2017 are used to estimate the total, RTM, and treatment effects.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106211"},"PeriodicalIF":0.8,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141606665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-05-28DOI: 10.1016/j.jspi.2024.106194
Anne-Marie Toparkus, Rafael Weißbach
In studies on lifetimes, occasionally, the population contains statistical units that are born before the data collection has started. Left-truncated are units that deceased before this start. For all other units, the age at the study start often is recorded and we aim at testing whether this second measurement is independent of the genuine measure of interest, the lifetime. Our basic model of dependence is the one-parameter Gumbel–Barnett copula. For simplicity, the marginal distribution of the lifetime is assumed to be Exponential and for the age-at-study-start, namely the distribution of birth dates, we assume a Uniform. Also for simplicity, and to fit our application, we assume that units that die later than our study period, are also truncated. As a result from point process theory, we can approximate the truncated sample by a Poisson process and thereby derive its likelihood. Identification, consistency and asymptotic distribution of the maximum-likelihood estimator are derived. Testing for positive truncation dependence must include the hypothetical independence which coincides with the boundary of the copula’s parameter space. By non-standard theory, the maximum likelihood estimator of the exponential and the copula parameter is distributed as a mixture of a two- and a one-dimensional normal distribution. For the proof, the third parameter, the unobservable sample size, is profiled out. An interesting result is, that it differs to view the data as truncated sample, or, as simple sample from the truncated population, but not by much. The application are 55 thousand double-truncated lifetimes of German businesses that closed down over the period 2014 to 2016. The likelihood has its maximum for the copula parameter at the parameter space boundary so that the -value of test is 0.5. The life expectancy does not increase relative to the year of foundation. Using a Farlie–Gumbel–Morgenstern copula, which models positive and negative dependence, finds that life expectancy of German enterprises even decreases significantly over time. A simulation under the condition of the application suggests that the tests retain the nominal level and have good power.
{"title":"Testing truncation dependence: The Gumbel–Barnett copula","authors":"Anne-Marie Toparkus, Rafael Weißbach","doi":"10.1016/j.jspi.2024.106194","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106194","url":null,"abstract":"<div><p>In studies on lifetimes, occasionally, the population contains statistical units that are born before the data collection has started. Left-truncated are units that deceased before this start. For all other units, the age at the study start often is recorded and we aim at testing whether this second measurement is independent of the genuine measure of interest, the lifetime. Our basic model of dependence is the one-parameter Gumbel–Barnett copula. For simplicity, the marginal distribution of the lifetime is assumed to be Exponential and for the age-at-study-start, namely the distribution of birth dates, we assume a Uniform. Also for simplicity, and to fit our application, we assume that units that die later than our study period, are also truncated. As a result from point process theory, we can approximate the truncated sample by a Poisson process and thereby derive its likelihood. Identification, consistency and asymptotic distribution of the maximum-likelihood estimator are derived. Testing for positive truncation dependence must include the hypothetical independence which coincides with the boundary of the copula’s parameter space. By non-standard theory, the maximum likelihood estimator of the exponential and the copula parameter is distributed as a mixture of a two- and a one-dimensional normal distribution. For the proof, the third parameter, the unobservable sample size, is profiled out. An interesting result is, that it differs to view the data as truncated sample, or, as simple sample from the truncated population, but not by much. The application are 55 thousand double-truncated lifetimes of German businesses that closed down over the period 2014 to 2016. The likelihood has its maximum for the copula parameter at the parameter space boundary so that the <span><math><mi>p</mi></math></span>-value of test is 0.5. The life expectancy does not increase relative to the year of foundation. Using a Farlie–Gumbel–Morgenstern copula, which models positive and negative dependence, finds that life expectancy of German enterprises even decreases significantly over time. A simulation under the condition of the application suggests that the tests retain the nominal level and have good power.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106194"},"PeriodicalIF":0.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S037837582400051X/pdfft?md5=a5bc737bb68bd11a1a31f4aeb333c40e&pid=1-s2.0-S037837582400051X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141240222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-06-05DOI: 10.1016/j.jspi.2024.106197
Lazaros Kanellopoulos, Konstadinos Politis
We study some properties and relations for stochastic orders and aging classes related to the Laplace transform. In particular, we show that the NBU class of distributions is closed under convolution. We also obtain results for the ratio of derivatives of the Laplace transform between two distributions.
{"title":"Some results for stochastic orders and aging properties related to the Laplace transform","authors":"Lazaros Kanellopoulos, Konstadinos Politis","doi":"10.1016/j.jspi.2024.106197","DOIUrl":"10.1016/j.jspi.2024.106197","url":null,"abstract":"<div><p>We study some properties and relations for stochastic orders and aging classes related to the Laplace transform. In particular, we show that the NBU<span><math><msub><mrow></mrow><mrow><mtext>Lt</mtext></mrow></msub></math></span> class of distributions is closed under convolution. We also obtain results for the ratio of derivatives of the Laplace transform between two distributions.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106197"},"PeriodicalIF":0.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141403038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-05-15DOI: 10.1016/j.jspi.2024.106191
Richeng Hu , Ngai-Hang Chan , Rongmao Zhang
A self-normalized approach for testing the stationarity of a -dimensional random field is considered in this paper. Because the discrete Fourier transforms (DFT) at fundamental frequencies of a second-order stationary random field are asymptotically uncorrelated (see Bandyopadhyay and Subba Rao, 2017), one can construct a stationarity test based on the sample covariance of the DFTs. Such a test is usually inferior because it involves an overestimated scale parameter that leads to low size and power. To circumvent this shortcoming, this paper proposes two self-normalized statistics based on extreme value and partial sum of the sample covariance of the DFTs. Under certain regularity conditions, it is shown that the proposed tests converge to functionals of Brownian motion. Simulations and a data analysis demonstrate the outstanding performance of the proposed tests.
{"title":"Self-normalized inference for stationarity of irregular spatial data","authors":"Richeng Hu , Ngai-Hang Chan , Rongmao Zhang","doi":"10.1016/j.jspi.2024.106191","DOIUrl":"10.1016/j.jspi.2024.106191","url":null,"abstract":"<div><p>A self-normalized approach for testing the stationarity of a <span><math><mi>d</mi></math></span>-dimensional random field is considered in this paper. Because the discrete Fourier transforms (DFT) at fundamental frequencies of a second-order stationary random field are asymptotically uncorrelated (see Bandyopadhyay and Subba Rao, 2017), one can construct a stationarity test based on the sample covariance of the DFTs. Such a test is usually inferior because it involves an overestimated scale parameter that leads to low size and power. To circumvent this shortcoming, this paper proposes two self-normalized statistics based on extreme value and partial sum of the sample covariance of the DFTs. Under certain regularity conditions, it is shown that the proposed tests converge to functionals of Brownian motion. Simulations and a data analysis demonstrate the outstanding performance of the proposed tests.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106191"},"PeriodicalIF":0.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141046356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-06-08DOI: 10.1016/j.jspi.2024.106199
Jaehoan Kim , Junyong Park , Hoyoung Park
Linear discriminant analysis (LDA) is a typical method for classification problems with large dimensions and small samples. There are various types of LDA methods that are based on the different types of estimators for the covariance matrices and mean vectors. In this paper, we consider shrinkage methods based on a non-parametric approach. For the precision matrix, methods based on the sparsity structure or data splitting are examined. Regarding the estimation of mean vectors, Non-parametric Empirical Bayes (NPEB) methods and Non-parametric Maximum Likelihood Estimation (NPMLE) methods, also known as -modeling and -modeling, respectively, are adopted. The performance of linear discriminant rules based on combined estimation strategies of the covariance matrix and mean vectors are analyzed in this study. Particularly, the study presents a theoretical result on the performance of the NPEB method and compares it with previous studies. Simulation studies with various covariance matrices and mean vector structures are conducted to evaluate the methods discussed in this paper. Furthermore, real data examples such as gene expressions and EEG data are also presented.
线性判别分析(LDA)是处理大维度、小样本分类问题的一种典型方法。基于协方差矩阵和均值向量的不同类型的估计值,有各种类型的线性判别分析方法。本文考虑基于非参数方法的收缩方法。对于精度矩阵,我们研究了基于稀疏性结构或数据分割的方法。关于均值向量的估计,采用了非参数经验贝叶斯(NPEB)方法和非参数最大似然估计(NPMLE)方法,也分别称为 f 建模和 g 建模。本研究分析了基于协方差矩阵和均值向量组合估计策略的线性判别规则的性能。特别是,本研究提出了 NPEB 方法性能的理论结果,并与之前的研究进行了比较。为了评估本文所讨论的方法,我们使用各种协方差矩阵和均值向量结构进行了仿真研究。此外,还介绍了基因表达和脑电图数据等真实数据示例。
{"title":"High dimensional discriminant rules with shrinkage estimators of the covariance matrix and mean vector","authors":"Jaehoan Kim , Junyong Park , Hoyoung Park","doi":"10.1016/j.jspi.2024.106199","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106199","url":null,"abstract":"<div><p>Linear discriminant analysis (LDA) is a typical method for classification problems with large dimensions and small samples. There are various types of LDA methods that are based on the different types of estimators for the covariance matrices and mean vectors. In this paper, we consider shrinkage methods based on a non-parametric approach. For the precision matrix, methods based on the sparsity structure or data splitting are examined. Regarding the estimation of mean vectors, Non-parametric Empirical Bayes (NPEB) methods and Non-parametric Maximum Likelihood Estimation (NPMLE) methods, also known as <span><math><mi>f</mi></math></span>-modeling and <span><math><mi>g</mi></math></span>-modeling, respectively, are adopted. The performance of linear discriminant rules based on combined estimation strategies of the covariance matrix and mean vectors are analyzed in this study. Particularly, the study presents a theoretical result on the performance of the NPEB method and compares it with previous studies. Simulation studies with various covariance matrices and mean vector structures are conducted to evaluate the methods discussed in this paper. Furthermore, real data examples such as gene expressions and EEG data are also presented.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106199"},"PeriodicalIF":0.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141422982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01Epub Date: 2024-06-05DOI: 10.1016/j.jspi.2024.106188
Michael Kohler , Sophie Langer
Convolutional neural networks (CNNs) trained with cross-entropy loss have proven to be extremely successful in classifying images. In recent years, much work has been done to also improve the theoretical understanding of neural networks. Nevertheless, it seems limited when these networks are trained with cross-entropy loss, mainly because of the unboundedness of the target function. In this paper, we aim to fill this gap by analysing the rate of the excess risk of a CNN classifier trained by cross-entropy loss. Under suitable assumptions on the smoothness and structure of the a posteriori probability, it is shown that these classifiers achieve a rate of convergence which is independent of the dimension of the image. These rates are in line with the practical observations about CNNs.
{"title":"Statistical theory for image classification using deep convolutional neural network with cross-entropy loss under the hierarchical max-pooling model","authors":"Michael Kohler , Sophie Langer","doi":"10.1016/j.jspi.2024.106188","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106188","url":null,"abstract":"<div><p>Convolutional neural networks (CNNs) trained with cross-entropy loss have proven to be extremely successful in classifying images. In recent years, much work has been done to also improve the theoretical understanding of neural networks. Nevertheless, it seems limited when these networks are trained with cross-entropy loss, mainly because of the unboundedness of the target function. In this paper, we aim to fill this gap by analysing the rate of the excess risk of a CNN classifier trained by cross-entropy loss. Under suitable assumptions on the smoothness and structure of the a posteriori probability, it is shown that these classifiers achieve a rate of convergence which is independent of the dimension of the image. These rates are in line with the practical observations about CNNs.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106188"},"PeriodicalIF":0.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0378375824000454/pdfft?md5=68a8b5f0ef9e0563ac8f09f8ca152533&pid=1-s2.0-S0378375824000454-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141422984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}