首页 > 最新文献

Journal of Statistical Planning and Inference最新文献

英文 中文
Bayes oracle property of multiple tests of multivariate normal means under sparsity 稀疏性条件下多元正态均值多重检验的贝叶斯神谕特性
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-01 Epub Date: 2024-08-22 DOI: 10.1016/j.jspi.2024.106227
Zikun Qin, Malay Ghosh

The paper considers a multiple testing problem of multivariate normal means under sparsity. First, the Bayes risk of the multivariate Bayes oracle is derived. Then, a hierarchical Bayesian approach is taken with global–local shrinkage priors, where the global parameter is either treated as a tuning parameter or is given a specific prior. The method is shown to attain an asymptotic Bayes optimal under sparsity (ABOS) property. Finally, an empirical Bayes procedure is proposed which involves estimation of the global shrinkage parameter. The approach is also shown to lead to the ABOS property.

本文研究了稀疏性条件下的多元正态均值多重检验问题。首先,推导出多元贝叶斯神谕的贝叶斯风险。然后,采用全局-局部收缩先验的分层贝叶斯方法,其中全局参数要么被视为调整参数,要么被赋予特定先验。结果表明,该方法具有稀疏性下的渐进贝叶斯最优(ABOS)特性。最后,提出了一种经验贝叶斯程序,涉及全局收缩参数的估计。该方法也显示出 ABOS 特性。
{"title":"Bayes oracle property of multiple tests of multivariate normal means under sparsity","authors":"Zikun Qin,&nbsp;Malay Ghosh","doi":"10.1016/j.jspi.2024.106227","DOIUrl":"10.1016/j.jspi.2024.106227","url":null,"abstract":"<div><p>The paper considers a multiple testing problem of multivariate normal means under sparsity. First, the Bayes risk of the multivariate Bayes oracle is derived. Then, a hierarchical Bayesian approach is taken with global–local shrinkage priors, where the global parameter is either treated as a tuning parameter or is given a specific prior. The method is shown to attain an asymptotic Bayes optimal under sparsity (ABOS) property. Finally, an empirical Bayes procedure is proposed which involves estimation of the global shrinkage parameter. The approach is also shown to lead to the ABOS property.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"235 ","pages":"Article 106227"},"PeriodicalIF":0.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142088421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A graph decomposition-based approach for the graph-fused lasso 基于图分解的图融合套索方法
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-01 Epub Date: 2024-08-10 DOI: 10.1016/j.jspi.2024.106221
Feng Yu , Archer Yi Yang , Teng Zhang

We propose a new algorithm for solving the graph-fused lasso (GFL), a regularized model that operates under the assumption that the signal tends to be locally constant over a predefined graph structure. The proposed method applies a novel decomposition of the objective function for the alternating direction method of multipliers (ADMM) algorithm. While ADMM has been widely used in fused lasso problems, existing works such as the network lasso decompose the objective function into the loss function component and the total variation penalty component. In contrast, based on the graph matching technique in graph theory, we propose a new method of decomposition that separates the objective function into two components, where one component is the loss function plus part of the total variation penalty, and the other component is the remaining total variation penalty. We develop an exact convergence rate of the proposed algorithm by developing a general theory on the local convergence of ADMM. Compared with the network lasso algorithm, our algorithm has a faster exact linear convergence rate (although in the same order as for the network lasso). It also enjoys a smaller computational cost per iteration, thus converges overall faster in most numerical examples.

我们提出了一种求解图融合套索(GFL)的新算法,这是一种正则化模型,其运行假设是信号在预定义的图结构上趋于局部恒定。所提出的方法对交替方向乘法(ADMM)算法的目标函数进行了新的分解。虽然 ADMM 已广泛应用于融合套索问题,但现有的工作(如网络套索)将目标函数分解为损失函数部分和总变异惩罚部分。相比之下,我们基于图论中的图匹配技术,提出了一种新的分解方法,将目标函数分解为两个部分,其中一个部分是损失函数加上部分总变化惩罚,另一个部分是剩余的总变化惩罚。通过发展 ADMM 局部收敛的一般理论,我们得出了所提算法的精确收敛率。与网络套索算法相比,我们的算法具有更快的精确线性收敛速度(尽管与网络套索算法的收敛速度相同)。它的每次迭代计算成本也更低,因此在大多数数值示例中总体收敛速度更快。
{"title":"A graph decomposition-based approach for the graph-fused lasso","authors":"Feng Yu ,&nbsp;Archer Yi Yang ,&nbsp;Teng Zhang","doi":"10.1016/j.jspi.2024.106221","DOIUrl":"10.1016/j.jspi.2024.106221","url":null,"abstract":"<div><p>We propose a new algorithm for solving the graph-fused lasso (GFL), a regularized model that operates under the assumption that the signal tends to be locally constant over a predefined graph structure. The proposed method applies a novel decomposition of the objective function for the alternating direction method of multipliers (ADMM) algorithm. While ADMM has been widely used in fused lasso problems, existing works such as the network lasso decompose the objective function into the loss function component and the total variation penalty component. In contrast, based on the graph matching technique in graph theory, we propose a new method of decomposition that separates the objective function into two components, where one component is the loss function plus part of the total variation penalty, and the other component is the remaining total variation penalty. We develop an exact convergence rate of the proposed algorithm by developing a general theory on the local convergence of ADMM. Compared with the network lasso algorithm, our algorithm has a faster exact linear convergence rate (although in the same order as for the network lasso). It also enjoys a smaller computational cost per iteration, thus converges overall faster in most numerical examples.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"235 ","pages":"Article 106221"},"PeriodicalIF":0.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142096052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical inference from partially nominated sets: An application to estimating the prevalence of osteoporosis among adult women 从部分提名集进行统计推断:应用于估算成年女性骨质疏松症患病率
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-01 Epub Date: 2024-07-26 DOI: 10.1016/j.jspi.2024.106214
Zeinab Akbari Ghamsari , Ehsan Zamanzade , Majid Asadi

This paper focuses on drawing statistical inference based on a novel variant of maxima or minima nomination sampling (NS) designs. These sampling designs are useful for obtaining more representative sample units from the tails of the population distribution using the available auxiliary ranking information. However, one common difficulty in performing NS in practice is that the researcher cannot obtain a nominated sample unless he/she uniquely determines the sample unit with the highest or the lowest rank in each set. To overcome this problem, a variant of NS, which is called partial nomination sampling, is proposed, in which the researcher is allowed to declare that two or more units are tied in the ranks whenever he/she cannot find the sample unit with the highest or the lowest rank. Based on this sampling design, two asymptotically unbiased estimators are developed for the cumulative distribution function, which is obtained using maximum likelihood and moment-based approaches, and their asymptotic normalities are proved. Several numerical studies have shown that the proposed estimators have higher relative efficiencies than their counterparts in simple random sampling in analyzing either the upper or the lower tail of the parent distribution. The procedures that we developed are then implemented on a real dataset from the Third National Health and Nutrition Examination Survey (NHANES III) to estimate the prevalence of osteoporosis among adult women aged 50 and over. It is shown that in certain circumstances, the techniques that we have developed require only one-third of the sample size needed in SRS to achieve the desired precision. This results in a considerable reduction in time and cost compared to the standard SRS method.

本文的重点是基于最大值或最小值提名抽样(NS)设计的新型变体进行统计推断。这些抽样设计有助于利用现有的辅助排序信息,从总体分布的尾部获得更具代表性的样本单位。然而,在实践中执行提名抽样的一个常见困难是,除非研究人员唯一确定每组中排名最高或最低的样本单位,否则无法获得提名样本。为了克服这个问题,我们提出了 NS 的一种变体,即部分提名抽样,允许研究人员在找不到排名最高或最低的样本单位时,宣布两个或两个以上的单位排名并列。基于这种抽样设计,利用最大似然法和基于矩的方法为累积分布函数建立了两个渐近无偏估计器,并证明了它们的渐近正态性。几项数值研究表明,在分析母分布的上尾或下尾时,所提出的估计器比简单随机抽样中的同类估计器具有更高的相对效率。随后,我们在第三次全国健康与营养调查(NHANES III)的真实数据集上实施了所开发的程序,以估计 50 岁及以上成年女性的骨质疏松症患病率。结果表明,在某些情况下,我们开发的技术只需要 SRS 所需的样本量的三分之一就能达到预期精度。与标准 SRS 方法相比,这大大减少了时间和成本。
{"title":"Statistical inference from partially nominated sets: An application to estimating the prevalence of osteoporosis among adult women","authors":"Zeinab Akbari Ghamsari ,&nbsp;Ehsan Zamanzade ,&nbsp;Majid Asadi","doi":"10.1016/j.jspi.2024.106214","DOIUrl":"10.1016/j.jspi.2024.106214","url":null,"abstract":"<div><p>This paper focuses on drawing statistical inference based on a novel variant of maxima or minima nomination sampling (NS) designs. These sampling designs are useful for obtaining more representative sample units from the tails of the population distribution using the available auxiliary ranking information. However, one common difficulty in performing NS in practice is that the researcher cannot obtain a nominated sample unless he/she uniquely determines the sample unit with the highest or the lowest rank in each set. To overcome this problem, a variant of NS, which is called partial nomination sampling, is proposed, in which the researcher is allowed to declare that two or more units are tied in the ranks whenever he/she cannot find the sample unit with the highest or the lowest rank. Based on this sampling design, two asymptotically unbiased estimators are developed for the cumulative distribution function, which is obtained using maximum likelihood and moment-based approaches, and their asymptotic normalities are proved. Several numerical studies have shown that the proposed estimators have higher relative efficiencies than their counterparts in simple random sampling in analyzing either the upper or the lower tail of the parent distribution. The procedures that we developed are then implemented on a real dataset from the Third National Health and Nutrition Examination Survey (NHANES III) to estimate the prevalence of osteoporosis among adult women aged 50 and over. It is shown that in certain circumstances, the techniques that we have developed require only one-third of the sample size needed in SRS to achieve the desired precision. This results in a considerable reduction in time and cost compared to the standard SRS method.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"235 ","pages":"Article 106214"},"PeriodicalIF":0.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141937766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exponential consistency of M-estimators in generalized linear mixed models 广义线性混合模型中 M 估计器的指数一致性
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-01 Epub Date: 2024-08-08 DOI: 10.1016/j.jspi.2024.106222
Andrea Bratsberg , Magne Thoresen , Abhik Ghosh

Generalized linear mixed models are powerful tools for analyzing clustered data, where the unknown parameters are classically (and most commonly) estimated by the maximum likelihood and restricted maximum likelihood procedures. However, since the likelihood-based procedures are known to be highly sensitive to outliers, M-estimators have become popular as a means to obtain robust estimates under possible data contamination. In this paper, we prove that for sufficiently smooth general loss functions defining the M-estimators in generalized linear mixed models, the tail probability of the deviation between the estimated and the true regression coefficients has an exponential bound. This implies an exponential rate of consistency of these M-estimators under appropriate assumptions, generalizing the existing exponential consistency results from univariate to multivariate responses. We have illustrated this theoretical result further for the special examples of the maximum likelihood estimator and the robust minimum density power divergence estimator, a popular example of model-based M-estimators, in the settings of linear and logistic mixed models, comparing it with the empirical rate of convergence through simulation studies.

广义线性混合模型是分析聚类数据的强大工具,其中的未知参数通常(也是最常用的)通过最大似然和限制最大似然程序进行估计。然而,众所周知,基于似然法的程序对异常值非常敏感,因此,M-估计器作为一种在可能的数据污染情况下获得稳健估计值的方法而备受青睐。本文证明,对于定义广义线性混合模型中 M-estimators 的足够平滑的一般损失函数,估计值与真实回归系数之间偏差的尾部概率具有指数约束。这意味着在适当的假设条件下,这些 M-estimators 的指数一致性率,将现有的指数一致性结果从单变量推广到多变量响应。我们在线性模型和逻辑混合模型中,以最大似然估计器和稳健最小密度功率发散估计器(基于模型的 M-estimators 的一个流行例子)为例,进一步说明了这一理论结果,并通过模拟研究将其与经验收敛率进行了比较。
{"title":"Exponential consistency of M-estimators in generalized linear mixed models","authors":"Andrea Bratsberg ,&nbsp;Magne Thoresen ,&nbsp;Abhik Ghosh","doi":"10.1016/j.jspi.2024.106222","DOIUrl":"10.1016/j.jspi.2024.106222","url":null,"abstract":"<div><p>Generalized linear mixed models are powerful tools for analyzing clustered data, where the unknown parameters are classically (and most commonly) estimated by the maximum likelihood and restricted maximum likelihood procedures. However, since the likelihood-based procedures are known to be highly sensitive to outliers, M-estimators have become popular as a means to obtain robust estimates under possible data contamination. In this paper, we prove that for sufficiently smooth general loss functions defining the M-estimators in generalized linear mixed models, the tail probability of the deviation between the estimated and the true regression coefficients has an exponential bound. This implies an exponential rate of consistency of these M-estimators under appropriate assumptions, generalizing the existing exponential consistency results from univariate to multivariate responses. We have illustrated this theoretical result further for the special examples of the maximum likelihood estimator and the robust minimum density power divergence estimator, a popular example of model-based M-estimators, in the settings of linear and logistic mixed models, comparing it with the empirical rate of convergence through simulation studies.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"235 ","pages":"Article 106222"},"PeriodicalIF":0.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S037837582400079X/pdfft?md5=852e7e6dbe375fd6c8f548a7fe669070&pid=1-s2.0-S037837582400079X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141990800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regression to the mean for overdispersed count data 过度分散计数数据的均值回归
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-01-01 Epub Date: 2024-07-05 DOI: 10.1016/j.jspi.2024.106211
Kiran Iftikhar , Manzoor Khan , Jake Olivier

In repeated measurements, regression to the mean (RTM) is a tendency of subjects with observed extreme values to move closer to the mean when measured a second time. Not accounting for RTM could lead to incorrect decisions such as when observed natural variation is incorrectly attributed to the effect of a treatment/intervention. A strategy for addressing RTM is to decompose the total effect, the expected difference in paired random variables conditional on the first being in the tail of its distribution, into regression to the mean and unbiased treatment effects. The unbiased treatment effect can then be estimated by subtraction. Formulae are available in the literature to quantify RTM for Poisson distributed data which are constrained by mean–variance equivalence, although there are many real life examples of overdispersed count data that are not well approximated by the Poisson. The negative binomial can be considered an explicit overdispersed Poisson process where the Poisson intensity is chosen from a gamma distribution. In this study, the truncated bivariate negative binomial distribution is used to decompose the total effect formulae into RTM and treatment effects. Maximum likelihood estimators (MLE) and method of moments estimators are developed for the total, RTM, and treatment effects. A simulation study is carried out to investigate the properties of the estimators and compare them with those developed under the assumption of the Poisson process. Data on the incidence of dengue cases reported from 2007 to 2017 are used to estimate the total, RTM, and treatment effects.

在重复测量中,均值回归(RTM)是指观察到极值的受试者在第二次测量时向均值靠拢的趋势。不考虑 RTM 可能会导致错误的决策,例如将观察到的自然变化错误地归因于治疗/干预的效果。处理 RTM 的一种策略是将总效应(即配对随机变量的预期差异,条件是第一个变量处于其分布的尾部)分解为回归均值效应和无偏治疗效应。然后通过减法估算无偏治疗效果。尽管现实生活中有许多过度分散的计数数据不能很好地用泊松来近似,但文献中仍有一些公式可以量化泊松分布数据的 RTM。负二项分布可视为一个明确的过分散泊松过程,其中泊松强度是从伽马分布中选择的。在本研究中,截断的二元负二项分布用于将总效应公式分解为 RTM 和治疗效应。为总效应、RTM 和治疗效应开发了最大似然估计器(MLE)和矩估计法。通过模拟研究调查了估计器的特性,并与在泊松过程假设下开发的估计器进行了比较。2007 年至 2017 年登革热病例报告的发病率数据用于估计总效应、RTM效应和治疗效应。
{"title":"Regression to the mean for overdispersed count data","authors":"Kiran Iftikhar ,&nbsp;Manzoor Khan ,&nbsp;Jake Olivier","doi":"10.1016/j.jspi.2024.106211","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106211","url":null,"abstract":"<div><p>In repeated measurements, regression to the mean (RTM) is a tendency of subjects with observed extreme values to move closer to the mean when measured a second time. Not accounting for RTM could lead to incorrect decisions such as when observed natural variation is incorrectly attributed to the effect of a treatment/intervention. A strategy for addressing RTM is to decompose the <em>total effect</em>, the expected difference in paired random variables conditional on the first being in the tail of its distribution, into regression to the mean and unbiased treatment effects. The unbiased treatment effect can then be estimated by subtraction. Formulae are available in the literature to quantify RTM for Poisson distributed data which are constrained by mean–variance equivalence, although there are many real life examples of overdispersed count data that are not well approximated by the Poisson. The negative binomial can be considered an explicit overdispersed Poisson process where the Poisson intensity is chosen from a gamma distribution. In this study, the truncated bivariate negative binomial distribution is used to decompose the total effect formulae into RTM and treatment effects. Maximum likelihood estimators (MLE) and method of moments estimators are developed for the total, RTM, and treatment effects. A simulation study is carried out to investigate the properties of the estimators and compare them with those developed under the assumption of the Poisson process. Data on the incidence of dengue cases reported from 2007 to 2017 are used to estimate the total, RTM, and treatment effects.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106211"},"PeriodicalIF":0.8,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141606665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing truncation dependence: The Gumbel–Barnett copula 测试截断依赖性Gumbel-Barnett copula
IF 0.9 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-01-01 Epub Date: 2024-05-28 DOI: 10.1016/j.jspi.2024.106194
Anne-Marie Toparkus, Rafael Weißbach

In studies on lifetimes, occasionally, the population contains statistical units that are born before the data collection has started. Left-truncated are units that deceased before this start. For all other units, the age at the study start often is recorded and we aim at testing whether this second measurement is independent of the genuine measure of interest, the lifetime. Our basic model of dependence is the one-parameter Gumbel–Barnett copula. For simplicity, the marginal distribution of the lifetime is assumed to be Exponential and for the age-at-study-start, namely the distribution of birth dates, we assume a Uniform. Also for simplicity, and to fit our application, we assume that units that die later than our study period, are also truncated. As a result from point process theory, we can approximate the truncated sample by a Poisson process and thereby derive its likelihood. Identification, consistency and asymptotic distribution of the maximum-likelihood estimator are derived. Testing for positive truncation dependence must include the hypothetical independence which coincides with the boundary of the copula’s parameter space. By non-standard theory, the maximum likelihood estimator of the exponential and the copula parameter is distributed as a mixture of a two- and a one-dimensional normal distribution. For the proof, the third parameter, the unobservable sample size, is profiled out. An interesting result is, that it differs to view the data as truncated sample, or, as simple sample from the truncated population, but not by much. The application are 55 thousand double-truncated lifetimes of German businesses that closed down over the period 2014 to 2016. The likelihood has its maximum for the copula parameter at the parameter space boundary so that the p-value of test is 0.5. The life expectancy does not increase relative to the year of foundation. Using a Farlie–Gumbel–Morgenstern copula, which models positive and negative dependence, finds that life expectancy of German enterprises even decreases significantly over time. A simulation under the condition of the application suggests that the tests retain the nominal level and have good power.

在有关生命周期的研究中,人口中偶尔会包含在数据收集开始前出生的统计单位。左截断是指在数据收集开始前死亡的单位。对于所有其他单位,研究开始时的年龄往往会被记录下来,我们的目的是检验这第二个测量值是否独立于真正感兴趣的测量值,即寿命。我们的基本依赖模型是单参数 Gumbel-Barnett copula。为简单起见,我们假定寿命的边际分布为指数分布,而对于研究开始时的年龄,即出生日期的分布,我们假定为均匀分布。同样,为了简单起见,并符合我们的应用,我们假定晚于研究期死亡的单位也会被截断。根据点过程理论,我们可以用泊松过程来近似截断样本,从而得出其可能性。最大似然估计值的识别性、一致性和渐近分布也由此得出。检验正截断依赖性必须包括假设的独立性,这种独立性与 copula 参数空间的边界重合。根据非标准理论,指数和 copula 参数的最大似然估计值是二维正态分布和一维正态分布的混合分布。为了证明这一点,第三个参数,即不可观测的样本大小,被剖析出来。一个有趣的结果是,将数据视为截断样本或从截断人口中抽取的简单样本会有不同,但差别不大。应用的数据是 2014 年至 2016 年期间倒闭的 5.5 万家德国企业的双截断生命周期。在参数空间边界处,copula 参数的似然值为最大值,因此检验的 p 值为 0.5。预期寿命不会相对于成立年份而增加。使用建立正负依赖模型的 Farlie-Gumbel-Morgenstern copula 发现,德国企业的预期寿命甚至会随着时间的推移而显著下降。在应用条件下进行的模拟表明,检验结果保持了名义水平,并具有良好的说服力。
{"title":"Testing truncation dependence: The Gumbel–Barnett copula","authors":"Anne-Marie Toparkus,&nbsp;Rafael Weißbach","doi":"10.1016/j.jspi.2024.106194","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106194","url":null,"abstract":"<div><p>In studies on lifetimes, occasionally, the population contains statistical units that are born before the data collection has started. Left-truncated are units that deceased before this start. For all other units, the age at the study start often is recorded and we aim at testing whether this second measurement is independent of the genuine measure of interest, the lifetime. Our basic model of dependence is the one-parameter Gumbel–Barnett copula. For simplicity, the marginal distribution of the lifetime is assumed to be Exponential and for the age-at-study-start, namely the distribution of birth dates, we assume a Uniform. Also for simplicity, and to fit our application, we assume that units that die later than our study period, are also truncated. As a result from point process theory, we can approximate the truncated sample by a Poisson process and thereby derive its likelihood. Identification, consistency and asymptotic distribution of the maximum-likelihood estimator are derived. Testing for positive truncation dependence must include the hypothetical independence which coincides with the boundary of the copula’s parameter space. By non-standard theory, the maximum likelihood estimator of the exponential and the copula parameter is distributed as a mixture of a two- and a one-dimensional normal distribution. For the proof, the third parameter, the unobservable sample size, is profiled out. An interesting result is, that it differs to view the data as truncated sample, or, as simple sample from the truncated population, but not by much. The application are 55 thousand double-truncated lifetimes of German businesses that closed down over the period 2014 to 2016. The likelihood has its maximum for the copula parameter at the parameter space boundary so that the <span><math><mi>p</mi></math></span>-value of test is 0.5. The life expectancy does not increase relative to the year of foundation. Using a Farlie–Gumbel–Morgenstern copula, which models positive and negative dependence, finds that life expectancy of German enterprises even decreases significantly over time. A simulation under the condition of the application suggests that the tests retain the nominal level and have good power.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106194"},"PeriodicalIF":0.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S037837582400051X/pdfft?md5=a5bc737bb68bd11a1a31f4aeb333c40e&pid=1-s2.0-S037837582400051X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141240222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Some results for stochastic orders and aging properties related to the Laplace transform 与拉普拉斯变换有关的随机阶次和老化特性的一些结果
IF 0.9 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-01-01 Epub Date: 2024-06-05 DOI: 10.1016/j.jspi.2024.106197
Lazaros Kanellopoulos, Konstadinos Politis

We study some properties and relations for stochastic orders and aging classes related to the Laplace transform. In particular, we show that the NBULt class of distributions is closed under convolution. We also obtain results for the ratio of derivatives of the Laplace transform between two distributions.

我们研究了与拉普拉斯变换相关的随机阶数和老化类的一些性质和关系。特别是,我们证明了 NBULt 类分布在卷积下是封闭的。我们还获得了两个分布之间拉普拉斯变换导数比的结果。
{"title":"Some results for stochastic orders and aging properties related to the Laplace transform","authors":"Lazaros Kanellopoulos,&nbsp;Konstadinos Politis","doi":"10.1016/j.jspi.2024.106197","DOIUrl":"10.1016/j.jspi.2024.106197","url":null,"abstract":"<div><p>We study some properties and relations for stochastic orders and aging classes related to the Laplace transform. In particular, we show that the NBU<span><math><msub><mrow></mrow><mrow><mtext>Lt</mtext></mrow></msub></math></span> class of distributions is closed under convolution. We also obtain results for the ratio of derivatives of the Laplace transform between two distributions.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106197"},"PeriodicalIF":0.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141403038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-normalized inference for stationarity of irregular spatial data 不规则空间数据静止性的自归一化推论
IF 0.9 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-01-01 Epub Date: 2024-05-15 DOI: 10.1016/j.jspi.2024.106191
Richeng Hu , Ngai-Hang Chan , Rongmao Zhang

A self-normalized approach for testing the stationarity of a d-dimensional random field is considered in this paper. Because the discrete Fourier transforms (DFT) at fundamental frequencies of a second-order stationary random field are asymptotically uncorrelated (see Bandyopadhyay and Subba Rao, 2017), one can construct a stationarity test based on the sample covariance of the DFTs. Such a test is usually inferior because it involves an overestimated scale parameter that leads to low size and power. To circumvent this shortcoming, this paper proposes two self-normalized statistics based on extreme value and partial sum of the sample covariance of the DFTs. Under certain regularity conditions, it is shown that the proposed tests converge to functionals of Brownian motion. Simulations and a data analysis demonstrate the outstanding performance of the proposed tests.

本文考虑采用自归一化方法来测试 d 维随机场的静止性。由于二阶静止随机场基频的离散傅里叶变换(DFT)近似不相关(见 Bandyopadhyay 和 Subba Rao,2017 年),因此可以根据 DFT 的样本协方差构建静止性检验。这种检验通常效果较差,因为它涉及到一个被高估的尺度参数,导致规模和功率都较低。为了规避这一缺陷,本文提出了两种基于 DFT 样本协方差极值和偏和的自归一化统计量。在一定的正则条件下,本文证明了所提出的检验收敛于布朗运动的函数。模拟和数据分析证明了所提检验的卓越性能。
{"title":"Self-normalized inference for stationarity of irregular spatial data","authors":"Richeng Hu ,&nbsp;Ngai-Hang Chan ,&nbsp;Rongmao Zhang","doi":"10.1016/j.jspi.2024.106191","DOIUrl":"10.1016/j.jspi.2024.106191","url":null,"abstract":"<div><p>A self-normalized approach for testing the stationarity of a <span><math><mi>d</mi></math></span>-dimensional random field is considered in this paper. Because the discrete Fourier transforms (DFT) at fundamental frequencies of a second-order stationary random field are asymptotically uncorrelated (see Bandyopadhyay and Subba Rao, 2017), one can construct a stationarity test based on the sample covariance of the DFTs. Such a test is usually inferior because it involves an overestimated scale parameter that leads to low size and power. To circumvent this shortcoming, this paper proposes two self-normalized statistics based on extreme value and partial sum of the sample covariance of the DFTs. Under certain regularity conditions, it is shown that the proposed tests converge to functionals of Brownian motion. Simulations and a data analysis demonstrate the outstanding performance of the proposed tests.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106191"},"PeriodicalIF":0.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141046356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High dimensional discriminant rules with shrinkage estimators of the covariance matrix and mean vector 使用协方差矩阵和均值向量收缩估计器的高维判别规则
IF 0.9 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-01-01 Epub Date: 2024-06-08 DOI: 10.1016/j.jspi.2024.106199
Jaehoan Kim , Junyong Park , Hoyoung Park

Linear discriminant analysis (LDA) is a typical method for classification problems with large dimensions and small samples. There are various types of LDA methods that are based on the different types of estimators for the covariance matrices and mean vectors. In this paper, we consider shrinkage methods based on a non-parametric approach. For the precision matrix, methods based on the sparsity structure or data splitting are examined. Regarding the estimation of mean vectors, Non-parametric Empirical Bayes (NPEB) methods and Non-parametric Maximum Likelihood Estimation (NPMLE) methods, also known as f-modeling and g-modeling, respectively, are adopted. The performance of linear discriminant rules based on combined estimation strategies of the covariance matrix and mean vectors are analyzed in this study. Particularly, the study presents a theoretical result on the performance of the NPEB method and compares it with previous studies. Simulation studies with various covariance matrices and mean vector structures are conducted to evaluate the methods discussed in this paper. Furthermore, real data examples such as gene expressions and EEG data are also presented.

线性判别分析(LDA)是处理大维度、小样本分类问题的一种典型方法。基于协方差矩阵和均值向量的不同类型的估计值,有各种类型的线性判别分析方法。本文考虑基于非参数方法的收缩方法。对于精度矩阵,我们研究了基于稀疏性结构或数据分割的方法。关于均值向量的估计,采用了非参数经验贝叶斯(NPEB)方法和非参数最大似然估计(NPMLE)方法,也分别称为 f 建模和 g 建模。本研究分析了基于协方差矩阵和均值向量组合估计策略的线性判别规则的性能。特别是,本研究提出了 NPEB 方法性能的理论结果,并与之前的研究进行了比较。为了评估本文所讨论的方法,我们使用各种协方差矩阵和均值向量结构进行了仿真研究。此外,还介绍了基因表达和脑电图数据等真实数据示例。
{"title":"High dimensional discriminant rules with shrinkage estimators of the covariance matrix and mean vector","authors":"Jaehoan Kim ,&nbsp;Junyong Park ,&nbsp;Hoyoung Park","doi":"10.1016/j.jspi.2024.106199","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106199","url":null,"abstract":"<div><p>Linear discriminant analysis (LDA) is a typical method for classification problems with large dimensions and small samples. There are various types of LDA methods that are based on the different types of estimators for the covariance matrices and mean vectors. In this paper, we consider shrinkage methods based on a non-parametric approach. For the precision matrix, methods based on the sparsity structure or data splitting are examined. Regarding the estimation of mean vectors, Non-parametric Empirical Bayes (NPEB) methods and Non-parametric Maximum Likelihood Estimation (NPMLE) methods, also known as <span><math><mi>f</mi></math></span>-modeling and <span><math><mi>g</mi></math></span>-modeling, respectively, are adopted. The performance of linear discriminant rules based on combined estimation strategies of the covariance matrix and mean vectors are analyzed in this study. Particularly, the study presents a theoretical result on the performance of the NPEB method and compares it with previous studies. Simulation studies with various covariance matrices and mean vector structures are conducted to evaluate the methods discussed in this paper. Furthermore, real data examples such as gene expressions and EEG data are also presented.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106199"},"PeriodicalIF":0.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141422982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Statistical theory for image classification using deep convolutional neural network with cross-entropy loss under the hierarchical max-pooling model 分层最大池模型下使用具有交叉熵损失的深度卷积神经网络进行图像分类的统计理论
IF 0.9 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-01-01 Epub Date: 2024-06-05 DOI: 10.1016/j.jspi.2024.106188
Michael Kohler , Sophie Langer

Convolutional neural networks (CNNs) trained with cross-entropy loss have proven to be extremely successful in classifying images. In recent years, much work has been done to also improve the theoretical understanding of neural networks. Nevertheless, it seems limited when these networks are trained with cross-entropy loss, mainly because of the unboundedness of the target function. In this paper, we aim to fill this gap by analysing the rate of the excess risk of a CNN classifier trained by cross-entropy loss. Under suitable assumptions on the smoothness and structure of the a posteriori probability, it is shown that these classifiers achieve a rate of convergence which is independent of the dimension of the image. These rates are in line with the practical observations about CNNs.

事实证明,使用交叉熵损失训练的卷积神经网络(CNN)在图像分类方面非常成功。近年来,人们做了大量工作来提高对神经网络的理论认识。然而,主要由于目标函数的无界性,在使用交叉熵损失训练这些网络时,研究似乎受到了限制。本文旨在通过分析用交叉熵损失训练的 CNN 分类器的超额风险率来填补这一空白。在对后验概率的平滑性和结构进行适当假设的情况下,结果表明这些分类器的收敛速度与图像的维度无关。这些收敛率与 CNN 的实际观察结果一致。
{"title":"Statistical theory for image classification using deep convolutional neural network with cross-entropy loss under the hierarchical max-pooling model","authors":"Michael Kohler ,&nbsp;Sophie Langer","doi":"10.1016/j.jspi.2024.106188","DOIUrl":"https://doi.org/10.1016/j.jspi.2024.106188","url":null,"abstract":"<div><p>Convolutional neural networks (CNNs) trained with cross-entropy loss have proven to be extremely successful in classifying images. In recent years, much work has been done to also improve the theoretical understanding of neural networks. Nevertheless, it seems limited when these networks are trained with cross-entropy loss, mainly because of the unboundedness of the target function. In this paper, we aim to fill this gap by analysing the rate of the excess risk of a CNN classifier trained by cross-entropy loss. Under suitable assumptions on the smoothness and structure of the a posteriori probability, it is shown that these classifiers achieve a rate of convergence which is independent of the dimension of the image. These rates are in line with the practical observations about CNNs.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"234 ","pages":"Article 106188"},"PeriodicalIF":0.9,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0378375824000454/pdfft?md5=68a8b5f0ef9e0563ac8f09f8ca152533&pid=1-s2.0-S0378375824000454-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141422984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Statistical Planning and Inference
全部 Chem. Ecol. Basin Res. Geobiology Environmental Claims Journal COMP BIOCHEM PHYS C Geosci. J. ICARUS European Journal of Biological Research 航空科学与技术(英文) Jpn. J. Appl. Phys. Yan Ke Xue Bao (Hong Kong) FOLIA PHONIATR LOGO Mon. Weather Rev. Ocean and Coastal Research Nat. Phys. Nat. Astron ArcheoSci.-Rev. Archeom. B SOC GEOL MEX Environ. Prot. Eng. PHOTONICS-BASEL Acta Geochimica Am. Mineral. ATMOSPHERE-BASEL STRUCT DYNAM-US REV MEX CIENC GEOL Environ. Eng. Manage. J. Condens. Matter Phys. QUATERNAIRE Int. J. Geog. Inf. Sci. Astropart. Phys. Environ. Geochem. Health Chin. Phys. B Environmental Science: an Indian journal Environ. Mol. Mutagen. Int. J. Climatol. Front. Phys. Archaeol. Anthropol. Sci. Journal of Semiconductors J EARTHQ TSUNAMI Environmental dermatology : the official journal of the Japanese Society for Contact Dermatitis Enzyme Research IDOJARAS ENTROPY-SWITZ J PHYS D APPL PHYS J. Earth Syst. Sci. J EXP ANAL BEHAV J. Atmos. Oceanic Technol. Low Temp. Phys. Exp. Eye Res. Expert Rev. Neurother. ARCT ANTARCT ALP RES Environ. Eng. Sci. Geochim. Cosmochim. Acta Seismol. Res. Lett. 环境与发展 EUR PHYS J-SPEC TOP IZV-PHYS SOLID EART+ Photonics Res. AAPG Bull. Espacio Tiempo y Forma. Serie VII, Historia del Arte Conserv. Biol. EUR UROL Austrian J. Earth Sci. J. Clim. Energy Environ. 山西省考古学会论文集 ECOLOGY Theor. Appl. Climatol. Laser Phys. Lett. Geochem. Trans. Ann. Phys. ACTA PETROL SIN PHYSICA B Chin. Phys. C ROM REP PHYS Geol. J. Energy Ecol Environ Hydrol. Processes European Journal of Chemistry Mar. Geod. Carbon Balance Manage. Études Caribéennes Acta Geophys. ACTA GEOL SIN-ENGL ECOTOXICOLOGY Environ. Educ. Res, APPL NEUROPSYCH-CHIL Ann. Geophys. essentia law Merchant Shipping Act 1995 J. Nanophotonics Can. J. Phys. EXPERT REV ANTI-INFE Aquat. Geochem. J PHYS-CONDENS MAT Terra Nova ENG SANIT AMBIENT Contrib. Plasma Phys. J. Cosmol. Astropart. Phys. Environmental Health Insights Hydrol. Earth Syst. Sci.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1