首页 > 最新文献

Journal of Statistical Planning and Inference最新文献

英文 中文
Estimation and group-feature selection in sparse mixture-of-experts with diverging number of parameters 参数数量分散的稀疏专家混合物中的估计和组特征选择
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-07-01 Epub Date: 2024-11-19 DOI: 10.1016/j.jspi.2024.106250
Abbas Khalili , Archer Yi Yang , Xiaonan Da
Mixture-of-experts provide flexible statistical models for a wide range of regression (supervised learning) problems. Often a large number of covariates (features) are available in many modern applications yet only a small subset of them is useful in explaining a response variable of interest. This calls for a feature selection device. In this paper, we present new group-feature selection and estimation methods for sparse mixture-of-experts models when the number of features can be nearly comparable to the sample size. We prove the consistency of the methods in both parameter estimation and feature selection. We implement the methods using a modified EM algorithm combined with proximal gradient method which results in a convenient closed-form parameter update in the M-step of the algorithm. We examine the finite-sample performance of the methods through simulations, and demonstrate their applications in a real data example on exploring relationships in body measurements.
专家混合模型为各种回归(监督学习)问题提供了灵活的统计模型。在许多现代应用中,往往会有大量的协变量(特征),但其中只有一小部分对解释感兴趣的响应变量有用。这就需要一种特征选择装置。在本文中,我们针对稀疏专家混合物模型提出了新的分组特征选择和估计方法,当特征数量几乎与样本大小相当时,就可以使用这种方法。我们证明了这些方法在参数估计和特征选择方面的一致性。我们使用改进的 EM 算法结合近似梯度法来实现这些方法,从而在算法的 M 步中方便地进行闭式参数更新。我们通过仿真检验了这些方法的有限样本性能,并在一个探索人体测量关系的真实数据示例中演示了这些方法的应用。
{"title":"Estimation and group-feature selection in sparse mixture-of-experts with diverging number of parameters","authors":"Abbas Khalili ,&nbsp;Archer Yi Yang ,&nbsp;Xiaonan Da","doi":"10.1016/j.jspi.2024.106250","DOIUrl":"10.1016/j.jspi.2024.106250","url":null,"abstract":"<div><div>Mixture-of-experts provide flexible statistical models for a wide range of regression (supervised learning) problems. Often a large number of covariates (features) are available in many modern applications yet only a small subset of them is useful in explaining a response variable of interest. This calls for a feature selection device. In this paper, we present new group-feature selection and estimation methods for sparse mixture-of-experts models when the number of features can be nearly comparable to the sample size. We prove the consistency of the methods in both parameter estimation and feature selection. We implement the methods using a modified EM algorithm combined with proximal gradient method which results in a convenient closed-form parameter update in the M-step of the algorithm. We examine the finite-sample performance of the methods through simulations, and demonstrate their applications in a real data example on exploring relationships in body measurements.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"237 ","pages":"Article 106250"},"PeriodicalIF":0.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-parametric empirical likelihood inference on quantile difference between two samples with length-biased and right-censored data 利用长度偏差和右删失数据对两个样本之间的量差进行半参数经验似然推断
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-07-01 Epub Date: 2024-11-14 DOI: 10.1016/j.jspi.2024.106249
Li Xun , Xin Guan , Yong Zhou
Exploring quantile differences between two populations at various probability levels offers valuable insights into their distinctions, which are essential for practical applications such as assessing treatment effects. However, estimating these differences can be challenging due to the complex data often encountered in clinical trials. This paper assumes that right-censored data and length-biased right-censored data originate from two populations of interest. We propose an adjusted smoothed empirical likelihood (EL) method for inferring quantile differences and establish the asymptotic properties of the proposed estimators. Under mild conditions, we demonstrate that the adjusted log-EL ratio statistics asymptotically follow the standard chi-squared distribution. We construct confidence intervals for the quantile differences using both normal and chi-squared approximations and develop a likelihood ratio test for these differences. The performance of our proposed methods is illustrated through simulation studies. Finally, we present a case study utilizing Oscar award nomination data to demonstrate the application of our method.
探索两个人群在不同概率水平上的量纲差异,可以深入了解它们之间的区别,这对评估治疗效果等实际应用至关重要。然而,由于临床试验中经常遇到复杂的数据,估计这些差异可能具有挑战性。本文假设右删失数据和长度偏倚右删失数据来自两个相关人群。我们提出了一种用于推断量纲差异的调整平滑经验似然法(EL),并建立了所提估计值的渐近特性。在温和条件下,我们证明了调整后的对数-EL 比率统计量渐近遵循标准的卡方分布。我们使用正态和卡方近似值构建了量纲差异的置信区间,并开发了针对这些差异的似然比检验。我们通过模拟研究说明了所提方法的性能。最后,我们利用奥斯卡奖提名数据进行了案例研究,展示了我们方法的应用。
{"title":"Semi-parametric empirical likelihood inference on quantile difference between two samples with length-biased and right-censored data","authors":"Li Xun ,&nbsp;Xin Guan ,&nbsp;Yong Zhou","doi":"10.1016/j.jspi.2024.106249","DOIUrl":"10.1016/j.jspi.2024.106249","url":null,"abstract":"<div><div>Exploring quantile differences between two populations at various probability levels offers valuable insights into their distinctions, which are essential for practical applications such as assessing treatment effects. However, estimating these differences can be challenging due to the complex data often encountered in clinical trials. This paper assumes that right-censored data and length-biased right-censored data originate from two populations of interest. We propose an adjusted smoothed empirical likelihood (EL) method for inferring quantile differences and establish the asymptotic properties of the proposed estimators. Under mild conditions, we demonstrate that the adjusted log-EL ratio statistics asymptotically follow the standard chi-squared distribution. We construct confidence intervals for the quantile differences using both normal and chi-squared approximations and develop a likelihood ratio test for these differences. The performance of our proposed methods is illustrated through simulation studies. Finally, we present a case study utilizing Oscar award nomination data to demonstrate the application of our method.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"237 ","pages":"Article 106249"},"PeriodicalIF":0.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142705362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Outcome dependent subsampling divide and conquer in generalized linear models for massive data 海量数据广义线性模型的结果依赖子抽样分治方法
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-07-01 Epub Date: 2024-12-04 DOI: 10.1016/j.jspi.2024.106253
Jie Yin , Jieli Ding , Changming Yang
In order to break the constraints and barriers caused by limited computing power in processing massive datasets, we propose an outcome dependent subsampling divide and conquer strategy in this paper. The proposed strategy can process data on multiple blocks in parallel and concentrate the computing resources of each block on regions with the most information. We develop a distributed statistical inference method and propose a computation-efficient algorithm in the generalized linear models for massive data. The proposed method only need to preserve some summary statistics from each data block and then use them to directly construct the proposed estimator. The asymptotic properties of the proposed method are established. Simulation studies and real data analysis are conducted to illustrate the merits of the proposed method.
为了打破计算能力有限对海量数据集处理的限制和障碍,本文提出了一种结果依赖的子抽样分治策略。该策略可以并行处理多个块上的数据,并将每个块的计算资源集中在信息最多的区域上。本文提出了一种分布式统计推理方法,并在海量数据的广义线性模型中提出了一种计算效率高的算法。该方法只需要从每个数据块中保留一些汇总统计信息,然后使用它们直接构造所提出的估计器。建立了该方法的渐近性。仿真研究和实际数据分析表明了该方法的优越性。
{"title":"Outcome dependent subsampling divide and conquer in generalized linear models for massive data","authors":"Jie Yin ,&nbsp;Jieli Ding ,&nbsp;Changming Yang","doi":"10.1016/j.jspi.2024.106253","DOIUrl":"10.1016/j.jspi.2024.106253","url":null,"abstract":"<div><div>In order to break the constraints and barriers caused by limited computing power in processing massive datasets, we propose an outcome dependent subsampling divide and conquer strategy in this paper. The proposed strategy can process data on multiple blocks in parallel and concentrate the computing resources of each block on regions with the most information. We develop a distributed statistical inference method and propose a computation-efficient algorithm in the generalized linear models for massive data. The proposed method only need to preserve some summary statistics from each data block and then use them to directly construct the proposed estimator. The asymptotic properties of the proposed method are established. Simulation studies and real data analysis are conducted to illustrate the merits of the proposed method.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"237 ","pages":"Article 106253"},"PeriodicalIF":0.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143133629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Maximum likelihood estimation of short panel autoregressive models with flexible form of fixed effects 具有灵活形式固定效应的短面板自回归模型的最大似然估计
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-07-01 Epub Date: 2024-12-18 DOI: 10.1016/j.jspi.2024.106252
Kazuhiko Hayakawa, Boyan Yin
This paper proposes the maximum likelihood (ML) estimator for a short panel autoregressive model with a flexible form of observed factors as well as unknown interactive fixed effects. We show that the ML estimator is consistent and asymptotically normally distributed as the number of cross-sectional units increases with the number of time periods being fixed. It should be noted that this asymptotic result holds uniformly for the autoregressive coefficient less than, equal to, or greater than one, in sharp contrast to existing estimators. Monte Carlo simulation results show that the ML estimator has desirable finite sample properties.
本文提出了具有柔性观测因子形式和未知交互固定效应的短面板自回归模型的最大似然估计量。我们证明了ML估计量是一致的,并且是渐近正态分布的,因为截面单元的数量随着时间段的数量固定而增加。应该注意的是,对于小于、等于或大于1的自回归系数,这个渐近结果一致成立,与现有的估计量形成鲜明对比。蒙特卡罗仿真结果表明,该估计器具有良好的有限样本特性。
{"title":"Maximum likelihood estimation of short panel autoregressive models with flexible form of fixed effects","authors":"Kazuhiko Hayakawa,&nbsp;Boyan Yin","doi":"10.1016/j.jspi.2024.106252","DOIUrl":"10.1016/j.jspi.2024.106252","url":null,"abstract":"<div><div>This paper proposes the maximum likelihood (ML) estimator for a short panel autoregressive model with a flexible form of observed factors as well as unknown interactive fixed effects. We show that the ML estimator is consistent and asymptotically normally distributed as the number of cross-sectional units increases with the number of time periods being fixed. It should be noted that this asymptotic result holds uniformly for the autoregressive coefficient less than, equal to, or greater than one, in sharp contrast to existing estimators. Monte Carlo simulation results show that the ML estimator has desirable finite sample properties.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"237 ","pages":"Article 106252"},"PeriodicalIF":0.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143133630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Marginally constrained nonparametric Bayesian inference through Gaussian processes 基于高斯过程的边际约束非参数贝叶斯推理
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-07-01 Epub Date: 2024-12-30 DOI: 10.1016/j.jspi.2024.106261
Bingjing Tang , Vinayak Rao
Nonparametric Bayesian models are used routinely as flexible and powerful models of complex data. In many situations, an applied scientist may have additional informative beliefs about the data distribution of interest, for instance, the distribution of its mean or a subset components. This often will not be compatible with the nonparametric prior. An important challenge is then to incorporate this partial prior belief into nonparametric Bayesian models. In this paper, we are motivated by settings where practitioners have additional distributional information about a subset of the coordinates of the observations being modeled. Our approach links this problem to that of conditional density modeling. Our main idea is a novel constrained Bayesian model, based on a perturbation of a parametric distribution with a transformed Gaussian process prior on the perturbation function. We develop a corresponding posterior sampling method based on data augmentation. We illustrate the efficacy of our proposed constrained nonparametric Bayesian model in a variety of real-world scenarios including modeling environmental and earthquake data.
非参数贝叶斯模型通常被用作复杂数据的灵活而强大的模型。在许多情况下,应用科学家可能对感兴趣的数据分布有额外的信息信念,例如,其平均值或子集组件的分布。这通常与非参数先验不兼容。然后,一个重要的挑战是将这种部分先验信念纳入非参数贝叶斯模型。在本文中,我们的动机来自于这样的设置,即实践者拥有关于正在建模的观测坐标子集的额外分布信息。我们的方法将这个问题与条件密度建模的问题联系起来。我们的主要思想是一种新的约束贝叶斯模型,它基于参数分布的扰动,在扰动函数上有一个转换的高斯过程。提出了一种基于数据增强的后验抽样方法。我们说明了我们提出的约束非参数贝叶斯模型在各种现实世界场景中的有效性,包括建模环境和地震数据。
{"title":"Marginally constrained nonparametric Bayesian inference through Gaussian processes","authors":"Bingjing Tang ,&nbsp;Vinayak Rao","doi":"10.1016/j.jspi.2024.106261","DOIUrl":"10.1016/j.jspi.2024.106261","url":null,"abstract":"<div><div>Nonparametric Bayesian models are used routinely as flexible and powerful models of complex data. In many situations, an applied scientist may have additional informative beliefs about the data distribution of interest, for instance, the distribution of its mean or a subset components. This often will not be compatible with the nonparametric prior. An important challenge is then to incorporate this partial prior belief into nonparametric Bayesian models. In this paper, we are motivated by settings where practitioners have additional distributional information about a subset of the coordinates of the observations being modeled. Our approach links this problem to that of conditional density modeling. Our main idea is a novel constrained Bayesian model, based on a perturbation of a parametric distribution with a transformed Gaussian process prior on the perturbation function. We develop a corresponding posterior sampling method based on data augmentation. We illustrate the efficacy of our proposed constrained nonparametric Bayesian model in a variety of real-world scenarios including modeling environmental and earthquake data.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"237 ","pages":"Article 106261"},"PeriodicalIF":0.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143133631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deterministic construction methods for asymmetrical uniform designs 不对称均匀设计的确定性构造方法
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-07-01 Epub Date: 2024-12-28 DOI: 10.1016/j.jspi.2024.106262
Liuping Hu , Kashinath Chatterjee , Jianhui Ning , Hong Qin
Asymmetrical (mixed-level) uniform designs are useful for both computer and physical experiments. However, constructing these designs is often challenging due to their complex asymmetrical structure. In this paper, we propose novel methods for constructing uniform designs with mixed two-, three-, and four/nine-levels. Our construction methods are deterministic, allowing us to circumvent the complexity associated with stochastic algorithms. We evaluate uniformity using the wrap-around L2- and Lee discrepancies. We establish useful analytic relationships between uniformity and aberration, and derive new general lower bounds for discrepancies that are tighter than those currently available in the literature. These new benchmarks can effectively measure the uniformity of asymmetrical designs. Additionally, we provide examples demonstrating the efficacy of our construction methods and the relevance of the newly obtained lower bounds. Finally, through simulations, we show that the designs produced using our methods perform well in constructing statistical surrogate models.
不对称(混合水平)均匀设计对计算机和物理实验都很有用。然而,由于其复杂的不对称结构,构建这些设计通常具有挑战性。在本文中,我们提出了一种新的方法来构建混合二、三、四/九层的均匀设计。我们的构造方法是确定性的,允许我们规避与随机算法相关的复杂性。我们使用环绕L2-和Lee差异来评估均匀性。我们在均匀性和像差之间建立了有用的分析关系,并推导出比目前文献中可用的更严格的差异的新一般下界。这些新的基准可以有效地测量非对称设计的均匀性。此外,我们还提供了一些例子来证明我们的构造方法的有效性和新获得的下界的相关性。最后,通过仿真,我们表明使用我们的方法产生的设计在构建统计代理模型方面表现良好。
{"title":"Deterministic construction methods for asymmetrical uniform designs","authors":"Liuping Hu ,&nbsp;Kashinath Chatterjee ,&nbsp;Jianhui Ning ,&nbsp;Hong Qin","doi":"10.1016/j.jspi.2024.106262","DOIUrl":"10.1016/j.jspi.2024.106262","url":null,"abstract":"<div><div>Asymmetrical (mixed-level) uniform designs are useful for both computer and physical experiments. However, constructing these designs is often challenging due to their complex asymmetrical structure. In this paper, we propose novel methods for constructing uniform designs with mixed two-, three-, and four/nine-levels. Our construction methods are deterministic, allowing us to circumvent the complexity associated with stochastic algorithms. We evaluate uniformity using the wrap-around <span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>- and Lee discrepancies. We establish useful analytic relationships between uniformity and aberration, and derive new general lower bounds for discrepancies that are tighter than those currently available in the literature. These new benchmarks can effectively measure the uniformity of asymmetrical designs. Additionally, we provide examples demonstrating the efficacy of our construction methods and the relevance of the newly obtained lower bounds. Finally, through simulations, we show that the designs produced using our methods perform well in constructing statistical surrogate models.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"237 ","pages":"Article 106262"},"PeriodicalIF":0.8,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143133632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A family of discrete maximum-entropy distributions 离散最大熵分布系列
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-05-01 Epub Date: 2024-10-01 DOI: 10.1016/j.jspi.2024.106243
David J. Hessen
In this paper, a family of maximum-entropy distributions with general discrete support is derived. Members of the family are distinguished by the number of specified non-central moments. In addition, a subfamily of discrete symmetric distributions is defined. Attention is paid to maximum likelihood estimation of the parameters of any member of the general family. It is shown that the parameters of any special case with infinite support can be estimated using a conditional distribution given a finite subset of the total support. In an empirical data example, the procedures proposed are demonstrated.
本文导出了具有一般离散支持的最大熵分布族。该族成员根据指定的非中心矩的数量来区分。此外,还定义了离散对称分布子族。一般族成员参数的最大似然估计受到关注。结果表明,任何具有无限支持的特例的参数都可以使用给定总支持的有限子集的条件分布来估计。在一个经验数据示例中,演示了所提出的程序。
{"title":"A family of discrete maximum-entropy distributions","authors":"David J. Hessen","doi":"10.1016/j.jspi.2024.106243","DOIUrl":"10.1016/j.jspi.2024.106243","url":null,"abstract":"<div><div>In this paper, a family of maximum-entropy distributions with general discrete support is derived. Members of the family are distinguished by the number of specified non-central moments. In addition, a subfamily of discrete symmetric distributions is defined. Attention is paid to maximum likelihood estimation of the parameters of any member of the general family. It is shown that the parameters of any special case with infinite support can be estimated using a conditional distribution given a finite subset of the total support. In an empirical data example, the procedures proposed are demonstrated.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"236 ","pages":"Article 106243"},"PeriodicalIF":0.8,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142416588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On schematic orthogonal arrays of high strength 高强度正交阵列示意图
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-05-01 Epub Date: 2024-09-04 DOI: 10.1016/j.jspi.2024.106230
Rong Yan, Shanqi Pang, Jing Wang, Mengqian Chen

Schematic orthogonal arrays are closely related to association schemes. And which orthogonal arrays are schematic orthogonal arrays and how to classify them is an open problem proposed by Hedayat et al. (1999). By using the Hamming distances, this paper presents some general methods for constructing schematic symmetric and mixed orthogonal arrays of high strength. As applications of these methods, we construct association schemes and many new schematic orthogonal arrays including several infinite classes of such arrays. Some examples are provided to illustrate the construction methods. The paper gives the partial solution of the problem by Hedayat et al. (1999) for symmetric and mixed orthogonal arrays of high strength.

示意正交阵列与关联方案密切相关。而哪些正交阵列属于示意正交阵列以及如何对它们进行分类是 Hedayat 等人(1999 年)提出的一个未决问题。通过使用汉明距离,本文提出了一些构建高强度示意对称阵列和混合正交阵列的一般方法。作为这些方法的应用,我们构建了关联方案和许多新的示意正交阵列,包括此类阵列的几个无限类。本文提供了一些示例来说明构建方法。本文给出了 Hedayat 等人(1999 年)提出的高强度对称和混合正交阵列问题的部分解决方案。
{"title":"On schematic orthogonal arrays of high strength","authors":"Rong Yan,&nbsp;Shanqi Pang,&nbsp;Jing Wang,&nbsp;Mengqian Chen","doi":"10.1016/j.jspi.2024.106230","DOIUrl":"10.1016/j.jspi.2024.106230","url":null,"abstract":"<div><p>Schematic orthogonal arrays are closely related to association schemes. And which orthogonal arrays are schematic orthogonal arrays and how to classify them is an open problem proposed by Hedayat et al. (1999). By using the Hamming distances, this paper presents some general methods for constructing schematic symmetric and mixed orthogonal arrays of high strength. As applications of these methods, we construct association schemes and many new schematic orthogonal arrays including several infinite classes of such arrays. Some examples are provided to illustrate the construction methods. The paper gives the partial solution of the problem by Hedayat et al. (1999) for symmetric and mixed orthogonal arrays of high strength.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"236 ","pages":"Article 106230"},"PeriodicalIF":0.8,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142162837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zero-inflated multivariate tobit regression modeling 零膨胀多元托比特回归建模
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-05-01 Epub Date: 2024-09-03 DOI: 10.1016/j.jspi.2024.106229
Becky Tang , Henry A. Frye , John A. Silander Jr. , Alan E. Gelfand

A frequent challenge encountered in real-world applications is data having a high proportion of zeros. Focusing on ecological abundance data, much attention has been given to zero-inflated count data. Models for non-negative continuous abundance data with an excess of zeros are rarely discussed. Work presented here considers the creation of a point mass at zero through a left-censoring approach or through a hurdle approach. We incorporate both mechanisms to capture the analog of zero-inflation for count data. Additionally, primary attention has been given to univariate zero-inflated modeling (e.g., single species), whereas data often arise jointly (e.g., a collection of species). With multivariate abundance data, a key issue is to capture dependence among the species at a site, both in terms of positive abundance as well as absence. Therefore, our contribution is a model for multivariate zero-inflated continuous data that are non-negative. Working in a Bayesian framework, we discuss the issue of separating the two sources of zeros and offer model comparison metrics for multivariate zero-inflated data. In an application, we model the total biomass for five tree species obtained from plots established in the Forest Inventory Analysis database in the Northeast region of the United States.

实际应用中经常遇到的一个难题是数据中零的比例很高。以生态丰度数据为重点,零膨胀计数数据受到了广泛关注。而针对零过多的非负连续丰度数据的模型却鲜有讨论。本文介绍的工作考虑了通过左删减法或障碍法在零点处创建一个点质量。我们将这两种机制结合起来,以捕捉计数数据的零膨胀模拟。此外,人们主要关注的是单变量零膨胀建模(如单一物种),而数据往往是共同产生的(如物种集合)。对于多变量丰度数据,一个关键问题是捕捉一个地点物种之间的依赖性,包括正丰度和缺失。因此,我们的贡献是建立了一个非负的多变量零膨胀连续数据模型。在贝叶斯框架下,我们讨论了分离两个零源的问题,并提供了多元零膨胀数据的模型比较指标。在一个应用中,我们对从美国东北部地区森林资源清查分析数据库建立的地块中获得的五个树种的总生物量进行了建模。
{"title":"Zero-inflated multivariate tobit regression modeling","authors":"Becky Tang ,&nbsp;Henry A. Frye ,&nbsp;John A. Silander Jr. ,&nbsp;Alan E. Gelfand","doi":"10.1016/j.jspi.2024.106229","DOIUrl":"10.1016/j.jspi.2024.106229","url":null,"abstract":"<div><p>A frequent challenge encountered in real-world applications is data having a high proportion of zeros. Focusing on ecological abundance data, much attention has been given to zero-inflated count data. Models for non-negative continuous abundance data with an excess of zeros are rarely discussed. Work presented here considers the creation of a point mass at zero through a left-censoring approach or through a hurdle approach. We incorporate both mechanisms to capture the analog of zero-inflation for count data. Additionally, primary attention has been given to univariate zero-inflated modeling (e.g., single species), whereas data often arise jointly (e.g., a collection of species). With multivariate abundance data, a key issue is to capture dependence among the species at a site, both in terms of positive abundance as well as absence. Therefore, our contribution is a model for multivariate zero-inflated continuous data that are non-negative. Working in a Bayesian framework, we discuss the issue of separating the two sources of zeros and offer model comparison metrics for multivariate zero-inflated data. In an application, we model the total biomass for five tree species obtained from plots established in the Forest Inventory Analysis database in the Northeast region of the United States.</p></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"236 ","pages":"Article 106229"},"PeriodicalIF":0.8,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142150410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effect of dimensionality on convergence rates of kernel ridge regression estimator 维度对核脊回归估计器收敛率的影响
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-05-01 Epub Date: 2024-08-26 DOI: 10.1016/j.jspi.2024.106228
Kwan-Young Bak , Woojoo Lee
Despite the curse of dimensionality, kernel ridge regression often exhibits good performance in practical applications, even when the dimension is moderately large. However, it has been shown that kernel ridge regression cannot be free from the curse of dimensionality. Until now, the literature on kernel ridge regression has suggested that the gap between theory and practice in relation to dimensionality has not narrowed. In this study, we first investigate when the influence of dimensionality does not significantly affect the convergence rate of the kernel ridge regression. Specifically, we study the convergence rate of L2 and L risks for the kernel ridge estimator, with a focus on reproducing kernel Hilbert space (RKHS) generated by a product kernel. We show that the univariate optimal convergence rate up to a logarithmic factor in L2 and L risks can be achieved by controlling the size of the RKHS. The result of a numerical study confirms our theoretical findings.
尽管存在 "维度诅咒",但核岭回归在实际应用中往往表现出良好的性能,即使维度适中时也是如此。然而,研究表明,核岭回归无法摆脱维度诅咒。迄今为止,有关核岭回归的文献表明,理论与实践在维度方面的差距并没有缩小。在本研究中,我们首先研究了当维度的影响不会显著影响核岭回归的收敛速度时的情况。具体来说,我们研究了核脊估计器的收敛率和风险,重点是乘积核生成的再现核希尔伯特空间(RKHS)。我们的研究表明,通过控制 RKHS 的大小,可以实现单变量最优收敛率,达到和风险的对数因子。数值研究结果证实了我们的理论发现。
{"title":"Effect of dimensionality on convergence rates of kernel ridge regression estimator","authors":"Kwan-Young Bak ,&nbsp;Woojoo Lee","doi":"10.1016/j.jspi.2024.106228","DOIUrl":"10.1016/j.jspi.2024.106228","url":null,"abstract":"<div><div>Despite the curse of dimensionality, kernel ridge regression often exhibits good performance in practical applications, even when the dimension is moderately large. However, it has been shown that kernel ridge regression cannot be free from the curse of dimensionality. Until now, the literature on kernel ridge regression has suggested that the gap between theory and practice in relation to dimensionality has not narrowed. In this study, we first investigate when the influence of dimensionality does not significantly affect the convergence rate of the kernel ridge regression. Specifically, we study the convergence rate of <span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span> and <span><math><msub><mrow><mi>L</mi></mrow><mrow><mi>∞</mi></mrow></msub></math></span> risks for the kernel ridge estimator, with a focus on reproducing kernel Hilbert space (RKHS) generated by a product kernel. We show that the univariate optimal convergence rate up to a logarithmic factor in <span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span> and <span><math><msub><mrow><mi>L</mi></mrow><mrow><mi>∞</mi></mrow></msub></math></span> risks can be achieved by controlling the size of the RKHS. The result of a numerical study confirms our theoretical findings.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"236 ","pages":"Article 106228"},"PeriodicalIF":0.8,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142180125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Statistical Planning and Inference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1