首页 > 最新文献

Journal of Statistical Planning and Inference最新文献

英文 中文
Nonparametric estimation of the quantiles of the conditional residual lifetime distribution 条件残差寿命分布分位数的非参数估计
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2026-01-06 DOI: 10.1016/j.jspi.2026.106373
Steven Abrams , Paul Janssen , Noël Veraverbeke
In medical research interest is often in studying either the association between an event time T1 and a continuous covariate T2 or between two event times T1 and T2 where event times are typically subject to (right) censoring. Although the strength of dependence between such random variables can be expressed in terms of global and local association measures, it is interesting to study alternative quantities such as percentiles of the residual lifetime distribution with regard to T1, conditional on T2 taking values in a given interval. In this paper, we extend existing methods to estimate quantiles of the conditional residual lifetime distribution needed to encompass a more flexible classification of subjects into subgroups based on their respective T2-values. More specifically, we propose two estimators under one-component, respectively under univariate censoring, and provide a detailed study of their finite-sample performance. We demonstrate the use of these estimators for two medical datasets on (1) monoclonal gammopathy of undetermined significance, and (2) on overall mortality in Danish twin members.
医学研究的兴趣通常是研究事件时间T1与连续协变量T2之间的关系,或两个事件时间T1和T2之间的关系,其中事件时间通常受到(右)审查。尽管这些随机变量之间的依赖强度可以用全局和局部关联度量来表示,但研究替代量(如T2在给定区间内取值的条件下相对于T1的剩余寿命分布的百分位数)是有趣的。在本文中,我们扩展了现有的方法来估计条件剩余寿命分布的分位数,这些分位数需要根据各自的t2值将受试者更灵活地分类为子组。更具体地说,我们分别在单分量和单变量审查下提出了两个估计器,并详细研究了它们的有限样本性能。我们展示了这些估计器在两个医学数据集上的使用(1)意义不明的单克隆伽玛病,以及(2)丹麦双胞胎成员的总死亡率。
{"title":"Nonparametric estimation of the quantiles of the conditional residual lifetime distribution","authors":"Steven Abrams ,&nbsp;Paul Janssen ,&nbsp;Noël Veraverbeke","doi":"10.1016/j.jspi.2026.106373","DOIUrl":"10.1016/j.jspi.2026.106373","url":null,"abstract":"<div><div>In medical research interest is often in studying either the association between an event time <span><math><msub><mrow><mi>T</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span> and a continuous covariate <span><math><msub><mrow><mi>T</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span> or between two event times <span><math><msub><mrow><mi>T</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span> and <span><math><msub><mrow><mi>T</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span> where event times are typically subject to (right) censoring. Although the strength of dependence between such random variables can be expressed in terms of global and local association measures, it is interesting to study alternative quantities such as percentiles of the residual lifetime distribution with regard to <span><math><msub><mrow><mi>T</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>, conditional on <span><math><msub><mrow><mi>T</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span> taking values in a given interval. In this paper, we extend existing methods to estimate quantiles of the conditional residual lifetime distribution needed to encompass a more flexible classification of subjects into subgroups based on their respective <span><math><msub><mrow><mi>T</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-values. More specifically, we propose two estimators under one-component, respectively under univariate censoring, and provide a detailed study of their finite-sample performance. We demonstrate the use of these estimators for two medical datasets on (1) monoclonal gammopathy of undetermined significance, and (2) on overall mortality in Danish twin members.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"243 ","pages":"Article 106373"},"PeriodicalIF":0.8,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust estimation with Latin Hypercube Sampling: A Central Limit Theorem for Z-estimators 拉丁超立方抽样的稳健估计:z估计量的中心极限定理
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2026-01-06 DOI: 10.1016/j.jspi.2026.106374
Faouzi Hakimi
Latin Hypercube Sampling (LHS) is a widely used stratified sampling method in computer experiments. In this work, we extend existing convergence results for the sample mean under LHS to the broader class of Z-estimators — estimators defined as the zeros of a sample mean function. We derive the asymptotic variance of these estimators and demonstrate that it is smaller when using LHS compared to traditional independent and identically distributed sampling. Furthermore, we establish a Central Limit Theorem for Z-estimators under LHS, providing a theoretical foundation for its improved efficiency.
拉丁超立方体抽样(LHS)是计算机实验中广泛使用的分层抽样方法。在这项工作中,我们将LHS下样本均值的现有收敛结果扩展到更广泛的z估计量-定义为样本均值函数零点的估计量。我们推导了这些估计量的渐近方差,并证明了与传统的独立同分布抽样相比,LHS的渐近方差更小。此外,我们还建立了LHS下z估计量的中心极限定理,为其提高效率提供了理论基础。
{"title":"Robust estimation with Latin Hypercube Sampling: A Central Limit Theorem for Z-estimators","authors":"Faouzi Hakimi","doi":"10.1016/j.jspi.2026.106374","DOIUrl":"10.1016/j.jspi.2026.106374","url":null,"abstract":"<div><div>Latin Hypercube Sampling (LHS) is a widely used stratified sampling method in computer experiments. In this work, we extend existing convergence results for the sample mean under LHS to the broader class of <span><math><mi>Z</mi></math></span>-estimators — estimators defined as the zeros of a sample mean function. We derive the asymptotic variance of these estimators and demonstrate that it is smaller when using LHS compared to traditional independent and identically distributed sampling. Furthermore, we establish a Central Limit Theorem for <span><math><mi>Z</mi></math></span>-estimators under LHS, providing a theoretical foundation for its improved efficiency.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"243 ","pages":"Article 106374"},"PeriodicalIF":0.8,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145925203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A class of mixed-level amplified designs and their space-filling properties 一类混合级放大设计及其空间填充特性
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-12-19 DOI: 10.1016/j.jspi.2025.106372
Zuohang Kang , Zujun Ou
With the increasing complexity of experimental scenarios, mixed-level designs with large size are urgently needed. A class of mixed-level designs are constructed through amplification, which enlarges both the run size and number of factors of initial design. The space-filling properties of amplified designs are discussed under generalized minimum aberration criterion, wordlength enumerator and maximin L2-distance criterion, and attainable upper bound of maximin L2-distance and lower bound of wordlength enumerator for amplified design are respectively obtained. Numerical examples demonstrate that the construction method of amplified designs is very simple and effective, and is recommended for application in high dimension topics of statistics or large-scale experiments.
随着实验场景的日益复杂,迫切需要大尺寸的混合级设计。通过放大的方法构建了一类混合水平设计,它既扩大了初始设计的运行规模,又扩大了初始设计的因素数量。在广义最小像差准则、字长枚举数和最大l2 -距离准则下,讨论了放大设计的空间填充特性,得到了放大设计的最大l2 -距离上界和最大l2 -距离下界。数值算例表明,放大设计的构造方法简单有效,适用于高维统计课题或大规模实验。
{"title":"A class of mixed-level amplified designs and their space-filling properties","authors":"Zuohang Kang ,&nbsp;Zujun Ou","doi":"10.1016/j.jspi.2025.106372","DOIUrl":"10.1016/j.jspi.2025.106372","url":null,"abstract":"<div><div>With the increasing complexity of experimental scenarios, mixed-level designs with large size are urgently needed. A class of mixed-level designs are constructed through amplification, which enlarges both the run size and number of factors of initial design. The space-filling properties of amplified designs are discussed under generalized minimum aberration criterion, wordlength enumerator and maximin <span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-distance criterion, and attainable upper bound of maximin <span><math><msub><mrow><mi>L</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-distance and lower bound of wordlength enumerator for amplified design are respectively obtained. Numerical examples demonstrate that the construction method of amplified designs is very simple and effective, and is recommended for application in high dimension topics of statistics or large-scale experiments.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"243 ","pages":"Article 106372"},"PeriodicalIF":0.8,"publicationDate":"2025-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145840080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Homogeneity testing under finite mixtures of multivariate Poisson distributions 多元泊松分布有限混合下的均匀性检验
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-11-27 DOI: 10.1016/j.jspi.2025.106369
Guanfu Liu , Yuejiao Fu
The finite mixtures of multivariate Poisson (FMMP) distributions have wide applications in the real world. Testing for homogeneity under the FMMP models is important, however, there is no generic solution to this problem as far as we know. In this paper, we propose an EM-test for homogeneity under the FMMP models to fulfill the gap. We establish the strong consistency of the maximum likelihood estimator for the mixing distribution by relaxing two conditions required in existing literature. The null limiting distribution of the proposed test is studied, and based on the limiting distribution, a resampling procedure is constructed to approximate the p-value of the test. The loss of the strong identifiability for the multivariate Poisson distribution poses a significant challenge in deriving the null limiting distribution. Finally, simulation studies and real-data analysis demonstrate the good performance of the proposed test.
多元泊松分布的有限混合在现实世界中有着广泛的应用。在FMMP模型下测试同质性是很重要的,然而,据我们所知,这个问题没有通用的解决方案。在本文中,我们提出了FMMP模型下的同质性的em检验来填补这一空白。通过放宽现有文献中所要求的两个条件,建立了混合分布的极大似然估计量的强相合性。研究了该检验的零极限分布,并基于该极限分布构造了近似检验p值的重采样程序。多元泊松分布的强可辨识性的丧失对零极限分布的推导提出了重大挑战。最后,仿真研究和实际数据分析验证了该方法的良好性能。
{"title":"Homogeneity testing under finite mixtures of multivariate Poisson distributions","authors":"Guanfu Liu ,&nbsp;Yuejiao Fu","doi":"10.1016/j.jspi.2025.106369","DOIUrl":"10.1016/j.jspi.2025.106369","url":null,"abstract":"<div><div>The finite mixtures of multivariate Poisson (FMMP) distributions have wide applications in the real world. Testing for homogeneity under the FMMP models is important, however, there is no generic solution to this problem as far as we know. In this paper, we propose an EM-test for homogeneity under the FMMP models to fulfill the gap. We establish the strong consistency of the maximum likelihood estimator for the mixing distribution by relaxing two conditions required in existing literature. The null limiting distribution of the proposed test is studied, and based on the limiting distribution, a resampling procedure is constructed to approximate the <span><math><mi>p</mi></math></span>-value of the test. The loss of the strong identifiability for the multivariate Poisson distribution poses a significant challenge in deriving the null limiting distribution. Finally, simulation studies and real-data analysis demonstrate the good performance of the proposed test.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"243 ","pages":"Article 106369"},"PeriodicalIF":0.8,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145610584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On deriving Liouville process from Liouville distribution and its application in nonparametric Bayesian inference 由Liouville分布导出Liouville过程及其在非参数贝叶斯推理中的应用
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-11-26 DOI: 10.1016/j.jspi.2025.106368
Sadegh Chegini, Mahmoud Zarepour
The Liouville distribution, a generalization of the Dirichlet distribution, serves as a well-known conjugate prior for the multinomial distribution. Just as the Dirichlet process is derived from the finite-dimensional Dirichlet distribution, it is natural and important to introduce and derive a Liouville process in a similar manner. We introduce a discrete random probability measure constructed from a random vector following a Liouville distribution and subsequently derive its weak limit to define our proposed Liouville process. The resulting process is a spike-and-slab process, where the Dirichlet process serves as the slab and a single point from its mean acts as the spike. These two components are linearly combined using a random weight generated from the Liouville distribution. By using the Liouville process as a prior on the space of probability measures, we derive the corresponding posterior process as well as the predictive distribution.
Liouville分布是Dirichlet分布的一种推广,是多项式分布的一个众所周知的共轭先验。正如狄利克雷过程是从有限维狄利克雷分布推导出来的一样,以类似的方式引入和推导Liouville过程是很自然和重要的。我们引入了一个离散随机概率测度,该测度由遵循Liouville分布的随机向量构造,随后推导出其弱极限来定义我们提出的Liouville过程。由此产生的过程是尖峰-尖峰过程,其中狄利克雷过程作为尖峰,其平均值的一个点作为尖峰。这两个分量使用由Liouville分布生成的随机权重线性组合。利用Liouville过程作为概率测度空间上的先验,推导出相应的后验过程和预测分布。
{"title":"On deriving Liouville process from Liouville distribution and its application in nonparametric Bayesian inference","authors":"Sadegh Chegini,&nbsp;Mahmoud Zarepour","doi":"10.1016/j.jspi.2025.106368","DOIUrl":"10.1016/j.jspi.2025.106368","url":null,"abstract":"<div><div>The Liouville distribution, a generalization of the Dirichlet distribution, serves as a well-known conjugate prior for the multinomial distribution. Just as the Dirichlet process is derived from the finite-dimensional Dirichlet distribution, it is natural and important to introduce and derive a Liouville process in a similar manner. We introduce a discrete random probability measure constructed from a random vector following a Liouville distribution and subsequently derive its weak limit to define our proposed Liouville process. The resulting process is a spike-and-slab process, where the Dirichlet process serves as the slab and a single point from its mean acts as the spike. These two components are linearly combined using a random weight generated from the Liouville distribution. By using the Liouville process as a prior on the space of probability measures, we derive the corresponding posterior process as well as the predictive distribution.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"243 ","pages":"Article 106368"},"PeriodicalIF":0.8,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145618331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semiparametric tests for Lorenz dominance based on density ratio model 基于密度比模型的Lorenz优势度半参数检验
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-11-12 DOI: 10.1016/j.jspi.2025.106361
Weiwei Zhuang , Weiqi Yang , Wenchen Liao , Yukun Liu
Lorenz dominance is a fundamental tool for assessing whether wealth or income disparity is greater in one population than another. Based on the well-established density ratio model, we propose a new semiparametric test for Lorenz dominance. We show that the limiting distribution of the proposed test statistic is the supremum of a Gaussian process. To facilitate practical application, we devise a bootstrap procedure to calculate the p-value and establish its theoretical validity. Our simulation studies demonstrate that the proposed test correctly controls the Type I error and outperforms its competitors in terms of statistical power. Finally, we apply the test to compare salary distributions among higher education employees in Ohio from 2011 to 2015.
洛伦兹优势是评估一个人群的财富或收入差距是否大于另一个人群的基本工具。基于已建立的密度比模型,我们提出了一种新的洛伦兹优势度的半参数检验方法。我们证明了所提出的检验统计量的极限分布是高斯过程的极大值。为了便于实际应用,我们设计了一个自举程序来计算p值并验证其理论有效性。我们的仿真研究表明,所提出的测试正确地控制了I型误差,并在统计功率方面优于其竞争对手。最后,我们运用该检验比较了2011 - 2015年俄亥俄州高等教育员工的薪酬分布。
{"title":"Semiparametric tests for Lorenz dominance based on density ratio model","authors":"Weiwei Zhuang ,&nbsp;Weiqi Yang ,&nbsp;Wenchen Liao ,&nbsp;Yukun Liu","doi":"10.1016/j.jspi.2025.106361","DOIUrl":"10.1016/j.jspi.2025.106361","url":null,"abstract":"<div><div>Lorenz dominance is a fundamental tool for assessing whether wealth or income disparity is greater in one population than another. Based on the well-established density ratio model, we propose a new semiparametric test for Lorenz dominance. We show that the limiting distribution of the proposed test statistic is the supremum of a Gaussian process. To facilitate practical application, we devise a bootstrap procedure to calculate the <span><math><mi>p</mi></math></span>-value and establish its theoretical validity. Our simulation studies demonstrate that the proposed test correctly controls the Type I error and outperforms its competitors in terms of statistical power. Finally, we apply the test to compare salary distributions among higher education employees in Ohio from 2011 to 2015.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"242 ","pages":"Article 106361"},"PeriodicalIF":0.8,"publicationDate":"2025-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145527966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-weighted estimation for nonstationary processes with infinite variance GARCH errors 具有无限方差GARCH误差的非平稳过程的自加权估计
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-11-08 DOI: 10.1016/j.jspi.2025.106360
Yuze Yuan , Shuyu Liu , Rongmao Zhang
Zhang and Chan (2021) considered the augmented Dickey–Fuller (ADF) test for an unit root process with linear noise driven by generalized autoregressive conditional heteroskedasticity (GARCH), and showed that the ADF test may perform even worse than the Dickey–Fuller test. The main reason is that the parameters of the lag terms in the ADF regression cannot be estimated consistently for infinite variance GARCH noises based on least square estimation (LSE). In this paper, we propose a self-weighted least square estimation (SWLSE) procedure to solve this problem. Consequently, a new test based on SWLSE for the unit-root is also proposed. It is shown that the SWLSE are consistent, and the proposed test converges to a functional of a stable process and a Brownian motion and performs well in term of size and power. Simulation study is conducted to evaluate the performance of our procedure, and a real-world illustrative example is provided.
Zhang和Chan(2021)考虑了广义自回归条件异方差(GARCH)驱动线性噪声的单位根过程的增广Dickey-Fuller (ADF)检验,并表明ADF检验的表现可能比Dickey-Fuller检验更差。主要原因是基于最小二乘估计(LSE)的无限方差GARCH噪声的ADF回归中滞后项的参数无法一致估计。本文提出一种自加权最小二乘估计(SWLSE)方法来解决这一问题。在此基础上,提出了一种新的基于SWLSE的单位根检验方法。结果表明,SWLSE是一致的,所提出的测试收敛于稳定过程和布朗运动的泛函,并且在大小和功率方面表现良好。通过仿真研究对该方法的性能进行了评价,并给出了一个实例。
{"title":"Self-weighted estimation for nonstationary processes with infinite variance GARCH errors","authors":"Yuze Yuan ,&nbsp;Shuyu Liu ,&nbsp;Rongmao Zhang","doi":"10.1016/j.jspi.2025.106360","DOIUrl":"10.1016/j.jspi.2025.106360","url":null,"abstract":"<div><div>Zhang and Chan (2021) considered the augmented Dickey–Fuller (ADF) test for an unit root process with linear noise driven by generalized autoregressive conditional heteroskedasticity (GARCH), and showed that the ADF test may perform even worse than the Dickey–Fuller test. The main reason is that the parameters of the lag terms in the ADF regression cannot be estimated consistently for infinite variance GARCH noises based on least square estimation (LSE). In this paper, we propose a self-weighted least square estimation (SWLSE) procedure to solve this problem. Consequently, a new test based on SWLSE for the unit-root is also proposed. It is shown that the SWLSE are consistent, and the proposed test converges to a functional of a stable process and a Brownian motion and performs well in term of size and power. Simulation study is conducted to evaluate the performance of our procedure, and a real-world illustrative example is provided.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"242 ","pages":"Article 106360"},"PeriodicalIF":0.8,"publicationDate":"2025-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145527965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mixed latent graphical models with mixed measurement error and misclassification in variables 具有混合测量误差和变量误分类的混合潜在图形模型
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-11-01 DOI: 10.1016/j.jspi.2025.106359
Yu Shi , Grace Y. Yi
Graphical models are powerful tools for characterizing conditional dependence structures among variables with complex relationships. Although many methods have been developed under the graphical modeling framework, their validity often hinges on the quality of the data. A fundamental assumption in most existing approaches is that all variables are measured precisely, an assumption frequently violated in practice. In many applications, mismeasurement of mixed discrete and continuous variables is a common challenge. In this paper, we address error-contaminated data involving both continuous and discrete variables by proposing a mixed latent Gaussian copula graphical measurement error model. To perform inference, we develop a simulation-based expectation–maximization procedure that explicitly accounts for mismeasurement effects. We further introduce a computationally efficient refinement to reduce the computational burden. Asymptotic properties of the proposed estimator are established, and its finite-sample performance is evaluated through numerical studies.
图形模型是描述具有复杂关系的变量间条件依赖结构的有力工具。尽管在图形建模框架下开发了许多方法,但它们的有效性往往取决于数据的质量。大多数现有方法的一个基本假设是,所有变量都是精确测量的,这一假设在实践中经常被违反。在许多应用中,离散和连续混合变量的测量错误是一个常见的挑战。在本文中,我们通过提出一个混合潜在高斯耦合图形测量误差模型来处理涉及连续和离散变量的误差污染数据。为了进行推理,我们开发了一个基于模拟的期望最大化程序,该程序明确地说明了误测量效应。我们进一步引入了一种计算效率高的改进来减少计算负担。建立了该估计器的渐近性质,并通过数值研究对其有限样本性能进行了评价。
{"title":"Mixed latent graphical models with mixed measurement error and misclassification in variables","authors":"Yu Shi ,&nbsp;Grace Y. Yi","doi":"10.1016/j.jspi.2025.106359","DOIUrl":"10.1016/j.jspi.2025.106359","url":null,"abstract":"<div><div>Graphical models are powerful tools for characterizing conditional dependence structures among variables with complex relationships. Although many methods have been developed under the graphical modeling framework, their validity often hinges on the quality of the data. A fundamental assumption in most existing approaches is that all variables are measured precisely, an assumption frequently violated in practice. In many applications, mismeasurement of mixed discrete and continuous variables is a common challenge. In this paper, we address error-contaminated data involving both continuous and discrete variables by proposing a mixed latent Gaussian copula graphical measurement error model. To perform inference, we develop a simulation-based expectation–maximization procedure that explicitly accounts for mismeasurement effects. We further introduce a computationally efficient refinement to reduce the computational burden. Asymptotic properties of the proposed estimator are established, and its finite-sample performance is evaluated through numerical studies.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"242 ","pages":"Article 106359"},"PeriodicalIF":0.8,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145465635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
General sliced minimum aberration designs for multi-platform experiments 用于多平台实验的一般切片最小像差设计
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-10-30 DOI: 10.1016/j.jspi.2025.106357
Yuliang Zhou, Qianqian Zhao, Shengli Zhao
Sliced designs are widely used in multi-platform experiments. A sliced design contains several sub-designs divided by the sliced factor, and each sub-design is assigned to a platform, respectively. In some experimental scenarios, it is necessary to consider the optimality of both the sub-designs and the complete sliced designs, such sliced designs are referred to as general sliced (GS) designs. To construct the optimal GS designs for such scenarios, we propose the general sliced effect hierarchy principle (GSEHP). Based on the GSEHP, we introduce the general sliced minimum aberration (GSMA) criterion and choose the GSMA designs as optimal GS designs when the sliced factor and design factors are equally important. Some GSMA designs with 32 and 64 runs are tabulated. Additionally, we present a practical example to illustrate the application of GSMA designs in guiding strategies of webpage setting on two platforms.
切片设计广泛应用于多平台实验。一个切片设计包含若干个被切片因子划分的子设计,每个子设计分别分配给一个平台。在某些实验场景中,需要同时考虑子设计和完整切片设计的最优性,这种切片设计称为一般切片设计(GS)。为了构建这种场景下的最优GS设计,我们提出了通用切片效应层次原则(GSEHP)。在GSEHP的基础上,引入了通用最小像差(GSMA)准则,并在切片因素和设计因素同等重要的情况下,选择GSMA设计作为最优的GS设计。一些运行32次和64次的GSMA设计被制成表格。此外,我们还通过一个实例说明了GSMA设计在两个平台的网页设置指导策略中的应用。
{"title":"General sliced minimum aberration designs for multi-platform experiments","authors":"Yuliang Zhou,&nbsp;Qianqian Zhao,&nbsp;Shengli Zhao","doi":"10.1016/j.jspi.2025.106357","DOIUrl":"10.1016/j.jspi.2025.106357","url":null,"abstract":"<div><div>Sliced designs are widely used in multi-platform experiments. A sliced design contains several sub-designs divided by the sliced factor, and each sub-design is assigned to a platform, respectively. In some experimental scenarios, it is necessary to consider the optimality of both the sub-designs and the complete sliced designs, such sliced designs are referred to as general sliced (GS) designs. To construct the optimal GS designs for such scenarios, we propose the general sliced effect hierarchy principle (GSEHP). Based on the GSEHP, we introduce the general sliced minimum aberration (GSMA) criterion and choose the GSMA designs as optimal GS designs when the sliced factor and design factors are equally important. Some GSMA designs with 32 and 64 runs are tabulated. Additionally, we present a practical example to illustrate the application of GSMA designs in guiding strategies of webpage setting on two platforms.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"242 ","pages":"Article 106357"},"PeriodicalIF":0.8,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145415646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust and consistent model evaluation criteria in high-dimensional regression 高维回归中稳健一致的模型评价准则
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-10-28 DOI: 10.1016/j.jspi.2025.106358
Sumito Kurata, Kei Hirose
Most of the regularization methods such as the LASSO have one (or more) regularization parameter(s), and to select the value of the regularization parameter is essentially equal to select a model. Thus, to obtain a model suitable for the data and phenomenon, we need to determine an adequate value of the regularization parameter. Regarding the determination of the regularization parameter in the linear regression model, we often apply the information criteria like the AIC and BIC, however, it has been pointed out that these criteria are sensitive to outliers and tend not to perform well in high-dimensional settings. Outliers generally have a negative effect on not only estimation but also model selection, consequently, it is important to employ a selection method with robustness against outliers. In addition, when the number of explanatory variables is quite large, most conventional criteria are prone to select unnecessary explanatory variables. In this paper, we propose model evaluation criteria based on the statistical divergence with excellence in robustness in both of parametric estimation and model selection, by applying the quasi-Bayesian procedure. Our proposed criteria achieve the selection consistency even in high-dimensional settings due to precise approximation, simultaneously with robustness. We also investigate the conditions for establishing robustness and consistency, and provide an appropriate example of the divergence and penalty term that can achieve the desirable properties. We finally report the results of some numerical examples to verify that the proposed criteria perform robust and consistent variable selection compared with the conventional selection methods.
大多数正则化方法(如LASSO)都有一个(或多个)正则化参数,选择正则化参数的值本质上等于选择一个模型。因此,为了获得适合于数据和现象的模型,我们需要确定一个适当的正则化参数值。对于线性回归模型中正则化参数的确定,我们通常采用AIC和BIC等信息准则,但已有研究指出,这些准则对异常值敏感,在高维环境下往往表现不佳。异常值不仅对估计有负面影响,而且对模型选择也有负面影响,因此,采用对异常值具有鲁棒性的选择方法非常重要。此外,当解释变量的数量相当大时,大多数常规标准容易选择不必要的解释变量。本文应用拟贝叶斯过程,提出了基于统计散度的模型评价准则,该准则在参数估计和模型选择上都具有较好的鲁棒性。我们提出的标准即使在高维环境下,由于精确的近似,也能实现选择一致性,同时具有鲁棒性。我们还研究了建立鲁棒性和一致性的条件,并提供了一个适当的散度和惩罚项的例子,可以达到期望的性质。最后给出了一些数值算例,验证了所提出的准则与传统的选择方法相比具有鲁棒性和一致性。
{"title":"Robust and consistent model evaluation criteria in high-dimensional regression","authors":"Sumito Kurata,&nbsp;Kei Hirose","doi":"10.1016/j.jspi.2025.106358","DOIUrl":"10.1016/j.jspi.2025.106358","url":null,"abstract":"<div><div>Most of the regularization methods such as the LASSO have one (or more) regularization parameter(s), and to select the value of the regularization parameter is essentially equal to select a model. Thus, to obtain a model suitable for the data and phenomenon, we need to determine an adequate value of the regularization parameter. Regarding the determination of the regularization parameter in the linear regression model, we often apply the information criteria like the AIC and BIC, however, it has been pointed out that these criteria are sensitive to outliers and tend not to perform well in high-dimensional settings. Outliers generally have a negative effect on not only estimation but also model selection, consequently, it is important to employ a selection method with robustness against outliers. In addition, when the number of explanatory variables is quite large, most conventional criteria are prone to select unnecessary explanatory variables. In this paper, we propose model evaluation criteria based on the statistical divergence with excellence in robustness in both of parametric estimation and model selection, by applying the quasi-Bayesian procedure. Our proposed criteria achieve the selection consistency even in high-dimensional settings due to precise approximation, simultaneously with robustness. We also investigate the conditions for establishing robustness and consistency, and provide an appropriate example of the divergence and penalty term that can achieve the desirable properties. We finally report the results of some numerical examples to verify that the proposed criteria perform robust and consistent variable selection compared with the conventional selection methods.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"242 ","pages":"Article 106358"},"PeriodicalIF":0.8,"publicationDate":"2025-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145415645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Statistical Planning and Inference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1