首页 > 最新文献

Journal of Statistical Planning and Inference最新文献

英文 中文
Maximum Projection Gini Correlation (MaGiC) for mixed categorical and numerical data 混合分类和数值数据的最大投影基尼相关(MaGiC)
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-24 DOI: 10.1016/j.jspi.2025.106294
Hong Xiao , Radhakrishna Adhikari , Yixin Chen , Xin Dang
We propose a projection correlation for measure of dependence between numerical multivariate variables and categorical variables. The projection correlation, defined as the maximum of the Gini correlations (i.e., MaGiC) between the categorical variable and the univariate projections of the multivariate vector, is non-parametric, and intuitively produces a high coefficient when the two variables are dependent, and zero when they are independent. We show that MaGiC possesses the property of nestedness, in that it is non-decreasing with the increasing number of features in the numerical vector, while remaining unchanged if additional numerical features are independent of the categorical variable and original features. We establish n-consistency of the sample projection correlation. A powerful K-sample test can be carried out via the MaGiC-based independence test. When compared with related correlation definitions for multivariate variables, MaGiC also enjoys a faster implementation, with the computational complexity O(mn(d+logn)) where d is the dimension of the numerical variable, n is the sample size, and m is the number of projections performed, as opposed to O(dn2) for Gini correlation. We demonstrate these properties through simulation and application to real datasets.
我们提出了一种投影相关性来衡量数值多元变量和分类变量之间的相关性。投影相关性,定义为类别变量与多元向量的单变量投影之间的基尼相关性(即MaGiC)的最大值,是非参数的,当两个变量相依时直观地产生高系数,当它们独立时产生零系数。我们证明了MaGiC具有嵌套性,即随着数值向量中特征数量的增加,它不减少,而如果附加的数值特征独立于分类变量和原始特征,它保持不变。我们建立了样本投影相关性的n一致性。通过基于magic的独立性检验,可以进行强大的k样本检验。与多元变量的相关关联定义相比,MaGiC的实现速度更快,计算复杂度为O(mn(d+logn)),其中d是数值变量的维度,n是样本量,m是执行的预测数量,而基尼相关的计算复杂度为O(dn2)。我们通过模拟和实际数据集的应用来证明这些特性。
{"title":"Maximum Projection Gini Correlation (MaGiC) for mixed categorical and numerical data","authors":"Hong Xiao ,&nbsp;Radhakrishna Adhikari ,&nbsp;Yixin Chen ,&nbsp;Xin Dang","doi":"10.1016/j.jspi.2025.106294","DOIUrl":"10.1016/j.jspi.2025.106294","url":null,"abstract":"<div><div>We propose a projection correlation for measure of dependence between numerical multivariate variables and categorical variables. The projection correlation, defined as the maximum of the Gini correlations (i.e., MaGiC) between the categorical variable and the univariate projections of the multivariate vector, is non-parametric, and intuitively produces a high coefficient when the two variables are dependent, and zero when they are independent. We show that MaGiC possesses the property of nestedness, in that it is non-decreasing with the increasing number of features in the numerical vector, while remaining unchanged if additional numerical features are independent of the categorical variable and original features. We establish <span><math><msqrt><mrow><mi>n</mi></mrow></msqrt></math></span>-consistency of the sample projection correlation. A powerful <span><math><mi>K</mi></math></span>-sample test can be carried out via the MaGiC-based independence test. When compared with related correlation definitions for multivariate variables, MaGiC also enjoys a faster implementation, with the computational complexity <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>m</mi><mi>n</mi><mrow><mo>(</mo><mi>d</mi><mo>+</mo><mo>log</mo><mi>n</mi><mo>)</mo></mrow><mo>)</mo></mrow></mrow></math></span> where <span><math><mi>d</mi></math></span> is the dimension of the numerical variable, <span><math><mi>n</mi></math></span> is the sample size, and <span><math><mi>m</mi></math></span> is the number of projections performed, as opposed to <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>d</mi><mspace></mspace><msup><mrow><mi>n</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow></mrow></math></span> for Gini correlation. We demonstrate these properties through simulation and application to real datasets.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"239 ","pages":"Article 106294"},"PeriodicalIF":0.8,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143874462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
M-procedures robust to structural changes detection under strong mixing heavy-tailed time series models 在强混合重尾时间序列模型下,m程序对结构变化检测具有鲁棒性
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-24 DOI: 10.1016/j.jspi.2025.106295
Hao Jin , Jiating Hu , Ling Zhu , Shiyu Tian , Si Zhang
Many tests of change points resort to least squares estimation method, but it can lead to bias if these observations are heavy-tailed processes. The aim of this paper is to construct a ratio-typed test based on M-estimation, which avoids the long-range variance estimation and is robust to structural change detection under strong mixing series with heavy-tailed. The proposed test consisting of M-procedures has more utility in that it allows processes in the domain of attraction of a stable law with index κ(0,2), not limited to (1,2). Under some regular conditions, asymptotic distribution under the null hypothesis of no change is functional of a Brownian motion, and the divergent rate under the alternative hypothesis is also provided. Furthermore, the convergence rate of a ratio-typed change point estimator is established. Simulation study illustrates there is no distortion in empirical sizes, and empirical powers have satisfactory performance. Finally, two practical applications to real examples are presented as well.
许多变化点的测试采用最小二乘估计方法,但如果这些观察是重尾过程,则可能导致偏差。本文的目的是构建一个基于m估计的比率型检验,该检验避免了长时间方差估计,并且对重尾强混合序列下的结构变化检测具有鲁棒性。所提出的由m -过程组成的检验具有更大的实用性,因为它允许在索引κ∈(0,2)的稳定定律的吸引域内的过程,而不限于(1,2)。在一定的正则条件下,无变化零假设下的渐近分布是布朗运动的泛函,并给出了备择假设下的发散率。进一步给出了比值型变点估计量的收敛速率。仿真研究表明,经验大小没有失真,经验幂具有令人满意的性能。最后,给出了两个实例的实际应用。
{"title":"M-procedures robust to structural changes detection under strong mixing heavy-tailed time series models","authors":"Hao Jin ,&nbsp;Jiating Hu ,&nbsp;Ling Zhu ,&nbsp;Shiyu Tian ,&nbsp;Si Zhang","doi":"10.1016/j.jspi.2025.106295","DOIUrl":"10.1016/j.jspi.2025.106295","url":null,"abstract":"<div><div>Many tests of change points resort to least squares estimation method, but it can lead to bias if these observations are heavy-tailed processes. The aim of this paper is to construct a ratio-typed test based on M-estimation, which avoids the long-range variance estimation and is robust to structural change detection under strong mixing series with heavy-tailed. The proposed test consisting of M-procedures has more utility in that it allows processes in the domain of attraction of a stable law with index <span><math><mrow><mi>κ</mi><mo>∈</mo><mrow><mo>(</mo><mn>0</mn><mo>,</mo><mn>2</mn><mo>)</mo></mrow></mrow></math></span>, not limited to <span><math><mrow><mo>(</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>)</mo></mrow></math></span>. Under some regular conditions, asymptotic distribution under the null hypothesis of no change is functional of a Brownian motion, and the divergent rate under the alternative hypothesis is also provided. Furthermore, the convergence rate of a ratio-typed change point estimator is established. Simulation study illustrates there is no distortion in empirical sizes, and empirical powers have satisfactory performance. Finally, two practical applications to real examples are presented as well.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"239 ","pages":"Article 106295"},"PeriodicalIF":0.8,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143891417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pursuing sparsity and homogeneity for multi-source high-dimensional current status data 追求多源高维现状数据的稀疏性和同质性
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-04-23 DOI: 10.1016/j.jspi.2025.106293
Xin Ye , Yanyan Liu
Nowadays, current status data with high-dimensional predictors are prevalent in observational studies. However, for a single study, the high dimensionality and the presence of censoring pose substantial challenges to statistical analysis with limited sample size. Although integrative analysis has been widely regarded as an effective strategy to improve the estimation, the source-level heterogeneity has to be carefully addressed. In this paper, we propose an integrative analysis method for multi-source high-dimensional current status data, which can simultaneously identify the homogeneity/heterogeneity structure and select important variables. We prove that the proposed approach attains consistency in estimation, sparsity recovery, and the pursuit of homogeneity. Extensive simulation studies have been carried out to assess the finite sample performance of the proposed method. A real data analysis of multi-source ovarian cancer recurrence studies further demonstrates its practical applicability.
目前,具有高维预测因子的现状数据在观察性研究中普遍存在。然而,对于单一的研究,高维度和审查的存在对有限样本量的统计分析构成了实质性的挑战。虽然综合分析已被广泛认为是改善估计的有效策略,但必须仔细处理源级异质性。本文提出了一种多源高维电流状态数据的综合分析方法,该方法可以同时识别同质/异质结构并选择重要变量。我们证明了该方法在估计、稀疏恢复和追求同质性方面达到了一致性。已经进行了大量的仿真研究来评估所提出的方法的有限样本性能。通过对卵巢癌多源复发研究的真实数据分析,进一步证明了该方法的实用性。
{"title":"Pursuing sparsity and homogeneity for multi-source high-dimensional current status data","authors":"Xin Ye ,&nbsp;Yanyan Liu","doi":"10.1016/j.jspi.2025.106293","DOIUrl":"10.1016/j.jspi.2025.106293","url":null,"abstract":"<div><div>Nowadays, current status data with high-dimensional predictors are prevalent in observational studies. However, for a single study, the high dimensionality and the presence of censoring pose substantial challenges to statistical analysis with limited sample size. Although integrative analysis has been widely regarded as an effective strategy to improve the estimation, the source-level heterogeneity has to be carefully addressed. In this paper, we propose an integrative analysis method for multi-source high-dimensional current status data, which can simultaneously identify the homogeneity/heterogeneity structure and select important variables. We prove that the proposed approach attains consistency in estimation, sparsity recovery, and the pursuit of homogeneity. Extensive simulation studies have been carried out to assess the finite sample performance of the proposed method. A real data analysis of multi-source ovarian cancer recurrence studies further demonstrates its practical applicability.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"239 ","pages":"Article 106293"},"PeriodicalIF":0.8,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143891416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neighborhood VAR: Efficient estimation of multivariate timeseries with neighborhood information 邻域VAR:具有邻域信息的多元时间序列的有效估计
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-31 DOI: 10.1016/j.jspi.2025.106277
Zhihao Hu , Shyam Ranganathan , Yang Shao , Xinwei Deng
Vector autoregression (VAR) models are popular in modeling multivariate time series in data sciences and other areas. When the number of time series is large, the number of parameters in the VAR model increases dramatically, posing great challenges for proper model estimation and inference. In this work, we propose a so-called neighborhood vector autoregression (NVAR) model to efficiently analyze large-dimensional multivariate time series. We assume that the time series have underlying neighborhood relationships, e.g., spatial or network, among them based on the inherent setting of the problem. When this neighborhood information is available or can be summarized using a distance matrix, we demonstrate that our proposed NVAR method provides a computationally efficient and theoretically sound estimation of model parameters. The performance of the proposed method is compared with other existing approaches in both simulation studies and a real-data application in environmental science.
向量自回归(VAR)模型在数据科学和其他领域的多变量时间序列建模中很受欢迎。当时间序列数量较大时,VAR模型中的参数数量会急剧增加,这对正确的模型估计和推理提出了很大的挑战。在这项工作中,我们提出了一个所谓的邻域向量自回归(NVAR)模型来有效地分析大维多元时间序列。我们假设时间序列具有潜在的邻域关系,例如,空间或网络,其中基于问题的固有设置。当邻域信息可用或可以使用距离矩阵进行汇总时,我们证明了我们提出的NVAR方法提供了计算效率高且理论上合理的模型参数估计。在模拟研究和环境科学的实际数据应用中,将该方法的性能与其他现有方法进行了比较。
{"title":"Neighborhood VAR: Efficient estimation of multivariate timeseries with neighborhood information","authors":"Zhihao Hu ,&nbsp;Shyam Ranganathan ,&nbsp;Yang Shao ,&nbsp;Xinwei Deng","doi":"10.1016/j.jspi.2025.106277","DOIUrl":"10.1016/j.jspi.2025.106277","url":null,"abstract":"<div><div>Vector autoregression (VAR) models are popular in modeling multivariate time series in data sciences and other areas. When the number of time series is large, the number of parameters in the VAR model increases dramatically, posing great challenges for proper model estimation and inference. In this work, we propose a so-called neighborhood vector autoregression (NVAR) model to efficiently analyze large-dimensional multivariate time series. We assume that the time series have underlying neighborhood relationships, e.g., spatial or network, among them based on the inherent setting of the problem. When this neighborhood information is available or can be summarized using a distance matrix, we demonstrate that our proposed NVAR method provides a computationally efficient and theoretically sound estimation of model parameters. The performance of the proposed method is compared with other existing approaches in both simulation studies and a real-data application in environmental science.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"239 ","pages":"Article 106277"},"PeriodicalIF":0.8,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inference on linear quantile regression with dyadic data 二元数据下线性分位数回归的推理
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-26 DOI: 10.1016/j.jspi.2025.106292
Hongqi Chen
This paper focuses on developing a robust inference procedure for the linear quantile regression estimator in the context of dyadic data structures. We investigate the asymptotic distribution of the quantile regression estimator under dependency structures arising from shared nodes in both undirected and directed networks. We establish consistency results for the covariance matrix estimator and provide asymptotic distributions for the associated t-statistic and Wald statistic, particularly in both univariate and joint hypothesis testing scenarios. To showcase the effectiveness of our proposed method, we present numerical simulations and an empirical application using international trade data. Our results demonstrate the excellent performance of the robust t-statistic and Wald statistic in quantile regression inference with dyadic data.
本文的重点是在二元数据结构的背景下开发一个鲁棒的线性分位数回归估计的推理程序。研究了无向网络和有向网络中由共享节点引起的依赖结构下的分位数回归估计量的渐近分布。我们建立了协方差矩阵估计量的一致性结果,并提供了相关t统计量和Wald统计量的渐近分布,特别是在单变量和联合假设检验场景中。为了证明我们提出的方法的有效性,我们给出了数值模拟和使用国际贸易数据的实证应用。结果表明,稳健t统计量和Wald统计量在二元数据的分位数回归推理中具有良好的性能。
{"title":"Inference on linear quantile regression with dyadic data","authors":"Hongqi Chen","doi":"10.1016/j.jspi.2025.106292","DOIUrl":"10.1016/j.jspi.2025.106292","url":null,"abstract":"<div><div>This paper focuses on developing a robust inference procedure for the linear quantile regression estimator in the context of dyadic data structures. We investigate the asymptotic distribution of the quantile regression estimator under dependency structures arising from shared nodes in both undirected and directed networks. We establish consistency results for the covariance matrix estimator and provide asymptotic distributions for the associated <span><math><mi>t</mi></math></span>-statistic and Wald statistic, particularly in both univariate and joint hypothesis testing scenarios. To showcase the effectiveness of our proposed method, we present numerical simulations and an empirical application using international trade data. Our results demonstrate the excellent performance of the robust <span><math><mi>t</mi></math></span>-statistic and Wald statistic in quantile regression inference with dyadic data.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"239 ","pages":"Article 106292"},"PeriodicalIF":0.8,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143734819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent 梯度下降法学习的超参数化卷积神经网络图像分类器的收敛速度分析
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-19 DOI: 10.1016/j.jspi.2025.106291
Michael Kohler , Adam Krzyżak , Benjamin Walter
Image classification based on over-parametrized convolutional neural networks with a global average-pooling layer is considered. The weights of the network are learned by gradient descent. A bound on the rate of convergence of the difference between the misclassification risk of the newly introduced convolutional neural network estimate and the minimal possible value is derived.
研究了一种基于全局平均池化层的超参数化卷积神经网络图像分类方法。网络的权值是通过梯度下降来学习的。给出了新引入的卷积神经网络估计的误分类风险与最小可能值之差的收敛速度的界。
{"title":"Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent","authors":"Michael Kohler ,&nbsp;Adam Krzyżak ,&nbsp;Benjamin Walter","doi":"10.1016/j.jspi.2025.106291","DOIUrl":"10.1016/j.jspi.2025.106291","url":null,"abstract":"<div><div>Image classification based on over-parametrized convolutional neural networks with a global average-pooling layer is considered. The weights of the network are learned by gradient descent. A bound on the rate of convergence of the difference between the misclassification risk of the newly introduced convolutional neural network estimate and the minimal possible value is derived.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"239 ","pages":"Article 106291"},"PeriodicalIF":0.8,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143715034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On misspecification in cusp-type change-point models 关于尖端型变点模型的错误描述
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-13 DOI: 10.1016/j.jspi.2025.106290
O.V. Chernoyarov , S. Dachian , Yu.A. Kutoyants
The problem of parameter estimation by i.i.d. observations of an inhomogeneous Poisson process is considered in situation of misspecification. The model is that of a Poissonian signal observed in presence of a homogeneous Poissonian noise. The intensity function of the process is supposed to have a cusp-type singularity at the change-point (the unknown moment of arrival of the signal), while the supposed (theoretical) and the real (observed) levels of the signal are different. The asymptotic properties of the (pseudo) MLE are described. It is shown that the estimator converges to the value minimizing the Kullback–Leibler divergence, that the normalized error of estimation converges to some limit distribution, and that its polynomial moments also converge.
研究了非齐次泊松过程在不规范情况下的参数估计问题。该模型是在均匀泊松噪声存在下观察到的泊松信号。假设过程的强度函数在变点(信号到达的未知时刻)具有尖点型奇点,而信号的假设(理论)和实际(观测)水平是不同的。描述了(伪)最大似然的渐近性质。证明了估计量收敛于使Kullback-Leibler散度最小的值,估计的归一化误差收敛于某个极限分布,其多项式矩也收敛。
{"title":"On misspecification in cusp-type change-point models","authors":"O.V. Chernoyarov ,&nbsp;S. Dachian ,&nbsp;Yu.A. Kutoyants","doi":"10.1016/j.jspi.2025.106290","DOIUrl":"10.1016/j.jspi.2025.106290","url":null,"abstract":"<div><div>The problem of parameter estimation by i.i.d. observations of an inhomogeneous Poisson process is considered in situation of misspecification. The model is that of a Poissonian signal observed in presence of a homogeneous Poissonian noise. The intensity function of the process is supposed to have a cusp-type singularity at the change-point (the unknown moment of arrival of the signal), while the supposed (theoretical) and the real (observed) levels of the signal are different. The asymptotic properties of the (pseudo) MLE are described. It is shown that the estimator converges to the value minimizing the Kullback–Leibler divergence, that the normalized error of estimation converges to some limit distribution, and that its polynomial moments also converge.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"239 ","pages":"Article 106290"},"PeriodicalIF":0.8,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143637425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation and testing for varying-coefficient single-index quantile regression models 变系数单指标分位数回归模型的估计与检验
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-11 DOI: 10.1016/j.jspi.2025.106289
Hui Ding , Mei Yao , Riquan Zhang , Zhenglong Zhang , Hanbing Zhu
In this paper we propose varying-coefficient single-index quantile regression models, which includes most existing quantile regression models. We adopt B-spline basis approximation for the estimation of nonparametric components and use the “delete-one-component” method to construct check loss function. Under some mild conditions, we establish asymptotic theory of the proposed estimators for both the parametric and nonparametric components. Moreover, we propose a rank score based test to examine whether the varying-coefficient functions are constant. The finite sample performance of the proposed estimation method is illustrated by simulation studies and an empirical analysis of two real datasets.
本文提出了变系数单指标分位数回归模型,该模型包含了大多数现有的分位数回归模型。我们采用b样条基近似估计非参数分量,并使用“删除一分量”方法构造校验损失函数。在一些温和的条件下,我们建立了所提估计量对参数分量和非参数分量的渐近理论。此外,我们提出了一个基于等级分数的检验来检验变系数函数是否为常数。通过仿真研究和两个真实数据集的实证分析,说明了所提出的估计方法的有限样本性能。
{"title":"Estimation and testing for varying-coefficient single-index quantile regression models","authors":"Hui Ding ,&nbsp;Mei Yao ,&nbsp;Riquan Zhang ,&nbsp;Zhenglong Zhang ,&nbsp;Hanbing Zhu","doi":"10.1016/j.jspi.2025.106289","DOIUrl":"10.1016/j.jspi.2025.106289","url":null,"abstract":"<div><div>In this paper we propose varying-coefficient single-index quantile regression models, which includes most existing quantile regression models. We adopt B-spline basis approximation for the estimation of nonparametric components and use the “delete-one-component” method to construct check loss function. Under some mild conditions, we establish asymptotic theory of the proposed estimators for both the parametric and nonparametric components. Moreover, we propose a rank score based test to examine whether the varying-coefficient functions are constant. The finite sample performance of the proposed estimation method is illustrated by simulation studies and an empirical analysis of two real datasets.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"239 ","pages":"Article 106289"},"PeriodicalIF":0.8,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fixed-budget optimal designs for multi-fidelity computer experiments 多保真度计算机实验的固定预算优化设计
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-04 DOI: 10.1016/j.jspi.2025.106286
Gecheng Chen, Rui Tuo
This work focuses on the design of experiments of multi-fidelity computer experiments. We consider the autoregressive Gaussian process model proposed by Kennedy and O’Hagan (2000) and the optimal nested design that maximizes the prediction accuracy subject to a budget constraint. An approximate solution is identified through the idea of multi-level approximation and recent error bounds of Gaussian process regression. The proposed (approximately) optimal designs admit a simple analytical form. We prove that, to achieve the same prediction accuracy, the proposed optimal multi-fidelity design requires much lower computational cost than any single-fidelity design in the asymptotic sense. Numerical studies confirm this theoretical assertion.
本工作的重点是多保真度计算机实验的实验设计。我们考虑Kennedy和O 'Hagan(2000)提出的自回归高斯过程模型,以及在预算约束下使预测精度最大化的最优嵌套设计。通过多层逼近的思想和高斯过程回归的最新误差界,确定了近似解。所提出的(近似)最优设计具有简单的解析形式。在渐近意义上,我们证明了在达到相同预测精度的情况下,所提出的最优多保真度设计比任何单保真度设计所需的计算成本要低得多。数值研究证实了这一理论论断。
{"title":"Fixed-budget optimal designs for multi-fidelity computer experiments","authors":"Gecheng Chen,&nbsp;Rui Tuo","doi":"10.1016/j.jspi.2025.106286","DOIUrl":"10.1016/j.jspi.2025.106286","url":null,"abstract":"<div><div>This work focuses on the design of experiments of multi-fidelity computer experiments. We consider the autoregressive Gaussian process model proposed by Kennedy and O’Hagan (2000) and the optimal nested design that maximizes the prediction accuracy subject to a budget constraint. An approximate solution is identified through the idea of multi-level approximation and recent error bounds of Gaussian process regression. The proposed (approximately) optimal designs admit a simple analytical form. We prove that, to achieve the same prediction accuracy, the proposed optimal multi-fidelity design requires much lower computational cost than any single-fidelity design in the asymptotic sense. Numerical studies confirm this theoretical assertion.</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"239 ","pages":"Article 106286"},"PeriodicalIF":0.8,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143579798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonparametric regression with predictors missing at random and the scale depending on auxiliary covariates 随机缺失预测因子和依赖辅助协变量的尺度的非参数回归
IF 0.8 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2025-03-01 DOI: 10.1016/j.jspi.2025.106278
Tian Jiang
Nonparametric regression with missing at random (MAR) predictors, univariate regression component of interest, and the scale function depending on both the predictor and auxiliary covariates, is considered. The asymptotic theory suggests that both heteroscedasticity and MAR mechanism affect the sharp constant of the minimax mean integrated squared error (MISE) convergence. We propose a data-driven procedure adaptive to the missing mechanism and unknown smoothness of the estimated regression function. The estimator preserves the optimal convergence rate and can achieve sharp minimaxity when predictors are missing completely at random (MCAR).
考虑了随机缺失(MAR)预测因子的非参数回归,感兴趣的单变量回归成分以及依赖于预测因子和辅助协变量的尺度函数。渐近理论表明,异方差和MAR机制都影响最小最大平均积分平方误差(MISE)收敛的锐常数。我们提出了一种适应缺失机制和未知平滑估计回归函数的数据驱动过程。该估计器保持了最优的收敛速度,并且在预测器完全随机缺失(MCAR)的情况下可以达到急剧极小值。
{"title":"Nonparametric regression with predictors missing at random and the scale depending on auxiliary covariates","authors":"Tian Jiang","doi":"10.1016/j.jspi.2025.106278","DOIUrl":"10.1016/j.jspi.2025.106278","url":null,"abstract":"<div><div>Nonparametric regression with missing at random (MAR) predictors, univariate regression component of interest, and the scale function depending on both the predictor and auxiliary covariates, is considered. The asymptotic theory suggests that both heteroscedasticity and MAR mechanism affect the sharp constant of the minimax mean integrated squared error (MISE) convergence. We propose a data-driven procedure adaptive to the missing mechanism and unknown smoothness of the estimated regression function. The estimator preserves the optimal convergence rate and can achieve sharp minimaxity when predictors are missing completely at random (MCAR).</div></div>","PeriodicalId":50039,"journal":{"name":"Journal of Statistical Planning and Inference","volume":"239 ","pages":"Article 106278"},"PeriodicalIF":0.8,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143552811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Statistical Planning and Inference
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1