Pub Date : 2025-01-01Epub Date: 2025-08-15DOI: 10.1007/s11749-025-00983-9
Annika Betken, Giorgio Micali, Johannes Schmidt-Hieber
The ordinal patterns of a fixed number of consecutive values in a time series are the spatial ordering of these values. Counting how often a specific ordinal pattern occurs in a time series provides important insights into the properties of the time series. In this work, we prove the asymptotic normality of the relative frequency of ordinal patterns for time series with linear increments. Moreover, we apply ordinal patterns to detect changes in the distribution of a time series.
{"title":"Ordinal pattern-based change point detection.","authors":"Annika Betken, Giorgio Micali, Johannes Schmidt-Hieber","doi":"10.1007/s11749-025-00983-9","DOIUrl":"10.1007/s11749-025-00983-9","url":null,"abstract":"<p><p>The ordinal patterns of a fixed number of consecutive values in a time series are the spatial ordering of these values. Counting how often a specific ordinal pattern occurs in a time series provides important insights into the properties of the time series. In this work, we prove the asymptotic normality of the relative frequency of ordinal patterns for time series with linear increments. Moreover, we apply ordinal patterns to detect changes in the distribution of a time series.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"34 4","pages":"927-980"},"PeriodicalIF":1.3,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12634733/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145589563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-04DOI: 10.1007/s11749-024-00941-x
Da Chen, Linlin Dai, Yichuan Zhao
The correlation coefficient is fundamental in advanced statistical analysis. However, traditional methods of calculating correlation coefficients can be biased due to the existence of confounding variables. Such confounding variables could act in an additive or multiplicative fashion. To study the additive model, previous research has shown residual-based estimation of correlation coefficients. The powerful tool of empirical likelihood (EL) has been used to construct the confidence interval for the correlation coefficient. However, the methods so far only perform well when sample sizes are large. With small sample size situations, the coverage probability of EL, for instance, can be below 90% at confidence level 95%. On the basis of previous research, we propose new methods of interval estimation for the correlation coefficient using jackknife empirical likelihood, mean jackknife empirical likelihood and adjusted jackknife empirical likelihood. For better performance with small sample sizes, we also propose mean adjusted empirical likelihood. The simulation results show the best performance with mean adjusted jackknife empirical likelihood when the sample sizes are as small as 25. Real data analyses are used to illustrate the proposed approach.
相关系数是高级统计分析的基础。然而,由于混杂变量的存在,计算相关系数的传统方法可能会出现偏差。这些混杂变量可能以相加或相乘的方式发挥作用。为了研究加法模型,以往的研究显示了基于残差的相关系数估计方法。经验似然法(EL)这一强大工具被用来构建相关系数的置信区间。然而,迄今为止的方法只有在样本量较大的情况下才表现良好。在样本量较小的情况下,以 EL 为例,在置信水平为 95% 时,其覆盖概率可能低于 90%。在前人研究的基础上,我们提出了新的相关系数区间估计方法,即使用杰克刀经验似然法、平均杰克刀经验似然法和调整杰克刀经验似然法。为了在样本量较小的情况下获得更好的性能,我们还提出了平均调整经验似然法。模拟结果表明,当样本量小到 25 个时,平均调整杰克刀经验似然法的性能最佳。真实数据分析用于说明所提出的方法。
{"title":"Jackknife empirical likelihood for the correlation coefficient with additive distortion measurement errors","authors":"Da Chen, Linlin Dai, Yichuan Zhao","doi":"10.1007/s11749-024-00941-x","DOIUrl":"https://doi.org/10.1007/s11749-024-00941-x","url":null,"abstract":"<p>The correlation coefficient is fundamental in advanced statistical analysis. However, traditional methods of calculating correlation coefficients can be biased due to the existence of confounding variables. Such confounding variables could act in an additive or multiplicative fashion. To study the additive model, previous research has shown residual-based estimation of correlation coefficients. The powerful tool of empirical likelihood (EL) has been used to construct the confidence interval for the correlation coefficient. However, the methods so far only perform well when sample sizes are large. With small sample size situations, the coverage probability of EL, for instance, can be below 90% at confidence level 95%. On the basis of previous research, we propose new methods of interval estimation for the correlation coefficient using jackknife empirical likelihood, mean jackknife empirical likelihood and adjusted jackknife empirical likelihood. For better performance with small sample sizes, we also propose mean adjusted empirical likelihood. The simulation results show the best performance with mean adjusted jackknife empirical likelihood when the sample sizes are as small as 25. Real data analyses are used to illustrate the proposed approach.\u0000</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"7 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-31DOI: 10.1007/s11749-024-00945-7
Dimitrios Bagkavos, Montserrat Guillen, Jens P. Nielsen
The present research provides two methodological advances, simulation evidence and a real data analysis, all contributing to the area of local linear survival function estimation and bandwidth selection. The first contribution is the development of a double smoothed local linear survival function estimator which admits an arbitrary number of covariates and the analytic establishment of its asymptotic properties. The second contribution is the efficient implementation of the estimator in practice. This is achieved by developing an automatic plug-in smoothing parameter selector which optimizes the estimator’s performance in all coordinate directions. The traditional problem of vectorization of higher-order derivatives which lead to increasingly intractable matrix algebraic expressions is addressed here by introducing an alternative vectorization that exploits the analytic relationships between the functionals involved. This yields simpler, tractable and efficient in terms of computing time expressions which greatly facilitate the implementation of the rule in practice. The analytic study of the rule’s rate of convergence shows that in contrast to the traditional cross validation approach, the proposed bandwidth selector is functional even for a large number of covariates. The benefits of all methodological advances are illustrated with the analysis of a motivating real-world dataset on credit risk.
{"title":"Nonparametric conditional survival function estimation and plug-in bandwidth selection with multiple covariates","authors":"Dimitrios Bagkavos, Montserrat Guillen, Jens P. Nielsen","doi":"10.1007/s11749-024-00945-7","DOIUrl":"https://doi.org/10.1007/s11749-024-00945-7","url":null,"abstract":"<p>The present research provides two methodological advances, simulation evidence and a real data analysis, all contributing to the area of local linear survival function estimation and bandwidth selection. The first contribution is the development of a double smoothed local linear survival function estimator which admits an arbitrary number of covariates and the analytic establishment of its asymptotic properties. The second contribution is the efficient implementation of the estimator in practice. This is achieved by developing an automatic plug-in smoothing parameter selector which optimizes the estimator’s performance in all coordinate directions. The traditional problem of vectorization of higher-order derivatives which lead to increasingly intractable matrix algebraic expressions is addressed here by introducing an alternative vectorization that exploits the analytic relationships between the functionals involved. This yields simpler, tractable and efficient in terms of computing time expressions which greatly facilitate the implementation of the rule in practice. The analytic study of the rule’s rate of convergence shows that in contrast to the traditional cross validation approach, the proposed bandwidth selector is functional even for a large number of covariates. The benefits of all methodological advances are illustrated with the analysis of a motivating real-world dataset on credit risk.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"5 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-26DOI: 10.1007/s11749-024-00944-8
Tizheng Li, Yuping Wang
Conventional higher-order spatial autoregressive models assume that regression coefficients are constant over space, which is overly restrictive and unrealistic in applications. In this paper, we introduce higher-order spatial autoregressive varying coefficient model where regression coefficients are allowed to smoothly change over space, which enables us to simultaneously explore different types of spatial dependence and spatial heterogeneity of regression relationship. We propose a semi-parametric generalized method of moments estimation method for the proposed model and derive asymptotic properties of resulting estimators. Moreover, we propose a testing method to detect spatial heterogeneity of the regression relationship. Simulation studies show that the proposed estimation and testing methods perform quite well in finite samples. The Boston house price data are finally analyzed to demonstrate the proposed model and its estimation and testing methods.
{"title":"Higher-order spatial autoregressive varying coefficient model: estimation and specification test","authors":"Tizheng Li, Yuping Wang","doi":"10.1007/s11749-024-00944-8","DOIUrl":"https://doi.org/10.1007/s11749-024-00944-8","url":null,"abstract":"<p>Conventional higher-order spatial autoregressive models assume that regression coefficients are constant over space, which is overly restrictive and unrealistic in applications. In this paper, we introduce higher-order spatial autoregressive varying coefficient model where regression coefficients are allowed to smoothly change over space, which enables us to simultaneously explore different types of spatial dependence and spatial heterogeneity of regression relationship. We propose a semi-parametric generalized method of moments estimation method for the proposed model and derive asymptotic properties of resulting estimators. Moreover, we propose a testing method to detect spatial heterogeneity of the regression relationship. Simulation studies show that the proposed estimation and testing methods perform quite well in finite samples. The Boston house price data are finally analyzed to demonstrate the proposed model and its estimation and testing methods.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"19 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-24DOI: 10.1007/s11749-024-00946-6
Chengxin Wu, Nengxiang Ling, Philippe Vieu, Guoliang Fan
In this paper, we focus on the studying of composite quantile estimation for the partially functional linear regression model with randomly censored responses. Concretely, we adopt the approach of inverse probability weighting to estimate the weights by using the survival distribution function of the censoring variables with the methods of Kaplan–Meier and Breslow as well as local Kaplan-Meier respectively. Then, we construct the weighted composite quantile estimators for the slope function and the scalar parameters of the model. Furthermore, the large sample properties, such as the convergence rates of the estimators for the slope function and scalar parameters as well as the asymptotic distribution of the estimators for the scalar parameters are obtained under some mild conditions. In addition, we propose a computationally simple resampling technique to approximate the distribution of the parametric estimators of the model, and establish the interval estimations for the scalar parameters. Finally, the finite sample performances of the model and the estimation method are illustrated by some simulation studies and a real data analysis, which shows that both the model and the estimation methods are effective.
{"title":"Composite quantile estimation in partially functional linear regression model with randomly censored responses","authors":"Chengxin Wu, Nengxiang Ling, Philippe Vieu, Guoliang Fan","doi":"10.1007/s11749-024-00946-6","DOIUrl":"https://doi.org/10.1007/s11749-024-00946-6","url":null,"abstract":"<p>In this paper, we focus on the studying of composite quantile estimation for the partially functional linear regression model with randomly censored responses. Concretely, we adopt the approach of inverse probability weighting to estimate the weights by using the survival distribution function of the censoring variables with the methods of Kaplan–Meier and Breslow as well as local Kaplan-Meier respectively. Then, we construct the weighted composite quantile estimators for the slope function and the scalar parameters of the model. Furthermore, the large sample properties, such as the convergence rates of the estimators for the slope function and scalar parameters as well as the asymptotic distribution of the estimators for the scalar parameters are obtained under some mild conditions. In addition, we propose a computationally simple resampling technique to approximate the distribution of the parametric estimators of the model, and establish the interval estimations for the scalar parameters. Finally, the finite sample performances of the model and the estimation method are illustrated by some simulation studies and a real data analysis, which shows that both the model and the estimation methods are effective.\u0000</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"62 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-16DOI: 10.1007/s11749-024-00942-w
Panagiotis Papastamoulis, Fotios S. Milienos
Estimating model parameters of a general family of cure models is always a challenging task mainly due to flatness and multimodality of the likelihood function. In this work, we propose a fully Bayesian approach in order to overcome these issues. Posterior inference is carried out by constructing a Metropolis-coupled Markov chain Monte Carlo (MCMC) sampler, which combines Gibbs sampling for the latent cure indicators and Metropolis–Hastings steps with Langevin diffusion dynamics for parameter updates. The main MCMC algorithm is embedded within a parallel tempering scheme by considering heated versions of the target posterior distribution. It is demonstrated that along the considered simulation study the proposed algorithm freely explores the multimodal posterior distribution and produces robust point estimates, while it outperforms maximum likelihood estimation via the Expectation–Maximization algorithm. A by-product of our Bayesian implementation is to control the False Discovery Rate when classifying items as cured or not. Finally, the proposed method is illustrated in a real dataset which refers to recidivism for offenders released from prison; the event of interest is whether the offender was re-incarcerated after probation or not.
{"title":"Bayesian inference and cure rate modeling for event history data","authors":"Panagiotis Papastamoulis, Fotios S. Milienos","doi":"10.1007/s11749-024-00942-w","DOIUrl":"https://doi.org/10.1007/s11749-024-00942-w","url":null,"abstract":"<p>Estimating model parameters of a general family of cure models is always a challenging task mainly due to flatness and multimodality of the likelihood function. In this work, we propose a fully Bayesian approach in order to overcome these issues. Posterior inference is carried out by constructing a Metropolis-coupled Markov chain Monte Carlo (MCMC) sampler, which combines Gibbs sampling for the latent cure indicators and Metropolis–Hastings steps with Langevin diffusion dynamics for parameter updates. The main MCMC algorithm is embedded within a parallel tempering scheme by considering heated versions of the target posterior distribution. It is demonstrated that along the considered simulation study the proposed algorithm freely explores the multimodal posterior distribution and produces robust point estimates, while it outperforms maximum likelihood estimation via the Expectation–Maximization algorithm. A by-product of our Bayesian implementation is to control the False Discovery Rate when classifying items as cured or not. Finally, the proposed method is illustrated in a real dataset which refers to recidivism for offenders released from prison; the event of interest is whether the offender was re-incarcerated after probation or not.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"58 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142221446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-30DOI: 10.1007/s11749-024-00939-5
Bin Du, Xiumin Liu, Junlong Zhao
Hypothesis test for a mean vector is a classical problem in data analysis but has been highly underinvestigated in distributed frameworks where samples of size n are located on k local sites. This paper focuses on the one-sample mean test, proposing synthesized test statistics with a much lower communication cost than the centralized Hotelling (T^2) test. For the homogeneous case, where data on different local sites are independent and identically distributed, the efficiency of our proposed test is comparable to that of the centralized one, and much better than the test constructed from the divide and conquer method. Besides, three heterogeneous cases are considered, where the distributions of the data on local sites can be different. Heterogeneous cases are much more challenging because the local sample means and covariance matrices may be inconsistent estimators. We construct communication-efficient testing procedures for heterogeneous cases, and the power of the proposed test statistics is comparable to that of the centralized one under some conditions. Simulation results verify the effectiveness of the proposed testing procedures.
均值向量的假设检验是数据分析中的一个经典问题,但在分布式框架中,大小为 n 的样本分布在 k 个本地站点上,对这个问题的研究却非常不够。本文重点关注单样本均值检验,提出了通信成本远低于集中式 Hotelling (T^2) 检验的合成检验统计量。对于不同本地站点数据独立且同分布的同质情况,我们提出的检验效率与集中式检验效率相当,且远优于用分而治之法构建的检验。此外,我们还考虑了三种异构情况,即本地站点的数据分布可能不同。异质情况更具挑战性,因为本地样本均值和协方差矩阵可能是不一致的估计值。我们为异构情况构建了通信效率高的测试程序,在某些条件下,所提出的测试统计量的功率与集中式统计量的功率相当。仿真结果验证了所提测试程序的有效性。
{"title":"Extended Hotelling $$T^2$$ test in distributed frameworks","authors":"Bin Du, Xiumin Liu, Junlong Zhao","doi":"10.1007/s11749-024-00939-5","DOIUrl":"https://doi.org/10.1007/s11749-024-00939-5","url":null,"abstract":"<p>Hypothesis test for a mean vector is a classical problem in data analysis but has been highly underinvestigated in distributed frameworks where samples of size <i>n</i> are located on <i>k</i> local sites. This paper focuses on the one-sample mean test, proposing synthesized test statistics with a much lower communication cost than the centralized Hotelling <span>(T^2)</span> test. For the homogeneous case, where data on different local sites are independent and identically distributed, the efficiency of our proposed test is comparable to that of the centralized one, and much better than the test constructed from the divide and conquer method. Besides, three heterogeneous cases are considered, where the distributions of the data on local sites can be different. Heterogeneous cases are much more challenging because the local sample means and covariance matrices may be inconsistent estimators. We construct communication-efficient testing procedures for heterogeneous cases, and the power of the proposed test statistics is comparable to that of the centralized one under some conditions. Simulation results verify the effectiveness of the proposed testing procedures.\u0000</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"74 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141870317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-21DOI: 10.1007/s11749-024-00940-y
Xing Li, Yujing Shao, Lei Wang
To balance robustness of quantile regression and effectiveness of expectile regression, we consider (L_p)-quantile regression models with large-scale data and develop a unified optimal subsampling method to downsize the data volume and reduce computational burden. For low-dimensional (L_p)-quantile regression models, two optimal subsampling probabilities based on the A- and L-optimality criteria are firstly proposed. For the preconceived low-dimensional parameter in high-dimensional (L_p)-quantile regression models, a novel optimal subsampling decorrelated score function is proposed to mitigate the effect from nuisance parameter estimation and then two optimal decorrelated score subsampling probabilities are provided. The asymptotic properties of two optimal subsample estimators are established. The finite-sample performance of the proposed estimators is studied through simulations, and an application to Beijing Air Quality Dataset is also presented.
{"title":"Optimal subsampling for $$L_p$$ -quantile regression via decorrelated score","authors":"Xing Li, Yujing Shao, Lei Wang","doi":"10.1007/s11749-024-00940-y","DOIUrl":"https://doi.org/10.1007/s11749-024-00940-y","url":null,"abstract":"<p>To balance robustness of quantile regression and effectiveness of expectile regression, we consider <span>(L_p)</span>-quantile regression models with large-scale data and develop a unified optimal subsampling method to downsize the data volume and reduce computational burden. For low-dimensional <span>(L_p)</span>-quantile regression models, two optimal subsampling probabilities based on the A- and L-optimality criteria are firstly proposed. For the preconceived low-dimensional parameter in high-dimensional <span>(L_p)</span>-quantile regression models, a novel optimal subsampling decorrelated score function is proposed to mitigate the effect from nuisance parameter estimation and then two optimal decorrelated score subsampling probabilities are provided. The asymptotic properties of two optimal subsample estimators are established. The finite-sample performance of the proposed estimators is studied through simulations, and an application to Beijing Air Quality Dataset is also presented.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"46 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141738910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-12DOI: 10.1007/s11749-024-00935-9
Li Cai, Lei Jin, Jiuzhou Miao, Suojin Wang
Single-index models are important and popular semiparametric models, as they can handle the problem of the “curse of dimensionality” and enjoy the flexibility of nonparametric modeling and the interpretability of parametric modeling. Most existing methods for single-index models are sensitive to outliers or heavy-tailed distributions because they use the least squares criterion. An oracle-efficient M-estimator is proposed for single-index models, and a smooth simultaneous confidence band is constructed by treating the index coefficients as nuisance parameters. Under general assumptions it is shown that the M-estimator for the nonparametric link function, based on any (sqrt{n})-consistent coefficient index parameter estimators, is oracle-efficient. This means that it is uniformly as efficient as the infeasible one obtained by M-regression using the true single-index coefficient parameters. As a result, the asymptotic distribution of the maximal deviation between the M-type kernel estimator and the true link function is derived, and an asymptotically accurate simultaneous confidence band is established as a global inference tool for the link function. The proposed method generalizes the desirable uniform convergence property of ordinary least squares to the M-estimation. Meanwhile, it is a general approach that allows any (sqrt{n})-consistent coefficient parameter estimators to be applied in the procedure to make global inferences for the link function. Simulation studies with commonly encountered sample sizes are reported to support the theoretical findings. These numerical results show certain desirable robustness properties against heavy-tailed errors and outliers. As an illustration, the proposed method is applied to the analysis of a car purchasing dataset.
单指数模型是重要而流行的半参数模型,因为它们可以处理 "维度诅咒 "问题,并享有非参数建模的灵活性和参数建模的可解释性。大多数现有的单指数模型方法都对异常值或重尾分布很敏感,因为它们使用的是最小二乘法准则。本文为单指数模型提出了一种具有甲骨文效率的 M 估计器,并通过将指数系数视为干扰参数,构建了一个平滑的同步置信带。在一般假设下,基于任何 (sqrt{n})-一致的系数指数参数估计器的非参数链接函数的 M-估计器是具有甲骨文效率的。这意味着它的效率与使用真实的单指数系数参数通过 M 回归得到的不可行估计一样高。因此,推导出了 M 型核估计器与真实链接函数之间最大偏差的渐近分布,并建立了渐近精确的同步置信带,作为链接函数的全局推断工具。所提出的方法将普通最小二乘法理想的均匀收敛特性推广到了 M 型估计。同时,它是一种通用方法,允许在程序中应用任何(sqrt{n})一致的系数参数估计器来对联系函数进行全局推断。为支持理论研究结果,报告还对常见样本量进行了模拟研究。这些数值结果表明,对重尾误差和异常值具有某些理想的稳健性。作为示例,我们将所提出的方法应用于汽车购买数据集的分析。
{"title":"Oracle-efficient M-estimation for single-index models with a smooth simultaneous confidence band","authors":"Li Cai, Lei Jin, Jiuzhou Miao, Suojin Wang","doi":"10.1007/s11749-024-00935-9","DOIUrl":"https://doi.org/10.1007/s11749-024-00935-9","url":null,"abstract":"<p>Single-index models are important and popular semiparametric models, as they can handle the problem of the “curse of dimensionality” and enjoy the flexibility of nonparametric modeling and the interpretability of parametric modeling. Most existing methods for single-index models are sensitive to outliers or heavy-tailed distributions because they use the least squares criterion. An oracle-efficient M-estimator is proposed for single-index models, and a smooth simultaneous confidence band is constructed by treating the index coefficients as nuisance parameters. Under general assumptions it is shown that the M-estimator for the nonparametric link function, based on any <span>(sqrt{n})</span>-consistent coefficient index parameter estimators, is oracle-efficient. This means that it is uniformly as efficient as the infeasible one obtained by M-regression using the true single-index coefficient parameters. As a result, the asymptotic distribution of the maximal deviation between the M-type kernel estimator and the true link function is derived, and an asymptotically accurate simultaneous confidence band is established as a global inference tool for the link function. The proposed method generalizes the desirable uniform convergence property of ordinary least squares to the M-estimation. Meanwhile, it is a general approach that allows any <span>(sqrt{n})</span>-consistent coefficient parameter estimators to be applied in the procedure to make global inferences for the link function. Simulation studies with commonly encountered sample sizes are reported to support the theoretical findings. These numerical results show certain desirable robustness properties against heavy-tailed errors and outliers. As an illustration, the proposed method is applied to the analysis of a car purchasing dataset.</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"32 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141608523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-01DOI: 10.1007/s11749-024-00933-x
Šárka Hudecová, Marie Hušková, Simos G. Meintanis
We propose a goodness-of-fit test for a class of count time series models with covariates which includes the Poisson autoregressive model with covariates (PARX) as a special case. The test criteria are derived from a specific characterization for the conditional probability generating function, and the test statistic is formulated as a (L_2) weighting norm of the corresponding sample counterpart. The asymptotic properties of the proposed test statistic are provided under the null hypothesis as well as under specific alternatives. A bootstrap version of the test is explored in a Monte–Carlo study and illustrated on a real data set on road safety.
{"title":"Specifications tests for count time series models with covariates","authors":"Šárka Hudecová, Marie Hušková, Simos G. Meintanis","doi":"10.1007/s11749-024-00933-x","DOIUrl":"https://doi.org/10.1007/s11749-024-00933-x","url":null,"abstract":"<p>We propose a goodness-of-fit test for a class of count time series models with covariates which includes the Poisson autoregressive model with covariates (PARX) as a special case. The test criteria are derived from a specific characterization for the conditional probability generating function, and the test statistic is formulated as a <span>(L_2)</span> weighting norm of the corresponding sample counterpart. The asymptotic properties of the proposed test statistic are provided under the null hypothesis as well as under specific alternatives. A bootstrap version of the test is explored in a Monte–Carlo study and illustrated on a real data set on road safety.\u0000</p>","PeriodicalId":51189,"journal":{"name":"Test","volume":"193 1","pages":""},"PeriodicalIF":1.3,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}