This paper is focused on the specification test of functional linear quantile regression models. A nonparametric test statistic is proposed based on the orthogonality of residual and its conditional expectation. It is proved with mild assumptions that the proposed statistic follows asymptotically the standard normal distribution under the null hypothesis, but tends to infinity under alternative hypothesis. The asymptotic power of the test is also presented for some local alternative hypotheses. The test is easy to implement, and is shown by simulations powerful even for small sample sizes. A real data example with the Capital Bikeshare data is presented for illustration.
本文主要研究函数线性量回归模型的规格检验。根据残差的正交性及其条件期望,提出了一种非参数检验统计量。在温和的假设条件下,证明了所提出的统计量在零假设条件下近似服从标准正态分布,但在备择假设条件下则趋于无穷大。对于一些局部替代假设,还给出了检验的渐近功率。该检验易于实现,即使样本量较小,通过模拟也能显示出其强大的功能。为了便于说明,我们还提供了一个使用 Capital Bikeshare 数据的真实数据示例。
{"title":"A consistent specification test for functional linear quantile regression models","authors":"Lili Xia, Zhongzhan Zhang, Gongming Shi","doi":"10.4310/22-sii754","DOIUrl":"https://doi.org/10.4310/22-sii754","url":null,"abstract":"This paper is focused on the specification test of functional linear quantile regression models. A nonparametric test statistic is proposed based on the orthogonality of residual and its conditional expectation. It is proved with mild assumptions that the proposed statistic follows asymptotically the standard normal distribution under the null hypothesis, but tends to infinity under alternative hypothesis. The asymptotic power of the test is also presented for some local alternative hypotheses. The test is easy to implement, and is shown by simulations powerful even for small sample sizes. A real data example with the Capital Bikeshare data is presented for illustration.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We develop a new selection model for nonignorable missing values in multivariate categorical response variables by assuming that the response variables and their missingness can be summarized into categorical latent variables. Our proposed model contains two categorical latent variables. One latent variable summarizes the response patterns while the other describes the response variables’ missingness. Our selection model is an alternative method to other incomplete data methods when the incomplete data mechanism is nonignorable. We implement simulation studies to evaluate the performance of the proposed method and analyze the General Social Survey 2018 data to demonstrate its performance.
{"title":"A latent class selection model for categorical response variables with nonignorably missing data","authors":"Jung Wun Lee, Ofer Harel","doi":"10.4310/22-sii753","DOIUrl":"https://doi.org/10.4310/22-sii753","url":null,"abstract":"We develop a new selection model for nonignorable missing values in multivariate categorical response variables by assuming that the response variables and their missingness can be summarized into categorical latent variables. Our proposed model contains two categorical latent variables. One latent variable summarizes the response patterns while the other describes the response variables’ missingness. Our selection model is an alternative method to other incomplete data methods when the incomplete data mechanism is nonignorable. We implement simulation studies to evaluate the performance of the proposed method and analyze the General Social Survey 2018 data to demonstrate its performance.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we consider the robust estimation for a class of partially linear spatial autoregressive models. By combining empirical likelihood and composite quantile regression methods, we propose a robust empirical likelihood estimation procedure. Under some regularity conditions, the proposed empirical log-likelihood ratio is proved to be asymptotically chi-squared, and the convergence rate of the estimator for nonparametric component is also derived. Some simulation analyses are conducted for further illustrating the performance of the proposed method, and simulation results show that the proposed method is more robust.
{"title":"Composite quantile regression based robust empirical likelihood for partially linear spatial autoregressive models","authors":"Peixin Zhao, Suli Cheng, Xiaoshuang Zhou","doi":"10.4310/22-sii764","DOIUrl":"https://doi.org/10.4310/22-sii764","url":null,"abstract":"In this paper, we consider the robust estimation for a class of partially linear spatial autoregressive models. By combining empirical likelihood and composite quantile regression methods, we propose a robust empirical likelihood estimation procedure. Under some regularity conditions, the proposed empirical log-likelihood ratio is proved to be asymptotically chi-squared, and the convergence rate of the estimator for nonparametric component is also derived. Some simulation analyses are conducted for further illustrating the performance of the proposed method, and simulation results show that the proposed method is more robust.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-19DOI: 10.4310/sii.2024.v17.n4.a7
Yuanyao Tan, Xialing Wen, Wei Liang, Ying Yan
There has been growing attention on covariate adjustment for treatment effect estimation in an objective and efficient manner in randomized clinical trials. In this paper, we propose a weighting approach to extract covariate information based on the empirical likelihood method for the randomized clinical trials with possible missingness in the outcomes. Multiple regression models are imposed to delineate the missing data mechanism and the covariate-outcome relationship, respectively. We demonstrate that the proposed estimator is suitable for objective inference of treatment effects. Theoretically, we prove that the proposed approach is multiply robust and semiparametrically efficient. We conduct simulations and a real data study to make comparisons with other existing methods.
{"title":"Empirical likelihood-based weighted estimation of average treatment effects in randomized clinical trials with missing outcomes","authors":"Yuanyao Tan, Xialing Wen, Wei Liang, Ying Yan","doi":"10.4310/sii.2024.v17.n4.a7","DOIUrl":"https://doi.org/10.4310/sii.2024.v17.n4.a7","url":null,"abstract":"There has been growing attention on covariate adjustment for treatment effect estimation in an objective and efficient manner in randomized clinical trials. In this paper, we propose a weighting approach to extract covariate information based on the empirical likelihood method for the randomized clinical trials with possible missingness in the outcomes. Multiple regression models are imposed to delineate the missing data mechanism and the covariate-outcome relationship, respectively. We demonstrate that the proposed estimator is suitable for objective inference of treatment effects. Theoretically, we prove that the proposed approach is multiply robust and semiparametrically efficient. We conduct simulations and a real data study to make comparisons with other existing methods.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The promotion time cure models or bounded cumulative hazards model (BCH) was proposed as an alternative to the mixture cure models. In the present paper, this model is modified to provide a class of cure rate models based on a non-homogeneous Poisson process (NHPP). The properties of this class are studied. Also, when censored observations are present, distinguishing censored individuals from the cured group lead to identifiability issues in the members of this class. These identifiability issues are investigated and finally few members of this class are provided. Simulation results using an example of the NHPP cure rate model with exponentiated intensity and exponential baseline is supplemented. The application of the model is illustrated using E1684 real data from a study that included 284 patients from the Eastern Cooperative Oncology Group (ECOG) phase III clinical trial.
促进时间治愈模型或有界累积危险模型(BCH)是作为混合治愈模型的替代模型而提出的。本文对该模型进行了修改,以提供一类基于非均质泊松过程(NHPP)的治愈率模型。本文对该类模型的特性进行了研究。此外,当存在剔除的观测数据时,将剔除的个体从治愈组中区分出来会导致该类模型成员的可识别性问题。我们对这些可识别性问题进行了研究,并最终提供了该类中的少数几个成员。此外,还补充了使用具有指数强度和指数基线的 NHPP 治愈率模型示例的模拟结果。利用 E1684 真实数据对模型的应用进行了说明,这些数据来自一项研究,其中包括来自东部合作肿瘤学组 (ECOG) III 期临床试验的 284 名患者。
{"title":"Modeling and identifiability of non-homogenous Poisson process cure rate model","authors":"Soorya Surendren, Asha Gopalakrishnan, Anup Dewanji","doi":"10.4310/22-sii763","DOIUrl":"https://doi.org/10.4310/22-sii763","url":null,"abstract":"The promotion time cure models or bounded cumulative hazards model (BCH) was proposed as an alternative to the mixture cure models. In the present paper, this model is modified to provide a class of cure rate models based on a non-homogeneous Poisson process (NHPP). The properties of this class are studied. Also, when censored observations are present, distinguishing censored individuals from the cured group lead to identifiability issues in the members of this class. These identifiability issues are investigated and finally few members of this class are provided. Simulation results using an example of the NHPP cure rate model with exponentiated intensity and exponential baseline is supplemented. The application of the model is illustrated using E1684 real data from a study that included 284 patients from the Eastern Cooperative Oncology Group (ECOG) phase III clinical trial.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we develop a class of corrected post-model selection estimation method to identify important explanatory variables in parametric component of high-dimensional partially linear spatial autoregressive model with measurement errors. Compared with existing methods, the proposed method adds a new process of re-estimating the selected model parameters after model selection. We show that the post-model selection estimator performs at least as well as the Lasso penalty estimator by establishing some theorems of model selection and estimation properties. Extensive simulation studies not only evaluate the finite sample performance of the proposed method, but also show the superiority of the proposed method over other methods. As an empirical illustration, we apply the proposed model and method to two real data sets.
{"title":"Variable selection and estimation for high-dimensional partially linear spatial autoregressive models with measurement errors","authors":"Zhensheng Huang, Shuyu Meng, Linlin Zhang","doi":"10.4310/22-sii758","DOIUrl":"https://doi.org/10.4310/22-sii758","url":null,"abstract":"In this paper, we develop a class of corrected post-model selection estimation method to identify important explanatory variables in parametric component of high-dimensional partially linear spatial autoregressive model with measurement errors. Compared with existing methods, the proposed method adds a new process of re-estimating the selected model parameters after model selection. We show that the post-model selection estimator performs at least as well as the Lasso penalty estimator by establishing some theorems of model selection and estimation properties. Extensive simulation studies not only evaluate the finite sample performance of the proposed method, but also show the superiority of the proposed method over other methods. As an empirical illustration, we apply the proposed model and method to two real data sets.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In many situations of interest, it is common to observe positive responses measured along several assessment conditions, within the same subjects. Usually, such a scenario implies a positive skewness on the response distributions, along with the existence of within-subject dependency. It is known that neglecting these features can lead to a misleading inference. In this paper we extend the beta prime regression model for modeling asymmetric positive data, while taking into account the dependence structure. We consider a useful predictor for modeling a suitable transformation of the mean, along with homogeneous covariance structure. The proposed model is an interesting competitor of the flexible Tweedie regression models, which include distributions such as Gamma and Inverse Gaussian. Furthermore, residual analysis and influence diagnostic tools are proposed. A Monte Carlo experiment is conducted to evaluate the performance of the proposed methodology, under small and moderate sample sizes, along with suitable discussions. The methodology is illustrated with the analysis of a real longitudinal dataset. An R package was developed to allow the practitioners to use the methodology described in this paper.
在许多令人感兴趣的情况下,经常会观察到同一受试者在几种评估条件下测得的正反应。通常,这种情况意味着反应分布呈正偏斜,同时存在受试者内部依赖性。众所周知,忽略这些特征会导致误导性推断。在本文中,我们扩展了贝塔质数回归模型,用于对非对称正向数据建模,同时考虑了依赖结构。我们考虑了一个有用的预测因子,用于对均值的适当变换以及同质协方差结构进行建模。所提出的模型是灵活的特威迪回归模型(包括伽马分布和反高斯分布)的一个有趣的竞争对手。此外,还提出了残差分析和影响诊断工具。我们进行了蒙特卡罗实验,以评估所提方法在小样本量和中等样本量下的性能,并进行了适当的讨论。通过分析一个真实的纵向数据集,对该方法进行了说明。本文还开发了一个 R 软件包,使从业人员能够使用本文所述的方法。
{"title":"Flexible quasi-beta prime regression models for dependent continuous positive data","authors":"João Freitas, Juvêncio Nobre, Caio Azevedo","doi":"10.4310/22-sii762","DOIUrl":"https://doi.org/10.4310/22-sii762","url":null,"abstract":"In many situations of interest, it is common to observe positive responses measured along several assessment conditions, within the same subjects. Usually, such a scenario implies a positive skewness on the response distributions, along with the existence of within-subject dependency. It is known that neglecting these features can lead to a misleading inference. In this paper we extend the beta prime regression model for modeling asymmetric positive data, while taking into account the dependence structure. We consider a useful predictor for modeling a suitable transformation of the mean, along with homogeneous covariance structure. The proposed model is an interesting competitor of the flexible Tweedie regression models, which include distributions such as Gamma and Inverse Gaussian. Furthermore, residual analysis and influence diagnostic tools are proposed. A Monte Carlo experiment is conducted to evaluate the performance of the proposed methodology, under small and moderate sample sizes, along with suitable discussions. The methodology is illustrated with the analysis of a real longitudinal dataset. An R package was developed to allow the practitioners to use the methodology described in this paper.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graphical models have long been studied in statistics as a tool for inferring conditional independence relationships among a large set of random variables. The most existing works in graphical modeling focus on the cases that the data are Gaussian or mixed and the variables are linearly dependent. In this paper, we propose a double regression method for learning graphical models under the high-dimensional nonlinear and non-Gaussian setting, and prove that the proposed method is consistent under mild conditions. The proposed method works by performing a series of nonparametric conditional independence tests. The conditioning set of each test is reduced via a double regression procedure where a model-free sure independence screening procedure or a sparse deep neural network can be employed. The numerical results indicate that the proposed method works well for high-dimensional nonlinear and non-Gaussian data.
{"title":"A double regression method for graphical modeling of high-dimensional nonlinear and non-Gaussian data","authors":"Siqi Liang, Faming Liang","doi":"10.4310/22-sii756","DOIUrl":"https://doi.org/10.4310/22-sii756","url":null,"abstract":"Graphical models have long been studied in statistics as a tool for inferring conditional independence relationships among a large set of random variables. The most existing works in graphical modeling focus on the cases that the data are Gaussian or mixed and the variables are linearly dependent. In this paper, we propose a double regression method for learning graphical models under the high-dimensional nonlinear and non-Gaussian setting, and prove that the proposed method is consistent under mild conditions. The proposed method works by performing a series of nonparametric conditional independence tests. The conditioning set of each test is reduced via a double regression procedure where a model-free sure independence screening procedure or a sparse deep neural network can be employed. The numerical results indicate that the proposed method works well for high-dimensional nonlinear and non-Gaussian data.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In a Bayesian model selection and hypothesis testing, users should be cautious when choosing suitable prior distributions, as it is an important problem. More often than not, objective Bayesian analyses utilize noninformative priors such as Jeffreys priors. However, since these noninformative priors are often improper, the Bayes factor associated with these improper priors is not well-defined. To circumvent this indeterminate issue, the Bayes factor can be corrected by intrinsic and fractional methods. These adjusted Bayes factors are asymptotically equivalent to the ordinary Bayes factors calculated with proper priors, called intrinsic priors. In this article, we derive intrinsic priors for testing the point null hypothesis under a zero-inflated Poisson distribution. Extensive simulation studies are performed to support the theoretical results on asymptotic equivalence, and two real datasets are analyzed to illustrate the methodology developed in this paper.
{"title":"Default Bayesian testing for the zero-inflated Poisson distribution","authors":"Yewon Han, Haewon Hwang, Hon Keung Ng, Seong Kim","doi":"10.4310/22-sii750","DOIUrl":"https://doi.org/10.4310/22-sii750","url":null,"abstract":"In a Bayesian model selection and hypothesis testing, users should be cautious when choosing suitable prior distributions, as it is an important problem. More often than not, objective Bayesian analyses utilize noninformative priors such as Jeffreys priors. However, since these noninformative priors are often improper, the Bayes factor associated with these improper priors is not well-defined. To circumvent this indeterminate issue, the Bayes factor can be corrected by intrinsic and fractional methods. These adjusted Bayes factors are asymptotically equivalent to the ordinary Bayes factors calculated with proper priors, called intrinsic priors. In this article, we derive intrinsic priors for testing the point null hypothesis under a zero-inflated Poisson distribution. Extensive simulation studies are performed to support the theoretical results on asymptotic equivalence, and two real datasets are analyzed to illustrate the methodology developed in this paper.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Modern statistical analyses often encounter datasets with massive sizes and heavy-tailed distributions. For datasets with massive sizes, traditional estimation methods can hardly be used to estimate the extreme value index directly. To address the issue, we propose here a subsampling-based method. Specifically, multiple subsamples are drawn from the whole dataset by using the technique of simple random subsampling with replacement. Based on each subsample, an approximate maximum likelihood estimator can be computed. The resulting estimators are then averaged to form a more accurate one. Under appropriate regularity conditions, we show theoretically that the proposed estimator is consistent and asymptotically normal. With the help of the estimated extreme value index, we can estimate high-level quantiles and tail probabilities of a heavy-tailed random variable consistently. Extensive simulation experiments are provided to demonstrate the promising performance of our method. A real data analysis is also presented for illustration purpose.
{"title":"Estimating extreme value index by subsampling for massive datasets with heavy-tailed distributions","authors":"Yongxin Li, Liujun Chen, Deyuan Li, Hansheng Wang","doi":"10.4310/22-sii749","DOIUrl":"https://doi.org/10.4310/22-sii749","url":null,"abstract":"Modern statistical analyses often encounter datasets with massive sizes and heavy-tailed distributions. For datasets with massive sizes, traditional estimation methods can hardly be used to estimate the extreme value index directly. To address the issue, we propose here a subsampling-based method. Specifically, multiple subsamples are drawn from the whole dataset by using the technique of simple random subsampling with replacement. Based on each subsample, an approximate maximum likelihood estimator can be computed. The resulting estimators are then averaged to form a more accurate one. Under appropriate regularity conditions, we show theoretically that the proposed estimator is consistent and asymptotically normal. With the help of the estimated extreme value index, we can estimate high-level quantiles and tail probabilities of a heavy-tailed random variable consistently. Extensive simulation experiments are provided to demonstrate the promising performance of our method. A real data analysis is also presented for illustration purpose.","PeriodicalId":51230,"journal":{"name":"Statistics and Its Interface","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141743408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}