Pub Date : 2023-11-08DOI: 10.1080/02331888.2023.2278042
Sisheng Liu, Richard Charnigo
AbstractEstimation of a function, or its derivatives via nonparametric regression requires selection of one or more tuning parameters. In the present work, we propose a tuning parameter selection criterion called DCp for nonparametric derivative estimation in random design. Our criterion is general in that it can be applied with any nonparametric estimation method which is linear in the observed outcomes. Charnigo et al. [A generalized Cp criterion for derivative estimation. Technometrics. 2011;53(3):238–253] had proposed a GCp criterion for a similar purpose, assuming values of the covariate to be fixed and constant error variance. Here we consider the setting with random design and non-constant error variance since the covariate values will not generally be fixed and equally spaced in real data applications. We justify DCp in this setting both theoretically and by simulation. We also illustrate use of DCp with two economics data sets.Keywords: Nonparametric derivative estimationempirical derivativetuning parameter selectionrandom covariateheteroskedasticity AcknowledgmentsWe gratefully acknowledge the coding work from Charnigo et al. [Citation3] since some of R code for our simulation study was adapted from their work. We thank the associate editor and two anonymous peer reviewers for constructive suggestions.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingSisheng Liu's research is supported by the Scientific Research Fund of Hunan Provincial Education Department [grant number 22B0037].
摘要通过非参数回归估计函数或其导数需要选择一个或多个调谐参数。在本工作中,我们提出了一种称为DCp的随机设计非参数导数估计的调谐参数选择准则。我们的准则是通用的,因为它可以应用于任何在观测结果中呈线性的非参数估计方法。Charnigo et al.[导数估计的广义Cp准则。]technometics . 2011;53(3): 238-253]提出了一个GCp标准,用于类似的目的,假设协变量的值是固定的,误差方差恒定。这里我们考虑随机设计和非恒定误差方差的设置,因为协变量值在实际数据应用中通常不会是固定的和等间隔的。我们从理论上和仿真上证明了DCp在这种情况下的合理性。我们还用两个经济数据集说明DCp的使用。关键词:非参数导数估计经验导数调整参数选择随机协变量异方差致谢感谢Charnigo等人的编码工作[引文3],因为我们模拟研究的一些R代码改编自他们的工作。我们感谢副主编和两位匿名同行审稿人提出的建设性意见。披露声明作者未报告潜在的利益冲突。刘思生的研究得到湖南省教育厅科研基金资助[批准号:22B0037]。
{"title":"Tuning parameter selection for nonparametric derivative estimation in random design","authors":"Sisheng Liu, Richard Charnigo","doi":"10.1080/02331888.2023.2278042","DOIUrl":"https://doi.org/10.1080/02331888.2023.2278042","url":null,"abstract":"AbstractEstimation of a function, or its derivatives via nonparametric regression requires selection of one or more tuning parameters. In the present work, we propose a tuning parameter selection criterion called DCp for nonparametric derivative estimation in random design. Our criterion is general in that it can be applied with any nonparametric estimation method which is linear in the observed outcomes. Charnigo et al. [A generalized Cp criterion for derivative estimation. Technometrics. 2011;53(3):238–253] had proposed a GCp criterion for a similar purpose, assuming values of the covariate to be fixed and constant error variance. Here we consider the setting with random design and non-constant error variance since the covariate values will not generally be fixed and equally spaced in real data applications. We justify DCp in this setting both theoretically and by simulation. We also illustrate use of DCp with two economics data sets.Keywords: Nonparametric derivative estimationempirical derivativetuning parameter selectionrandom covariateheteroskedasticity AcknowledgmentsWe gratefully acknowledge the coding work from Charnigo et al. [Citation3] since some of R code for our simulation study was adapted from their work. We thank the associate editor and two anonymous peer reviewers for constructive suggestions.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingSisheng Liu's research is supported by the Scientific Research Fund of Hunan Provincial Education Department [grant number 22B0037].","PeriodicalId":54358,"journal":{"name":"Statistics","volume":"111 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135342663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-08DOI: 10.1080/02331888.2023.2280072
Quinn Forzley, Shakhawat Hossain, Shahedul A. Khan
AbstractThis paper considers the pretest and shrinkage estimation methods for estimating regression parameters of the generalized log-logistic proportional hazard (PH) model. This model is a simple extension of the log-logistic model, which is closed under the PH relationship. The generalized log-logistic PH model also has attributes similar to those of the Weibull model. We consider this model for right-censored data when some parameters shrink to a restricted subspace. This subspace information on the parameters is used to shrink the unrestricted model estimates toward the restricted model estimates. We then optimally combine the unrestricted and restricted estimates in order to define pretest and shrinkage estimators. Although this estimation procedure may increase bias, it also reduces the overall mean squared error. The efficacy of the proposed model and estimation techniques are shown using a simulation study as well as an application to real data. We also compare the performance of generalized log-logistic, Weibull, and Cox PH models for unimodal and increasing hazards. The shrinkage estimator poses less risk than the maximum likelihood estimator when the shrinkage dimension exceeds two; this is shown through simulation and real data applications.Keywords: Generalized log-logistic distributionWeibull distributionCox proportional hazard modelmaximum likelihoodMonte Carlo simulationshrinkage and pretest estimators2020 Mathematics Subject Classification: 62N02 AcknowledgementsThe authors are thankful to the editor, associate editor, and two referees for their valuable and insightful comments, which have significantly enhanced the quality of this article.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis research work was partially supported by NSERC through Discovery Grants to S Hossain (#419428) and SA Khan (#368532).
{"title":"Generalized log-logistic proportional hazard model: a non-penalty shrinkage approach","authors":"Quinn Forzley, Shakhawat Hossain, Shahedul A. Khan","doi":"10.1080/02331888.2023.2280072","DOIUrl":"https://doi.org/10.1080/02331888.2023.2280072","url":null,"abstract":"AbstractThis paper considers the pretest and shrinkage estimation methods for estimating regression parameters of the generalized log-logistic proportional hazard (PH) model. This model is a simple extension of the log-logistic model, which is closed under the PH relationship. The generalized log-logistic PH model also has attributes similar to those of the Weibull model. We consider this model for right-censored data when some parameters shrink to a restricted subspace. This subspace information on the parameters is used to shrink the unrestricted model estimates toward the restricted model estimates. We then optimally combine the unrestricted and restricted estimates in order to define pretest and shrinkage estimators. Although this estimation procedure may increase bias, it also reduces the overall mean squared error. The efficacy of the proposed model and estimation techniques are shown using a simulation study as well as an application to real data. We also compare the performance of generalized log-logistic, Weibull, and Cox PH models for unimodal and increasing hazards. The shrinkage estimator poses less risk than the maximum likelihood estimator when the shrinkage dimension exceeds two; this is shown through simulation and real data applications.Keywords: Generalized log-logistic distributionWeibull distributionCox proportional hazard modelmaximum likelihoodMonte Carlo simulationshrinkage and pretest estimators2020 Mathematics Subject Classification: 62N02 AcknowledgementsThe authors are thankful to the editor, associate editor, and two referees for their valuable and insightful comments, which have significantly enhanced the quality of this article.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis research work was partially supported by NSERC through Discovery Grants to S Hossain (#419428) and SA Khan (#368532).","PeriodicalId":54358,"journal":{"name":"Statistics","volume":"28 52","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135391112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-06DOI: 10.1080/02331888.2023.2278034
Hui Jiang, Guangyu Yang, Mingming Yu
In this paper, we consider the normalized least squares estimator of the parameter in a mildly stationary first-order autoregressive (AR(1)) model with dependent errors which are modeled as a mildly stationary AR(1) process. By martingale methods, we establish the moderate deviations for the least squares estimators of the regressor and error, which can be applied to understand the near-integrated second order autoregressive processes. As an application, we also obtain the moderate deviations for the Durbin-Watson statistic.
{"title":"Moderate deviations for the mildly stationary autoregressive model with dependent errors","authors":"Hui Jiang, Guangyu Yang, Mingming Yu","doi":"10.1080/02331888.2023.2278034","DOIUrl":"https://doi.org/10.1080/02331888.2023.2278034","url":null,"abstract":"In this paper, we consider the normalized least squares estimator of the parameter in a mildly stationary first-order autoregressive (AR(1)) model with dependent errors which are modeled as a mildly stationary AR(1) process. By martingale methods, we establish the moderate deviations for the least squares estimators of the regressor and error, which can be applied to understand the near-integrated second order autoregressive processes. As an application, we also obtain the moderate deviations for the Durbin-Watson statistic.","PeriodicalId":54358,"journal":{"name":"Statistics","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135634375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-03DOI: 10.1080/02331888.2023.2269588
Ling Peng, Xiangyong Tan, Peiwen Xiao, Zeinab Rizk, Xiaohui Liu
AbstractTrace regression has received a lot of attention due to its ability to account for matrix-type covariates, including panel data, images, and genomic microarrays as special cases. However, most of its existing research focuses on the case of mean regression. In this paper, we consider the expectile trace regression, which can provide a more diversified picture of the regression relationship at different expectiles, via the low-rank and group sparsity regularization. The upper bound for the statistical rate of convergence of the regularized estimator is established under some mild conditions. Some simulations, as well as a real data example, are also provided to illustrate the finite sample performance of the developed expectile trace regression.Keywords: Expectile trace regressionlow-rankupper boundconvergence ratematrix-type covariates2020 Mathematics Subject Classifications: 62J9962H12 AcknowledgementsThe authors thank one anonymous referee and the associate editor for their valuable comments, which have led to many improvements to this paper.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingLing Peng's research was supported by the NSF of China (Grant No. 12201259), Jiangxi Provincial NSF (Grant No. 20224BAB211008), and the Science & Technology research project of the Education Department of Jiangxi Province (Grant No. GJJ2200537). Xiangyong Tan's research was supported by the NSF of China (Grant No. 12201260), Jiangxi Provincial NSF (Grant No. 20212BAB211010), and China Postdoctoral Science Foundation (2022M711425). Xiaohui Liu's research is supported by NSF of China (Grant No. 11971208), the National Social Science Foundation of China (21&ZD152), and the Outstanding Youth Fund Project of the Science and Technology Department of Jiangxi Province (No. 20224ACB211003).
{"title":"Expectile trace regression via low-rank and group sparsity regularization","authors":"Ling Peng, Xiangyong Tan, Peiwen Xiao, Zeinab Rizk, Xiaohui Liu","doi":"10.1080/02331888.2023.2269588","DOIUrl":"https://doi.org/10.1080/02331888.2023.2269588","url":null,"abstract":"AbstractTrace regression has received a lot of attention due to its ability to account for matrix-type covariates, including panel data, images, and genomic microarrays as special cases. However, most of its existing research focuses on the case of mean regression. In this paper, we consider the expectile trace regression, which can provide a more diversified picture of the regression relationship at different expectiles, via the low-rank and group sparsity regularization. The upper bound for the statistical rate of convergence of the regularized estimator is established under some mild conditions. Some simulations, as well as a real data example, are also provided to illustrate the finite sample performance of the developed expectile trace regression.Keywords: Expectile trace regressionlow-rankupper boundconvergence ratematrix-type covariates2020 Mathematics Subject Classifications: 62J9962H12 AcknowledgementsThe authors thank one anonymous referee and the associate editor for their valuable comments, which have led to many improvements to this paper.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingLing Peng's research was supported by the NSF of China (Grant No. 12201259), Jiangxi Provincial NSF (Grant No. 20224BAB211008), and the Science & Technology research project of the Education Department of Jiangxi Province (Grant No. GJJ2200537). Xiangyong Tan's research was supported by the NSF of China (Grant No. 12201260), Jiangxi Provincial NSF (Grant No. 20212BAB211010), and China Postdoctoral Science Foundation (2022M711425). Xiaohui Liu's research is supported by NSF of China (Grant No. 11971208), the National Social Science Foundation of China (21&ZD152), and the Outstanding Youth Fund Project of the Science and Technology Department of Jiangxi Province (No. 20224ACB211003).","PeriodicalId":54358,"journal":{"name":"Statistics","volume":"48 15","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135820316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-10DOI: 10.1080/02331888.2023.2268314
Mengmei Xi, Chunhua Wang, Xuejun Wang
AbstractIn this paper, we primarily focus on the edge frequency polygon estimator of f(x), which represents the probability density function of a sequence of φ-mixing random variables {Xi,i≥1}. We establish the uniformly strong consistency and the convergence rate of asymptotic normality for the edge frequency polygon estimator under suitable conditions. Notably, the convergence rate achieves O(n−1/6), which is more precise compared to the corresponding rate mentioned in the existing literature. Additionally, we present simulation studies to validate the theoretical results.Keywords: Berry–Esseen boundsuniformly strong consistencydensity functionedge frequency polygon estimatorMathematical Subject Classifications: 60E0562G20 AcknowledgementsThe authors are most grateful to the Editor and anonymous referee for carefully reading the manuscript and valuable suggestions which helped in improving an earlier version of this paper.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingSupported by the National Social Science Foundation of China (22BTJ059), the National Natural Science Foundation of China (12201600), and the Natural Science Foundation of Anhui Province (2108085MA06), and the Postdoctoral Science Foundation of China (2022M713056).
{"title":"Uniformly strong consistency and the rates of asymptotic normality for the edge frequency polygons","authors":"Mengmei Xi, Chunhua Wang, Xuejun Wang","doi":"10.1080/02331888.2023.2268314","DOIUrl":"https://doi.org/10.1080/02331888.2023.2268314","url":null,"abstract":"AbstractIn this paper, we primarily focus on the edge frequency polygon estimator of f(x), which represents the probability density function of a sequence of φ-mixing random variables {Xi,i≥1}. We establish the uniformly strong consistency and the convergence rate of asymptotic normality for the edge frequency polygon estimator under suitable conditions. Notably, the convergence rate achieves O(n−1/6), which is more precise compared to the corresponding rate mentioned in the existing literature. Additionally, we present simulation studies to validate the theoretical results.Keywords: Berry–Esseen boundsuniformly strong consistencydensity functionedge frequency polygon estimatorMathematical Subject Classifications: 60E0562G20 AcknowledgementsThe authors are most grateful to the Editor and anonymous referee for carefully reading the manuscript and valuable suggestions which helped in improving an earlier version of this paper.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingSupported by the National Social Science Foundation of China (22BTJ059), the National Natural Science Foundation of China (12201600), and the Natural Science Foundation of Anhui Province (2108085MA06), and the Postdoctoral Science Foundation of China (2022M713056).","PeriodicalId":54358,"journal":{"name":"Statistics","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136295304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-25DOI: 10.1080/02331888.2023.2262665
Hengrui Luo, Steven N. MacEachern, Mario Peruggia
AbstractTopological data analysis (TDA) allows us to explore the topological features of a dataset. Among topological features, lower dimensional ones have recently drawn the attention of practitioners in mathematics and statistics due to their potential to aid the discovery of low dimensional structure in a data set. However, lower dimensional features are usually challenging to detect based on finite samples and using TDA methods that ignore the probabilistic mechanism that generates the data. In this paper, lower dimensional topological features occurring as zero-density regions of density functions are introduced and thoroughly investigated. Specifically, we consider sequences of coverings for the support of a density function in which the coverings are comprised of balls with shrinking radii. We show that, when these coverings satisfy certain sufficient conditions as the sample size goes to infinity, we can detect lower dimensional, zero-density regions with increasingly higher probability while guarding against false detection. We supplement the theoretical developments with the discussion of simulated experiments that elucidate the behaviour of the methodology for different choices of the tuning parameters that govern the construction of the covering sequences and characterize the asymptotic results.Keywords: Topological data analysiscovering constructionzero-density regions AcknowledgmentsWe thank the anonymous referee, whose comments greatly improve the article. We thank the AE for helpful comments and handling.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis material is based upon work supported by the National Science Foundation [grants numbers DMS-1613110, DMS-2015552, and SES-1921523].
{"title":"Asymptotics of lower dimensional zero-density regions","authors":"Hengrui Luo, Steven N. MacEachern, Mario Peruggia","doi":"10.1080/02331888.2023.2262665","DOIUrl":"https://doi.org/10.1080/02331888.2023.2262665","url":null,"abstract":"AbstractTopological data analysis (TDA) allows us to explore the topological features of a dataset. Among topological features, lower dimensional ones have recently drawn the attention of practitioners in mathematics and statistics due to their potential to aid the discovery of low dimensional structure in a data set. However, lower dimensional features are usually challenging to detect based on finite samples and using TDA methods that ignore the probabilistic mechanism that generates the data. In this paper, lower dimensional topological features occurring as zero-density regions of density functions are introduced and thoroughly investigated. Specifically, we consider sequences of coverings for the support of a density function in which the coverings are comprised of balls with shrinking radii. We show that, when these coverings satisfy certain sufficient conditions as the sample size goes to infinity, we can detect lower dimensional, zero-density regions with increasingly higher probability while guarding against false detection. We supplement the theoretical developments with the discussion of simulated experiments that elucidate the behaviour of the methodology for different choices of the tuning parameters that govern the construction of the covering sequences and characterize the asymptotic results.Keywords: Topological data analysiscovering constructionzero-density regions AcknowledgmentsWe thank the anonymous referee, whose comments greatly improve the article. We thank the AE for helpful comments and handling.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis material is based upon work supported by the National Science Foundation [grants numbers DMS-1613110, DMS-2015552, and SES-1921523].","PeriodicalId":54358,"journal":{"name":"Statistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135864188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-03DOI: 10.1080/02331888.2023.2256948
Angelo Alcaraz, Gabriela Ciuperca
The paper focuses on the automatic selection of the grouped explanatory variables in a high-dimensional model, when the model blue error is asymmetric. After introducing the model and notations, we define the adaptive group LASSO expectile estimator for which we prove the oracle properties: the sparsity and the asymptotic normality. Afterwards, the results are generalized by considering the asymmetric Lq-norm loss function. The theoretical results are obtained in several cases with respect to the number of variable groups. This number can be fixed or dependent on the sample size n, with the possibility that it is of the same order as n. Note that these new estimators allow us to consider weaker assumptions on the data and on the model errors than the usual ones. Simulation study demonstrates the competitive performance of the proposed penalized expectile regression, especially when the samples size is close to the number of explanatory variables and model errors are asymmetrical. An application on air pollution data is considered.
{"title":"Automatic selection by penalized asymmetric <i>L<sub>q</sub></i>-norm in a high-dimensional model with grouped variables","authors":"Angelo Alcaraz, Gabriela Ciuperca","doi":"10.1080/02331888.2023.2256948","DOIUrl":"https://doi.org/10.1080/02331888.2023.2256948","url":null,"abstract":"The paper focuses on the automatic selection of the grouped explanatory variables in a high-dimensional model, when the model blue error is asymmetric. After introducing the model and notations, we define the adaptive group LASSO expectile estimator for which we prove the oracle properties: the sparsity and the asymptotic normality. Afterwards, the results are generalized by considering the asymmetric Lq-norm loss function. The theoretical results are obtained in several cases with respect to the number of variable groups. This number can be fixed or dependent on the sample size n, with the possibility that it is of the same order as n. Note that these new estimators allow us to consider weaker assumptions on the data and on the model errors than the usual ones. Simulation study demonstrates the competitive performance of the proposed penalized expectile regression, especially when the samples size is close to the number of explanatory variables and model errors are asymmetrical. An application on air pollution data is considered.","PeriodicalId":54358,"journal":{"name":"Statistics","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134948020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-03DOI: 10.1080/02331888.2023.2260037
Alexandre Berred, Alexei Stepanov
AbstractIn this work, we investigate spacings based on order statistics obtained from continuous distribution functions. At the beginning of the paper, we present distributional results for spacings and a method of classification of distributions according to their tails. Then we use this method to derive asymptotic results for spacings. By applying some special versions of Borel–Cantelli lemma, we obtain their strong limit results. At the end of the paper, we present some illustrative examples.Keywords: Order statisticsspacingslimit resultsMSC2020-Mathematical Sciences Classification System: 6062 AcknowledgmentsThe authors are deeply indebted to the two anonymous Reviewers for their interesting comments and remarks. The work of the second author was supported by the Ministry of Science and Higher Education of the Russian Federation (agreement no. 075-02-2021-1748).Disclosure statementNo potential conflict of interest was reported by the author(s).
{"title":"On asymptotic properties of spacings","authors":"Alexandre Berred, Alexei Stepanov","doi":"10.1080/02331888.2023.2260037","DOIUrl":"https://doi.org/10.1080/02331888.2023.2260037","url":null,"abstract":"AbstractIn this work, we investigate spacings based on order statistics obtained from continuous distribution functions. At the beginning of the paper, we present distributional results for spacings and a method of classification of distributions according to their tails. Then we use this method to derive asymptotic results for spacings. By applying some special versions of Borel–Cantelli lemma, we obtain their strong limit results. At the end of the paper, we present some illustrative examples.Keywords: Order statisticsspacingslimit resultsMSC2020-Mathematical Sciences Classification System: 6062 AcknowledgmentsThe authors are deeply indebted to the two anonymous Reviewers for their interesting comments and remarks. The work of the second author was supported by the Ministry of Science and Higher Education of the Russian Federation (agreement no. 075-02-2021-1748).Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":54358,"journal":{"name":"Statistics","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134948604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-03DOI: 10.1080/02331888.2023.2258429
Lei Yang, Yongzhao Shao
AbstractMulti-centre study is increasingly used for borrowing strength from multiple research groups to obtain reproducible study findings. Regression analysis is widely used for analysing multi-group studies, however, some of the regression predictors are nonlinear and/or often measured with batch effects. Also, the group compositions are potentially heterogeneous across different centres. The conventional pooled data analysis can cause biased regression estimates. This paper proposes an integrated partially linear regression model (IPLM) to account for predictor's nonlinearity, general batch effect, group composition heterogeneity, and potential measurement-error in covariates simultaneously. A local linear regression-based approach is employed to estimate the nonlinear component and a regularization procedure is introduced to identify the predictors' effects. The IPLM-based method has estimation consistency and variable-selection consistency. Moreover, it has a fast computing algorithm and its effectiveness is supported by simulation studies. A multi-centre Alzheimer's disease research project is provided to illustrate the proposed IPLM-based analysis.Keywords: Multi-centre studydata harmonizationpartially linear regression modelgeneral batch effectsgroup composition heterogeneity AcknowledgementsThe authors would like to thank the reviewers and the associate editor for careful reading and for many constructive suggestions. The authors would like to thank Drs. Mony de Leon, Ricardo Osorio, and Elizabeth Pirraglia for sharing with us the NYU Alzheimer's disease data sets used in Section 5 for the illustration of our proposed model and analysis. The NYU study data are available from figshare (https://figshare.com/s/16d233d4822b810bcd9b, DOI: 10.6084/m9.figshare.5758554). One part of the data used in the preparation of the example in Section 5 of this article was obtained from the Alzheimers Disease Neuroimaging Initiative (ADNI) database (http://adni.loni.usc.edu/data-samples/access-data/). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in the design, analysis or writing of this report. A complete list of ADNI investigators is at: http://adni.loni.usc.edu/wpcontent/uploads/how to apply/ADNI Acknowledgement List.pdf.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis research was partially supported by the United States National Institute of Health grants (NIA grants P30AG066512, P01AG060882, NCI grants P50CA225450, P30CA016087) and Center for Disease Control and Prevention (CDC) grant U01OH012486.
{"title":"Integrated partially linear model for multi-centre studies with heterogeneity and batch effect in covariates","authors":"Lei Yang, Yongzhao Shao","doi":"10.1080/02331888.2023.2258429","DOIUrl":"https://doi.org/10.1080/02331888.2023.2258429","url":null,"abstract":"AbstractMulti-centre study is increasingly used for borrowing strength from multiple research groups to obtain reproducible study findings. Regression analysis is widely used for analysing multi-group studies, however, some of the regression predictors are nonlinear and/or often measured with batch effects. Also, the group compositions are potentially heterogeneous across different centres. The conventional pooled data analysis can cause biased regression estimates. This paper proposes an integrated partially linear regression model (IPLM) to account for predictor's nonlinearity, general batch effect, group composition heterogeneity, and potential measurement-error in covariates simultaneously. A local linear regression-based approach is employed to estimate the nonlinear component and a regularization procedure is introduced to identify the predictors' effects. The IPLM-based method has estimation consistency and variable-selection consistency. Moreover, it has a fast computing algorithm and its effectiveness is supported by simulation studies. A multi-centre Alzheimer's disease research project is provided to illustrate the proposed IPLM-based analysis.Keywords: Multi-centre studydata harmonizationpartially linear regression modelgeneral batch effectsgroup composition heterogeneity AcknowledgementsThe authors would like to thank the reviewers and the associate editor for careful reading and for many constructive suggestions. The authors would like to thank Drs. Mony de Leon, Ricardo Osorio, and Elizabeth Pirraglia for sharing with us the NYU Alzheimer's disease data sets used in Section 5 for the illustration of our proposed model and analysis. The NYU study data are available from figshare (https://figshare.com/s/16d233d4822b810bcd9b, DOI: 10.6084/m9.figshare.5758554). One part of the data used in the preparation of the example in Section 5 of this article was obtained from the Alzheimers Disease Neuroimaging Initiative (ADNI) database (http://adni.loni.usc.edu/data-samples/access-data/). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in the design, analysis or writing of this report. A complete list of ADNI investigators is at: http://adni.loni.usc.edu/wpcontent/uploads/how to apply/ADNI Acknowledgement List.pdf.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis research was partially supported by the United States National Institute of Health grants (NIA grants P30AG066512, P01AG060882, NCI grants P50CA225450, P30CA016087) and Center for Disease Control and Prevention (CDC) grant U01OH012486.","PeriodicalId":54358,"journal":{"name":"Statistics","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134949283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-03DOI: 10.1080/02331888.2023.2258249
Dong Han, Fugee Tsung, Jinguo Xian
AbstractThis article proposes a method of optimizing control chart (sequential test) to detect an abnormal change in a sequence of finite or even small samples with the unknown change-point and the unknown post-change probability distribution. We not only introduced a performance measure for a given charting statistic to evaluate the detection effect of a control chart, but also constructed an optimal control chart under the measure. The effect of optimization method was illustrated by numerical simulations of three optimized Shewhart, CUSUM and EWMA control charts, and a real data example.Keywords: Optimization of control chartchange-point detectionfinite samplesMSC 2010 Subject Classifications: Primary 62L10secondary 62L15 AcknowledgmentsWe sincerely thank the two reviewers for their precious comments on the manuscript.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis work is supported by RGC Competitive Earmarked Research Grants and National Natural Science Foundation of China (11531001).
{"title":"An optimization method for change-point monitoring in finite samples sequence","authors":"Dong Han, Fugee Tsung, Jinguo Xian","doi":"10.1080/02331888.2023.2258249","DOIUrl":"https://doi.org/10.1080/02331888.2023.2258249","url":null,"abstract":"AbstractThis article proposes a method of optimizing control chart (sequential test) to detect an abnormal change in a sequence of finite or even small samples with the unknown change-point and the unknown post-change probability distribution. We not only introduced a performance measure for a given charting statistic to evaluate the detection effect of a control chart, but also constructed an optimal control chart under the measure. The effect of optimization method was illustrated by numerical simulations of three optimized Shewhart, CUSUM and EWMA control charts, and a real data example.Keywords: Optimization of control chartchange-point detectionfinite samplesMSC 2010 Subject Classifications: Primary 62L10secondary 62L15 AcknowledgmentsWe sincerely thank the two reviewers for their precious comments on the manuscript.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis work is supported by RGC Competitive Earmarked Research Grants and National Natural Science Foundation of China (11531001).","PeriodicalId":54358,"journal":{"name":"Statistics","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134949761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}