Pub Date : 2023-11-27DOI: 10.1080/10485252.2023.2284896
Carina Beering, Anne Leucht
We provide a functional central limit theorem for a broad class of smooth functions for possibly non-causal multivariate linear processes with time-varying coefficients. Since the limiting processe...
本文给出了一类具有时变系数的可能非因果多元线性过程的光滑函数的泛函中心极限定理。由于限制过程…
{"title":"A bootstrap functional central limit theorem for time-varying linear processes","authors":"Carina Beering, Anne Leucht","doi":"10.1080/10485252.2023.2284896","DOIUrl":"https://doi.org/10.1080/10485252.2023.2284896","url":null,"abstract":"We provide a functional central limit theorem for a broad class of smooth functions for possibly non-causal multivariate linear processes with time-varying coefficients. Since the limiting processe...","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"93 ","pages":""},"PeriodicalIF":1.2,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138518667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-27DOI: 10.1080/10485252.2023.2284897
John O'Quigley
The log-rank test can be viewed as nonparametric from the standpoint of a series of 2×2 tables, as semi-parametric from the standpoint of the proportional hazards model and as parametric from the v...
{"title":"Integrated log-rank test","authors":"John O'Quigley","doi":"10.1080/10485252.2023.2284897","DOIUrl":"https://doi.org/10.1080/10485252.2023.2284897","url":null,"abstract":"The log-rank test can be viewed as nonparametric from the standpoint of a series of 2×2 tables, as semi-parametric from the standpoint of the proportional hazards model and as parametric from the v...","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"94 ","pages":""},"PeriodicalIF":1.2,"publicationDate":"2023-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138518670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-10DOI: 10.1080/10485252.2023.2280003
Alejandro Cholaquidis, Ricardo Fraiman, Manuel Hernández-Banadik
AbstractNew continuous-time models and statistical methods have been developed to estimate some sets related to animal movement, such as the home-range and the core-area among others, when the information of the trajectory is provided by a GPS. Because data transfer costs and GPS battery life are practical constraints, the experimental designer must make critical sampling decisions to maximise information. We introduce the on–off sampling scheme, where the GPS is alternately on and off. This scheme is already used in practice but with insufficient statistical theoretical support. We prove the consistency of home-range estimators with an underlying reflected diffusion model under this sampling method. The same rate of convergence is achieved as in the case where the GPS is always on for the whole experiment. This is illustrated by a simulation study and real data. We also provide estimators of the stationary distribution, its level sets and the drift function.Keywords: Home-range estimationreflected Brownian motion with driftstationarity distributionlevel set estimation2010 Mathematics Subject Classifications: 62M2062G2060J70 AcknowledgmentsWe thanks Dr. Stephen Blake, of the Max Planck Institute for Ornithology, for facilitating access to the data set that was used in this paper. The data that support the findings of this study are openly available in Movebank at https://www.movebank.org/cms/webapp?gwt_fragment=page=studies,path=study1818825, reference number 1818825.We thanks the editor and three referee's for their constructive comments which improves significantly the present version of the manuscript.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis work was supported by grants ANII (Agencia Nacional de Investigación e Innovación) [grant numbers POSNAC20191157608, FCE120191156054].
摘要利用GPS提供的轨迹信息,建立了新的连续时间模型和统计方法来估计与动物运动有关的一些集合,如起始距离和核心区域等。由于数据传输成本和GPS电池寿命是实际限制,实验设计者必须做出关键的采样决策,以最大限度地提高信息。我们介绍了开关采样方案,其中GPS交替打开和关闭。该方案已在实践中得到应用,但缺乏统计学理论支持。在这种抽样方法下,我们证明了具有底层反射扩散模型的家园距离估计的一致性。与GPS在整个实验中始终开着的情况下达到相同的收敛速率。仿真研究和实际数据说明了这一点。我们还提供了平稳分布、其水平集和漂移函数的估计。关键词:距离估计;反映布朗运动与漂移平稳分布;水平集估计;2010数学学科分类:62M2062G2060J70致谢我们感谢马克斯普朗克鸟类研究所的Stephen Blake博士,他为本文使用的数据集提供了方便。支持本研究结果的数据可以在Movebank上公开获取,网址为https://www.movebank.org/cms/webapp?gwt_fragment=page=studies,path=study1818825,参考编号为1818825。我们感谢编辑和三位审稿人的建设性意见,这些意见大大改进了当前版本的手稿。披露声明作者未报告潜在的利益冲突。本研究由ANII基金(Agencia Nacional de Investigación e Innovación)支持[资助号POSNAC20191157608, FCE120191156054]。
{"title":"Home-range estimation under a restricted sample scheme","authors":"Alejandro Cholaquidis, Ricardo Fraiman, Manuel Hernández-Banadik","doi":"10.1080/10485252.2023.2280003","DOIUrl":"https://doi.org/10.1080/10485252.2023.2280003","url":null,"abstract":"AbstractNew continuous-time models and statistical methods have been developed to estimate some sets related to animal movement, such as the home-range and the core-area among others, when the information of the trajectory is provided by a GPS. Because data transfer costs and GPS battery life are practical constraints, the experimental designer must make critical sampling decisions to maximise information. We introduce the on–off sampling scheme, where the GPS is alternately on and off. This scheme is already used in practice but with insufficient statistical theoretical support. We prove the consistency of home-range estimators with an underlying reflected diffusion model under this sampling method. The same rate of convergence is achieved as in the case where the GPS is always on for the whole experiment. This is illustrated by a simulation study and real data. We also provide estimators of the stationary distribution, its level sets and the drift function.Keywords: Home-range estimationreflected Brownian motion with driftstationarity distributionlevel set estimation2010 Mathematics Subject Classifications: 62M2062G2060J70 AcknowledgmentsWe thanks Dr. Stephen Blake, of the Max Planck Institute for Ornithology, for facilitating access to the data set that was used in this paper. The data that support the findings of this study are openly available in Movebank at https://www.movebank.org/cms/webapp?gwt_fragment=page=studies,path=study1818825, reference number 1818825.We thanks the editor and three referee's for their constructive comments which improves significantly the present version of the manuscript.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis work was supported by grants ANII (Agencia Nacional de Investigación e Innovación) [grant numbers POSNAC20191157608, FCE120191156054].","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"75 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135092733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-10DOI: 10.1080/10485252.2023.2280016
Mingyue Du, Mengzhu Yu
AbstractMultivariate interval-censored failure time data occur when a failure time study involves several related failure times of interest and only interval-censored observations are available for each of them. Although a great deal of literature has been established for their regression analysis, there does not seem to exist an approach that applies to the situation where there exist both a cured subgroup and informative censoring, the focus of this paper. For the problem, a class of semiparametric transformation non-mixture cure models is presented and a two-step estimation procedure is proposed. For the implementation of the proposed method, an EM algorithm is developed. Numerical results suggest that the proposed method works well for practical situations and an application is provided.Keywords: Informative censoringmultivariate interval-censored datanon-mixture cure modeltransformation model AcknowledgementsThe authors wish to thank the Editor, Prof. Wenbin Lu, the Associate Editor and two reviewers for their helpful comments and suggestions that greatly improved the paper. The R code for the implementation of the proposed method is available from the second author upon request.Disclosure statementNo potential conflict of interest was reported by the author(s).
{"title":"Regression analysis of multivariate interval-censored failure time data with a cured subgroup and informative censoring","authors":"Mingyue Du, Mengzhu Yu","doi":"10.1080/10485252.2023.2280016","DOIUrl":"https://doi.org/10.1080/10485252.2023.2280016","url":null,"abstract":"AbstractMultivariate interval-censored failure time data occur when a failure time study involves several related failure times of interest and only interval-censored observations are available for each of them. Although a great deal of literature has been established for their regression analysis, there does not seem to exist an approach that applies to the situation where there exist both a cured subgroup and informative censoring, the focus of this paper. For the problem, a class of semiparametric transformation non-mixture cure models is presented and a two-step estimation procedure is proposed. For the implementation of the proposed method, an EM algorithm is developed. Numerical results suggest that the proposed method works well for practical situations and an application is provided.Keywords: Informative censoringmultivariate interval-censored datanon-mixture cure modeltransformation model AcknowledgementsThe authors wish to thank the Editor, Prof. Wenbin Lu, the Associate Editor and two reviewers for their helpful comments and suggestions that greatly improved the paper. The R code for the implementation of the proposed method is available from the second author upon request.Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"47 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135092617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-09DOI: 10.1080/10485252.2023.2280004
Xuejun Wang, Xi Chen, Tien-Chung Hu, Andrei Volodin
AbstractIn this article, the complete f-moment convergence for m-asymptotic negatively associated random variables is investigated. As applications, we establish the strong consistency of the least square estimator in the simple linear errors-in-variables models and the complete consistency for estimator in the semiparametric regression model based on m-asymptotic negatively associated errors. We also give some simulations to assess the finite sample performance of the theoretical results.Keywords: m-Asymptotic negatively associated random variablescomplete f-moment convergenceconsistencyerrors-in-variables modelssemiparametric regression modelsMathematics Subject Classifications: 60F1562G20 AcknowledgmentsThe authors are most grateful to the Editor and anonymous referee for carefully reading the manuscript and valuable suggestions which helped in improving an earlier version of this paper.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingSupported by the National Social Science Foundation of China (22BTJ059).
{"title":"Complete <i>f</i> -moment convergence for <i>m</i> -asymptotic negatively associated random variables and related statistical applications","authors":"Xuejun Wang, Xi Chen, Tien-Chung Hu, Andrei Volodin","doi":"10.1080/10485252.2023.2280004","DOIUrl":"https://doi.org/10.1080/10485252.2023.2280004","url":null,"abstract":"AbstractIn this article, the complete f-moment convergence for m-asymptotic negatively associated random variables is investigated. As applications, we establish the strong consistency of the least square estimator in the simple linear errors-in-variables models and the complete consistency for estimator in the semiparametric regression model based on m-asymptotic negatively associated errors. We also give some simulations to assess the finite sample performance of the theoretical results.Keywords: m-Asymptotic negatively associated random variablescomplete f-moment convergenceconsistencyerrors-in-variables modelssemiparametric regression modelsMathematics Subject Classifications: 60F1562G20 AcknowledgmentsThe authors are most grateful to the Editor and anonymous referee for carefully reading the manuscript and valuable suggestions which helped in improving an earlier version of this paper.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingSupported by the National Social Science Foundation of China (22BTJ059).","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":" 94","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135241576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-09DOI: 10.1080/10485252.2023.2280022
Christian Hirsch, Johannes Krebs, Claudia Redenbach
AbstractMotivated by the rapidly increasing relevance of virtual material design in the domain of materials science, it has become essential to assess whether topological properties of stochastic models for a spatial tessellation are in accordance with a given dataset. Recently, tools from topological data analysis such as the persistence diagram have allowed to reach profound insights in a variety of application contexts. In this work, we establish the asymptotic normality of a variety of test statistics derived from a tessellation-adapted refinement of the persistence diagram. Since in applications, it is common to work with tessellation data subject to interactions, we establish our main results for Voronoi and Laguerre tessellations whose generators form a Gibbs point process. We elucidate how these conceptual results can be used to derive goodness of fit tests, and then investigate their power in a simulation study. Finally, we apply our testing methodology to a tessellation describing real foam data.Keywords: Tessellationtopological data analysisgoodness-of-fitpersistence diagram2010 Mathematics Subject Classifications: 60K3560F1082C22 AcknowledgmentsWe thank the two anonymous referees for their careful reading of the manuscript. Their comments and suggestions substantially improved the quality of the presentation. We thank Anne Jung (Helmut Schmidt University Hamburg) for providing the foam sample and Christian Jung (RPTU Kaiserslautern-Landau) for computing the Laguerre approximation.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingJohannes Krebs was partially supported by the German Research Foundation (DFG), Grant Number KR-4977/2-1.
{"title":"Persistent homology based goodness-of-fit tests for spatial tessellations","authors":"Christian Hirsch, Johannes Krebs, Claudia Redenbach","doi":"10.1080/10485252.2023.2280022","DOIUrl":"https://doi.org/10.1080/10485252.2023.2280022","url":null,"abstract":"AbstractMotivated by the rapidly increasing relevance of virtual material design in the domain of materials science, it has become essential to assess whether topological properties of stochastic models for a spatial tessellation are in accordance with a given dataset. Recently, tools from topological data analysis such as the persistence diagram have allowed to reach profound insights in a variety of application contexts. In this work, we establish the asymptotic normality of a variety of test statistics derived from a tessellation-adapted refinement of the persistence diagram. Since in applications, it is common to work with tessellation data subject to interactions, we establish our main results for Voronoi and Laguerre tessellations whose generators form a Gibbs point process. We elucidate how these conceptual results can be used to derive goodness of fit tests, and then investigate their power in a simulation study. Finally, we apply our testing methodology to a tessellation describing real foam data.Keywords: Tessellationtopological data analysisgoodness-of-fitpersistence diagram2010 Mathematics Subject Classifications: 60K3560F1082C22 AcknowledgmentsWe thank the two anonymous referees for their careful reading of the manuscript. Their comments and suggestions substantially improved the quality of the presentation. We thank Anne Jung (Helmut Schmidt University Hamburg) for providing the foam sample and Christian Jung (RPTU Kaiserslautern-Landau) for computing the Laguerre approximation.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingJohannes Krebs was partially supported by the German Research Foundation (DFG), Grant Number KR-4977/2-1.","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":" 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135241575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-02DOI: 10.1080/10485252.2023.2277260
Yafan Guo, Derek S. Young
AbstractTolerance intervals in regression allow the user to quantify, with a specified degree of confidence, bounds for a specified proportion of the sampled population when conditioned on a set of covariate values. While methods are available for tolerance intervals in fully-parametric regression settings, the construction of tolerance intervals for nonparametric regression models has been treated in a limited capacity. This paper fills this gap and develops likelihood-based approaches for the construction of pointwise one-sided and two-sided tolerance intervals for nonparametric regression models. A numerical approach is also presented for constructing simultaneous tolerance intervals. An appealing facet of this work is that the resulting methodology is consistent with what is done for fully-parametric regression tolerance intervals. Extensive coverage studies are presented, which demonstrate very good performance of the proposed methods. The proposed tolerance intervals are calculated and interpreted for analyses involving a fertility dataset and a triceps measurement dataset.Keywords: Bootstrapboundary effectscoverage probabilitiesk-factorsmoothing splineAMS Subject Classifications: 62G0862G15 AcknowledgmentsWe would thank the University of Kentucky Center for Computational Sciences and Information Technology Services Research Computing for their support and use of the Lipscomb Compute Cluster and associated research computing resources. The authors are also thankful to the Associate Editor and two reviewers who provided numerous insightful comments that improved the overall quality of this work.Disclosure statementNo potential conflict of interest was reported by the author(s).Data availability statementThe fertility data are available at the HFC's website bluehttps://www.fertilitydata.org/cgi-bin/data.php. The triceps data are available in the R package MultiKink (Wan and Zhong Citation2020), and can be accessed by typing data(triceps).
在一组协变量值的条件下,回归中的容忍区间允许用户以指定的置信度量化抽样总体中指定比例的界限。虽然在全参数回归设置中有可用于公差区间的方法,但对非参数回归模型的公差区间的构造的处理能力有限。本文填补了这一空白,并开发了基于似然的方法来构建非参数回归模型的点向单侧和双侧容差区间。提出了一种构造同步公差区间的数值方法。这项工作的一个吸引人的方面是,所得到的方法与对全参数回归容忍区间所做的一致。广泛的覆盖研究表明,所提出的方法具有良好的性能。提出的公差区间计算和解释分析涉及生育数据集和三头肌测量数据集。关键词:自举边界效应覆盖概率因子平滑样条ams学科分类:62G0862G15致谢我们要感谢肯塔基大学计算科学和信息技术服务研究计算中心对Lipscomb计算集群和相关研究计算资源的支持和使用。作者还感谢副编辑和两位审稿人,他们提供了许多有见地的评论,提高了本文的整体质量。披露声明作者未报告潜在的利益冲突。数据可用性声明生育率数据可在HFC的网站bluehttps://www.fertilitydata.org/cgi-bin/data.php上获得。肱三头肌数据可以在R软件包MultiKink (Wan and Zhong Citation2020)中获得,可以通过输入数据(肱三头肌)来访问。
{"title":"Approximate tolerance intervals for nonparametric regression models","authors":"Yafan Guo, Derek S. Young","doi":"10.1080/10485252.2023.2277260","DOIUrl":"https://doi.org/10.1080/10485252.2023.2277260","url":null,"abstract":"AbstractTolerance intervals in regression allow the user to quantify, with a specified degree of confidence, bounds for a specified proportion of the sampled population when conditioned on a set of covariate values. While methods are available for tolerance intervals in fully-parametric regression settings, the construction of tolerance intervals for nonparametric regression models has been treated in a limited capacity. This paper fills this gap and develops likelihood-based approaches for the construction of pointwise one-sided and two-sided tolerance intervals for nonparametric regression models. A numerical approach is also presented for constructing simultaneous tolerance intervals. An appealing facet of this work is that the resulting methodology is consistent with what is done for fully-parametric regression tolerance intervals. Extensive coverage studies are presented, which demonstrate very good performance of the proposed methods. The proposed tolerance intervals are calculated and interpreted for analyses involving a fertility dataset and a triceps measurement dataset.Keywords: Bootstrapboundary effectscoverage probabilitiesk-factorsmoothing splineAMS Subject Classifications: 62G0862G15 AcknowledgmentsWe would thank the University of Kentucky Center for Computational Sciences and Information Technology Services Research Computing for their support and use of the Lipscomb Compute Cluster and associated research computing resources. The authors are also thankful to the Associate Editor and two reviewers who provided numerous insightful comments that improved the overall quality of this work.Disclosure statementNo potential conflict of interest was reported by the author(s).Data availability statementThe fertility data are available at the HFC's website bluehttps://www.fertilitydata.org/cgi-bin/data.php. The triceps data are available in the R package MultiKink (Wan and Zhong Citation2020), and can be accessed by typing data(triceps).","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"33 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135933082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-27DOI: 10.1080/10485252.2023.2275056
Minggen Lu, Chin-Shang Li, Karla D. Wagner
AbstractWe develop a practical and computationally efficient penalised estimation approach for partially linear additive models to zero-inflated binary outcome data. To facilitate estimation, B-splines are employed to approximate unknown nonparametric components. A two-stage iterative expectation-maximisation (EM) algorithm is proposed to calculate penalised spline estimates. The large-sample properties such as the uniform convergence and the optimal rate of convergence for functional estimators, and the asymptotic normality and efficiency for regression coefficient estimators are established. Further, two variance-covariance estimation approaches are proposed to provide reliable Wald-type inference for regression coefficients. We conducted an extensive Monte Carlo study to evaluate the numerical properties of the proposed penalised methodology and compare it to the competing spline method [Li and Lu. ‘Semiparametric Zero-Inflated Bernoulli Regression with Applications’, Journal of Applied Statistics, 49, 2845–2869]. The methodology is further illustrated by an egocentric network study.Keywords: Additive Bernoulli regressionB-splineEM algorithmpenalised estimationzero-inflatedAMS SUBJECT CLASSIFICATIONS: 62G0562G2062G08 AcknowledgmentsThe authors are grateful to the Editor, the Associate Editor, and two reviewers for their useful comments and constructive suggestions which led to significant improvement in the revised manuscript.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis research was partially supported by the National Institute on Drug Abuse (NIDA) of the National Institutes of Health under Award Number R01DA038185.
{"title":"Penalised estimation of partially linear additive zero-inflated Bernoulli regression models","authors":"Minggen Lu, Chin-Shang Li, Karla D. Wagner","doi":"10.1080/10485252.2023.2275056","DOIUrl":"https://doi.org/10.1080/10485252.2023.2275056","url":null,"abstract":"AbstractWe develop a practical and computationally efficient penalised estimation approach for partially linear additive models to zero-inflated binary outcome data. To facilitate estimation, B-splines are employed to approximate unknown nonparametric components. A two-stage iterative expectation-maximisation (EM) algorithm is proposed to calculate penalised spline estimates. The large-sample properties such as the uniform convergence and the optimal rate of convergence for functional estimators, and the asymptotic normality and efficiency for regression coefficient estimators are established. Further, two variance-covariance estimation approaches are proposed to provide reliable Wald-type inference for regression coefficients. We conducted an extensive Monte Carlo study to evaluate the numerical properties of the proposed penalised methodology and compare it to the competing spline method [Li and Lu. ‘Semiparametric Zero-Inflated Bernoulli Regression with Applications’, Journal of Applied Statistics, 49, 2845–2869]. The methodology is further illustrated by an egocentric network study.Keywords: Additive Bernoulli regressionB-splineEM algorithmpenalised estimationzero-inflatedAMS SUBJECT CLASSIFICATIONS: 62G0562G2062G08 AcknowledgmentsThe authors are grateful to the Editor, the Associate Editor, and two reviewers for their useful comments and constructive suggestions which led to significant improvement in the revised manuscript.Disclosure statementNo potential conflict of interest was reported by the author(s).Additional informationFundingThis research was partially supported by the National Institute on Drug Abuse (NIDA) of the National Institutes of Health under Award Number R01DA038185.","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"74 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136235012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-18DOI: 10.1080/10485252.2023.2270079
Kin Yap Cheung, Stephen M. S. Lee
AbstractWe propose a new method for variable selection and prediction under a nonparametric regression setting, where a covariate may be missing either because its value is hidden from the observer or because it is inapplicable to the particular subject being observed. Despite its practical relevance, the problem has received little attention in the literature and its solutions are largely non-existent. Our proposal hinges on the construction of a modified Nadaraya–Watson estimator of the conditional mean regression function, with its bandwidths regularised to select variables and its weights adapted to accommodate different types of missingness. The method allows for information sharing across different missing data patterns without affecting consistency of the estimator. Unlike other conventional methods such as those based on imputations or likelihoods, our method requires only mild assumptions on the model and the missingness mechanism. For prediction we focus on finding relevant variables for predicting mean responses, conditional on covariate vectors subject to a given type of missingness. Our theoretical and numerical results show that the new method is consistent in variable selection and yields better prediction accuracy compared to existing methods.KEYWORDS: Nadaraya–Watson estimatormissing datanonparametric regressionvariable selection Disclosure statementNo potential conflict of interest was reported by the author(s).
{"title":"A modified Nadaraya–Watson procedure for variable selection and nonparametric prediction with missing data","authors":"Kin Yap Cheung, Stephen M. S. Lee","doi":"10.1080/10485252.2023.2270079","DOIUrl":"https://doi.org/10.1080/10485252.2023.2270079","url":null,"abstract":"AbstractWe propose a new method for variable selection and prediction under a nonparametric regression setting, where a covariate may be missing either because its value is hidden from the observer or because it is inapplicable to the particular subject being observed. Despite its practical relevance, the problem has received little attention in the literature and its solutions are largely non-existent. Our proposal hinges on the construction of a modified Nadaraya–Watson estimator of the conditional mean regression function, with its bandwidths regularised to select variables and its weights adapted to accommodate different types of missingness. The method allows for information sharing across different missing data patterns without affecting consistency of the estimator. Unlike other conventional methods such as those based on imputations or likelihoods, our method requires only mild assumptions on the model and the missingness mechanism. For prediction we focus on finding relevant variables for predicting mean responses, conditional on covariate vectors subject to a given type of missingness. Our theoretical and numerical results show that the new method is consistent in variable selection and yields better prediction accuracy compared to existing methods.KEYWORDS: Nadaraya–Watson estimatormissing datanonparametric regressionvariable selection Disclosure statementNo potential conflict of interest was reported by the author(s).","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135883797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-13DOI: 10.1080/10485252.2023.2266050
Natalia M. Markovich, Igor V. Rodionov
ABSTRACTWe propose a new threshold selection method for nonparametric estimation of the extremal index of stochastic processes. The discrepancy method was proposed as a data-driven smoothing tool for estimation of a probability density function. Now it is modified to select a threshold parameter of an extremal index estimator. A modification of the discrepancy statistic based on the Cramér–von Mises–Smirnov statistic ω2 is calculated by k largest order statistics instead of an entire sample. Its asymptotic distribution as k→∞ is proved to coincide with the ω2-distribution. Its quantiles are used as discrepancy values. The convergence rate of an extremal index estimate coupled with the discrepancy method is derived. The discrepancy method is used as an automatic threshold selection for the intervals and K-gaps estimators. It may be applied to other estimators of the extremal index. The performance of our method is evaluated by simulated and real data examples.KEYWORDS: Cramér–von Mises–Smirnov statisticdiscrepancy methodextremal indexnonparametric estimationthreshold selectionAMS SUBJECT CLASSIFICATION:: 62G32 Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1 The connection between (Equation1(1) ωn2=n∫−∞∞(Fn(x)−F(x))2dF(x)(1) ) and (Equation2(2) ω^n2(h)=∑i=1n(F^h(Xi,n)−i−0.5n)2+112n(2) ) can be found in Markovich (Citation2007, p. 81).2 Theoretically, events {Ti=1} are allowed. In practice, such cases related to single inter-arrival times between consecutive exceedances are meaningless.3 The modification (ω^n2−0.4/n+0.6/n2)(1+1/n) of classical statistic (Equation2(2) ω^n2(h)=∑i=1n(F^h(Xi,n)−i−0.5n)2+112n(2) ) eliminates the dependence of the percentage points of the C–M–S statistic on the sample size (Stephens Citation1974). For n>40 it changes the statistic on less than one percent. One can use the modification with regard to ω~L2(θ^) for finite L due to the closeness of its distribution to the limit distribution of the C–M–S statistic by Theorem 3.2.Additional informationFundingThe work of N.M. Markovich in Sections 1, 2, 4 and 5 was supported by the Russian Science Foundation [grant number 22-21-00177]. The work of I. V. Rodionov in Section 3 and proofs in Markovich and Rodionov (Citation2022) was performed at the Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences with the support of the Russian Science Foundation (grant No. 21-71-00035).
摘要针对随机过程极值指标的非参数估计,提出了一种新的阈值选择方法。提出了一种数据驱动的平滑方法,用于估计概率密度函数。现在将其修改为选择极值索引估计器的阈值参数。基于cram - von Mises-Smirnov统计量ω2的差异统计量的修正是通过k个最大阶统计量而不是整个样本来计算的。证明了它在k→∞时的渐近分布与ω - 2分布一致。其分位数用作差异值。导出了与差值法相结合的极值指数估计的收敛速度。差异方法被用作区间和k -间隙估计器的自动阈值选择。它可以应用于极值指数的其他估计。通过仿真和实际数据实例对该方法的性能进行了评价。关键词:克拉姆萨姆-冯·米斯-斯米尔诺夫统计差异法极值指数非参数估计阈值选择ams主题分类::62G32披露声明作者未报告潜在利益冲突。理论上,事件{Ti=1}是允许的。在实践中,这类与连续超标之间的单一到达间隔时间有关的情况是没有意义的对于n>40,它对统计量的改变小于1%。对于有限的L,由于它的分布与定理3.2中C-M-S统计量的极限分布很接近,我们可以使用关于ω~L2(θ^)的修正。N.M. Markovich在第1,2,4和5部分的工作得到了俄罗斯科学基金会的支持[资助号22-21-00177]。I. V. Rodionov在第3节中的工作以及Markovich和Rodionov的证明(Citation2022)由俄罗斯科学院信息传输问题研究所(Kharkevich研究所)在俄罗斯科学基金会(资助号21-71-00035)的支持下完成。
{"title":"Threshold selection for extremal index estimation","authors":"Natalia M. Markovich, Igor V. Rodionov","doi":"10.1080/10485252.2023.2266050","DOIUrl":"https://doi.org/10.1080/10485252.2023.2266050","url":null,"abstract":"ABSTRACTWe propose a new threshold selection method for nonparametric estimation of the extremal index of stochastic processes. The discrepancy method was proposed as a data-driven smoothing tool for estimation of a probability density function. Now it is modified to select a threshold parameter of an extremal index estimator. A modification of the discrepancy statistic based on the Cramér–von Mises–Smirnov statistic ω2 is calculated by k largest order statistics instead of an entire sample. Its asymptotic distribution as k→∞ is proved to coincide with the ω2-distribution. Its quantiles are used as discrepancy values. The convergence rate of an extremal index estimate coupled with the discrepancy method is derived. The discrepancy method is used as an automatic threshold selection for the intervals and K-gaps estimators. It may be applied to other estimators of the extremal index. The performance of our method is evaluated by simulated and real data examples.KEYWORDS: Cramér–von Mises–Smirnov statisticdiscrepancy methodextremal indexnonparametric estimationthreshold selectionAMS SUBJECT CLASSIFICATION:: 62G32 Disclosure statementNo potential conflict of interest was reported by the author(s).Notes1 The connection between (Equation1(1) ωn2=n∫−∞∞(Fn(x)−F(x))2dF(x)(1) ) and (Equation2(2) ω^n2(h)=∑i=1n(F^h(Xi,n)−i−0.5n)2+112n(2) ) can be found in Markovich (Citation2007, p. 81).2 Theoretically, events {Ti=1} are allowed. In practice, such cases related to single inter-arrival times between consecutive exceedances are meaningless.3 The modification (ω^n2−0.4/n+0.6/n2)(1+1/n) of classical statistic (Equation2(2) ω^n2(h)=∑i=1n(F^h(Xi,n)−i−0.5n)2+112n(2) ) eliminates the dependence of the percentage points of the C–M–S statistic on the sample size (Stephens Citation1974). For n>40 it changes the statistic on less than one percent. One can use the modification with regard to ω~L2(θ^) for finite L due to the closeness of its distribution to the limit distribution of the C–M–S statistic by Theorem 3.2.Additional informationFundingThe work of N.M. Markovich in Sections 1, 2, 4 and 5 was supported by the Russian Science Foundation [grant number 22-21-00177]. The work of I. V. Rodionov in Section 3 and proofs in Markovich and Rodionov (Citation2022) was performed at the Institute for Information Transmission Problems (Kharkevich Institute) of the Russian Academy of Sciences with the support of the Russian Science Foundation (grant No. 21-71-00035).","PeriodicalId":50112,"journal":{"name":"Journal of Nonparametric Statistics","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135805169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}