首页 > 最新文献

Canadian Journal of Statistics-Revue Canadienne De Statistique最新文献

英文 中文
Acknowledgement of Referees' Services Remerciements aux membres des jurys 对推荐人服务的认可感谢陪审团成员
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-02-18 DOI: 10.1002/cjs.11766
Aeberhard, William H. ETH Zürich Asgharian, Masoud McGill University Bahraoui, Tarik* Université du Québec à Montréal Battey, Heather Imperial College London Bédard, Mylène Université de Montréal Bellhouse, David* University of Western Ontario Berger, Yves* University of Southampton Braekers, Roel Hasselt University Brazzale, Alessandra University of Padova Cai, Song Carleton University Cao, Guanqun Auburn University Casa, Alessandro Free University of Bozen-Bolzano Chatterjee, Kashinath* Visva-Bharati University Chen, Baojiang University of Texas Health Science Center Chen, Guanhua University of Wisconsin-Madison Chen, Sixia University of Oklahoma Health Sciences Center Chen, Yaqing* University of California Davis Cheng, Yu University of Pittsburgh Cheung, Rex San Francisco State University Coia, Vincenzo University of British Columbia Cook, Richard University of Waterloo Csató, László ELKH SZTAKI Dagne, Getachew University of South Florida Dai, Ben Chinese University of Hong Kong
Aeberhard, William H. ETH zrich Asgharian, Masoud McGill University Bahraoui, Tarik* University of quemacbec montracimal Battey, Heather Imperial College London b, myl University de montracimal Bellhouse, David* University of Western Ontario Berger, Yves* University of Southampton Braekers, Roel Hasselt University Brazzale, Alessandra University of Padova Cai, Song Carleton University Cao, Guanqun Auburn University Casa, Alessandro Free University of Bozen-Bolzano Chatterjee,Kashinath* Visva-Bharati University Chen, Baojiang University of Texas Health Science Center Chen, wisconsin Guanhua University - madison Chen, Sixia University of Oklahoma Health Science Center Chen, Yaqing* California University Davis Cheng, Yu University of Pittsburgh b张,Rex San Francisco State University Coia, Vincenzo University of British Columbia Cook, Richard University of Waterloo Csató, László ELKH SZTAKI Dagne, Getachew University of South Florida Dai,香港中文大学
{"title":"Acknowledgement of Referees' Services Remerciements aux membres des jurys","authors":"","doi":"10.1002/cjs.11766","DOIUrl":"10.1002/cjs.11766","url":null,"abstract":"Aeberhard, William H. ETH Zürich Asgharian, Masoud McGill University Bahraoui, Tarik* Université du Québec à Montréal Battey, Heather Imperial College London Bédard, Mylène Université de Montréal Bellhouse, David* University of Western Ontario Berger, Yves* University of Southampton Braekers, Roel Hasselt University Brazzale, Alessandra University of Padova Cai, Song Carleton University Cao, Guanqun Auburn University Casa, Alessandro Free University of Bozen-Bolzano Chatterjee, Kashinath* Visva-Bharati University Chen, Baojiang University of Texas Health Science Center Chen, Guanhua University of Wisconsin-Madison Chen, Sixia University of Oklahoma Health Sciences Center Chen, Yaqing* University of California Davis Cheng, Yu University of Pittsburgh Cheung, Rex San Francisco State University Coia, Vincenzo University of British Columbia Cook, Richard University of Waterloo Csató, László ELKH SZTAKI Dagne, Getachew University of South Florida Dai, Ben Chinese University of Hong Kong","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"51 1","pages":"344-349"},"PeriodicalIF":0.6,"publicationDate":"2023-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42178036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PCA Rerandomization
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-02-16 DOI: 10.1002/cjs.11765
Hengtao Zhang, Guosheng Yin, Donald B. Rubin

Mahalanobis distance of covariate means between treatment and control groups is often adopted as a balance criterion when implementing a rerandomization strategy. However, this criterion may not work well for high-dimensional cases because it balances all orthogonalized covariates equally. We propose using principal component analysis (PCA) to identify proper subspaces in which Mahalanobis distance should be calculated. Not only can PCA effectively reduce the dimensionality for high-dimensional covariates, but it also provides computational simplicity by focusing on the top orthogonal components. The PCA rerandomization scheme has desirable theoretical properties for balancing covariates and thereby improving the estimation of average treatment effects. This conclusion is supported by numerical studies using both simulated and real examples.

在实施再随机化策略时,治疗组和对照组之间的马氏距离协变均值通常被用作平衡标准。然而,这个标准可能不适用于高维情况,因为它平等地平衡了所有正交协变量。在这里,我们建议利用主成分分析(PCA)来确定应该在其中计算Mahalanobis距离的适当子空间。PCA不仅可以有效地降低高维情况的维数,同时捕获协变量中的大部分信息,而且它还通过关注顶部正交分量来提供计算简单性。我们证明了我们的PCA重随机化方案在平衡协变量方面具有理想的理论性质,从而改进了平均治疗效果的估计。我们还表明,这一结论得到了数值研究的支持,包括模拟和实际例子。
{"title":"PCA Rerandomization","authors":"Hengtao Zhang,&nbsp;Guosheng Yin,&nbsp;Donald B. Rubin","doi":"10.1002/cjs.11765","DOIUrl":"10.1002/cjs.11765","url":null,"abstract":"<p>Mahalanobis distance of covariate means between treatment and control groups is often adopted as a balance criterion when implementing a rerandomization strategy. However, this criterion may not work well for high-dimensional cases because it balances all orthogonalized covariates equally. We propose using principal component analysis (PCA) to identify proper subspaces in which Mahalanobis distance should be calculated. Not only can PCA effectively reduce the dimensionality for high-dimensional covariates, but it also provides computational simplicity by focusing on the top orthogonal components. The PCA rerandomization scheme has desirable theoretical properties for balancing covariates and thereby improving the estimation of average treatment effects. This conclusion is supported by numerical studies using both simulated and real examples.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"52 1","pages":"5-25"},"PeriodicalIF":0.6,"publicationDate":"2023-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11765","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44206441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Method of model checking for case II interval-censored data under the additive hazards model 加性危害模型下案例II区间截尾数据的模型检验方法
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-02-16 DOI: 10.1002/cjs.11759
Yanqin Feng, Ming Tang, Jieli Ding

General or case II interval-censored data are commonly encountered in practice. We develop methods for model-checking and goodness-of-fit testing for the additive hazards model with case II interval-censored data. We propose test statistics based on the supremum of the stochastic processes derived from the cumulative sum of martingale-based residuals over time and covariates. We approximate the distribution of the stochastic process via a simulation technique to conduct a class of graphical and numerical techniques for various purposes of model-fitting evaluations. Simulation studies are conducted to assess the finite-sample performance of the proposed method. A real dataset from an AIDS observational study is analyzed for illustration.

在实践中经常会遇到一般或情况 II 区间删失数据。我们开发了使用情况 II 间隔删失数据的加性危险模型的模型检查和拟合优度检验方法。我们提出的检验统计量是基于马氏残差随时间和协变量的累积和得出的随机过程的上峰。我们通过模拟技术对随机过程的分布进行了近似,从而为模型拟合评估的各种目的提供了一类图形和数值技术。我们进行了模拟研究,以评估所提出方法的有限样本性能。为说明起见,还分析了一项艾滋病观察研究的真实数据集。
{"title":"Method of model checking for case II interval-censored data under the additive hazards model","authors":"Yanqin Feng,&nbsp;Ming Tang,&nbsp;Jieli Ding","doi":"10.1002/cjs.11759","DOIUrl":"10.1002/cjs.11759","url":null,"abstract":"<p>General or case II interval-censored data are commonly encountered in practice. We develop methods for model-checking and goodness-of-fit testing for the additive hazards model with case II interval-censored data. We propose test statistics based on the supremum of the stochastic processes derived from the cumulative sum of martingale-based residuals over time and covariates. We approximate the distribution of the stochastic process via a simulation technique to conduct a class of graphical and numerical techniques for various purposes of model-fitting evaluations. Simulation studies are conducted to assess the finite-sample performance of the proposed method. A real dataset from an AIDS observational study is analyzed for illustration.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"52 1","pages":"212-236"},"PeriodicalIF":0.6,"publicationDate":"2023-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48612832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Subgroup analysis of linear models with measurement error 具有测量误差的线性模型的亚群分析
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-02-14 DOI: 10.1002/cjs.11763
Yuan Le, Yang Bai, Guoyou Qin

Heterogeneity exists in populations, and people may benefit differently from the same treatments or services. Correctly identifying subgroups corresponding to outcomes such as treatment response plays an important role in data-based decision making. As few discussions exist on subgroup analysis with measurement error, we propose a new estimation method to consider these two components simultaneously under the linear regression model. First, we develop an objective function based on unbiased estimating equations with two repeated measurements and a concave penalty on pairwise differences between coefficients. The proposed method can identify subgroups and estimate coefficients simultaneously when considering measurement error. Second, we derive an algorithm based on the alternating direction method of multipliers algorithm and demonstrate its convergence. Third, we prove that the proposed estimators are consistent and asymptotically normal. The performance and asymptotic properties of the proposed method are evaluated through simulation studies. Finally, we apply our method to data from the Lifestyle Education for Activity and Nutrition study and identify two subgroups, of which one has a significant treatment effect.

如何在异质人群中识别不同的亚群在精准医疗、个性化商品和服务等领域发挥着重要作用。在现实生活中,由于测量误差,我们通常无法获得变量的精确值。如何在存在测量误差的情况下更准确地估计模型也是一个值得研究的问题。因此,本文同时考虑了子群分析和测量误差。在线性回归模型的框架下,提出了一种新的方法来解决具有测量误差的子群分析问题。本文将构造具有两个重复测量的无偏估计方程的思想转化为最小化目标函数,然后对系数的成对差应用凹罚,以便同时估计系数和识别子群。本文提出了一种具有凹罚的交替方向乘法器算法,并证明了其收敛性。证明了所提出的估计量具有一致性和渐近正态性,并得到了仿真的支持。最后,我们将我们的方法应用于活动和营养生活方式教育研究的数据。
{"title":"Subgroup analysis of linear models with measurement error","authors":"Yuan Le,&nbsp;Yang Bai,&nbsp;Guoyou Qin","doi":"10.1002/cjs.11763","DOIUrl":"10.1002/cjs.11763","url":null,"abstract":"<p>Heterogeneity exists in populations, and people may benefit differently from the same treatments or services. Correctly identifying subgroups corresponding to outcomes such as treatment response plays an important role in data-based decision making. As few discussions exist on subgroup analysis with measurement error, we propose a new estimation method to consider these two components simultaneously under the linear regression model. First, we develop an objective function based on unbiased estimating equations with two repeated measurements and a concave penalty on pairwise differences between coefficients. The proposed method can identify subgroups and estimate coefficients simultaneously when considering measurement error. Second, we derive an algorithm based on the alternating direction method of multipliers algorithm and demonstrate its convergence. Third, we prove that the proposed estimators are consistent and asymptotically normal. The performance and asymptotic properties of the proposed method are evaluated through simulation studies. Finally, we apply our method to data from the Lifestyle Education for Activity and Nutrition study and identify two subgroups, of which one has a significant treatment effect.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"52 1","pages":"26-42"},"PeriodicalIF":0.6,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47687270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed sequential estimation procedures 分布式顺序估计程序
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-02-11 DOI: 10.1002/cjs.11762
Zhuojian Chen, Zhanfeng Wang, Yuan-chin Ivan Chang

Data collected from distributed sources or sites commonly have different distributions or contaminated observations. Active learning procedures allow us to assess data when recruiting new data into model building. Thus, combining several active learning procedures together is a promising idea, even when the collected data set is contaminated. Here, we study how to conduct and integrate several adaptive sequential procedures at a time to produce a valid result via several machines or a parallel-computing framework. To avoid distraction by complicated modelling processes, we use confidence set estimation for linear models to illustrate the proposed method and discuss the approach's statistical properties. We then evaluate its performance using both synthetic and real data. We have implemented our method using Python and made it available through Github at https://github.com/zhuojianc/dsep.

从分散的来源或地点收集到的数据通常具有不同的分布或受污染的观测结果。主动学习程序允许我们在收集新数据建立模型时对数据进行评估。因此,将多个主动学习程序结合在一起是一个很有前景的想法,即使收集到的数据集受到了污染。在这里,我们研究了如何通过多台机器或并行计算框架,同时进行并整合多个自适应序列程序,以产生有效的结果。为了避免复杂建模过程的干扰,我们使用线性模型的置信集估计来说明所提出的方法,并讨论该方法的统计特性。然后,我们使用合成数据和真实数据对其性能进行评估。我们使用 Python 实现了我们的方法,并通过 Github 发布在 https://github.com/zhuojianc/dsep 上。
{"title":"Distributed sequential estimation procedures","authors":"Zhuojian Chen,&nbsp;Zhanfeng Wang,&nbsp;Yuan-chin Ivan Chang","doi":"10.1002/cjs.11762","DOIUrl":"10.1002/cjs.11762","url":null,"abstract":"<p>Data collected from distributed sources or sites commonly have different distributions or contaminated observations. Active learning procedures allow us to assess data when recruiting new data into model building. Thus, combining several active learning procedures together is a promising idea, even when the collected data set is contaminated. Here, we study how to conduct and integrate several adaptive sequential procedures at a time to produce a valid result via several machines or a parallel-computing framework. To avoid distraction by complicated modelling processes, we use confidence set estimation for linear models to illustrate the proposed method and discuss the approach's statistical properties. We then evaluate its performance using both synthetic and real data. We have implemented our method using Python and made it available through Github at https://github.com/zhuojianc/dsep.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"52 1","pages":"271-290"},"PeriodicalIF":0.6,"publicationDate":"2023-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43825926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tests of linear hypotheses using indirect information 利用间接信息检验线性假设
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-02-11 DOI: 10.1002/cjs.11760
Andrew McCormack, Peter D. Hoff

In multigroup data settings with small within-group sample sizes, standard F$$ F $$-tests of group-specific linear hypotheses can have low power, particularly if the within-group sample sizes are not large relative to the number of explanatory variables. To remedy this situation, in this article we derive alternative test statistics based on information sharing across groups. Each group-specific test has potentially much larger power than the standard F$$ F $$-test, while still exactly maintaining a target type I error rate if the null hypothesis for the group is true. The proposed test for a given group uses a statistic that has optimal marginal power under a prior distribution derived from the data of the other groups. This statistic approaches the usual F$$ F $$-statistic as the prior distribution becomes more diffuse, but approaches a limiting “cone” test statistic as the prior distribution becomes extremely concentrated. We compare the power and P$$ P $$-values of the cone test to that of the F$$ F $$-test in some high-dimensional asymptotic scenarios. An analysis of educational outcome data is provided, demonstrating empirically that the proposed test is more powerful than the F$$ F $$-test.

在组内样本量较小的多组数据设置中,组内特定线性假设的标准F $$ F $$检验可能具有较低的功效,特别是当组内样本量相对于解释变量的数量并不大时。为了纠正这种情况,在本文中,我们基于组间的信息共享导出了可选的测试统计信息。如果组的零假设为真,则每个组特定的测试可能比标准F $$ F $$ - test的功率大得多,同时仍然完全保持目标I型错误率。对于给定的组,建议的测试使用在从其他组的数据导出的先验分布下具有最优边际功率的统计量。当先验分布变得更加分散时,该统计量接近通常的F $$ F $$统计量,但当先验分布变得极其集中时,该统计量接近极限“锥”检验统计量。在一些高维渐近情形下,我们比较了锥检验与F $$ F $$检验的幂和P $$ P $$ -值。对教育成果数据的分析提供,实证证明,提出的测试比F $$ F $$‐测试更强大。
{"title":"Tests of linear hypotheses using indirect information","authors":"Andrew McCormack,&nbsp;Peter D. Hoff","doi":"10.1002/cjs.11760","DOIUrl":"10.1002/cjs.11760","url":null,"abstract":"<p>In multigroup data settings with small within-group sample sizes, standard <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>F</mi>\u0000 </mrow>\u0000 <annotation>$$ F $$</annotation>\u0000 </semantics></math>-tests of group-specific linear hypotheses can have low power, particularly if the within-group sample sizes are not large relative to the number of explanatory variables. To remedy this situation, in this article we derive alternative test statistics based on information sharing across groups. Each group-specific test has potentially much larger power than the standard <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>F</mi>\u0000 </mrow>\u0000 <annotation>$$ F $$</annotation>\u0000 </semantics></math>-test, while still exactly maintaining a target type I error rate if the null hypothesis for the group is true. The proposed test for a given group uses a statistic that has optimal marginal power under a prior distribution derived from the data of the other groups. This statistic approaches the usual <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>F</mi>\u0000 </mrow>\u0000 <annotation>$$ F $$</annotation>\u0000 </semantics></math>-statistic as the prior distribution becomes more diffuse, but approaches a limiting “cone” test statistic as the prior distribution becomes extremely concentrated. We compare the power and <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>P</mi>\u0000 </mrow>\u0000 <annotation>$$ P $$</annotation>\u0000 </semantics></math>-values of the cone test to that of the <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>F</mi>\u0000 </mrow>\u0000 <annotation>$$ F $$</annotation>\u0000 </semantics></math>-test in some high-dimensional asymptotic scenarios. An analysis of educational outcome data is provided, demonstrating empirically that the proposed test is more powerful than the <math>\u0000 <semantics>\u0000 <mrow>\u0000 <mi>F</mi>\u0000 </mrow>\u0000 <annotation>$$ F $$</annotation>\u0000 </semantics></math>-test.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"51 3","pages":"852-876"},"PeriodicalIF":0.6,"publicationDate":"2023-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45041150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Segment regression model average with multiple threshold variables and multiple structural breaks 具有多个阈值变量和多个结构断裂的分段回归模型
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-02-06 DOI: 10.1002/cjs.11764
Pan Liu, Jialiang Li

We propose a new model averaging approach to investigate segment regression models with multiple threshold variables and multiple structural breaks. We first fit a series of models, each with a single threshold variable and multiple breaks over its domain, using a two-stage change point detection method. Then these models are combined together to produce a weighted ensemble through a frequentist model averaging approach. Consequently, our segment regression model averaging (SRMA) method may help identify complicated subgroups in a heterogeneous study population. A crucial step is to determine the optimal weights in the model averaging, and we follow the familiar non-concave penalty estimation approach. We provide theoretical support for SRMA by establishing the consistency of individual fitted models and estimated weights. Numerical studies are carried out to assess the performance in low- and high-dimensional settings, and comparisons are made between our proposed method and a wide range of existing alternative subgroup estimation methods. Two real economic data examples are analyzed to illustrate our methodology.

我们提出了一种新的模型平均方法,用于研究具有多个阈值变量和多个结构断点的分段回归模型。我们首先使用两阶段变化点检测方法拟合一系列模型,每个模型都有一个阈值变量和其域内的多个断点。然后,通过频繁模型平均法将这些模型组合在一起,产生一个加权集合。因此,我们的分段回归模型平均(SRMA)方法可以帮助识别异质性研究人群中的复杂亚组。确定模型平均中的最优权重是一个关键步骤,我们采用了熟悉的非凹式惩罚估计方法。通过确定各个拟合模型和估计权重的一致性,我们为 SRMA 提供了理论支持。我们进行了数值研究,以评估其在低维和高维环境中的性能,并将我们提出的方法与现有的各种替代分组估计方法进行了比较。分析了两个真实的经济数据实例,以说明我们的方法。
{"title":"Segment regression model average with multiple threshold variables and multiple structural breaks","authors":"Pan Liu,&nbsp;Jialiang Li","doi":"10.1002/cjs.11764","DOIUrl":"10.1002/cjs.11764","url":null,"abstract":"<p>We propose a new model averaging approach to investigate segment regression models with multiple threshold variables and multiple structural breaks. We first fit a series of models, each with a single threshold variable and multiple breaks over its domain, using a two-stage change point detection method. Then these models are combined together to produce a weighted ensemble through a frequentist model averaging approach. Consequently, our segment regression model averaging (SRMA) method may help identify complicated subgroups in a heterogeneous study population. A crucial step is to determine the optimal weights in the model averaging, and we follow the familiar non-concave penalty estimation approach. We provide theoretical support for SRMA by establishing the consistency of individual fitted models and estimated weights. Numerical studies are carried out to assess the performance in low- and high-dimensional settings, and comparisons are made between our proposed method and a wide range of existing alternative subgroup estimation methods. Two real economic data examples are analyzed to illustrate our methodology.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"52 1","pages":"131-161"},"PeriodicalIF":0.6,"publicationDate":"2023-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47103035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variable selection in additive models via hierarchical sparse penalty 基于层次稀疏惩罚的加性模型变量选择
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-02-03 DOI: 10.1002/cjs.11752
Canhong Wen, Anan Chen, Xueqin Wang, Wenliang Pan, for the Alzheimer's Disease Neuroimaging Initiative

As a popular tool for nonlinear models, additive models work efficiently with nonparametric estimation. However, naively applying the existing regularization method can result in misleading outcomes because of the basis sparsity in each variable. In this article, we consider variable selection in additive models via a combination of variable selection and basis selection, yielding a joint selection of variables and basis functions. A novel penalty function is proposed for basis selection to address the hierarchical structure as well as the sparsity assumption. Under some mild conditions, we establish theoretical properties including the support recovery consistency. We also derive the necessary and sufficient conditions for the estimator and develop an efficient algorithm based on it. Our new methodology and results are supported by simulation and real data examples.

作为非线性模型的常用工具,加法模型能有效地进行非参数估计。然而,由于每个变量的基稀疏性,天真地应用现有的正则化方法可能会导致误导性的结果。在本文中,我们通过变量选择和基础选择的结合来考虑加法模型中的变量选择,从而产生变量和基础函数的联合选择。针对基础选择提出了一种新的惩罚函数,以解决层次结构和稀疏性假设问题。在一些温和的条件下,我们建立了包括支持恢复一致性在内的理论属性。我们还推导出了估计器的必要条件和充分条件,并在此基础上开发了一种高效算法。我们的新方法和结果得到了模拟和真实数据实例的支持。
{"title":"Variable selection in additive models via hierarchical sparse penalty","authors":"Canhong Wen,&nbsp;Anan Chen,&nbsp;Xueqin Wang,&nbsp;Wenliang Pan,&nbsp;for the Alzheimer's Disease Neuroimaging Initiative","doi":"10.1002/cjs.11752","DOIUrl":"10.1002/cjs.11752","url":null,"abstract":"<p>As a popular tool for nonlinear models, additive models work efficiently with nonparametric estimation. However, naively applying the existing regularization method can result in misleading outcomes because of the basis sparsity in each variable. In this article, we consider variable selection in additive models via a combination of variable selection and basis selection, yielding a joint selection of variables and basis functions. A novel penalty function is proposed for basis selection to address the hierarchical structure as well as the sparsity assumption. Under some mild conditions, we establish theoretical properties including the support recovery consistency. We also derive the necessary and sufficient conditions for the estimator and develop an efficient algorithm based on it. Our new methodology and results are supported by simulation and real data examples.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"52 1","pages":"162-194"},"PeriodicalIF":0.6,"publicationDate":"2023-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45153452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A tale of two variances 两种差异的故事
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-02-02 DOI: 10.1002/cjs.11758
Peter McCullagh

We begin by showing that the standard repeated-sampling interpretation of the variance of a parameter estimate in a finite-dimensional parametric model is ambiguous and open to misinterpretation. Three operational interpretations are given, all numerically different in general and all compatible with repeated sampling from the same population with a fixed parameter. One of these is compatible with the standard large-sample calculation based on the inverse Fisher information. The others are not. One interpretation coincides with what Fisher appears to have had in mind in his 1943 derivation of the log-series model for species abundances. The different interpretations help to resolve an apparent contradiction between the Fisherian variance and the inverse-information variance obtained from the Ewens model.

我们首先表明,有限维参数模型中参数估计方差的标准重复抽样解释是模糊的,容易被误解。给出了三种操作解释,通常在数值上都不同,并且都与固定参数的同一总体的重复采样兼容。其中一个与基于逆Fisher信息的标准大样本计算兼容。其他人则不然。一种解释与费舍尔在1943年推导物种丰度对数序列模型时的想法一致。不同的解释有助于解决Fisherian方差和从Ewens模型获得的逆信息方差之间的明显矛盾。
{"title":"A tale of two variances","authors":"Peter McCullagh","doi":"10.1002/cjs.11758","DOIUrl":"10.1002/cjs.11758","url":null,"abstract":"<p>We begin by showing that the standard repeated-sampling interpretation of the variance of a parameter estimate in a finite-dimensional parametric model is ambiguous and open to misinterpretation. Three operational interpretations are given, all numerically different in general and all compatible with repeated sampling from the same population with a fixed parameter. One of these is compatible with the standard large-sample calculation based on the inverse Fisher information. The others are not. One interpretation coincides with what Fisher appears to have had in mind in his 1943 derivation of the log-series model for species abundances. The different interpretations help to resolve an apparent contradiction between the Fisherian variance and the inverse-information variance obtained from the Ewens model.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"51 3","pages":"769-779"},"PeriodicalIF":0.6,"publicationDate":"2023-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44819057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Volatility analysis for the GARCH-Itô model with option data GARCH‐Itô模型的期权数据波动性分析
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2023-01-10 DOI: 10.1002/cjs.11746
Huiling Yuan, Yong Zhou, Zhiyuan Zhang, Xiangyu Cui

Low-frequency historical data, high-frequency historical data, and option data are three primary sources that can be used to forecast an underlying security's volatility. In this article, we propose an explicit model integrating the three information sources. Instead of directly using option price data, we extract option-implied volatility from option data and estimate its dynamics. We provide joint quasimaximum likelihood estimators for the parameters and establish their asymptotic properties. Real data examples demonstrate that the proposed model has better out-of-sample volatility forecasting performance than other popular volatility models.

低频历史数据、高频历史数据和期权数据是可以用来预测标的证券波动率的三个主要来源。在本文中,我们提出了一个整合这三种信息来源的明确模型。我们不直接使用期权价格数据,而是从期权数据中提取期权隐含波动率并估计其动态变化。我们提供了参数的联合准极大似然估计值,并建立了它们的渐近特性。实际数据实例表明,与其他流行的波动率模型相比,所提出的模型具有更好的样本外波动率预测性能。
{"title":"Volatility analysis for the GARCH-Itô model with option data","authors":"Huiling Yuan,&nbsp;Yong Zhou,&nbsp;Zhiyuan Zhang,&nbsp;Xiangyu Cui","doi":"10.1002/cjs.11746","DOIUrl":"10.1002/cjs.11746","url":null,"abstract":"<p>Low-frequency historical data, high-frequency historical data, and option data are three primary sources that can be used to forecast an underlying security's volatility. In this article, we propose an explicit model integrating the three information sources. Instead of directly using option price data, we extract option-implied volatility from option data and estimate its dynamics. We provide joint quasimaximum likelihood estimators for the parameters and establish their asymptotic properties. Real data examples demonstrate that the proposed model has better out-of-sample volatility forecasting performance than other popular volatility models.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"52 1","pages":"237-270"},"PeriodicalIF":0.6,"publicationDate":"2023-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45112046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Canadian Journal of Statistics-Revue Canadienne De Statistique
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1