首页 > 最新文献

Canadian Journal of Statistics-Revue Canadienne De Statistique最新文献

英文 中文
Automatic structure recovery for generalized additive models 广义加性模型的结构自动恢复
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-10-18 DOI: 10.1002/cjs.11739
Kai Shen, Yichao Wu

In this article, we propose an automatic structure recovery method for generalized additive models (GAMs) by extending Wu and Stefanski's approach. In a similar vein, the proposed method is based on a local scoring algorithm coupled with local polynomial smoothing, along with a kernel-based variable selection approach. Given a specific degree M, the goal is to identify predictors contributing polynomially at different degrees up to M and predictors that contribute beyond degree M. By focusing on two GAMs, logistic regression and Poisson regression, we illustrate the performance of the proposed method using Monte Carlo simulation studies and two real data examples.

本文在推广Wu和Stefanski方法的基础上,提出了一种广义加性模型(GAMs)的自动结构恢复方法。在类似的情况下,所提出的方法是基于局部评分算法,结合局部多项式平滑,以及基于核的变量选择方法。给定一个特定的度M,目标是识别在不同程度上多项式地贡献到M的预测因子和贡献超过度M的预测因子。以逻辑回归和泊松回归这两种GAMs为例,通过蒙特卡罗模拟研究和两个实际数据示例来说明所提出方法的性能。
{"title":"Automatic structure recovery for generalized additive models","authors":"Kai Shen,&nbsp;Yichao Wu","doi":"10.1002/cjs.11739","DOIUrl":"10.1002/cjs.11739","url":null,"abstract":"<p>In this article, we propose an automatic structure recovery method for generalized additive models (GAMs) by extending Wu and Stefanski's approach. In a similar vein, the proposed method is based on a local scoring algorithm coupled with local polynomial smoothing, along with a kernel-based variable selection approach. Given a specific degree <math>\u0000 <mrow>\u0000 <mi>M</mi>\u0000 </mrow></math>, the goal is to identify predictors contributing polynomially at different degrees up to <math>\u0000 <mrow>\u0000 <mi>M</mi>\u0000 </mrow></math> and predictors that contribute beyond degree <math>\u0000 <mrow>\u0000 <mi>M</mi>\u0000 </mrow></math>. By focusing on two GAMs, logistic regression and Poisson regression, we illustrate the performance of the proposed method using Monte Carlo simulation studies and two real data examples.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"51 4","pages":"959-974"},"PeriodicalIF":0.6,"publicationDate":"2022-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11739","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43267711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A high-dimensional inverse norm sign test for two-sample location problems 两样本定位问题的高维逆范数符号检验
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-10-17 DOI: 10.1002/cjs.11731
Xifen Huang, Binghui Liu, Qin Zhou, Long Feng

In this article, we focus on the two-sample location testing problem for high-dimensional data, where the data dimension is potentially much larger than the sample sizes. First, we construct a general class of weighted spatial sign tests for the two-sample location problem, which can include some existing high-dimensional nonparametric tests. Then, in this article, we find a locally most powerful test by choosing the inverse norm weight function, named the two-sample inverse norm sign test (tINST). The proposed test can be viewed as an extension of the inverse norm sign test devised for the one-sample problem. We establish the asymptotic properties of the proposed test, which indicate that it is consistent and has greater power than competing tests that belong to the proposed class of weighted spatial sign tests for two-sample location problems. Finally, a large number of numerical investigations and a practical biomedical example demonstrate the power and robustness advantages of the proposed test.

在本文中,我们关注高维数据的双样本位置测试问题,其中数据维度可能比样本大小大得多。首先,针对两样本定位问题,我们构造了一类广义的加权空间符号检验,它可以包含一些现有的高维非参数检验。然后,在本文中,我们通过选择逆范数权重函数来找到一个局部最强大的检验,称为双样本逆范数符号检验(tINST)。所提出的检验可以看作是针对单样本问题设计的逆范数检验的扩展。我们建立了所提出的检验的渐近性质,这表明它是一致的,并且比属于所提出的两样本定位问题加权空间符号检验类的竞争检验具有更大的幂。最后,大量的数值研究和一个实际的生物医学例子证明了所提出的测试的强大和鲁棒性优势。
{"title":"A high-dimensional inverse norm sign test for two-sample location problems","authors":"Xifen Huang,&nbsp;Binghui Liu,&nbsp;Qin Zhou,&nbsp;Long Feng","doi":"10.1002/cjs.11731","DOIUrl":"10.1002/cjs.11731","url":null,"abstract":"<p>In this article, we focus on the two-sample location testing problem for high-dimensional data, where the data dimension is potentially much larger than the sample sizes. First, we construct a general class of weighted spatial sign tests for the two-sample location problem, which can include some existing high-dimensional nonparametric tests. Then, in this article, we find a locally most powerful test by choosing the inverse norm weight function, named the two-sample inverse norm sign test (tINST). The proposed test can be viewed as an extension of the inverse norm sign test devised for the one-sample problem. We establish the asymptotic properties of the proposed test, which indicate that it is consistent and has greater power than competing tests that belong to the proposed class of weighted spatial sign tests for two-sample location problems. Finally, a large number of numerical investigations and a practical biomedical example demonstrate the power and robustness advantages of the proposed test.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"51 4","pages":"1004-1033"},"PeriodicalIF":0.6,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45508907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A generalized single-index linear threshold model for identifying treatment-sensitive subsets based on multiple covariates and longitudinal measurements 一个广义的单指标线性阈值模型,用于识别基于多协变量和纵向测量的治疗敏感子集
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-10-17 DOI: 10.1002/cjs.11737
Xinyi Ge, Yingwei Peng, Dongsheng Tu

Identification of a subset of patients who may be sensitive to a specific treatment is an important step towards personalized medicine. We consider the case where the effect of a treatment is assessed by longitudinal measurements, which may be continuous or categorical, such as quality of life scores assessed over the duration of a clinical trial. We assume that multiple baseline covariates, such as age and expression levels of genes, are available, and propose a generalized single-index linear threshold model to identify the treatment-sensitive subset and assess the treatment-by-subset interaction after combining these covariates. Because the model involves an indicator function with unknown parameters, conventional procedures are difficult to apply for inferences of the parameters in the model. We define smoothed generalized estimating equations and propose an inference procedure based on these equations with an efficient spectral algorithm to find their solutions. The proposed procedure is evaluated through simulation studies and an application to the analysis of data from a randomized clinical trial in advanced pancreatic cancer.

识别可能对特定治疗敏感的患者子集是实现个性化医疗的重要一步。我们考虑通过纵向测量来评估治疗效果的情况,这可能是连续的或分类的,例如在临床试验期间评估的生活质量评分。我们假设存在多个基线协变量,如年龄和基因表达水平,并提出了一个广义的单指标线性阈值模型,以确定治疗敏感子集,并在组合这些协变量后评估治疗对子集的相互作用。由于模型中涉及一个参数未知的指标函数,常规的方法难以对模型中的参数进行推断。我们定义了光滑的广义估计方程,并提出了一个基于这些方程的推理程序,并使用有效的谱算法来求其解。通过模拟研究和应用于晚期胰腺癌随机临床试验的数据分析来评估拟议的程序。
{"title":"A generalized single-index linear threshold model for identifying treatment-sensitive subsets based on multiple covariates and longitudinal measurements","authors":"Xinyi Ge,&nbsp;Yingwei Peng,&nbsp;Dongsheng Tu","doi":"10.1002/cjs.11737","DOIUrl":"10.1002/cjs.11737","url":null,"abstract":"<p>Identification of a subset of patients who may be sensitive to a specific treatment is an important step towards personalized medicine. We consider the case where the effect of a treatment is assessed by longitudinal measurements, which may be continuous or categorical, such as quality of life scores assessed over the duration of a clinical trial. We assume that multiple baseline covariates, such as age and expression levels of genes, are available, and propose a generalized single-index linear threshold model to identify the treatment-sensitive subset and assess the treatment-by-subset interaction after combining these covariates. Because the model involves an indicator function with unknown parameters, conventional procedures are difficult to apply for inferences of the parameters in the model. We define smoothed generalized estimating equations and propose an inference procedure based on these equations with an efficient spectral algorithm to find their solutions. The proposed procedure is evaluated through simulation studies and an application to the analysis of data from a randomized clinical trial in advanced pancreatic cancer.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"51 4","pages":"1171-1189"},"PeriodicalIF":0.6,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46146988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minorize–maximize algorithm for the generalized odds rate model for clustered current status data 聚类当前状态数据的广义比值率模型的Minorize–maximum算法
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-10-12 DOI: 10.1002/cjs.11733
Tong Wang, Kejun He, Wei Ma, Dipankar Bandyopadhyay, Samiran Sinha

Current status data are widely used in epidemiology and public health, where the only observable information is the random inspection time and the event status at inspection. This article presents a unified methodology to analyze such complex data subject to clustering. Given the random clustering effect, the time to event is assumed to follow a semiparametric generalized odds rate (GOR) model. The nonparametric component of the GOR model is approximated via penalized splines, with a set of knot points that increase with the sample size. The within-subject correlation is accounted for by a random (frailty) effect. For estimation, a novel MM algorithm is developed that allows the separation of the parametric and nonparametric components of the model. This separation makes the problem conducive to applying the Newton–Raphson algorithm that quickly returns the roots. The work is accompanied by a complexity analysis of the algorithm, a rigorous asymptotic proof, and the related semiparametric efficiency of the proposed methodology. The finite sample performance of the proposed method is assessed via simulation studies. Furthermore, the proposed methodology is illustrated via real data analysis on periodontal disease studies accompanied by diagnostic checks to identify influential observations.

当前状态数据在流行病学和公共卫生中广泛使用,其中唯一可观察到的信息是随机检查时间和检查时的事件状态。本文提出了一种统一的方法来分析这种受聚类影响的复杂数据。考虑到随机聚类效应,假设到事件的时间遵循半参数广义赔率(GOR)模型。GOR模型的非参数成分通过惩罚样条近似,具有一组随样本量增加的结点。主体内相关性是由随机(脆弱)效应来解释的。对于估计,开发了一种新的MM算法,允许分离模型的参数和非参数组件。这种分离使得问题有利于应用快速返回根的牛顿-拉夫森算法。这项工作伴随着算法的复杂性分析,严格的渐近证明,以及所提出的方法的相关半参数效率。通过仿真研究对该方法的有限样本性能进行了评价。此外,建议的方法是通过对牙周病研究的真实数据分析来说明的,并附有诊断检查,以确定有影响的观察结果。
{"title":"Minorize–maximize algorithm for the generalized odds rate model for clustered current status data","authors":"Tong Wang,&nbsp;Kejun He,&nbsp;Wei Ma,&nbsp;Dipankar Bandyopadhyay,&nbsp;Samiran Sinha","doi":"10.1002/cjs.11733","DOIUrl":"10.1002/cjs.11733","url":null,"abstract":"<p>Current status data are widely used in epidemiology and public health, where the only observable information is the random inspection time and the event status at inspection. This article presents a unified methodology to analyze such complex data subject to clustering. Given the random clustering effect, the time to event is assumed to follow a semiparametric generalized odds rate (GOR) model. The nonparametric component of the GOR model is approximated via penalized splines, with a set of knot points that increase with the sample size. The within-subject correlation is accounted for by a random (frailty) effect. For estimation, a novel MM algorithm is developed that allows the separation of the parametric and nonparametric components of the model. This separation makes the problem conducive to applying the Newton–Raphson algorithm that quickly returns the roots. The work is accompanied by a complexity analysis of the algorithm, a rigorous asymptotic proof, and the related semiparametric efficiency of the proposed methodology. The finite sample performance of the proposed method is assessed via simulation studies. Furthermore, the proposed methodology is illustrated via real data analysis on periodontal disease studies accompanied by diagnostic checks to identify influential observations.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"51 4","pages":"1150-1170"},"PeriodicalIF":0.6,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47713975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Causal inference for multiple treatments using fractional factorial designs 使用分数析因设计进行多重治疗的因果推断
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-10-12 DOI: 10.1002/cjs.11734
Nicole E. Pashley, Marie-Abèle C. Bind

We consider the design and analysis of multi-factor experiments using fractional factorial and incomplete designs within the potential outcome framework. These designs are particularly useful when limited resources make running a full factorial design infeasible. We connect our design-based methods to standard regression methods. We further motivate the usefulness of these designs in multi-factor observational studies, where certain treatment combinations may be so rare that there are no measured outcomes in the observed data corresponding to them. Therefore, conceptualizing a hypothetical fractional factorial experiment instead of a full factorial experiment allows for appropriate analysis in those settings. We illustrate our approach using biomedical data from the 2003–2004 cycle of the National Health and Nutrition Examination Survey to examine the effects of four common pesticides on body mass index.

我们考虑在潜在结果框架内使用分数因子和不完全设计设计和分析多因素实验。当有限的资源使运行全因子设计变得不可行时,这些设计特别有用。我们将基于设计的方法与标准回归方法联系起来。我们进一步激发了这些设计在多因素观察性研究中的有用性,在这些研究中,某些治疗组合可能非常罕见,以至于在相应的观察数据中没有测量结果。因此,将假设的部分析因实验概念化,而不是全析因实验,可以在这些环境中进行适当的分析。我们使用2003-2004年国家健康和营养检查调查周期的生物医学数据来说明我们的方法,以检查四种常见杀虫剂对体重指数的影响。
{"title":"Causal inference for multiple treatments using fractional factorial designs","authors":"Nicole E. Pashley,&nbsp;Marie-Abèle C. Bind","doi":"10.1002/cjs.11734","DOIUrl":"10.1002/cjs.11734","url":null,"abstract":"<p>We consider the design and analysis of multi-factor experiments using fractional factorial and incomplete designs within the potential outcome framework. These designs are particularly useful when limited resources make running a full factorial design infeasible. We connect our design-based methods to standard regression methods. We further motivate the usefulness of these designs in multi-factor observational studies, where certain treatment combinations may be so rare that there are no measured outcomes in the observed data corresponding to them. Therefore, conceptualizing a hypothetical fractional factorial experiment instead of a full factorial experiment allows for appropriate analysis in those settings. We illustrate our approach using biomedical data from the 2003–2004 cycle of the National Health and Nutrition Examination Survey to examine the effects of four common pesticides on body mass index.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"51 2","pages":"444-468"},"PeriodicalIF":0.6,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49410823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A random walk through Canadian contributions on empirical processes and their applications in probability and statistics 随机浏览加拿大在经验过程及其在概率和统计中的应用方面的贡献
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-10-05 DOI: 10.1002/cjs.11730
Miklós Csörgő, Donald A. Dawson, Bouchra R. Nasri, Bruno N. Rémillard

In this article, we present a review of important results and statistical applications obtained or generalized by Canadian pioneers and their collaborators, for empirical processes of independent and identically distributed observations, pseudo-observations, and time series. In particular, we consider weak convergence and strong approximations results, as well as tests for model adequacy such as tests of independence, tests of goodness-of-fit, tests of change point, and tests of serial dependence for time series. We also consider applications of empirical processes of interacting particle systems for the approximation of measure-valued processes.

在这篇文章中,我们回顾了加拿大先驱及其合作者在独立和同分布观测、伪观测和时间序列的经验过程中获得或推广的重要结果和统计应用。特别是,我们考虑了弱收敛性和强近似结果,以及模型充分性测试,如独立性测试、拟合优度测试、变化点测试和时间序列的序列相关性测试。我们还考虑了相互作用粒子系统的经验过程在测度值过程近似中的应用。
{"title":"A random walk through Canadian contributions on empirical processes and their applications in probability and statistics","authors":"Miklós Csörgő,&nbsp;Donald A. Dawson,&nbsp;Bouchra R. Nasri,&nbsp;Bruno N. Rémillard","doi":"10.1002/cjs.11730","DOIUrl":"10.1002/cjs.11730","url":null,"abstract":"<p>In this article, we present a review of important results and statistical applications obtained or generalized by Canadian pioneers and their collaborators, for empirical processes of independent and identically distributed observations, pseudo-observations, and time series. In particular, we consider weak convergence and strong approximations results, as well as tests for model adequacy such as tests of independence, tests of goodness-of-fit, tests of change point, and tests of serial dependence for time series. We also consider applications of empirical processes of interacting particle systems for the approximation of measure-valued processes.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"50 4","pages":"1116-1142"},"PeriodicalIF":0.6,"publicationDate":"2022-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11730","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42677983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The EAS approach for graphical selection consistency in vector autoregression models 向量自回归模型中图形选择一致性的EAS方法
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-09-27 DOI: 10.1002/cjs.11726
Jonathan P. Williams, Yuying Xie, Jan Hannig

As evidenced by various recent and significant papers within the frequentist literature, along with numerous applications in macroeconomics, genomics, and neuroscience, there continues to be substantial interest in understanding the theoretical estimation properties of high-dimensional vector autoregression (VAR) models. To date, however, while Bayesian VAR (BVAR) models have been developed and studied empirically (primarily in the econometrics literature), there exist very few theoretical investigations of the repeated-sampling properties for BVAR models in the literature, and there exist no generalized fiducial investigations of VAR models. In this direction, we construct methodology via the ε-admissible subsets (EAS) approach for inference based on a generalized fiducial distribution of relative model probabilities over all sets of active/inactive components (graphs) of the VAR transition matrix. We provide a mathematical proof of pairwise and strong graphical selection consistency for the EAS approach for stable VAR(1) models, and demonstrate empirically that it is an effective strategy in high-dimensional settings.

正如频率论文献中最近发表的各种重要论文以及在宏观经济学、基因组学和神经科学中的大量应用所证明的那样,人们对理解高维向量自回归(VAR)模型的理论估计特性仍然非常感兴趣。然而,到目前为止,虽然贝叶斯VAR(BVAR)模型已经得到了实证开发和研究(主要在计量经济学文献中),但文献中对BVAR模型的重复抽样特性的理论研究很少,也没有对VAR模型的广义基准研究。在这个方向上,我们通过ε-容许子集(EAS)方法构建方法,用于基于VAR转移矩阵的所有活跃/非活跃分量(图)上的相对模型概率的广义基准分布进行推理。我们为稳定VAR(1)模型的EAS方法提供了成对和强图形选择一致性的数学证明,并从经验上证明了它在高维环境中是一种有效的策略。
{"title":"The EAS approach for graphical selection consistency in vector autoregression models","authors":"Jonathan P. Williams,&nbsp;Yuying Xie,&nbsp;Jan Hannig","doi":"10.1002/cjs.11726","DOIUrl":"10.1002/cjs.11726","url":null,"abstract":"<p>As evidenced by various recent and significant papers within the frequentist literature, along with numerous applications in macroeconomics, genomics, and neuroscience, there continues to be substantial interest in understanding the theoretical estimation properties of high-dimensional vector autoregression (VAR) models. To date, however, while Bayesian VAR (BVAR) models have been developed and studied empirically (primarily in the econometrics literature), there exist very few theoretical investigations of the repeated-sampling properties for BVAR models in the literature, and there exist no generalized fiducial investigations of VAR models. In this direction, we construct methodology via the <math>\u0000 <mrow>\u0000 <mi>ε</mi>\u0000 </mrow></math>-<i>admissible</i> subsets (EAS) approach for inference based on a generalized fiducial distribution of relative model probabilities over all sets of active/inactive components (graphs) of the VAR transition matrix. We provide a mathematical proof of <i>pairwise</i> and <i>strong</i> graphical selection consistency for the EAS approach for stable VAR(1) models, and demonstrate empirically that it is an effective strategy in high-dimensional settings.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"51 2","pages":"674-703"},"PeriodicalIF":0.6,"publicationDate":"2022-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11726","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43765514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Unifying genetic association tests via regression: Prospective and retrospective, parametric and nonparametric, and genotype- and allele-based tests 通过回归统一遗传关联测试:前瞻性和回顾性,参数化和非参数化,以及基于基因型和等位基因的测试
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-09-23 DOI: 10.1002/cjs.11729
Lin Zhang, Lei Sun

Genetic association analysis, which evaluates relationships between genetic markers and complex, heritable traits, is the basis of genome-wide association studies. The many association tests that have been developed can generally be classified as prospective versus retrospective, parametric versus nonparametric, and genotype- versus allele-based. While method classifications are useful, it can be confusing and challenging for practitioners to decide on the “optimal” test to use for their data. We go beyond known differences between some popular association tests and provide new results that show analytical connections between tests, for both population- and family-based study designs.

遗传关联分析评估遗传标记与复杂遗传性状之间的关系,是全基因组关联研究的基础。已开发的许多关联检测通常可分为前瞻性与回顾性、参数化与非参数化、基因型与等位基因型。虽然方法分类是有用的,但是对于从业者来说,决定为他们的数据使用“最佳”测试可能是令人困惑和具有挑战性的。我们超越了一些流行的关联测试之间已知的差异,并提供了新的结果,显示了基于人群和基于家庭的研究设计的测试之间的分析联系。
{"title":"Unifying genetic association tests via regression: Prospective and retrospective, parametric and nonparametric, and genotype- and allele-based tests","authors":"Lin Zhang,&nbsp;Lei Sun","doi":"10.1002/cjs.11729","DOIUrl":"10.1002/cjs.11729","url":null,"abstract":"<p>Genetic association analysis, which evaluates relationships between genetic markers and complex, heritable traits, is the basis of genome-wide association studies. The many association tests that have been developed can generally be classified as prospective versus retrospective, parametric versus nonparametric, and genotype- versus allele-based. While method classifications are useful, it can be confusing and challenging for practitioners to decide on the “optimal” test to use for their data. We go beyond known differences between some popular association tests and provide new results that show analytical connections between tests, for both population- and family-based study designs.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"50 4","pages":"1321-1338"},"PeriodicalIF":0.6,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11729","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76504001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Efficient multiple change point detection for high-dimensional generalized linear models 高维广义线性模型的高效多变点检测
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-09-16 DOI: 10.1002/cjs.11721
Xianru Wang, Bin Liu, Xinsheng Zhang, Yufeng Liu, for the Alzheimer's Disease Neuroimaging Initiative

Change point detection for high-dimensional data is an important yet challenging problem for many applications. In this article, we consider multiple change point detection in the context of high-dimensional generalized linear models, allowing the covariate dimension p to grow exponentially with the sample size n. The model considered is general and flexible in the sense that it covers various specific models as special cases. It can automatically account for the underlying data generation mechanism without specifying any prior knowledge about the number of change points. Based on dynamic programming and binary segmentation techniques, two algorithms are proposed to detect multiple change points, allowing the number of change points to grow with n. To further improve the computational efficiency, a more efficient algorithm designed for the case of a single change point is proposed. We present theoretical properties of our proposed algorithms, including estimation consistency for the number and locations of change points as well as consistency and asymptotic distributions for the underlying regression coefficients. Finally, extensive simulation studies and application to the Alzheimer's Disease Neuroimaging Initiative data further demonstrate the competitive performance of our proposed methods.

对于许多应用来说,高维数据的变化点检测是一个重要而又具有挑战性的问题。在本文中,我们考虑在高维广义线性模型背景下的多变化点检测,允许协变量维p随样本量n呈指数增长。所考虑的模型是通用的和灵活的,因为它将各种特定模型作为特殊情况涵盖。它可以自动解释底层数据生成机制,而无需指定任何关于更改点数量的先验知识。基于动态规划和二值分割技术,提出了两种检测多个变化点的算法,允许变化点的数量随n增长。为了进一步提高计算效率,提出了一种针对单变化点情况的高效算法。我们提出了我们提出的算法的理论性质,包括对变化点的数量和位置的估计一致性以及底层回归系数的一致性和渐近分布。最后,广泛的模拟研究和阿尔茨海默病神经成像倡议数据的应用进一步证明了我们提出的方法的竞争性能。
{"title":"Efficient multiple change point detection for high-dimensional generalized linear models","authors":"Xianru Wang,&nbsp;Bin Liu,&nbsp;Xinsheng Zhang,&nbsp;Yufeng Liu,&nbsp;for the Alzheimer's Disease Neuroimaging Initiative","doi":"10.1002/cjs.11721","DOIUrl":"10.1002/cjs.11721","url":null,"abstract":"<p>Change point detection for high-dimensional data is an important yet challenging problem for many applications. In this article, we consider multiple change point detection in the context of high-dimensional generalized linear models, allowing the covariate dimension <math>\u0000 <mrow>\u0000 <mi>p</mi>\u0000 </mrow></math> to grow exponentially with the sample size <math>\u0000 <mrow>\u0000 <mi>n</mi>\u0000 </mrow></math>. The model considered is general and flexible in the sense that it covers various specific models as special cases. It can automatically account for the underlying data generation mechanism without specifying any prior knowledge about the number of change points. Based on dynamic programming and binary segmentation techniques, two algorithms are proposed to detect multiple change points, allowing the number of change points to grow with <math>\u0000 <mrow>\u0000 <mi>n</mi>\u0000 </mrow></math>. To further improve the computational efficiency, a more efficient algorithm designed for the case of a single change point is proposed. We present theoretical properties of our proposed algorithms, including estimation consistency for the number and locations of change points as well as consistency and asymptotic distributions for the underlying regression coefficients. Finally, extensive simulation studies and application to the Alzheimer's Disease Neuroimaging Initiative data further demonstrate the competitive performance of our proposed methods.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"51 2","pages":"596-629"},"PeriodicalIF":0.6,"publicationDate":"2022-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11721","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10087954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Classified generalized linear mixed model prediction incorporating pseudo-prior information 包含伪先验信息的分类广义线性混合模型预测
IF 0.6 4区 数学 Q3 STATISTICS & PROBABILITY Pub Date : 2022-09-14 DOI: 10.1002/cjs.11727
Haiqiang Ma, Jiming Jiang

We develop a method of classified mixed model prediction based on generalized linear mixed models that incorporate pseudo-prior information to improve prediction accuracy. We establish consistency of the proposed method both in terms of prediction of the true mixed effect of interest and in terms of correctly identifying the potential class corresponding to the new observations if such a class matching one of the training data classes exists. Empirical results, including simulation studies and real-data validation, fully support the theoretical findings.

我们开发了一种基于广义线性混合模型的分类混合模型预测方法,该方法结合了伪先验信息以提高预测精度。我们建立了所提出的方法的一致性,无论是在预测感兴趣的真实混合效应方面,还是在正确识别与新观察相对应的潜在类别方面,如果存在与训练数据类别之一匹配的类别的话。实证结果,包括模拟研究和真实数据验证,完全支持理论发现。
{"title":"Classified generalized linear mixed model prediction incorporating pseudo-prior information","authors":"Haiqiang Ma,&nbsp;Jiming Jiang","doi":"10.1002/cjs.11727","DOIUrl":"10.1002/cjs.11727","url":null,"abstract":"<p>We develop a method of classified mixed model prediction based on generalized linear mixed models that incorporate pseudo-prior information to improve prediction accuracy. We establish consistency of the proposed method both in terms of prediction of the true mixed effect of interest and in terms of correctly identifying the potential class corresponding to the new observations if such a class matching one of the training data classes exists. Empirical results, including simulation studies and real-data validation, fully support the theoretical findings.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"51 2","pages":"580-595"},"PeriodicalIF":0.6,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45308484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Canadian Journal of Statistics-Revue Canadienne De Statistique
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1