首页 > 最新文献

Biometrika最新文献

英文 中文
Testing generalized linear models with high-dimensional nuisance parameter. 测试具有高维滋扰参数的广义线性模型
IF 2.7 2区 数学 Q1 Mathematics Pub Date : 2023-03-01 Epub Date: 2022-04-05 DOI: 10.1093/biomet/asac021
Jinsong Chen, Quefeng Li, Hua Yun Chen

Generalized linear models often have a high-dimensional nuisance parameters, as seen in applications such as testing gene-environment interactions or gene-gene interactions. In these scenarios, it is essential to test the significance of a high-dimensional sub-vector of the model's coefficients. Although some existing methods can tackle this problem, they often rely on the bootstrap to approximate the asymptotic distribution of the test statistic, and thus are computationally expensive. Here, we propose a computationally efficient test with a closed-form limiting distribution, which allows the parameter being tested to be either sparse or dense. We show that under certain regularity conditions, the type I error of the proposed method is asymptotically correct, and we establish its power under high-dimensional alternatives. Extensive simulations demonstrate the good performance of the proposed test and its robustness when certain sparsity assumptions are violated. We also apply the proposed method to Chinese famine sample data in order to show its performance when testing the significance of gene-environment interactions.

广义线性模型通常有一个高维的干扰参数,这在测试基因与环境的相互作用或基因与基因的相互作用等应用中可以看到。在这些情况下,必须对模型系数的高维子向量进行显著性检验。虽然现有的一些方法可以解决这个问题,但它们往往依赖于引导法来近似检验统计量的渐近分布,因此计算成本很高。在这里,我们提出了一种具有闭式极限分布的计算效率高的检验方法,它允许被检验参数是稀疏或密集的。我们证明,在某些规则性条件下,所提方法的 I 型误差是渐进正确的,并确定了其在高维替代条件下的威力。大量的仿真证明了所提检验方法的良好性能,以及在违反某些稀疏性假设时的稳健性。我们还将所提方法应用于中国饥荒样本数据,以展示其在检验基因-环境交互作用显著性时的性能。
{"title":"Testing generalized linear models with high-dimensional nuisance parameter.","authors":"Jinsong Chen, Quefeng Li, Hua Yun Chen","doi":"10.1093/biomet/asac021","DOIUrl":"10.1093/biomet/asac021","url":null,"abstract":"<p><p>Generalized linear models often have a high-dimensional nuisance parameters, as seen in applications such as testing gene-environment interactions or gene-gene interactions. In these scenarios, it is essential to test the significance of a high-dimensional sub-vector of the model's coefficients. Although some existing methods can tackle this problem, they often rely on the bootstrap to approximate the asymptotic distribution of the test statistic, and thus are computationally expensive. Here, we propose a computationally efficient test with a closed-form limiting distribution, which allows the parameter being tested to be either sparse or dense. We show that under certain regularity conditions, the type I error of the proposed method is asymptotically correct, and we establish its power under high-dimensional alternatives. Extensive simulations demonstrate the good performance of the proposed test and its robustness when certain sparsity assumptions are violated. We also apply the proposed method to Chinese famine sample data in order to show its performance when testing the significance of gene-environment interactions.</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9933885/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10800040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonparametric estimation of the intensity function of a spatial point process on a Riemannian manifold 黎曼流形上空间点过程强度函数的非参数估计
2区 数学 Q1 Mathematics Pub Date : 2023-02-28 DOI: 10.1093/biomet/asad012
S Ward, H S Battey, E A K Cohen
Summary This paper is concerned with nonparametric estimation of the intensity function of a point process on a Riemannian manifold. It provides a first-order asymptotic analysis of the proposed kernel estimator for Poisson processes, supplemented by empirical work to probe the behaviour in finite samples and under other generative regimes. The investigation highlights the scope for finite-sample improvements by allowing the bandwidth to adapt to local curvature.
本文研究了黎曼流形上点过程强度函数的非参数估计。它提供了一阶渐近分析提出的核估计泊松过程,辅以经验工作,以探索在有限样本和其他生成制度下的行为。通过允许带宽适应局部曲率,研究突出了有限样本改进的范围。
{"title":"Nonparametric estimation of the intensity function of a spatial point process on a Riemannian manifold","authors":"S Ward, H S Battey, E A K Cohen","doi":"10.1093/biomet/asad012","DOIUrl":"https://doi.org/10.1093/biomet/asad012","url":null,"abstract":"Summary This paper is concerned with nonparametric estimation of the intensity function of a point process on a Riemannian manifold. It provides a first-order asymptotic analysis of the proposed kernel estimator for Poisson processes, supplemented by empirical work to probe the behaviour in finite samples and under other generative regimes. The investigation highlights the scope for finite-sample improvements by allowing the bandwidth to adapt to local curvature.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135827732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Equivariant Estimation of Fréchet Means 方差均值的等变估计
IF 2.7 2区 数学 Q1 Mathematics Pub Date : 2023-02-24 DOI: 10.1093/biomet/asad014
A. Mccormack, P. Hoff
The Fréchet mean generalizes the concept of a mean to a metric space setting. In this work we consider equivariant estimation of Fréchet means for parametric models on metric spaces that are Riemannian manifolds. The geometry and symmetry of such a space is partially encoded by its isometry group of distance preserving transformations. Estimators that are equivariant under the isometry group take into account the symmetry of the metric space. For some models there exists an optimal equivariant estimator, which necessarily will perform as well or better than other common equivariant estimators, such as the maximum likelihood estimator or the sample Fréchet mean. We derive the general form of this minimum risk equivariant estimator and in a few cases provide explicit expressions for it. A result for finding the Fréchet mean for distributions with radially decreasing densities is presented and used to find expressions for the minimum risk equivariant estimator. In some models the isometry group is not large enough relative to the parametric family of distributions for there to exist a minimum risk equivariant estimator. In such cases, we introduce an adaptive equivariant estimator that uses the data to select a submodel for which there is a minimum risk equivariant estimator. Simulation results show that the adaptive equivariant estimator performs favourably relative to alternative estimators.
Fréchet均值将均值的概念推广到度量空间设置。在这项工作中,我们考虑了作为黎曼流形的度量空间上的参数模型的Fréchet均值的等变估计。这样一个空间的几何和对称性部分由它的等距保距离变换组编码。等距群下的等变估计考虑了度量空间的对称性。对于一些模型,存在一个最优等变估计量,该估计量必然与其他常见的等变估计(如最大似然估计量或样本Fréchet均值)一样好或更好。我们导出了这种最小风险等变估计量的一般形式,并在少数情况下给出了它的显式表达式。给出了求径向递减密度分布的Fréchet均值的结果,并用于求最小风险等差估计量的表达式。在一些模型中,等距群相对于参数分布族不够大,不存在最小风险等变估计量。在这种情况下,我们引入了一种自适应等变估计器,该估计器使用数据来选择一个子模型,该子模型具有最小风险等变估计量。仿真结果表明,自适应等变估计量的性能优于其他估计量。
{"title":"Equivariant Estimation of Fréchet Means","authors":"A. Mccormack, P. Hoff","doi":"10.1093/biomet/asad014","DOIUrl":"https://doi.org/10.1093/biomet/asad014","url":null,"abstract":"\u0000 The Fréchet mean generalizes the concept of a mean to a metric space setting. In this work we consider equivariant estimation of Fréchet means for parametric models on metric spaces that are Riemannian manifolds. The geometry and symmetry of such a space is partially encoded by its isometry group of distance preserving transformations. Estimators that are equivariant under the isometry group take into account the symmetry of the metric space. For some models there exists an optimal equivariant estimator, which necessarily will perform as well or better than other common equivariant estimators, such as the maximum likelihood estimator or the sample Fréchet mean. We derive the general form of this minimum risk equivariant estimator and in a few cases provide explicit expressions for it. A result for finding the Fréchet mean for distributions with radially decreasing densities is presented and used to find expressions for the minimum risk equivariant estimator. In some models the isometry group is not large enough relative to the parametric family of distributions for there to exist a minimum risk equivariant estimator. In such cases, we introduce an adaptive equivariant estimator that uses the data to select a submodel for which there is a minimum risk equivariant estimator. Simulation results show that the adaptive equivariant estimator performs favourably relative to alternative estimators.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41441805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Statistical inference for streamed longitudinal data 流式纵向数据的统计推断
2区 数学 Q1 Mathematics Pub Date : 2023-02-20 DOI: 10.1093/biomet/asad010
Lan Luo, Jingshen Wang, Emily C Hector
Summary Modern longitudinal data, for example from wearable devices, may consist of measurements of biological signals on a fixed set of participants at a diverging number of time-points. Traditional statistical methods are not equipped to handle the computational burden of repeatedly analysing the cumulatively growing dataset each time new data are collected. We propose a new estimation and inference framework for dynamic updating of point estimates and their standard errors along sequentially collected datasets with dependence, both within and between the datasets. The key technique is a decomposition of the extended inference function vector of the quadratic inference function constructed over the cumulative longitudinal data into a sum of summary statistics over data batches. We show how this sum can be recursively updated without the need to access the whole dataset, resulting in a computationally efficient streaming procedure with minimal loss of statistical efficiency. We prove consistency and asymptotic normality of our streaming estimator as the number of data batches diverges, even as the number of independent participants remains fixed. Simulations demonstrate the advantages of our approach over traditional statistical methods that assume independence between data batches. Finally, we investigate the relationship between physical activity and several diseases through analysis of accelerometry data from the National Health and Nutrition Examination Survey.
现代纵向数据,例如来自可穿戴设备的数据,可能包括在不同数量的时间点对一组固定参与者的生物信号的测量。传统的统计方法无法处理每次收集新数据时重复分析累积增长数据集的计算负担。我们提出了一个新的估计和推理框架,用于动态更新点估计和它们的标准误差,沿顺序收集的数据集内和数据集之间的依赖。该方法的关键技术是将累积纵向数据上构造的二次推理函数的扩展推理函数向量分解为批次数据上的汇总统计和。我们展示了如何在不需要访问整个数据集的情况下递归地更新这个总和,从而在统计效率损失最小的情况下产生计算效率高的流过程。我们证明了流估计器的一致性和渐近正态性,因为数据批次的数量分散,即使独立参与者的数量保持固定。仿真表明,我们的方法优于传统的统计方法,传统的统计方法假设数据批次之间的独立性。最后,我们通过分析国家健康与营养检查调查的加速度计数据,探讨了体育活动与几种疾病之间的关系。
{"title":"Statistical inference for streamed longitudinal data","authors":"Lan Luo, Jingshen Wang, Emily C Hector","doi":"10.1093/biomet/asad010","DOIUrl":"https://doi.org/10.1093/biomet/asad010","url":null,"abstract":"Summary Modern longitudinal data, for example from wearable devices, may consist of measurements of biological signals on a fixed set of participants at a diverging number of time-points. Traditional statistical methods are not equipped to handle the computational burden of repeatedly analysing the cumulatively growing dataset each time new data are collected. We propose a new estimation and inference framework for dynamic updating of point estimates and their standard errors along sequentially collected datasets with dependence, both within and between the datasets. The key technique is a decomposition of the extended inference function vector of the quadratic inference function constructed over the cumulative longitudinal data into a sum of summary statistics over data batches. We show how this sum can be recursively updated without the need to access the whole dataset, resulting in a computationally efficient streaming procedure with minimal loss of statistical efficiency. We prove consistency and asymptotic normality of our streaming estimator as the number of data batches diverges, even as the number of independent participants remains fixed. Simulations demonstrate the advantages of our approach over traditional statistical methods that assume independence between data batches. Finally, we investigate the relationship between physical activity and several diseases through analysis of accelerometry data from the National Health and Nutrition Examination Survey.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134905480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sampling distribution for single-regression Granger causality estimators 单回归格兰杰因果估计量的抽样分布
2区 数学 Q1 Mathematics Pub Date : 2023-02-14 DOI: 10.1093/biomet/asad009
A J Gutknecht, L Barnett
Summary The single-regression Granger–Geweke causality estimator has previously been shown to solve known problems associated with the more conventional likelihood ratio estimator; however, its sampling distribution has remained unknown. We show that, under the null hypothesis of vanishing Granger causality, the single-regression estimator converges to a generalized χ2 distribution, which is well approximated by a Γ distribution. We show that this holds too for Geweke’s spectral causality averaged over a given frequency band, and derive explicit expressions for the generalized χ2 and Γ-approximation parameters in both cases. We present a Neyman–Pearson test based on the single-regression estimators, and discuss how it may be deployed in empirical scenarios. We outline how our analysis may be extended to the conditional case, point-frequency spectral Granger causality and the important case of state-space Granger causality.
单回归Granger-Geweke因果关系估计量先前已被证明可以解决与更传统的似然比估计量相关的已知问题;然而,其抽样分布仍然未知。我们证明,在格兰杰因果关系消失的零假设下,单回归估计量收敛于广义χ2分布,该分布可以很好地近似于Γ分布。我们证明这也适用于给定频带上平均的Geweke频谱因果关系,并推导出两种情况下广义χ2和Γ-approximation参数的显式表达式。我们提出了一个基于单回归估计的内曼-皮尔逊检验,并讨论了如何在经验场景中部署它。我们概述了如何将我们的分析扩展到条件情况,点频谱格兰杰因果关系和状态空间格兰杰因果关系的重要情况。
{"title":"Sampling distribution for single-regression Granger causality estimators","authors":"A J Gutknecht, L Barnett","doi":"10.1093/biomet/asad009","DOIUrl":"https://doi.org/10.1093/biomet/asad009","url":null,"abstract":"Summary The single-regression Granger–Geweke causality estimator has previously been shown to solve known problems associated with the more conventional likelihood ratio estimator; however, its sampling distribution has remained unknown. We show that, under the null hypothesis of vanishing Granger causality, the single-regression estimator converges to a generalized χ2 distribution, which is well approximated by a Γ distribution. We show that this holds too for Geweke’s spectral causality averaged over a given frequency band, and derive explicit expressions for the generalized χ2 and Γ-approximation parameters in both cases. We present a Neyman–Pearson test based on the single-regression estimators, and discuss how it may be deployed in empirical scenarios. We outline how our analysis may be extended to the conditional case, point-frequency spectral Granger causality and the important case of state-space Granger causality.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135727143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A subsampling perspective for extending the validity of state-of-the-art bootstraps in the frequency domain 在频域扩展最先进自举的有效性的子采样视角
2区 数学 Q1 Mathematics Pub Date : 2023-01-30 DOI: 10.1093/biomet/asad006
Haihan Yu, Mark S Kaiser, Daniel J Nordman
Summary Bootstrapping spectral mean statistics has been a notoriously difficult problem over the past 25 years. Many frequency domain bootstraps are valid only for certain time series structures, e.g., linear processes, or for special types of statistics, i.e., ratio statistics, because such bootstraps fail to capture the limiting variance of spectral statistics in general settings. We address this issue with a different form of resampling, namely, subsampling. While not considered previously, subsampling provides consistent variance estimation under much weaker conditions than any existing bootstrap in the frequency domain. Mixing is not used, as is often standard with subsampling. Rather, subsampling can be generally justified under the same conditions needed for original spectral mean statistics to have distributional limits in the first place. This result has impacts for other bootstrap methods. Subsampling then applies to extending the validity of recent state-of-the-art bootstraps in the frequency domain. We nontrivially link subsampling to such bootstraps, which broadens their range, as moment and block assumptions needed for these are cut by more than half. Essentially, state-of-the-art bootstraps then require no more stringent assumptions than those needed for a target limit distribution to exist, which is unusual in the bootstrap world. We also close a gap in the theory of subsampling for time series with distributional approximations, in addition to variance estimation, for frequency domain statistics.
在过去的25年中,自举谱均值统计一直是一个非常困难的问题。许多频域自举仅对某些时间序列结构有效,例如线性过程,或对特殊类型的统计有效,例如比率统计,因为此类自举无法捕获一般设置下谱统计的极限方差。我们用另一种形式的重采样来解决这个问题,即子采样。虽然以前没有考虑过,但子采样在比任何现有的频域自举都弱得多的条件下提供一致的方差估计。不使用混合,这通常是标准的子采样。相反,在原始谱均值统计量首先具有分布限制所需的相同条件下,通常可以证明子抽样是合理的。这个结果对其他bootstrap方法有影响。然后,子采样应用于扩展最新的最先进的自举在频域的有效性。我们非平凡地将子采样与这样的自举联系起来,这扩大了它们的范围,因为这些所需的矩和块假设减少了一半以上。从本质上讲,最先进的自举方法不需要比目标极限分布存在所需的假设更严格的假设,这在自举方法世界中是不寻常的。除了方差估计外,我们还在频域统计量的分布近似时间序列的子抽样理论中缩小了差距。
{"title":"A subsampling perspective for extending the validity of state-of-the-art bootstraps in the frequency domain","authors":"Haihan Yu, Mark S Kaiser, Daniel J Nordman","doi":"10.1093/biomet/asad006","DOIUrl":"https://doi.org/10.1093/biomet/asad006","url":null,"abstract":"Summary Bootstrapping spectral mean statistics has been a notoriously difficult problem over the past 25 years. Many frequency domain bootstraps are valid only for certain time series structures, e.g., linear processes, or for special types of statistics, i.e., ratio statistics, because such bootstraps fail to capture the limiting variance of spectral statistics in general settings. We address this issue with a different form of resampling, namely, subsampling. While not considered previously, subsampling provides consistent variance estimation under much weaker conditions than any existing bootstrap in the frequency domain. Mixing is not used, as is often standard with subsampling. Rather, subsampling can be generally justified under the same conditions needed for original spectral mean statistics to have distributional limits in the first place. This result has impacts for other bootstrap methods. Subsampling then applies to extending the validity of recent state-of-the-art bootstraps in the frequency domain. We nontrivially link subsampling to such bootstraps, which broadens their range, as moment and block assumptions needed for these are cut by more than half. Essentially, state-of-the-art bootstraps then require no more stringent assumptions than those needed for a target limit distribution to exist, which is unusual in the bootstrap world. We also close a gap in the theory of subsampling for time series with distributional approximations, in addition to variance estimation, for frequency domain statistics.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135554424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: Optimal row-column designs 更正:最佳行-列设计
IF 2.7 2区 数学 Q1 Mathematics Pub Date : 2023-01-27 DOI: 10.1093/biomet/asad003
{"title":"Correction to: Optimal row-column designs","authors":"","doi":"10.1093/biomet/asad003","DOIUrl":"https://doi.org/10.1093/biomet/asad003","url":null,"abstract":"","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44926470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High Dimensional Analysis of Variance in Multivariate Linear Regression 多元线性回归的高维方差分析
IF 2.7 2区 数学 Q1 Mathematics Pub Date : 2023-01-10 DOI: 10.1093/biomet/asad001
Zhipeng Lou, Xianyang Zhang, Weichi Wu
In this paper, we develop a systematic theory for high dimensional analysis of variance in multivariate linear regression, where the dimension and the number of coefficients can both grow with the sample size. We propose a new U type test statistic to test linear hypotheses and establish a high dimensional Gaussian approximation result under fairly mild moment assumptions. Our general framework and theory can be applied to deal with the classical one-way multivariate analysis of variance and the nonparametric one-way multivariate analysis of variance in high dimensions. To implement the test procedure, we introduce a sample-splitting based estimator of the second moment of the error covariance and discuss its properties. A simulation study shows that our proposed test outperforms some existing tests in various settings.
在本文中,我们发展了多元线性回归中高维方差分析的系统理论,其中系数的维数和数量都可以随着样本量的增加而增加。我们提出了一个新的U型检验统计量来检验线性假设,并在相当温和的矩假设下建立了高维高斯近似结果。我们的一般框架和理论可以应用于处理经典的单向多元方差分析和高维的非参数单向多元方差分析。为了实现测试程序,我们引入了一个基于样本分裂的误差协方差第二矩估计器,并讨论了它的性质。仿真研究表明,我们提出的测试在各种设置下都优于现有的一些测试。
{"title":"High Dimensional Analysis of Variance in Multivariate Linear Regression","authors":"Zhipeng Lou, Xianyang Zhang, Weichi Wu","doi":"10.1093/biomet/asad001","DOIUrl":"https://doi.org/10.1093/biomet/asad001","url":null,"abstract":"\u0000 In this paper, we develop a systematic theory for high dimensional analysis of variance in multivariate linear regression, where the dimension and the number of coefficients can both grow with the sample size. We propose a new U type test statistic to test linear hypotheses and establish a high dimensional Gaussian approximation result under fairly mild moment assumptions. Our general framework and theory can be applied to deal with the classical one-way multivariate analysis of variance and the nonparametric one-way multivariate analysis of variance in high dimensions. To implement the test procedure, we introduce a sample-splitting based estimator of the second moment of the error covariance and discuss its properties. A simulation study shows that our proposed test outperforms some existing tests in various settings.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47351243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An instrumental variable method for point processes: generalized Wald estimation based on deconvolution 点过程的工具变量方法:基于反卷积的广义Wald估计
IF 2.7 2区 数学 Q1 Mathematics Pub Date : 2023-01-09 DOI: 10.1093/biomet/asad005
Zhichao Jiang, Shizhe Chen, Peng Ding
Point processes are probabilistic tools for modelling event data. While there exists a fast-growing literature studying the relationships between point processes, it remains unexplored how such relationships connect to causal effects. In the presence of unmeasured confounders, parameters from point process models do not necessarily have causal interpretations. We propose an instrumental variable method for causal inference with point process treatment and outcome. We define causal quantities based on potential outcomes and establish nonparametric identification results with a binary instrumental variable. We extend the traditional Wald estimation to deal with point process treatment and outcome, showing that it should be performed after a Fourier transform of the intention-to-treat effects on the treatment and outcome and thus takes the form of deconvolution. We term this generalized Wald estimation and propose an estimation strategy based on well-established deconvolution methods.
点过程是用于建模事件数据的概率工具。虽然研究点过程之间关系的文献数量迅速增长,但这种关系如何与因果效应联系起来仍有待探索。在存在未测量的混杂因素的情况下,来自点过程模型的参数不一定具有因果解释。我们提出了一种具有点过程处理和结果的因果推理工具变量方法。我们基于潜在结果定义因果量,并用二元工具变量建立非参数识别结果。我们将传统的Wald估计扩展到处理点过程处理和结果,表明它应该在对意图处理对处理和结果的影响进行傅立叶变换后进行,因此采取了反褶积的形式。我们提出了这种广义Wald估计,并提出了一种基于公认的反褶积方法的估计策略。
{"title":"An instrumental variable method for point processes: generalized Wald estimation based on deconvolution","authors":"Zhichao Jiang, Shizhe Chen, Peng Ding","doi":"10.1093/biomet/asad005","DOIUrl":"https://doi.org/10.1093/biomet/asad005","url":null,"abstract":"\u0000 Point processes are probabilistic tools for modelling event data. While there exists a fast-growing literature studying the relationships between point processes, it remains unexplored how such relationships connect to causal effects. In the presence of unmeasured confounders, parameters from point process models do not necessarily have causal interpretations. We propose an instrumental variable method for causal inference with point process treatment and outcome. We define causal quantities based on potential outcomes and establish nonparametric identification results with a binary instrumental variable. We extend the traditional Wald estimation to deal with point process treatment and outcome, showing that it should be performed after a Fourier transform of the intention-to-treat effects on the treatment and outcome and thus takes the form of deconvolution. We term this generalized Wald estimation and propose an estimation strategy based on well-established deconvolution methods.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45747425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Significance testing for canonical correlation analysis in high dimensions. 高维度典型相关分析的显著性检验。
IF 2.4 2区 数学 Q2 BIOLOGY Pub Date : 2022-12-01 Epub Date: 2022-11-18 DOI: 10.1093/biomet/asab059
Ian W McKeague, Xin Zhang

We consider the problem of testing for the presence of linear relationships between large sets of random variables based on a post-selection inference approach to canonical correlation analysis. The challenge is to adjust for the selection of subsets of variables having linear combinations with maximal sample correlation. To this end, we construct a stabilized one-step estimator of the euclidean-norm of the canonical correlations maximized over subsets of variables of pre-specified cardinality. This estimator is shown to be consistent for its target parameter and asymptotically normal, provided the dimensions of the variables do not grow too quickly with sample size. We also develop a greedy search algorithm to accurately compute the estimator, leading to a computationally tractable omnibus test for the global null hypothesis that there are no linear relationships between any subsets of variables having the pre-specified cardinality. We further develop a confidence interval that takes the variable selection into account.

我们根据典型相关分析的后选推理方法,考虑了检验大量随机变量集之间是否存在线性关系的问题。我们面临的挑战是,如何调整具有最大样本相关性线性组合的变量子集的选择。为此,我们构建了一个稳定的一步估计器,用于估计在预先指定的心数变量子集上最大化的典型相关性的欧几里德正态。结果表明,只要变量的维数不随着样本量的增加而过快增长,这个估计器对其目标参数是一致的,而且渐近正态。我们还开发了一种贪婪搜索算法来精确计算该估计器,从而得到一个计算简单的全局零假设综合测试,即任何具有预先指定的万有引力的变量子集之间不存在线性关系。我们进一步开发了一个置信区间,将变量选择考虑在内。
{"title":"Significance testing for canonical correlation analysis in high dimensions.","authors":"Ian W McKeague, Xin Zhang","doi":"10.1093/biomet/asab059","DOIUrl":"10.1093/biomet/asab059","url":null,"abstract":"<p><p>We consider the problem of testing for the presence of linear relationships between large sets of random variables based on a post-selection inference approach to canonical correlation analysis. The challenge is to adjust for the selection of subsets of variables having linear combinations with maximal sample correlation. To this end, we construct a stabilized one-step estimator of the euclidean-norm of the canonical correlations maximized over subsets of variables of pre-specified cardinality. This estimator is shown to be consistent for its target parameter and asymptotically normal, provided the dimensions of the variables do not grow too quickly with sample size. We also develop a greedy search algorithm to accurately compute the estimator, leading to a computationally tractable omnibus test for the global null hypothesis that there are no linear relationships between any subsets of variables having the pre-specified cardinality. We further develop a confidence interval that takes the variable selection into account.</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9857302/pdf/nihms-1771870.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10613294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biometrika
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1