首页 > 最新文献

Biometrika最新文献

英文 中文
Local Bootstrap for Network Data 网络数据的本地引导
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-09-09 DOI: 10.1093/biomet/asae046
Tianhai Zu, Yichen Qin
SUMMARY In network analysis, we frequently need to conduct inference for network parameters based on one observed network. Since the sampling distribution of the statistic is often unknown, we need to rely on the bootstrap. However, due to the complex dependence structure among vertices, existing bootstrap methods often yield unsatisfactory performance, especially under small or moderate sample sizes. To this end, we propose a new network bootstrap procedure, termed local bootstrap, to estimate the standard errors of network statistics. We propose to resample the observed vertices along with their neighbor sets, and reconstruct the edges between the resampled vertices by drawing from the set of edges connecting their neighbor sets. We justify the proposed method theoretically with desirable asymptotic properties for statistics such as motif density, and demonstrate its excellent numerical performance in small and moderate sample sizes. Our method includes several existing methods, such as the empirical graphon bootstrap, as special cases. We investigate the advantages of the proposed methods over the existing methods through the lens of edge randomness, vertex heterogeneity, neighbor set size, which shed some light on the complex issue of network bootstrapping.
摘要 在网络分析中,我们经常需要根据一个观测网络来推断网络参数。由于统计量的抽样分布往往是未知的,因此我们需要依靠自举法。然而,由于顶点之间存在复杂的依赖结构,现有的自举方法往往效果不佳,尤其是在样本量较小或中等的情况下。为此,我们提出了一种新的网络引导程序,称为局部引导,用于估计网络统计的标准误差。我们建议对观察到的顶点及其邻居集进行重新采样,并从连接其邻居集的边缘集中抽取,重建重新采样顶点之间的边缘。我们从理论上证明了所提议的方法对图案密度等统计数据具有理想的渐近特性,并证明了该方法在中小规模样本中的优异数值性能。我们的方法包括几种现有方法,如经验图引导法,作为特例。我们从边缘随机性、顶点异质性、邻居集大小等角度研究了所提方法相对于现有方法的优势,从而揭示了网络引导这一复杂问题。
{"title":"Local Bootstrap for Network Data","authors":"Tianhai Zu, Yichen Qin","doi":"10.1093/biomet/asae046","DOIUrl":"https://doi.org/10.1093/biomet/asae046","url":null,"abstract":"SUMMARY In network analysis, we frequently need to conduct inference for network parameters based on one observed network. Since the sampling distribution of the statistic is often unknown, we need to rely on the bootstrap. However, due to the complex dependence structure among vertices, existing bootstrap methods often yield unsatisfactory performance, especially under small or moderate sample sizes. To this end, we propose a new network bootstrap procedure, termed local bootstrap, to estimate the standard errors of network statistics. We propose to resample the observed vertices along with their neighbor sets, and reconstruct the edges between the resampled vertices by drawing from the set of edges connecting their neighbor sets. We justify the proposed method theoretically with desirable asymptotic properties for statistics such as motif density, and demonstrate its excellent numerical performance in small and moderate sample sizes. Our method includes several existing methods, such as the empirical graphon bootstrap, as special cases. We investigate the advantages of the proposed methods over the existing methods through the lens of edge randomness, vertex heterogeneity, neighbor set size, which shed some light on the complex issue of network bootstrapping.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Simple Bootstrap for Chatterjee's Rank Correlation 查特吉等级相关性的简单引导法
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-08-26 DOI: 10.1093/biomet/asae045
H Dette, M Kroll
SUMMARY We prove that an m out of n bootstrap procedure for Chatterjee's rank correlation is consistent whenever asymptotic normality of Chatterjee's rank correlation can be established. In particular, we prove that m out of n bootstrap works for continuous as well as for discrete data with independent coordinates; furthermore, simulations indicate that it also performs well for discrete data with dependent coordinates, and that it outperforms alternative estimation methods. Consistency of the bootstrap is proved in the Kolmogorov as well as in the Wasserstein distance.
摘要 我们证明,只要能确定查特吉秩相关性的渐近正态性,则查特吉秩相关性的 n 分之 m 引导程序是一致的。特别是,我们证明了 n 分之 m 引导法既适用于连续数据,也适用于具有独立坐标的离散数据;此外,模拟结果表明,它对具有从属坐标的离散数据也有良好的表现,并且优于其他估计方法。在科尔莫哥洛夫距离和瓦瑟斯坦距离中都证明了引导法的一致性。
{"title":"A Simple Bootstrap for Chatterjee's Rank Correlation","authors":"H Dette, M Kroll","doi":"10.1093/biomet/asae045","DOIUrl":"https://doi.org/10.1093/biomet/asae045","url":null,"abstract":"SUMMARY We prove that an m out of n bootstrap procedure for Chatterjee's rank correlation is consistent whenever asymptotic normality of Chatterjee's rank correlation can be established. In particular, we prove that m out of n bootstrap works for continuous as well as for discrete data with independent coordinates; furthermore, simulations indicate that it also performs well for discrete data with dependent coordinates, and that it outperforms alternative estimation methods. Consistency of the bootstrap is proved in the Kolmogorov as well as in the Wasserstein distance.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sensitivity models and bounds under sequential unmeasured confounding in longitudinal studies 纵向研究中连续未测量混杂情况下的灵敏度模型和界限
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-08-20 DOI: 10.1093/biomet/asae044
Zhiqiang Tan
Consider sensitivity analysis for causal inference in a longitudinal study with time-varying treatments and covariates. It is of interest to assess the worst-case possible values of counterfactual-outcome means and average treatment effects under sequential unmeasured confounding. We formulate several multi-period sensitivity models to relax the corresponding versions of the assumption of sequential non-confounding. The primary sensitivity model involves only counterfactual outcomes, whereas the joint and product sensitivity models involve both counterfactual covariates and outcomes. We establish and compare explicit representations for the sharp and conservative bounds at the population level through convex optimization, depending only on the observed data. These results provide for the first time a satisfactory generalization from the marginal sensitivity model in the cross-sectional setting.
考虑在具有时变治疗和协变量的纵向研究中进行因果推断的敏感性分析。我们有兴趣评估在连续的未测量混杂情况下,反事实结果均值和平均治疗效果的最坏情况可能值。我们制定了几个多期敏感性模型,以放松相应版本的连续非混杂假设。主要灵敏度模型只涉及反事实结果,而联合灵敏度模型和乘积灵敏度模型则涉及反事实协变量和结果。我们仅根据观测数据,通过凸优化,在群体水平上建立并比较了锐界和保守界的明确表示。这些结果首次令人满意地概括了横截面环境下的边际敏感性模型。
{"title":"Sensitivity models and bounds under sequential unmeasured confounding in longitudinal studies","authors":"Zhiqiang Tan","doi":"10.1093/biomet/asae044","DOIUrl":"https://doi.org/10.1093/biomet/asae044","url":null,"abstract":"Consider sensitivity analysis for causal inference in a longitudinal study with time-varying treatments and covariates. It is of interest to assess the worst-case possible values of counterfactual-outcome means and average treatment effects under sequential unmeasured confounding. We formulate several multi-period sensitivity models to relax the corresponding versions of the assumption of sequential non-confounding. The primary sensitivity model involves only counterfactual outcomes, whereas the joint and product sensitivity models involve both counterfactual covariates and outcomes. We establish and compare explicit representations for the sharp and conservative bounds at the population level through convex optimization, depending only on the observed data. These results provide for the first time a satisfactory generalization from the marginal sensitivity model in the cross-sectional setting.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142203421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Studies in the history of probability and statistics, LI: the first conditional logistic regression 概率论与统计学史研究,LI:第一个条件逻辑回归
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-08-09 DOI: 10.1093/biomet/asae038
J A Hanley
Statisticians and epidemiologists generally cite the publications by Prentice & Breslow and by Breslow et al. in 1978 as the first description and use of conditional logistic regression, while economists cite the 1973 book chapter by Nobel laureate McFadden. We describe the until-now-unrecognized use of, and way of fitting, this model in 1934 by Lionel Penrose and Ronald Fisher.
统计学家和流行病学家一般将 Prentice & Breslow 和 Breslow 等人 1978 年发表的文章作为条件对数回归的首次描述和使用,而经济学家则引用诺贝尔奖得主麦克法登 1973 年在书中的章节。我们描述的是莱昂内尔-彭罗斯和罗纳德-费舍尔在 1934 年对这一模型的使用和拟合方法,直到现在还未得到认可。
{"title":"Studies in the history of probability and statistics, LI: the first conditional logistic regression","authors":"J A Hanley","doi":"10.1093/biomet/asae038","DOIUrl":"https://doi.org/10.1093/biomet/asae038","url":null,"abstract":"Statisticians and epidemiologists generally cite the publications by Prentice & Breslow and by Breslow et al. in 1978 as the first description and use of conditional logistic regression, while economists cite the 1973 book chapter by Nobel laureate McFadden. We describe the until-now-unrecognized use of, and way of fitting, this model in 1934 by Lionel Penrose and Ronald Fisher.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141935803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Skip-sampling: subsampling in the frequency domain 跳采样:频域子采样
IF 2.4 2区 数学 Q2 BIOLOGY Pub Date : 2024-08-08 DOI: 10.1093/biomet/asae039
T. McElroy, D. Politis
Over the last 35 years, several bootstrap methods for time series have been proposed. Popular time domain methods include the block-bootstrap, the stationary bootstrap, the linear process bootstrap, among others; subsampling for time series is also available, and is closely related to the block-bootstrap. The frequency domain bootstrap has been performed either by resampling the periodogram ordinates or by resampling the ordinates of the discrete Fourier transform. The paper at hand proposes a novel construction of subsampling the discrete Fourier transform ordinates, and investigates its theoretical properties and realm of applicability. Numerical studies show that the new method performs comparably to the frequency domain bootstrap for linear spectral means and ratio statistics, while at the same time yielding significant computational savings as well as numerical stability.
在过去的 35 年里,人们提出了多种时间序列的自举方法。流行的时域自举法包括块自举法、静态自举法、线性过程自举法等;还有与块自举法密切相关的时间序列子采样法。频域自举是通过对周期图序数进行重采样或对离散傅里叶变换的序数进行重采样来实现的。本文提出了对离散傅立叶变换序数进行子采样的新结构,并研究了其理论特性和适用范围。数值研究表明,对于线性频谱均值和比率统计,新方法的性能与频域自举法相当,同时还大大节省了计算量,并具有数值稳定性。
{"title":"Skip-sampling: subsampling in the frequency domain","authors":"T. McElroy, D. Politis","doi":"10.1093/biomet/asae039","DOIUrl":"https://doi.org/10.1093/biomet/asae039","url":null,"abstract":"\u0000 Over the last 35 years, several bootstrap methods for time series have been proposed. Popular time domain methods include the block-bootstrap, the stationary bootstrap, the linear process bootstrap, among others; subsampling for time series is also available, and is closely related to the block-bootstrap. The frequency domain bootstrap has been performed either by resampling the periodogram ordinates or by resampling the ordinates of the discrete Fourier transform. The paper at hand proposes a novel construction of subsampling the discrete Fourier transform ordinates, and investigates its theoretical properties and realm of applicability. Numerical studies show that the new method performs comparably to the frequency domain bootstrap for linear spectral means and ratio statistics, while at the same time yielding significant computational savings as well as numerical stability.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141929111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Covariate-Balancing Method in Learning Optimal Individualized Treatment Regimes 学习最佳个性化治疗方案的稳健协变量平衡法
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-07-17 DOI: 10.1093/biomet/asae036
Canhui Li, Donglin Zeng, Wensheng Zhu
Summary One of the most important problems in precision medicine is to find the optimal individualized treatment rule, which is designed to recommend treatment decisions and maximize overall clinical benefit to patients based on their individual characteristics. Typically, the expected clinical outcome is required to be estimated first, in which an outcome regression model or a propensity score model usually needs to be assumed for most of the existing statistical methods. However, if either model assumption is invalid, the estimated treatment regime is not reliable. In this article, we first define a contrast value function, which is the basis of the study for individualized treatment regimes. Then we construct a hybrid estimator of the contrast value function, by combining two types of estimation methods. We further propose a robust covariate-balancing estimator of the contrast value function by combining the inverse probability weighted method and matching method, which is based on the covariate balancing propensity score proposed by Imai and Ratkovic (2014). Theoretical results show that the proposed estimator is doubly robust, that is, it is consistent if either the propensity score model or the matching is correct. Based on a large number of simulation studies, we demonstrate that the proposed estimator outperforms existing methods. Lastly, the proposed method is illustrated through analysis of the SUPPORT study.
摘要 精准医疗中最重要的问题之一是找到最佳个体化治疗规则,该规则旨在根据患者的个体特征推荐治疗决策,并使患者的总体临床获益最大化。通常情况下,首先需要估计预期临床结果,在此过程中,大多数现有统计方法通常需要假设结果回归模型或倾向评分模型。然而,如果任一模型假设无效,估计出的治疗方案就不可靠。在本文中,我们首先定义了对比值函数,这是研究个体化治疗方案的基础。然后,我们结合两种估计方法,构建了对比值函数的混合估计器。我们进一步结合反概率加权法和匹配法,在 Imai 和 Ratkovic(2014 年)提出的共变平衡倾向得分的基础上,提出了一种稳健的共变平衡对比值函数估计器。理论结果表明,所提出的估计器具有双重稳健性,即如果倾向得分模型或匹配正确,则估计器是一致的。基于大量的模拟研究,我们证明了所提出的估计方法优于现有方法。最后,我们通过对 SUPPORT 研究的分析来说明所提出的方法。
{"title":"Robust Covariate-Balancing Method in Learning Optimal Individualized Treatment Regimes","authors":"Canhui Li, Donglin Zeng, Wensheng Zhu","doi":"10.1093/biomet/asae036","DOIUrl":"https://doi.org/10.1093/biomet/asae036","url":null,"abstract":"Summary One of the most important problems in precision medicine is to find the optimal individualized treatment rule, which is designed to recommend treatment decisions and maximize overall clinical benefit to patients based on their individual characteristics. Typically, the expected clinical outcome is required to be estimated first, in which an outcome regression model or a propensity score model usually needs to be assumed for most of the existing statistical methods. However, if either model assumption is invalid, the estimated treatment regime is not reliable. In this article, we first define a contrast value function, which is the basis of the study for individualized treatment regimes. Then we construct a hybrid estimator of the contrast value function, by combining two types of estimation methods. We further propose a robust covariate-balancing estimator of the contrast value function by combining the inverse probability weighted method and matching method, which is based on the covariate balancing propensity score proposed by Imai and Ratkovic (2014). Theoretical results show that the proposed estimator is doubly robust, that is, it is consistent if either the propensity score model or the matching is correct. Based on a large number of simulation studies, we demonstrate that the proposed estimator outperforms existing methods. Lastly, the proposed method is illustrated through analysis of the SUPPORT study.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141740777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal inference with hidden mediators 隐性中介的因果推断
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-07-13 DOI: 10.1093/biomet/asae037
AmirEmad Ghassami, Alan Yang, Ilya Shpitser, Eric Tchetgen Tchetgen
Summary Proximal causal inference was recently proposed as a framework to identify causal effects from observational data in the presence of hidden confounders for which proxies are available. In this paper, we extend the proximal causal inference approach to settings where identification of causal effects hinges upon a set of mediators which are not observed, yet error prone proxies of the hidden mediators are measured. Specifically, (i) we establish causal hidden mediation analysis, which extends classical causal mediation analysis methods for identifying natural direct and indirect effects under no unmeasured confounding to a setting where the mediator of interest is hidden, but proxies of it are available. (ii) We establish a hidden front-door criterion, which extends the classical front-door criterion to allow for hidden mediators for which proxies are available. (iii) We show that the identification of a certain causal effect called population intervention indirect effect remains possible with hidden mediators in settings where challenges in (i) and (ii) might co-exist. We view (i)-(iii) as important steps towards the practical application of front-door criteria and mediation analysis as mediators are almost always measured with error and thus, the most one can hope for in practice is that the measurements are at best proxies of mediating mechanisms. We propose identification approaches for the parameters of interest in our considered models. For the estimation aspect, we propose an influence function-based estimation method and provide an analysis for the robustness of the estimators.
摘要 近因推断是最近提出的一个框架,用于在存在可替代的隐藏混杂因素的情况下,从观测数据中识别因果效应。在本文中,我们将近端因果推理方法扩展到因果效应的识别取决于一组未被观测到的中介因子,但测量了隐藏中介因子的易错替代物的情况。具体来说,(i) 我们建立了因果隐性中介分析法,它将经典的因果中介分析法扩展到了在没有未测量混杂因素的情况下识别自然直接和间接效应的方法,在这种情况下,所关注的中介因素是隐性的,但可以得到其替代物。(ii) 我们建立了一个隐藏的前门标准,该标准扩展了经典的前门标准,允许存在替代物的隐藏中介。(iii) 我们证明,在(i)和(ii)中的挑战可能同时存在的情况下,利用隐藏的中介因素仍有可能识别出某种因果效应,即人口干预间接效应。我们认为(i)-(iii)是前门标准和中介分析实际应用的重要步骤,因为中介因子的测量几乎总是有误差的,因此,在实践中我们最多只能希望测量结果是中介机制的替代物。我们为所考虑模型中的相关参数提出了识别方法。在估计方面,我们提出了一种基于影响函数的估计方法,并对估计值的稳健性进行了分析。
{"title":"Causal inference with hidden mediators","authors":"AmirEmad Ghassami, Alan Yang, Ilya Shpitser, Eric Tchetgen Tchetgen","doi":"10.1093/biomet/asae037","DOIUrl":"https://doi.org/10.1093/biomet/asae037","url":null,"abstract":"Summary Proximal causal inference was recently proposed as a framework to identify causal effects from observational data in the presence of hidden confounders for which proxies are available. In this paper, we extend the proximal causal inference approach to settings where identification of causal effects hinges upon a set of mediators which are not observed, yet error prone proxies of the hidden mediators are measured. Specifically, (i) we establish causal hidden mediation analysis, which extends classical causal mediation analysis methods for identifying natural direct and indirect effects under no unmeasured confounding to a setting where the mediator of interest is hidden, but proxies of it are available. (ii) We establish a hidden front-door criterion, which extends the classical front-door criterion to allow for hidden mediators for which proxies are available. (iii) We show that the identification of a certain causal effect called population intervention indirect effect remains possible with hidden mediators in settings where challenges in (i) and (ii) might co-exist. We view (i)-(iii) as important steps towards the practical application of front-door criteria and mediation analysis as mediators are almost always measured with error and thus, the most one can hope for in practice is that the measurements are at best proxies of mediating mechanisms. We propose identification approaches for the parameters of interest in our considered models. For the estimation aspect, we propose an influence function-based estimation method and provide an analysis for the robustness of the estimators.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141718261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
More Power by Using Fewer Permutations 用更少的排列组合获得更大的能量
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-07-10 DOI: 10.1093/biomet/asae031
Nick W Koning
Summary It is conventionally believed that permutation-based testing methods should ideally use all permutations. We challenge this by showing we can sometimes obtain dramatically more power by using a tiny subgroup. As the subgroup is tiny, this also comes at a much lower computational cost. Moreover, the method remains valid for the same hypotheses. We exploit this to improve the popular permutation-based Westfall & Young MaxT multiple testing method. We analyze the relative efficiency in a Gaussian location model, and find the largest gain in high dimensions.
摘要 传统观点认为,基于排列的检验方法最好使用所有排列。我们对这一观点提出了质疑,因为我们发现有时使用一个很小的子群就能获得更强的能力。由于子群很小,因此计算成本也低得多。此外,这种方法对相同的假设依然有效。我们利用这一点改进了流行的基于置换的 Westfall & Young MaxT 多重检验方法。我们分析了高斯位置模型中的相对效率,发现在高维度中的收益最大。
{"title":"More Power by Using Fewer Permutations","authors":"Nick W Koning","doi":"10.1093/biomet/asae031","DOIUrl":"https://doi.org/10.1093/biomet/asae031","url":null,"abstract":"Summary It is conventionally believed that permutation-based testing methods should ideally use all permutations. We challenge this by showing we can sometimes obtain dramatically more power by using a tiny subgroup. As the subgroup is tiny, this also comes at a much lower computational cost. Moreover, the method remains valid for the same hypotheses. We exploit this to improve the popular permutation-based Westfall & Young MaxT multiple testing method. We analyze the relative efficiency in a Gaussian location model, and find the largest gain in high dimensions.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141585906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing Independence for Sparse Longitudinal Data 测试稀疏纵向数据的独立性
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-07-08 DOI: 10.1093/biomet/asae035
Changbo Zhu, Junwen Yao, Jane-Ling Wang
Summary With the advance of science and technology, more and more data are collected in the form of functions. A fundamental question for a pair of random functions is to test whether they are independent. This problem becomes quite challenging when the random trajectories are sampled irregularly and sparsely for each subject. In other words, each random function is only sampled at a few time-points, and these time-points vary with subjects. Furthermore, the observed data may contain noise. To the best of our knowledge, there exists no consistent test in the literature to test the independence of sparsely observed functional data. We show in this work that testing pointwise independence simultaneously is feasible. The test statistics are constructed by integrating pointwise distance covariances (Székely et al., 2007) and are shown to converge, at a certain rate, to their corresponding population counterparts, which characterize the simultaneous pointwise independence of two random functions. The performance of the proposed methods is further verified by Monte Carlo simulations and analysis of real data.
摘要 随着科学技术的发展,越来越多的数据以函数的形式被收集起来。一对随机函数的基本问题是测试它们是否独立。如果对每个受试者的随机轨迹进行不规则的稀疏采样,这个问题就变得相当具有挑战性。换句话说,每个随机函数只在几个时间点上采样,而这些时间点会随着受试者的不同而变化。此外,观察到的数据可能包含噪声。据我们所知,文献中没有一致的测试方法来测试稀疏观测功能数据的独立性。我们在这项工作中证明,同时测试点独立性是可行的。测试统计量是通过积分点距协方差(Székely et al.蒙特卡罗模拟和真实数据分析进一步验证了所提方法的性能。
{"title":"Testing Independence for Sparse Longitudinal Data","authors":"Changbo Zhu, Junwen Yao, Jane-Ling Wang","doi":"10.1093/biomet/asae035","DOIUrl":"https://doi.org/10.1093/biomet/asae035","url":null,"abstract":"Summary With the advance of science and technology, more and more data are collected in the form of functions. A fundamental question for a pair of random functions is to test whether they are independent. This problem becomes quite challenging when the random trajectories are sampled irregularly and sparsely for each subject. In other words, each random function is only sampled at a few time-points, and these time-points vary with subjects. Furthermore, the observed data may contain noise. To the best of our knowledge, there exists no consistent test in the literature to test the independence of sparsely observed functional data. We show in this work that testing pointwise independence simultaneously is feasible. The test statistics are constructed by integrating pointwise distance covariances (Székely et al., 2007) and are shown to converge, at a certain rate, to their corresponding population counterparts, which characterize the simultaneous pointwise independence of two random functions. The performance of the proposed methods is further verified by Monte Carlo simulations and analysis of real data.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141566703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semiparametric efficiency gains from parametric restrictions on propensity scores 倾向分数参数限制带来的半参数效率收益
IF 2.7 2区 数学 Q2 BIOLOGY Pub Date : 2024-07-06 DOI: 10.1093/biomet/asae034
Haruki Kono
Summary We explore how much knowing a parametric restriction on propensity scores improves semiparametric efficiency bounds in the potential outcome framework. For stratified propensity scores, considered as a parametric model, we derive explicit formulas for the efficiency gain from knowing how the covariate space is split. Based on these, we find that the efficiency gain decreases as the partition of the stratification becomes finer. For general parametric models, where it is hard to obtain explicit representations of efficiency bounds, we propose a novel framework that enables us to see whether knowing a parametric model is valuable in terms of efficiency even when it is high-dimensional. In addition to the intuitive fact that knowing the parametric model does not help much if it is sufficiently flexible, we discover that the efficiency gain can be nearly zero even though the parametric assumption significantly restricts the space of possible propensity scores.
摘要 我们探讨了在潜在结果框架下,了解倾向得分的参数限制对半参数效率约束的改善程度。对于被视为参数模型的分层倾向得分,我们推导出了明确的公式,说明了解协变量空间的分割方式对效率的提高有多大。在此基础上,我们发现效率增益会随着分层分割的细化而降低。对于一般的参数模型,很难获得效率边界的明确表示,我们提出了一个新颖的框架,使我们能够了解即使是高维的参数模型,知道它在效率方面是否有价值。如果参数模型足够灵活,那么了解参数模型并不会有太大帮助,除了这一直观事实外,我们还发现,即使参数假设极大地限制了可能的倾向得分空间,效率收益也可能几乎为零。
{"title":"Semiparametric efficiency gains from parametric restrictions on propensity scores","authors":"Haruki Kono","doi":"10.1093/biomet/asae034","DOIUrl":"https://doi.org/10.1093/biomet/asae034","url":null,"abstract":"Summary We explore how much knowing a parametric restriction on propensity scores improves semiparametric efficiency bounds in the potential outcome framework. For stratified propensity scores, considered as a parametric model, we derive explicit formulas for the efficiency gain from knowing how the covariate space is split. Based on these, we find that the efficiency gain decreases as the partition of the stratification becomes finer. For general parametric models, where it is hard to obtain explicit representations of efficiency bounds, we propose a novel framework that enables us to see whether knowing a parametric model is valuable in terms of efficiency even when it is high-dimensional. In addition to the intuitive fact that knowing the parametric model does not help much if it is sufficiently flexible, we discover that the efficiency gain can be nearly zero even though the parametric assumption significantly restricts the space of possible propensity scores.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.7,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141566705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biometrika
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1