首页 > 最新文献

International Journal of Biostatistics最新文献

英文 中文
Empirical Efficiency Maximization: Improved Locally Efficient Covariate Adjustment in Randomized Experiments and Survival Analysis 经验效率最大化:随机实验和生存分析中改进的局部有效协变量调整
IF 1.2 4区 数学 Pub Date : 2008-05-04 DOI: 10.2202/1557-4679.1084
D. Rubin, M. J. van der Laan
It has long been recognized that covariate adjustment can increase precision in randomized experiments, even when it is not strictly necessary. Adjustment is often straightforward when a discrete covariate partitions the sample into a handful of strata, but becomes more involved with even a single continuous covariate such as age. As randomized experiments remain a gold standard for scientific inquiry, and the information age facilitates a massive collection of baseline information, the longstanding problem of if and how to adjust for covariates is likely to engage investigators for the foreseeable future.In the locally efficient estimation approach introduced for general coarsened data structures by James Robins and collaborators, one first fits a relatively small working model, often with maximum likelihood, giving a nuisance parameter fit in an estimating equation for the parameter of interest. The usual advertisement is that the estimator will be asymptotically efficient if the working model is correct, but otherwise will still be consistent and asymptotically Gaussian.However, by applying standard likelihood-based fits to misspecified working models in covariate adjustment problems, one can poorly estimate the parameter of interest. We propose a new method, empirical efficiency maximization, to optimize the working model fit for the resulting parameter estimate.In addition to the randomized experiment setting, we show how our covariate adjustment procedure can be used in survival analysis applications. Numerical asymptotic efficiency calculations demonstrate gains relative to standard locally efficient estimators.
人们早就认识到协变量调整可以提高随机实验的精度,即使它不是严格必要的。当一个离散的协变量将样本划分为几个层时,调整通常是直接的,但即使是一个连续的协变量,如年龄,也会变得更加复杂。由于随机实验仍然是科学探究的黄金标准,信息时代促进了大量基线信息的收集,在可预见的未来,是否以及如何调整协变量的长期问题可能会让研究人员参与其中。在James Robins及其合作者为一般粗化数据结构引入的局部有效估计方法中,首先拟合一个相对较小的工作模型,通常具有最大似然性,在对感兴趣的参数的估计方程中给出一个讨厌的参数拟合。通常的广告是,如果工作模型是正确的,估计器将是渐近有效的,但否则仍然是一致的和渐近高斯的。然而,在协变量调整问题中,通过将标准的基于似然的拟合应用于错误指定的工作模型,人们可以很好地估计感兴趣的参数。我们提出了一种新的方法,经验效率最大化,以优化工作模型拟合的结果参数估计。除了随机实验设置外,我们还展示了协变量调整程序如何用于生存分析应用。数值渐近效率计算证明了相对于标准局部有效估计的增益。
{"title":"Empirical Efficiency Maximization: Improved Locally Efficient Covariate Adjustment in Randomized Experiments and Survival Analysis","authors":"D. Rubin, M. J. van der Laan","doi":"10.2202/1557-4679.1084","DOIUrl":"https://doi.org/10.2202/1557-4679.1084","url":null,"abstract":"It has long been recognized that covariate adjustment can increase precision in randomized experiments, even when it is not strictly necessary. Adjustment is often straightforward when a discrete covariate partitions the sample into a handful of strata, but becomes more involved with even a single continuous covariate such as age. As randomized experiments remain a gold standard for scientific inquiry, and the information age facilitates a massive collection of baseline information, the longstanding problem of if and how to adjust for covariates is likely to engage investigators for the foreseeable future.In the locally efficient estimation approach introduced for general coarsened data structures by James Robins and collaborators, one first fits a relatively small working model, often with maximum likelihood, giving a nuisance parameter fit in an estimating equation for the parameter of interest. The usual advertisement is that the estimator will be asymptotically efficient if the working model is correct, but otherwise will still be consistent and asymptotically Gaussian.However, by applying standard likelihood-based fits to misspecified working models in covariate adjustment problems, one can poorly estimate the parameter of interest. We propose a new method, empirical efficiency maximization, to optimize the working model fit for the resulting parameter estimate.In addition to the randomized experiment setting, we show how our covariate adjustment procedure can be used in survival analysis applications. Numerical asymptotic efficiency calculations demonstrate gains relative to standard locally efficient estimators.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"13 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2008-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1084","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 101
Modeling the Effect of a Preventive Intervention on the Natural History of Cancer: Application to the Prostate Cancer Prevention Trial 模拟预防干预对癌症自然史的影响:在前列腺癌预防试验中的应用
IF 1.2 4区 数学 Pub Date : 2006-12-28 DOI: 10.2202/1557-4679.1036
P. Pinsky, Ruth Etzioni, N. Howlader, P. Goodman, I. Thompson
The Prostate Cancer Prevention Trial (PCPT) recently demonstrated a significant reduction in prostate cancer incidence of about 25% among men taking finasteride compared to men taking placebo. However, the effect of finasteride on the natural history of prostate cancer is not well understood. We adapted a convolution model developed by Pinsky (2001) to characterize the natural history of prostate cancer in the presence and absence of finasteride. The model was applied to data from 10,995 men in PCPT who had disease status determined by interim diagnosis of prostate cancer or end-of-study biopsy. Prostate cancer cases were either screen-detected by Prostate-Specific Antigen (PSA), biopsy-detected at the end of the study, or clinically detected, that is, detected by methods other than PSA screening. The hazard ratio (HR) for the incidence of preclinical disease on finasteride versus placebo was 0.42 (95% CI: 0.20-0.58). The progression from preclinical to clinical disease was relatively unaffected by finasteride, with mean sojourn time being 16 years for placebo cases and 18.5 years for finasteride cases (p-value for difference = 0.2). We conclude that finasteride appears to affect prostate cancer primarily by preventing the emergence of new, preclinical tumors with little impact on established, latent disease.
前列腺癌预防试验(PCPT)最近表明,与服用安慰剂的男性相比,服用非那雄胺的男性前列腺癌发病率显著降低约25%。然而,非那雄胺对前列腺癌自然史的影响尚不清楚。我们采用了Pinsky(2001)开发的卷积模型来描述非那雄胺存在和不存在时前列腺癌的自然历史。该模型应用于10995名通过前列腺癌中期诊断或研究结束活检确定疾病状态的PCPT患者的数据。前列腺癌病例要么通过前列腺特异性抗原(PSA)筛查检测,要么在研究结束时进行活检检测,要么通过临床检测,即通过PSA筛查以外的方法检测。非那雄胺与安慰剂的临床前疾病发生率的危险比(HR)为0.42 (95% CI: 0.20-0.58)。从临床前到临床疾病的进展相对不受非那雄胺的影响,安慰剂组的平均停留时间为16年,非那雄胺组的平均停留时间为18.5年(p值差异= 0.2)。我们得出结论,非那雄胺似乎主要通过预防新的临床前肿瘤的出现来影响前列腺癌,而对已建立的潜伏性疾病几乎没有影响。
{"title":"Modeling the Effect of a Preventive Intervention on the Natural History of Cancer: Application to the Prostate Cancer Prevention Trial","authors":"P. Pinsky, Ruth Etzioni, N. Howlader, P. Goodman, I. Thompson","doi":"10.2202/1557-4679.1036","DOIUrl":"https://doi.org/10.2202/1557-4679.1036","url":null,"abstract":"The Prostate Cancer Prevention Trial (PCPT) recently demonstrated a significant reduction in prostate cancer incidence of about 25% among men taking finasteride compared to men taking placebo. However, the effect of finasteride on the natural history of prostate cancer is not well understood. We adapted a convolution model developed by Pinsky (2001) to characterize the natural history of prostate cancer in the presence and absence of finasteride. The model was applied to data from 10,995 men in PCPT who had disease status determined by interim diagnosis of prostate cancer or end-of-study biopsy. Prostate cancer cases were either screen-detected by Prostate-Specific Antigen (PSA), biopsy-detected at the end of the study, or clinically detected, that is, detected by methods other than PSA screening. The hazard ratio (HR) for the incidence of preclinical disease on finasteride versus placebo was 0.42 (95% CI: 0.20-0.58). The progression from preclinical to clinical disease was relatively unaffected by finasteride, with mean sojourn time being 16 years for placebo cases and 18.5 years for finasteride cases (p-value for difference = 0.2). We conclude that finasteride appears to affect prostate cancer primarily by preventing the emergence of new, preclinical tumors with little impact on established, latent disease.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1036","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Targeted Maximum Likelihood Learning 目标最大似然学习
IF 1.2 4区 数学 Pub Date : 2006-12-28 DOI: 10.2202/1557-4679.1043
M. J. van der Laan, D. Rubin
Suppose one observes a sample of independent and identically distributed observations from a particular data generating distribution. Suppose that one is concerned with estimation of a particular pathwise differentiable Euclidean parameter. A substitution estimator evaluating the parameter of a given likelihood based density estimator is typically too biased and might not even converge at the parametric rate: that is, the density estimator was targeted to be a good estimator of the density and might therefore result in a poor estimator of a particular smooth functional of the density. In this article we propose a one step (and, by iteration, k-th step) targeted maximum likelihood density estimator which involves 1) creating a hardest parametric submodel with parameter epsilon through the given density estimator with score equal to the efficient influence curve of the pathwise differentiable parameter at the density estimator, 2) estimating epsilon with the maximum likelihood estimator, and 3) defining a new density estimator as the corresponding update of the original density estimator. We show that iteration of this algorithm results in a targeted maximum likelihood density estimator which solves the efficient influence curve estimating equation and thereby yields a locally efficient estimator of the parameter of interest, under regularity conditions. In particular, we show that, if the parameter is linear and the model is convex, then the targeted maximum likelihood estimator is often achieved in the first step, and it results in a locally efficient estimator at an arbitrary (e.g., heavily misspecified) starting density.We also show that the targeted maximum likelihood estimators are now in full agreement with the locally efficient estimating function methodology as presented in Robins and Rotnitzky (1992) and van der Laan and Robins (2003), creating, in particular, algebraic equivalence between the double robust locally efficient estimators using the targeted maximum likelihood estimators as an estimate of its nuisance parameters, and targeted maximum likelihood estimators. In addition, it is argued that the targeted MLE has various advantages relative to the current estimating function based approach. We proceed by providing data driven methodologies to select the initial density estimator for the targeted MLE, thereby providing data adaptive targeted maximum likelihood estimation methodology. We illustrate the method with various worked out examples.
假设从一个特定的数据生成分布中观察到一个独立且相同分布的观察样本。假设我们关心的是一个特定的路径可微欧几里得参数的估计。评估给定的基于似然的密度估计器的参数的替代估计器通常过于偏倚,甚至可能不会以参数速率收敛:也就是说,密度估计器的目标是成为密度的良好估计器,因此可能导致密度的特定光滑泛函的差估计器。在本文中,我们提出了一个一步(通过迭代,第k步)目标最大似然密度估计器,它涉及1)通过给定的密度估计器创建参数为epsilon的最难参数子模型,其得分等于密度估计器处路径可微参数的有效影响曲线,2)用最大似然估计器估计epsilon,3)定义一个新的密度估计量作为对原有密度估计量的相应更新。我们证明了该算法的迭代产生了一个目标最大似然密度估计量,它解决了有效的影响曲线估计方程,从而在正则性条件下产生了感兴趣参数的局部有效估计量。特别是,我们表明,如果参数是线性的,模型是凸的,那么目标最大似然估计器通常在第一步就能实现,并且它会在任意(例如,严重错误指定)的起始密度下产生局部有效的估计器。我们还表明,目标最大似然估计量现在与Robins和Rotnitzky(1992)以及van der Laan和Robins(2003)中提出的局部有效估计函数方法完全一致,特别是在使用目标最大似然估计量作为其讨厌参数的估计的双鲁棒局部有效估计量和目标最大似然估计量之间创建了代数等价。此外,本文还认为,相对于目前基于函数的估计方法,目标最大似然算法具有多种优势。我们通过提供数据驱动的方法来选择目标最大似然估计的初始密度估计量,从而提供数据自适应的目标最大似然估计方法。我们用各种算例来说明该方法。
{"title":"Targeted Maximum Likelihood Learning","authors":"M. J. van der Laan, D. Rubin","doi":"10.2202/1557-4679.1043","DOIUrl":"https://doi.org/10.2202/1557-4679.1043","url":null,"abstract":"Suppose one observes a sample of independent and identically distributed observations from a particular data generating distribution. Suppose that one is concerned with estimation of a particular pathwise differentiable Euclidean parameter. A substitution estimator evaluating the parameter of a given likelihood based density estimator is typically too biased and might not even converge at the parametric rate: that is, the density estimator was targeted to be a good estimator of the density and might therefore result in a poor estimator of a particular smooth functional of the density. In this article we propose a one step (and, by iteration, k-th step) targeted maximum likelihood density estimator which involves 1) creating a hardest parametric submodel with parameter epsilon through the given density estimator with score equal to the efficient influence curve of the pathwise differentiable parameter at the density estimator, 2) estimating epsilon with the maximum likelihood estimator, and 3) defining a new density estimator as the corresponding update of the original density estimator. We show that iteration of this algorithm results in a targeted maximum likelihood density estimator which solves the efficient influence curve estimating equation and thereby yields a locally efficient estimator of the parameter of interest, under regularity conditions. In particular, we show that, if the parameter is linear and the model is convex, then the targeted maximum likelihood estimator is often achieved in the first step, and it results in a locally efficient estimator at an arbitrary (e.g., heavily misspecified) starting density.We also show that the targeted maximum likelihood estimators are now in full agreement with the locally efficient estimating function methodology as presented in Robins and Rotnitzky (1992) and van der Laan and Robins (2003), creating, in particular, algebraic equivalence between the double robust locally efficient estimators using the targeted maximum likelihood estimators as an estimate of its nuisance parameters, and targeted maximum likelihood estimators. In addition, it is argued that the targeted MLE has various advantages relative to the current estimating function based approach. We proceed by providing data driven methodologies to select the initial density estimator for the targeted MLE, thereby providing data adaptive targeted maximum likelihood estimation methodology. We illustrate the method with various worked out examples.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1043","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 749
Choice of Monitoring Mechanism for Optimal Nonparametric Functional Estimation for Binary Data 二值数据最优非参数泛函估计监控机制的选择
IF 1.2 4区 数学 Pub Date : 2006-09-25 DOI: 10.2202/1557-4679.1031
N. Jewell, M. J. van der Laan, S. Shiboski
Optimal designs of dose levels in order to estimate parameters from a model for binary response data have a long and rich history. These designs are based on parametric models. Here we consider fully nonparametric models with interest focused on estimation of smooth functionals using plug-in estimators based on the nonparametric maximum likelihood estimator. An important application of the results is the derivation of the optimal choice of the monitoring time distribution function for current status observation of a survival distribution. The optimal choice depends in a simple way on the dose-response function and the form of the functional. The results can be extended to allow dependence of the monitoring mechanism on covariates.
为了从二元响应数据模型中估计参数而进行剂量水平的优化设计有着悠久而丰富的历史。这些设计是基于参数化模型的。在这里,我们考虑完全非参数模型,重点关注使用基于非参数极大似然估计的插件估计器估计光滑函数。该结果的一个重要应用是推导了生存分布当前状态观测的监测时间分布函数的最优选择。最优选择以一种简单的方式取决于剂量-响应函数和泛函的形式。结果可以推广到允许监控机制对协变量的依赖。
{"title":"Choice of Monitoring Mechanism for Optimal Nonparametric Functional Estimation for Binary Data","authors":"N. Jewell, M. J. van der Laan, S. Shiboski","doi":"10.2202/1557-4679.1031","DOIUrl":"https://doi.org/10.2202/1557-4679.1031","url":null,"abstract":"Optimal designs of dose levels in order to estimate parameters from a model for binary response data have a long and rich history. These designs are based on parametric models. Here we consider fully nonparametric models with interest focused on estimation of smooth functionals using plug-in estimators based on the nonparametric maximum likelihood estimator. An important application of the results is the derivation of the optimal choice of the monitoring time distribution function for current status observation of a survival distribution. The optimal choice depends in a simple way on the dose-response function and the form of the functional. The results can be extended to allow dependence of the monitoring mechanism on covariates.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Application of a Variable Importance Measure Method 变量重要性度量方法的应用
IF 1.2 4区 数学 Pub Date : 2006-01-14 DOI: 10.2202/1557-4679.1013
M. Birkner, M. J. van der Laan
Van der Laan (2005) proposed a targeted method used to construct variable importance measures coupled with respective statistical inference. This technique involves determining the importance of a variable in predicting an outcome. This method can be applied as inverse probability of treatment weighted (IPTW) or double robust inverse probability of treatment weighted (DR-IPTW) estimators. The variance and respective p-value of the estimate are calculated by estimating the influence curve. This article applies the Van der Laan (2005) variable importance measures and corresponding inference to HIV-1 sequence data. In this application, the method is targeted at every codon position. In this data application, protease and reverse transcriptase codon positions on the HIV-1 strand are assessed to determine their respective variable importance, with respect to an outcome of viral replication capacity. We estimate the DR-IPTW W-adjusted variable importance measure for a specified set of potential effect modifiers W. In addition, simulations were performed on two separate datasets to examine the DR-IPTW estimator.
Van der Laan(2005)提出了一种有针对性的方法,用于构建变量重要性度量,并结合各自的统计推断。这项技术包括确定变量在预测结果中的重要性。该方法可应用于加权处理逆概率估计(IPTW)或双鲁棒加权处理逆概率估计(DR-IPTW)。通过估计影响曲线来计算估计的方差和各自的p值。本文将Van der Laan(2005)变量重要性度量和相应的推断应用于HIV-1序列数据。在本应用中,该方法针对每个密码子位置。在此数据应用中,评估了HIV-1链上蛋白酶和逆转录酶密码子的位置,以确定它们各自的变量重要性,以及病毒复制能力的结果。我们估计了DR-IPTW w调整后的变量重要性测量值对一组特定的潜在效应修饰因子w的影响。此外,在两个独立的数据集上进行了模拟,以检验DR-IPTW估计器。
{"title":"Application of a Variable Importance Measure Method","authors":"M. Birkner, M. J. van der Laan","doi":"10.2202/1557-4679.1013","DOIUrl":"https://doi.org/10.2202/1557-4679.1013","url":null,"abstract":"Van der Laan (2005) proposed a targeted method used to construct variable importance measures coupled with respective statistical inference. This technique involves determining the importance of a variable in predicting an outcome. This method can be applied as inverse probability of treatment weighted (IPTW) or double robust inverse probability of treatment weighted (DR-IPTW) estimators. The variance and respective p-value of the estimate are calculated by estimating the influence curve. This article applies the Van der Laan (2005) variable importance measures and corresponding inference to HIV-1 sequence data. In this application, the method is targeted at every codon position. In this data application, protease and reverse transcriptase codon positions on the HIV-1 strand are assessed to determine their respective variable importance, with respect to an outcome of viral replication capacity. We estimate the DR-IPTW W-adjusted variable importance measure for a specified set of potential effect modifiers W. In addition, simulations were performed on two separate datasets to examine the DR-IPTW estimator.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Two Sample Problem for Multiple Categorical Variables 多分类变量的双样本问题
IF 1.2 4区 数学 Pub Date : 2006-01-09 DOI: 10.2202/1557-4679.1019
A. DiRienzo
Comparing two large multivariate distributions is potentially complicated at least for the following reasons. First, some variable/level combinations may have a redundant difference in prevalence between groups in the sense that the difference can be completely explained in terms of lower-order combinations. Second, the total number of variable/level combinations to compare between groups is very large, and likely computationally prohibitive. In this paper, for both the paired and independent sample case, an approximate comparison method is proposed, along with a computationally efficient algorithm, that estimates the set of variable/level combinations that have a non-redundant different prevalence between two populations. The probability that the estimate contains one or more false or redundant differences is asymptotically bounded above by any pre-specified level for arbitrary data-generating distributions. The method is shown to perform well for finite samples in a simulation study, and is used to investigate HIV-1 genotype evolution in a recent AIDS clinical trial.
由于以下原因,比较两个大型多变量分布可能会很复杂。首先,一些变量/水平组合可能在组间的流行率上有冗余差异,因为这种差异可以完全用低阶组合来解释。其次,要在组间比较的变量/水平组合的总数非常大,并且可能在计算上令人望而却步。在本文中,对于配对和独立样本情况,提出了一种近似比较方法,以及一种计算效率高的算法,该方法可以估计两个种群之间具有非冗余不同患病率的变量/水平组合集。对于任意数据生成分布,估计包含一个或多个错误或冗余差异的概率在任何预先指定的水平上渐近有界。在模拟研究中,该方法在有限样本中表现良好,并在最近的艾滋病临床试验中用于研究HIV-1基因型进化。
{"title":"The Two Sample Problem for Multiple Categorical Variables","authors":"A. DiRienzo","doi":"10.2202/1557-4679.1019","DOIUrl":"https://doi.org/10.2202/1557-4679.1019","url":null,"abstract":"Comparing two large multivariate distributions is potentially complicated at least for the following reasons. First, some variable/level combinations may have a redundant difference in prevalence between groups in the sense that the difference can be completely explained in terms of lower-order combinations. Second, the total number of variable/level combinations to compare between groups is very large, and likely computationally prohibitive. In this paper, for both the paired and independent sample case, an approximate comparison method is proposed, along with a computationally efficient algorithm, that estimates the set of variable/level combinations that have a non-redundant different prevalence between two populations. The probability that the estimate contains one or more false or redundant differences is asymptotically bounded above by any pre-specified level for arbitrary data-generating distributions. The method is shown to perform well for finite samples in a simulation study, and is used to investigate HIV-1 genotype evolution in a recent AIDS clinical trial.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1019","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing Distribution Functions via Empirical Likelihood 通过经验似然比较分布函数
IF 1.2 4区 数学 Pub Date : 2006-01-04 DOI: 10.2202/1557-4679.1007
I. McKeague, Yichuan Zhao
This paper develops empirical likelihood based simultaneous confidence bands for differences and ratios of two distribution functions from independent samples of right-censored survival data. The proposed confidence bands provide a flexible way of comparing treatments in biomedical settings, and bring empirical likelihood methods to bear on important target functions for which only Wald-type confidence bands have been available in the literature. The approach is illustrated with a real data example.
本文开发了基于经验似然的两个分布函数的差异和比值的同时置信带,这些分布函数来自独立的右截尾生存数据样本。所提出的置信带提供了一种灵活的方法来比较生物医学环境中的治疗,并将经验似然方法应用于文献中只有wald型置信带可用的重要目标函数。通过一个实际数据示例说明了该方法。
{"title":"Comparing Distribution Functions via Empirical Likelihood","authors":"I. McKeague, Yichuan Zhao","doi":"10.2202/1557-4679.1007","DOIUrl":"https://doi.org/10.2202/1557-4679.1007","url":null,"abstract":"This paper develops empirical likelihood based simultaneous confidence bands for differences and ratios of two distribution functions from independent samples of right-censored survival data. The proposed confidence bands provide a flexible way of comparing treatments in biomedical settings, and bring empirical likelihood methods to bear on important target functions for which only Wald-type confidence bands have been available in the literature. The approach is illustrated with a real data example.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"1 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68714433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
A Regression Model for Dependent Gap Times 相关间隔时间的回归模型
IF 1.2 4区 数学 Pub Date : 2006-01-01 DOI: 10.2202/1557-4679.1005
R. Strawderman
A natural choice of time scale for analyzing recurrent event data is the ``gap" (or soujourn) time between successive events. In many situations it is reasonable to assume correlation exists between the successive events experienced by a given subject. This paper looks at the problem of extending the accelerated failure time (AFT) model to the case of dependent recurrent event data via intensity modeling. Specifically, the accelerated gap times model of Strawderman (2005), a semiparametric intensity model for independent gap time data, is extended to the case of multiplicative gamma frailty. As argued in Aalen & Husebye (1991), incorporating frailty captures the heterogeneity between subjects and the ``hazard" portion of the intensity model captures gap time variation within a subject. Estimators are motivated using semiparametric efficiency theory and lead to useful generalizations of the rank statistics considered in Strawderman (2005). Several interesting distinctions arise in comparison to the Cox-Andersen-Gill frailty model (e.g., Nielsen et al, 1992; Klein, 1992). The proposed methodology is illustrated by simulation and data analysis.
分析重复事件数据的时间尺度的自然选择是连续事件之间的“间隙”(或逗留)时间。在许多情况下,假设给定主体所经历的连续事件之间存在相关性是合理的。本文研究了通过强度建模将加速失效时间(AFT)模型扩展到相关循环事件数据的问题。具体而言,Strawderman(2005)的加速间隙时间模型(独立间隙时间数据的半参数强度模型)被扩展到乘法伽马脆弱的情况。正如Aalen & Husebye(1991)所指出的那样,纳入脆弱性捕获了受试者之间的异质性,而强度模型的“危险”部分捕获了受试者内部的间隙时间变化。估计器使用半参数效率理论进行激励,并导致了Strawderman(2005)中考虑的秩统计的有用推广。与Cox-Andersen-Gill脆弱性模型相比,出现了几个有趣的区别(例如,Nielsen et al, 1992;克莱恩,1992)。通过仿真和数据分析说明了所提出的方法。
{"title":"A Regression Model for Dependent Gap Times","authors":"R. Strawderman","doi":"10.2202/1557-4679.1005","DOIUrl":"https://doi.org/10.2202/1557-4679.1005","url":null,"abstract":"A natural choice of time scale for analyzing recurrent event data is the ``gap\" (or soujourn) time between successive events. In many situations it is reasonable to assume correlation exists between the successive events experienced by a given subject. This paper looks at the problem of extending the accelerated failure time (AFT) model to the case of dependent recurrent event data via intensity modeling. Specifically, the accelerated gap times model of Strawderman (2005), a semiparametric intensity model for independent gap time data, is extended to the case of multiplicative gamma frailty. As argued in Aalen & Husebye (1991), incorporating frailty captures the heterogeneity between subjects and the ``hazard\" portion of the intensity model captures gap time variation within a subject. Estimators are motivated using semiparametric efficiency theory and lead to useful generalizations of the rank statistics considered in Strawderman (2005). Several interesting distinctions arise in comparison to the Cox-Andersen-Gill frailty model (e.g., Nielsen et al, 1992; Klein, 1992). The proposed methodology is illustrated by simulation and data analysis.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68714794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens 历史调整的边际结构模型和静态最优动态治疗方案
IF 1.2 4区 数学 Pub Date : 2005-11-22 DOI: 10.2202/1557-4679.1003
M. J. van der Laan, M. Petersen, M. Joffe
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a treatment. These models, introduced by Robins, model the marginal distributions of treatment-specific counterfactual outcomes, possibly conditional on a subset of the baseline covariates. Marginal structural models are particularly useful in the context of longitudinal data structures, in which each subject's treatment and covariate history are measured over time, and an outcome is recorded at a final time point. However, the utility of these models for some applications has been limited by their inability to incorporate modification of the causal effect of treatment by time-varying covariates. Particularly in the context of clinical decision making, such time-varying effect modifiers are often of considerable or even primary interest, as they are used in practice to guide treatment decisions for an individual. In this article we propose a generalization of marginal structural models, which we call history-adjusted marginal structural models (HA-MSM). These models allow estimation of adjusted causal effects of treatment, given the observed past, and are therefore more suitable for making treatment decisions at the individual level and for identification of time-dependent effect modifiers. Specifically, a HA-MSM models the conditional distribution of treatment-specific counterfactual outcomes, conditional on the whole or a subset of the observed past up till a time-point, simultaneously for all time-points. Double robust inverse probability of treatment weighted estimators have been developed and studied in detail for standard MSM. We extend these results by proposing a class of double robust inverse probability of treatment weighted estimators for the unknown parameters of the HA-MSM. In addition, we show that HA-MSM provide a natural approach to identifying the dynamic treatment regimen which follows, at each time-point, the history-adjusted (up till the most recent time point) optimal static treatment regimen. We illustrate our results using an example drawn from the treatment of HIV infection.
边际结构模型(MSM)为估计治疗的因果效应提供了一个强有力的工具。这些模型,由罗宾斯介绍,模拟治疗特异性反事实结果的边际分布,可能以基线协变量的一个子集为条件。边际结构模型在纵向数据结构的背景下特别有用,在纵向数据结构中,每个受试者的治疗和协变量历史随时间测量,并在最终时间点记录结果。然而,这些模型在某些应用中的效用受到限制,因为它们无法通过时变协变量纳入治疗因果效应的修正。特别是在临床决策的背景下,这种时变效应调节剂通常是相当重要的,甚至是主要的兴趣,因为它们在实践中用于指导个人的治疗决策。在本文中,我们提出了一种边际结构模型,我们称之为历史调整边际结构模型(HA-MSM)。根据观察到的过去,这些模型允许对治疗的调整因果效应进行估计,因此更适合于在个人水平上做出治疗决策,并用于识别依赖于时间的效果修饰因子。具体地说,HA-MSM模拟了治疗特异性反事实结果的条件分布,条件取决于观察到的整个或子集的过去,直到一个时间点,同时适用于所有时间点。对标准MSM的双鲁棒逆概率处理加权估计进行了详细的研究。我们通过对HA-MSM的未知参数提出一类双鲁棒逆概率处理加权估计来推广这些结果。此外,我们表明HA-MSM提供了一种自然的方法来确定动态治疗方案,该方案遵循每个时间点的历史调整(直到最近的时间点)最佳静态治疗方案。我们用一个治疗HIV感染的例子来说明我们的结果。
{"title":"History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens","authors":"M. J. van der Laan, M. Petersen, M. Joffe","doi":"10.2202/1557-4679.1003","DOIUrl":"https://doi.org/10.2202/1557-4679.1003","url":null,"abstract":"Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a treatment. These models, introduced by Robins, model the marginal distributions of treatment-specific counterfactual outcomes, possibly conditional on a subset of the baseline covariates. Marginal structural models are particularly useful in the context of longitudinal data structures, in which each subject's treatment and covariate history are measured over time, and an outcome is recorded at a final time point. However, the utility of these models for some applications has been limited by their inability to incorporate modification of the causal effect of treatment by time-varying covariates. Particularly in the context of clinical decision making, such time-varying effect modifiers are often of considerable or even primary interest, as they are used in practice to guide treatment decisions for an individual. In this article we propose a generalization of marginal structural models, which we call history-adjusted marginal structural models (HA-MSM). These models allow estimation of adjusted causal effects of treatment, given the observed past, and are therefore more suitable for making treatment decisions at the individual level and for identification of time-dependent effect modifiers. Specifically, a HA-MSM models the conditional distribution of treatment-specific counterfactual outcomes, conditional on the whole or a subset of the observed past up till a time-point, simultaneously for all time-points. Double robust inverse probability of treatment weighted estimators have been developed and studied in detail for standard MSM. We extend these results by proposing a class of double robust inverse probability of treatment weighted estimators for the unknown parameters of the HA-MSM. In addition, we show that HA-MSM provide a natural approach to identifying the dynamic treatment regimen which follows, at each time-point, the history-adjusted (up till the most recent time point) optimal static treatment regimen. We illustrate our results using an example drawn from the treatment of HIV infection.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"1 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2005-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68714712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 82
Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics 当前状态数据的得分统计:与似然比和沃尔德统计的比较
IF 1.2 4区 数学 Pub Date : 2005-08-04 DOI: 10.2202/1557-4679.1001
M. Banerjee, J. Wellner
In this paper we introduce three natural ``score statistics" for testing the hypothesis that F(t_0)takes on a fixed value in the context of nonparametric inference with current status data. These three new test statistics have natural interpretations in terms of certain (weighted) L_2 distances, and are also connected to natural ``one-sided" scores. We compare these new test statistics with the analogue of the classical Wald statistic and the likelihood ratio statistic introduced in Banerjee and Wellner (2001) for the same testing problem. Under classical ``regular" statistical problems the likelihood ratio, score, and Wald statistics all have the same chi-squared limiting distribution under the null hypothesis. In sharp contrast, in this non-regular problem all three statistics have different limiting distributions under the null hypothesis. Thus we begin by establishing the limit distribution theory of the statistics under the null hypothesis, and discuss calculation of the relevant critical points for the test statistics. Once the null distribution theory is known, the immediate question becomes that of power. We establish the limiting behavior of the three types of statistics under local alternatives. We have also compared the power of these five different statistics via a limited Monte-Carlo study. Our conclusions are: (a) the Wald statistic is less powerful than the likelihood ratio and score statistics; and (b) one of the score statistics may have more power than the likelihood ratio statistic for some alternatives.
在本文中,我们引入了三种自然的“分数统计”来检验F(t_0)在使用当前状态数据进行非参数推理的情况下取固定值的假设。这三个新的测试统计量在一定(加权)l2距离方面具有自然解释,并且也与自然的“片面”分数有关。我们将这些新的检验统计量与Banerjee和Wellner(2001)为同一检验问题引入的经典Wald统计量和似然比统计量的类比进行比较。在经典的“规则”统计问题中,在零假设下,似然比、分数和Wald统计量都具有相同的卡方极限分布。与此形成鲜明对比的是,在这个非正则问题中,所有三种统计量在零假设下具有不同的极限分布。因此,我们首先建立了零假设下统计量的极限分布理论,并讨论了检验统计量的相关临界点的计算。一旦知道了零分布理论,直接的问题就变成了权力的问题。建立了三种统计量在局部替代条件下的极限行为。我们还通过一项有限的蒙特卡洛研究比较了这五种不同统计数据的效力。我们的结论是:(a) Wald统计量比似然比和评分统计量更弱;(b)对于某些选项,其中一个得分统计可能比似然比统计更有效。
{"title":"Score Statistics for Current Status Data: Comparisons with Likelihood Ratio and Wald Statistics","authors":"M. Banerjee, J. Wellner","doi":"10.2202/1557-4679.1001","DOIUrl":"https://doi.org/10.2202/1557-4679.1001","url":null,"abstract":"In this paper we introduce three natural ``score statistics\" for testing the hypothesis that F(t_0)takes on a fixed value in the context of nonparametric inference with current status data. These three new test statistics have natural interpretations in terms of certain (weighted) L_2 distances, and are also connected to natural ``one-sided\" scores. We compare these new test statistics with the analogue of the classical Wald statistic and the likelihood ratio statistic introduced in Banerjee and Wellner (2001) for the same testing problem. Under classical ``regular\" statistical problems the likelihood ratio, score, and Wald statistics all have the same chi-squared limiting distribution under the null hypothesis. In sharp contrast, in this non-regular problem all three statistics have different limiting distributions under the null hypothesis. Thus we begin by establishing the limit distribution theory of the statistics under the null hypothesis, and discuss calculation of the relevant critical points for the test statistics. Once the null distribution theory is known, the immediate question becomes that of power. We establish the limiting behavior of the three types of statistics under local alternatives. We have also compared the power of these five different statistics via a limited Monte-Carlo study. Our conclusions are: (a) the Wald statistic is less powerful than the likelihood ratio and score statistics; and (b) one of the score statistics may have more power than the likelihood ratio statistic for some alternatives.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"1 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2005-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68714657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
期刊
International Journal of Biostatistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1