首页 > 最新文献

International Journal of Biostatistics最新文献

英文 中文
Inference in Epidemic Models without Likelihoods 无似然的流行病模型中的推论
IF 1.2 4区 数学 Pub Date : 2009-07-20 DOI: 10.2202/1557-4679.1171
T. McKinley, A. Cook, R. Deardon
Likelihood-based inference for epidemic models can be challenging, in part due to difficulties in evaluating the likelihood. The problem is particularly acute in models of large-scale outbreaks, and unobserved or partially observed data further complicates this process. Here we investigate the performance of Markov Chain Monte Carlo and Sequential Monte Carlo algorithms for parameter inference, where the routines are based on approximate likelihoods generated from model simulations. We compare our results to a gold-standard data-augmented MCMC for both complete and incomplete data. We illustrate our techniques using simulated epidemics as well as data from a recent outbreak of Ebola Haemorrhagic Fever in the Democratic Republic of Congo and discuss situations in which we think simulation-based inference may be preferable to likelihood-based inference.
基于可能性的流行病模型推断可能具有挑战性,部分原因是难以评估可能性。这一问题在大规模疫情模型中尤为严重,而未观察到或部分观察到的数据使这一过程进一步复杂化。在这里,我们研究了马尔可夫链蒙特卡罗和顺序蒙特卡罗算法在参数推理方面的性能,其中例程是基于模型模拟产生的近似似然。对于完整和不完整的数据,我们将结果与金标准的数据增强MCMC进行比较。我们使用模拟流行病以及刚果民主共和国最近爆发的埃博拉出血热的数据来说明我们的技术,并讨论了我们认为基于模拟的推断可能比基于可能性的推断更可取的情况。
{"title":"Inference in Epidemic Models without Likelihoods","authors":"T. McKinley, A. Cook, R. Deardon","doi":"10.2202/1557-4679.1171","DOIUrl":"https://doi.org/10.2202/1557-4679.1171","url":null,"abstract":"Likelihood-based inference for epidemic models can be challenging, in part due to difficulties in evaluating the likelihood. The problem is particularly acute in models of large-scale outbreaks, and unobserved or partially observed data further complicates this process. Here we investigate the performance of Markov Chain Monte Carlo and Sequential Monte Carlo algorithms for parameter inference, where the routines are based on approximate likelihoods generated from model simulations. We compare our results to a gold-standard data-augmented MCMC for both complete and incomplete data. We illustrate our techniques using simulated epidemics as well as data from a recent outbreak of Ebola Haemorrhagic Fever in the Democratic Republic of Congo and discuss situations in which we think simulation-based inference may be preferable to likelihood-based inference.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"5 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2009-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1171","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68716137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 188
A Non-Parametric Approach to Scale Reduction for Uni-Dimensional Screening Scales 一维筛选尺度尺度缩减的非参数方法
IF 1.2 4区 数学 Pub Date : 2009-01-28 DOI: 10.2202/1557-4679.1094
Xinhua Liu, Zhezhen Jin
To select items from a uni-dimensional scale to create a reduced scale for disease screening, Liu and Jin (2007) developed a non-parametric method based on binary risk classification. When the measure for the risk of a disease is ordinal or quantitative, and possibly subject to random censoring, this method is inefficient because it requires dichotomizing the risk measure, which may cause information loss and sample size reduction. In this paper, we modify Harrell's C-index (1984) such that the concordance probability, used as a measure of the discrimination accuracy of a scale with integer valued scores, can be estimated consistently when data are subject to random censoring. By evaluating changes in discrimination accuracy with the addition or deletion of items, we can select risk-related items without specifying parametric models. The procedure first removes the least useful items from the full scale, then, applies forward stepwise selection to the remaining items to obtain a reduced scale whose discrimination accuracy matches or exceeds that of the full scale. A simulation study shows the procedure to have good finite sample performance. We illustrate the method using a data set of patients at risk of developing Alzheimer's disease, who were administered a 40-item test of olfactory function before their semi-annual follow-up assessment.
为了从单维量表中选择项目来创建疾病筛查的简化量表,Liu和Jin(2007)开发了一种基于二元风险分类的非参数方法。当一种疾病的风险度量是有序的或定量的,并且可能受到随机审查时,这种方法是低效的,因为它需要对风险度量进行二分类,这可能导致信息丢失和样本量减少。在本文中,我们修改了Harrell的C-index(1984),使得当数据受到随机审查时,用于衡量整数值分数的尺度的判别精度的一致性概率能够得到一致的估计。通过评估增加或删除项目对识别精度的影响,我们可以在不指定参数模型的情况下选择与风险相关的项目。该方法首先从全量表中去除最无用的项目,然后对剩余的项目进行前向逐步选择,得到一个识别精度与全量表相当或超过全量表的缩减量表。仿真研究表明,该程序具有良好的有限样本性能。我们使用一组有患阿尔茨海默病风险的患者数据来说明该方法,这些患者在每半年进行一次随访评估之前进行了40项嗅觉功能测试。
{"title":"A Non-Parametric Approach to Scale Reduction for Uni-Dimensional Screening Scales","authors":"Xinhua Liu, Zhezhen Jin","doi":"10.2202/1557-4679.1094","DOIUrl":"https://doi.org/10.2202/1557-4679.1094","url":null,"abstract":"To select items from a uni-dimensional scale to create a reduced scale for disease screening, Liu and Jin (2007) developed a non-parametric method based on binary risk classification. When the measure for the risk of a disease is ordinal or quantitative, and possibly subject to random censoring, this method is inefficient because it requires dichotomizing the risk measure, which may cause information loss and sample size reduction. In this paper, we modify Harrell's C-index (1984) such that the concordance probability, used as a measure of the discrimination accuracy of a scale with integer valued scores, can be estimated consistently when data are subject to random censoring. By evaluating changes in discrimination accuracy with the addition or deletion of items, we can select risk-related items without specifying parametric models. The procedure first removes the least useful items from the full scale, then, applies forward stepwise selection to the remaining items to obtain a reduced scale whose discrimination accuracy matches or exceeds that of the full scale. A simulation study shows the procedure to have good finite sample performance. We illustrate the method using a data set of patients at risk of developing Alzheimer's disease, who were administered a 40-item test of olfactory function before their semi-annual follow-up assessment.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"23 1","pages":"1-22"},"PeriodicalIF":1.2,"publicationDate":"2009-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1094","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Measuring Agreement about Ranked Decision Choices for a Single Subject 衡量单个受试者的排序决策选择的一致性
IF 1.2 4区 数学 Pub Date : 2009-01-01 DOI: 10.2202/1557-4679.1113
R. Riffenburgh, P. Johnstone
Introduction. When faced with a medical classification, clinicians often rank-order the likelihood of potential diagnoses, treatment choices, or prognoses as a way to focus on likely occurrences without dropping rarer ones from consideration. To know how well clinicians agree on such rankings might help extend the realm of clinical judgment farther into the purview of evidence-based medicine. If rankings by different clinicians agree better than chance, the order of assignments and their relative likelihoods may justifiably contribute to medical decisions. If the agreement is no better than chance, the ranking should not influence the medical decision.Background. Available rank-order methods measure agreement over a set of decision choices by two rankers or by a set of rankers over two choices (rank correlation methods), or an overall agreement over a set of choices by a set of rankers (Kendall's W), but will not measure agreement about a single decision choice across a set of rankers. Rating methods (e.g. kappa) assign multiple subjects to nominal categories rather than ranking possible choices about a single subject and will not measure agreement about a single decision choice across a set of rankers.Method. In this article, we pose an agreement coefficient A for measuring agreement among a set of clinicians about a single decision choice and compare several potential forms of A. A takes on the value 0 when agreement is random and 1 when agreement is perfect. It is shown that A = 1 - observed disagreement/maximum disagreement. A particular form of A is recommended and tables of 5% and 10% significant values of A are generated for common numbers of ranks and rankers.Examples. In the selection of potential treatment assignments by a Tumor Board to a patient with a neck mass, there is no significant agreement about any treatment. Another example involves ranking decisions about a proposed medical research protocol by an Institutional Review Board (IRB). The decision to pass a protocol with minor revisions shows agreement at the 5% significance level, adequate for a consistent decision.
介绍。当面临医学分类时,临床医生通常会对潜在诊断、治疗选择或预后的可能性进行排序,以此来关注可能发生的情况,而不会放弃对罕见情况的考虑。了解临床医生对这种排名的认同程度,可能有助于将临床判断的领域进一步扩展到循证医学的范围。如果不同临床医生的排名比偶然更一致,那么分配的顺序及其相对可能性可能合理地有助于医疗决策。如果协议不比偶然好,排名不应该影响医疗决定。背景。可用的秩序方法度量两个排序者对一组决策选择的一致性,或者度量一组排序者对两个选择的一致性(秩相关方法),或者度量一组排序者对一组选择的总体一致性(Kendall's W),但是不能度量跨一组排序者对单个决策选择的一致性。评级方法(如kappa)将多个受试者分配到名义类别,而不是对单个主题的可能选择进行排名,并且不会衡量一组排名者对单个决策选择的一致性。在本文中,我们提出了一个一致性系数A来衡量一组临床医生对单一决策选择的一致性,并比较了几种可能的A形式。当一致性是随机的时,A的值为0,当一致性是完美的时,A的值为1。结果表明,A = 1 -观察到的分歧/最大分歧。推荐一种特殊形式的A,并为常见的等级和排名生成5%和10%显著值的A表。在肿瘤委员会对颈部肿块患者的潜在治疗方案的选择中,对任何治疗方案都没有显著的一致意见。另一个例子涉及机构审查委员会(IRB)对拟议医学研究方案的排名决定。对协议进行少量修改的决定表明在5%的显著性水平上达成一致,足以做出一致的决定。
{"title":"Measuring Agreement about Ranked Decision Choices for a Single Subject","authors":"R. Riffenburgh, P. Johnstone","doi":"10.2202/1557-4679.1113","DOIUrl":"https://doi.org/10.2202/1557-4679.1113","url":null,"abstract":"Introduction. When faced with a medical classification, clinicians often rank-order the likelihood of potential diagnoses, treatment choices, or prognoses as a way to focus on likely occurrences without dropping rarer ones from consideration. To know how well clinicians agree on such rankings might help extend the realm of clinical judgment farther into the purview of evidence-based medicine. If rankings by different clinicians agree better than chance, the order of assignments and their relative likelihoods may justifiably contribute to medical decisions. If the agreement is no better than chance, the ranking should not influence the medical decision.Background. Available rank-order methods measure agreement over a set of decision choices by two rankers or by a set of rankers over two choices (rank correlation methods), or an overall agreement over a set of choices by a set of rankers (Kendall's W), but will not measure agreement about a single decision choice across a set of rankers. Rating methods (e.g. kappa) assign multiple subjects to nominal categories rather than ranking possible choices about a single subject and will not measure agreement about a single decision choice across a set of rankers.Method. In this article, we pose an agreement coefficient A for measuring agreement among a set of clinicians about a single decision choice and compare several potential forms of A. A takes on the value 0 when agreement is random and 1 when agreement is perfect. It is shown that A = 1 - observed disagreement/maximum disagreement. A particular form of A is recommended and tables of 5% and 10% significant values of A are generated for common numbers of ranks and rankers.Examples. In the selection of potential treatment assignments by a Tumor Board to a patient with a neck mass, there is no significant agreement about any treatment. Another example involves ranking decisions about a proposed medical research protocol by an Institutional Review Board (IRB). The decision to pass a protocol with minor revisions shows agreement at the 5% significance level, adequate for a consistent decision.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"47 47 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1113","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Empirical Efficiency Maximization: Improved Locally Efficient Covariate Adjustment in Randomized Experiments and Survival Analysis 经验效率最大化:随机实验和生存分析中改进的局部有效协变量调整
IF 1.2 4区 数学 Pub Date : 2008-05-04 DOI: 10.2202/1557-4679.1084
D. Rubin, M. J. van der Laan
It has long been recognized that covariate adjustment can increase precision in randomized experiments, even when it is not strictly necessary. Adjustment is often straightforward when a discrete covariate partitions the sample into a handful of strata, but becomes more involved with even a single continuous covariate such as age. As randomized experiments remain a gold standard for scientific inquiry, and the information age facilitates a massive collection of baseline information, the longstanding problem of if and how to adjust for covariates is likely to engage investigators for the foreseeable future.In the locally efficient estimation approach introduced for general coarsened data structures by James Robins and collaborators, one first fits a relatively small working model, often with maximum likelihood, giving a nuisance parameter fit in an estimating equation for the parameter of interest. The usual advertisement is that the estimator will be asymptotically efficient if the working model is correct, but otherwise will still be consistent and asymptotically Gaussian.However, by applying standard likelihood-based fits to misspecified working models in covariate adjustment problems, one can poorly estimate the parameter of interest. We propose a new method, empirical efficiency maximization, to optimize the working model fit for the resulting parameter estimate.In addition to the randomized experiment setting, we show how our covariate adjustment procedure can be used in survival analysis applications. Numerical asymptotic efficiency calculations demonstrate gains relative to standard locally efficient estimators.
人们早就认识到协变量调整可以提高随机实验的精度,即使它不是严格必要的。当一个离散的协变量将样本划分为几个层时,调整通常是直接的,但即使是一个连续的协变量,如年龄,也会变得更加复杂。由于随机实验仍然是科学探究的黄金标准,信息时代促进了大量基线信息的收集,在可预见的未来,是否以及如何调整协变量的长期问题可能会让研究人员参与其中。在James Robins及其合作者为一般粗化数据结构引入的局部有效估计方法中,首先拟合一个相对较小的工作模型,通常具有最大似然性,在对感兴趣的参数的估计方程中给出一个讨厌的参数拟合。通常的广告是,如果工作模型是正确的,估计器将是渐近有效的,但否则仍然是一致的和渐近高斯的。然而,在协变量调整问题中,通过将标准的基于似然的拟合应用于错误指定的工作模型,人们可以很好地估计感兴趣的参数。我们提出了一种新的方法,经验效率最大化,以优化工作模型拟合的结果参数估计。除了随机实验设置外,我们还展示了协变量调整程序如何用于生存分析应用。数值渐近效率计算证明了相对于标准局部有效估计的增益。
{"title":"Empirical Efficiency Maximization: Improved Locally Efficient Covariate Adjustment in Randomized Experiments and Survival Analysis","authors":"D. Rubin, M. J. van der Laan","doi":"10.2202/1557-4679.1084","DOIUrl":"https://doi.org/10.2202/1557-4679.1084","url":null,"abstract":"It has long been recognized that covariate adjustment can increase precision in randomized experiments, even when it is not strictly necessary. Adjustment is often straightforward when a discrete covariate partitions the sample into a handful of strata, but becomes more involved with even a single continuous covariate such as age. As randomized experiments remain a gold standard for scientific inquiry, and the information age facilitates a massive collection of baseline information, the longstanding problem of if and how to adjust for covariates is likely to engage investigators for the foreseeable future.In the locally efficient estimation approach introduced for general coarsened data structures by James Robins and collaborators, one first fits a relatively small working model, often with maximum likelihood, giving a nuisance parameter fit in an estimating equation for the parameter of interest. The usual advertisement is that the estimator will be asymptotically efficient if the working model is correct, but otherwise will still be consistent and asymptotically Gaussian.However, by applying standard likelihood-based fits to misspecified working models in covariate adjustment problems, one can poorly estimate the parameter of interest. We propose a new method, empirical efficiency maximization, to optimize the working model fit for the resulting parameter estimate.In addition to the randomized experiment setting, we show how our covariate adjustment procedure can be used in survival analysis applications. Numerical asymptotic efficiency calculations demonstrate gains relative to standard locally efficient estimators.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"13 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2008-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1084","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 101
Modeling the Effect of a Preventive Intervention on the Natural History of Cancer: Application to the Prostate Cancer Prevention Trial 模拟预防干预对癌症自然史的影响:在前列腺癌预防试验中的应用
IF 1.2 4区 数学 Pub Date : 2006-12-28 DOI: 10.2202/1557-4679.1036
P. Pinsky, Ruth Etzioni, N. Howlader, P. Goodman, I. Thompson
The Prostate Cancer Prevention Trial (PCPT) recently demonstrated a significant reduction in prostate cancer incidence of about 25% among men taking finasteride compared to men taking placebo. However, the effect of finasteride on the natural history of prostate cancer is not well understood. We adapted a convolution model developed by Pinsky (2001) to characterize the natural history of prostate cancer in the presence and absence of finasteride. The model was applied to data from 10,995 men in PCPT who had disease status determined by interim diagnosis of prostate cancer or end-of-study biopsy. Prostate cancer cases were either screen-detected by Prostate-Specific Antigen (PSA), biopsy-detected at the end of the study, or clinically detected, that is, detected by methods other than PSA screening. The hazard ratio (HR) for the incidence of preclinical disease on finasteride versus placebo was 0.42 (95% CI: 0.20-0.58). The progression from preclinical to clinical disease was relatively unaffected by finasteride, with mean sojourn time being 16 years for placebo cases and 18.5 years for finasteride cases (p-value for difference = 0.2). We conclude that finasteride appears to affect prostate cancer primarily by preventing the emergence of new, preclinical tumors with little impact on established, latent disease.
前列腺癌预防试验(PCPT)最近表明,与服用安慰剂的男性相比,服用非那雄胺的男性前列腺癌发病率显著降低约25%。然而,非那雄胺对前列腺癌自然史的影响尚不清楚。我们采用了Pinsky(2001)开发的卷积模型来描述非那雄胺存在和不存在时前列腺癌的自然历史。该模型应用于10995名通过前列腺癌中期诊断或研究结束活检确定疾病状态的PCPT患者的数据。前列腺癌病例要么通过前列腺特异性抗原(PSA)筛查检测,要么在研究结束时进行活检检测,要么通过临床检测,即通过PSA筛查以外的方法检测。非那雄胺与安慰剂的临床前疾病发生率的危险比(HR)为0.42 (95% CI: 0.20-0.58)。从临床前到临床疾病的进展相对不受非那雄胺的影响,安慰剂组的平均停留时间为16年,非那雄胺组的平均停留时间为18.5年(p值差异= 0.2)。我们得出结论,非那雄胺似乎主要通过预防新的临床前肿瘤的出现来影响前列腺癌,而对已建立的潜伏性疾病几乎没有影响。
{"title":"Modeling the Effect of a Preventive Intervention on the Natural History of Cancer: Application to the Prostate Cancer Prevention Trial","authors":"P. Pinsky, Ruth Etzioni, N. Howlader, P. Goodman, I. Thompson","doi":"10.2202/1557-4679.1036","DOIUrl":"https://doi.org/10.2202/1557-4679.1036","url":null,"abstract":"The Prostate Cancer Prevention Trial (PCPT) recently demonstrated a significant reduction in prostate cancer incidence of about 25% among men taking finasteride compared to men taking placebo. However, the effect of finasteride on the natural history of prostate cancer is not well understood. We adapted a convolution model developed by Pinsky (2001) to characterize the natural history of prostate cancer in the presence and absence of finasteride. The model was applied to data from 10,995 men in PCPT who had disease status determined by interim diagnosis of prostate cancer or end-of-study biopsy. Prostate cancer cases were either screen-detected by Prostate-Specific Antigen (PSA), biopsy-detected at the end of the study, or clinically detected, that is, detected by methods other than PSA screening. The hazard ratio (HR) for the incidence of preclinical disease on finasteride versus placebo was 0.42 (95% CI: 0.20-0.58). The progression from preclinical to clinical disease was relatively unaffected by finasteride, with mean sojourn time being 16 years for placebo cases and 18.5 years for finasteride cases (p-value for difference = 0.2). We conclude that finasteride appears to affect prostate cancer primarily by preventing the emergence of new, preclinical tumors with little impact on established, latent disease.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1036","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Targeted Maximum Likelihood Learning 目标最大似然学习
IF 1.2 4区 数学 Pub Date : 2006-12-28 DOI: 10.2202/1557-4679.1043
M. J. van der Laan, D. Rubin
Suppose one observes a sample of independent and identically distributed observations from a particular data generating distribution. Suppose that one is concerned with estimation of a particular pathwise differentiable Euclidean parameter. A substitution estimator evaluating the parameter of a given likelihood based density estimator is typically too biased and might not even converge at the parametric rate: that is, the density estimator was targeted to be a good estimator of the density and might therefore result in a poor estimator of a particular smooth functional of the density. In this article we propose a one step (and, by iteration, k-th step) targeted maximum likelihood density estimator which involves 1) creating a hardest parametric submodel with parameter epsilon through the given density estimator with score equal to the efficient influence curve of the pathwise differentiable parameter at the density estimator, 2) estimating epsilon with the maximum likelihood estimator, and 3) defining a new density estimator as the corresponding update of the original density estimator. We show that iteration of this algorithm results in a targeted maximum likelihood density estimator which solves the efficient influence curve estimating equation and thereby yields a locally efficient estimator of the parameter of interest, under regularity conditions. In particular, we show that, if the parameter is linear and the model is convex, then the targeted maximum likelihood estimator is often achieved in the first step, and it results in a locally efficient estimator at an arbitrary (e.g., heavily misspecified) starting density.We also show that the targeted maximum likelihood estimators are now in full agreement with the locally efficient estimating function methodology as presented in Robins and Rotnitzky (1992) and van der Laan and Robins (2003), creating, in particular, algebraic equivalence between the double robust locally efficient estimators using the targeted maximum likelihood estimators as an estimate of its nuisance parameters, and targeted maximum likelihood estimators. In addition, it is argued that the targeted MLE has various advantages relative to the current estimating function based approach. We proceed by providing data driven methodologies to select the initial density estimator for the targeted MLE, thereby providing data adaptive targeted maximum likelihood estimation methodology. We illustrate the method with various worked out examples.
假设从一个特定的数据生成分布中观察到一个独立且相同分布的观察样本。假设我们关心的是一个特定的路径可微欧几里得参数的估计。评估给定的基于似然的密度估计器的参数的替代估计器通常过于偏倚,甚至可能不会以参数速率收敛:也就是说,密度估计器的目标是成为密度的良好估计器,因此可能导致密度的特定光滑泛函的差估计器。在本文中,我们提出了一个一步(通过迭代,第k步)目标最大似然密度估计器,它涉及1)通过给定的密度估计器创建参数为epsilon的最难参数子模型,其得分等于密度估计器处路径可微参数的有效影响曲线,2)用最大似然估计器估计epsilon,3)定义一个新的密度估计量作为对原有密度估计量的相应更新。我们证明了该算法的迭代产生了一个目标最大似然密度估计量,它解决了有效的影响曲线估计方程,从而在正则性条件下产生了感兴趣参数的局部有效估计量。特别是,我们表明,如果参数是线性的,模型是凸的,那么目标最大似然估计器通常在第一步就能实现,并且它会在任意(例如,严重错误指定)的起始密度下产生局部有效的估计器。我们还表明,目标最大似然估计量现在与Robins和Rotnitzky(1992)以及van der Laan和Robins(2003)中提出的局部有效估计函数方法完全一致,特别是在使用目标最大似然估计量作为其讨厌参数的估计的双鲁棒局部有效估计量和目标最大似然估计量之间创建了代数等价。此外,本文还认为,相对于目前基于函数的估计方法,目标最大似然算法具有多种优势。我们通过提供数据驱动的方法来选择目标最大似然估计的初始密度估计量,从而提供数据自适应的目标最大似然估计方法。我们用各种算例来说明该方法。
{"title":"Targeted Maximum Likelihood Learning","authors":"M. J. van der Laan, D. Rubin","doi":"10.2202/1557-4679.1043","DOIUrl":"https://doi.org/10.2202/1557-4679.1043","url":null,"abstract":"Suppose one observes a sample of independent and identically distributed observations from a particular data generating distribution. Suppose that one is concerned with estimation of a particular pathwise differentiable Euclidean parameter. A substitution estimator evaluating the parameter of a given likelihood based density estimator is typically too biased and might not even converge at the parametric rate: that is, the density estimator was targeted to be a good estimator of the density and might therefore result in a poor estimator of a particular smooth functional of the density. In this article we propose a one step (and, by iteration, k-th step) targeted maximum likelihood density estimator which involves 1) creating a hardest parametric submodel with parameter epsilon through the given density estimator with score equal to the efficient influence curve of the pathwise differentiable parameter at the density estimator, 2) estimating epsilon with the maximum likelihood estimator, and 3) defining a new density estimator as the corresponding update of the original density estimator. We show that iteration of this algorithm results in a targeted maximum likelihood density estimator which solves the efficient influence curve estimating equation and thereby yields a locally efficient estimator of the parameter of interest, under regularity conditions. In particular, we show that, if the parameter is linear and the model is convex, then the targeted maximum likelihood estimator is often achieved in the first step, and it results in a locally efficient estimator at an arbitrary (e.g., heavily misspecified) starting density.We also show that the targeted maximum likelihood estimators are now in full agreement with the locally efficient estimating function methodology as presented in Robins and Rotnitzky (1992) and van der Laan and Robins (2003), creating, in particular, algebraic equivalence between the double robust locally efficient estimators using the targeted maximum likelihood estimators as an estimate of its nuisance parameters, and targeted maximum likelihood estimators. In addition, it is argued that the targeted MLE has various advantages relative to the current estimating function based approach. We proceed by providing data driven methodologies to select the initial density estimator for the targeted MLE, thereby providing data adaptive targeted maximum likelihood estimation methodology. We illustrate the method with various worked out examples.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1043","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 749
Choice of Monitoring Mechanism for Optimal Nonparametric Functional Estimation for Binary Data 二值数据最优非参数泛函估计监控机制的选择
IF 1.2 4区 数学 Pub Date : 2006-09-25 DOI: 10.2202/1557-4679.1031
N. Jewell, M. J. van der Laan, S. Shiboski
Optimal designs of dose levels in order to estimate parameters from a model for binary response data have a long and rich history. These designs are based on parametric models. Here we consider fully nonparametric models with interest focused on estimation of smooth functionals using plug-in estimators based on the nonparametric maximum likelihood estimator. An important application of the results is the derivation of the optimal choice of the monitoring time distribution function for current status observation of a survival distribution. The optimal choice depends in a simple way on the dose-response function and the form of the functional. The results can be extended to allow dependence of the monitoring mechanism on covariates.
为了从二元响应数据模型中估计参数而进行剂量水平的优化设计有着悠久而丰富的历史。这些设计是基于参数化模型的。在这里,我们考虑完全非参数模型,重点关注使用基于非参数极大似然估计的插件估计器估计光滑函数。该结果的一个重要应用是推导了生存分布当前状态观测的监测时间分布函数的最优选择。最优选择以一种简单的方式取决于剂量-响应函数和泛函的形式。结果可以推广到允许监控机制对协变量的依赖。
{"title":"Choice of Monitoring Mechanism for Optimal Nonparametric Functional Estimation for Binary Data","authors":"N. Jewell, M. J. van der Laan, S. Shiboski","doi":"10.2202/1557-4679.1031","DOIUrl":"https://doi.org/10.2202/1557-4679.1031","url":null,"abstract":"Optimal designs of dose levels in order to estimate parameters from a model for binary response data have a long and rich history. These designs are based on parametric models. Here we consider fully nonparametric models with interest focused on estimation of smooth functionals using plug-in estimators based on the nonparametric maximum likelihood estimator. An important application of the results is the derivation of the optimal choice of the monitoring time distribution function for current status observation of a survival distribution. The optimal choice depends in a simple way on the dose-response function and the form of the functional. The results can be extended to allow dependence of the monitoring mechanism on covariates.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715343","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Application of a Variable Importance Measure Method 变量重要性度量方法的应用
IF 1.2 4区 数学 Pub Date : 2006-01-14 DOI: 10.2202/1557-4679.1013
M. Birkner, M. J. van der Laan
Van der Laan (2005) proposed a targeted method used to construct variable importance measures coupled with respective statistical inference. This technique involves determining the importance of a variable in predicting an outcome. This method can be applied as inverse probability of treatment weighted (IPTW) or double robust inverse probability of treatment weighted (DR-IPTW) estimators. The variance and respective p-value of the estimate are calculated by estimating the influence curve. This article applies the Van der Laan (2005) variable importance measures and corresponding inference to HIV-1 sequence data. In this application, the method is targeted at every codon position. In this data application, protease and reverse transcriptase codon positions on the HIV-1 strand are assessed to determine their respective variable importance, with respect to an outcome of viral replication capacity. We estimate the DR-IPTW W-adjusted variable importance measure for a specified set of potential effect modifiers W. In addition, simulations were performed on two separate datasets to examine the DR-IPTW estimator.
Van der Laan(2005)提出了一种有针对性的方法,用于构建变量重要性度量,并结合各自的统计推断。这项技术包括确定变量在预测结果中的重要性。该方法可应用于加权处理逆概率估计(IPTW)或双鲁棒加权处理逆概率估计(DR-IPTW)。通过估计影响曲线来计算估计的方差和各自的p值。本文将Van der Laan(2005)变量重要性度量和相应的推断应用于HIV-1序列数据。在本应用中,该方法针对每个密码子位置。在此数据应用中,评估了HIV-1链上蛋白酶和逆转录酶密码子的位置,以确定它们各自的变量重要性,以及病毒复制能力的结果。我们估计了DR-IPTW w调整后的变量重要性测量值对一组特定的潜在效应修饰因子w的影响。此外,在两个独立的数据集上进行了模拟,以检验DR-IPTW估计器。
{"title":"Application of a Variable Importance Measure Method","authors":"M. Birkner, M. J. van der Laan","doi":"10.2202/1557-4679.1013","DOIUrl":"https://doi.org/10.2202/1557-4679.1013","url":null,"abstract":"Van der Laan (2005) proposed a targeted method used to construct variable importance measures coupled with respective statistical inference. This technique involves determining the importance of a variable in predicting an outcome. This method can be applied as inverse probability of treatment weighted (IPTW) or double robust inverse probability of treatment weighted (DR-IPTW) estimators. The variance and respective p-value of the estimate are calculated by estimating the influence curve. This article applies the Van der Laan (2005) variable importance measures and corresponding inference to HIV-1 sequence data. In this application, the method is targeted at every codon position. In this data application, protease and reverse transcriptase codon positions on the HIV-1 strand are assessed to determine their respective variable importance, with respect to an outcome of viral replication capacity. We estimate the DR-IPTW W-adjusted variable importance measure for a specified set of potential effect modifiers W. In addition, simulations were performed on two separate datasets to examine the DR-IPTW estimator.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Two Sample Problem for Multiple Categorical Variables 多分类变量的双样本问题
IF 1.2 4区 数学 Pub Date : 2006-01-09 DOI: 10.2202/1557-4679.1019
A. DiRienzo
Comparing two large multivariate distributions is potentially complicated at least for the following reasons. First, some variable/level combinations may have a redundant difference in prevalence between groups in the sense that the difference can be completely explained in terms of lower-order combinations. Second, the total number of variable/level combinations to compare between groups is very large, and likely computationally prohibitive. In this paper, for both the paired and independent sample case, an approximate comparison method is proposed, along with a computationally efficient algorithm, that estimates the set of variable/level combinations that have a non-redundant different prevalence between two populations. The probability that the estimate contains one or more false or redundant differences is asymptotically bounded above by any pre-specified level for arbitrary data-generating distributions. The method is shown to perform well for finite samples in a simulation study, and is used to investigate HIV-1 genotype evolution in a recent AIDS clinical trial.
由于以下原因,比较两个大型多变量分布可能会很复杂。首先,一些变量/水平组合可能在组间的流行率上有冗余差异,因为这种差异可以完全用低阶组合来解释。其次,要在组间比较的变量/水平组合的总数非常大,并且可能在计算上令人望而却步。在本文中,对于配对和独立样本情况,提出了一种近似比较方法,以及一种计算效率高的算法,该方法可以估计两个种群之间具有非冗余不同患病率的变量/水平组合集。对于任意数据生成分布,估计包含一个或多个错误或冗余差异的概率在任何预先指定的水平上渐近有界。在模拟研究中,该方法在有限样本中表现良好,并在最近的艾滋病临床试验中用于研究HIV-1基因型进化。
{"title":"The Two Sample Problem for Multiple Categorical Variables","authors":"A. DiRienzo","doi":"10.2202/1557-4679.1019","DOIUrl":"https://doi.org/10.2202/1557-4679.1019","url":null,"abstract":"Comparing two large multivariate distributions is potentially complicated at least for the following reasons. First, some variable/level combinations may have a redundant difference in prevalence between groups in the sense that the difference can be completely explained in terms of lower-order combinations. Second, the total number of variable/level combinations to compare between groups is very large, and likely computationally prohibitive. In this paper, for both the paired and independent sample case, an approximate comparison method is proposed, along with a computationally efficient algorithm, that estimates the set of variable/level combinations that have a non-redundant different prevalence between two populations. The probability that the estimate contains one or more false or redundant differences is asymptotically bounded above by any pre-specified level for arbitrary data-generating distributions. The method is shown to perform well for finite samples in a simulation study, and is used to investigate HIV-1 genotype evolution in a recent AIDS clinical trial.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"2 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1019","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68715010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing Distribution Functions via Empirical Likelihood 通过经验似然比较分布函数
IF 1.2 4区 数学 Pub Date : 2006-01-04 DOI: 10.2202/1557-4679.1007
I. McKeague, Yichuan Zhao
This paper develops empirical likelihood based simultaneous confidence bands for differences and ratios of two distribution functions from independent samples of right-censored survival data. The proposed confidence bands provide a flexible way of comparing treatments in biomedical settings, and bring empirical likelihood methods to bear on important target functions for which only Wald-type confidence bands have been available in the literature. The approach is illustrated with a real data example.
本文开发了基于经验似然的两个分布函数的差异和比值的同时置信带,这些分布函数来自独立的右截尾生存数据样本。所提出的置信带提供了一种灵活的方法来比较生物医学环境中的治疗,并将经验似然方法应用于文献中只有wald型置信带可用的重要目标函数。通过一个实际数据示例说明了该方法。
{"title":"Comparing Distribution Functions via Empirical Likelihood","authors":"I. McKeague, Yichuan Zhao","doi":"10.2202/1557-4679.1007","DOIUrl":"https://doi.org/10.2202/1557-4679.1007","url":null,"abstract":"This paper develops empirical likelihood based simultaneous confidence bands for differences and ratios of two distribution functions from independent samples of right-censored survival data. The proposed confidence bands provide a flexible way of comparing treatments in biomedical settings, and bring empirical likelihood methods to bear on important target functions for which only Wald-type confidence bands have been available in the literature. The approach is illustrated with a real data example.","PeriodicalId":50333,"journal":{"name":"International Journal of Biostatistics","volume":"1 1","pages":""},"PeriodicalIF":1.2,"publicationDate":"2006-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2202/1557-4679.1007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68714433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
期刊
International Journal of Biostatistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1