首页 > 最新文献

American Statistician最新文献

英文 中文
Optimizing Sample Size Allocation and Power in a Bayesian Two-Stage Drop-The-Losers Design. 贝叶斯两阶段输掉输家设计中的样本容量分配和功率优化。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2019-01-01 Epub Date: 2019-06-24 DOI: 10.1080/00031305.2019.1610065
Alex Karanevich, Richard Meier, Stefan Graw, Anna McGlothlin, Byron Gajewski

When a researcher desires to test several treatment arms against a control arm, a two-stage adaptive design can be more efficient than a single-stage design where patients are equally allocated to all treatment arms and the control. We see this type of approach in clinical trials as a seamless Phase II - Phase III design. These designs require more statistical support and are less straightforward to plan and analyze than a standard single-stage design. To diminish the barriers associated with a Bayesian two-stage drop-the-losers design, we built a user-friendly point-and-click graphical user interface with R Shiny to aid researchers in planning such designs by allowing them to easily obtain trial operating characteristics, estimate statistical power and sample size, and optimize patient allocation in each stage to maximize power. We assume that endpoints are distributed normally with unknown but common variance between treatments. We recommend this software as an easy way to engage statisticians and researchers in two-stage designs as well as to actively investigate the power of two-stage designs relative to more traditional approaches. The software is freely available at https://github.com/stefangraw/Allocation-Power-Optimizer.

当研究人员希望测试几个治疗组和一个对照组时,两阶段自适应设计可能比单阶段设计更有效,单阶段设计将患者平均分配到所有治疗组和对照组。我们认为这种方法在临床试验中是一种无缝的II期- III期设计。这些设计需要更多的统计支持,并且比标准的单级设计更不容易规划和分析。为了减少与贝叶斯两阶段抛弃失败者设计相关的障碍,我们使用R Shiny构建了一个用户友好的点击式图形用户界面,帮助研究人员规划此类设计,使他们能够轻松获得试验操作特征,估计统计功率和样本量,并优化每个阶段的患者分配以最大化功率。我们假设端点正态分布,处理之间有未知但共同的方差。我们推荐这个软件作为一个简单的方法,让统计学家和研究人员参与两阶段设计,并积极调查两阶段设计相对于更传统的方法的力量。该软件可在https://github.com/stefangraw/Allocation-Power-Optimizer免费获得。
{"title":"Optimizing Sample Size Allocation and Power in a Bayesian Two-Stage Drop-The-Losers Design.","authors":"Alex Karanevich,&nbsp;Richard Meier,&nbsp;Stefan Graw,&nbsp;Anna McGlothlin,&nbsp;Byron Gajewski","doi":"10.1080/00031305.2019.1610065","DOIUrl":"https://doi.org/10.1080/00031305.2019.1610065","url":null,"abstract":"<p><p>When a researcher desires to test several treatment arms against a control arm, a two-stage adaptive design can be more efficient than a single-stage design where patients are equally allocated to all treatment arms and the control. We see this type of approach in clinical trials as a seamless Phase II - Phase III design. These designs require more statistical support and are less straightforward to plan and analyze than a standard single-stage design. To diminish the barriers associated with a Bayesian two-stage drop-the-losers design, we built a user-friendly point-and-click graphical user interface with <i>R Shiny</i> to aid researchers in planning such designs by allowing them to easily obtain trial operating characteristics, estimate statistical power and sample size, and optimize patient allocation in each stage to maximize power. We assume that endpoints are distributed normally with unknown but common variance between treatments. We recommend this software as an easy way to engage statisticians and researchers in two-stage designs as well as to actively investigate the power of two-stage designs relative to more traditional approaches. The software is freely available at https://github.com/stefangraw/Allocation-Power-Optimizer.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"2019 ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2019.1610065","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38427333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Modified Wilcoxon-Mann-Whitney Test and Power against Strong Null. 改进的Wilcoxon-Mann-Whitney检验和抗强零功率。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2019-01-01 Epub Date: 2018-05-10 DOI: 10.1080/00031305.2017.1328375
Youyi Fong, Ying Huang

The Wilcoxon-Mann-Whitney (WMW) test is a popular rank-based two-sample testing procedure for the strong null hypothesis that the two samples come from the same distribution. A modified WMW test, the Fligner-Policello (FP) test, has been proposed for comparing the medians of two populations. A fact that may be underappreciated among some practitioners is that the FP test can also be used to test the strong null like the WMW. In this paper we compare the power of the WMW and FP tests for testing the strong null. Our results show that neither test is uniformly better than the other and that there can be substantial differences in power between the two choices. We propose a new, modified WMW test that combines the WMW and FP tests. Monte Carlo studies show that the combined test has good power compared to either the WMW and FP test. We provide a fast implementation of the proposed test in an open-source software. Supplementary materials are available online.

Wilcoxon-Mann-Whitney (WMW)检验是一种流行的基于秩的双样本检验程序,用于强零假设,即两个样本来自同一分布。提出了一种改进的WMW检验,即Fligner-Policello (FP)检验,用于比较两个种群的中位数。一些从业者可能低估的一个事实是,FP测试也可以用来测试强零,如WMW。在本文中,我们比较了WMW和FP检验在强零值检验中的功率。我们的结果表明,没有一种测试是均匀优于另一种,并且在两种选择之间可能存在实质性的差异。我们提出了一种新的,改进的WMW测试,它结合了WMW和FP测试。蒙特卡罗研究表明,与WMW和FP测试相比,该组合测试具有良好的功率。我们在一个开源软件中提供了一个测试的快速实现。补充资料可在网上查阅。
{"title":"Modified Wilcoxon-Mann-Whitney Test and Power against Strong Null.","authors":"Youyi Fong,&nbsp;Ying Huang","doi":"10.1080/00031305.2017.1328375","DOIUrl":"https://doi.org/10.1080/00031305.2017.1328375","url":null,"abstract":"<p><p>The Wilcoxon-Mann-Whitney (WMW) test is a popular rank-based two-sample testing procedure for the strong null hypothesis that the two samples come from the same distribution. A modified WMW test, the Fligner-Policello (FP) test, has been proposed for comparing the medians of two populations. A fact that may be underappreciated among some practitioners is that the FP test can also be used to test the strong null like the WMW. In this paper we compare the power of the WMW and FP tests for testing the strong null. Our results show that neither test is uniformly better than the other and that there can be substantial differences in power between the two choices. We propose a new, modified WMW test that combines the WMW and FP tests. Monte Carlo studies show that the combined test has good power compared to either the WMW and FP test. We provide a fast implementation of the proposed test in an open-source software. Supplementary materials are available online.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"73 1","pages":"43-49"},"PeriodicalIF":1.8,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2017.1328375","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37078858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Facilitating the Calculation of the Efficient Score Using Symbolic Computing. 利用符号计算促进有效分数的计算。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2018-01-01 Epub Date: 2017-10-30 DOI: 10.1080/00031305.2017.1392361
Alexander Sibley, Zhiguo Li, Yu Jiang, Yi-Ju Li, Cliburn Chan, Andrew Allen, Kouros Owzar

The score statistic continues to be a fundamental tool for statistical inference. In the analysis of data from high-throughput genomic assays, inference on the basis of the score usually enjoys greater stability, considerably higher computational efficiency, and lends itself more readily to the use of resampling methods than the asymptotically equivalent Wald or likelihood ratio tests. The score function often depends on a set of unknown nuisance parameters which have to be replaced by estimators, but can be improved by calculating the efficient score, which accounts for the variability induced by estimating these parameters. Manual derivation of the efficient score is tedious and error-prone, so we illustrate using computer algebra to facilitate this derivation. We demonstrate this process within the context of a standard example from genetic association analyses, though the techniques shown here could be applied to any derivation, and have a place in the toolbox of any modern statistician. We further show how the resulting symbolic expressions can be readily ported to compiled languages, to develop fast numerical algorithms for high-throughput genomic analysis. We conclude by considering extensions of this approach. The code featured in this report is available online as part of the supplementary material.

分数统计仍然是统计推断的基本工具。在高通量基因组分析的数据分析中,基于分数的推断通常具有更大的稳定性,相当高的计算效率,并且比渐近等效Wald或似然比检验更容易使用重采样方法。分数函数通常依赖于一组未知的干扰参数,这些参数必须由估计器替换,但可以通过计算有效分数来改进,这解释了由估计这些参数引起的可变性。手动推导有效分数是繁琐且容易出错的,因此我们说明使用计算机代数来简化这种推导。我们在遗传关联分析的标准示例的背景下演示这个过程,尽管这里展示的技术可以应用于任何推导,并在任何现代统计学家的工具箱中占有一席之地。我们进一步展示了如何将结果符号表达式轻松地移植到编译语言中,以开发用于高通量基因组分析的快速数值算法。最后,我们将考虑这种方法的扩展。本报告中的代码作为补充材料的一部分可在网上获得。
{"title":"Facilitating the Calculation of the Efficient Score Using Symbolic Computing.","authors":"Alexander Sibley,&nbsp;Zhiguo Li,&nbsp;Yu Jiang,&nbsp;Yi-Ju Li,&nbsp;Cliburn Chan,&nbsp;Andrew Allen,&nbsp;Kouros Owzar","doi":"10.1080/00031305.2017.1392361","DOIUrl":"https://doi.org/10.1080/00031305.2017.1392361","url":null,"abstract":"<p><p>The score statistic continues to be a fundamental tool for statistical inference. In the analysis of data from high-throughput genomic assays, inference on the basis of the score usually enjoys greater stability, considerably higher computational efficiency, and lends itself more readily to the use of resampling methods than the asymptotically equivalent Wald or likelihood ratio tests. The score function often depends on a set of unknown nuisance parameters which have to be replaced by estimators, but can be improved by calculating the efficient score, which accounts for the variability induced by estimating these parameters. Manual derivation of the efficient score is tedious and error-prone, so we illustrate using computer algebra to facilitate this derivation. We demonstrate this process within the context of a standard example from genetic association analyses, though the techniques shown here could be applied to any derivation, and have a place in the toolbox of any modern statistician. We further show how the resulting symbolic expressions can be readily ported to compiled languages, to develop fast numerical algorithms for high-throughput genomic analysis. We conclude by considering extensions of this approach. The code featured in this report is available online as part of the supplementary material.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"72 2","pages":"199-205"},"PeriodicalIF":1.8,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2017.1392361","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36409732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Guide to Teaching Data Science. 数据科学教学指南。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2018-01-01 Epub Date: 2018-11-14 DOI: 10.1080/00031305.2017.1356747
Stephanie C Hicks, Rafael A Irizarry

Demand for data science education is surging and traditional courses offered by statistics departments are not meeting the needs of those seeking training. This has led to a number of opinion pieces advocating for an update to the Statistics curriculum. The unifying recommendation is that computing should play a more prominent role. We strongly agree with this recommendation, but advocate the main priority is to bring applications to the forefront as proposed by Nolan and Speed (1999). We also argue that the individuals tasked with developing data science courses should not only have statistical training, but also have experience analyzing data with the main objective of solving real-world problems. Here, we share a set of general principles and offer a detailed guide derived from our successful experience developing and teaching a graduate-level, introductory data science course centered entirely on case studies. We argue for the importance of statistical thinking, as defined by Wild and Pfannkuch (1999) and describe how our approach teaches students three key skills needed to succeed in data science, which we refer to as creating, connecting, and computing. This guide can also be used for statisticians wanting to gain more practical knowledge about data science before embarking on teaching an introductory course.

对数据科学教育的需求正在激增,统计部门提供的传统课程已不能满足那些寻求培训的人的需求。这导致了一些主张更新统计课程的观点。统一的建议是计算应该发挥更突出的作用。我们强烈同意这一建议,但主张主要优先事项是将应用程序带到Nolan和Speed(1999)提出的最前沿。我们还认为,负责开发数据科学课程的个人不仅应该接受统计培训,还应该具有以解决现实问题为主要目标的数据分析经验。在这里,我们分享了一组一般原则,并提供了一个详细的指南,这些指南来自我们开发和教授研究生水平的、完全以案例研究为中心的入门数据科学课程的成功经验。我们论证了统计思维的重要性,正如Wild和Pfannkuch(1999)所定义的那样,并描述了我们的方法如何教会学生在数据科学中取得成功所需的三种关键技能,我们称之为创造、连接和计算。本指南也可以用于想要在开始教授入门课程之前获得更多关于数据科学的实用知识的统计学家。
{"title":"A Guide to Teaching Data Science.","authors":"Stephanie C Hicks,&nbsp;Rafael A Irizarry","doi":"10.1080/00031305.2017.1356747","DOIUrl":"https://doi.org/10.1080/00031305.2017.1356747","url":null,"abstract":"<p><p>Demand for data science education is surging and traditional courses offered by statistics departments are not meeting the needs of those seeking training. This has led to a number of opinion pieces advocating for an update to the Statistics curriculum. The unifying recommendation is that computing should play a more prominent role. We strongly agree with this recommendation, but advocate the main priority is to bring applications to the forefront as proposed by Nolan and Speed (1999). We also argue that the individuals tasked with developing data science courses should not only have statistical training, but also have experience analyzing data with the main objective of solving real-world problems. Here, we share a set of general principles and offer a detailed guide derived from our successful experience developing and teaching a graduate-level, introductory data science course centered entirely on case studies. We argue for the importance of <i>statistical thinking</i>, as defined by Wild and Pfannkuch (1999) and describe how our approach teaches students three key skills needed to succeed in data science, which we refer to as <i>creating</i>, <i>connecting</i>, and <i>computing</i>. This guide can also be used for statisticians wanting to gain more practical knowledge about data science before embarking on teaching an introductory course.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"72 4","pages":"382-391"},"PeriodicalIF":1.8,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2017.1356747","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37252641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 71
How to share data for collaboration. 如何共享协作数据。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2018-01-01 Epub Date: 2018-04-24 DOI: 10.1080/00031305.2017.1375987
Shannon E Ellis, Jeffrey T Leek

Within the statistics community, a number of guiding principles for sharing data have emerged; however, these principles are not always made clear to collaborators generating the data. To bridge this divide, we have established a set of guidelines for sharing data. In these, we highlight the need to provide raw data to the statistician, the importance of consistent formatting, and the necessity of including all essential experimental information and pre-processing steps carried out to the statistician. With these guidelines we hope to avoid errors and delays in data analysis.

在统计界,已经出现了一些共享数据的指导原则;但是,这些原则并不总是能让数据生成者清楚地了解。为了弥合这一分歧,我们制定了一套数据共享指导原则。其中,我们强调了向统计人员提供原始数据的必要性、格式一致的重要性,以及向统计人员提供所有基本实验信息和预处理步骤的必要性。我们希望通过这些指南避免数据分析中的错误和延误。
{"title":"How to share data for collaboration.","authors":"Shannon E Ellis, Jeffrey T Leek","doi":"10.1080/00031305.2017.1375987","DOIUrl":"10.1080/00031305.2017.1375987","url":null,"abstract":"<p><p>Within the statistics community, a number of guiding principles for sharing data have emerged; however, these principles are not always made clear to collaborators generating the data. To bridge this divide, we have established a set of guidelines for sharing data. In these, we highlight the need to provide raw data to the statistician, the importance of consistent formatting, and the necessity of including all essential experimental information and pre-processing steps carried out to the statistician. With these guidelines we hope to avoid errors and delays in data analysis.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"72 1","pages":"53-57"},"PeriodicalIF":1.8,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7518408/pdf/nihms-1502431.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38424275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Comparison of Correlation Structure Selection Penalties for Generalized Estimating Equations. 广义估计方程的相关结构选择惩罚比较。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2017-01-01 Epub Date: 2018-01-11 DOI: 10.1080/00031305.2016.1200490
Philip M Westgate, Woodrow W Burchett
ABSTRACT Correlated data are commonly analyzed using models constructed using population-averaged generalized estimating equations (GEEs). The specification of a population-averaged GEE model includes selection of a structure describing the correlation of repeated measures. Accurate specification of this structure can improve efficiency, whereas the finite-sample estimation of nuisance correlation parameters can inflate the variances of regression parameter estimates. Therefore, correlation structure selection criteria should penalize, or account for, correlation parameter estimation. In this article, we compare recently proposed penalties in terms of their impacts on correlation structure selection and regression parameter estimation, and give practical considerations for data analysts. Supplementary materials for this article are available online.
相关数据通常使用由总体平均广义估计方程(GEEs)构建的模型进行分析。总体平均的GEE模型的规范包括选择描述重复测量的相关性的结构。该结构的准确规范可以提高效率,而有害相关参数的有限样本估计会使回归参数估计的方差增大。因此,相关结构选择标准应该惩罚或考虑相关参数估计。在本文中,我们比较了最近提出的惩罚对相关结构选择和回归参数估计的影响,并为数据分析人员提供了实际考虑。
{"title":"A Comparison of Correlation Structure Selection Penalties for Generalized Estimating Equations.","authors":"Philip M Westgate,&nbsp;Woodrow W Burchett","doi":"10.1080/00031305.2016.1200490","DOIUrl":"https://doi.org/10.1080/00031305.2016.1200490","url":null,"abstract":"ABSTRACT Correlated data are commonly analyzed using models constructed using population-averaged generalized estimating equations (GEEs). The specification of a population-averaged GEE model includes selection of a structure describing the correlation of repeated measures. Accurate specification of this structure can improve efficiency, whereas the finite-sample estimation of nuisance correlation parameters can inflate the variances of regression parameter estimates. Therefore, correlation structure selection criteria should penalize, or account for, correlation parameter estimation. In this article, we compare recently proposed penalties in terms of their impacts on correlation structure selection and regression parameter estimation, and give practical considerations for data analysts. Supplementary materials for this article are available online.","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"71 4","pages":"344-353"},"PeriodicalIF":1.8,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2016.1200490","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37096268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Efficient Computation of Reduced Regression Models. 简化回归模型的高效计算。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2017-01-01 Epub Date: 2017-02-28 DOI: 10.1080/00031305.2017.1296375
Stuart R Lipsitz, Garrett M Fitzmaurice, Debajyoti Sinha, Nathanael Hevelone, Edward Giovannucci, Quoc-Dien Trinh, Jim C Hu

We consider settings where it is of interest to fit and assess regression submodels that arise as various explanatory variables are excluded from a larger regression model. The larger model is referred to as the full model; the submodels are the reduced models. We show that a computationally efficient approximation to the regression estimates under any reduced model can be obtained from a simple weighted least squares (WLS) approach based on the estimated regression parameters and covariance matrix from the full model. This WLS approach can be considered an extension to unbiased estimating equations of a first-order Taylor series approach proposed by Lawless and Singhal. Using data from the 2010 Nationwide Inpatient Sample (NIS), a 20% weighted, stratified, cluster sample of approximately 8 million hospital stays from approximately 1000 hospitals, we illustrate the WLS approach when fitting interval censored regression models to estimate the effect of type of surgery (robotic versus nonrobotic surgery) on hospital length-of-stay while adjusting for three sets of covariates: patient-level characteristics, hospital characteristics, and zip-code level characteristics. Ordinarily, standard fitting of the reduced models to the NIS data takes approximately 10 hours; using the proposed WLS approach, the reduced models take seconds to fit.

我们考虑拟合和评估回归子模型的设置,这些回归子模型是由于各种解释变量被排除在更大的回归模型之外而产生的。较大的模型称为完整模型;子模型是简化的模型。研究表明,基于全模型估计的回归参数和协方差矩阵,可以通过加权最小二乘(WLS)方法获得任意约简模型下回归估计的计算效率近似值。这种WLS方法可以看作是Lawless和Singhal提出的一阶泰勒级数方法的无偏估计方程的推广。利用2010年全国住院患者样本(NIS)的数据,我们说明了WLS方法在拟合区间截除回归模型以估计手术类型(机器人手术与非机器人手术)对住院时间的影响时的方法,同时调整了三组协变量:患者级特征、医院特征和邮政编码级特征。通常,简化模型与NIS数据的标准拟合大约需要10个小时;使用所提出的WLS方法,简化模型需要几秒钟的时间来拟合。
{"title":"Efficient Computation of Reduced Regression Models.","authors":"Stuart R Lipsitz,&nbsp;Garrett M Fitzmaurice,&nbsp;Debajyoti Sinha,&nbsp;Nathanael Hevelone,&nbsp;Edward Giovannucci,&nbsp;Quoc-Dien Trinh,&nbsp;Jim C Hu","doi":"10.1080/00031305.2017.1296375","DOIUrl":"https://doi.org/10.1080/00031305.2017.1296375","url":null,"abstract":"<p><p>We consider settings where it is of interest to fit and assess regression submodels that arise as various explanatory variables are excluded from a larger regression model. The larger model is referred to as the full model; the submodels are the reduced models. We show that a computationally efficient approximation to the regression estimates under any reduced model can be obtained from a simple weighted least squares (WLS) approach based on the estimated regression parameters and covariance matrix from the full model. This WLS approach can be considered an extension to unbiased estimating equations of a first-order Taylor series approach proposed by Lawless and Singhal. Using data from the 2010 Nationwide Inpatient Sample (NIS), a 20% weighted, stratified, cluster sample of approximately 8 million hospital stays from approximately 1000 hospitals, we illustrate the WLS approach when fitting interval censored regression models to estimate the effect of type of surgery (robotic versus nonrobotic surgery) on hospital length-of-stay while adjusting for three sets of covariates: patient-level characteristics, hospital characteristics, and zip-code level characteristics. Ordinarily, standard fitting of the reduced models to the NIS data takes approximately 10 hours; using the proposed WLS approach, the reduced models take seconds to fit.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"71 2","pages":"171-176"},"PeriodicalIF":1.8,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2017.1296375","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35225781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Strategies for Success: Early-Stage Collaborating Biostatistics Faculty in an Academic Health Center. 成功的策略:在学术健康中心的早期合作生物统计学教师。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2017-01-01 Epub Date: 2017-10-18 DOI: 10.1080/00031305.2016.1277157
Heidi Spratt, Erin E Fox, Nawar Shara, Madhu Mazumdar

Collaborative biostatistics faculties (CBF) are increasingly valued by academic health centers (AHCs) for their role in increasingsuccess rates of grants and publications, and educating medical students and clinical researchers. Some AHCs have a biostatistics department that consists of only biostatisticians focused on methodological research, collaborative research, and education. Others may have a biostatistics unit within an interdisciplinary department, or statisticians recruited into clinical departments. Within each model, there is also variability in environment, influenced by the chair's background, research focus of colleagues, type of students taught, funding sources, and whether the department is in a medical school or school of public health. CBF appointments may be tenure track or non-tenure, and expectations for promotion may vary greatly depending on the type of department, track, and the AHC. In this article, the authors identify strategies for developing early-stage CBFs in four domains: 1)Influenceof department/environment, 2) Skills to develop, 3) Ways to increase productivity, and 4) Ways to document accomplishments. Graduating students and postdoctoral fellows should consider the first domain when choosing a faculty position. Early-stage CBFs will benefit by understanding the requirements of their environment early in their appointment and by modifying the provided progression grid with their chair and mentoring team as needed. Following this personalized grid will increase the chances of a satisfying career with appropriate recognition for academic accomplishments.

协作生物统计学院(CBF)越来越受到学术卫生中心(AHCs)的重视,因为它们在提高赠款和出版物的成功率以及教育医科学生和临床研究人员方面发挥了作用。一些ahc有一个生物统计部门,只由专注于方法研究、合作研究和教育的生物统计学家组成。其他的可能在跨学科的部门有一个生物统计单位,或者统计学家被招募到临床部门。在每个模型中,环境也存在可变性,受主席背景、同事的研究重点、教学学生类型、资金来源以及该部门是在医学院还是公共卫生学院的影响。CBF的任命可能是终身制的,也可能是非终身制的,对晋升的期望可能会因部门、轨迹和AHC的类型而有很大差异。在本文中,作者从四个方面确定了发展早期cbf的策略:1)部门/环境的影响,2)需要发展的技能,3)提高生产力的方法,以及4)记录成就的方法。即将毕业的学生和博士后在选择教师职位时应该考虑第一个领域。早期阶段的cbf将受益于在其任命的早期了解其环境的需求,并根据需要与他们的主席和指导团队一起修改提供的进度网格。遵循这个个性化的网格将增加获得令人满意的职业和适当的学术成就认可的机会。
{"title":"Strategies for Success: Early-Stage Collaborating Biostatistics Faculty in an Academic Health Center.","authors":"Heidi Spratt, Erin E Fox, Nawar Shara, Madhu Mazumdar","doi":"10.1080/00031305.2016.1277157","DOIUrl":"10.1080/00031305.2016.1277157","url":null,"abstract":"<p><p>Collaborative biostatistics faculties (CBF) are increasingly valued by academic health centers (AHCs) for their role in increasingsuccess rates of grants and publications, and educating medical students and clinical researchers. Some AHCs have a biostatistics department that consists of only biostatisticians focused on methodological research, collaborative research, and education. Others may have a biostatistics unit within an interdisciplinary department, or statisticians recruited into clinical departments. Within each model, there is also variability in environment, influenced by the chair's background, research focus of colleagues, type of students taught, funding sources, and whether the department is in a medical school or school of public health. CBF appointments may be tenure track or non-tenure, and expectations for promotion may vary greatly depending on the type of department, track, and the AHC. In this article, the authors identify strategies for developing early-stage CBFs in four domains: 1)Influenceof department/environment, 2) Skills to develop, 3) Ways to increase productivity, and 4) Ways to document accomplishments. Graduating students and postdoctoral fellows should consider the first domain when choosing a faculty position. Early-stage CBFs will benefit by understanding the requirements of their environment early in their appointment and by modifying the provided progression grid with their chair and mentoring team as needed. Following this personalized grid will increase the chances of a satisfying career with appropriate recognition for academic accomplishments.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"71 3","pages":"220-230"},"PeriodicalIF":1.8,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2016.1277157","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38427334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
The Central Role of Bayes' Theorem for Joint Estimation of Causal Effects and Propensity Scores. 贝叶斯定理在因果效应和倾向分数联合估计中的中心作用。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2016-03-31 Epub Date: 2015-12-14 DOI: 10.1080/00031305.2015.1111260
Corwin Matthew Zigler

Although propensity scores have been central to the estimation of causal effects for over 30 years, only recently has the statistical literature begun to consider in detail methods for Bayesian estimation of propensity scores and causal effects. Underlying this recent body of literature on Bayesian propensity score estimation is an implicit discordance between the goal of the propensity score and the use of Bayes theorem. The propensity score condenses multivariate covariate information into a scalar to allow estimation of causal effects without specifying a model for how each covariate relates to the outcome. Avoiding specification of a detailed model for the outcome response surface is valuable for robust estimation of causal effects, but this strategy is at odds with the use of Bayes theorem, which presupposes a full probability model for the observed data that adheres to the likelihood principle. The goal of this paper is to explicate this fundamental feature of Bayesian estimation of causal effects with propensity scores in order to provide context for the existing literature and for future work on this important topic.

尽管倾向分数在因果效应的估计中已经占据了30多年的中心地位,但直到最近,统计文献才开始详细考虑倾向分数和因果效应的贝叶斯估计方法。在最近关于贝叶斯倾向得分估计的文献中,倾向得分的目标与贝叶斯定理的使用之间存在隐性的不一致。倾向得分将多变量协变量信息浓缩成一个标量,以便在不指定每个协变量与结果如何相关的模型的情况下,对因果效应进行估计。避免为结果响应面指定详细的模型对于因果效应的稳健估计是有价值的,但这种策略与贝叶斯定理的使用不一致,贝叶斯定理假定观察数据遵循似然原理的完整概率模型。本文的目的是解释贝叶斯估计因果效应与倾向得分的基本特征,以便为现有文献和未来的工作提供背景。
{"title":"The Central Role of Bayes' Theorem for Joint Estimation of Causal Effects and Propensity Scores.","authors":"Corwin Matthew Zigler","doi":"10.1080/00031305.2015.1111260","DOIUrl":"https://doi.org/10.1080/00031305.2015.1111260","url":null,"abstract":"<p><p>Although propensity scores have been central to the estimation of causal effects for over 30 years, only recently has the statistical literature begun to consider in detail methods for Bayesian estimation of propensity scores and causal effects. Underlying this recent body of literature on Bayesian propensity score estimation is an implicit discordance between the goal of the propensity score and the use of Bayes theorem. The propensity score condenses multivariate covariate information into a scalar to allow estimation of causal effects without specifying a model for how each covariate relates to the outcome. Avoiding specification of a detailed model for the outcome response surface is valuable for robust estimation of causal effects, but this strategy is at odds with the use of Bayes theorem, which presupposes a full probability model for the observed data that adheres to the likelihood principle. The goal of this paper is to explicate this fundamental feature of Bayesian estimation of causal effects with propensity scores in order to provide context for the existing literature and for future work on this important topic.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"70 1","pages":"47-54"},"PeriodicalIF":1.8,"publicationDate":"2016-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/00031305.2015.1111260","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34614212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 40
The p-Value You Can't Buy. 买不到的 p 值
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2016-01-02 Epub Date: 2016-03-31 DOI: 10.1080/00031305.2015.1069760
Eugene Demidenko

There is growing frustration with the concept of the p-value. Besides having an ambiguous interpretation, the p-value can be made as small as desired by increasing the sample size, n. The p-value is outdated and does not make sense with big data: Everything becomes statistically significant. The root of the problem with the p-value is in the mean comparison. We argue that statistical uncertainty should be measured on the individual, not the group, level. Consequently, standard deviation (SD), not standard error (SE), error bars should be used to graphically present the data on two groups. We introduce a new measure based on the discrimination of individuals/objects from two groups, and call it the D-value. The D-value can be viewed as the n-of-1 p-value because it is computed in the same way as p while letting n equal 1. We show how the D-value is related to discrimination probability and the area above the receiver operating characteristic (ROC) curve. The D-value has a clear interpretation as the proportion of patients who get worse after the treatment, and as such facilitates to weigh up the likelihood of events under different scenarios. [Received January 2015. Revised June 2015.].

人们对 p 值的概念越来越失望。p 值除了解释含糊不清外,还可以通过增加样本量 n 使其变得越小越好:一切都变得具有统计意义。p 值的问题根源在于均值比较。我们认为,统计不确定性应从个体而非群体层面来衡量。因此,应该使用标准差(SD),而不是标准误差(SE)、误差条来图解两组数据。我们根据两组个体/对象的区分度引入了一种新的测量方法,称之为 D 值。D 值可以看作是 n-of-1 的 p 值,因为它的计算方法与 p 值相同,只是让 n 等于 1。我们将展示 D 值与判别概率和接收者操作特征曲线(ROC)上方面积之间的关系。D 值可明确解释为治疗后病情恶化的患者比例,因此有助于权衡不同情况下发生事件的可能性。[2015年1月接收。2015年6月修订]。
{"title":"The <i>p</i>-Value You Can't Buy.","authors":"Eugene Demidenko","doi":"10.1080/00031305.2015.1069760","DOIUrl":"10.1080/00031305.2015.1069760","url":null,"abstract":"<p><p>There is growing frustration with the concept of the <i>p</i>-value. Besides having an ambiguous interpretation, the <i>p-</i>value can be made as small as desired by increasing the sample size, <i>n</i>. The <i>p</i>-value is outdated and does not make sense with big data: Everything becomes statistically significant. The root of the problem with the <i>p-</i>value is in the mean comparison. We argue that statistical uncertainty should be measured on the individual, not the group, level. Consequently, standard deviation (SD), not standard error (SE), error bars should be used to graphically present the data on two groups. We introduce a new measure based on the discrimination of individuals/objects from two groups, and call it the <i>D</i>-value. The <i>D</i>-value can be viewed as the <i>n</i>-of-1 <i>p</i>-value because it is computed in the same way as <i>p</i> while letting <i>n</i> equal 1. We show how the <i>D</i>-value is related to discrimination probability and the area above the receiver operating characteristic (ROC) curve. The <i>D</i>-value has a clear interpretation as the proportion of patients who get worse after the treatment, and as such facilitates to weigh up the likelihood of events under different scenarios. [Received January 2015. Revised June 2015.].</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"70 1","pages":"33-38"},"PeriodicalIF":1.8,"publicationDate":"2016-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4867863/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34518881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
American Statistician
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1