首页 > 最新文献

Observational studies最新文献

英文 中文
Using propensity scores for racial disparities analysis 使用倾向得分进行种族差异分析
Pub Date : 2022-09-08 DOI: 10.1353/obs.2023.0005
Fan Li
Abstract:Propensity score plays a central role in causal inference, but its use is not limited to causal comparisons. As a covariate balancing tool, propensity score can be used for controlled descriptive comparisons between groups whose memberships are not manipulable. A prominent example is racial disparities in health care. However, conceptual confusion and hesitation persists for using propensity score in racial disparities studies. In this commentary, we argue that propensity score, possibly combined with other methods, is an effective tool for racial disparities analysis. We describe relevant estimands, target population, and assumptions. In particular, we clarify that a controlled descriptive comparison requires weaker assumptions than a causal comparison. We discuss three common propensity score weighting strategies: overlap weighting, inverse probability weighting and average treatment effect for treated weighting. We further describe how to combine weighting with the rank-and-replace adjustment method to produce racial disparity estimates concordant to the Institute of Medicine’s definition. The method is illustrated by a re-analysis of the Medical Expenditure Panel Survey data.
摘要倾向得分在因果推理中起着核心作用,但其应用并不局限于因果比较。作为协变量平衡工具,倾向得分可用于成员不可操纵的群体之间的受控描述性比较。一个突出的例子是医疗保健方面的种族差异。然而,在种族差异研究中使用倾向评分存在概念上的混淆和犹豫。在这篇评论中,我们认为倾向评分,可能与其他方法相结合,是种族差异分析的有效工具。我们描述了相关的估计、目标人群和假设。特别是,我们澄清,一个受控的描述性比较需要弱的假设比因果比较。讨论了三种常用的倾向得分加权策略:重叠加权、逆概率加权和处理加权的平均处理效果。我们进一步描述了如何将加权与秩-替换调整方法相结合,以产生符合医学研究所定义的种族差异估计。对医疗支出小组调查数据的重新分析说明了这种方法。
{"title":"Using propensity scores for racial disparities analysis","authors":"Fan Li","doi":"10.1353/obs.2023.0005","DOIUrl":"https://doi.org/10.1353/obs.2023.0005","url":null,"abstract":"Abstract:Propensity score plays a central role in causal inference, but its use is not limited to causal comparisons. As a covariate balancing tool, propensity score can be used for controlled descriptive comparisons between groups whose memberships are not manipulable. A prominent example is racial disparities in health care. However, conceptual confusion and hesitation persists for using propensity score in racial disparities studies. In this commentary, we argue that propensity score, possibly combined with other methods, is an effective tool for racial disparities analysis. We describe relevant estimands, target population, and assumptions. In particular, we clarify that a controlled descriptive comparison requires weaker assumptions than a causal comparison. We discuss three common propensity score weighting strategies: overlap weighting, inverse probability weighting and average treatment effect for treated weighting. We further describe how to combine weighting with the rank-and-replace adjustment method to produce racial disparity estimates concordant to the Institute of Medicine’s definition. The method is illustrated by a re-analysis of the Medical Expenditure Panel Survey data.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"59 - 68"},"PeriodicalIF":0.0,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45405608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Revisiting the Propensity Score’s Central Role: Towards Bridging Balance and Efficiency in the Era of Causal Machine Learning 重新审视倾向得分的核心作用:在因果机器学习时代实现平衡与效率的桥梁
Pub Date : 2022-08-17 DOI: 10.1353/obs.2023.0001
N. Hejazi, M. J. van der Laan
Abstract:About forty years ago, in a now–seminal contribution, Rosenbaum and Rubin (1983) introduced a critical characterization of the propensity score as a central quantity for drawing causal inferences in observational study settings. In the decades since, much progress has been made across several research frontiers in causal inference, notably including the re-weighting and matching paradigms. Focusing on the former and specifically on its intersection with machine learning and semiparametric efficiency theory, we re-examine the role of the propensity score in modern methodological developments. As Rosenbaum and Rubin (1983)’s contribution spurred a focus on the balancing property of the propensity score, we re-examine the degree to which and how this property plays a role in the development of asymptotically efficient estimators of causal effects; moreover, we discuss a connection between the balancing property and efficient estimation in the form of score equations and propose a score test for evaluating whether an estimator achieves empirical balance.
摘要:大约四十年前,Rosenbaum和Rubin(1983)在一项现在具有开创性意义的贡献中,引入了倾向得分的批判性描述,将其作为在观察性研究环境中进行因果推断的中心量。在此后的几十年里,因果推理的几个研究领域取得了很大进展,特别是包括重新加权和匹配范式。关注前者,特别是它与机器学习和半参数效率理论的交叉,我们重新审视倾向得分在现代方法论发展中的作用。由于Rosenbaum和Rubin(1983)的贡献促使人们关注倾向得分的平衡性质,我们重新审视了这种性质在因果效应渐近有效估计量的发展中发挥作用的程度和方式;此外,我们以分数方程的形式讨论了平衡性质与有效估计之间的联系,并提出了一个分数检验来评估估计器是否实现了经验平衡。
{"title":"Revisiting the Propensity Score’s Central Role: Towards Bridging Balance and Efficiency in the Era of Causal Machine Learning","authors":"N. Hejazi, M. J. van der Laan","doi":"10.1353/obs.2023.0001","DOIUrl":"https://doi.org/10.1353/obs.2023.0001","url":null,"abstract":"Abstract:About forty years ago, in a now–seminal contribution, Rosenbaum and Rubin (1983) introduced a critical characterization of the propensity score as a central quantity for drawing causal inferences in observational study settings. In the decades since, much progress has been made across several research frontiers in causal inference, notably including the re-weighting and matching paradigms. Focusing on the former and specifically on its intersection with machine learning and semiparametric efficiency theory, we re-examine the role of the propensity score in modern methodological developments. As Rosenbaum and Rubin (1983)’s contribution spurred a focus on the balancing property of the propensity score, we re-examine the degree to which and how this property plays a role in the development of asymptotically efficient estimators of causal effects; moreover, we discuss a connection between the balancing property and efficient estimation in the form of score equations and propose a score test for evaluating whether an estimator achieves empirical balance.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"23 - 34"},"PeriodicalIF":0.0,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48027197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Propensity Score Modeling: Key Challenges When Moving Beyond the No-Interference Assumption 倾向得分建模:超越无干扰假设的关键挑战
Pub Date : 2022-08-13 DOI: 10.1353/obs.2023.0003
Hyunseung Kang, Chan Park, R. Trane
Abstract:The paper presents some models for the propensity score. Considerable attention is given to a recently popular, but relatively under-explored setting in causal inference where the no-interference assumption does not hold. We lay out some key challenges in propensity score modeling under interference and present a few promising models based on existing works on mixed effects models.
摘要:本文提出了一些倾向得分的模型。在因果推理中,一个最近流行但相对未被充分探索的环境受到了相当大的关注,即无干扰假设不成立。我们提出了干扰下倾向得分建模的一些关键挑战,并在现有混合效应模型的基础上提出了一些有前景的模型。
{"title":"Propensity Score Modeling: Key Challenges When Moving Beyond the No-Interference Assumption","authors":"Hyunseung Kang, Chan Park, R. Trane","doi":"10.1353/obs.2023.0003","DOIUrl":"https://doi.org/10.1353/obs.2023.0003","url":null,"abstract":"Abstract:The paper presents some models for the propensity score. Considerable attention is given to a recently popular, but relatively under-explored setting in causal inference where the no-interference assumption does not hold. We lay out some key challenges in propensity score modeling under interference and present a few promising models based on existing works on mixed effects models.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"43 - 53"},"PeriodicalIF":0.0,"publicationDate":"2022-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47342181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Sensitivity Analysis for the Adjusted Mann-Whitney Test with Observational Studies 观察性研究校正Mann-Whitney检验的敏感性分析
Pub Date : 2022-06-04 DOI: 10.1353/obs.2022.0002
Maozhu Dai, Weining Shen, H. Stern
Abstract:The Mann-Whitney test is a popular nonparametric test for comparing two samples. It has been recently extended by Satten et al. (2018) to allow testing for the existence of treatment effects in observational studies. Their proposed adjusted Mann-Whitney test relies on the unconfoundedness assumption which is untestable in practice. It hence becomes important to assess the impact of violating this assumption on the degree to which causal conclusions remain valid. In this paper, we consider a marginal sensitivity analysis framework to address this problem by utilizing a bootstrap approach that provides a sensitivity interval for the estimand with a guaranteed coverage probability as long as the data generating mechanism is included in the set of pre-specified sensitivity models. We develop efficient optimization algorithms for computing the sensitivity interval and further extend our approach to a general class of adjusted multi-sample U-statistics. Simulation studies and two real data applications are discussed to demonstrate the utility of our proposed methodology.
摘要:Mann-Whitney检验是比较两个样本的常用非参数检验。Satten等人(2018)最近对其进行了扩展,以允许在观察性研究中测试治疗效果的存在。他们提出的调整曼-惠特尼检验依赖于在实践中无法检验的非混杂假设。因此,重要的是评估违反这一假设对因果结论保持有效程度的影响。在本文中,我们考虑了一个边际灵敏度分析框架来解决这个问题,该框架利用bootstrap方法为估计提供一个具有保证覆盖概率的灵敏度区间,只要数据生成机制包含在预先指定的灵敏度模型集中。我们开发了计算灵敏度区间的有效优化算法,并进一步将我们的方法扩展到一般的调整多样本u统计量。仿真研究和两个实际数据应用进行了讨论,以证明我们提出的方法的实用性。
{"title":"Sensitivity Analysis for the Adjusted Mann-Whitney Test with Observational Studies","authors":"Maozhu Dai, Weining Shen, H. Stern","doi":"10.1353/obs.2022.0002","DOIUrl":"https://doi.org/10.1353/obs.2022.0002","url":null,"abstract":"Abstract:The Mann-Whitney test is a popular nonparametric test for comparing two samples. It has been recently extended by Satten et al. (2018) to allow testing for the existence of treatment effects in observational studies. Their proposed adjusted Mann-Whitney test relies on the unconfoundedness assumption which is untestable in practice. It hence becomes important to assess the impact of violating this assumption on the degree to which causal conclusions remain valid. In this paper, we consider a marginal sensitivity analysis framework to address this problem by utilizing a bootstrap approach that provides a sensitivity interval for the estimand with a guaranteed coverage probability as long as the data generating mechanism is included in the set of pre-specified sensitivity models. We develop efficient optimization algorithms for computing the sensitivity interval and further extend our approach to a general class of adjusted multi-sample U-statistics. Simulation studies and two real data applications are discussed to demonstrate the utility of our proposed methodology.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"8 1","pages":"1 - 29"},"PeriodicalIF":0.0,"publicationDate":"2022-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46940922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Evaluation of Language Training Programs in Luxembourg using Principal Stratification 用校长分层法评价卢森堡语言培训项目
Pub Date : 2022-06-04 DOI: 10.2139/ssrn.3538309
Michela Bia, Alfonso Flores-Lagunes, Andrea Mercatanti
Abstract:In a world increasingly globalized, multiple language skills can create more employment opportunities. Several countries include language training programs in active labor market programs for the unemployed. We analyze the effects of a language training program on the re-employment probability and hourly wages simultaneously, using high-quality administrative data from Luxembourg. We address selection into training with an unconfoundedness assumption and account for the complication that wages are “truncated” by unemployment by adopting a principal stratification framework. Estimation is undertaken with a mixture model likelihood-based approach. To improve inference, we use the individual’s hours worked as a secondary outcome and a stochastic dominance assumption. These two features considerably ameliorate the multimodality problem commonly encountered in mixture models. We also conduct a sensitivity analysis to assess the unconfoundedness assumption. Our results suggest a positive effect (of up to 12.7 percent) of the language training programs on the re-employment probability, but no effects on wages for those who are observed employed regardless of training participation.
摘要:在一个日益全球化的世界里,多种语言技能可以创造更多的就业机会。一些国家将语言培训项目纳入了针对失业者的活跃劳动力市场项目中。我们利用卢森堡的高质量行政数据,同时分析了语言培训计划对再就业概率和时薪的影响。我们以一种无根据的假设来解决培训的选择问题,并通过采用一个主要的分层框架来解释失业“截断”工资的复杂性。使用基于混合模型似然的方法进行估计。为了改进推理,我们使用个人的工作时间作为次要结果和随机优势假设。这两个特征大大改善了混合模型中常见的多模态问题。我们还进行了敏感性分析,以评估无根据性假设。我们的研究结果表明,语言培训项目对再就业概率有积极影响(高达12.7%),但对那些被观察到就业的人的工资没有影响,无论他们是否参加培训。
{"title":"Evaluation of Language Training Programs in Luxembourg using Principal Stratification","authors":"Michela Bia, Alfonso Flores-Lagunes, Andrea Mercatanti","doi":"10.2139/ssrn.3538309","DOIUrl":"https://doi.org/10.2139/ssrn.3538309","url":null,"abstract":"Abstract:In a world increasingly globalized, multiple language skills can create more employment opportunities. Several countries include language training programs in active labor market programs for the unemployed. We analyze the effects of a language training program on the re-employment probability and hourly wages simultaneously, using high-quality administrative data from Luxembourg. We address selection into training with an unconfoundedness assumption and account for the complication that wages are “truncated” by unemployment by adopting a principal stratification framework. Estimation is undertaken with a mixture model likelihood-based approach. To improve inference, we use the individual’s hours worked as a secondary outcome and a stochastic dominance assumption. These two features considerably ameliorate the multimodality problem commonly encountered in mixture models. We also conduct a sensitivity analysis to assess the unconfoundedness assumption. Our results suggest a positive effect (of up to 12.7 percent) of the language training programs on the re-employment probability, but no effects on wages for those who are observed employed regardless of training participation.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"8 1","pages":"1 - 44"},"PeriodicalIF":0.0,"publicationDate":"2022-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42577592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
gesttools: General Purpose G-Estimation in R gestools:R中的通用G-估计
Pub Date : 2022-06-04 DOI: 10.1353/obs.2022.0003
Daniel Tompsett, S. Vansteelandt, O. Dukes, B. D. De Stavola
Abstract:In this paper we present gesttools, a series of general purpose, user friendly functions with which to perform g-estimation of structural nested mean models (SNMMs) for time-varying exposures and outcomes in R. The package implements the g-estimation methods found in Vansteelandt and Sjolander (2016) and Dukes and Vansteelandt (2018), and is capable of analysing both end of study and time-varying outcome data that are either binary or continuous, or exposure variables that are either binary, continuous, or categorical. It also allows for the fitting of SNMMs with time-varying causal effects, effect modification by other variables, or both, as well as support for censored data using inverse weighting. We outline the theory underpinning these methods, as well as describing the SNMMs that can be fitted by the software. The package is demonstrated using simulated, and real-world inspired datasets.
摘要:在本文中,我们提出了gesttools,这是一系列通用的、用户友好的函数,用于对r中的时变暴露和结果执行结构嵌套均值模型(SNMMs)的g估计。该软件包实现了Vansteelandt和Sjolander(2016)以及Dukes和Vansteelandt(2018)中发现的g估计方法,能够分析研究结束和时变结果数据,无论是二进制的还是连续的。或者是二元、连续或分类的曝光变量。它还允许snmm与时变因果效应的拟合,其他变量的影响修正,或两者兼而有之,以及使用逆加权支持审查数据。我们概述了支撑这些方法的理论,并描述了可以由软件拟合的snmm。该软件包使用模拟和现实世界的启发数据集进行演示。
{"title":"gesttools: General Purpose G-Estimation in R","authors":"Daniel Tompsett, S. Vansteelandt, O. Dukes, B. D. De Stavola","doi":"10.1353/obs.2022.0003","DOIUrl":"https://doi.org/10.1353/obs.2022.0003","url":null,"abstract":"Abstract:In this paper we present gesttools, a series of general purpose, user friendly functions with which to perform g-estimation of structural nested mean models (SNMMs) for time-varying exposures and outcomes in R. The package implements the g-estimation methods found in Vansteelandt and Sjolander (2016) and Dukes and Vansteelandt (2018), and is capable of analysing both end of study and time-varying outcome data that are either binary or continuous, or exposure variables that are either binary, continuous, or categorical. It also allows for the fitting of SNMMs with time-varying causal effects, effect modification by other variables, or both, as well as support for censored data using inverse weighting. We outline the theory underpinning these methods, as well as describing the SNMMs that can be fitted by the software. The package is demonstrated using simulated, and real-world inspired datasets.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"8 1","pages":"1 - 28"},"PeriodicalIF":0.0,"publicationDate":"2022-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47866713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Causal Inference Challenges with Interrupted Time Series Designs: An Evaluation of an Assault Weapons Ban in California 中断时间序列设计的因果推理挑战:对加州攻击性武器禁令的评估
Pub Date : 2022-04-26 DOI: 10.1353/obs.0.0001
R. Berk
The interrupted time series design was introduced to social scientists in 1963 by Campbell and Stanley, analysis methods were proposed by Box and Tiao in 1975, and more recent treatments are easily found (Box et al., 2016). Despite its popularity, current results in statistics reveal fundamental oversights in the standard statistical methods employed. Adaptive model selection built into recommended practice causes challenging problems for post-model-selection-inference. What one might call model cherry picking can invalidate conventional statistical inference, statistical tests and confidence intervals with damaging consequences for causal inference. There are technical developments that can correct for these problems, but these remedies raise conceptual difficulties for causal inference when proper estimands are defined. The issues are illustrated with an analysis of the impact of an assault weapons ban on daily handgun sales in California from 1996 through 2018. Statistically valid regression functionals are obtained, but their causal meaning is unclear. Researchers might be best served by interpreting only the sign of such functionals.
中断时间序列设计于1963年由Campbell和Stanley引入社会科学家,分析方法由Box和Tiao于1975年提出,最近的治疗方法很容易找到(Box et al., 2016)。尽管它很受欢迎,但目前的统计结果显示,所采用的标准统计方法存在根本性的疏忽。在推荐实践中建立的自适应模型选择为后模型选择推理带来了挑战性问题。人们可能会称之为“模型挑选”,它会使传统的统计推断、统计测试和置信区间无效,并对因果推断产生破坏性后果。有技术上的发展可以纠正这些问题,但这些补救措施在定义适当的估计时,会给因果推理带来概念上的困难。通过分析1996年至2018年加州禁止每日手枪销售的攻击性武器禁令的影响,可以说明这些问题。得到了统计上有效的回归函数,但其因果意义尚不清楚。研究人员最好只解释这些功能的符号。
{"title":"Causal Inference Challenges with Interrupted Time Series Designs: An Evaluation of an Assault Weapons Ban in California","authors":"R. Berk","doi":"10.1353/obs.0.0001","DOIUrl":"https://doi.org/10.1353/obs.0.0001","url":null,"abstract":"The interrupted time series design was introduced to social scientists in 1963 by Campbell and Stanley, analysis methods were proposed by Box and Tiao in 1975, and more recent treatments are easily found (Box et al., 2016). Despite its popularity, current results in statistics reveal fundamental oversights in the standard statistical methods employed. Adaptive model selection built into recommended practice causes challenging problems for post-model-selection-inference. What one might call model cherry picking can invalidate conventional statistical inference, statistical tests and confidence intervals with damaging consequences for causal inference. There are technical developments that can correct for these problems, but these remedies raise conceptual difficulties for causal inference when proper estimands are defined. The issues are illustrated with an analysis of the impact of an assault weapons ban on daily handgun sales in California from 1996 through 2018. Statistically valid regression functionals are obtained, but their causal meaning is unclear. Researchers might be best served by interpreting only the sign of such functionals.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"1 1","pages":"-"},"PeriodicalIF":0.0,"publicationDate":"2022-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66460798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Posterior Predictive Propensity Scores and p-Values 后验预测倾向评分和p值
Pub Date : 2022-02-16 DOI: 10.1353/obs.2023.0015
Peng Ding, Tianyu Guo
Abstract:Rosenbaum and Rubin (1983) introduced the notion of the propensity score and discussed its central role in causal inference with observational studies. Their paper, however, caused a fundamental incoherence with an early paper by Rubin (1978), which showed that the propensity score does not play any role in the Bayesian analysis of unconfounded observational studies if the priors on the propensity score and outcome models are independent. Despite the serious efforts made in the literature, it is generally difficult to reconcile these contradicting results. We offer a simple approach to incorporating the propensity score in Bayesian causal inference based on the posterior predictive p-value. To motivate a simple procedure, we focus on the model with the strong null hypothesis of no causal effects for any units whatsoever. Computationally, the proposed posterior predictive p-value equals the classic p-value based on the Fisher randomization test averaged over the posterior predictive distribution of the propensity score. Moreover, using the studentized doubly robust estimator as the test statistic, the proposed p-value inherits the doubly robust property and is also asymptotically valid for testing the weak null hypothesis of zero average causal effect. Perhaps surprisingly, this Bayesianly motivated p-value can have better frequentist’s finite-sample performance than the frequentist’s p-value based on the asymptotic approximation especially when the propensity scores can take extreme values.
摘要:Rosenbaum和Rubin(1983)引入了倾向得分的概念,并通过观察研究讨论了其在因果推断中的核心作用。然而,他们的论文与Rubin(1978)的早期论文产生了根本性的不一致,该论文表明,如果倾向得分和结果模型的先验是独立的,那么倾向得分在无根据观察性研究的贝叶斯分析中不会起到任何作用。尽管在文献中做出了认真的努力,但通常很难调和这些相互矛盾的结果。我们提供了一种简单的方法,将倾向得分纳入基于后验预测p值的贝叶斯因果推理中。为了激励一个简单的过程,我们将重点放在对任何单位都没有因果影响的强零假设模型上。在计算上,所提出的后验预测p值等于基于Fisher随机化检验的经典p值,该检验在倾向得分的后验估计分布上取平均值。此外,使用学生化的双稳健估计量作为检验统计量,所提出的p值继承了双稳健性质,并且对于检验零平均因果效应的弱零假设也是渐近有效的。也许令人惊讶的是,这种仅受贝叶斯激励的p值可以比基于渐近近似的频率学家的p值具有更好的频率学家有限样本性能,尤其是当倾向得分可以取极值时。
{"title":"Posterior Predictive Propensity Scores and p-Values","authors":"Peng Ding, Tianyu Guo","doi":"10.1353/obs.2023.0015","DOIUrl":"https://doi.org/10.1353/obs.2023.0015","url":null,"abstract":"Abstract:Rosenbaum and Rubin (1983) introduced the notion of the propensity score and discussed its central role in causal inference with observational studies. Their paper, however, caused a fundamental incoherence with an early paper by Rubin (1978), which showed that the propensity score does not play any role in the Bayesian analysis of unconfounded observational studies if the priors on the propensity score and outcome models are independent. Despite the serious efforts made in the literature, it is generally difficult to reconcile these contradicting results. We offer a simple approach to incorporating the propensity score in Bayesian causal inference based on the posterior predictive p-value. To motivate a simple procedure, we focus on the model with the strong null hypothesis of no causal effects for any units whatsoever. Computationally, the proposed posterior predictive p-value equals the classic p-value based on the Fisher randomization test averaged over the posterior predictive distribution of the propensity score. Moreover, using the studentized doubly robust estimator as the test statistic, the proposed p-value inherits the doubly robust property and is also asymptotically valid for testing the weak null hypothesis of zero average causal effect. Perhaps surprisingly, this Bayesianly motivated p-value can have better frequentist’s finite-sample performance than the frequentist’s p-value based on the asymptotic approximation especially when the propensity scores can take extreme values.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"18 - 3"},"PeriodicalIF":0.0,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49379109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Melting together prediction and inference 将预测和推理融合在一起
Pub Date : 2021-10-04 DOI: 10.1353/obs.2021.0035
A. Daoud, Devdatt P. Dubhashi
Abstract:In Leo Breiman's influential article "Statistical modeling-the two cultures" he identified two cultures for statistical practices. The data modeling culture (DMC) denotes practices tailored for statistical inference targeting a quantity of interest, [inline-graphic 01]. The algorithmic modeling culture (AMC) refers to practices defining an algorithm, or a machine-learning (ML) procedure, that generates accurate predictions about an outcome of interest, [inline-graphic 02] was the dominant mode, Breiman argued that statisticians should give more attention to AMC. Twenty years later and energized by two revolutions—one in data-science and one in causal inference—a hybrid modeling culture (HMC) is rising. HMC fuses the inferential strength of DMC and the predictive power of AMC with the goal of analyzing cause and effect, and thus, HMC's quantity of interest is causal effect, [inline-graphic 03]. In combining inference and prediction, the result of HMC practices is that the distinction between prediction and inference, taken to its limit, melts away. While this hybrid culture does not occupy the default mode of scientific practices, we argue that it offers an intriguing novel path for applied sciences.
摘要:在Leo Breiman颇具影响力的文章《统计建模——两种文化》中,他为统计实践确定了两种文化。数据建模文化(DMC)表示针对感兴趣的数量进行统计推断的实践,[inline-graphic 01]。算法建模文化(AMC)是指定义算法或机器学习(ML)过程的实践,这些算法或机器学习(ML)过程可以对感兴趣的结果产生准确的预测,[inline-graphic 02]是主要模式,Breiman认为统计学家应该更多地关注AMC。二十年后,在数据科学和因果推理两场革命的推动下,混合建模文化(HMC)正在兴起。HMC融合了DMC的推理强度和AMC的预测能力,目的是分析因果关系,因此,HMC的兴趣量是因果效应,[inline- figure 03]。在将推理和预测结合起来的过程中,HMC实践的结果是,预测和推理之间的区别,达到了极限,消失了。虽然这种混合文化并没有占据科学实践的默认模式,但我们认为它为应用科学提供了一条有趣的新途径。
{"title":"Melting together prediction and inference","authors":"A. Daoud, Devdatt P. Dubhashi","doi":"10.1353/obs.2021.0035","DOIUrl":"https://doi.org/10.1353/obs.2021.0035","url":null,"abstract":"Abstract:In Leo Breiman's influential article \"Statistical modeling-the two cultures\" he identified two cultures for statistical practices. The data modeling culture (DMC) denotes practices tailored for statistical inference targeting a quantity of interest, [inline-graphic 01]. The algorithmic modeling culture (AMC) refers to practices defining an algorithm, or a machine-learning (ML) procedure, that generates accurate predictions about an outcome of interest, [inline-graphic 02] was the dominant mode, Breiman argued that statisticians should give more attention to AMC. Twenty years later and energized by two revolutions—one in data-science and one in causal inference—a hybrid modeling culture (HMC) is rising. HMC fuses the inferential strength of DMC and the predictive power of AMC with the goal of analyzing cause and effect, and thus, HMC's quantity of interest is causal effect, [inline-graphic 03]. In combining inference and prediction, the result of HMC practices is that the distinction between prediction and inference, taken to its limit, melts away. While this hybrid culture does not occupy the default mode of scientific practices, we argue that it offers an intriguing novel path for applied sciences.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"7 1","pages":"1 - 7"},"PeriodicalIF":0.0,"publicationDate":"2021-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45501762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Randomization Tests to Assess Covariate Balance When Designing and Analyzing Matched Datasets 设计和分析匹配数据集时评估协变量平衡的随机测试
Pub Date : 2021-09-09 DOI: 10.1353/obs.2021.0031
Zach Branson
Abstract:Causal analyses for observational studies are often complicated by covariate imbalances among treatment groups, and matching methodologies alleviate this complication by finding subsets of treatment groups that exhibit covariate balance. It is widely agreed upon that covariate balance can serve as evidence that a matched dataset approximates a randomized experiment, but what kind of experiment does a matched dataset approximate? In this work, we develop a randomization test for the hypothesis that a matched dataset approximates a particular experimental design, such as complete randomization, block randomization, or rerandomization. Our test can incorporate any experimental design, and it allows for a graphical display that puts several designs on the same univariate scale, thereby allowing researchers to pinpoint which design—if any—is most appropriate for a matched dataset. After researchers determine a plausible design, we recommend a randomization based approach for analyzing the matched data, which can incorporate any design and treatment effect estimator. Through simulation, we find that our test can frequently detect violations of randomized assignment that harm inferential results. Furthermore, through simulation and a real application in political science, we find that matched datasets with high levels of covariate balance tend to approximate balance-constrained designs like rerandomization, and analyzing them as such can lead to precise causal analyses. However, assuming a precise design should be proceeded with caution, because it can harm inferential results if there are still substantial biases due to remaining imbalances after matching. Our approach is implemented in the randChecks R package, available on CRAN.
摘要:观察性研究的因果分析常常因治疗组间协变量失衡而变得复杂,而匹配方法通过寻找表现出协变量平衡的治疗组子集来缓解这一并发症。人们普遍认为协变量平衡可以作为匹配数据集近似随机实验的证据,但是匹配数据集近似什么样的实验?在这项工作中,我们为匹配的数据集近似于特定实验设计的假设开发了随机化检验,例如完全随机化,块随机化或再随机化。我们的测试可以包含任何实验设计,并且它允许图形显示,将几个设计放在同一个单变量尺度上,从而允许研究人员确定哪种设计(如果有的话)最适合匹配的数据集。在研究人员确定一个合理的设计之后,我们推荐一种基于随机化的方法来分析匹配的数据,它可以包含任何设计和治疗效果估计。通过仿真,我们发现我们的测试可以频繁地检测出破坏推理结果的随机分配违规行为。此外,通过模拟和在政治学中的实际应用,我们发现具有高水平协变量平衡的匹配数据集倾向于近似于像再随机化这样的平衡约束设计,并且这样分析它们可以导致精确的因果分析。然而,假设一个精确的设计应该谨慎进行,因为如果匹配后仍然存在由于剩余的不平衡而产生的大量偏差,它可能会损害推断结果。我们的方法是在randChecks R包中实现的,可以在CRAN上获得。
{"title":"Randomization Tests to Assess Covariate Balance When Designing and Analyzing Matched Datasets","authors":"Zach Branson","doi":"10.1353/obs.2021.0031","DOIUrl":"https://doi.org/10.1353/obs.2021.0031","url":null,"abstract":"Abstract:Causal analyses for observational studies are often complicated by covariate imbalances among treatment groups, and matching methodologies alleviate this complication by finding subsets of treatment groups that exhibit covariate balance. It is widely agreed upon that covariate balance can serve as evidence that a matched dataset approximates a randomized experiment, but what kind of experiment does a matched dataset approximate? In this work, we develop a randomization test for the hypothesis that a matched dataset approximates a particular experimental design, such as complete randomization, block randomization, or rerandomization. Our test can incorporate any experimental design, and it allows for a graphical display that puts several designs on the same univariate scale, thereby allowing researchers to pinpoint which design—if any—is most appropriate for a matched dataset. After researchers determine a plausible design, we recommend a randomization based approach for analyzing the matched data, which can incorporate any design and treatment effect estimator. Through simulation, we find that our test can frequently detect violations of randomized assignment that harm inferential results. Furthermore, through simulation and a real application in political science, we find that matched datasets with high levels of covariate balance tend to approximate balance-constrained designs like rerandomization, and analyzing them as such can lead to precise causal analyses. However, assuming a precise design should be proceeded with caution, because it can harm inferential results if there are still substantial biases due to remaining imbalances after matching. Our approach is implemented in the randChecks R package, available on CRAN.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"7 1","pages":"1 - 36"},"PeriodicalIF":0.0,"publicationDate":"2021-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44904534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
Observational studies
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1