Journal of Causal Inference最新文献

英文中文

Double machine learning and automated confounder selection: A cautionary tale 双重机器学习和自动混淆选择:一个警世故事

IF 1.4 4区医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Journal of Causal Inference

Pub Date : 2021-08-25 DOI: 10.1515/jci-2022-0078

Paul Hünermund, Beyers Louw, Itamar Caspi

Abstract Double machine learning (DML) has become an increasingly popular tool for automated variable selection in high-dimensional settings. Even though the ability to deal with a large number of potential covariates can render selection-on-observables assumptions more plausible, there is at the same time a growing risk that endogenous variables are included, which would lead to the violation of conditional independence. This article demonstrates that DML is very sensitive to the inclusion of only a few “bad controls” in the covariate space. The resulting bias varies with the nature of the theoretical causal model, which raises concerns about the feasibility of selecting control variables in a data-driven way.

双机器学习(DML)已经成为一种越来越流行的高维环境中自动变量选择的工具。尽管处理大量潜在协变量的能力可以使可观测选择假设更加合理，但同时也存在内生变量被包括在内的风险，这将导致违反条件独立性。本文证明了DML对协变量空间中仅包含少数“坏控制”非常敏感。由此产生的偏差随理论因果模型的性质而变化，这引起了人们对以数据驱动的方式选择控制变量的可行性的关注。

引用次数: 7

Adaptive normalization for IPW estimation IPW估计的自适应归一化

IF 1.4 4区医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Journal of Causal Inference

Pub Date : 2021-06-14 DOI: 10.1515/jci-2022-0019

Samir Khan, J. Ugander

Abstract Inverse probability weighting (IPW) is a general tool in survey sampling and causal inference, used in both Horvitz–Thompson estimators, which normalize by the sample size, and Hájek/self-normalized estimators, which normalize by the sum of the inverse probability weights. In this work, we study a family of IPW estimators, first proposed by Trotter and Tukey in the context of Monte Carlo problems, that are normalized by an affine combination of the sample size and a sum of inverse weights. We show how selecting an estimator from this family in a data-dependent way to minimize asymptotic variance leads to an iterative procedure that converges to an estimator with connections to regression control methods. We refer to such estimators as adaptively normalized estimators. For mean estimation in survey sampling, the adaptively normalized estimator has asymptotic variance that is never worse than the Horvitz–Thompson and Hájek estimators. Going further, we show that adaptive normalization can be used to propose improvements of the augmented IPW (AIPW) estimator, average treatment effect (ATE) estimators, and policy learning objectives. Appealingly, these proposals preserve both the asymptotic efficiency of AIPW and the regret bounds for policy learning with IPW objectives, and deliver consistent finite sample improvements in simulations for all three of mean estimation, ATE estimation, and policy learning.

逆概率加权(IPW)是一种用于调查抽样和因果推理的通用工具，用于通过样本量进行归一化的Horvitz-Thompson估计量和通过逆概率权和进行归一化的Hájek/自归一化估计量。在这项工作中，我们研究了一类IPW估计量，它们首先由Trotter和Tukey在蒙特卡罗问题的背景下提出，通过样本大小和逆权和的仿射组合进行归一化。我们展示了如何以数据相关的方式从这个族中选择一个估计量来最小化渐近方差，从而导致一个迭代过程收敛到一个与回归控制方法有联系的估计量。我们把这样的估计量称为自适应归一化估计量。对于调查抽样的均值估计，自适应归一化估计量的渐近方差不会比Horvitz-Thompson和Hájek估计量差。进一步，我们表明自适应归一化可以用来提出增强IPW (AIPW)估计器、平均处理效果(ATE)估计器和策略学习目标的改进。值得注意的是，这些建议既保留了AIPW的渐近效率，又保留了具有IPW目标的策略学习的遗憾界限，并在模拟中为所有三种方法(均值估计、ATE估计和策略学习)提供了一致的有限样本改进。

{"title":"Adaptive normalization for IPW estimation","authors":"Samir Khan, J. Ugander","doi":"10.1515/jci-2022-0019","DOIUrl":"https://doi.org/10.1515/jci-2022-0019","url":null,"abstract":"Abstract Inverse probability weighting (IPW) is a general tool in survey sampling and causal inference, used in both Horvitz–Thompson estimators, which normalize by the sample size, and Hájek/self-normalized estimators, which normalize by the sum of the inverse probability weights. In this work, we study a family of IPW estimators, first proposed by Trotter and Tukey in the context of Monte Carlo problems, that are normalized by an affine combination of the sample size and a sum of inverse weights. We show how selecting an estimator from this family in a data-dependent way to minimize asymptotic variance leads to an iterative procedure that converges to an estimator with connections to regression control methods. We refer to such estimators as adaptively normalized estimators. For mean estimation in survey sampling, the adaptively normalized estimator has asymptotic variance that is never worse than the Horvitz–Thompson and Hájek estimators. Going further, we show that adaptive normalization can be used to propose improvements of the augmented IPW (AIPW) estimator, average treatment effect (ATE) estimators, and policy learning objectives. Appealingly, these proposals preserve both the asymptotic efficiency of AIPW and the regret bounds for policy learning with IPW objectives, and deliver consistent finite sample improvements in simulations for all three of mean estimation, ATE estimation, and policy learning.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"49 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2021-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83878902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Causal effect on a target population: A sensitivity analysis to handle missing covariates 对目标人群的因果效应:处理缺失协变量的敏感性分析

IF 1.4 4区医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Journal of Causal Inference

Pub Date : 2021-05-13 DOI: 10.1515/jci-2021-0059

B. Colnet, J. Josse, G. Varoquaux, Erwan Scornet

Abstract Randomized controlled trials (RCTs) are often considered the gold standard for estimating causal effect, but they may lack external validity when the population eligible to the RCT is substantially different from the target population. Having at hand a sample of the target population of interest allows us to generalize the causal effect. Identifying the treatment effect in the target population requires covariates to capture all treatment effect modifiers that are shifted between the two sets. Standard estimators then use either weighting (IPSW), outcome modeling (G-formula), or combine the two in doubly robust approaches (AIPSW). However, such covariates are often not available in both sets. In this article, after proving L 1 {L}^{1} -consistency of these three estimators, we compute the expected bias induced by a missing covariate, assuming a Gaussian distribution, a continuous outcome, and a semi-parametric model. Under this setting, we perform a sensitivity analysis for each missing covariate pattern and compute the sign of the expected bias. We also show that there is no gain in linearly imputing a partially unobserved covariate. Finally, we study the substitution of a missing covariate by a proxy. We illustrate all these results on simulations, as well as semi-synthetic benchmarks using data from the Tennessee student/teacher achievement ratio (STAR), and a real-world example from critical care medicine.

随机对照试验(RCT)通常被认为是估计因果效应的金标准，但当符合RCT的人群与目标人群有很大差异时，它们可能缺乏外部效度。手头有感兴趣的目标人群的样本使我们能够概括因果关系。确定目标人群中的治疗效果需要协变量来捕获在两组之间转移的所有治疗效果修饰符。然后，标准估计器要么使用加权(IPSW)，要么使用结果建模(g公式)，要么使用双稳健方法(AIPSW)将两者结合起来。然而，这类协变量在两个集合中往往不可用。在本文中，在证明了这三个估计量的L 1 {L}^{1} -相合性之后，我们计算了由缺失协变量引起的期望偏差，假设高斯分布，结果连续，半参数模型。在此设置下，我们对每个缺失的协变量模式进行敏感性分析，并计算期望偏差的符号。我们还表明，在线性输入部分未观察到的协变量时没有增益。最后，我们研究了用代理替换缺失的协变量。我们在模拟中说明了所有这些结果，以及使用田纳西州学生/教师成就比(STAR)数据的半合成基准，以及来自重症监护医学的现实世界示例。

{"title":"Causal effect on a target population: A sensitivity analysis to handle missing covariates","authors":"B. Colnet, J. Josse, G. Varoquaux, Erwan Scornet","doi":"10.1515/jci-2021-0059","DOIUrl":"https://doi.org/10.1515/jci-2021-0059","url":null,"abstract":"Abstract Randomized controlled trials (RCTs) are often considered the gold standard for estimating causal effect, but they may lack external validity when the population eligible to the RCT is substantially different from the target population. Having at hand a sample of the target population of interest allows us to generalize the causal effect. Identifying the treatment effect in the target population requires covariates to capture all treatment effect modifiers that are shifted between the two sets. Standard estimators then use either weighting (IPSW), outcome modeling (G-formula), or combine the two in doubly robust approaches (AIPSW). However, such covariates are often not available in both sets. In this article, after proving L 1 {L}^{1} -consistency of these three estimators, we compute the expected bias induced by a missing covariate, assuming a Gaussian distribution, a continuous outcome, and a semi-parametric model. Under this setting, we perform a sensitivity analysis for each missing covariate pattern and compute the sign of the expected bias. We also show that there is no gain in linearly imputing a partially unobserved covariate. Finally, we study the substitution of a missing covariate by a proxy. We illustrate all these results on simulations, as well as semi-synthetic benchmarks using data from the Tennessee student/teacher achievement ratio (STAR), and a real-world example from critical care medicine.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"19 1","pages":"372 - 414"},"PeriodicalIF":1.4,"publicationDate":"2021-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88798547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Precise unbiased estimation in randomized experiments using auxiliary observational data 利用辅助观测数据进行随机实验的精确无偏估计

IF 1.4 4区医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Journal of Causal Inference

Pub Date : 2021-05-07 DOI: 10.1515/jci-2022-0011

Johann A. Gagnon-Bartsch, Adam C. Sales, Edward Wu, Anthony F. Botelho, John A. Erickson, Luke W. Miratrix, N. Heffernan

Abstract Randomized controlled trials (RCTs) admit unconfounded design-based inference – randomization largely justifies the assumptions underlying statistical effect estimates – but often have limited sample sizes. However, researchers may have access to big observational data on covariates and outcomes from RCT nonparticipants. For example, data from A/B tests conducted within an educational technology platform exist alongside historical observational data drawn from student logs. We outline a design-based approach to using such observational data for variance reduction in RCTs. First, we use the observational data to train a machine learning algorithm predicting potential outcomes using covariates and then use that algorithm to generate predictions for RCT participants. Then, we use those predictions, perhaps alongside other covariates, to adjust causal effect estimates with a flexible, design-based covariate-adjustment routine. In this way, there is no danger of biases from the observational data leaking into the experimental estimates, which are guaranteed to be exactly unbiased regardless of whether the machine learning models are “correct” in any sense or whether the observational samples closely resemble RCT samples. We demonstrate the method in analyzing 33 randomized A/B tests and show that it decreases standard errors relative to other estimators, sometimes substantially.

随机对照试验(RCTs)承认基于设计的无混杂推断——随机化在很大程度上证明了统计效应估计背后的假设——但通常样本量有限。然而，研究人员可以从RCT非参与者那里获得有关协变量和结果的大量观察数据。例如，在教育技术平台中进行的A/B测试的数据与从学生日志中提取的历史观察数据并存。我们概述了一种基于设计的方法，在随机对照试验中使用这些观察数据来减少方差。首先，我们使用观察数据来训练机器学习算法，使用协变量预测潜在结果，然后使用该算法为RCT参与者生成预测。然后，我们使用这些预测，可能与其他协变量一起，用灵活的、基于设计的协变量调整程序来调整因果效应估计。这样，就不会有观测数据泄漏到实验估计中的偏倚危险，无论机器学习模型在任何意义上是否“正确”，或者观察样本是否与RCT样本非常相似，实验估计都保证完全无偏。我们在分析33个随机A/B测试中证明了该方法，并表明它相对于其他估计器减少了标准误差，有时甚至是实质性的。

{"title":"Precise unbiased estimation in randomized experiments using auxiliary observational data","authors":"Johann A. Gagnon-Bartsch, Adam C. Sales, Edward Wu, Anthony F. Botelho, John A. Erickson, Luke W. Miratrix, N. Heffernan","doi":"10.1515/jci-2022-0011","DOIUrl":"https://doi.org/10.1515/jci-2022-0011","url":null,"abstract":"Abstract Randomized controlled trials (RCTs) admit unconfounded design-based inference – randomization largely justifies the assumptions underlying statistical effect estimates – but often have limited sample sizes. However, researchers may have access to big observational data on covariates and outcomes from RCT nonparticipants. For example, data from A/B tests conducted within an educational technology platform exist alongside historical observational data drawn from student logs. We outline a design-based approach to using such observational data for variance reduction in RCTs. First, we use the observational data to train a machine learning algorithm predicting potential outcomes using covariates and then use that algorithm to generate predictions for RCT participants. Then, we use those predictions, perhaps alongside other covariates, to adjust causal effect estimates with a flexible, design-based covariate-adjustment routine. In this way, there is no danger of biases from the observational data leaking into the experimental estimates, which are guaranteed to be exactly unbiased regardless of whether the machine learning models are “correct” in any sense or whether the observational samples closely resemble RCT samples. We demonstrate the method in analyzing 33 randomized A/B tests and show that it decreases standard errors relative to other estimators, sometimes substantially.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"146 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2021-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86836047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Simple yet sharp sensitivity analysis for unmeasured confounding 对未测量混杂的简单而敏锐的敏感性分析

IF 1.4 4区医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Journal of Causal Inference

Pub Date : 2021-04-27 DOI: 10.1515/jci-2021-0041

J. Peña

Abstract We present a method for assessing the sensitivity of the true causal effect to unmeasured confounding. The method requires the analyst to set two intuitive parameters. Otherwise, the method is assumption free. The method returns an interval that contains the true causal effect and whose bounds are arbitrarily sharp, i.e., practically attainable. We show experimentally that our bounds can be tighter than those obtained by the method of Ding and VanderWeele, which, moreover, requires to set one more parameter than our method. Finally, we extend our method to bound the natural direct and indirect effects when there are measured mediators and unmeasured exposure–outcome confounding.

摘要:我们提出了一种评估真实因果效应对未测量混杂的敏感性的方法。该方法要求分析人员设置两个直观的参数。否则，该方法是无假设的。该方法返回一个包含真实因果效应的区间，其边界是任意尖锐的，即实际上可以达到的。我们通过实验证明，我们的边界可以比Ding和VanderWeele的方法得到的边界更紧，而且需要比我们的方法多设置一个参数。最后，我们扩展了我们的方法，当存在可测量的介质和未测量的暴露-结果混淆时，将自然的直接和间接影响结合起来。

引用次数: 4

Causality and independence in perfectly adapted dynamical systems 完全适应动力系统中的因果关系和独立性

IF 1.4 4区医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Journal of Causal Inference

Pub Date : 2021-01-28 DOI: 10.1515/jci-2021-0005

Tineke Blom, J. Mooij

Abstract Perfect adaptation in a dynamical system is the phenomenon that one or more variables have an initial transient response to a persistent change in an external stimulus but revert to their original value as the system converges to equilibrium. With the help of the causal ordering algorithm, one can construct graphical representations of dynamical systems that represent the causal relations between the variables and the conditional independences in the equilibrium distribution. We apply these tools to formulate sufficient graphical conditions for identifying perfect adaptation from a set of first-order differential equations. Furthermore, we give sufficient conditions to test for the presence of perfect adaptation in experimental equilibrium data. We apply this method to a simple model for a protein signalling pathway and test its predictions in both simulations and using real-world protein expression data. We demonstrate that perfect adaptation can lead to misleading orientation of edges in the output of causal discovery algorithms.

动态系统中的完美适应是指一个或多个变量对外部刺激的持续变化具有初始瞬态响应，但随着系统收敛到平衡状态而恢复到原始值的现象。在因果排序算法的帮助下，可以构建动态系统的图形表示，表示平衡分布中变量之间的因果关系和条件独立性。我们应用这些工具来制定充分的图形条件，以确定一阶微分方程的完美自适应。此外，我们还给出了实验平衡数据中存在完美自适应的充分条件。我们将这种方法应用于蛋白质信号通路的简单模型，并在模拟和使用真实世界的蛋白质表达数据中测试其预测。我们证明了在因果发现算法的输出中，完美的自适应会导致边缘方向的误导。

引用次数: 7

Parental Methamphetamine Abuse and Foster Care 父母滥用甲基苯丙胺及寄养

IF 1.4 4区医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Journal of Causal Inference

Pub Date : 2021-01-26 DOI: 10.12987/9780300255881-027

引用次数: 0

Popular IV Designs 流行的静脉注射设计

IF 1.4 4区医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Journal of Causal Inference

Pub Date : 2021-01-26 DOI: 10.12987/9780300255881-031

引用次数: 0

Acknowledgments 致谢

IF 1.4 4区医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Journal of Causal Inference

Pub Date : 2021-01-26 DOI: 10.2307/j.ctv1c29t27.3

引用次数: 0

John Snow’s Cholera Hypothesis 约翰·斯诺的霍乱假说

IF 1.4 4区医学 Q2 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Journal of Causal Inference

Pub Date : 2021-01-26 DOI: 10.12987/9780300255881-037

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Journal of Causal Inference

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀