Abstract Double machine learning (DML) has become an increasingly popular tool for automated variable selection in high-dimensional settings. Even though the ability to deal with a large number of potential covariates can render selection-on-observables assumptions more plausible, there is at the same time a growing risk that endogenous variables are included, which would lead to the violation of conditional independence. This article demonstrates that DML is very sensitive to the inclusion of only a few “bad controls” in the covariate space. The resulting bias varies with the nature of the theoretical causal model, which raises concerns about the feasibility of selecting control variables in a data-driven way.
{"title":"Double machine learning and automated confounder selection: A cautionary tale","authors":"Paul Hünermund, Beyers Louw, Itamar Caspi","doi":"10.1515/jci-2022-0078","DOIUrl":"https://doi.org/10.1515/jci-2022-0078","url":null,"abstract":"Abstract Double machine learning (DML) has become an increasingly popular tool for automated variable selection in high-dimensional settings. Even though the ability to deal with a large number of potential covariates can render selection-on-observables assumptions more plausible, there is at the same time a growing risk that endogenous variables are included, which would lead to the violation of conditional independence. This article demonstrates that DML is very sensitive to the inclusion of only a few “bad controls” in the covariate space. The resulting bias varies with the nature of the theoretical causal model, which raises concerns about the feasibility of selecting control variables in a data-driven way.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"2010 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2021-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82542755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Inverse probability weighting (IPW) is a general tool in survey sampling and causal inference, used in both Horvitz–Thompson estimators, which normalize by the sample size, and Hájek/self-normalized estimators, which normalize by the sum of the inverse probability weights. In this work, we study a family of IPW estimators, first proposed by Trotter and Tukey in the context of Monte Carlo problems, that are normalized by an affine combination of the sample size and a sum of inverse weights. We show how selecting an estimator from this family in a data-dependent way to minimize asymptotic variance leads to an iterative procedure that converges to an estimator with connections to regression control methods. We refer to such estimators as adaptively normalized estimators. For mean estimation in survey sampling, the adaptively normalized estimator has asymptotic variance that is never worse than the Horvitz–Thompson and Hájek estimators. Going further, we show that adaptive normalization can be used to propose improvements of the augmented IPW (AIPW) estimator, average treatment effect (ATE) estimators, and policy learning objectives. Appealingly, these proposals preserve both the asymptotic efficiency of AIPW and the regret bounds for policy learning with IPW objectives, and deliver consistent finite sample improvements in simulations for all three of mean estimation, ATE estimation, and policy learning.
{"title":"Adaptive normalization for IPW estimation","authors":"Samir Khan, J. Ugander","doi":"10.1515/jci-2022-0019","DOIUrl":"https://doi.org/10.1515/jci-2022-0019","url":null,"abstract":"Abstract Inverse probability weighting (IPW) is a general tool in survey sampling and causal inference, used in both Horvitz–Thompson estimators, which normalize by the sample size, and Hájek/self-normalized estimators, which normalize by the sum of the inverse probability weights. In this work, we study a family of IPW estimators, first proposed by Trotter and Tukey in the context of Monte Carlo problems, that are normalized by an affine combination of the sample size and a sum of inverse weights. We show how selecting an estimator from this family in a data-dependent way to minimize asymptotic variance leads to an iterative procedure that converges to an estimator with connections to regression control methods. We refer to such estimators as adaptively normalized estimators. For mean estimation in survey sampling, the adaptively normalized estimator has asymptotic variance that is never worse than the Horvitz–Thompson and Hájek estimators. Going further, we show that adaptive normalization can be used to propose improvements of the augmented IPW (AIPW) estimator, average treatment effect (ATE) estimators, and policy learning objectives. Appealingly, these proposals preserve both the asymptotic efficiency of AIPW and the regret bounds for policy learning with IPW objectives, and deliver consistent finite sample improvements in simulations for all three of mean estimation, ATE estimation, and policy learning.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"49 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2021-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83878902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Randomized controlled trials (RCTs) are often considered the gold standard for estimating causal effect, but they may lack external validity when the population eligible to the RCT is substantially different from the target population. Having at hand a sample of the target population of interest allows us to generalize the causal effect. Identifying the treatment effect in the target population requires covariates to capture all treatment effect modifiers that are shifted between the two sets. Standard estimators then use either weighting (IPSW), outcome modeling (G-formula), or combine the two in doubly robust approaches (AIPSW). However, such covariates are often not available in both sets. In this article, after proving L 1 {L}^{1} -consistency of these three estimators, we compute the expected bias induced by a missing covariate, assuming a Gaussian distribution, a continuous outcome, and a semi-parametric model. Under this setting, we perform a sensitivity analysis for each missing covariate pattern and compute the sign of the expected bias. We also show that there is no gain in linearly imputing a partially unobserved covariate. Finally, we study the substitution of a missing covariate by a proxy. We illustrate all these results on simulations, as well as semi-synthetic benchmarks using data from the Tennessee student/teacher achievement ratio (STAR), and a real-world example from critical care medicine.
{"title":"Causal effect on a target population: A sensitivity analysis to handle missing covariates","authors":"B. Colnet, J. Josse, G. Varoquaux, Erwan Scornet","doi":"10.1515/jci-2021-0059","DOIUrl":"https://doi.org/10.1515/jci-2021-0059","url":null,"abstract":"Abstract Randomized controlled trials (RCTs) are often considered the gold standard for estimating causal effect, but they may lack external validity when the population eligible to the RCT is substantially different from the target population. Having at hand a sample of the target population of interest allows us to generalize the causal effect. Identifying the treatment effect in the target population requires covariates to capture all treatment effect modifiers that are shifted between the two sets. Standard estimators then use either weighting (IPSW), outcome modeling (G-formula), or combine the two in doubly robust approaches (AIPSW). However, such covariates are often not available in both sets. In this article, after proving L 1 {L}^{1} -consistency of these three estimators, we compute the expected bias induced by a missing covariate, assuming a Gaussian distribution, a continuous outcome, and a semi-parametric model. Under this setting, we perform a sensitivity analysis for each missing covariate pattern and compute the sign of the expected bias. We also show that there is no gain in linearly imputing a partially unobserved covariate. Finally, we study the substitution of a missing covariate by a proxy. We illustrate all these results on simulations, as well as semi-synthetic benchmarks using data from the Tennessee student/teacher achievement ratio (STAR), and a real-world example from critical care medicine.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"19 1","pages":"372 - 414"},"PeriodicalIF":1.4,"publicationDate":"2021-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88798547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Johann A. Gagnon-Bartsch, Adam C. Sales, Edward Wu, Anthony F. Botelho, John A. Erickson, Luke W. Miratrix, N. Heffernan
Abstract Randomized controlled trials (RCTs) admit unconfounded design-based inference – randomization largely justifies the assumptions underlying statistical effect estimates – but often have limited sample sizes. However, researchers may have access to big observational data on covariates and outcomes from RCT nonparticipants. For example, data from A/B tests conducted within an educational technology platform exist alongside historical observational data drawn from student logs. We outline a design-based approach to using such observational data for variance reduction in RCTs. First, we use the observational data to train a machine learning algorithm predicting potential outcomes using covariates and then use that algorithm to generate predictions for RCT participants. Then, we use those predictions, perhaps alongside other covariates, to adjust causal effect estimates with a flexible, design-based covariate-adjustment routine. In this way, there is no danger of biases from the observational data leaking into the experimental estimates, which are guaranteed to be exactly unbiased regardless of whether the machine learning models are “correct” in any sense or whether the observational samples closely resemble RCT samples. We demonstrate the method in analyzing 33 randomized A/B tests and show that it decreases standard errors relative to other estimators, sometimes substantially.
{"title":"Precise unbiased estimation in randomized experiments using auxiliary observational data","authors":"Johann A. Gagnon-Bartsch, Adam C. Sales, Edward Wu, Anthony F. Botelho, John A. Erickson, Luke W. Miratrix, N. Heffernan","doi":"10.1515/jci-2022-0011","DOIUrl":"https://doi.org/10.1515/jci-2022-0011","url":null,"abstract":"Abstract Randomized controlled trials (RCTs) admit unconfounded design-based inference – randomization largely justifies the assumptions underlying statistical effect estimates – but often have limited sample sizes. However, researchers may have access to big observational data on covariates and outcomes from RCT nonparticipants. For example, data from A/B tests conducted within an educational technology platform exist alongside historical observational data drawn from student logs. We outline a design-based approach to using such observational data for variance reduction in RCTs. First, we use the observational data to train a machine learning algorithm predicting potential outcomes using covariates and then use that algorithm to generate predictions for RCT participants. Then, we use those predictions, perhaps alongside other covariates, to adjust causal effect estimates with a flexible, design-based covariate-adjustment routine. In this way, there is no danger of biases from the observational data leaking into the experimental estimates, which are guaranteed to be exactly unbiased regardless of whether the machine learning models are “correct” in any sense or whether the observational samples closely resemble RCT samples. We demonstrate the method in analyzing 33 randomized A/B tests and show that it decreases standard errors relative to other estimators, sometimes substantially.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"146 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2021-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86836047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract We present a method for assessing the sensitivity of the true causal effect to unmeasured confounding. The method requires the analyst to set two intuitive parameters. Otherwise, the method is assumption free. The method returns an interval that contains the true causal effect and whose bounds are arbitrarily sharp, i.e., practically attainable. We show experimentally that our bounds can be tighter than those obtained by the method of Ding and VanderWeele, which, moreover, requires to set one more parameter than our method. Finally, we extend our method to bound the natural direct and indirect effects when there are measured mediators and unmeasured exposure–outcome confounding.
{"title":"Simple yet sharp sensitivity analysis for unmeasured confounding","authors":"J. Peña","doi":"10.1515/jci-2021-0041","DOIUrl":"https://doi.org/10.1515/jci-2021-0041","url":null,"abstract":"Abstract We present a method for assessing the sensitivity of the true causal effect to unmeasured confounding. The method requires the analyst to set two intuitive parameters. Otherwise, the method is assumption free. The method returns an interval that contains the true causal effect and whose bounds are arbitrarily sharp, i.e., practically attainable. We show experimentally that our bounds can be tighter than those obtained by the method of Ding and VanderWeele, which, moreover, requires to set one more parameter than our method. Finally, we extend our method to bound the natural direct and indirect effects when there are measured mediators and unmeasured exposure–outcome confounding.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"42 1","pages":"1 - 17"},"PeriodicalIF":1.4,"publicationDate":"2021-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74560465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Perfect adaptation in a dynamical system is the phenomenon that one or more variables have an initial transient response to a persistent change in an external stimulus but revert to their original value as the system converges to equilibrium. With the help of the causal ordering algorithm, one can construct graphical representations of dynamical systems that represent the causal relations between the variables and the conditional independences in the equilibrium distribution. We apply these tools to formulate sufficient graphical conditions for identifying perfect adaptation from a set of first-order differential equations. Furthermore, we give sufficient conditions to test for the presence of perfect adaptation in experimental equilibrium data. We apply this method to a simple model for a protein signalling pathway and test its predictions in both simulations and using real-world protein expression data. We demonstrate that perfect adaptation can lead to misleading orientation of edges in the output of causal discovery algorithms.
{"title":"Causality and independence in perfectly adapted dynamical systems","authors":"Tineke Blom, J. Mooij","doi":"10.1515/jci-2021-0005","DOIUrl":"https://doi.org/10.1515/jci-2021-0005","url":null,"abstract":"Abstract Perfect adaptation in a dynamical system is the phenomenon that one or more variables have an initial transient response to a persistent change in an external stimulus but revert to their original value as the system converges to equilibrium. With the help of the causal ordering algorithm, one can construct graphical representations of dynamical systems that represent the causal relations between the variables and the conditional independences in the equilibrium distribution. We apply these tools to formulate sufficient graphical conditions for identifying perfect adaptation from a set of first-order differential equations. Furthermore, we give sufficient conditions to test for the presence of perfect adaptation in experimental equilibrium data. We apply this method to a simple model for a protein signalling pathway and test its predictions in both simulations and using real-world protein expression data. We demonstrate that perfect adaptation can lead to misleading orientation of edges in the output of causal discovery algorithms.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"18 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2021-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87517386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-01-26DOI: 10.12987/9780300255881-031
{"title":"Popular IV Designs","authors":"","doi":"10.12987/9780300255881-031","DOIUrl":"https://doi.org/10.12987/9780300255881-031","url":null,"abstract":"","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"78 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2021-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86622800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}