Abstract Matching in observational studies faces complications when units enroll in treatment on a rolling basis. While each treated unit has a specific time of entry into the study, control units each have many possible comparison, or “pseudo-treatment,” times. Valid inference must account for correlations between repeated measures for a single unit, and researchers must decide how flexibly to match across time and units. We provide three important innovations. First, we introduce a new matched design, GroupMatch with instance replacement, allowing maximum flexibility in control selection. This new design searches over all possible comparison times for each treated-control pairing and is more amenable to analysis than past methods. Second, we propose a block bootstrap approach for inference in matched designs with rolling enrollment and demonstrate that it accounts properly for complex correlations across matched sets in our new design and several other contexts. Third, we develop a falsification test to detect violations of the timepoint agnosticism assumption, which is needed to permit flexible matching across time. We demonstrate the practical value of these tools via simulations and a case study of the impact of short-term injuries on batting performance in major league baseball.
{"title":"Robust inference for matching under rolling enrollment","authors":"Amanda K. Glazer, Samuel D. Pimentel","doi":"10.1515/jci-2022-0055","DOIUrl":"https://doi.org/10.1515/jci-2022-0055","url":null,"abstract":"Abstract Matching in observational studies faces complications when units enroll in treatment on a rolling basis. While each treated unit has a specific time of entry into the study, control units each have many possible comparison, or “pseudo-treatment,” times. Valid inference must account for correlations between repeated measures for a single unit, and researchers must decide how flexibly to match across time and units. We provide three important innovations. First, we introduce a new matched design, GroupMatch with instance replacement, allowing maximum flexibility in control selection. This new design searches over all possible comparison times for each treated-control pairing and is more amenable to analysis than past methods. Second, we propose a block bootstrap approach for inference in matched designs with rolling enrollment and demonstrate that it accounts properly for complex correlations across matched sets in our new design and several other contexts. Third, we develop a falsification test to detect violations of the timepoint agnosticism assumption, which is needed to permit flexible matching across time. We demonstrate the practical value of these tools via simulations and a case study of the impact of short-term injuries on batting performance in major league baseball.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"4 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79025660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Treatment effect estimates are often available from randomized controlled trials as a single average treatment effect for a certain patient population. Estimates of the conditional average treatment effect (CATE) are more useful for individualized treatment decision-making, but randomized trials are often too small to estimate the CATE. Examples in medical literature make use of the relative treatment effect (e.g. an odds ratio) reported by randomized trials to estimate the CATE using large observational datasets. One approach to estimating these CATE models is by using the relative treatment effect as an offset, while estimating the covariate-specific untreated risk. We observe that the odds ratios reported in randomized controlled trials are not the odds ratios that are needed in offset models because trials often report the marginal odds ratio. We introduce a constraint or a regularizer to better use marginal odds ratios from randomized controlled trials and find that under the standard observational causal inference assumptions, this approach provides a consistent estimate of the CATE. Next, we show that the offset approach is not valid for CATE estimation in the presence of unobserved confounding. We study if the offset assumption and the marginal constraint lead to better approximations of the CATE relative to the alternative of using the average treatment effect estimate from the randomized trial. We empirically show that when the underlying CATE has sufficient variation, the constraint and offset approaches lead to closer approximations to the CATE.
{"title":"Conditional average treatment effect estimation with marginally constrained models","authors":"W. A. van Amsterdam, R. Ranganath","doi":"10.1515/jci-2022-0027","DOIUrl":"https://doi.org/10.1515/jci-2022-0027","url":null,"abstract":"Abstract Treatment effect estimates are often available from randomized controlled trials as a single average treatment effect for a certain patient population. Estimates of the conditional average treatment effect (CATE) are more useful for individualized treatment decision-making, but randomized trials are often too small to estimate the CATE. Examples in medical literature make use of the relative treatment effect (e.g. an odds ratio) reported by randomized trials to estimate the CATE using large observational datasets. One approach to estimating these CATE models is by using the relative treatment effect as an offset, while estimating the covariate-specific untreated risk. We observe that the odds ratios reported in randomized controlled trials are not the odds ratios that are needed in offset models because trials often report the marginal odds ratio. We introduce a constraint or a regularizer to better use marginal odds ratios from randomized controlled trials and find that under the standard observational causal inference assumptions, this approach provides a consistent estimate of the CATE. Next, we show that the offset approach is not valid for CATE estimation in the presence of unobserved confounding. We study if the offset assumption and the marginal constraint lead to better approximations of the CATE relative to the alternative of using the average treatment effect estimate from the randomized trial. We empirically show that when the underlying CATE has sufficient variation, the constraint and offset approaches lead to closer approximations to the CATE.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"19 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90261544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Statistical power is often a concern for clustered randomized control trials (RCTs) due to variance inflation from design effects and the high cost of adding study clusters (such as hospitals, schools, or communities). While covariate pre-specification can improve power for estimating regression-adjusted average treatment effects (ATEs), further precision gains can be achieved through covariate selection once primary outcomes have been collected. This article uses design-based methods underlying clustered RCTs to develop Lasso methods for the post-hoc selection of covariates for ATE estimation that avoids a lack of transparency and model overfitting. Our focus is on two-stage estimators: in the first stage, Lasso estimation is conducted using data on cluster-level averages or sums, and in the second stage, standard ATE estimators are adjusted for covariates using the first-stage Lasso results. We discuss l 1 {l}_{1} consistency of the estimated Lasso coefficients, asymptotic normality of the ATE estimators, and design-based variance estimation. The nonparametric approach applies to continuous, binary, and discrete outcomes. We present simulation results and demonstrate the method using data from a federally funded clustered RCT testing the effects of school-based programs promoting behavioral health.
{"title":"A Lasso approach to covariate selection and average treatment effect estimation for clustered RCTs using design-based methods","authors":"Peter Z. Schochet","doi":"10.1515/jci-2021-0036","DOIUrl":"https://doi.org/10.1515/jci-2021-0036","url":null,"abstract":"Abstract Statistical power is often a concern for clustered randomized control trials (RCTs) due to variance inflation from design effects and the high cost of adding study clusters (such as hospitals, schools, or communities). While covariate pre-specification can improve power for estimating regression-adjusted average treatment effects (ATEs), further precision gains can be achieved through covariate selection once primary outcomes have been collected. This article uses design-based methods underlying clustered RCTs to develop Lasso methods for the post-hoc selection of covariates for ATE estimation that avoids a lack of transparency and model overfitting. Our focus is on two-stage estimators: in the first stage, Lasso estimation is conducted using data on cluster-level averages or sums, and in the second stage, standard ATE estimators are adjusted for covariates using the first-stage Lasso results. We discuss l 1 {l}_{1} consistency of the estimated Lasso coefficients, asymptotic normality of the ATE estimators, and design-based variance estimation. The nonparametric approach applies to continuous, binary, and discrete outcomes. We present simulation results and demonstrate the method using data from a federally funded clustered RCT testing the effects of school-based programs promoting behavioral health.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"9 1","pages":"494 - 514"},"PeriodicalIF":1.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84301133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mediation analysis has been used in many disciplines to explain the mechanism or process that underlies an observed relationship between an exposure variable and an outcome variable via the inclusion of mediators. Decompositions of the total effect (TE) of an exposure variable into effects characterizing mediation pathways and interactions have gained an increasing amount of interest in the last decade. In this work, we develop decompositions for scenarios where two mediators are causally sequential or non-sequential. Current developments in this area have primarily focused on either decompositions without interaction components or with interactions but assuming no causally sequential order between the mediators. We propose a new concept called natural mediated interaction (MI) effect that captures the two-way and three-way interactions for both scenarios and extends the two-way MIs in the literature. We develop a unified approach for decomposing the TE into the effects that are due to mediation only, interaction only, both mediation and interaction, neither mediation nor interaction within the counterfactual framework. Finally, we compare our proposed decomposition to an existing method in a non-sequential two-mediator scenario using simulated data, and illustrate the proposed decomposition for a sequential two-mediator scenario using a real data analysis.
{"title":"Decomposition of the total effect for two mediators: A natural mediated interaction effect framework.","authors":"Xin Gao, Li Li, Li Luo","doi":"10.1515/jci-2020-0017","DOIUrl":"https://doi.org/10.1515/jci-2020-0017","url":null,"abstract":"<p><p>Mediation analysis has been used in many disciplines to explain the mechanism or process that underlies an observed relationship between an exposure variable and an outcome variable via the inclusion of mediators. Decompositions of the total effect (TE) of an exposure variable into effects characterizing mediation pathways and interactions have gained an increasing amount of interest in the last decade. In this work, we develop decompositions for scenarios where two mediators are causally sequential or non-sequential. Current developments in this area have primarily focused on either decompositions without interaction components or with interactions but assuming no causally sequential order between the mediators. We propose a new concept called natural mediated interaction (MI) effect that captures the two-way and three-way interactions for both scenarios and extends the two-way MIs in the literature. We develop a unified approach for decomposing the TE into the effects that are due to mediation only, interaction only, both mediation and interaction, neither mediation nor interaction within the counterfactual framework. Finally, we compare our proposed decomposition to an existing method in a non-sequential two-mediator scenario using simulated data, and illustrate the proposed decomposition for a sequential two-mediator scenario using a real data analysis.</p>","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"10 1","pages":"18-44"},"PeriodicalIF":1.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9139468/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10600650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Residual confounding is a common source of bias in observational studies. In this article, we build upon a series of sensitivity analyses methods for residual confounding developed by Brumback et al. and Chiba whose sensitivity parameters are constructed to quantify deviation from conditional exchangeability, given measured confounders. These sensitivity parameters are combined with the observed data to produce a “bias-corrected” estimate of the causal effect of interest. We provide important generalizations of these sensitivity analyses, by allowing for arbitrary exposures and a wide range of different causal effect measures, through the specification of the target causal effect as a parameter in a generalized linear model with the arbitrary link function. We show how our generalized sensitivity analysis can be easily implemented with standard software, and how its sensitivity parameters can be calibrated against measured confounders. We demonstrate our sensitivity analysis with an application to publicly available data from a cohort study of behavior patterns and coronary heart disease.
{"title":"Sensitivity analysis for causal effects with generalized linear models","authors":"A. Sjölander, E. Gabriel, I. Ciocănea-Teodorescu","doi":"10.1515/jci-2022-0040","DOIUrl":"https://doi.org/10.1515/jci-2022-0040","url":null,"abstract":"Abstract Residual confounding is a common source of bias in observational studies. In this article, we build upon a series of sensitivity analyses methods for residual confounding developed by Brumback et al. and Chiba whose sensitivity parameters are constructed to quantify deviation from conditional exchangeability, given measured confounders. These sensitivity parameters are combined with the observed data to produce a “bias-corrected” estimate of the causal effect of interest. We provide important generalizations of these sensitivity analyses, by allowing for arbitrary exposures and a wide range of different causal effect measures, through the specification of the target causal effect as a parameter in a generalized linear model with the arbitrary link function. We show how our generalized sensitivity analysis can be easily implemented with standard software, and how its sensitivity parameters can be calibrated against measured confounders. We demonstrate our sensitivity analysis with an application to publicly available data from a cohort study of behavior patterns and coronary heart disease.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"60 1","pages":"441 - 479"},"PeriodicalIF":1.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73798688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract I thank Judea Pearl for his discussion of my paper and respond to the points he raises. In particular, his attachment to unaugmented directed acyclic graphs has led to a misapprehension of my own proposals. I also discuss the possibilities for developing a non-manipulative understanding of causality.
{"title":"Decision-theoretic foundations for statistical causality: Response to Pearl","authors":"P. Dawid","doi":"10.1515/jci-2022-0056","DOIUrl":"https://doi.org/10.1515/jci-2022-0056","url":null,"abstract":"Abstract I thank Judea Pearl for his discussion of my paper and respond to the points he raises. In particular, his attachment to unaugmented directed acyclic graphs has led to a misapprehension of my own proposals. I also discuss the possibilities for developing a non-manipulative understanding of causality.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"1 1","pages":"296 - 299"},"PeriodicalIF":1.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90808820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comment on: “Decision-theoretic foundations for statistical causality”","authors":"I. Shpitser","doi":"10.1515/jci-2021-0056","DOIUrl":"https://doi.org/10.1515/jci-2021-0056","url":null,"abstract":"","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"213 1","pages":"190 - 196"},"PeriodicalIF":1.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85875904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract In a recent issue of this journal, Philip Dawid (2021) proposes a framework for causal inference that is based on statistical decision theory and that is, in many aspects, compatible with the familiar framework of causal graphs (e.g., Directed Acyclic Graphs (DAGs)). This editorial compares the methodological features of the two frameworks as well as their epistemological basis.
{"title":"Causation and decision: On Dawid’s “Decision theoretic foundation of statistical causality”","authors":"J. Pearl","doi":"10.1515/jci-2022-0046","DOIUrl":"https://doi.org/10.1515/jci-2022-0046","url":null,"abstract":"Abstract In a recent issue of this journal, Philip Dawid (2021) proposes a framework for causal inference that is based on statistical decision theory and that is, in many aspects, compatible with the familiar framework of causal graphs (e.g., Directed Acyclic Graphs (DAGs)). This editorial compares the methodological features of the two frameworks as well as their epistemological basis.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"20 1","pages":"221 - 226"},"PeriodicalIF":1.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87835946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Randomized controlled trials (RCTs) sometimes test interventions that aim to improve existing services targeted to a subset of individuals identified after randomization. Accordingly, the treatment could affect the composition of service recipients and the offered services. With such bias, intention-to-treat estimates using data on service recipients and nonrecipients may be difficult to interpret. This article develops causal estimands and inverse probability weighting (IPW) estimators for complier populations in these settings, using a generalized estimating equation approach that adjusts the standard errors for estimation error in the IPW weights. While our focus is on more general clustered RCTs, the methods also apply (reduce) to nonclustered RCTs. Simulations show that the estimators achieve nominal confidence interval coverage under the assumed identification conditions. An empirical application demonstrates the methods using data from a large-scale RCT testing the effects of early childhood services on children’s cognitive development scores. An R program for estimation is available for download.
{"title":"Estimating complier average causal effects for clustered RCTs when the treatment affects the service population","authors":"Peter Z. Schochet","doi":"10.1515/jci-2022-0033","DOIUrl":"https://doi.org/10.1515/jci-2022-0033","url":null,"abstract":"Abstract Randomized controlled trials (RCTs) sometimes test interventions that aim to improve existing services targeted to a subset of individuals identified after randomization. Accordingly, the treatment could affect the composition of service recipients and the offered services. With such bias, intention-to-treat estimates using data on service recipients and nonrecipients may be difficult to interpret. This article develops causal estimands and inverse probability weighting (IPW) estimators for complier populations in these settings, using a generalized estimating equation approach that adjusts the standard errors for estimation error in the IPW weights. While our focus is on more general clustered RCTs, the methods also apply (reduce) to nonclustered RCTs. Simulations show that the estimators achieve nominal confidence interval coverage under the assumed identification conditions. An empirical application demonstrates the methods using data from a large-scale RCT testing the effects of early childhood services on children’s cognitive development scores. An R program for estimation is available for download.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"664 1","pages":"300 - 334"},"PeriodicalIF":1.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79033986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The study of causal inference has seen recent momentum in machine learning and artificial intelligence (AI), particularly in the domains of transfer learning, reinforcement learning, automated diagnostics, and explainability (among others). Yet, despite its increasing application to address many of the boundaries in modern AI, causal topics remain absent in most AI curricula. This work seeks to bridge this gap by providing classroom-ready introductions that integrate into traditional topics in AI, suggests intuitive graphical tools for the application to both new and traditional lessons in probabilistic and causal reasoning, and presents avenues for instructors to impress the merit of climbing the “causal hierarchy” to address problems at the levels of associational, interventional, and counterfactual inference. Finally, this study shares anecdotal instructor experiences, successes, and challenges integrating these lessons at multiple levels of education.
{"title":"Causal inference in AI education: A primer","authors":"A. Forney, Scott Mueller","doi":"10.1515/jci-2021-0048","DOIUrl":"https://doi.org/10.1515/jci-2021-0048","url":null,"abstract":"Abstract The study of causal inference has seen recent momentum in machine learning and artificial intelligence (AI), particularly in the domains of transfer learning, reinforcement learning, automated diagnostics, and explainability (among others). Yet, despite its increasing application to address many of the boundaries in modern AI, causal topics remain absent in most AI curricula. This work seeks to bridge this gap by providing classroom-ready introductions that integrate into traditional topics in AI, suggests intuitive graphical tools for the application to both new and traditional lessons in probabilistic and causal reasoning, and presents avenues for instructors to impress the merit of climbing the “causal hierarchy” to address problems at the levels of associational, interventional, and counterfactual inference. Finally, this study shares anecdotal instructor experiences, successes, and challenges integrating these lessons at multiple levels of education.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"2015 1","pages":"141 - 173"},"PeriodicalIF":1.4,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87012898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}