Abstract Unmeasured confounding is an important threat to the validity of observational studies. A common way to deal with unmeasured confounding is to compute bounds for the causal effect of interest, that is, a range of values that is guaranteed to include the true effect, given the observed data. Recently, bounds have been proposed that are based on sensitivity parameters, which quantify the degree of unmeasured confounding on the risk ratio scale. These bounds can be used to compute an E-value, that is, the degree of confounding required to explain away an observed association, on the risk ratio scale. We complement and extend this previous work by deriving analogous bounds, based on sensitivity parameters on the risk difference scale. We show that our bounds can also be used to compute an E-value, on the risk difference scale. We compare our novel bounds with previous bounds through a real data example and a simulation study.
{"title":"Novel bounds for causal effects based on sensitivity parameters on the risk difference scale","authors":"A. Sjölander, O. Hössjer","doi":"10.1515/jci-2021-0024","DOIUrl":"https://doi.org/10.1515/jci-2021-0024","url":null,"abstract":"Abstract Unmeasured confounding is an important threat to the validity of observational studies. A common way to deal with unmeasured confounding is to compute bounds for the causal effect of interest, that is, a range of values that is guaranteed to include the true effect, given the observed data. Recently, bounds have been proposed that are based on sensitivity parameters, which quantify the degree of unmeasured confounding on the risk ratio scale. These bounds can be used to compute an E-value, that is, the degree of confounding required to explain away an observed association, on the risk ratio scale. We complement and extend this previous work by deriving analogous bounds, based on sensitivity parameters on the risk difference scale. We show that our bounds can also be used to compute an E-value, on the risk difference scale. We compare our novel bounds with previous bounds through a real data example and a simulation study.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"191 1","pages":"190 - 210"},"PeriodicalIF":1.4,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83067085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rui Zhang, M. Imaizumi, B. Scholkopf, Krikamol Muandet
Abstract We investigate a simple objective for nonlinear instrumental variable (IV) regression based on a kernelized conditional moment restriction known as a maximum moment restriction (MMR). The MMR objective is formulated by maximizing the interaction between the residual and the instruments belonging to a unit ball in a reproducing kernel Hilbert space. First, it allows us to simplify the IV regression as an empirical risk minimization problem, where the risk function depends on the reproducing kernel on the instrument and can be estimated by a U-statistic or V-statistic. Second, on the basis this simplification, we are able to provide consistency and asymptotic normality results in both parametric and nonparametric settings. Finally, we provide easy-to-use IV regression algorithms with an efficient hyperparameter selection procedure. We demonstrate the effectiveness of our algorithms using experiments on both synthetic and real-world data.
{"title":"Instrumental variable regression via kernel maximum moment loss","authors":"Rui Zhang, M. Imaizumi, B. Scholkopf, Krikamol Muandet","doi":"10.1515/jci-2022-0073","DOIUrl":"https://doi.org/10.1515/jci-2022-0073","url":null,"abstract":"Abstract We investigate a simple objective for nonlinear instrumental variable (IV) regression based on a kernelized conditional moment restriction known as a maximum moment restriction (MMR). The MMR objective is formulated by maximizing the interaction between the residual and the instruments belonging to a unit ball in a reproducing kernel Hilbert space. First, it allows us to simplify the IV regression as an empirical risk minimization problem, where the risk function depends on the reproducing kernel on the instrument and can be estimated by a U-statistic or V-statistic. Second, on the basis this simplification, we are able to provide consistency and asymptotic normality results in both parametric and nonparametric settings. Finally, we provide easy-to-use IV regression algorithms with an efficient hyperparameter selection procedure. We demonstrate the effectiveness of our algorithms using experiments on both synthetic and real-world data.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"1 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2020-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83678311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract In this article, we propose a new method to learn the underlying acyclic mixed graph of a linear non-Gaussian structural equation model with given observational data. We build on an algorithm proposed by Wang and Drton, and we show that one can augment the hidden variable structure of the recovered model by learning multidirected edges rather than only directed and bidirected ones. Multidirected edges appear when more than two of the observed variables have a hidden common cause. We detect the presence of such hidden causes by looking at higher order cumulants and exploiting the multi-trek rule. Our method recovers the correct structure when the underlying graph is a bow-free acyclic mixed graph with potential multidirected edges.
{"title":"Learning linear non-Gaussian graphical models with multidirected edges","authors":"Yiheng Liu, Elina Robeva, Huanqing Wang","doi":"10.1515/jci-2020-0027","DOIUrl":"https://doi.org/10.1515/jci-2020-0027","url":null,"abstract":"Abstract In this article, we propose a new method to learn the underlying acyclic mixed graph of a linear non-Gaussian structural equation model with given observational data. We build on an algorithm proposed by Wang and Drton, and we show that one can augment the hidden variable structure of the recovered model by learning multidirected edges rather than only directed and bidirected ones. Multidirected edges appear when more than two of the observed variables have a hidden common cause. We detect the presence of such hidden causes by looking at higher order cumulants and exploiting the multi-trek rule. Our method recovers the correct structure when the underlying graph is a bow-free acyclic mixed graph with potential multidirected edges.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"12 1","pages":"250 - 263"},"PeriodicalIF":1.4,"publicationDate":"2020-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84219125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The global average treatment effect (GATE) is a primary quantity of interest in the study of causal inference under network interference. With a correctly specified exposure model of the interference, the Horvitz–Thompson (HT) and Hájek estimators of the GATE are unbiased and consistent, respectively, yet known to exhibit extreme variance under many designs and in many settings of interest. With a fixed clustering of the interference graph, graph cluster randomization (GCR) designs have been shown to greatly reduce variance compared to node-level random assignment, but even so the variance is still often prohibitively large. In this work, we propose a randomized version of the GCR design, descriptively named randomized graph cluster randomization (RGCR), which uses a random clustering rather than a single fixed clustering. By considering an ensemble of many different clustering assignments, this design avoids a key problem with GCR where the network exposure probability of a given node can be exponentially small in a single clustering. We propose two inherently randomized graph decomposition algorithms for use with RGCR designs, randomized 3-net and 1-hop-max, adapted from the prior work on multiway graph cut problems and the probabilistic approximation of (graph) metrics. We also propose weighted extensions of these two algorithms with slight additional advantages. All these algorithms result in network exposure probabilities that can be estimated efficiently. We derive structure-dependent upper bounds on the variance of the HT estimator of the GATE, depending on the metric structure of the graph driving the interference. Where the best-known such upper bound for the HT estimator under a GCR design is exponential in the parameters of the metric structure, we give a comparable upper bound under RGCR that is instead polynomial in the same parameters. We provide extensive simulations comparing RGCR and GCR designs, observing substantial improvements in GATE estimation in a variety of settings.
摘要全局平均处理效应(global average treatment effect, GATE)是研究网络干扰下因果推理的一个重要参数。有了正确指定的干扰暴露模型,GATE的Horvitz-Thompson (HT)和Hájek估计器分别是无偏和一致的,但已知在许多设计和许多感兴趣的设置下表现出极端的差异。对于干扰图的固定聚类,与节点级随机分配相比,图簇随机化(GCR)设计已被证明可以大大减少方差,但即使如此,方差仍然经常大得令人难以置信。在这项工作中,我们提出了一个随机版本的GCR设计,描述性地命名为随机图聚类随机化(RGCR),它使用随机聚类而不是单个固定聚类。通过考虑许多不同聚类分配的集成,该设计避免了GCR的一个关键问题,即给定节点的网络暴露概率在单个聚类中可能呈指数级小。我们提出了两种用于RGCR设计的固有随机图分解算法,随机3-net和1-hop-max,它们改编自先前关于多路图切问题和(图)度量的概率逼近的工作。我们还提出了这两种算法的加权扩展,并增加了一些额外的优点。所有这些算法都可以有效地估计网络暴露概率。我们根据驱动干涉的图的度量结构,推导出GATE的HT估计量方差的结构相关的上界。其中最著名的在GCR设计下的HT估计量的上界在度量结构的参数中是指数的,我们给出了RGCR下的一个类似的上界,它在相同的参数中是多项式的。我们提供了比较RGCR和GCR设计的广泛模拟,观察到在各种设置下GATE估计的实质性改进。
{"title":"Randomized graph cluster randomization","authors":"J. Ugander, Hao Yin","doi":"10.1515/jci-2022-0014","DOIUrl":"https://doi.org/10.1515/jci-2022-0014","url":null,"abstract":"Abstract The global average treatment effect (GATE) is a primary quantity of interest in the study of causal inference under network interference. With a correctly specified exposure model of the interference, the Horvitz–Thompson (HT) and Hájek estimators of the GATE are unbiased and consistent, respectively, yet known to exhibit extreme variance under many designs and in many settings of interest. With a fixed clustering of the interference graph, graph cluster randomization (GCR) designs have been shown to greatly reduce variance compared to node-level random assignment, but even so the variance is still often prohibitively large. In this work, we propose a randomized version of the GCR design, descriptively named randomized graph cluster randomization (RGCR), which uses a random clustering rather than a single fixed clustering. By considering an ensemble of many different clustering assignments, this design avoids a key problem with GCR where the network exposure probability of a given node can be exponentially small in a single clustering. We propose two inherently randomized graph decomposition algorithms for use with RGCR designs, randomized 3-net and 1-hop-max, adapted from the prior work on multiway graph cut problems and the probabilistic approximation of (graph) metrics. We also propose weighted extensions of these two algorithms with slight additional advantages. All these algorithms result in network exposure probabilities that can be estimated efficiently. We derive structure-dependent upper bounds on the variance of the HT estimator of the GATE, depending on the metric structure of the graph driving the interference. Where the best-known such upper bound for the HT estimator under a GCR design is exponential in the parameters of the metric structure, we give a comparable upper bound under RGCR that is instead polynomial in the same parameters. We provide extensive simulations comparing RGCR and GCR designs, observing substantial improvements in GATE estimation in a variety of settings.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"77 2-3 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2020-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78140869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sergio Garrido, S. Borysov, Jeppe Rich, F. Pereira
Abstract The estimation of causal effects is fundamental in situations where the underlying system will be subject to active interventions. Part of building a causal inference engine is defining how variables relate to each other, that is, defining the functional relationship between variables entailed by the graph conditional dependencies. In this article, we deviate from the common assumption of linear relationships in causal models by making use of neural autoregressive density estimators and use them to estimate causal effects within Pearl’s do-calculus framework. Using synthetic data, we show that the approach can retrieve causal effects from non-linear systems without explicitly modeling the interactions between the variables and include confidence bands using the non-parametric bootstrap. We also explore scenarios that deviate from the ideal causal effect estimation setting such as poor data support or unobserved confounders.
{"title":"Estimating causal effects with the neural autoregressive density estimator","authors":"Sergio Garrido, S. Borysov, Jeppe Rich, F. Pereira","doi":"10.1515/jci-2020-0007","DOIUrl":"https://doi.org/10.1515/jci-2020-0007","url":null,"abstract":"Abstract The estimation of causal effects is fundamental in situations where the underlying system will be subject to active interventions. Part of building a causal inference engine is defining how variables relate to each other, that is, defining the functional relationship between variables entailed by the graph conditional dependencies. In this article, we deviate from the common assumption of linear relationships in causal models by making use of neural autoregressive density estimators and use them to estimate causal effects within Pearl’s do-calculus framework. Using synthetic data, we show that the approach can retrieve causal effects from non-linear systems without explicitly modeling the interactions between the variables and include confidence bands using the non-parametric bootstrap. We also explore scenarios that deviate from the ideal causal effect estimation setting such as poor data support or unobserved confounders.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"2 1","pages":"211 - 228"},"PeriodicalIF":1.4,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86219822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nicole E. Pashley, Guillaume W. Basse, Luke W. Miratrix
Abstract The injunction to “analyze the way you randomize” is well known to statisticians since Fisher advocated for randomization as the basis of inference. Yet even those convinced by the merits of randomization-based inference seldom follow this injunction to the letter. Bernoulli randomized experiments are often analyzed as completely randomized experiments, and completely randomized experiments are analyzed as if they had been stratified; more generally, it is not uncommon to analyze an experiment as if it had been randomized differently. This article examines the theoretical foundation behind this practice within a randomization-based framework. Specifically, we ask when is it legitimate to analyze an experiment randomized according to one design as if it had been randomized according to some other design. We show that a sufficient condition for this type of analysis to be valid is that the design used for analysis should be derived from the original design by an appropriate form of conditioning. We use our theory to justify certain existing methods, question others, and finally suggest new methodological insights such as conditioning on approximate covariate balance.
{"title":"Conditional as-if analyses in randomized experiments","authors":"Nicole E. Pashley, Guillaume W. Basse, Luke W. Miratrix","doi":"10.1515/jci-2021-0012","DOIUrl":"https://doi.org/10.1515/jci-2021-0012","url":null,"abstract":"Abstract The injunction to “analyze the way you randomize” is well known to statisticians since Fisher advocated for randomization as the basis of inference. Yet even those convinced by the merits of randomization-based inference seldom follow this injunction to the letter. Bernoulli randomized experiments are often analyzed as completely randomized experiments, and completely randomized experiments are analyzed as if they had been stratified; more generally, it is not uncommon to analyze an experiment as if it had been randomized differently. This article examines the theoretical foundation behind this practice within a randomization-based framework. Specifically, we ask when is it legitimate to analyze an experiment randomized according to one design as if it had been randomized according to some other design. We show that a sufficient condition for this type of analysis to be valid is that the design used for analysis should be derived from the original design by an appropriate form of conditioning. We use our theory to justify certain existing methods, question others, and finally suggest new methodological insights such as conditioning on approximate covariate balance.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"92 1","pages":"264 - 284"},"PeriodicalIF":1.4,"publicationDate":"2020-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83798256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Recently, there has been increasing interest in the use of heavily restricted randomization designs which enforce balance on observed covariates in randomized controlled trials. However, when restrictions are strict, there is a risk that the treatment effect estimator will have a very high mean squared error (MSE). In this article, we formalize this risk and propose a novel combinatoric-based approach to describe and address this issue. First, we validate our new approach by re-proving some known properties of complete randomization and restricted randomization. Second, we propose a novel diagnostic measure for restricted designs that only use the information embedded in the combinatorics of the design. Third, we show that the variance of the MSE of the difference-in-means estimator in a randomized experiment is a linear function of this diagnostic measure. Finally, we identify situations in which restricted designs can lead to an increased risk of getting a high MSE and discuss how our diagnostic measure can be used to detect such designs. Our results have implications for any restricted randomization design and can be used to evaluate the trade-off between enforcing balance on observed covariates and avoiding too restrictive designs.
{"title":"Properties of restricted randomization with implications for experimental design","authors":"Mattias Nordin, M. Schultzberg","doi":"10.1515/jci-2021-0057","DOIUrl":"https://doi.org/10.1515/jci-2021-0057","url":null,"abstract":"Abstract Recently, there has been increasing interest in the use of heavily restricted randomization designs which enforce balance on observed covariates in randomized controlled trials. However, when restrictions are strict, there is a risk that the treatment effect estimator will have a very high mean squared error (MSE). In this article, we formalize this risk and propose a novel combinatoric-based approach to describe and address this issue. First, we validate our new approach by re-proving some known properties of complete randomization and restricted randomization. Second, we propose a novel diagnostic measure for restricted designs that only use the information embedded in the combinatorics of the design. Third, we show that the variance of the MSE of the difference-in-means estimator in a randomized experiment is a linear function of this diagnostic measure. Finally, we identify situations in which restricted designs can lead to an increased risk of getting a high MSE and discuss how our diagnostic measure can be used to detect such designs. Our results have implications for any restricted randomization design and can be used to evaluate the trade-off between enforcing balance on observed covariates and avoiding too restrictive designs.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"39 1","pages":"227 - 245"},"PeriodicalIF":1.4,"publicationDate":"2020-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89531588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract We develop a mathematical and interpretative foundation for the enterprise of decision-theoretic (DT) statistical causality, which is a straightforward way of representing and addressing causal questions. DT reframes causal inference as “assisted decision-making” and aims to understand when, and how, I can make use of external data, typically observational, to help me solve a decision problem by taking advantage of assumed relationships between the data and my problem. The relationships embodied in any representation of a causal problem require deeper justification, which is necessarily context-dependent. Here we clarify the considerations needed to support applications of the DT methodology. Exchangeability considerations are used to structure the required relationships, and a distinction drawn between intention to treat and intervention to treat forms the basis for the enabling condition of “ignorability.” We also show how the DT perspective unifies and sheds light on other popular formalisations of statistical causality, including potential responses and directed acyclic graphs.
{"title":"Decision-theoretic foundations for statistical causality","authors":"P. Dawid","doi":"10.1515/jci-2020-0008","DOIUrl":"https://doi.org/10.1515/jci-2020-0008","url":null,"abstract":"Abstract We develop a mathematical and interpretative foundation for the enterprise of decision-theoretic (DT) statistical causality, which is a straightforward way of representing and addressing causal questions. DT reframes causal inference as “assisted decision-making” and aims to understand when, and how, I can make use of external data, typically observational, to help me solve a decision problem by taking advantage of assumed relationships between the data and my problem. The relationships embodied in any representation of a causal problem require deeper justification, which is necessarily context-dependent. Here we clarify the considerations needed to support applications of the DT methodology. Exchangeability considerations are used to structure the required relationships, and a distinction drawn between intention to treat and intervention to treat forms the basis for the enabling condition of “ignorability.” We also show how the DT perspective unifies and sheds light on other popular formalisations of statistical causality, including potential responses and directed acyclic graphs.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"40 1","pages":"39 - 77"},"PeriodicalIF":1.4,"publicationDate":"2020-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79851043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Adjusting for covariates is a well-established method to estimate the total causal effect of an exposure variable on an outcome of interest. Depending on the causal structure of the mechanism under study, there may be different adjustment sets, equally valid from a theoretical perspective, leading to identical causal effects. However, in practice, with finite data, estimators built on different sets may display different precisions. To investigate the extent of this variability, we consider the simplest non-trivial non-linear model of a v-structure on three nodes for binary data. We explicitly compute and compare the variance of the two possible different causal estimators. Further, by going beyond leading-order asymptotics, we show that there are parameter regimes where the set with the asymptotically optimal variance does depend on the edge coefficients, a result that is not captured by the recent leading-order developments for general causal models. As a practical consequence, the adjustment set selection needs to account for the relative magnitude of the relationships between variables with respect to the sample size and cannot rely on purely graphical criteria.
{"title":"The variance of causal effect estimators for binary v-structures","authors":"Jack Kuipers, G. Moffa","doi":"10.1515/jci-2021-0025","DOIUrl":"https://doi.org/10.1515/jci-2021-0025","url":null,"abstract":"Abstract Adjusting for covariates is a well-established method to estimate the total causal effect of an exposure variable on an outcome of interest. Depending on the causal structure of the mechanism under study, there may be different adjustment sets, equally valid from a theoretical perspective, leading to identical causal effects. However, in practice, with finite data, estimators built on different sets may display different precisions. To investigate the extent of this variability, we consider the simplest non-trivial non-linear model of a v-structure on three nodes for binary data. We explicitly compute and compare the variance of the two possible different causal estimators. Further, by going beyond leading-order asymptotics, we show that there are parameter regimes where the set with the asymptotically optimal variance does depend on the edge coefficients, a result that is not captured by the recent leading-order developments for general causal models. As a practical consequence, the adjustment set selection needs to account for the relative magnitude of the relationships between variables with respect to the sample size and cannot rely on purely graphical criteria.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"41 1","pages":"90 - 105"},"PeriodicalIF":1.4,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81036119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Analysts often use data-driven approaches to supplement their knowledge when selecting covariates for effect estimation. Multiple variable selection procedures for causal effect estimation have been devised in recent years, but additional developments are still required to adequately address the needs of analysts. We propose a generalized Bayesian causal effect estimation (GBCEE) algorithm to perform variable selection and produce double robust (DR) estimates of causal effects for binary or continuous exposures and outcomes. GBCEE employs a prior distribution that targets the selection of true confounders and predictors of the outcome for the unbiased estimation of causal effects with reduced standard errors. The Bayesian machinery allows GBCEE to directly produce inferences for its estimate. In simulations, GBCEE was observed to perform similarly or to outperform DR alternatives. Its ability to directly produce inferences is also an important advantage from a computational perspective. The method is finally illustrated for the estimation of the effect of meeting physical activity recommendations on the risk of hip or upper-leg fractures among older women in the study of osteoporotic fractures. The 95% confidence interval produced by GBCEE is 61% narrower than that of a DR estimator adjusting for all potential confounders in this illustration.
{"title":"A generalized double robust Bayesian model averaging approach to causal effect estimation with application to the study of osteoporotic fractures","authors":"D. Talbot, C. Beaudoin","doi":"10.1515/jci-2021-0023","DOIUrl":"https://doi.org/10.1515/jci-2021-0023","url":null,"abstract":"Abstract Analysts often use data-driven approaches to supplement their knowledge when selecting covariates for effect estimation. Multiple variable selection procedures for causal effect estimation have been devised in recent years, but additional developments are still required to adequately address the needs of analysts. We propose a generalized Bayesian causal effect estimation (GBCEE) algorithm to perform variable selection and produce double robust (DR) estimates of causal effects for binary or continuous exposures and outcomes. GBCEE employs a prior distribution that targets the selection of true confounders and predictors of the outcome for the unbiased estimation of causal effects with reduced standard errors. The Bayesian machinery allows GBCEE to directly produce inferences for its estimate. In simulations, GBCEE was observed to perform similarly or to outperform DR alternatives. Its ability to directly produce inferences is also an important advantage from a computational perspective. The method is finally illustrated for the estimation of the effect of meeting physical activity recommendations on the risk of hip or upper-leg fractures among older women in the study of osteoporotic fractures. The 95% confidence interval produced by GBCEE is 61% narrower than that of a DR estimator adjusting for all potential confounders in this illustration.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"3 1","pages":"335 - 371"},"PeriodicalIF":1.4,"publicationDate":"2020-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78500321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}