Abstract When estimating a global average treatment effect (GATE) under network interference, units can have widely different relationships to the treatment depending on a combination of the structure of their network neighborhood, the structure of the interference mechanism, and how the treatment was distributed in their neighborhood. In this work, we introduce a sequential procedure to generate and select graph- and treatment-based covariates for GATE estimation under regression adjustment. We show that it is possible to simultaneously achieve low bias and considerably reduce variance with such a procedure. To tackle inferential complications caused by our feature generation and selection process, we introduce a way to construct confidence intervals based on a block bootstrap. We illustrate that our selection procedure and subsequent estimator can achieve good performance in terms of root-mean-square error in several semi-synthetic experiments with Bernoulli designs, comparing favorably to an oracle estimator that takes advantage of regression adjustments for the known underlying interference structure. We apply our method to a real-world experimental dataset with strong evidence of interference and demonstrate that it can estimate the GATE reasonably well without knowing the interference process a priori .
{"title":"Model-based regression adjustment with model-free covariates for network interference","authors":"Kevin Han, Johan Ugander","doi":"10.1515/jci-2023-0005","DOIUrl":"https://doi.org/10.1515/jci-2023-0005","url":null,"abstract":"Abstract When estimating a global average treatment effect (GATE) under network interference, units can have widely different relationships to the treatment depending on a combination of the structure of their network neighborhood, the structure of the interference mechanism, and how the treatment was distributed in their neighborhood. In this work, we introduce a sequential procedure to generate and select graph- and treatment-based covariates for GATE estimation under regression adjustment. We show that it is possible to simultaneously achieve low bias and considerably reduce variance with such a procedure. To tackle inferential complications caused by our feature generation and selection process, we introduce a way to construct confidence intervals based on a block bootstrap. We illustrate that our selection procedure and subsequent estimator can achieve good performance in terms of root-mean-square error in several semi-synthetic experiments with Bernoulli designs, comparing favorably to an oracle estimator that takes advantage of regression adjustments for the known underlying interference structure. We apply our method to a real-world experimental dataset with strong evidence of interference and demonstrate that it can estimate the GATE reasonably well without knowing the interference process a priori .","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135507034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The attributable fraction (population) has attracted much attention from a theoretical perspective and has been used extensively to assess the impact of potential health interventions. However, despite its extensive use, there is much confusion about its concept and calculation methods. In this article, we discuss the concepts of and calculation methods for the attributable fraction and related measures in the counterfactual framework, both with and without stratification by covariates. Generally, the attributable fraction is useful when the exposure of interest has a causal effect on the outcome. However, it is important to understand that this statement applies to the exposed group. Although the target population of the attributable fraction (population) is the total population, the causal effect should be present not in the total population but in the exposed group. As related measures, we discuss the preventable fraction and prevented fraction, which are generally useful when the exposure of interest has a preventive effect on the outcome, and we further propose a new measure called the attributed fraction. We also discuss the causal and preventive excess fractions, and provide notes on vaccine efficacy. Finally, we discuss the relations between the aforementioned six measures and six possible patterns using a conceptual schema.
{"title":"Attributable fraction and related measures: Conceptual relations in the counterfactual framework","authors":"E. Suzuki, E. Yamamoto","doi":"10.1515/jci-2021-0068","DOIUrl":"https://doi.org/10.1515/jci-2021-0068","url":null,"abstract":"Abstract The attributable fraction (population) has attracted much attention from a theoretical perspective and has been used extensively to assess the impact of potential health interventions. However, despite its extensive use, there is much confusion about its concept and calculation methods. In this article, we discuss the concepts of and calculation methods for the attributable fraction and related measures in the counterfactual framework, both with and without stratification by covariates. Generally, the attributable fraction is useful when the exposure of interest has a causal effect on the outcome. However, it is important to understand that this statement applies to the exposed group. Although the target population of the attributable fraction (population) is the total population, the causal effect should be present not in the total population but in the exposed group. As related measures, we discuss the preventable fraction and prevented fraction, which are generally useful when the exposure of interest has a preventive effect on the outcome, and we further propose a new measure called the attributed fraction. We also discuss the causal and preventive excess fractions, and provide notes on vaccine efficacy. Finally, we discuss the relations between the aforementioned six measures and six possible patterns using a conceptual schema.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"41 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77940445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Steven Siwei Ye, Yanzhen Chen, Oscar Hernan Madrid Padilla
Abstract Statisticians show growing interest in estimating and analyzing heterogeneity in causal effects in observational studies. However, there usually exists a trade-off between accuracy and interpretability for developing a desirable estimator for treatment effects, especially in the case when there are a large number of features in estimation. To make efforts to address the issue, we propose a score-based framework for estimating the conditional average treatment effect (CATE) function in this article. The framework integrates two components: (i) leverage the joint use of propensity and prognostic scores in a matching algorithm to obtain a proxy of the heterogeneous treatment effects for each observation and (ii) utilize nonparametric regression trees to construct an estimator for the CATE function conditioning on the two scores. The method naturally stratifies treatment effects into subgroups over a 2d grid whose axis are the propensity and prognostic scores. We conduct benchmark experiments on multiple simulated data and demonstrate clear advantages of the proposed estimator over state-of-the-art methods. We also evaluate empirical performance in real-life settings, using two observational data from a clinical trial and a complex social survey, and interpret policy implications following the numerical results.
{"title":"2D score-based estimation of heterogeneous treatment effects","authors":"Steven Siwei Ye, Yanzhen Chen, Oscar Hernan Madrid Padilla","doi":"10.1515/jci-2022-0016","DOIUrl":"https://doi.org/10.1515/jci-2022-0016","url":null,"abstract":"Abstract Statisticians show growing interest in estimating and analyzing heterogeneity in causal effects in observational studies. However, there usually exists a trade-off between accuracy and interpretability for developing a desirable estimator for treatment effects, especially in the case when there are a large number of features in estimation. To make efforts to address the issue, we propose a score-based framework for estimating the conditional average treatment effect (CATE) function in this article. The framework integrates two components: (i) leverage the joint use of propensity and prognostic scores in a matching algorithm to obtain a proxy of the heterogeneous treatment effects for each observation and (ii) utilize nonparametric regression trees to construct an estimator for the CATE function conditioning on the two scores. The method naturally stratifies treatment effects into subgroups over a 2d grid whose axis are the propensity and prognostic scores. We conduct benchmark experiments on multiple simulated data and demonstrate clear advantages of the proposed estimator over state-of-the-art methods. We also evaluate empirical performance in real-life settings, using two observational data from a clinical trial and a complex social survey, and interpret policy implications following the numerical results.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135212050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mayleen Cortez-Rodriguez, Matthew Eichhorn, Christina Lee Yu
Abstract Network interference, where the outcome of an individual is affected by the treatment assignment of those in their social network, is pervasive in real-world settings. However, it poses a challenge to estimating causal effects. We consider the task of estimating the total treatment effect (TTE), or the difference between the average outcomes of the population when everyone is treated versus when no one is, under network interference. Under a Bernoulli randomized design, we provide an unbiased estimator for the TTE when network interference effects are constrained to low-order interactions among neighbors of an individual. We make no assumptions on the graph other than bounded degree, allowing for well-connected networks that may not be easily clustered. We derive a bound on the variance of our estimator and show in simulated experiments that it performs well compared with standard estimators for the TTE. We also derive a minimax lower bound on the mean squared error of our estimator, which suggests that the difficulty of estimation can be characterized by the degree of interactions in the potential outcomes model. We also prove that our estimator is asymptotically normal under boundedness conditions on the network degree and potential outcomes model. Central to our contribution is a new framework for balancing model flexibility and statistical complexity as captured by this low-order interactions structure.
{"title":"Exploiting neighborhood interference with low-order interactions under unit randomized design","authors":"Mayleen Cortez-Rodriguez, Matthew Eichhorn, Christina Lee Yu","doi":"10.1515/jci-2022-0051","DOIUrl":"https://doi.org/10.1515/jci-2022-0051","url":null,"abstract":"Abstract Network interference, where the outcome of an individual is affected by the treatment assignment of those in their social network, is pervasive in real-world settings. However, it poses a challenge to estimating causal effects. We consider the task of estimating the total treatment effect (TTE), or the difference between the average outcomes of the population when everyone is treated versus when no one is, under network interference. Under a Bernoulli randomized design, we provide an unbiased estimator for the TTE when network interference effects are constrained to low-order interactions among neighbors of an individual. We make no assumptions on the graph other than bounded degree, allowing for well-connected networks that may not be easily clustered. We derive a bound on the variance of our estimator and show in simulated experiments that it performs well compared with standard estimators for the TTE. We also derive a minimax lower bound on the mean squared error of our estimator, which suggests that the difficulty of estimation can be characterized by the degree of interactions in the potential outcomes model. We also prove that our estimator is asymptotically normal under boundedness conditions on the network degree and potential outcomes model. Central to our contribution is a new framework for balancing model flexibility and statistical complexity as captured by this low-order interactions structure.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135894033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract We consider likelihood score-based methods for causal discovery in structural causal models. In particular, we focus on Gaussian scoring and analyze the effect of model misspecification in terms of non-Gaussian error distribution. We present a surprising negative result for Gaussian likelihood scoring in combination with nonparametric regression methods.
{"title":"On the pitfalls of Gaussian likelihood scoring for causal discovery","authors":"Christoph Schultheiss, P. Bühlmann","doi":"10.1515/jci-2022-0068","DOIUrl":"https://doi.org/10.1515/jci-2022-0068","url":null,"abstract":"Abstract We consider likelihood score-based methods for causal discovery in structural causal models. In particular, we focus on Gaussian scoring and analyze the effect of model misspecification in terms of non-Gaussian error distribution. We present a surprising negative result for Gaussian likelihood scoring in combination with nonparametric regression methods.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"72 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78644884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract We propose quantitative probing as a model-agnostic framework for validating causal models in the presence of quantitative domain knowledge. The method is constructed in analogy to the train/test split in correlation-based machine learning. It is consistent with the logic of scientific discovery and enhances current causal validation strategies. The effectiveness of the method is illustrated using Pearl’s sprinkler example, before a thorough simulation-based investigation is conducted. Limits of the technique are identified by studying exemplary failing scenarios, which are furthermore used to propose a list of topics for future research and improvements of the presented version of quantitative probing. A guide for practitioners is included to facilitate the incorporation of quantitative probing in causal modelling applications. The code for integrating quantitative probing into causal analysis, as well as the code for the presented simulation-based studies of the effectiveness of quantitative probing are provided in two separate open-source Python packages.
{"title":"Quantitative probing: Validating causal models with quantitative domain knowledge","authors":"Daniel Grünbaum, M. L. Stern, E. Lang","doi":"10.1515/jci-2022-0060","DOIUrl":"https://doi.org/10.1515/jci-2022-0060","url":null,"abstract":"Abstract We propose quantitative probing as a model-agnostic framework for validating causal models in the presence of quantitative domain knowledge. The method is constructed in analogy to the train/test split in correlation-based machine learning. It is consistent with the logic of scientific discovery and enhances current causal validation strategies. The effectiveness of the method is illustrated using Pearl’s sprinkler example, before a thorough simulation-based investigation is conducted. Limits of the technique are identified by studying exemplary failing scenarios, which are furthermore used to propose a list of topics for future research and improvements of the presented version of quantitative probing. A guide for practitioners is included to facilitate the incorporation of quantitative probing in causal modelling applications. The code for integrating quantitative probing into causal analysis, as well as the code for the presented simulation-based studies of the effectiveness of quantitative probing are provided in two separate open-source Python packages.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"55 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81451949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-19DOI: 10.48550/arXiv.2208.09558
Scott Mueller
Abstract Personalized decision making targets the behavior of a specific individual, while population-based decision making concerns a subpopulation resembling that individual. This article clarifies the distinction between the two and explains why the former leads to more informed decisions. We further show that by combining experimental and observational studies, we can obtain valuable information about individual behavior and, consequently, improve decisions over those obtained from experimental studies alone. In particular, we show examples where such a combination discriminates between individuals who can benefit from a treatment and those who cannot – information that would not be revealed by experimental studies alone. We outline areas where this method could be of benefit to both policy makers and individuals involved.
{"title":"Personalized decision making – A conceptual introduction","authors":"Scott Mueller","doi":"10.48550/arXiv.2208.09558","DOIUrl":"https://doi.org/10.48550/arXiv.2208.09558","url":null,"abstract":"Abstract Personalized decision making targets the behavior of a specific individual, while population-based decision making concerns a subpopulation resembling that individual. This article clarifies the distinction between the two and explains why the former leads to more informed decisions. We further show that by combining experimental and observational studies, we can obtain valuable information about individual behavior and, consequently, improve decisions over those obtained from experimental studies alone. In particular, we show examples where such a combination discriminates between individuals who can benefit from a treatment and those who cannot – information that would not be revealed by experimental studies alone. We outline areas where this method could be of benefit to both policy makers and individuals involved.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"37 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75278078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-10DOI: 10.48550/arXiv.2208.05553
Mayleen Cortez, Matthew Eichhorn, C. Yu
Abstract Network interference, where the outcome of an individual is affected by the treatment assignment of those in their social network, is pervasive in real-world settings. However, it poses a challenge to estimating causal effects. We consider the task of estimating the total treatment effect (TTE), or the difference between the average outcomes of the population when everyone is treated versus when no one is, under network interference. Under a Bernoulli randomized design, we provide an unbiased estimator for the TTE when network interference effects are constrained to low-order interactions among neighbors of an individual. We make no assumptions on the graph other than bounded degree, allowing for well-connected networks that may not be easily clustered. We derive a bound on the variance of our estimator and show in simulated experiments that it performs well compared with standard estimators for the TTE. We also derive a minimax lower bound on the mean squared error of our estimator, which suggests that the difficulty of estimation can be characterized by the degree of interactions in the potential outcomes model. We also prove that our estimator is asymptotically normal under boundedness conditions on the network degree and potential outcomes model. Central to our contribution is a new framework for balancing model flexibility and statistical complexity as captured by this low-order interactions structure.
{"title":"Exploiting neighborhood interference with low-order interactions under unit randomized design","authors":"Mayleen Cortez, Matthew Eichhorn, C. Yu","doi":"10.48550/arXiv.2208.05553","DOIUrl":"https://doi.org/10.48550/arXiv.2208.05553","url":null,"abstract":"Abstract Network interference, where the outcome of an individual is affected by the treatment assignment of those in their social network, is pervasive in real-world settings. However, it poses a challenge to estimating causal effects. We consider the task of estimating the total treatment effect (TTE), or the difference between the average outcomes of the population when everyone is treated versus when no one is, under network interference. Under a Bernoulli randomized design, we provide an unbiased estimator for the TTE when network interference effects are constrained to low-order interactions among neighbors of an individual. We make no assumptions on the graph other than bounded degree, allowing for well-connected networks that may not be easily clustered. We derive a bound on the variance of our estimator and show in simulated experiments that it performs well compared with standard estimators for the TTE. We also derive a minimax lower bound on the mean squared error of our estimator, which suggests that the difficulty of estimation can be characterized by the degree of interactions in the potential outcomes model. We also prove that our estimator is asymptotically normal under boundedness conditions on the network degree and potential outcomes model. Central to our contribution is a new framework for balancing model flexibility and statistical complexity as captured by this low-order interactions structure.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"52 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73433770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract A key objective of decomposition analysis is to identify a factor (the “mediator”) contributing to disparities in an outcome between social groups. In decomposition analysis, a scholarly interest often centers on estimating how much the disparity (e.g., health disparities between Black women and White men) would be reduced/remain if we set the mediator (e.g., education) distribution of one social group equal to another. However, causally identifying disparity reduction and remaining depends on the no omitted mediator–outcome confounding assumption, which is not empirically testable. Therefore, we propose a set of sensitivity analyses to assess the robustness of disparity reduction to possible unobserved confounding. We derived general bias formulas for disparity reduction, which can be used beyond a particular statistical model and do not require any functional assumptions. Moreover, the same bias formulas apply with unobserved confounding measured before and after the group status. On the basis of the formulas, we provide sensitivity analysis techniques based on regression coefficients and R 2 {R}^{2} values by extending the existing approaches. The R 2 {R}^{2} -based sensitivity analysis offers a straightforward interpretation of sensitivity parameters and a standard way to report the robustness of research findings. Although we introduce sensitivity analysis techniques in the context of decomposition analysis, they can be utilized in any mediation setting based on interventional indirect effects when the exposure is randomized (or conditionally ignorable given covariates).
{"title":"Sensitivity analysis for causal decomposition analysis: Assessing robustness toward omitted variable bias","authors":"S. Park, Suyeon Kang, Chioun Lee, Shujie Ma","doi":"10.1515/jci-2022-0031","DOIUrl":"https://doi.org/10.1515/jci-2022-0031","url":null,"abstract":"Abstract A key objective of decomposition analysis is to identify a factor (the “mediator”) contributing to disparities in an outcome between social groups. In decomposition analysis, a scholarly interest often centers on estimating how much the disparity (e.g., health disparities between Black women and White men) would be reduced/remain if we set the mediator (e.g., education) distribution of one social group equal to another. However, causally identifying disparity reduction and remaining depends on the no omitted mediator–outcome confounding assumption, which is not empirically testable. Therefore, we propose a set of sensitivity analyses to assess the robustness of disparity reduction to possible unobserved confounding. We derived general bias formulas for disparity reduction, which can be used beyond a particular statistical model and do not require any functional assumptions. Moreover, the same bias formulas apply with unobserved confounding measured before and after the group status. On the basis of the formulas, we provide sensitivity analysis techniques based on regression coefficients and R 2 {R}^{2} values by extending the existing approaches. The R 2 {R}^{2} -based sensitivity analysis offers a straightforward interpretation of sensitivity parameters and a standard way to report the robustness of research findings. Although we introduce sensitivity analysis techniques in the context of decomposition analysis, they can be utilized in any mediation setting based on interventional indirect effects when the exposure is randomized (or conditionally ignorable given covariates).","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"15 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72716650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Investigating the causal relationship between exposure and time-to-event outcome is an important topic in biomedical research. Previous literature has discussed the potential issues of using hazard ratio (HR) as the marginal causal effect measure due to noncollapsibility. In this article, we advocate using restricted mean survival time (RMST) difference as a marginal causal effect measure, which is collapsible and has a simple interpretation as the difference of area under survival curves over a certain time horizon. To address both measured and unmeasured confounding, a matched design with sensitivity analysis is proposed. Matching is used to pair similar treated and untreated subjects together, which is generally more robust than outcome modeling due to potential misspecifications. Our propensity score matched RMST difference estimator is shown to be asymptotically unbiased, and the corresponding variance estimator is calculated by accounting for the correlation due to matching. Simulation studies also demonstrate that our method has adequate empirical performance and outperforms several competing methods used in practice. To assess the impact of unmeasured confounding, we develop a sensitivity analysis strategy by adapting the E-value approach to matched data. We apply the proposed method to the Atherosclerosis Risk in Communities Study (ARIC) to examine the causal effect of smoking on stroke-free survival.
{"title":"Matched design for marginal causal effect on restricted mean survival time in observational studies","authors":"Zihan Lin, A. Ni, Bo Lu","doi":"10.1515/jci-2022-0035","DOIUrl":"https://doi.org/10.1515/jci-2022-0035","url":null,"abstract":"Abstract Investigating the causal relationship between exposure and time-to-event outcome is an important topic in biomedical research. Previous literature has discussed the potential issues of using hazard ratio (HR) as the marginal causal effect measure due to noncollapsibility. In this article, we advocate using restricted mean survival time (RMST) difference as a marginal causal effect measure, which is collapsible and has a simple interpretation as the difference of area under survival curves over a certain time horizon. To address both measured and unmeasured confounding, a matched design with sensitivity analysis is proposed. Matching is used to pair similar treated and untreated subjects together, which is generally more robust than outcome modeling due to potential misspecifications. Our propensity score matched RMST difference estimator is shown to be asymptotically unbiased, and the corresponding variance estimator is calculated by accounting for the correlation due to matching. Simulation studies also demonstrate that our method has adequate empirical performance and outperforms several competing methods used in practice. To assess the impact of unmeasured confounding, we develop a sensitivity analysis strategy by adapting the E-value approach to matched data. We apply the proposed method to the Atherosclerosis Risk in Communities Study (ARIC) to examine the causal effect of smoking on stroke-free survival.","PeriodicalId":48576,"journal":{"name":"Journal of Causal Inference","volume":"29 12 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2022-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82731058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}