Abstract:Many decades after being introduced by Thistlewaite and Campbell (1960), regression discontinuity designs have become an important tool for causal inference in social sciences. Researchers have found the methods to be widely applicable in settings where eligibility or incentives for participation in programs is at least partially regulated. Alongside, and motivated by, the many studies applying regression discontinuity methods there have been a number of methodological studies improving our understanding, and implementation, of, these methods. Here I report on some of the recent advances in the econometrics literature.
{"title":"Regression Discontinuity Designs in the Econometrics Literature","authors":"G. Imbens","doi":"10.1353/obs.2017.0003","DOIUrl":"https://doi.org/10.1353/obs.2017.0003","url":null,"abstract":"Abstract:Many decades after being introduced by Thistlewaite and Campbell (1960), regression discontinuity designs have become an important tool for causal inference in social sciences. Researchers have found the methods to be widely applicable in settings where eligibility or incentives for participation in programs is at least partially regulated. Alongside, and motivated by, the many studies applying regression discontinuity methods there have been a number of methodological studies improving our understanding, and implementation, of, these methods. Here I report on some of the recent advances in the econometrics literature.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"3 1","pages":"147 - 155"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2017.0003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42421574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Observational Studies and Study Designs: An Epidemiologic Perspective","authors":"T. J. Vander Weele","doi":"10.1353/obs.2015.0025","DOIUrl":"https://doi.org/10.1353/obs.2015.0025","url":null,"abstract":"","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"1 1","pages":"223 - 230"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2015.0025","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48496744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:We are concerned with the unbiased estimation of a treatment effect in the context of non-experimental studies with grouped or multilevel data. When analyzing such data with this goal, practitioners typically include as many predictors (controls) as possible, in an attempt to satisfy ignorability of the treatment assignment. In the multilevel setting with two levels, there are two classes of potential confounders that one must consider, and attempts to satisfy ignorability conditional on just one set would lead to a different treatment effect estimator than attempts to satisfy the other (or both). The three estimators considered in this paper are so-called “within,” “between” and OLS estimators. We generate bounds on the potential differences in bias for these competing estimators to inform model selection. Our approach relies on a parametric model for grouped data and omitted confounders and establishes a framework for sensitivity analysis in the two-level modeling context. The method relies on information obtained from parameters estimated under a variety of multilevel model specifications. We characterize the strength of the confounding and corresponding bias using easily interpretable parameters and graphical displays. We apply this approach to data from a multinational educational evaluation study. We demonstrate the extent to which different treatment effect estimators may be robust to potential unobserved individual- and group-level confounding.
{"title":"Potential for Bias Inflation with Grouped Data: A Comparison of Estimators and a Sensitivity Analysis Strategy","authors":"M. Scott, Ronli Diakow, J. Hill, J. Middleton","doi":"10.1353/obs.2018.0016","DOIUrl":"https://doi.org/10.1353/obs.2018.0016","url":null,"abstract":"Abstract:We are concerned with the unbiased estimation of a treatment effect in the context of non-experimental studies with grouped or multilevel data. When analyzing such data with this goal, practitioners typically include as many predictors (controls) as possible, in an attempt to satisfy ignorability of the treatment assignment. In the multilevel setting with two levels, there are two classes of potential confounders that one must consider, and attempts to satisfy ignorability conditional on just one set would lead to a different treatment effect estimator than attempts to satisfy the other (or both). The three estimators considered in this paper are so-called “within,” “between” and OLS estimators. We generate bounds on the potential differences in bias for these competing estimators to inform model selection. Our approach relies on a parametric model for grouped data and omitted confounders and establishes a framework for sensitivity analysis in the two-level modeling context. The method relies on information obtained from parameters estimated under a variety of multilevel model specifications. We characterize the strength of the confounding and corresponding bias using easily interpretable parameters and graphical displays. We apply this approach to data from a multinational educational evaluation study. We demonstrate the extent to which different treatment effect estimators may be robust to potential unobserved individual- and group-level confounding.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"4 1","pages":"111 - 149"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2018.0016","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45391755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As the introduction of Guanglei Hong’s Causality in a Social World makes clear, this book would not be necessary if all treatments we wished to study had constant effects through simple mechanisms on independent individuals who were randomly assigned to treatments. While, such conditions may hold in some idealized agricultural settings, this is not the phenomenon we encounter in a social policy oriented world with human agency. In response, Hong presents a coherent theoretical and empirical framework for estimating causality when people choose their own treatments, when they encounter mediating and moderating effects of treatments and when they influence others’ choices and outcomes. The book is presented in four large sections: overview, moderation, mediation and spillover, with a chapter introducing the core ideas in each section (chapters 4, 7, 11 and 14 respectively). Beyond merely consolidating her own foundational work, the book is steeped in deep and historical statistical principles of sampling, propensity score analysis, mediation and moderation, and spill-over mechanisms. Ultimately, the book will mark a passageway from underlying statistical principles to a framework that may endure and expand beyond even what Hong anticipates.
{"title":"Book review of “Causality in a Social World” by Guanglei Hong","authors":"K. Frank, G. Saw, Ran Xu","doi":"10.1353/obs.2016.0001","DOIUrl":"https://doi.org/10.1353/obs.2016.0001","url":null,"abstract":"As the introduction of Guanglei Hong’s Causality in a Social World makes clear, this book would not be necessary if all treatments we wished to study had constant effects through simple mechanisms on independent individuals who were randomly assigned to treatments. While, such conditions may hold in some idealized agricultural settings, this is not the phenomenon we encounter in a social policy oriented world with human agency. In response, Hong presents a coherent theoretical and empirical framework for estimating causality when people choose their own treatments, when they encounter mediating and moderating effects of treatments and when they influence others’ choices and outcomes. The book is presented in four large sections: overview, moderation, mediation and spillover, with a chapter introducing the core ideas in each section (chapters 4, 7, 11 and 14 respectively). Beyond merely consolidating her own foundational work, the book is steeped in deep and historical statistical principles of sampling, propensity score analysis, mediation and moderation, and spill-over mechanisms. Ultimately, the book will mark a passageway from underlying statistical principles to a framework that may endure and expand beyond even what Hong anticipates.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"2 1","pages":"86 - 89"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2016.0001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48217783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Thistlethwaite and Campbell (1960) proposed to use a “regression-discontinuity analysis” in settings where exposure to a treatment or intervention is determined by an observable score and a fixed cutoff. The type of setting they described, now widely known as the regression discontinuity (RD) design, is one where units receive a score, and a binary treatment is assigned according to a very specific rule. In the simplest case, all units whose score is above a known cutoff are assigned to the treatment condition, and all units whose score is below the cutoff are assigned to the control (i.e., absence of treatment) condition. Thistlethwaite and Campbell insightfully noted that, under appropriate assumptions, the discontinuity in the probability of treatment status induced by such an assignment rule could be leveraged to learn about the effect of the treatment at the cutoff. Their seminal contribution led to what is now one of the most rigorous non-experimental research designs across the social and biomedical sciences. See Cook (2008), Imbens and Lemieux (2008) and Lee and Lemieux (2010) for reviews, and the recent volume edited by Cattaneo and Escanciano (2017) for recent specific applications and methodological developments.
{"title":"Understanding Regression Discontinuity Designs As Observational Studies","authors":"J. Sekhon, R. Titiunik","doi":"10.1353/obs.2017.0005","DOIUrl":"https://doi.org/10.1353/obs.2017.0005","url":null,"abstract":"Thistlethwaite and Campbell (1960) proposed to use a “regression-discontinuity analysis” in settings where exposure to a treatment or intervention is determined by an observable score and a fixed cutoff. The type of setting they described, now widely known as the regression discontinuity (RD) design, is one where units receive a score, and a binary treatment is assigned according to a very specific rule. In the simplest case, all units whose score is above a known cutoff are assigned to the treatment condition, and all units whose score is below the cutoff are assigned to the control (i.e., absence of treatment) condition. Thistlethwaite and Campbell insightfully noted that, under appropriate assumptions, the discontinuity in the probability of treatment status induced by such an assignment rule could be leveraged to learn about the effect of the treatment at the cutoff. Their seminal contribution led to what is now one of the most rigorous non-experimental research designs across the social and biomedical sciences. See Cook (2008), Imbens and Lemieux (2008) and Lee and Lemieux (2010) for reviews, and the recent volume edited by Cattaneo and Escanciano (2017) for recent specific applications and methodological developments.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"3 1","pages":"174 - 182"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2017.0005","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49355673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:The simulation extrapolation method developed by Cook and Stefanski (1995) is a simulation based technique for estimating and reducing bias due to additive measurement error armed only with knowledge of the variance of the measurement error distribution. However there are many instances in which validation data are not available, and measurement error is known not to have mean zero. For example, in assessing phylogenetic cluster size of HIV viruses, cluster size is systematically underestimated since clustering can only be performed on the viruses of those individuals who have presented for testing. In this setting, it is not possible to obtain validation data; however, using knowledge gleaned from the literature, the distribution of the errors may be estimated. In this work, we extend the simulation extrapolation procedure to accommodate errors with non-zero means, motivated by an interest in determining behavioural correlates of HIV phylogenetic cluster size. We provide theoretical justification for the generalization to the non-zero mean measurement error case, proving its consistency and demonstrating its performance via simulation. We then apply the result to data from a the province of Quebec in Canada to show that findings from a naïve analysis are robust to a substantial range of possible measurement error distributions.
{"title":"The non-zero mean SIMEX: Improving estimation in the face of measurement error","authors":"Nabila Parveen, E. Moodie, B. Brenner","doi":"10.1353/obs.2015.0005","DOIUrl":"https://doi.org/10.1353/obs.2015.0005","url":null,"abstract":"Abstract:The simulation extrapolation method developed by Cook and Stefanski (1995) is a simulation based technique for estimating and reducing bias due to additive measurement error armed only with knowledge of the variance of the measurement error distribution. However there are many instances in which validation data are not available, and measurement error is known not to have mean zero. For example, in assessing phylogenetic cluster size of HIV viruses, cluster size is systematically underestimated since clustering can only be performed on the viruses of those individuals who have presented for testing. In this setting, it is not possible to obtain validation data; however, using knowledge gleaned from the literature, the distribution of the errors may be estimated. In this work, we extend the simulation extrapolation procedure to accommodate errors with non-zero means, motivated by an interest in determining behavioural correlates of HIV phylogenetic cluster size. We provide theoretical justification for the generalization to the non-zero mean measurement error case, proving its consistency and demonstrating its performance via simulation. We then apply the result to data from a the province of Quebec in Canada to show that findings from a naïve analysis are robust to a substantial range of possible measurement error distributions.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"1 1","pages":"123 - 90"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2015.0005","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49458199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:The estimation of causal effects in observational studies is usually limited by the lack of randomization, which can result in different treatment or exposure groups differing systematically with respect to characteristics that influence outcomes. To remove such systematic differences, methods to ’’balance” subjects on observed covariates across treatment or exposure levels have been developed over the past three decades. These methods have been primarily developed in settings with binary treatment or exposures. However, in many observational studies, the exposures are continuous instead of being binary or discrete, and are usually considered as doses of treatment. In this manuscript we consider estimating the causal effect of early childhood lead exposure on youth academic achievement, where the exposure variable blood lead concentration can take any values that are greater than or equal to 0, using three balancing methods: propensity score analysis, non-bipartite matching, and Bayesian regression trees. We find some evidence that the standard logistic regression analysis controlling for age and socioeconomic confounders used in previous analyses (Zhang et al. (2013)) overstates the effect of lead exposure on performance on standardized mathematics and reading examinations; however, significant declines remain, including at doses currently below the recommended exposure levels.
摘要:观察性研究中因果效应的估计通常受到缺乏随机化的限制,这可能导致不同的治疗或暴露组在影响结果的特征方面存在系统性差异。为了消除这种系统性差异,在过去三十年中,已经开发出了在不同治疗或暴露水平的观察到的协变量上“平衡”受试者的方法。这些方法主要是在二元治疗或暴露的环境中开发的。然而,在许多观察性研究中,暴露是连续的,而不是二元或离散的,通常被视为治疗剂量。在这篇手稿中,我们考虑使用三种平衡方法来估计儿童早期铅暴露对青少年学业成绩的因果影响,其中暴露变量血铅浓度可以取大于或等于0的任何值:倾向得分分析、非二分匹配和贝叶斯回归树。我们发现一些证据表明,先前分析中使用的控制年龄和社会经济混杂因素的标准逻辑回归分析(Zhang et al.(2013))夸大了铅暴露对标准化数学和阅读考试成绩的影响;然而,仍有显著下降,包括目前低于建议暴露水平的剂量。
{"title":"Application of Propensity Scores to a Continuous Exposure: Effect of Lead Exposure in Early Childhood on Reading and Mathematics Scores","authors":"M. Elliott, Nanhua Zhang, Dylan S. Small","doi":"10.1353/obs.2015.0002","DOIUrl":"https://doi.org/10.1353/obs.2015.0002","url":null,"abstract":"Abstract:The estimation of causal effects in observational studies is usually limited by the lack of randomization, which can result in different treatment or exposure groups differing systematically with respect to characteristics that influence outcomes. To remove such systematic differences, methods to ’’balance” subjects on observed covariates across treatment or exposure levels have been developed over the past three decades. These methods have been primarily developed in settings with binary treatment or exposures. However, in many observational studies, the exposures are continuous instead of being binary or discrete, and are usually considered as doses of treatment. In this manuscript we consider estimating the causal effect of early childhood lead exposure on youth academic achievement, where the exposure variable blood lead concentration can take any values that are greater than or equal to 0, using three balancing methods: propensity score analysis, non-bipartite matching, and Bayesian regression trees. We find some evidence that the standard logistic regression analysis controlling for age and socioeconomic confounders used in previous analyses (Zhang et al. (2013)) overstates the effect of lead exposure on performance on standardized mathematics and reading examinations; however, significant declines remain, including at doses currently below the recommended exposure levels.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"1 1","pages":"30 - 55"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2015.0002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43901491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The seminal paper of Thistlethwaite and Campbell (1960) is one of the greatest breakthroughs in program evaluation and causal inference for observational studies. The originally coined Regression-Discontinuity Analysis, and nowadays widely known as the Regression Discontinuity (RD) design, is likely the most credible and internally valid quantitative approach for the analysis and interpretation of non-experimental data. Early reviews and perspectives on RD designs include Cook (2008), Imbens and Lemieux (2008) and Lee and Lemieux (2010); see also Cattaneo and Escanciano (2017) for a contemporaneous edited volume with more recent overviews, discussions, and references. The key design feature in RD is that units have an observable running variable, score or index, and are assigned to treatment whenever this variable exceeds a known cutoff. Empirical work in RD designs seeks to compare the response of units just below the cutoff (control group) to the response of units just above (treatment group) to learn about the treatment effects of interest. It is by now generally recognized that the most important task in practice is to select the appropriate neighborhood near the cutoff, that is, to correctly determine which observations near the cutoff will be used. Localizing near the cutoff is crucial because empirical findings can be quite sensitive to which observations are included in the analysis. Several neighborhood selection methods have been developed in the literature depending on the goal (e.g., estimation, inference, falsification, graphical presentation), the underlying assumptions invoked (e.g., parametric specification, continuity/nonparametric specification, local randomization), the parameter of interest (e.g., sharp, fuzzy, kink), and even the specific design (e.g., single-cutoff, multi-cutoff, geographic). We offer a comprehensive discussion of both deprecated and modern neighborhood selection approaches available in the literature, following their historical as well as methodological evolution over the last decades. We focus on the prototypical case of a continuously distributed running variable for the most part, though we also discuss the discrete-valued case towards the end of the discussion. The bulk of the presentation focuses on neighborhood selection for estimation and inference, outlining different methods and approaches according to, roughly speaking, the size of a typical selected neighborhood in each case, going from the largest to smallest neighborhood. Figure 1 provides a heuristic summary, which we
{"title":"The Choice of Neighborhood in Regression Discontinuity Designs","authors":"M. D. Cattaneo, Cattaneo","doi":"10.1353/obs.2017.0002","DOIUrl":"https://doi.org/10.1353/obs.2017.0002","url":null,"abstract":"The seminal paper of Thistlethwaite and Campbell (1960) is one of the greatest breakthroughs in program evaluation and causal inference for observational studies. The originally coined Regression-Discontinuity Analysis, and nowadays widely known as the Regression Discontinuity (RD) design, is likely the most credible and internally valid quantitative approach for the analysis and interpretation of non-experimental data. Early reviews and perspectives on RD designs include Cook (2008), Imbens and Lemieux (2008) and Lee and Lemieux (2010); see also Cattaneo and Escanciano (2017) for a contemporaneous edited volume with more recent overviews, discussions, and references. The key design feature in RD is that units have an observable running variable, score or index, and are assigned to treatment whenever this variable exceeds a known cutoff. Empirical work in RD designs seeks to compare the response of units just below the cutoff (control group) to the response of units just above (treatment group) to learn about the treatment effects of interest. It is by now generally recognized that the most important task in practice is to select the appropriate neighborhood near the cutoff, that is, to correctly determine which observations near the cutoff will be used. Localizing near the cutoff is crucial because empirical findings can be quite sensitive to which observations are included in the analysis. Several neighborhood selection methods have been developed in the literature depending on the goal (e.g., estimation, inference, falsification, graphical presentation), the underlying assumptions invoked (e.g., parametric specification, continuity/nonparametric specification, local randomization), the parameter of interest (e.g., sharp, fuzzy, kink), and even the specific design (e.g., single-cutoff, multi-cutoff, geographic). We offer a comprehensive discussion of both deprecated and modern neighborhood selection approaches available in the literature, following their historical as well as methodological evolution over the last decades. We focus on the prototypical case of a continuously distributed running variable for the most part, though we also discuss the discrete-valued case towards the end of the discussion. The bulk of the presentation focuses on neighborhood selection for estimation and inference, outlining different methods and approaches according to, roughly speaking, the size of a typical selected neighborhood in each case, going from the largest to smallest neighborhood. Figure 1 provides a heuristic summary, which we","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"3 1","pages":"134 - 146"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2017.0002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44027642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The economist Paul Samuelson said, “My belief is that nothing that can be expressed by mathematics cannot be expressed by careful use of literary words.” Paul Rosenbaum brings this perspective to causal inference in his new book Observation and Experiment: An Introduction to Causal Inference (Harvard University Press, 2017). The book is a luminous presentation of concepts and strategies for causal inference with a minimum of technical material. An example of how Rosenbaum explains causal inference in a literary way is his use of a passage from Robert Frost’s poem “The Road Not Taken” to illuminate how causal questions involve comparing potential outcomes under two or more treatments where we can only see one potential outcome:
{"title":"Book review of “Observation and Experiment: An Introduction to Causal Inference” by Paul R. Rosenbaum","authors":"Dylan S. Small","doi":"10.1353/obs.2017.0008","DOIUrl":"https://doi.org/10.1353/obs.2017.0008","url":null,"abstract":"The economist Paul Samuelson said, “My belief is that nothing that can be expressed by mathematics cannot be expressed by careful use of literary words.” Paul Rosenbaum brings this perspective to causal inference in his new book Observation and Experiment: An Introduction to Causal Inference (Harvard University Press, 2017). The book is a luminous presentation of concepts and strategies for causal inference with a minimum of technical material. An example of how Rosenbaum explains causal inference in a literary way is his use of a passage from Robert Frost’s poem “The Road Not Taken” to illuminate how causal questions involve comparing potential outcomes under two or more treatments where we can only see one potential outcome:","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"3 1","pages":"28 - 38"},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2017.0008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41412544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}