The Choice of Neighborhood in Regression Discontinuity Designs

M. D. Cattaneo, Cattaneo
{"title":"The Choice of Neighborhood in Regression Discontinuity Designs","authors":"M. D. Cattaneo, Cattaneo","doi":"10.1353/obs.2017.0002","DOIUrl":null,"url":null,"abstract":"The seminal paper of Thistlethwaite and Campbell (1960) is one of the greatest breakthroughs in program evaluation and causal inference for observational studies. The originally coined Regression-Discontinuity Analysis, and nowadays widely known as the Regression Discontinuity (RD) design, is likely the most credible and internally valid quantitative approach for the analysis and interpretation of non-experimental data. Early reviews and perspectives on RD designs include Cook (2008), Imbens and Lemieux (2008) and Lee and Lemieux (2010); see also Cattaneo and Escanciano (2017) for a contemporaneous edited volume with more recent overviews, discussions, and references. The key design feature in RD is that units have an observable running variable, score or index, and are assigned to treatment whenever this variable exceeds a known cutoff. Empirical work in RD designs seeks to compare the response of units just below the cutoff (control group) to the response of units just above (treatment group) to learn about the treatment effects of interest. It is by now generally recognized that the most important task in practice is to select the appropriate neighborhood near the cutoff, that is, to correctly determine which observations near the cutoff will be used. Localizing near the cutoff is crucial because empirical findings can be quite sensitive to which observations are included in the analysis. Several neighborhood selection methods have been developed in the literature depending on the goal (e.g., estimation, inference, falsification, graphical presentation), the underlying assumptions invoked (e.g., parametric specification, continuity/nonparametric specification, local randomization), the parameter of interest (e.g., sharp, fuzzy, kink), and even the specific design (e.g., single-cutoff, multi-cutoff, geographic). We offer a comprehensive discussion of both deprecated and modern neighborhood selection approaches available in the literature, following their historical as well as methodological evolution over the last decades. We focus on the prototypical case of a continuously distributed running variable for the most part, though we also discuss the discrete-valued case towards the end of the discussion. The bulk of the presentation focuses on neighborhood selection for estimation and inference, outlining different methods and approaches according to, roughly speaking, the size of a typical selected neighborhood in each case, going from the largest to smallest neighborhood. Figure 1 provides a heuristic summary, which we","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2017.0002","citationCount":"42","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Observational studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1353/obs.2017.0002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 42

Abstract

The seminal paper of Thistlethwaite and Campbell (1960) is one of the greatest breakthroughs in program evaluation and causal inference for observational studies. The originally coined Regression-Discontinuity Analysis, and nowadays widely known as the Regression Discontinuity (RD) design, is likely the most credible and internally valid quantitative approach for the analysis and interpretation of non-experimental data. Early reviews and perspectives on RD designs include Cook (2008), Imbens and Lemieux (2008) and Lee and Lemieux (2010); see also Cattaneo and Escanciano (2017) for a contemporaneous edited volume with more recent overviews, discussions, and references. The key design feature in RD is that units have an observable running variable, score or index, and are assigned to treatment whenever this variable exceeds a known cutoff. Empirical work in RD designs seeks to compare the response of units just below the cutoff (control group) to the response of units just above (treatment group) to learn about the treatment effects of interest. It is by now generally recognized that the most important task in practice is to select the appropriate neighborhood near the cutoff, that is, to correctly determine which observations near the cutoff will be used. Localizing near the cutoff is crucial because empirical findings can be quite sensitive to which observations are included in the analysis. Several neighborhood selection methods have been developed in the literature depending on the goal (e.g., estimation, inference, falsification, graphical presentation), the underlying assumptions invoked (e.g., parametric specification, continuity/nonparametric specification, local randomization), the parameter of interest (e.g., sharp, fuzzy, kink), and even the specific design (e.g., single-cutoff, multi-cutoff, geographic). We offer a comprehensive discussion of both deprecated and modern neighborhood selection approaches available in the literature, following their historical as well as methodological evolution over the last decades. We focus on the prototypical case of a continuously distributed running variable for the most part, though we also discuss the discrete-valued case towards the end of the discussion. The bulk of the presentation focuses on neighborhood selection for estimation and inference, outlining different methods and approaches according to, roughly speaking, the size of a typical selected neighborhood in each case, going from the largest to smallest neighborhood. Figure 1 provides a heuristic summary, which we
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
回归不连续设计中邻域的选择
Thistlethwaite和Campbell(1960)的开创性论文是观察性研究中程序评估和因果推断的最大突破之一。最初创造的回归不连续性分析,现在被广泛称为回归不连续(RD)设计,可能是分析和解释非实验数据的最可信和内部有效的定量方法。早期对RD设计的评论和观点包括Cook(2008)、Imbens和Lemieux(2008)以及Lee和Lemiux(2010);另请参见Cattaneo和Escanciano(2017),以获取同期编辑的卷,其中包含更新的概述、讨论和参考文献。RD的关键设计特征是,单元具有可观察的运行变量、分数或指数,并且每当该变量超过已知临界值时,就被分配给治疗。RD设计中的经验工作试图将刚好低于临界值的单位(对照组)的响应与刚好高于临界值的单元(治疗组)的反应进行比较,以了解感兴趣的治疗效果。到目前为止,人们普遍认为,实践中最重要的任务是在截止点附近选择合适的邻域,也就是说,正确地确定将使用截止点附近的哪些观测值。在截止点附近定位是至关重要的,因为经验发现对分析中包含的观察结果非常敏感。文献中已经开发了几种邻域选择方法,这些方法取决于目标(例如,估计、推断、伪造、图形表示)、调用的基本假设(例如,参数规范、连续性/非参数规范、局部随机化)、感兴趣的参数(例如,尖锐、模糊、扭结)、,甚至是特定的设计(例如,单截止、多截止、地理)。我们对文献中的废弃和现代邻域选择方法进行了全面的讨论,遵循了它们在过去几十年中的历史和方法论演变。我们在很大程度上关注连续分布运行变量的原型情况,尽管在讨论的最后我们也讨论了离散值情况。演示的大部分内容集中在用于估计和推断的邻域选择上,粗略地说,根据每种情况下典型选定邻域的大小,从最大邻域到最小邻域,概述了不同的方法和方法。图1提供了一个启发式总结,我们
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
0.80
自引率
0.00%
发文量
0
期刊最新文献
Does matching introduce confounding or selection bias into the matched case-control design? Size-biased sensitivity analysis for matched pairs design to assess the impact of healthcare-associated infections A Software Tutorial for Matching in Clustered Observational Studies Using a difference-in-difference control trial to test an intervention aimed at increasing the take-up of a welfare payment in New Zealand Estimating Treatment Effect with Propensity Score Weighted Regression and Double Machine Learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1