{"title":"Regression discontinuity design and its applications to Science of Science: A survey","authors":"Mei Li, Yang Zhang, Yang Wang","doi":"10.2478/jdis-2023-0008","DOIUrl":null,"url":null,"abstract":"Abstract Purpose With the availability of large-scale scholarly datasets, scientists from various domains hope to understand the underlying mechanisms behind science, forming a vibrant area of inquiry in the emerging “science of science” field. As the results from the science of science often has strong policy implications, understanding the causal relationships between variables becomes prominent. However, the most credible quasi-experimental method among all causal inference methods, and a highly valuable tool in the empirical toolkit, Regression Discontinuity Design (RDD) has not been fully exploited in the field of science of science. In this paper, we provide a systematic survey of the RDD method, and its practical applications in the science of science. Design/methodology/approach First, we introduce the basic assumptions, mathematical notations, and two types of RDD, i.e., sharp and fuzzy RDD. Second, we use the Web of Science and the Microsoft Academic Graph datasets to study the evolution and citation patterns of RDD papers. Moreover, we provide a systematic survey of the applications of RDD methodologies in various scientific domains, as well as in the science of science. Finally, we demonstrate a case study to estimate the effect of Head Start Funding Proposals on child mortality. Findings RDD was almost neglected for 30 years after it was first introduced in 1960. Afterward, scientists used mathematical and economic tools to develop the RDD methodology. After 2010, RDD methods showed strong applications in various domains, including medicine, psychology, political science and environmental science. However, we also notice that the RDD method has not been well developed in science of science research. Research Limitations This work uses a keyword search to obtain RDD papers, which may neglect some related work. Additionally, our work does not aim to develop rigorous mathematical and technical details of RDD but rather focuses on its intuitions and applications. Practical implications This work proposes how to use the RDD method in science of science research. Originality/value This work systematically introduces the RDD, and calls for the awareness of using such a method in the field of science of science.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"8 1","pages":"43 - 65"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of data and information science (Warsaw, Poland)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/jdis-2023-0008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract Purpose With the availability of large-scale scholarly datasets, scientists from various domains hope to understand the underlying mechanisms behind science, forming a vibrant area of inquiry in the emerging “science of science” field. As the results from the science of science often has strong policy implications, understanding the causal relationships between variables becomes prominent. However, the most credible quasi-experimental method among all causal inference methods, and a highly valuable tool in the empirical toolkit, Regression Discontinuity Design (RDD) has not been fully exploited in the field of science of science. In this paper, we provide a systematic survey of the RDD method, and its practical applications in the science of science. Design/methodology/approach First, we introduce the basic assumptions, mathematical notations, and two types of RDD, i.e., sharp and fuzzy RDD. Second, we use the Web of Science and the Microsoft Academic Graph datasets to study the evolution and citation patterns of RDD papers. Moreover, we provide a systematic survey of the applications of RDD methodologies in various scientific domains, as well as in the science of science. Finally, we demonstrate a case study to estimate the effect of Head Start Funding Proposals on child mortality. Findings RDD was almost neglected for 30 years after it was first introduced in 1960. Afterward, scientists used mathematical and economic tools to develop the RDD methodology. After 2010, RDD methods showed strong applications in various domains, including medicine, psychology, political science and environmental science. However, we also notice that the RDD method has not been well developed in science of science research. Research Limitations This work uses a keyword search to obtain RDD papers, which may neglect some related work. Additionally, our work does not aim to develop rigorous mathematical and technical details of RDD but rather focuses on its intuitions and applications. Practical implications This work proposes how to use the RDD method in science of science research. Originality/value This work systematically introduces the RDD, and calls for the awareness of using such a method in the field of science of science.
摘要目的随着大规模学术数据集的可用性,来自各个领域的科学家希望了解科学背后的潜在机制,在新兴的“科学的科学”领域形成一个充满活力的研究领域。由于科学的结果往往具有强烈的政策含义,理解变量之间的因果关系变得尤为突出。然而,回归不连续性设计(RDD)是所有因果推理方法中最可信的准实验方法,也是经验工具包中极具价值的工具,在科学领域尚未得到充分利用。在本文中,我们对RDD方法及其在科学中的实际应用进行了系统的综述。设计/方法论/方法首先,我们介绍了基本假设、数学符号和两种类型的RDD,即尖锐和模糊RDD。其次,我们使用Web of Science和Microsoft Academic Graph数据集来研究RDD论文的演变和引用模式。此外,我们对RDD方法在各个科学领域以及科学中的应用进行了系统的调查。最后,我们展示了一个案例研究,以评估领先资金提案对儿童死亡率的影响。研究结果RDD在1960年首次引入后的30年里几乎被忽视。之后,科学家们利用数学和经济工具开发了RDD方法。2010年之后,RDD方法在医学、心理学、政治学和环境科学等各个领域都有了强大的应用。然而,我们也注意到RDD方法在科学研究中并没有得到很好的发展。研究局限性这项工作使用关键词搜索来获得RDD论文,这可能会忽略一些相关工作。此外,我们的工作并不旨在开发RDD的严格数学和技术细节,而是专注于其直觉和应用。实际意义这项工作提出了如何在科学研究中使用RDD方法。独创性/价值这部作品系统地介绍了RDD,并呼吁人们意识到在科学领域使用这种方法。