Abstract:The Mann-Whitney test is a popular nonparametric test for comparing two samples. It has been recently extended by Satten et al. (2018) to allow testing for the existence of treatment effects in observational studies. Their proposed adjusted Mann-Whitney test relies on the unconfoundedness assumption which is untestable in practice. It hence becomes important to assess the impact of violating this assumption on the degree to which causal conclusions remain valid. In this paper, we consider a marginal sensitivity analysis framework to address this problem by utilizing a bootstrap approach that provides a sensitivity interval for the estimand with a guaranteed coverage probability as long as the data generating mechanism is included in the set of pre-specified sensitivity models. We develop efficient optimization algorithms for computing the sensitivity interval and further extend our approach to a general class of adjusted multi-sample U-statistics. Simulation studies and two real data applications are discussed to demonstrate the utility of our proposed methodology.
{"title":"Sensitivity Analysis for the Adjusted Mann-Whitney Test with Observational Studies","authors":"Maozhu Dai, Weining Shen, H. Stern","doi":"10.1353/obs.2022.0002","DOIUrl":"https://doi.org/10.1353/obs.2022.0002","url":null,"abstract":"Abstract:The Mann-Whitney test is a popular nonparametric test for comparing two samples. It has been recently extended by Satten et al. (2018) to allow testing for the existence of treatment effects in observational studies. Their proposed adjusted Mann-Whitney test relies on the unconfoundedness assumption which is untestable in practice. It hence becomes important to assess the impact of violating this assumption on the degree to which causal conclusions remain valid. In this paper, we consider a marginal sensitivity analysis framework to address this problem by utilizing a bootstrap approach that provides a sensitivity interval for the estimand with a guaranteed coverage probability as long as the data generating mechanism is included in the set of pre-specified sensitivity models. We develop efficient optimization algorithms for computing the sensitivity interval and further extend our approach to a general class of adjusted multi-sample U-statistics. Simulation studies and two real data applications are discussed to demonstrate the utility of our proposed methodology.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46940922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michela Bia, Alfonso Flores-Lagunes, Andrea Mercatanti
Abstract:In a world increasingly globalized, multiple language skills can create more employment opportunities. Several countries include language training programs in active labor market programs for the unemployed. We analyze the effects of a language training program on the re-employment probability and hourly wages simultaneously, using high-quality administrative data from Luxembourg. We address selection into training with an unconfoundedness assumption and account for the complication that wages are “truncated” by unemployment by adopting a principal stratification framework. Estimation is undertaken with a mixture model likelihood-based approach. To improve inference, we use the individual’s hours worked as a secondary outcome and a stochastic dominance assumption. These two features considerably ameliorate the multimodality problem commonly encountered in mixture models. We also conduct a sensitivity analysis to assess the unconfoundedness assumption. Our results suggest a positive effect (of up to 12.7 percent) of the language training programs on the re-employment probability, but no effects on wages for those who are observed employed regardless of training participation.
{"title":"Evaluation of Language Training Programs in Luxembourg using Principal Stratification","authors":"Michela Bia, Alfonso Flores-Lagunes, Andrea Mercatanti","doi":"10.2139/ssrn.3538309","DOIUrl":"https://doi.org/10.2139/ssrn.3538309","url":null,"abstract":"Abstract:In a world increasingly globalized, multiple language skills can create more employment opportunities. Several countries include language training programs in active labor market programs for the unemployed. We analyze the effects of a language training program on the re-employment probability and hourly wages simultaneously, using high-quality administrative data from Luxembourg. We address selection into training with an unconfoundedness assumption and account for the complication that wages are “truncated” by unemployment by adopting a principal stratification framework. Estimation is undertaken with a mixture model likelihood-based approach. To improve inference, we use the individual’s hours worked as a secondary outcome and a stochastic dominance assumption. These two features considerably ameliorate the multimodality problem commonly encountered in mixture models. We also conduct a sensitivity analysis to assess the unconfoundedness assumption. Our results suggest a positive effect (of up to 12.7 percent) of the language training programs on the re-employment probability, but no effects on wages for those who are observed employed regardless of training participation.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42577592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Tompsett, S. Vansteelandt, O. Dukes, B. D. De Stavola
Abstract:In this paper we present gesttools, a series of general purpose, user friendly functions with which to perform g-estimation of structural nested mean models (SNMMs) for time-varying exposures and outcomes in R. The package implements the g-estimation methods found in Vansteelandt and Sjolander (2016) and Dukes and Vansteelandt (2018), and is capable of analysing both end of study and time-varying outcome data that are either binary or continuous, or exposure variables that are either binary, continuous, or categorical. It also allows for the fitting of SNMMs with time-varying causal effects, effect modification by other variables, or both, as well as support for censored data using inverse weighting. We outline the theory underpinning these methods, as well as describing the SNMMs that can be fitted by the software. The package is demonstrated using simulated, and real-world inspired datasets.
{"title":"gesttools: General Purpose G-Estimation in R","authors":"Daniel Tompsett, S. Vansteelandt, O. Dukes, B. D. De Stavola","doi":"10.1353/obs.2022.0003","DOIUrl":"https://doi.org/10.1353/obs.2022.0003","url":null,"abstract":"Abstract:In this paper we present gesttools, a series of general purpose, user friendly functions with which to perform g-estimation of structural nested mean models (SNMMs) for time-varying exposures and outcomes in R. The package implements the g-estimation methods found in Vansteelandt and Sjolander (2016) and Dukes and Vansteelandt (2018), and is capable of analysing both end of study and time-varying outcome data that are either binary or continuous, or exposure variables that are either binary, continuous, or categorical. It also allows for the fitting of SNMMs with time-varying causal effects, effect modification by other variables, or both, as well as support for censored data using inverse weighting. We outline the theory underpinning these methods, as well as describing the SNMMs that can be fitted by the software. The package is demonstrated using simulated, and real-world inspired datasets.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47866713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The interrupted time series design was introduced to social scientists in 1963 by Campbell and Stanley, analysis methods were proposed by Box and Tiao in 1975, and more recent treatments are easily found (Box et al., 2016). Despite its popularity, current results in statistics reveal fundamental oversights in the standard statistical methods employed. Adaptive model selection built into recommended practice causes challenging problems for post-model-selection-inference. What one might call model cherry picking can invalidate conventional statistical inference, statistical tests and confidence intervals with damaging consequences for causal inference. There are technical developments that can correct for these problems, but these remedies raise conceptual difficulties for causal inference when proper estimands are defined. The issues are illustrated with an analysis of the impact of an assault weapons ban on daily handgun sales in California from 1996 through 2018. Statistically valid regression functionals are obtained, but their causal meaning is unclear. Researchers might be best served by interpreting only the sign of such functionals.
中断时间序列设计于1963年由Campbell和Stanley引入社会科学家,分析方法由Box和Tiao于1975年提出,最近的治疗方法很容易找到(Box et al., 2016)。尽管它很受欢迎,但目前的统计结果显示,所采用的标准统计方法存在根本性的疏忽。在推荐实践中建立的自适应模型选择为后模型选择推理带来了挑战性问题。人们可能会称之为“模型挑选”,它会使传统的统计推断、统计测试和置信区间无效,并对因果推断产生破坏性后果。有技术上的发展可以纠正这些问题,但这些补救措施在定义适当的估计时,会给因果推理带来概念上的困难。通过分析1996年至2018年加州禁止每日手枪销售的攻击性武器禁令的影响,可以说明这些问题。得到了统计上有效的回归函数,但其因果意义尚不清楚。研究人员最好只解释这些功能的符号。
{"title":"Causal Inference Challenges with Interrupted Time Series Designs: An Evaluation of an Assault Weapons Ban in California","authors":"R. Berk","doi":"10.1353/obs.0.0001","DOIUrl":"https://doi.org/10.1353/obs.0.0001","url":null,"abstract":"The interrupted time series design was introduced to social scientists in 1963 by Campbell and Stanley, analysis methods were proposed by Box and Tiao in 1975, and more recent treatments are easily found (Box et al., 2016). Despite its popularity, current results in statistics reveal fundamental oversights in the standard statistical methods employed. Adaptive model selection built into recommended practice causes challenging problems for post-model-selection-inference. What one might call model cherry picking can invalidate conventional statistical inference, statistical tests and confidence intervals with damaging consequences for causal inference. There are technical developments that can correct for these problems, but these remedies raise conceptual difficulties for causal inference when proper estimands are defined. The issues are illustrated with an analysis of the impact of an assault weapons ban on daily handgun sales in California from 1996 through 2018. Statistically valid regression functionals are obtained, but their causal meaning is unclear. Researchers might be best served by interpreting only the sign of such functionals.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66460798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:Rosenbaum and Rubin (1983) introduced the notion of the propensity score and discussed its central role in causal inference with observational studies. Their paper, however, caused a fundamental incoherence with an early paper by Rubin (1978), which showed that the propensity score does not play any role in the Bayesian analysis of unconfounded observational studies if the priors on the propensity score and outcome models are independent. Despite the serious efforts made in the literature, it is generally difficult to reconcile these contradicting results. We offer a simple approach to incorporating the propensity score in Bayesian causal inference based on the posterior predictive p-value. To motivate a simple procedure, we focus on the model with the strong null hypothesis of no causal effects for any units whatsoever. Computationally, the proposed posterior predictive p-value equals the classic p-value based on the Fisher randomization test averaged over the posterior predictive distribution of the propensity score. Moreover, using the studentized doubly robust estimator as the test statistic, the proposed p-value inherits the doubly robust property and is also asymptotically valid for testing the weak null hypothesis of zero average causal effect. Perhaps surprisingly, this Bayesianly motivated p-value can have better frequentist’s finite-sample performance than the frequentist’s p-value based on the asymptotic approximation especially when the propensity scores can take extreme values.
{"title":"Posterior Predictive Propensity Scores and p-Values","authors":"Peng Ding, Tianyu Guo","doi":"10.1353/obs.2023.0015","DOIUrl":"https://doi.org/10.1353/obs.2023.0015","url":null,"abstract":"Abstract:Rosenbaum and Rubin (1983) introduced the notion of the propensity score and discussed its central role in causal inference with observational studies. Their paper, however, caused a fundamental incoherence with an early paper by Rubin (1978), which showed that the propensity score does not play any role in the Bayesian analysis of unconfounded observational studies if the priors on the propensity score and outcome models are independent. Despite the serious efforts made in the literature, it is generally difficult to reconcile these contradicting results. We offer a simple approach to incorporating the propensity score in Bayesian causal inference based on the posterior predictive p-value. To motivate a simple procedure, we focus on the model with the strong null hypothesis of no causal effects for any units whatsoever. Computationally, the proposed posterior predictive p-value equals the classic p-value based on the Fisher randomization test averaged over the posterior predictive distribution of the propensity score. Moreover, using the studentized doubly robust estimator as the test statistic, the proposed p-value inherits the doubly robust property and is also asymptotically valid for testing the weak null hypothesis of zero average causal effect. Perhaps surprisingly, this Bayesianly motivated p-value can have better frequentist’s finite-sample performance than the frequentist’s p-value based on the asymptotic approximation especially when the propensity scores can take extreme values.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49379109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:In Leo Breiman's influential article "Statistical modeling-the two cultures" he identified two cultures for statistical practices. The data modeling culture (DMC) denotes practices tailored for statistical inference targeting a quantity of interest, [inline-graphic 01]. The algorithmic modeling culture (AMC) refers to practices defining an algorithm, or a machine-learning (ML) procedure, that generates accurate predictions about an outcome of interest, [inline-graphic 02] was the dominant mode, Breiman argued that statisticians should give more attention to AMC. Twenty years later and energized by two revolutions—one in data-science and one in causal inference—a hybrid modeling culture (HMC) is rising. HMC fuses the inferential strength of DMC and the predictive power of AMC with the goal of analyzing cause and effect, and thus, HMC's quantity of interest is causal effect, [inline-graphic 03]. In combining inference and prediction, the result of HMC practices is that the distinction between prediction and inference, taken to its limit, melts away. While this hybrid culture does not occupy the default mode of scientific practices, we argue that it offers an intriguing novel path for applied sciences.
{"title":"Melting together prediction and inference","authors":"A. Daoud, Devdatt P. Dubhashi","doi":"10.1353/obs.2021.0035","DOIUrl":"https://doi.org/10.1353/obs.2021.0035","url":null,"abstract":"Abstract:In Leo Breiman's influential article \"Statistical modeling-the two cultures\" he identified two cultures for statistical practices. The data modeling culture (DMC) denotes practices tailored for statistical inference targeting a quantity of interest, [inline-graphic 01]. The algorithmic modeling culture (AMC) refers to practices defining an algorithm, or a machine-learning (ML) procedure, that generates accurate predictions about an outcome of interest, [inline-graphic 02] was the dominant mode, Breiman argued that statisticians should give more attention to AMC. Twenty years later and energized by two revolutions—one in data-science and one in causal inference—a hybrid modeling culture (HMC) is rising. HMC fuses the inferential strength of DMC and the predictive power of AMC with the goal of analyzing cause and effect, and thus, HMC's quantity of interest is causal effect, [inline-graphic 03]. In combining inference and prediction, the result of HMC practices is that the distinction between prediction and inference, taken to its limit, melts away. While this hybrid culture does not occupy the default mode of scientific practices, we argue that it offers an intriguing novel path for applied sciences.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45501762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:Causal analyses for observational studies are often complicated by covariate imbalances among treatment groups, and matching methodologies alleviate this complication by finding subsets of treatment groups that exhibit covariate balance. It is widely agreed upon that covariate balance can serve as evidence that a matched dataset approximates a randomized experiment, but what kind of experiment does a matched dataset approximate? In this work, we develop a randomization test for the hypothesis that a matched dataset approximates a particular experimental design, such as complete randomization, block randomization, or rerandomization. Our test can incorporate any experimental design, and it allows for a graphical display that puts several designs on the same univariate scale, thereby allowing researchers to pinpoint which design—if any—is most appropriate for a matched dataset. After researchers determine a plausible design, we recommend a randomization based approach for analyzing the matched data, which can incorporate any design and treatment effect estimator. Through simulation, we find that our test can frequently detect violations of randomized assignment that harm inferential results. Furthermore, through simulation and a real application in political science, we find that matched datasets with high levels of covariate balance tend to approximate balance-constrained designs like rerandomization, and analyzing them as such can lead to precise causal analyses. However, assuming a precise design should be proceeded with caution, because it can harm inferential results if there are still substantial biases due to remaining imbalances after matching. Our approach is implemented in the randChecks R package, available on CRAN.
{"title":"Randomization Tests to Assess Covariate Balance When Designing and Analyzing Matched Datasets","authors":"Zach Branson","doi":"10.1353/obs.2021.0031","DOIUrl":"https://doi.org/10.1353/obs.2021.0031","url":null,"abstract":"Abstract:Causal analyses for observational studies are often complicated by covariate imbalances among treatment groups, and matching methodologies alleviate this complication by finding subsets of treatment groups that exhibit covariate balance. It is widely agreed upon that covariate balance can serve as evidence that a matched dataset approximates a randomized experiment, but what kind of experiment does a matched dataset approximate? In this work, we develop a randomization test for the hypothesis that a matched dataset approximates a particular experimental design, such as complete randomization, block randomization, or rerandomization. Our test can incorporate any experimental design, and it allows for a graphical display that puts several designs on the same univariate scale, thereby allowing researchers to pinpoint which design—if any—is most appropriate for a matched dataset. After researchers determine a plausible design, we recommend a randomization based approach for analyzing the matched data, which can incorporate any design and treatment effect estimator. Through simulation, we find that our test can frequently detect violations of randomized assignment that harm inferential results. Furthermore, through simulation and a real application in political science, we find that matched datasets with high levels of covariate balance tend to approximate balance-constrained designs like rerandomization, and analyzing them as such can lead to precise causal analyses. However, assuming a precise design should be proceeded with caution, because it can harm inferential results if there are still substantial biases due to remaining imbalances after matching. Our approach is implemented in the randChecks R package, available on CRAN.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44904534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:Starting in 2014, states received the option to expand Medicaid through the Affordable Care Act (ACA). Many states chose to expand Medicaid whereas others opted out. In this protocol, we describe a study to examine the impact of Medicaid expansion on mortality in the continental United States from 2015-2018. We adopt the matching structure of Mann et al. (2021) to estimate causal effects in this study. This protocol outlines both a standard method of analyzing this policy choice along with a method employing a novel weighting scheme to examine the effect of Medicaid expansion on those who stand to benefit most from policy implementation: the individuals newly eligible for Medicaid.
{"title":"Protocol: Evaluating the Effect of ACA Medicaid Expansion on 2015-2018 Mortality Through Matching and Weighting","authors":"Timothy Lycurgus, Charlotte Z. Mann, B. Hansen","doi":"10.1353/obs.2021.0033","DOIUrl":"https://doi.org/10.1353/obs.2021.0033","url":null,"abstract":"Abstract:Starting in 2014, states received the option to expand Medicaid through the Affordable Care Act (ACA). Many states chose to expand Medicaid whereas others opted out. In this protocol, we describe a study to examine the impact of Medicaid expansion on mortality in the continental United States from 2015-2018. We adopt the matching structure of Mann et al. (2021) to estimate causal effects in this study. This protocol outlines both a standard method of analyzing this policy choice along with a method employing a novel weighting scheme to examine the effect of Medicaid expansion on those who stand to benefit most from policy implementation: the individuals newly eligible for Medicaid.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41453810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:Breiman's "Two Cultures" paper painted a picture of two disciplines, data modeling, and algorithmic machine learning, both engaged in the analyses of data but talking past each other. Although that may have been true at the time, there is now much interaction between the two. For example, in economics, machine learning algorithms have become valuable and widely appreciated tools for aiding in the analyses of economic data, informed by causal/structural economic models.
{"title":"Breiman's Two Cultures: A Perspective from Econometrics","authors":"G. Imbens, S. Athey","doi":"10.1353/obs.2021.0028","DOIUrl":"https://doi.org/10.1353/obs.2021.0028","url":null,"abstract":"Abstract:Breiman's \"Two Cultures\" paper painted a picture of two disciplines, data modeling, and algorithmic machine learning, both engaged in the analyses of data but talking past each other. Although that may have been true at the time, there is now much interaction between the two. For example, in economics, machine learning algorithms have become valuable and widely appreciated tools for aiding in the analyses of economic data, informed by causal/structural economic models.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2021.0028","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47658119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:In his article on Two Cultures of Statistical Modeling, Leo Breiman argued for an algorithmic approach to statistics, as exemplified by his pathbreaking research on large regularized models that fit data and have good predictive properties but without attempting to capture true underlying structure. I think Breiman was right about the benefits of open-ended predictive methods for complex modern problems. I also discuss some points of disagreement, notably Breiman's dismissal of Bayesian methods, which I think reflected a misunderstanding on his part, in that he did not recognized that Bayesian inference can be viewed as regularized prediction and does not rely on an assumption that the fitted model is true. In retrospect, we can learn both from Breiman's deep foresight and from his occasional oversights.
摘要:在《统计建模的两种文化》(Two Cultures of Statistical Modeling)一文中,Leo Breiman提出了统计学的算法方法,他对大型正则化模型进行了开创性的研究,这些模型拟合数据,具有良好的预测特性,但没有试图捕捉真正的底层结构。我认为Breiman关于开放式预测方法对复杂现代问题的好处是正确的。我还讨论了一些分歧点,特别是Breiman对贝叶斯方法的不屑一顾,我认为这反映了他的误解,因为他没有认识到贝叶斯推理可以被视为正则化预测,并且不依赖于拟合模型为真的假设。回顾过去,我们可以从布雷曼的远见卓识和偶尔的疏忽中吸取教训。
{"title":"Reflections on Breiman's Two Cultures of Statistical Modeling","authors":"A. Gelman","doi":"10.1353/obs.2021.0025","DOIUrl":"https://doi.org/10.1353/obs.2021.0025","url":null,"abstract":"Abstract:In his article on Two Cultures of Statistical Modeling, Leo Breiman argued for an algorithmic approach to statistics, as exemplified by his pathbreaking research on large regularized models that fit data and have good predictive properties but without attempting to capture true underlying structure. I think Breiman was right about the benefits of open-ended predictive methods for complex modern problems. I also discuss some points of disagreement, notably Breiman's dismissal of Bayesian methods, which I think reflected a misunderstanding on his part, in that he did not recognized that Bayesian inference can be viewed as regularized prediction and does not rely on an assumption that the fitted model is true. In retrospect, we can learn both from Breiman's deep foresight and from his occasional oversights.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2021.0025","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44774748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}