Abstract:We argue that algorithmic models, though powerful and appropriate in some circumstances, rely on just as many tenuous assumptions as parametric probabilistic models; these assumptions, their violations, and the ethical consequences of these violations are simply obscured within a black box. We advocate for a future in which statisticians play a central role in bridging the gap between Breiman's two cultures.
{"title":"Discussion of Breiman's \"Two Cultures\": From Two Cultures to One","authors":"Anna Neufeld, D. Witten","doi":"10.1353/obs.2021.0004","DOIUrl":"https://doi.org/10.1353/obs.2021.0004","url":null,"abstract":"Abstract:We argue that algorithmic models, though powerful and appropriate in some circumstances, rely on just as many tenuous assumptions as parametric probabilistic models; these assumptions, their violations, and the ethical consequences of these violations are simply obscured within a black box. We advocate for a future in which statisticians play a central role in bridging the gap between Breiman's two cultures.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43750897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:Leo Breiman's article "Statistical Modeling: The two cultures" was timely and provocative. He advocated for Statisticians to learn about and appreciate a different "culture": an algorithmic approach, as distinct from the familiar, stochastic, data modeling approach to Statistics. While we have appreciated and contributed to the algorithmic approach, we have always had a foot in both camps. Here we advocate for a "melting pot", arguing that both approaches have their virtues, sometimes on the same problem.
{"title":"A Melting Pot","authors":"R. Tibshirani, T. Hastie","doi":"10.1353/obs.2021.0012","DOIUrl":"https://doi.org/10.1353/obs.2021.0012","url":null,"abstract":"Abstract:Leo Breiman's article \"Statistical Modeling: The two cultures\" was timely and provocative. He advocated for Statisticians to learn about and appreciate a different \"culture\": an algorithmic approach, as distinct from the familiar, stochastic, data modeling approach to Statistics. While we have appreciated and contributed to the algorithmic approach, we have always had a foot in both camps. Here we advocate for a \"melting pot\", arguing that both approaches have their virtues, sometimes on the same problem.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2021.0012","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44567597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:We comment on Leo Breiman's "Statistical Modeling: The Two Cultures" paper. We provide some thoughts on prediction from a broader perspective and argue that "aiming for one modern culture" is a highly embracing attempt for addressing key problems in data and information sciences.
{"title":"One Modern Culture of Statistics: Comments on Statistical Modeling: The Two Cultures (Breiman, 2001b)","authors":"P. Bühlmann","doi":"10.1353/obs.2021.0020","DOIUrl":"https://doi.org/10.1353/obs.2021.0020","url":null,"abstract":"Abstract:We comment on Leo Breiman's \"Statistical Modeling: The Two Cultures\" paper. We provide some thoughts on prediction from a broader perspective and argue that \"aiming for one modern culture\" is a highly embracing attempt for addressing key problems in data and information sciences.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48758134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:We offer descriptive and normative standards for the principled pursuit of causal inference. These standards address critiques of both the algorithmic and the data modeling cultures identified in (Breiman, 2001), and provide a fruitful synthesis of both cultures. We contrast the resulting "cautious causal inference" with overly optimistic methods inspired by algorithmic data analysis methods prevalent in machine learning, as well as older approaches to causal modeling that employ overly restrictive parametric models.
{"title":"Causal Modelling: The Two Cultures","authors":"Elizabeth L. Ogburn, I. Shpitser","doi":"10.1353/obs.2021.0006","DOIUrl":"https://doi.org/10.1353/obs.2021.0006","url":null,"abstract":"Abstract:We offer descriptive and normative standards for the principled pursuit of causal inference. These standards address critiques of both the algorithmic and the data modeling cultures identified in (Breiman, 2001), and provide a fruitful synthesis of both cultures. We contrast the resulting \"cautious causal inference\" with overly optimistic methods inspired by algorithmic data analysis methods prevalent in machine learning, as well as older approaches to causal modeling that employ overly restrictive parametric models.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2021.0006","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43520306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:Leo Breiman's "Statistical Modeling: The Two Cultures" is a treasure for any statistician who engages with real world problem. I argue that there is a more fundamental dichotomy between the principles of statistical modeling and the techniques for statistical modeling. Focusing entirely on the techniques in statistical education and research can be dangerous. I join Breiman's call for statistics to return to its roots.
{"title":"Statistical Modeling: Returning to its Roots","authors":"Qingyuan Zhao","doi":"10.1353/obs.2021.0014","DOIUrl":"https://doi.org/10.1353/obs.2021.0014","url":null,"abstract":"Abstract:Leo Breiman's \"Statistical Modeling: The Two Cultures\" is a treasure for any statistician who engages with real world problem. I argue that there is a more fundamental dichotomy between the principles of statistical modeling and the techniques for statistical modeling. Focusing entirely on the techniques in statistical education and research can be dangerous. I join Breiman's call for statistics to return to its roots.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2021.0014","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42393513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikki L B Freeman, John Sperger, Helal El-Zaatari, Anna R Kahkoska, Minxin Lu, Michael Valancius, Arti V Virkud, Tarek M Zikry, Michael R Kosorok
In the twenty years since Dr. Leo Breiman's incendiary paper Statistical Modeling: The Two Cultures was first published, algorithmic modeling techniques have gone from controversial to commonplace in the statistical community. While the widespread adoption of these methods as part of the contemporary statistician's toolkit is a testament to Dr. Breiman's vision, the number of high-profile failures of algorithmic models suggests that Dr. Breiman's final remark that "the emphasis needs to be on the problem and the data" has been less widely heeded. In the spirit of Dr. Breiman, we detail an emerging research community in statistics - data-driven decision support. We assert that to realize the full potential of decision support, broadly and in the context of precision health, will require a culture of social awareness and accountability, in addition to ongoing attention towards complex technical challenges.
{"title":"Beyond Two Cultures: Cultural Infrastructure for Data-driven Decision Support.","authors":"Nikki L B Freeman, John Sperger, Helal El-Zaatari, Anna R Kahkoska, Minxin Lu, Michael Valancius, Arti V Virkud, Tarek M Zikry, Michael R Kosorok","doi":"10.1353/obs.2021.0024","DOIUrl":"10.1353/obs.2021.0024","url":null,"abstract":"<p><p>In the twenty years since Dr. Leo Breiman's incendiary paper <i>Statistical Modeling: The Two Cultures</i> was first published, algorithmic modeling techniques have gone from controversial to commonplace in the statistical community. While the widespread adoption of these methods as part of the contemporary statistician's toolkit is a testament to Dr. Breiman's vision, the number of high-profile failures of algorithmic models suggests that Dr. Breiman's final remark that \"the emphasis needs to be on the problem and the data\" has been less widely heeded. In the spirit of Dr. Breiman, we detail an emerging research community in statistics - data-driven decision support. We assert that to realize the full potential of decision support, broadly and in the context of precision health, will require a culture of social awareness and accountability, in addition to ongoing attention towards complex technical challenges.</p>","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8802367/pdf/nihms-1773096.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39741992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider an extension of Leo Breiman's thesis from "Statistical Modeling: The Two Cultures" to include a bifurcation of algorithmic modeling, focusing on parametric regressions, interpretable algorithms, and complex (possibly explainable) algorithms.
我们考虑扩展 Leo Breiman 在《统计建模:两种文化》一文中的论点:两种文化 "的延伸,将算法建模的分叉纳入其中,重点关注参数回归、可解释算法和复杂(可能可解释)算法。
{"title":"Considerations Across Three Cultures: Parametric Regressions, Interpretable Algorithms, and Complex Algorithms.","authors":"Ani Eloyan, Sherri Rose","doi":"10.1353/obs.2021.0009","DOIUrl":"10.1353/obs.2021.0009","url":null,"abstract":"<p><p>We consider an extension of Leo Breiman's thesis from \"Statistical Modeling: The Two Cultures\" to include a bifurcation of algorithmic modeling, focusing on parametric regressions, interpretable algorithms, and complex (possibly explainable) algorithms.</p>","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8415757/pdf/nihms-1732979.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39387492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Causation in Action: Some Remarks Attendant to Re-reading Hill (1965)","authors":"Herbert L. Smith","doi":"10.1353/OBS.2020.0007","DOIUrl":"https://doi.org/10.1353/OBS.2020.0007","url":null,"abstract":"","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/OBS.2020.0007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45137848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract:In this paper, we use a two-step approach for heterogeneous subgroup identification with a synthetic data set motivated by the National Study of Learning Mindsets. In the first step, optimal full propensity score matching is used to estimate stratum-specific treatment effects. In the second step, regression trees identify key subgroups based on covariates for which the treatment effect varies. In working with regression trees, we emphasize the role of the cost-complexity tuning parameter, selected through permutation-based Type I error rate studies, in justifying inferential decision-making, which we contrast with graphical and quantitative exploration for future study. Results indicate that the mindset intervention was effective, overall, in improving student achievement. While our exploratory analyses identified XC, C1, and X1 as potential effect modifiers worthy of further study, we find no statistically significant evidence of effect heterogeneity with the exception of urbanicity category XC = 3, but the finding is not robust to propensity score estimation method.
摘要:本文采用一种两步法,利用国家学习心态研究(National Study of Learning mindset)的综合数据集进行异质性亚群识别。第一步,利用最优全倾向评分匹配来估计层特异性处理效果。在第二步中,回归树根据治疗效果变化的协变量确定关键子组。在使用回归树时,我们强调通过基于排列的I型错误率研究选择的成本-复杂性调整参数在证明推理决策中的作用,并将其与未来研究的图形和定量探索进行对比。结果表明,心态干预在提高学生成绩方面是有效的。虽然我们的探索性分析发现XC、C1和X1是值得进一步研究的潜在影响修饰因子,但除了城市化类别XC = 3外,我们没有发现统计学上显著的效应异质性证据,但这一发现对于倾向得分估计方法并不稳健。
{"title":"Heterogeneous Subgroup Identification with Observational Data: A Case Study Based on the National Study of Learning Mindsets","authors":"Bryan Keller, Jianshen Chen, Tianyang Zhang","doi":"10.1353/obs.2019.0010","DOIUrl":"https://doi.org/10.1353/obs.2019.0010","url":null,"abstract":"Abstract:In this paper, we use a two-step approach for heterogeneous subgroup identification with a synthetic data set motivated by the National Study of Learning Mindsets. In the first step, optimal full propensity score matching is used to estimate stratum-specific treatment effects. In the second step, regression trees identify key subgroups based on covariates for which the treatment effect varies. In working with regression trees, we emphasize the role of the cost-complexity tuning parameter, selected through permutation-based Type I error rate studies, in justifying inferential decision-making, which we contrast with graphical and quantitative exploration for future study. Results indicate that the mindset intervention was effective, overall, in improving student achievement. While our exploratory analyses identified XC, C1, and X1 as potential effect modifiers worthy of further study, we find no statistically significant evidence of effect heterogeneity with the exception of urbanicity category XC = 3, but the finding is not robust to propensity score estimation method.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2019.0010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41420738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comment on Cochran’s “Observational Studies”","authors":"B. Hansen, Adam C. Sales","doi":"10.1353/obs.2015.0017","DOIUrl":"https://doi.org/10.1353/obs.2015.0017","url":null,"abstract":"","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1353/obs.2015.0017","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48808444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}