Rejoinder: “Co-citation and Co-authorship Networks of Statisticians”

IF 4.6 Q2 MATERIALS SCIENCE, BIOMATERIALS ACS Applied Bio Materials Pub Date : 2022-04-03 DOI:10.1080/07350015.2022.2055358
Pengsheng Ji, Jiashun Jin, Z. Ke, Wanshan Li
{"title":"Rejoinder: “Co-citation and Co-authorship Networks of Statisticians”","authors":"Pengsheng Ji, Jiashun Jin, Z. Ke, Wanshan Li","doi":"10.1080/07350015.2022.2055358","DOIUrl":null,"url":null,"abstract":"We thank David Donoho for very encouraging comments. As always, his penetrating vision and deep thoughts are extremely stimulating. We are glad that he summarizes a major philosophical difference between statistics in earlier years (e.g., the time of Francis Galton) and statistics in our time by just a few words: data-first versus model-first. We completely agree with his comment that “each effort by a statistics researcher to understand a newly available type of data enlarges our field; it should be a primary part of the career of statisticians to cultivate an interest in cultivating new types of datasets, so that new methodology can be discovered and developed”; these are exactly the motivations underlying our (several-year) efforts in collecting, cleaning, and analyzing a large-scale high-quality dataset. We would like to add that both traditions have strengths, and combining the strengths of two sides may greatly help statisticians deal with the so-called crisis of the 21st century in statistics we face today. Let us explain the crisis above first. In the model-first tradition, with a particular application problem in mind, we propose a model, develop a method and justify its optimality by some hard-to-prove theorems, and find a dataset to support the approach. In this tradition, we put a lot of faith on our model and our theory: we hope the model is adequate, and we hope our optimality theory warrants the superiority of our method over others. Modern machine learning literature (especially the recent development of deep learning) provides a different approach to justifying the “superiority” of an approach; we compare the proposed approach with existing approaches by the real data results over a dozen of benchmark datasets. To choose an algorithm for their dataset, a practitioner does not necessarily need warranties from a theorem; a superior performance over many benchmark datasets says it all. To some theoretical statisticians, this is rather disappointing, as they come from a long","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2022-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/07350015.2022.2055358","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 1

Abstract

We thank David Donoho for very encouraging comments. As always, his penetrating vision and deep thoughts are extremely stimulating. We are glad that he summarizes a major philosophical difference between statistics in earlier years (e.g., the time of Francis Galton) and statistics in our time by just a few words: data-first versus model-first. We completely agree with his comment that “each effort by a statistics researcher to understand a newly available type of data enlarges our field; it should be a primary part of the career of statisticians to cultivate an interest in cultivating new types of datasets, so that new methodology can be discovered and developed”; these are exactly the motivations underlying our (several-year) efforts in collecting, cleaning, and analyzing a large-scale high-quality dataset. We would like to add that both traditions have strengths, and combining the strengths of two sides may greatly help statisticians deal with the so-called crisis of the 21st century in statistics we face today. Let us explain the crisis above first. In the model-first tradition, with a particular application problem in mind, we propose a model, develop a method and justify its optimality by some hard-to-prove theorems, and find a dataset to support the approach. In this tradition, we put a lot of faith on our model and our theory: we hope the model is adequate, and we hope our optimality theory warrants the superiority of our method over others. Modern machine learning literature (especially the recent development of deep learning) provides a different approach to justifying the “superiority” of an approach; we compare the proposed approach with existing approaches by the real data results over a dozen of benchmark datasets. To choose an algorithm for their dataset, a practitioner does not necessarily need warranties from a theorem; a superior performance over many benchmark datasets says it all. To some theoretical statisticians, this is rather disappointing, as they come from a long
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
复辩状:“统计学家的共同引用和合作网络”
我们感谢大卫·多诺霍非常鼓舞人心的评论。一如既往,他锐利的眼光和深邃的思想极具启发性。我们很高兴他总结了早期统计(例如,弗朗西斯·高尔顿的时代)和我们这个时代的统计之间的主要哲学差异,只有几个字:数据优先与模型优先。我们完全同意他的评论:“统计研究人员为理解一种新的可用数据类型所做的每一次努力都扩大了我们的研究领域;培养培养新型数据集的兴趣应该成为统计学家职业生涯的一个主要部分,这样才能发现和发展新的方法”;这些正是我们(数年)努力收集、清理和分析大规模高质量数据集的动机。我们想补充的是,这两种传统都有各自的优势,将双方的优势结合起来,可能会极大地帮助统计学家应对我们今天面临的所谓21世纪统计危机。让我们先解释一下上述危机。在模型优先的传统中,考虑到特定的应用问题,我们提出了一个模型,开发了一种方法,并通过一些难以证明的定理来证明其最优性,并找到一个数据集来支持该方法。在这个传统中,我们对我们的模型和理论有很大的信心:我们希望模型是足够的,我们希望我们的最优性理论保证我们的方法优于其他方法。现代机器学习文献(尤其是深度学习的最新发展)提供了一种不同的方法来证明一种方法的“优越性”;我们通过十几个基准数据集的真实数据结果将所提出的方法与现有方法进行了比较。为了为他们的数据集选择一种算法,从业者不一定需要定理的保证;优于许多基准数据集的优越性能说明了一切。对于一些理论统计学家来说,这是相当令人失望的,因为他们来自一个漫长的
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACS Applied Bio Materials
ACS Applied Bio Materials Chemistry-Chemistry (all)
CiteScore
9.40
自引率
2.10%
发文量
464
期刊最新文献
A Systematic Review of Sleep Disturbance in Idiopathic Intracranial Hypertension. Advancing Patient Education in Idiopathic Intracranial Hypertension: The Promise of Large Language Models. Anti-Myelin-Associated Glycoprotein Neuropathy: Recent Developments. Approach to Managing the Initial Presentation of Multiple Sclerosis: A Worldwide Practice Survey. Association Between LACE+ Index Risk Category and 90-Day Mortality After Stroke.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1