A framework for understanding data science

Michael L Brodie
{"title":"A framework for understanding data science","authors":"Michael L Brodie","doi":"arxiv-2403.00776","DOIUrl":null,"url":null,"abstract":"The objective of this research is to provide a framework with which the data\nscience community can understand, define, and develop data science as a field\nof inquiry. The framework is based on the classical reference framework\n(axiology, ontology, epistemology, methodology) used for 200 years to define\nknowledge discovery paradigms and disciplines in the humanities, sciences,\nalgorithms, and now data science. I augmented it for automated problem-solving\nwith (methods, technology, community). The resulting data science reference\nframework is used to define the data science knowledge discovery paradigm in\nterms of the philosophy of data science addressed in previous papers and the\ndata science problem-solving paradigm, i.e., the data science method, and the\ndata science problem-solving workflow, both addressed in this paper. The\nframework is a much called for unifying framework for data science as it\ncontains the components required to define data science. For insights to better\nunderstand data science, this paper uses the framework to define the emerging,\noften enigmatic, data science problem-solving paradigm and workflow, and to\ncompare them with their well-understood scientific counterparts, scientific\nproblem-solving paradigm and workflow.","PeriodicalId":501323,"journal":{"name":"arXiv - STAT - Other Statistics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Other Statistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.00776","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The objective of this research is to provide a framework with which the data science community can understand, define, and develop data science as a field of inquiry. The framework is based on the classical reference framework (axiology, ontology, epistemology, methodology) used for 200 years to define knowledge discovery paradigms and disciplines in the humanities, sciences, algorithms, and now data science. I augmented it for automated problem-solving with (methods, technology, community). The resulting data science reference framework is used to define the data science knowledge discovery paradigm in terms of the philosophy of data science addressed in previous papers and the data science problem-solving paradigm, i.e., the data science method, and the data science problem-solving workflow, both addressed in this paper. The framework is a much called for unifying framework for data science as it contains the components required to define data science. For insights to better understand data science, this paper uses the framework to define the emerging, often enigmatic, data science problem-solving paradigm and workflow, and to compare them with their well-understood scientific counterparts, scientific problem-solving paradigm and workflow.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
了解数据科学的框架
这项研究的目的是提供一个框架,使数据科学界能够理解、定义和发展数据科学,将其作为一个研究领域。该框架基于经典参考框架(公理、本体、认识论、方法论),该框架已在人文、科学、算法以及现在的数据科学领域使用了 200 年,用于定义知识发现范式和学科。我用(方法、技术、社区)对其进行了扩充,以便自动解决问题。由此产生的数据科学参考框架被用来定义数据科学知识发现范式,即前几篇论文中论述的数据科学哲学和本文中论述的数据科学问题解决范式,即数据科学方法和数据科学问题解决工作流。该框架是一个备受关注的数据科学统一框架,因为它包含了定义数据科学所需的各个组成部分。为了更好地理解数据科学,本文使用该框架来定义新兴的、往往是神秘的数据科学问题解决范式和工作流程,并将它们与人们熟知的科学对应范式--科学问题解决范式和工作流程--进行比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Censored Data Forecasting: Applying Tobit Exponential Smoothing with Time Aggregation How to survive the Squid Games using probability theory Cross-sectional personal network analysis of adult smoking in rural areas Modeling information spread across networks with communities using a multitype branching process framework Asymptotic confidence intervals for the difference and the ratio of the weighted kappa coefficients of two diagnostic tests subject to a paired design
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1