Seven ways to make a data science project fail

Robert J. Glushko
{"title":"Seven ways to make a data science project fail","authors":"Robert J. Glushko","doi":"10.1016/j.dim.2023.100029","DOIUrl":null,"url":null,"abstract":"<div><p>The rapid emergence of data science as a field has made it a rival or replacement for information science from an industry perspective. In particular, the “big data” meme in data science and a heavy reliance on “black box” technology emphasize the quantity of data used in a project and asks, “what data do we have” rather than “what data do we need to solve our business problems.” This perspective also undermines the perceived importance of domain expertise, user research, data semantics and provenance, and other considerations valued in information science. This article uses a composite (and somewhat caricatured) case study of a data science project and discusses seven ways in which it is destined to fail, and then explains how “good information science” would have prevented or ameliorated them. Data science and information science need to recognize that together they can accomplish more than they can accomplish separately.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"7 1","pages":"Article 100029"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data and information management","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2543925123000037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The rapid emergence of data science as a field has made it a rival or replacement for information science from an industry perspective. In particular, the “big data” meme in data science and a heavy reliance on “black box” technology emphasize the quantity of data used in a project and asks, “what data do we have” rather than “what data do we need to solve our business problems.” This perspective also undermines the perceived importance of domain expertise, user research, data semantics and provenance, and other considerations valued in information science. This article uses a composite (and somewhat caricatured) case study of a data science project and discusses seven ways in which it is destined to fail, and then explains how “good information science” would have prevented or ameliorated them. Data science and information science need to recognize that together they can accomplish more than they can accomplish separately.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
让数据科学项目失败的七种方法
数据科学作为一个领域的迅速出现,使其从行业角度成为信息科学的竞争对手或替代品。特别是,数据科学中的“大数据”模因和对“黑匣子”技术的严重依赖强调了项目中使用的数据量,并询问“我们有什么数据”,而不是“我们需要什么数据来解决我们的业务问题”。这种观点也削弱了领域专业知识、用户研究、数据语义和来源的重要性,以及信息科学中有价值的其他考虑因素。本文使用了一个数据科学项目的综合(有点讽刺)案例研究,讨论了它注定会失败的七种方式,然后解释了“好的信息科学”是如何预防或改善它们的。数据科学和信息科学需要认识到,它们一起可以完成比单独完成更多的任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Data and information management
Data and information management Management Information Systems, Library and Information Sciences
CiteScore
3.70
自引率
0.00%
发文量
0
审稿时长
55 days
期刊最新文献
Erratum regarding missing Declaration of Competing Interest statements in previously published articles (Volume 6, Issues 1–4) Improved detection of transient events in wide area sky survey using convolutional neural networks An evaluation method of academic output that considers productivity differences Adaptive K-means clustering based under-sampling methods to solve the class imbalance problem Does internet use affect public risk perception? — From the perspective of political participation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1