{"title":"Seven ways to make a data science project fail","authors":"Robert J. Glushko","doi":"10.1016/j.dim.2023.100029","DOIUrl":null,"url":null,"abstract":"<div><p>The rapid emergence of data science as a field has made it a rival or replacement for information science from an industry perspective. In particular, the “big data” meme in data science and a heavy reliance on “black box” technology emphasize the quantity of data used in a project and asks, “what data do we have” rather than “what data do we need to solve our business problems.” This perspective also undermines the perceived importance of domain expertise, user research, data semantics and provenance, and other considerations valued in information science. This article uses a composite (and somewhat caricatured) case study of a data science project and discusses seven ways in which it is destined to fail, and then explains how “good information science” would have prevented or ameliorated them. Data science and information science need to recognize that together they can accomplish more than they can accomplish separately.</p></div>","PeriodicalId":72769,"journal":{"name":"Data and information management","volume":"7 1","pages":"Article 100029"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data and information management","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2543925123000037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The rapid emergence of data science as a field has made it a rival or replacement for information science from an industry perspective. In particular, the “big data” meme in data science and a heavy reliance on “black box” technology emphasize the quantity of data used in a project and asks, “what data do we have” rather than “what data do we need to solve our business problems.” This perspective also undermines the perceived importance of domain expertise, user research, data semantics and provenance, and other considerations valued in information science. This article uses a composite (and somewhat caricatured) case study of a data science project and discusses seven ways in which it is destined to fail, and then explains how “good information science” would have prevented or ameliorated them. Data science and information science need to recognize that together they can accomplish more than they can accomplish separately.