{"title":"数据科学的4+1模型","authors":"Rafael C. Alvarado","doi":"arxiv-2311.07631","DOIUrl":null,"url":null,"abstract":"Data Science is a complex and evolving field, but most agree that it can be\ndefined as a combination of expertise drawn from three broad areascomputer\nscience and technology, math and statistics, and domain knowledge -- with the\npurpose of extracting knowledge and value from data. Beyond this, the field is\noften defined as a series of practical activities ranging from the cleaning and\nwrangling of data, to its analysis and use to infer models, to the visual and\nrhetorical representation of results to stakeholders and decision-makers. This\nessay proposes a model of data science that goes beyond laundry-list\ndefinitions to get at the specific nature of data science and help distinguish\nit from adjacent fields such as computer science and statistics. We define data\nscience as an interdisciplinary field comprising four broad areas of expertise:\nvalue, design, systems, and analytics. A fifth area, practice, integrates the\nother four in specific contexts of domain knowledge. We call this the 4+1 model\nof data science. Together, these areas belong to every data science project,\neven if they are often unconnected and siloed in the academy.","PeriodicalId":501533,"journal":{"name":"arXiv - CS - General Literature","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The 4+1 Model of Data Science\",\"authors\":\"Rafael C. Alvarado\",\"doi\":\"arxiv-2311.07631\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data Science is a complex and evolving field, but most agree that it can be\\ndefined as a combination of expertise drawn from three broad areascomputer\\nscience and technology, math and statistics, and domain knowledge -- with the\\npurpose of extracting knowledge and value from data. Beyond this, the field is\\noften defined as a series of practical activities ranging from the cleaning and\\nwrangling of data, to its analysis and use to infer models, to the visual and\\nrhetorical representation of results to stakeholders and decision-makers. This\\nessay proposes a model of data science that goes beyond laundry-list\\ndefinitions to get at the specific nature of data science and help distinguish\\nit from adjacent fields such as computer science and statistics. We define data\\nscience as an interdisciplinary field comprising four broad areas of expertise:\\nvalue, design, systems, and analytics. A fifth area, practice, integrates the\\nother four in specific contexts of domain knowledge. We call this the 4+1 model\\nof data science. Together, these areas belong to every data science project,\\neven if they are often unconnected and siloed in the academy.\",\"PeriodicalId\":501533,\"journal\":{\"name\":\"arXiv - CS - General Literature\",\"volume\":\"8 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - General Literature\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2311.07631\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - General Literature","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2311.07631","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data Science is a complex and evolving field, but most agree that it can be
defined as a combination of expertise drawn from three broad areascomputer
science and technology, math and statistics, and domain knowledge -- with the
purpose of extracting knowledge and value from data. Beyond this, the field is
often defined as a series of practical activities ranging from the cleaning and
wrangling of data, to its analysis and use to infer models, to the visual and
rhetorical representation of results to stakeholders and decision-makers. This
essay proposes a model of data science that goes beyond laundry-list
definitions to get at the specific nature of data science and help distinguish
it from adjacent fields such as computer science and statistics. We define data
science as an interdisciplinary field comprising four broad areas of expertise:
value, design, systems, and analytics. A fifth area, practice, integrates the
other four in specific contexts of domain knowledge. We call this the 4+1 model
of data science. Together, these areas belong to every data science project,
even if they are often unconnected and siloed in the academy.