{"title":"Learning to Restructure Tables Automatically","authors":"J. M. Hellerstein","doi":"10.1145/3665252.3665268","DOIUrl":null,"url":null,"abstract":"By now, it is widely-accepted folk wisdom that \"half of the time in any data analysis project is spent wrangling the data\". Analytic algorithms and tools-built on mathematical foundations of matrices and relations-require their data to be lined up in particular rows and columns. In the relational model (known in data science circles as \"tidy data\"), each row is an independent observation, and each column is a distinct attribute of the phenomenon described by the data. While there are many thorny aspects to data wrangling, perhaps none is more basic than the challenge of getting data reorganized, positionally, into the right form for analysis.","PeriodicalId":346332,"journal":{"name":"ACM SIGMOD Record","volume":"54 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM SIGMOD Record","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3665252.3665268","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
By now, it is widely-accepted folk wisdom that "half of the time in any data analysis project is spent wrangling the data". Analytic algorithms and tools-built on mathematical foundations of matrices and relations-require their data to be lined up in particular rows and columns. In the relational model (known in data science circles as "tidy data"), each row is an independent observation, and each column is a distinct attribute of the phenomenon described by the data. While there are many thorny aspects to data wrangling, perhaps none is more basic than the challenge of getting data reorganized, positionally, into the right form for analysis.