{"title":"在我们信任的数据中","authors":"Rob Kitchin","doi":"10.2307/j.ctv1c9hmnq.9","DOIUrl":null,"url":null,"abstract":"This chapter discusses issues of data quality and veracity in open datasets, using a variety of examples from the Irish data system. These examples include the Residential Property Price Register (RPPR), the Dublin Dashboard project, the TRIPS database, and Irish crime data. There are a number of issues with Irish crime data, such as crimes being recorded in relation to the police stations that handle them, rather than the location they are committed. There are also issues in the standardization of crime categorization, with some police officers recording the same crimes in slightly different ways, and also in timeliness of recording. Moreover, there are difficulties of retrieving data from the crime management software system. In addition to errors, every dataset has issues of representativeness — that is, the extent to which the data faithfully represents that which it seeks to measure. In generating data, processes of extraction, abstraction, generalization and sampling can introduce measurement error, noise, imprecision and bias. Yet internationally, there has been much work expended on formulating data-quality guidelines and standards, trying to get those generating and sharing data to adhere to them, and promoting the importance of reporting this information to users.","PeriodicalId":446623,"journal":{"name":"Data Lives","volume":"358 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"In Data We Trust\",\"authors\":\"Rob Kitchin\",\"doi\":\"10.2307/j.ctv1c9hmnq.9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This chapter discusses issues of data quality and veracity in open datasets, using a variety of examples from the Irish data system. These examples include the Residential Property Price Register (RPPR), the Dublin Dashboard project, the TRIPS database, and Irish crime data. There are a number of issues with Irish crime data, such as crimes being recorded in relation to the police stations that handle them, rather than the location they are committed. There are also issues in the standardization of crime categorization, with some police officers recording the same crimes in slightly different ways, and also in timeliness of recording. Moreover, there are difficulties of retrieving data from the crime management software system. In addition to errors, every dataset has issues of representativeness — that is, the extent to which the data faithfully represents that which it seeks to measure. In generating data, processes of extraction, abstraction, generalization and sampling can introduce measurement error, noise, imprecision and bias. Yet internationally, there has been much work expended on formulating data-quality guidelines and standards, trying to get those generating and sharing data to adhere to them, and promoting the importance of reporting this information to users.\",\"PeriodicalId\":446623,\"journal\":{\"name\":\"Data Lives\",\"volume\":\"358 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data Lives\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2307/j.ctv1c9hmnq.9\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Lives","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2307/j.ctv1c9hmnq.9","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
This chapter discusses issues of data quality and veracity in open datasets, using a variety of examples from the Irish data system. These examples include the Residential Property Price Register (RPPR), the Dublin Dashboard project, the TRIPS database, and Irish crime data. There are a number of issues with Irish crime data, such as crimes being recorded in relation to the police stations that handle them, rather than the location they are committed. There are also issues in the standardization of crime categorization, with some police officers recording the same crimes in slightly different ways, and also in timeliness of recording. Moreover, there are difficulties of retrieving data from the crime management software system. In addition to errors, every dataset has issues of representativeness — that is, the extent to which the data faithfully represents that which it seeks to measure. In generating data, processes of extraction, abstraction, generalization and sampling can introduce measurement error, noise, imprecision and bias. Yet internationally, there has been much work expended on formulating data-quality guidelines and standards, trying to get those generating and sharing data to adhere to them, and promoting the importance of reporting this information to users.