T. Araújo, C. Cappiello, N. P. Kozievitch, Demetrio Gomes Mestre, Carlos Eduardo S. Pires, Monica Vitali
{"title":"Towards Reliable Data Analyses for Smart Cities","authors":"T. Araújo, C. Cappiello, N. P. Kozievitch, Demetrio Gomes Mestre, Carlos Eduardo S. Pires, Monica Vitali","doi":"10.1145/3105831.3105834","DOIUrl":null,"url":null,"abstract":"As cities are becoming green and smart, public information systems are being revamped to adopt digital technologies. There are several sources (official or not) that can provide information related to a city. The availability of multiple sources enables the design of advanced analyses for offering valuable services to both citizens and municipalities. However, such analyses would fail if the considered data were affected by errors and uncertainties: Data Quality is one of the main requirements for the successful exploitation of the available information. This paper highlights the importance of the Data Quality evaluation in the context of geographical data sources. Moreover, we describe how the Entity Matching task can provide additional information to refine the quality assessment and, consequently, obtain a better evaluation of the reliability data sources. Data gathered from the public transportation and urban areas of Curitiba, Brazil, are used to show the strengths and effectiveness of the presented approach.","PeriodicalId":319729,"journal":{"name":"Proceedings of the 21st International Database Engineering & Applications Symposium","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st International Database Engineering & Applications Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3105831.3105834","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
As cities are becoming green and smart, public information systems are being revamped to adopt digital technologies. There are several sources (official or not) that can provide information related to a city. The availability of multiple sources enables the design of advanced analyses for offering valuable services to both citizens and municipalities. However, such analyses would fail if the considered data were affected by errors and uncertainties: Data Quality is one of the main requirements for the successful exploitation of the available information. This paper highlights the importance of the Data Quality evaluation in the context of geographical data sources. Moreover, we describe how the Entity Matching task can provide additional information to refine the quality assessment and, consequently, obtain a better evaluation of the reliability data sources. Data gathered from the public transportation and urban areas of Curitiba, Brazil, are used to show the strengths and effectiveness of the presented approach.