{"title":"Introducing Context and Context-awareness in Data Integration: Identifying the Problem and a Preliminary Case Study on Informed Consent","authors":"C. Debruyne","doi":"10.1145/3428757.3429116","DOIUrl":null,"url":null,"abstract":"Data integration is the process of selecting, preprocessing, and transforming data from heterogeneous sources in data-driven projects. This process also requires the most time, effort, resources. Data integration is such an involved process due to the many informed decisions one has to make. These decisions are influenced by the complex context of a data-driven project. We argue that using said context could facilitate the decision-making processes and even automate some integration steps. However, the problem we identify in this paper is that the context of a data-driven project is tacit and, therefore, not easily accessible by humans and certainly not by software agents. From the SotA, however, we observe that current models represent the context in crude and simplistic terms. These context models are furthermore built for specific tasks or application domains such as query optimization or a smart home. The current state of affairs is thus is not fit for intelligent data integration. Next to identifying the problem, we postulate that solving this problem requires two steps: formalizing context and using that context for building context-aware agents. We illustrate this notion of \"context-aware data integration\" with preliminary results obtained with a use case in the domain of GDPR, more specifically the generation of datasets that takes into account informed consent.","PeriodicalId":212557,"journal":{"name":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd International Conference on Information Integration and Web-based Applications & Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3428757.3429116","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Data integration is the process of selecting, preprocessing, and transforming data from heterogeneous sources in data-driven projects. This process also requires the most time, effort, resources. Data integration is such an involved process due to the many informed decisions one has to make. These decisions are influenced by the complex context of a data-driven project. We argue that using said context could facilitate the decision-making processes and even automate some integration steps. However, the problem we identify in this paper is that the context of a data-driven project is tacit and, therefore, not easily accessible by humans and certainly not by software agents. From the SotA, however, we observe that current models represent the context in crude and simplistic terms. These context models are furthermore built for specific tasks or application domains such as query optimization or a smart home. The current state of affairs is thus is not fit for intelligent data integration. Next to identifying the problem, we postulate that solving this problem requires two steps: formalizing context and using that context for building context-aware agents. We illustrate this notion of "context-aware data integration" with preliminary results obtained with a use case in the domain of GDPR, more specifically the generation of datasets that takes into account informed consent.