首页 > 最新文献

Advances in database technology : proceedings. International Conference on Extending Database Technology最新文献

英文 中文
Recommending Unanimously Preferred Items to Groups 向组推荐一致首选项目
Karim Benouaret, K. Tan
Due to the pervasiveness of group activities in people’s daily life, group recommendation has attracted a massive research effort in both industry and academia. A fundamental challenge in group recommendation is how to aggregate the preferences of group members to select a set of items maximizing the overall satisfaction of the group; this is the focus of this paper. Specifically, we introduce a dual adjustment aggregation score, which measures the relevance of an item to a group. We then propose a recommendation scheme, termed 𝑘 -dual adjustment unanimous skyline, that seeks to retrieve the 𝑘 items with the highest score, while discarding items that are unanimously considered inap-propriate. Furthermore, we design and develop algorithms for computing the 𝑘 -dual adjustment unanimous skyline efficiently. Finally, we demonstrate both the retrieval effectiveness and the efficiency of our approach through an extensive experimental evaluation on real datasets.
由于群体活动在人们日常生活中的普遍存在,群体推荐在业界和学术界都引起了大量的研究。群体推荐的一个基本问题是如何综合群体成员的偏好来选择一组项目,使群体整体满意度最大化;这是本文的重点。具体来说,我们引入了一个双调整聚合分数,它测量了一个项目与一个组的相关性。然后,我们提出了一个推荐方案,称为𝑘-双重调整一致的天际线,它寻求检索得分最高的𝑘项,同时丢弃一致认为不合适的项。此外,我们设计并开发了有效计算𝑘-对偶平差一致天际线的算法。最后,我们通过对真实数据集的广泛实验评估来证明我们的方法的检索有效性和效率。
{"title":"Recommending Unanimously Preferred Items to Groups","authors":"Karim Benouaret, K. Tan","doi":"10.48786/edbt.2023.29","DOIUrl":"https://doi.org/10.48786/edbt.2023.29","url":null,"abstract":"Due to the pervasiveness of group activities in people’s daily life, group recommendation has attracted a massive research effort in both industry and academia. A fundamental challenge in group recommendation is how to aggregate the preferences of group members to select a set of items maximizing the overall satisfaction of the group; this is the focus of this paper. Specifically, we introduce a dual adjustment aggregation score, which measures the relevance of an item to a group. We then propose a recommendation scheme, termed 𝑘 -dual adjustment unanimous skyline, that seeks to retrieve the 𝑘 items with the highest score, while discarding items that are unanimously considered inap-propriate. Furthermore, we design and develop algorithms for computing the 𝑘 -dual adjustment unanimous skyline efficiently. Finally, we demonstrate both the retrieval effectiveness and the efficiency of our approach through an extensive experimental evaluation on real datasets.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"116 1","pages":"364-377"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89386677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data Narration for the People: Challenges and Opportunities 面向人民的数据叙事:挑战与机遇
S. Amer-Yahia, Patrick Marcel, Verónika Peralta
Data narration is the process of telling stories with insights ex-tracted from data. It is an instance of data science [4] where the pipeline focuses on data collection and exploration, answering questions, structuring answers, and finally presenting them to stakeholders [16, 17]. This tutorial reviews the challenges and opportunities of the full and semi-automation of these steps. In doing so, it draws from the extensive literature in data narration, data exploration and data visualization. In particular, we point out key theoretical and practical contributions in each domain such as next-step recommendation and policy learning for data exploration, insight interestingness and evaluation frameworks, and the crafting of data stories for the people who will exploit them. We also identify topics that are still worth investigating, such as the inclusion of different stakeholders’ profiles in designing data pipelines with the goal of providing data narration for all.
数据叙事是用从数据中提取的见解来讲述故事的过程。它是数据科学的一个实例[4],其中管道侧重于数据收集和探索,回答问题,构建答案,并最终将其呈现给利益相关者[16,17]。本教程回顾了这些步骤的完全自动化和半自动化的挑战和机遇。在此过程中,它借鉴了数据叙述、数据探索和数据可视化方面的广泛文献。我们特别指出了每个领域的关键理论和实践贡献,例如数据探索的下一步建议和政策学习,洞察兴趣和评估框架,以及为将利用它们的人制作数据故事。我们还确定了仍然值得研究的主题,例如在设计数据管道时包含不同涉众的配置文件,目的是为所有人提供数据叙述。
{"title":"Data Narration for the People: Challenges and Opportunities","authors":"S. Amer-Yahia, Patrick Marcel, Verónika Peralta","doi":"10.48786/edbt.2023.82","DOIUrl":"https://doi.org/10.48786/edbt.2023.82","url":null,"abstract":"Data narration is the process of telling stories with insights ex-tracted from data. It is an instance of data science [4] where the pipeline focuses on data collection and exploration, answering questions, structuring answers, and finally presenting them to stakeholders [16, 17]. This tutorial reviews the challenges and opportunities of the full and semi-automation of these steps. In doing so, it draws from the extensive literature in data narration, data exploration and data visualization. In particular, we point out key theoretical and practical contributions in each domain such as next-step recommendation and policy learning for data exploration, insight interestingness and evaluation frameworks, and the crafting of data stories for the people who will exploit them. We also identify topics that are still worth investigating, such as the inclusion of different stakeholders’ profiles in designing data pipelines with the goal of providing data narration for all.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"56 1","pages":"855-858"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84774846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Dimensional Data Publishing With Local Differential Privacy 具有局部差分隐私的多维数据发布
Gaoyuan Liu, Peng Tang, Chengyu Hu, Chongshi Jin, Shanqing Guo
This paper studies the publication of multi-dimensional data with local differential privacy (LDP). This problem raises tremendous challenges in terms of both computational efficiency and data utility. The state-of-the-art solution addresses this problem by first constructing a junction tree (a kind of probabilistic graphical model, PGM) to generate a set of noisy low-dimensional marginals of the input data and then using them to approximate the distribution of the input dataset for synthetic data generation. However, there are two severe limitations in the existing solution, i.e., calculating a large number of attribute pairs’ marginals to construct the PGM and not solving well in calculating the marginal distribution of large cliques in the PGM, which degrade the quality of synthetic data. To address the above deficiencies, based on the sparseness of the constructed PGM and the divisibility of LDP, we first propose an incremental learning-based PGM construction method. In this method, we gradually prune the edges (attribute pairs) with weak correlation and allocate more data and privacy budgets to the useful edges, thereby improving the model’s accuracy. In this method, we introduce a high-precision data accumulation technique and a low-error edge pruning technique. Second, based on joint distribution decomposition and redundancy elimination, we propose a novel marginal calculation method for the large cliques in the context of LDP. Extensive experiments on real datasets demonstrate that our solution offers desirable data utility.
研究了基于局部差分隐私(LDP)的多维数据发布问题。这个问题在计算效率和数据效用方面都提出了巨大的挑战。最先进的解决方案通过首先构建一个连接树(一种概率图形模型,PGM)来生成一组输入数据的噪声低维边缘,然后使用它们来近似输入数据集的分布,以生成合成数据。但是,现有的解决方案存在两个严重的局限性,即计算大量属性对的边际来构造PGM,以及计算PGM中大集团的边际分布不能很好地求解,从而降低了合成数据的质量。针对上述不足,基于构造的PGM的稀疏性和LDP的可整除性,我们首先提出了一种基于增量学习的PGM构造方法。在该方法中,我们逐渐修剪弱相关性的边(属性对),并将更多的数据和隐私预算分配给有用的边,从而提高模型的准确性。在该方法中,我们引入了高精度的数据积累技术和低误差的边缘修剪技术。其次,基于联合分布分解和冗余消除,提出了一种新的LDP背景下大集团的边际计算方法。在真实数据集上的大量实验表明,我们的解决方案提供了理想的数据效用。
{"title":"Multi-Dimensional Data Publishing With Local Differential Privacy","authors":"Gaoyuan Liu, Peng Tang, Chengyu Hu, Chongshi Jin, Shanqing Guo","doi":"10.48786/edbt.2023.15","DOIUrl":"https://doi.org/10.48786/edbt.2023.15","url":null,"abstract":"This paper studies the publication of multi-dimensional data with local differential privacy (LDP). This problem raises tremendous challenges in terms of both computational efficiency and data utility. The state-of-the-art solution addresses this problem by first constructing a junction tree (a kind of probabilistic graphical model, PGM) to generate a set of noisy low-dimensional marginals of the input data and then using them to approximate the distribution of the input dataset for synthetic data generation. However, there are two severe limitations in the existing solution, i.e., calculating a large number of attribute pairs’ marginals to construct the PGM and not solving well in calculating the marginal distribution of large cliques in the PGM, which degrade the quality of synthetic data. To address the above deficiencies, based on the sparseness of the constructed PGM and the divisibility of LDP, we first propose an incremental learning-based PGM construction method. In this method, we gradually prune the edges (attribute pairs) with weak correlation and allocate more data and privacy budgets to the useful edges, thereby improving the model’s accuracy. In this method, we introduce a high-precision data accumulation technique and a low-error edge pruning technique. Second, based on joint distribution decomposition and redundancy elimination, we propose a novel marginal calculation method for the large cliques in the context of LDP. Extensive experiments on real datasets demonstrate that our solution offers desirable data utility.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"18 1","pages":"183-194"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86073013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Joint Source and Schema Evolution: Insights from a Study of 195 FOSS Projects 联合源和模式演化:来自195个自由/开源软件项目研究的见解
Panos Vassiliadis, Fation Shehaj, George Kalampokis, A. Zarras
In this paper, we address the problem of the co-evolution of Free Open Source Software projects with the relational schemata that they encompass. We exploit a data set of 195 publicly available schema histories of FOSS projects hosted in Github, for which we locally cloned their respective project and measured their evolution progress. Our first research question asks which percentage of the projects demonstrates a “hand-in-hand” schema and source code co-evolution? To address this question, we defined synchronicity by allowing a bounded amount of lag between the cumulative evolution of the schema and the entire project. A core finding is that there are all kinds of behaviors with respect to project and schema co-evolution, resulting in only a small number of projects where the evolution of schema and project progress in sync. Moreover, we discovered that after exceeding a 5-year threshold of project life, schemata gravitate to lower rates of evolution, which practically means that, with time, the schemata stop evolving as actively as they originally did. To answer a second question, on whether evolution comes early in the life of a schema, we measured how often does the cumulative progress of schema evolution exceed the respective progress of source change, as well as the respective progress of time. The results indicate that a large majority of schemata demonstrates early advance of schema change with respect to code evolution, and, an even larger majority is also demonstrating an advance of schema evolution with respect to time, too. Third, we asked at which time point in their lives do schemata attain a substantial
在本文中,我们讨论了自由开源软件项目与它们所包含的关系模式的共同发展问题。我们利用了托管在Github上的195个公开可用的自由/开源软件项目的模式历史数据集,为此我们在本地克隆了它们各自的项目并测量了它们的发展进度。我们的第一个研究问题是,有多少百分比的项目展示了“手拉手”的模式和源代码协同进化?为了解决这个问题,我们通过允许模式的累积进化和整个项目之间的有限延迟来定义同步性。一个核心的发现是,关于项目和模式的共同进化有各种各样的行为,导致只有少数项目的模式的进化和项目的进展是同步的。此外,我们发现在超过5年的项目生命阈值之后,模式倾向于较低的进化速率,这实际上意味着,随着时间的推移,模式停止像最初那样积极地进化。为了回答第二个问题,即进化是否发生在模式生命的早期,我们测量了模式进化的累积进展超过源变化的各自进展以及时间的各自进展的频率。结果表明,大多数模式显示了相对于代码演化的模式变更的早期进展,而且,更大的多数模式也显示了相对于时间的模式演化的进展。第三,我们问他们在生命中的哪个时间点图式达到实质性的
{"title":"Joint Source and Schema Evolution: Insights from a Study of 195 FOSS Projects","authors":"Panos Vassiliadis, Fation Shehaj, George Kalampokis, A. Zarras","doi":"10.48786/edbt.2023.03","DOIUrl":"https://doi.org/10.48786/edbt.2023.03","url":null,"abstract":"In this paper, we address the problem of the co-evolution of Free Open Source Software projects with the relational schemata that they encompass. We exploit a data set of 195 publicly available schema histories of FOSS projects hosted in Github, for which we locally cloned their respective project and measured their evolution progress. Our first research question asks which percentage of the projects demonstrates a “hand-in-hand” schema and source code co-evolution? To address this question, we defined synchronicity by allowing a bounded amount of lag between the cumulative evolution of the schema and the entire project. A core finding is that there are all kinds of behaviors with respect to project and schema co-evolution, resulting in only a small number of projects where the evolution of schema and project progress in sync. Moreover, we discovered that after exceeding a 5-year threshold of project life, schemata gravitate to lower rates of evolution, which practically means that, with time, the schemata stop evolving as actively as they originally did. To answer a second question, on whether evolution comes early in the life of a schema, we measured how often does the cumulative progress of schema evolution exceed the respective progress of source change, as well as the respective progress of time. The results indicate that a large majority of schemata demonstrates early advance of schema change with respect to code evolution, and, an even larger majority is also demonstrating an advance of schema evolution with respect to time, too. Third, we asked at which time point in their lives do schemata attain a substantial","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"4 1","pages":"27-39"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90532072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Data Provenance for SHACL 用于acl的数据来源
Thomas Delva, Maxim Jakubowski
In constraint languages for RDF graphs, such as ShEx and SHACL, constraints on nodes and their properties are known as “shapes”. Using SHACL, we propose in this paper the notion of neighborhood of a node 𝑣 satisfying a given shape in a graph 𝐺 . This neighborhood is a subgraph of 𝐺 , and provides data provenance of 𝑣 for the given shape. We establish a correctness property for the obtained provenance mechanism, by proving that neighborhoods adhere to the Sufficiency requirement articulated for provenance semantics for database queries. As an additional benefit, neighborhoods allow a novel use of shapes: the extraction of a subgraph from an RDF graph, the so-called shape fragment. We compare shape fragments with SPARQL queries. We discuss implementation strategies for computing neighborhoods, and present initial experiments demonstrating that our ideas are fea-sible.
在RDF图的约束语言(如ShEx和SHACL)中,节点及其属性的约束被称为“形状”。利用SHACL,我们提出了满足图𝐺中给定形状的节点𝑣邻域的概念。这个邻域是𝐺的子图,并为给定形状提供𝑣的数据来源。通过证明邻域符合数据库查询的来源语义的充分性要求,我们为获得的来源机制建立了一个正确性属性。作为一个额外的好处,邻域允许对形状进行新的使用:从RDF图中提取子图,即所谓的形状片段。我们将形状片段与SPARQL查询进行比较。我们讨论了计算邻域的实现策略,并提出了初步的实验来证明我们的想法是可行的。
{"title":"Data Provenance for SHACL","authors":"Thomas Delva, Maxim Jakubowski","doi":"10.48786/edbt.2023.23","DOIUrl":"https://doi.org/10.48786/edbt.2023.23","url":null,"abstract":"In constraint languages for RDF graphs, such as ShEx and SHACL, constraints on nodes and their properties are known as “shapes”. Using SHACL, we propose in this paper the notion of neighborhood of a node 𝑣 satisfying a given shape in a graph 𝐺 . This neighborhood is a subgraph of 𝐺 , and provides data provenance of 𝑣 for the given shape. We establish a correctness property for the obtained provenance mechanism, by proving that neighborhoods adhere to the Sufficiency requirement articulated for provenance semantics for database queries. As an additional benefit, neighborhoods allow a novel use of shapes: the extraction of a subgraph from an RDF graph, the so-called shape fragment. We compare shape fragments with SPARQL queries. We discuss implementation strategies for computing neighborhoods, and present initial experiments demonstrating that our ideas are fea-sible.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"16 1","pages":"285-297"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90343818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stitcher: Learned Workload Synthesis from Historical Performance Footprints 缝制工:从历史性能足迹中学习工作量合成
Chengcheng Wan, Yiwen Zhu, Joyce Cahoon, Wenjing Wang, K. Lin, Sean Liu, Raymond Truong, Neetu Singh, Alexandra Ciortea, Konstantinos Karanasos, Subru Krishnan
Database benchmarking and workload replay have been widely used to drive system design, evaluate workload performance, de-termine product evolution, and guide cloud migration. However, they both suffer from some key limitations: the former fails to capture the variety and complexity of production workloads; the latter requires access to user data, queries, and machine specifications, deeming it inapplicable in the face of user privacy concerns. Here we introduce our vision of learned workload synthesis to overcome these issues: given the performance profile of a customer workload (e.g., CPU/memory counters), synthesize a new workload that yields the same performance profile when executed on a range of hardware/software configurations. We present Stitcher as a first step towards realizing this vision, which synthesizes workloads by combining pieces from standard benchmarks. We believe that our vision will spark new research avenues in database workload replay.
数据库基准测试和工作负载重放已被广泛用于驱动系统设计、评估工作负载性能、确定产品演进和指导云迁移。然而,它们都有一些关键的局限性:前者无法捕捉生产工作负载的多样性和复杂性;后者需要访问用户数据、查询和机器规格,认为它在用户隐私问题面前不适用。在这里,我们介绍学习工作负载合成的愿景,以克服这些问题:给定客户工作负载的性能概要(例如,CPU/内存计数器),合成一个在一系列硬件/软件配置上执行时产生相同性能概要的新工作负载。我们将Stitcher作为实现这一愿景的第一步,它通过组合来自标准基准的片段来合成工作负载。我们相信,我们的愿景将在数据库工作负载重放方面激发新的研究途径。
{"title":"Stitcher: Learned Workload Synthesis from Historical Performance Footprints","authors":"Chengcheng Wan, Yiwen Zhu, Joyce Cahoon, Wenjing Wang, K. Lin, Sean Liu, Raymond Truong, Neetu Singh, Alexandra Ciortea, Konstantinos Karanasos, Subru Krishnan","doi":"10.48786/edbt.2023.33","DOIUrl":"https://doi.org/10.48786/edbt.2023.33","url":null,"abstract":"Database benchmarking and workload replay have been widely used to drive system design, evaluate workload performance, de-termine product evolution, and guide cloud migration. However, they both suffer from some key limitations: the former fails to capture the variety and complexity of production workloads; the latter requires access to user data, queries, and machine specifications, deeming it inapplicable in the face of user privacy concerns. Here we introduce our vision of learned workload synthesis to overcome these issues: given the performance profile of a customer workload (e.g., CPU/memory counters), synthesize a new workload that yields the same performance profile when executed on a range of hardware/software configurations. We present Stitcher as a first step towards realizing this vision, which synthesizes workloads by combining pieces from standard benchmarks. We believe that our vision will spark new research avenues in database workload replay.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"108 1","pages":"417-423"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91107488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding crowd energy consumption behaviors 了解人群能源消耗行为
X. Liu, Xu Cheng, Yanyan Yang, Huan Huo, Yongping Liu, P. S. Nielsen
Understanding crowd behavior is crucial for energy demand-side management. In this paper, we employ the fluid dynamics concept potential flow to model the energy demand shift patterns of the crowd in both temporal and spatial dimensions. To facilitate the use of the proposed method, we implement a visual analysis platform that allows users to interactively explore and interpret the shift patterns. The effectiveness of the proposed method will be evaluated through a hands-on experience with a real case study during the conference demonstration.
了解人群行为对能源需求侧管理至关重要。本文采用流体力学的势流概念,从时间和空间两个维度对人群的能量需求转移模式进行建模。为了便于使用所提出的方法,我们实现了一个可视化分析平台,允许用户交互式地探索和解释移位模式。在会议演示期间,将通过实际案例研究的实践经验来评估所提出方法的有效性。
{"title":"Understanding crowd energy consumption behaviors","authors":"X. Liu, Xu Cheng, Yanyan Yang, Huan Huo, Yongping Liu, P. S. Nielsen","doi":"10.48786/edbt.2023.68","DOIUrl":"https://doi.org/10.48786/edbt.2023.68","url":null,"abstract":"Understanding crowd behavior is crucial for energy demand-side management. In this paper, we employ the fluid dynamics concept potential flow to model the energy demand shift patterns of the crowd in both temporal and spatial dimensions. To facilitate the use of the proposed method, we implement a visual analysis platform that allows users to interactively explore and interpret the shift patterns. The effectiveness of the proposed method will be evaluated through a hands-on experience with a real case study during the conference demonstration.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"6 1","pages":"799-802"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91288051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pushing Edge Computing one Step Further: Resilient and Privacy-Preserving Processing on Personal Devices 进一步推动边缘计算:个人设备上的弹性和隐私保护处理
Ludovic Javet, N. Anciaux, Luc Bouganim, Léo Lamoureux, P. Pucheral
Can we push Edge computing one step further? This demonstration paper proposes an answer to this question by leveraging the generalization of Trusted Execution Environments at the very edge of the network to enable resilient and privacy-preserving computation on personal devices. Based on preliminary published results, we show that this can drastically change the way distributed processing over personal data is conceived and achieved. The platform presented here demonstrates the pertinence of the approach through execution scenarios integrating heterogeneous secure personal devices.
我们能否进一步推动边缘计算?这篇演示论文通过利用网络边缘可信执行环境的泛化来实现个人设备上的弹性和隐私保护计算,提出了这个问题的答案。根据初步公布的结果,我们表明这可以彻底改变个人数据分布式处理的构思和实现方式。这里展示的平台通过集成异构安全个人设备的执行场景展示了该方法的相关性。
{"title":"Pushing Edge Computing one Step Further: Resilient and Privacy-Preserving Processing on Personal Devices","authors":"Ludovic Javet, N. Anciaux, Luc Bouganim, Léo Lamoureux, P. Pucheral","doi":"10.48786/edbt.2023.77","DOIUrl":"https://doi.org/10.48786/edbt.2023.77","url":null,"abstract":"Can we push Edge computing one step further? This demonstration paper proposes an answer to this question by leveraging the generalization of Trusted Execution Environments at the very edge of the network to enable resilient and privacy-preserving computation on personal devices. Based on preliminary published results, we show that this can drastically change the way distributed processing over personal data is conceived and achieved. The platform presented here demonstrates the pertinence of the approach through execution scenarios integrating heterogeneous secure personal devices.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"46 1","pages":"835-838"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90898008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
REQUIRED: A Tool to Relax Queries through Relaxed Functional Dependencies 需要:一个通过放松的功能依赖来放松查询的工具
Loredana Caruccio, Stefano Cirillo, V. Deufemia, G. Polese, R. Stanzione
Query relaxation aims to relax the query constraints in order to derive some approximate results when the answer set is small. In this demo paper, we present REQUIRED, an automatized, portable, and scalable query relaxation tool leveraging metadata learned from an input dataset. The intuition is to use relationships underlying attribute values to derive a new query whose approximate results still meet the user’s expectations. In particular, REQUIRED exploits relaxed functional dependencies to modify the original query in two different ways: ( 𝑖 ) relaxing some query conditions by replacing the equality constraints with ranges and/or collections of admissible values, and ( 𝑖𝑖 ) rewriting the original query by replacing some or all the attributes involved in the conditions of the query with attributes related to them. Our demonstration scenarios show that REQUIRED is effective in properly relaxing queries according to the considered strategy.
查询松弛的目的是在答案集较小的情况下,放宽查询约束,从而得到一些近似的结果。在这篇演示论文中,我们介绍了REQUIRED,这是一个自动化、可移植和可扩展的查询放松工具,利用从输入数据集中学习的元数据。直观的做法是使用属性值背后的关系来派生一个新的查询,其近似结果仍然满足用户的期望。特别是,REQUIRED利用宽松的功能依赖关系以两种不同的方式修改原始查询:(纵向)通过将等式约束替换为范围和/或允许值的集合来放宽某些查询条件;(纵向)通过将查询条件中涉及的部分或全部属性替换为与之相关的属性来重写原始查询。我们的演示场景表明,REQUIRED可以根据所考虑的策略有效地适当放松查询。
{"title":"REQUIRED: A Tool to Relax Queries through Relaxed Functional Dependencies","authors":"Loredana Caruccio, Stefano Cirillo, V. Deufemia, G. Polese, R. Stanzione","doi":"10.48786/edbt.2023.74","DOIUrl":"https://doi.org/10.48786/edbt.2023.74","url":null,"abstract":"Query relaxation aims to relax the query constraints in order to derive some approximate results when the answer set is small. In this demo paper, we present REQUIRED, an automatized, portable, and scalable query relaxation tool leveraging metadata learned from an input dataset. The intuition is to use relationships underlying attribute values to derive a new query whose approximate results still meet the user’s expectations. In particular, REQUIRED exploits relaxed functional dependencies to modify the original query in two different ways: ( 𝑖 ) relaxing some query conditions by replacing the equality constraints with ranges and/or collections of admissible values, and ( 𝑖𝑖 ) rewriting the original query by replacing some or all the attributes involved in the conditions of the query with attributes related to them. Our demonstration scenarios show that REQUIRED is effective in properly relaxing queries according to the considered strategy.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"10 1","pages":"823-826"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86751714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Multi-Model Management 高效的多模式管理
Nils Strassenburg, Dominic Kupfer, J. Kowal, T. Rabl
Deep learning models are deployed in an increasing number of industrial domains, such as retail and automotive applications. An instance of a model typically performs one specific task, which is why larger software systems use multiple models in parallel. Given that all models in production software have to be managed, this leads to the problem of managing sets of related models, i.e., multi-model management. Existing approaches perform poorly on this task because they are optimized for saving single large models but not for simultaneously saving a set of related models. In this paper, we explore the space of multi-model management by presenting three optimized approaches: (1) A baseline approach that saves full model representations and minimizes the amount of saved metadata. (2) An update approach that reduces the storage consumption compared to the baseline by saving parameter updates instead of full models. (3) A provenance approach that saves model provenance data instead of model parameters. We evaluate the approaches for the multi-model management use cases of managing car battery cell models and image classification models. Our results show that the baseline outperforms existing approaches for save and recover times by more than an order of magnitude and that more sophisticated approaches reduce the storage consumption by up to 99%.
深度学习模型被部署在越来越多的工业领域,如零售和汽车应用。一个模型的实例通常执行一个特定的任务,这就是大型软件系统并行使用多个模型的原因。假设生产软件中的所有模型都必须被管理,这就导致了管理相关模型集的问题,即多模型管理。现有的方法在此任务上表现不佳,因为它们是为保存单个大型模型而优化的,而不是同时保存一组相关模型。在本文中,我们通过提出三种优化方法来探索多模型管理的空间:(1)保存完整模型表示并最小化保存元数据量的基线方法。(2)一种更新方法,通过保存参数更新而不是完整模型来减少与基线相比的存储消耗。(3)不保存模型参数而保存模型来源数据的溯源方法。我们评估了管理汽车电池模型和图像分类模型的多模型管理用例的方法。我们的结果表明,在保存和恢复时间方面,基线比现有的方法要好一个数量级以上,而更复杂的方法可以将存储消耗减少多达99%。
{"title":"Efficient Multi-Model Management","authors":"Nils Strassenburg, Dominic Kupfer, J. Kowal, T. Rabl","doi":"10.48786/edbt.2023.37","DOIUrl":"https://doi.org/10.48786/edbt.2023.37","url":null,"abstract":"Deep learning models are deployed in an increasing number of industrial domains, such as retail and automotive applications. An instance of a model typically performs one specific task, which is why larger software systems use multiple models in parallel. Given that all models in production software have to be managed, this leads to the problem of managing sets of related models, i.e., multi-model management. Existing approaches perform poorly on this task because they are optimized for saving single large models but not for simultaneously saving a set of related models. In this paper, we explore the space of multi-model management by presenting three optimized approaches: (1) A baseline approach that saves full model representations and minimizes the amount of saved metadata. (2) An update approach that reduces the storage consumption compared to the baseline by saving parameter updates instead of full models. (3) A provenance approach that saves model provenance data instead of model parameters. We evaluate the approaches for the multi-model management use cases of managing car battery cell models and image classification models. Our results show that the baseline outperforms existing approaches for save and recover times by more than an order of magnitude and that more sophisticated approaches reduce the storage consumption by up to 99%.","PeriodicalId":88813,"journal":{"name":"Advances in database technology : proceedings. International Conference on Extending Database Technology","volume":"77 1","pages":"457-463"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86764458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Advances in database technology : proceedings. International Conference on Extending Database Technology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1