首页 > 最新文献

Proceedings. International Database Engineering and Applications Symposium最新文献

英文 中文
SVMAX: a system for secure and valid manipulation of XML data SVMAX:用于安全有效地操作XML数据的系统
Pub Date : 2013-10-09 DOI: 10.1145/2513591.2513657
Houari Mahfoud, Abdessamad Imine, M. Rusinowitch
It is increasingly common to find XML views used to enforce access control as found in many applications and commercial database systems. To overcome the overhead of view materialization and maintenance, XML views are necessarily virtual. With this comes the need for answering XML queries posed over virtual views, by rewriting them into equivalent queries on the underlying documents. A major concern here is that query rewriting for recursive XML views is still an open problem, and proposed approaches deal only with non-recursive XML views. Moreover, a small number of works have studied the access rights for updates. In this paper, we present SVMAX (Secure and Valid MAnipulation of XML), the first system that supports specification and enforcement of both read and update access policies over arbitrary XML views (recursive or non). SVMAX defines general and expressive models for controlling access to XML data using significant class of XPath queries and in the presence of the update primitives of W3C XQuery Update Facility. Furthermore, SVMAX features an additional module enabling efficient validation of XML documents after primitive updates of XQuery. The wide use of W3C standards makes of SVMAX a useful system that can be easily integrated within commercial database systems as we will show. We give extensive experimental results, based on real-life DTDs, that show the efficiency and scalability of our system.
在许多应用程序和商业数据库系统中,使用XML视图来实施访问控制的情况越来越普遍。为了克服视图物化和维护的开销,XML视图必须是虚拟的。因此,需要通过将虚拟视图重写为基础文档上的等效查询来回答对虚拟视图提出的XML查询。这里的一个主要问题是,递归XML视图的查询重写仍然是一个未解决的问题,所提出的方法只处理非递归XML视图。此外,少数作品研究了更新的访问权限。在本文中,我们介绍了SVMAX (XML的安全和有效操作),这是第一个支持在任意XML视图(递归或非递归)上规范和执行读取和更新访问策略的系统。SVMAX定义了通用和富有表现力的模型,用于使用重要的XPath查询类和W3C XQuery update Facility的更新原语来控制对XML数据的访问。此外,SVMAX还提供了一个附加模块,可以在对XQuery进行基本更新后对XML文档进行有效验证。W3C标准的广泛使用使得SVMAX成为一个有用的系统,可以很容易地集成到商业数据库系统中。我们给出了基于实际dtd的大量实验结果,显示了我们系统的效率和可扩展性。
{"title":"SVMAX: a system for secure and valid manipulation of XML data","authors":"Houari Mahfoud, Abdessamad Imine, M. Rusinowitch","doi":"10.1145/2513591.2513657","DOIUrl":"https://doi.org/10.1145/2513591.2513657","url":null,"abstract":"It is increasingly common to find XML views used to enforce access control as found in many applications and commercial database systems. To overcome the overhead of view materialization and maintenance, XML views are necessarily virtual. With this comes the need for answering XML queries posed over virtual views, by rewriting them into equivalent queries on the underlying documents. A major concern here is that query rewriting for recursive XML views is still an open problem, and proposed approaches deal only with non-recursive XML views. Moreover, a small number of works have studied the access rights for updates. In this paper, we present SVMAX (Secure and Valid MAnipulation of XML), the first system that supports specification and enforcement of both read and update access policies over arbitrary XML views (recursive or non). SVMAX defines general and expressive models for controlling access to XML data using significant class of XPath queries and in the presence of the update primitives of W3C XQuery Update Facility. Furthermore, SVMAX features an additional module enabling efficient validation of XML documents after primitive updates of XQuery. The wide use of W3C standards makes of SVMAX a useful system that can be easily integrated within commercial database systems as we will show. We give extensive experimental results, based on real-life DTDs, that show the efficiency and scalability of our system.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"36 1","pages":"154-161"},"PeriodicalIF":0.0,"publicationDate":"2013-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85899753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Stream-join revisited in the context of epoch-based SQL continuous query 在基于epoch的SQL连续查询的上下文中重新讨论了流连接
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351491
Qiming Chen, M. Hsu
The current generation of stream processing systems is in general built separately from the query engine thus lacks the expressive power of SQL and causes significant overhead in data access and movement. This situation has motivated us to leverage the query engine for stream processing. Stream-join is a window operation where the key issue is how to punctuate and pair two or more correlated streams. In this work we tackle this issue in the specific context of query engine supported stream processing. We focus on the following problems: a SQL query is definable on bounded relation data but stream data are unbounded, and join multiple streams is a stateful (thus history-sensitive) operation but a SQL query only cares about the current state; further, relation join typically requires relation re-scan in a nested-loop but by nature a stream cannot be re-captured as reading a stream always gets newly incoming data. To leverage query processing for analyzing unbounded stream, we defined the Epoch-based Continuous Query (ECQ) model which allows a SQL query to be executed epoch by epoch for processing the stream data chunk by chunk. However, unlike multiple one-time queries, an ECQ is a single, continuous query instance across execution epochs for keeping the continuity of the application state as required by the history-sensitive operations such as sliding-window join. To joining multiple streams, we further developed the techniques to cache one or more consecutive data chunks falling in a sliding window across query execution epochs in the ECQ instance, to allow them to be re-delivered from the cache. In this way join multiple streams and self-join a single stream in the data chunk based window or sliding window, with various pairing schemes, are made possible. We extended the PostgreSQL engine to support the proposed approach. Our experience has demonstrated its value.
当前一代的流处理系统通常是与查询引擎分开构建的,因此缺乏SQL的表达能力,并导致数据访问和移动方面的巨大开销。这种情况促使我们利用查询引擎进行流处理。流连接是一个窗口操作,其关键问题是如何对两个或多个相关流进行标点和配对。在这项工作中,我们在查询引擎支持的流处理的特定背景下解决了这个问题。我们关注以下问题:SQL查询在有界关系数据上是可定义的,但流数据是无界的,连接多个流是有状态的(因此是历史敏感的)操作,但SQL查询只关心当前状态;此外,关系连接通常需要在嵌套循环中重新扫描关系,但本质上不能重新捕获流,因为读取流总是获得新传入的数据。为了利用查询处理来分析无界流,我们定义了基于epoch的连续查询(ECQ)模型,该模型允许一个SQL查询逐epoch执行,以逐块处理流数据。然而,与多个一次性查询不同,ECQ是跨执行时期的单个连续查询实例,用于保持应用程序状态的连续性,以满足历史敏感操作(如滑动窗口连接)的要求。为了连接多个流,我们进一步开发了一种技术,可以在ECQ实例中跨查询执行时间段的滑动窗口中缓存一个或多个连续的数据块,以允许它们从缓存中重新交付。通过这种方式,可以在基于数据块的窗口或滑动窗口中使用各种配对方案连接多个流和自连接单个流。我们扩展了PostgreSQL引擎来支持这个提议的方法。我们的经验证明了它的价值。
{"title":"Stream-join revisited in the context of epoch-based SQL continuous query","authors":"Qiming Chen, M. Hsu","doi":"10.1145/2351476.2351491","DOIUrl":"https://doi.org/10.1145/2351476.2351491","url":null,"abstract":"The current generation of stream processing systems is in general built separately from the query engine thus lacks the expressive power of SQL and causes significant overhead in data access and movement. This situation has motivated us to leverage the query engine for stream processing.\u0000 Stream-join is a window operation where the key issue is how to punctuate and pair two or more correlated streams. In this work we tackle this issue in the specific context of query engine supported stream processing. We focus on the following problems: a SQL query is definable on bounded relation data but stream data are unbounded, and join multiple streams is a stateful (thus history-sensitive) operation but a SQL query only cares about the current state; further, relation join typically requires relation re-scan in a nested-loop but by nature a stream cannot be re-captured as reading a stream always gets newly incoming data.\u0000 To leverage query processing for analyzing unbounded stream, we defined the Epoch-based Continuous Query (ECQ) model which allows a SQL query to be executed epoch by epoch for processing the stream data chunk by chunk. However, unlike multiple one-time queries, an ECQ is a single, continuous query instance across execution epochs for keeping the continuity of the application state as required by the history-sensitive operations such as sliding-window join.\u0000 To joining multiple streams, we further developed the techniques to cache one or more consecutive data chunks falling in a sliding window across query execution epochs in the ECQ instance, to allow them to be re-delivered from the cache. In this way join multiple streams and self-join a single stream in the data chunk based window or sliding window, with various pairing schemes, are made possible.\u0000 We extended the PostgreSQL engine to support the proposed approach. Our experience has demonstrated its value.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"89 1","pages":"130-138"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79982345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Resource allocation algorithm for a relational join operator in grid systems 网格系统中关系连接算子的资源分配算法
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351492
D. Cokuslu, A. Hameurlain, K. Erciyes, F. Morvan
Grid systems become very popular during the last decade because of their rapidly increasing computational capabilities. On the other hand, the advances on different domains cause enormous increase in the scale of the manipulated data. This issue augments the importance of distributed query processing and causes researchers to port their underlying environment onto the grid systems. However the dynamicity, heterogeneity and large scale characteristics of grid systems pose new problems for the distributed query processing domain. Resource allocation for query processing in grid systems is one of these problems, which attracts many researchers' attention. In this paper, we propose a new resource allocation algorithm for one relational join operator in a query considering characteristics of the grid systems. We provide theoretical analyses of the proposed algorithm and we consolidate analyses with the simulations.
网格系统由于其快速增长的计算能力在过去十年中变得非常流行。另一方面,不同领域的进步导致被操纵数据规模的巨大增加。这个问题增加了分布式查询处理的重要性,并促使研究人员将他们的底层环境移植到网格系统上。然而,网格系统的动态性、异构性和大规模特性给分布式查询处理领域带来了新的问题。网格系统中查询处理的资源分配问题是众多研究者关注的问题之一。本文针对网格系统的特点,提出了一种新的查询中单个关系连接算子的资源分配算法。我们对所提出的算法进行了理论分析,并将分析与仿真相结合。
{"title":"Resource allocation algorithm for a relational join operator in grid systems","authors":"D. Cokuslu, A. Hameurlain, K. Erciyes, F. Morvan","doi":"10.1145/2351476.2351492","DOIUrl":"https://doi.org/10.1145/2351476.2351492","url":null,"abstract":"Grid systems become very popular during the last decade because of their rapidly increasing computational capabilities. On the other hand, the advances on different domains cause enormous increase in the scale of the manipulated data. This issue augments the importance of distributed query processing and causes researchers to port their underlying environment onto the grid systems. However the dynamicity, heterogeneity and large scale characteristics of grid systems pose new problems for the distributed query processing domain. Resource allocation for query processing in grid systems is one of these problems, which attracts many researchers' attention. In this paper, we propose a new resource allocation algorithm for one relational join operator in a query considering characteristics of the grid systems. We provide theoretical analyses of the proposed algorithm and we consolidate analyses with the simulations.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"74 1","pages":"139-145"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84414334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
DYMOND: an active system for dynamic vertical partitioning of multimedia databases DYMOND:用于多媒体数据库动态垂直分区的活动系统
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351485
L. Rodríguez-Mazahua, Xiaoou Li, Jair Cervantes, Farid García
In recent years, vertical partitioning techniques have been employed in multimedia databases to achieve efficient retrieval of multimedia objects. These techniques are static because the input to the partitioning process, which includes queries accessing database and their frequency as well as the database schema, is obtained from an earlier analysis stage. This implies that when the system undergoes sufficient changes, a new analysis stage is carried out to re-run the partitioning process. Multimedia databases are accessed by many users simultaneously, therefore queries and their frequency tend to quickly change over time. In this context, dynamic vertical partitioning can significantly improve performance. In this paper we present an active system called DYMOND (DYnamic Multimedia ON line Distribution), which performs a dynamic vertical partitioning in multimedia databases to improve query performance. Experimental results on benchmark multimedia databases clarify the validness of our system.
近年来,为了实现多媒体对象的高效检索,多媒体数据库采用了垂直分区技术。这些技术是静态的,因为分区过程的输入(包括访问数据库的查询及其频率以及数据库模式)是从较早的分析阶段获得的。这意味着当系统发生足够的变化时,将执行一个新的分析阶段来重新运行分区过程。多媒体数据库是由许多用户同时访问的,因此查询及其频率往往会随着时间的推移而迅速变化。在这种情况下,动态垂直分区可以显著提高性能。本文提出了一个动态多媒体在线分布系统DYMOND (DYnamic Multimedia ON line Distribution),该系统对多媒体数据库进行动态垂直分区以提高查询性能。在多媒体基准数据库上的实验结果验证了系统的有效性。
{"title":"DYMOND: an active system for dynamic vertical partitioning of multimedia databases","authors":"L. Rodríguez-Mazahua, Xiaoou Li, Jair Cervantes, Farid García","doi":"10.1145/2351476.2351485","DOIUrl":"https://doi.org/10.1145/2351476.2351485","url":null,"abstract":"In recent years, vertical partitioning techniques have been employed in multimedia databases to achieve efficient retrieval of multimedia objects. These techniques are static because the input to the partitioning process, which includes queries accessing database and their frequency as well as the database schema, is obtained from an earlier analysis stage. This implies that when the system undergoes sufficient changes, a new analysis stage is carried out to re-run the partitioning process. Multimedia databases are accessed by many users simultaneously, therefore queries and their frequency tend to quickly change over time. In this context, dynamic vertical partitioning can significantly improve performance. In this paper we present an active system called DYMOND (DYnamic Multimedia ON line Distribution), which performs a dynamic vertical partitioning in multimedia databases to improve query performance. Experimental results on benchmark multimedia databases clarify the validness of our system.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"1 1","pages":"71-80"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90710342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Evolving social data mining and affective analysis methodologies, framework and applications 不断发展的社会数据挖掘和情感分析方法、框架和应用
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351477
A. Vakali
Social networks drive todays opinions and content diffusion. Large scale, distributed and unpredictable social data streams are produced and such evolving data production offers the ground for the data mining and analysis tasks. Such social data streams embed human reactions and inter-relationships and affective and emotional analysis has become rather important in todays applications. This work highlights the major data structures and methodologies used in evolving social data mining and proceeds to the relevant affective analysis techniques. A particular framework is outlined along with indicative applications which employ evolving social data analysis with emphasis on the seminal criteria of topic, location and time. Such mining and analysis overview is beneficial for various scientific and enterpreneural audiences and communities in the social networking area.
如今,社交网络推动着观点和内容的传播。大规模、分布式和不可预测的社会数据流的产生为数据挖掘和分析任务提供了基础。这种社会数据流嵌入了人类的反应和相互关系,情感和情绪分析在今天的应用中变得相当重要。这项工作强调了在不断发展的社会数据挖掘中使用的主要数据结构和方法,并继续讨论相关的情感分析技术。概述了一个特定的框架,以及采用不断发展的社会数据分析的指示性应用,重点是主题、地点和时间的开创性标准。这样的挖掘和分析概述对社会网络领域的各种科学和企业受众和社区都是有益的。
{"title":"Evolving social data mining and affective analysis methodologies, framework and applications","authors":"A. Vakali","doi":"10.1145/2351476.2351477","DOIUrl":"https://doi.org/10.1145/2351476.2351477","url":null,"abstract":"Social networks drive todays opinions and content diffusion. Large scale, distributed and unpredictable social data streams are produced and such evolving data production offers the ground for the data mining and analysis tasks. Such social data streams embed human reactions and inter-relationships and affective and emotional analysis has become rather important in todays applications. This work highlights the major data structures and methodologies used in evolving social data mining and proceeds to the relevant affective analysis techniques. A particular framework is outlined along with indicative applications which employ evolving social data analysis with emphasis on the seminal criteria of topic, location and time. Such mining and analysis overview is beneficial for various scientific and enterpreneural audiences and communities in the social networking area.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"57 1","pages":"1-7"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90929046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A stream query language TPQL for anomaly detection in facility management 一种用于设备管理异常检测的流查询语言TPQL
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351506
Makoto Imamura, S. Takayama, T. Munaka
In facility management for plants and buildings, needs of facility diagnosis for saving energy or facility management cost by analyzing time series data from sensors of equipments in facilities have been increasing. This paper proposes a relation-based stream query language TPQL (Trend Pattern Query Language) for expressing constraints in time series data for anomalies detection in facilities. The features of TPQL are the following. (1) TPQL introduces a convolution operator into a stream query language in order to describe constraints over sliding window. A convolution operator which takes a window function as an argument can express various domain dependent functions extracting feature over sliding windows such as duration constraint and hunting constraint. (2) TPQL introduces time-interval based join into stream query language in order to join time series data with different sampling rates.
在工厂和建筑物的设施管理中,通过分析设施中设备传感器的时间序列数据来进行设施诊断以节省能源或设施管理成本的需求越来越大。本文提出了一种基于关系的流查询语言TPQL(趋势模式查询语言)来表达时间序列数据中的约束条件,用于设施异常检测。TPQL的特性如下。(1) TPQL在流查询语言中引入卷积算子来描述滑动窗口的约束。以窗口函数为参数的卷积算子可以表示在滑动窗口上提取特征的各种域相关函数,如持续时间约束和搜索约束。(2) TPQL在流查询语言中引入了基于时间间隔的联接,以联接不同采样率的时间序列数据。
{"title":"A stream query language TPQL for anomaly detection in facility management","authors":"Makoto Imamura, S. Takayama, T. Munaka","doi":"10.1145/2351476.2351506","DOIUrl":"https://doi.org/10.1145/2351476.2351506","url":null,"abstract":"In facility management for plants and buildings, needs of facility diagnosis for saving energy or facility management cost by analyzing time series data from sensors of equipments in facilities have been increasing. This paper proposes a relation-based stream query language TPQL (Trend Pattern Query Language) for expressing constraints in time series data for anomalies detection in facilities. The features of TPQL are the following. (1) TPQL introduces a convolution operator into a stream query language in order to describe constraints over sliding window. A convolution operator which takes a window function as an argument can express various domain dependent functions extracting feature over sliding windows such as duration constraint and hunting constraint. (2) TPQL introduces time-interval based join into stream query language in order to join time series data with different sampling rates.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"13 1","pages":"235-238"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87099034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
TEEPA: a timely-aware elastic parallel architecture TEEPA:一个时效性的弹性并行架构
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351480
J. Costa, P. Martins, J. Cecílio, P. Furtado
Parallel Shared-Nothing architectures are frequently used to handle large star-schema Data Warehouses (DW). The continuous increase in data volume and the star-schema storage organization introduce severe limitations to scalability due to the well-known parallel join issues and the resulting need to use solutions such as on-the fly repartitioning of data or intermediate results, or massive replication of large data sets that still need to be joined locally, constraining their ability to deliver fast results. Parallelism may improve query performance, however some business decisions may require that query results be timely available which, even with additional parallelism and significant upgrade costs (both monetary and due to disturbance of normal operations), cannot be guaranteed. We propose a Timely-aware Execution Parallel Architecture (TEEPA) which balances data load and query processing among an elastic set of non-dedicated heterogeneous nodes in order to provide scale-out performance and timely query results. Data is allocated using adaptable storage models to minimize join costs (the major uncertainty factor) which best fit the nodes' capabilities, while preserving a consistent logical view of the star-schema. We present experimental evaluation of TEEPA and demonstrate its ability to provide timely results.
并行无共享架构经常用于处理大型星型模式数据仓库(DW)。数据量的持续增长和星型模式存储组织给可伸缩性带来了严重的限制,这是由于众所周知的并行连接问题,以及因此需要使用诸如数据或中间结果的动态重分区,或仍然需要在本地连接的大型数据集的大规模复制等解决方案,从而限制了它们交付快速结果的能力。并行性可以提高查询性能,但是一些业务决策可能要求查询结果及时可用,即使有额外的并行性和巨大的升级成本(包括金钱和正常操作的干扰),也不能保证查询结果及时可用。我们提出了一种实时感知的执行并行架构(TEEPA),它在一组非专用异构节点之间平衡数据负载和查询处理,以提供横向扩展性能和及时的查询结果。使用可适应的存储模型来分配数据,以最小化最适合节点功能的连接成本(主要的不确定因素),同时保留星型模式的一致逻辑视图。我们提出了TEEPA的实验评估,并证明了其提供及时结果的能力。
{"title":"TEEPA: a timely-aware elastic parallel architecture","authors":"J. Costa, P. Martins, J. Cecílio, P. Furtado","doi":"10.1145/2351476.2351480","DOIUrl":"https://doi.org/10.1145/2351476.2351480","url":null,"abstract":"Parallel Shared-Nothing architectures are frequently used to handle large star-schema Data Warehouses (DW). The continuous increase in data volume and the star-schema storage organization introduce severe limitations to scalability due to the well-known parallel join issues and the resulting need to use solutions such as on-the fly repartitioning of data or intermediate results, or massive replication of large data sets that still need to be joined locally, constraining their ability to deliver fast results. Parallelism may improve query performance, however some business decisions may require that query results be timely available which, even with additional parallelism and significant upgrade costs (both monetary and due to disturbance of normal operations), cannot be guaranteed. We propose a Timely-aware Execution Parallel Architecture (TEEPA) which balances data load and query processing among an elastic set of non-dedicated heterogeneous nodes in order to provide scale-out performance and timely query results. Data is allocated using adaptable storage models to minimize join costs (the major uncertainty factor) which best fit the nodes' capabilities, while preserving a consistent logical view of the star-schema. We present experimental evaluation of TEEPA and demonstrate its ability to provide timely results.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"37 1","pages":"24-31"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75154667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A cooperative scheme to aggregate spatio-temporal events in VANETs VANETs时空事件聚合的协同方案
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351488
D. Zekri, Bruno Defude, T. Delot
Today, thanks to vehicular networks, drivers may receive useful information produced or relayed by neighboring sensors or vehicles (e.g., the location of an available parking space, of a traffic congestion, etc.). In this paper, we address the problem of providing assistance to the driver when no recent information has been received on his/her vehicle. Therefore, we present a cooperative scheme to aggregate, store and exchange these events in order to have an history of past events. This scheme is based on a dedicated spatio-temporal aggregation structure using Flajolet-Martin sketches and deployed on each vehicle. Contrary to existing approaches considering data aggregation in vehicular networks, our main goal here is not to save network bandwidth but rather to extract useful knowledge from previous observations. In this paper, we present our aggregation data structure, the associated exchange protocol and a set of experiments showing the effectiveness of our proposal.
今天,由于车辆网络,司机可以接收到由邻近传感器或车辆产生或传递的有用信息(例如,可用停车位的位置,交通拥堵等)。在本文中,我们解决了当没有收到他/她的车辆的最新信息时向驾驶员提供帮助的问题。因此,我们提出了一种协作方案来聚合、存储和交换这些事件,以获得过去事件的历史记录。该方案基于专用的时空聚合结构,使用Flajolet-Martin草图,并部署在每辆车上。与考虑车辆网络中数据聚合的现有方法相反,我们这里的主要目标不是节省网络带宽,而是从先前的观察中提取有用的知识。在本文中,我们提出了我们的聚合数据结构,关联的交换协议和一组实验来证明我们的建议的有效性。
{"title":"A cooperative scheme to aggregate spatio-temporal events in VANETs","authors":"D. Zekri, Bruno Defude, T. Delot","doi":"10.1145/2351476.2351488","DOIUrl":"https://doi.org/10.1145/2351476.2351488","url":null,"abstract":"Today, thanks to vehicular networks, drivers may receive useful information produced or relayed by neighboring sensors or vehicles (e.g., the location of an available parking space, of a traffic congestion, etc.). In this paper, we address the problem of providing assistance to the driver when no recent information has been received on his/her vehicle. Therefore, we present a cooperative scheme to aggregate, store and exchange these events in order to have an history of past events. This scheme is based on a dedicated spatio-temporal aggregation structure using Flajolet-Martin sketches and deployed on each vehicle. Contrary to existing approaches considering data aggregation in vehicular networks, our main goal here is not to save network bandwidth but rather to extract useful knowledge from previous observations. In this paper, we present our aggregation data structure, the associated exchange protocol and a set of experiments showing the effectiveness of our proposal.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"12 1","pages":"100-109"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86991588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Incrementally maintaining run-length encoded attributes in column stores 在列存储中增量地维护运行长度的编码属性
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351493
Abhijeet Mohapatra, M. Genesereth
Run-length encoding is a popular compression scheme which is used extensively to compress the attribute values in column stores. Out of order insertion of tuples potentially degrades the compression achieved using run-length encoding and consequently, the performance of reads. The in-place insertions, deletions and updates of tuples into a column store relation with n tuples take O(n) time. The linear cost is typically avoided by amortizing the cost of updates in batches. However, the relation is decompressed and subsequently re-compressed after applying a batch of updates. This leads to added time time complexity. We propose a novel indexing scheme called count indexes that supports O(log n) in-place insertions, deletions, updates and look ups on a run-length encoded sequence with n runs. We also show that count indexes efficiently update a batch of tuples requiring almost a constant time per updated tuple. Additionally, we show that count indexes are optimal. We extend count indexes to support O(log n) updates on bitmapped sequences with n values and adapt them to block-based stores.
运行长度编码是一种流行的压缩方案,广泛用于压缩列存储中的属性值。乱序插入元组可能会降低使用运行长度编码实现的压缩,从而降低读取的性能。将元组插入、删除和更新到包含n个元组的列存储关系中需要O(n)时间。线性成本通常通过分摊批量更新的成本来避免。但是,关系会被解压缩,然后在应用一批更新后重新压缩。这导致了时间复杂度的增加。我们提出了一种新的索引方案,称为计数索引,它支持O(log n)个原地插入,删除,更新和查找在n次运行的运行长度编码序列上。我们还展示了计数索引有效地更新一批元组,每个更新元组所需的时间几乎是恒定的。此外,我们还证明计数索引是最优的。我们扩展计数索引,以支持0 (log n)次更新的位图序列与n个值,并使其适应基于块的存储。
{"title":"Incrementally maintaining run-length encoded attributes in column stores","authors":"Abhijeet Mohapatra, M. Genesereth","doi":"10.1145/2351476.2351493","DOIUrl":"https://doi.org/10.1145/2351476.2351493","url":null,"abstract":"Run-length encoding is a popular compression scheme which is used extensively to compress the attribute values in column stores. Out of order insertion of tuples potentially degrades the compression achieved using run-length encoding and consequently, the performance of reads. The in-place insertions, deletions and updates of tuples into a column store relation with n tuples take O(n) time. The linear cost is typically avoided by amortizing the cost of updates in batches. However, the relation is decompressed and subsequently re-compressed after applying a batch of updates. This leads to added time time complexity. We propose a novel indexing scheme called count indexes that supports O(log n) in-place insertions, deletions, updates and look ups on a run-length encoded sequence with n runs. We also show that count indexes efficiently update a batch of tuples requiring almost a constant time per updated tuple. Additionally, we show that count indexes are optimal. We extend count indexes to support O(log n) updates on bitmapped sequences with n values and adapt them to block-based stores.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"315 1","pages":"146-154"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83448415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
The QOL approach for optimizing distributed queries without complete knowledge 在没有完全知识的情况下优化分布式查询的QOL方法
Pub Date : 2012-08-08 DOI: 10.1145/2351476.2351487
L. Martínez, C. Collet, Christophe Bobineau, Etienne Dublé
This paper concerns the integration of the Case Based Reasoning (CBR) paradigm in query processing, providing a way to optimize queries when there is no prior knowledge on queried data sources and certainly no related metadata such as data statistics. Our Query Optimization by Learning (QOL) approach optimizes queries using cases generated from the evaluation of similar past queries. A query case comprises: (i) the query, (ii) the query plan and (iii) the measures (computational resources consumed) of the query plan. The work also concerns the way the CBR process interacts with the query plan generation process. This process uses classical heuristics and makes decisions randomly (e.g. when there is no statistics for join ordering and selection of algorithms, routing protocols); It also (re)uses cases (existing query plans) for similar queries parts, improving the query optimization and evaluation efficiency.
本文研究了基于案例推理(Case Based Reasoning, CBR)范式在查询处理中的集成,提供了一种在没有关于查询数据源的先验知识和没有相关元数据(如数据统计)的情况下优化查询的方法。我们的学习查询优化(Query Optimization by Learning, QOL)方法使用由过去类似查询的评估生成的案例来优化查询。查询用例包括:(i)查询,(ii)查询计划和(iii)查询计划的度量(消耗的计算资源)。这项工作还涉及CBR流程与查询计划生成流程交互的方式。这个过程使用经典的启发式并随机做出决策(例如,当没有统计数据用于连接排序和算法选择时,路由协议);它还(重新)使用了类似查询部分的用例(现有查询计划),提高了查询优化和求值效率。
{"title":"The QOL approach for optimizing distributed queries without complete knowledge","authors":"L. Martínez, C. Collet, Christophe Bobineau, Etienne Dublé","doi":"10.1145/2351476.2351487","DOIUrl":"https://doi.org/10.1145/2351476.2351487","url":null,"abstract":"This paper concerns the integration of the Case Based Reasoning (CBR) paradigm in query processing, providing a way to optimize queries when there is no prior knowledge on queried data sources and certainly no related metadata such as data statistics. Our Query Optimization by Learning (QOL) approach optimizes queries using cases generated from the evaluation of similar past queries. A query case comprises: (i) the query, (ii) the query plan and (iii) the measures (computational resources consumed) of the query plan. The work also concerns the way the CBR process interacts with the query plan generation process. This process uses classical heuristics and makes decisions randomly (e.g. when there is no statistics for join ordering and selection of algorithms, routing protocols); It also (re)uses cases (existing query plans) for similar queries parts, improving the query optimization and evaluation efficiency.","PeriodicalId":93615,"journal":{"name":"Proceedings. International Database Engineering and Applications Symposium","volume":"60 3","pages":"91-99"},"PeriodicalIF":0.0,"publicationDate":"2012-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72630668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Proceedings. International Database Engineering and Applications Symposium
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1