首页 > 最新文献

Proceedings. 20th International Conference on Data Engineering最新文献

英文 中文
Priority mechanisms for OLTP and transactional Web applications OLTP 和事务型网络应用程序的优先级机制
Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320025
David T. McWherter, Bianca Schroeder, A. Ailamaki, Mor Harchol-Balter
Transactional workloads are a hallmark of modern OLTP and Web applications, ranging from electronic commerce and banking to online shopping. Often, the database at the core of these applications is the performance bottleneck. Given the limited resources available to the database, transaction execution times can vary wildly as they compete and wait for critical resources. As the competitor is "only a click away", valuable (high-priority) users must be ensured consistently good performance via QoS and transaction prioritization. This paper analyzes and proposes prioritization for transactional workloads in traditional database systems (DBMS). This work first performs a detailed bottleneck analysis of resource usage by transactional workloads on commercial and noncommercial DBMS (IBM DB2, Post-greSQL, Shore) under a range of configurations. Second, this work implements and evaluates the performance of several preemptive and nonpreemptive DBMS prioritization policies in PostgreSQL and Shore. The primary contributions of this work include (i) understanding the bottleneck resources in transactional DBMS workloads and (ii) a demonstration that prioritization in traditional DBMS can provide 2x-5x improvement for high-priority transactions using simple scheduling policies, without expense to low-priority transactions.
事务工作负载是现代OLTP和Web应用程序的标志,范围从电子商务和银行到在线购物。通常,这些应用程序的核心数据库是性能瓶颈。考虑到数据库可用的资源有限,事务执行时间可能会因为它们竞争和等待关键资源而变化很大。由于竞争对手“只需点击一下”,因此必须通过QoS和事务优先级确保有价值(高优先级)的用户始终保持良好的性能。本文对传统数据库系统中事务性工作负载的优先级进行了分析和提出。这项工作首先对各种配置下商业和非商业DBMS (IBM DB2、Post-greSQL、Shore)上事务性工作负载的资源使用情况进行了详细的瓶颈分析。其次,本工作在PostgreSQL和Shore中实现并评估了几种抢占式和非抢占式DBMS优先级策略的性能。这项工作的主要贡献包括:(i)理解事务性DBMS工作负载中的瓶颈资源,(ii)证明传统DBMS中的优先级可以使用简单的调度策略为高优先级事务提供2 -5倍的改进,而不会对低优先级事务造成损失。
{"title":"Priority mechanisms for OLTP and transactional Web applications","authors":"David T. McWherter, Bianca Schroeder, A. Ailamaki, Mor Harchol-Balter","doi":"10.1109/ICDE.2004.1320025","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320025","url":null,"abstract":"Transactional workloads are a hallmark of modern OLTP and Web applications, ranging from electronic commerce and banking to online shopping. Often, the database at the core of these applications is the performance bottleneck. Given the limited resources available to the database, transaction execution times can vary wildly as they compete and wait for critical resources. As the competitor is \"only a click away\", valuable (high-priority) users must be ensured consistently good performance via QoS and transaction prioritization. This paper analyzes and proposes prioritization for transactional workloads in traditional database systems (DBMS). This work first performs a detailed bottleneck analysis of resource usage by transactional workloads on commercial and noncommercial DBMS (IBM DB2, Post-greSQL, Shore) under a range of configurations. Second, this work implements and evaluates the performance of several preemptive and nonpreemptive DBMS prioritization policies in PostgreSQL and Shore. The primary contributions of this work include (i) understanding the bottleneck resources in transactional DBMS workloads and (ii) a demonstration that prioritization in traditional DBMS can provide 2x-5x improvement for high-priority transactions using simple scheduling policies, without expense to low-priority transactions.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129720236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 109
A machine learning approach to rapid development of XML mapping queries 一种快速开发XML映射查询的机器学习方法
Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320004
Atsuyuki Morishima, H. Kitagawa, Akira Matsumoto
We present XLearner, a novel tool that helps the rapid development of XML mapping queries written in XQuery. XLearner is novel in that it learns XQuery queries consistent with given examples (fragments) of intended query results. XLearner combines known learning techniques, incorporates mechanisms to cope with issues specific to the XQuery learning context, and provides a systematic way for the semiautomatic development of queries. We describe the XLearner system. It presents algorithms for learning various classes of XQuery, shows that a minor extension gives the system a practical expressive power, and reports experimental results to demonstrate how XLearner outputs reasonably complicated queries with only a small number of interactions with the user.
我们介绍XLearner,这是一种帮助快速开发用XQuery编写的XML映射查询的新工具。XLearner的新颖之处在于,它学习与预期查询结果的给定示例(片段)一致的XQuery查询。XLearner结合了已知的学习技术,结合了一些机制来处理特定于XQuery学习上下文的问题,并为查询的半自动开发提供了一种系统的方法。我们来描述一下XLearner系统。本文介绍了用于学习各种XQuery类的算法,展示了一个小扩展为系统提供了实用的表达能力,并报告了实验结果,以演示XLearner如何仅与用户进行少量交互就输出相当复杂的查询。
{"title":"A machine learning approach to rapid development of XML mapping queries","authors":"Atsuyuki Morishima, H. Kitagawa, Akira Matsumoto","doi":"10.1109/ICDE.2004.1320004","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320004","url":null,"abstract":"We present XLearner, a novel tool that helps the rapid development of XML mapping queries written in XQuery. XLearner is novel in that it learns XQuery queries consistent with given examples (fragments) of intended query results. XLearner combines known learning techniques, incorporates mechanisms to cope with issues specific to the XQuery learning context, and provides a systematic way for the semiautomatic development of queries. We describe the XLearner system. It presents algorithms for learning various classes of XQuery, shows that a minor extension gives the system a practical expressive power, and reports experimental results to demonstrate how XLearner outputs reasonably complicated queries with only a small number of interactions with the user.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131383335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Approximate selection queries over imprecise data 对不精确数据的近似选择查询
Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1319991
Iosif Lazaridis, S. Mehrotra
We examine the problem of evaluating selection queries over imprecisely represented objects. Such objects are used either because they are much smaller in size than the precise ones (e.g., compressed versions of time series), or as imprecise replicas of fast-changing objects across the network (e.g., interval approximations for time-varying sensor readings). It may be impossible to determine whether an imprecise object meets the selection predicate. Additionally, the objects appearing in the output are also imprecise. Retrieving the precise objects themselves (at additional cost) can be used to increase the quality of the reported answer. We allow queries to specify their own answer quality requirements. We show how the query evaluation system may do the minimal amount of work to meet these requirements. Our work presents two important contributions: first, by considering queries with set-based answers, rather than the approximate aggregate queries over numerical data examined in the literature; second, by aiming to minimize the combined cost of both data processing and probe operations in a single framework. Thus, we establish that the answer accuracy/performance tradeoff can be realized in a more general setting than previously seen.
我们研究在不精确表示的对象上评估选择查询的问题。使用这些对象,要么是因为它们的尺寸比精确对象(例如,时间序列的压缩版本)小得多,要么是因为它们是网络中快速变化对象的不精确复制品(例如,时变传感器读数的间隔近似)。可能无法确定不精确的对象是否满足选择谓词。此外,输出中出现的对象也是不精确的。检索精确的对象本身(需要额外的成本)可以用来提高报告答案的质量。我们允许查询指定它们自己的回答质量要求。我们将展示查询评估系统如何以最少的工作量来满足这些需求。我们的工作提出了两个重要的贡献:首先,通过考虑具有基于集合的答案的查询,而不是在文献中检查的数值数据的近似聚合查询;其次,通过在单个框架中最小化数据处理和探测操作的综合成本。因此,我们确定答案准确性/性能权衡可以在比以前看到的更一般的设置中实现。
{"title":"Approximate selection queries over imprecise data","authors":"Iosif Lazaridis, S. Mehrotra","doi":"10.1109/ICDE.2004.1319991","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1319991","url":null,"abstract":"We examine the problem of evaluating selection queries over imprecisely represented objects. Such objects are used either because they are much smaller in size than the precise ones (e.g., compressed versions of time series), or as imprecise replicas of fast-changing objects across the network (e.g., interval approximations for time-varying sensor readings). It may be impossible to determine whether an imprecise object meets the selection predicate. Additionally, the objects appearing in the output are also imprecise. Retrieving the precise objects themselves (at additional cost) can be used to increase the quality of the reported answer. We allow queries to specify their own answer quality requirements. We show how the query evaluation system may do the minimal amount of work to meet these requirements. Our work presents two important contributions: first, by considering queries with set-based answers, rather than the approximate aggregate queries over numerical data examined in the literature; second, by aiming to minimize the combined cost of both data processing and probe operations in a single framework. Thus, we establish that the answer accuracy/performance tradeoff can be realized in a more general setting than previously seen.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129171172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Detection and correction of conflicting source updates for view maintenance 为视图维护检测和纠正冲突的源更新
Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320017
Songting Chen, Jun Chen, Xin Zhang, Elke A. Rundensteiner
Data integration over multiple heterogeneous data sources has become increasingly important for modern applications. The integrated data is usually stored in materialized views for high availability and better performance. Such views must be maintained after the data sources change. In a loosely-coupled and dynamic environment, such as the Data Grid, the sources may autonomously change not only their data but also their schema, query capabilities or semantics, which may consequently cause the ongoing view maintenance fail. We analyze the maintenance errors and classify them into different classes of dependencies. We then propose several dependency detection and correction algorithms to handle these new classes of concurrency. Our techniques are not tied to specific maintenance algorithms nor to a particular data model. To our knowledge, this is the first complete solution to the view maintenance concurrency problems for both data and schema changes. We have implemented the proposed solutions and experimentally evaluated the impact of anomalies on maintenance performance and trade-offs between different dependency detection algorithms.
在现代应用程序中,跨多个异构数据源的数据集成变得越来越重要。集成的数据通常存储在物化视图中,以获得高可用性和更好的性能。这些视图必须在数据源更改后维护。在松散耦合的动态环境中(例如Data Grid),数据源不仅可以自主地更改数据,还可以自主地更改模式、查询功能或语义,这可能导致正在进行的视图维护失败。我们分析了维护错误,并将它们划分为不同的依赖类。然后,我们提出了几种依赖检测和校正算法来处理这些新的并发类。我们的技术不依赖于特定的维护算法或特定的数据模型。据我们所知,这是针对数据和模式更改的视图维护并发性问题的第一个完整解决方案。我们已经实现了提出的解决方案,并通过实验评估了异常对维护性能的影响以及不同依赖检测算法之间的权衡。
{"title":"Detection and correction of conflicting source updates for view maintenance","authors":"Songting Chen, Jun Chen, Xin Zhang, Elke A. Rundensteiner","doi":"10.1109/ICDE.2004.1320017","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320017","url":null,"abstract":"Data integration over multiple heterogeneous data sources has become increasingly important for modern applications. The integrated data is usually stored in materialized views for high availability and better performance. Such views must be maintained after the data sources change. In a loosely-coupled and dynamic environment, such as the Data Grid, the sources may autonomously change not only their data but also their schema, query capabilities or semantics, which may consequently cause the ongoing view maintenance fail. We analyze the maintenance errors and classify them into different classes of dependencies. We then propose several dependency detection and correction algorithms to handle these new classes of concurrency. Our techniques are not tied to specific maintenance algorithms nor to a particular data model. To our knowledge, this is the first complete solution to the view maintenance concurrency problems for both data and schema changes. We have implemented the proposed solutions and experimentally evaluated the impact of anomalies on maintenance performance and trade-offs between different dependency detection algorithms.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128253640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Approximate aggregation techniques for sensor databases 传感器数据库的近似聚合技术
Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320018
Jeffrey Considine, Feifei Li, G. Kollios, J. Byers
In the emerging area of sensor-based systems, a significant challenge is to develop scalable, fault-tolerant methods to extract useful information from the data the sensors collect. An approach to this data management problem is the use of sensor database systems, exemplified by TinyDB and Cougar, which allow users to perform aggregation queries such as MIN, COUNT and AVG on a sensor network. Due to power and range constraints, centralized approaches are generally impractical, so most systems use in-network aggregation to reduce network traffic. However, these aggregation strategies become bandwidth-intensive when combined with the fault-tolerant, multipath routing methods often used in these environments. For example, duplicate-sensitive aggregates such as SUM cannot be computed exactly using substantially less bandwidth than explicit enumeration. To avoid this expense, we investigate the use of approximate in-network aggregation using small sketches. Our contributions are as follows: 1) we generalize well known duplicate-insensitive sketches for approximating COUNT to handle SUM, 2) we present and analyze methods for using sketches to produce accurate results with low communication and computation overhead, and 3) we present an extensive experimental validation of our methods.
在基于传感器的系统的新兴领域,一个重大的挑战是开发可扩展的、容错的方法,从传感器收集的数据中提取有用的信息。解决这个数据管理问题的一种方法是使用传感器数据库系统,例如TinyDB和Cougar,它们允许用户在传感器网络上执行聚合查询,如MIN, COUNT和AVG。由于功率和范围的限制,集中式方法通常是不切实际的,因此大多数系统使用网络内聚合来减少网络流量。然而,当与这些环境中经常使用的容错、多路径路由方法结合使用时,这些聚合策略会变得带宽密集。例如,使用比显式枚举少得多的带宽,不能精确计算SUM等对重复敏感的聚合。为了避免这种开销,我们使用小草图研究了近似网络内聚合的使用。我们的贡献如下:1)我们推广了众所周知的重复不敏感草图,用于近似COUNT来处理SUM; 2)我们提出并分析了使用草图以低通信和计算开销产生准确结果的方法;3)我们对我们的方法进行了广泛的实验验证。
{"title":"Approximate aggregation techniques for sensor databases","authors":"Jeffrey Considine, Feifei Li, G. Kollios, J. Byers","doi":"10.1109/ICDE.2004.1320018","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320018","url":null,"abstract":"In the emerging area of sensor-based systems, a significant challenge is to develop scalable, fault-tolerant methods to extract useful information from the data the sensors collect. An approach to this data management problem is the use of sensor database systems, exemplified by TinyDB and Cougar, which allow users to perform aggregation queries such as MIN, COUNT and AVG on a sensor network. Due to power and range constraints, centralized approaches are generally impractical, so most systems use in-network aggregation to reduce network traffic. However, these aggregation strategies become bandwidth-intensive when combined with the fault-tolerant, multipath routing methods often used in these environments. For example, duplicate-sensitive aggregates such as SUM cannot be computed exactly using substantially less bandwidth than explicit enumeration. To avoid this expense, we investigate the use of approximate in-network aggregation using small sketches. Our contributions are as follows: 1) we generalize well known duplicate-insensitive sketches for approximating COUNT to handle SUM, 2) we present and analyze methods for using sketches to produce accurate results with low communication and computation overhead, and 3) we present an extensive experimental validation of our methods.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123052900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 628
Peering and querying e-catalog communities 对等和查询电子目录社区
Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320076
B. Benatallah, Mohand-Said Hacid, Hye-young Paik, Christophe Rey, F. Toumani
More and more suppliers are offering access to their product or information portals (also called e-catalogs) via the Web. The key issue is how to efficiently integrate and query large, intricate, heterogeneous information sources such as e-catalogs. Traditional data integration approach, where the development of an integrated schema requires the understanding of both structure and semantics of all schemas of sources to be integrated, is hardly applicable because of the dynamic nature and size of the Web. We present WS-CatalogNet: a Web services based data sharing middleware infrastructure whose aims is to enhance the potential of e-catalogs by focusing on scalability and flexible aspects of their sharing and access.
越来越多的供应商通过Web提供对其产品或信息门户(也称为电子目录)的访问。关键问题是如何有效地集成和查询大型、复杂、异构的信息源,如电子目录。在传统的数据集成方法中,集成模式的开发需要理解要集成的源的所有模式的结构和语义,由于Web的动态性和规模,这种方法几乎不适用。我们提出WS-CatalogNet:一个基于Web服务的数据共享中间件基础设施,其目标是通过关注电子目录共享和访问的可伸缩性和灵活性方面来增强电子目录的潜力。
{"title":"Peering and querying e-catalog communities","authors":"B. Benatallah, Mohand-Said Hacid, Hye-young Paik, Christophe Rey, F. Toumani","doi":"10.1109/ICDE.2004.1320076","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320076","url":null,"abstract":"More and more suppliers are offering access to their product or information portals (also called e-catalogs) via the Web. The key issue is how to efficiently integrate and query large, intricate, heterogeneous information sources such as e-catalogs. Traditional data integration approach, where the development of an integrated schema requires the understanding of both structure and semantics of all schemas of sources to be integrated, is hardly applicable because of the dynamic nature and size of the Web. We present WS-CatalogNet: a Web services based data sharing middleware infrastructure whose aims is to enhance the potential of e-catalogs by focusing on scalability and flexible aspects of their sharing and access.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"170 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122829902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Applications for expression data in relational database systems 表达式数据在关系数据库系统中的应用
Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320031
D. Gawlick, Dmitry Lenkov, Aravind Yalamanchi, L. Chernobrod
The support for the expression data type in a relational database system allows storing of conditional expressions as data in database tables and evaluating them using SQL queries. In the context of this new capability, expressions can be interpreted as descriptions, queries, and filters, and this significantly broadens the use of a relational database system to support new types of applications. The paper presents an overview of the expression data type, relates expressions to descriptions, queries, and filters, considers applications pertaining to information distribution, demand analysis, and task assignment, and shows how these applications can be easily supported with improved functionality.
关系数据库系统中对表达式数据类型的支持允许将条件表达式作为数据存储在数据库表中,并使用SQL查询对其求值。在这个新功能的上下文中,表达式可以被解释为描述、查询和过滤器,这极大地扩展了关系数据库系统的使用范围,以支持新的应用程序类型。本文概述了表达式数据类型,将表达式与描述、查询和过滤器联系起来,考虑了与信息分发、需求分析和任务分配相关的应用程序,并展示了如何通过改进的功能轻松支持这些应用程序。
{"title":"Applications for expression data in relational database systems","authors":"D. Gawlick, Dmitry Lenkov, Aravind Yalamanchi, L. Chernobrod","doi":"10.1109/ICDE.2004.1320031","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320031","url":null,"abstract":"The support for the expression data type in a relational database system allows storing of conditional expressions as data in database tables and evaluating them using SQL queries. In the context of this new capability, expressions can be interpreted as descriptions, queries, and filters, and this significantly broadens the use of a relational database system to support new types of applications. The paper presents an overview of the expression data type, relates expressions to descriptions, queries, and filters, considers applications pertaining to information distribution, demand analysis, and task assignment, and shows how these applications can be easily supported with improved functionality.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116986565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A flexible infrastructure for gathering XML statistics and estimating query cardinality 用于收集XML统计信息和估计查询基数的灵活基础设施
Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320085
J. Freire, Maya Ramanath, Lingzhi Zhang
A key component of XML data management systems is the result size estimator, which estimates the cardinalities of user queries. Estimated cardinalities are needed in a variety of tasks, including query optimization and cost-based storage design; and they can also be used to give users early feedback about the expected outcome of their queries. In contrast to previously proposed result estimators, which use specialized data structures and estimation algorithms, StatiX uses histograms to uniformly capture both the structural and value skew present in documents. The original version of StatiX was built as a proof of concept. With the goal of making the system publicly available, we have built StatiX++, a new and improved version of StatiX, which extends the original system in significant ways. In this demonstration, we show the key features of StatiX++.
XML数据管理系统的一个关键组件是结果大小估计器,它估计用户查询的基数。各种任务都需要估计基数,包括查询优化和基于成本的存储设计;它们还可以用于向用户提供有关其查询预期结果的早期反馈。与之前提出的使用专门数据结构和估计算法的结果估计器不同,StatiX使用直方图来统一捕获文档中存在的结构和值偏差。最初版本的StatiX是作为概念验证而构建的。为了使系统公开可用,我们构建了StatiX++,这是StatiX的一个新的改进版本,它在很大程度上扩展了原来的系统。在这个演示中,我们将展示StatiX++的关键特性。
{"title":"A flexible infrastructure for gathering XML statistics and estimating query cardinality","authors":"J. Freire, Maya Ramanath, Lingzhi Zhang","doi":"10.1109/ICDE.2004.1320085","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320085","url":null,"abstract":"A key component of XML data management systems is the result size estimator, which estimates the cardinalities of user queries. Estimated cardinalities are needed in a variety of tasks, including query optimization and cost-based storage design; and they can also be used to give users early feedback about the expected outcome of their queries. In contrast to previously proposed result estimators, which use specialized data structures and estimation algorithms, StatiX uses histograms to uniformly capture both the structural and value skew present in documents. The original version of StatiX was built as a proof of concept. With the goal of making the system publicly available, we have built StatiX++, a new and improved version of StatiX, which extends the original system in significant ways. In this demonstration, we show the key features of StatiX++.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"84 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120883441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A type-safe object-oriented solution for the dynamic construction of queries 用于查询动态构造的类型安全的面向对象解决方案
Pub Date : 2004-03-30 DOI: 10.1109/ICDE.2004.1320073
Peter Rosenthal
Many object-oriented applications use large numbers of structurally different database queries. With current technology, writing applications that generate queries at runtime is difficult and error-prone. FROQUE, a framework for object-oriented queries, provides a secure and purely object-oriented solution to access relational databases. As such, it is easy to use for object-oriented programmers and with the help of object-oriented compilers it guarantees that queries formulated in the object-oriented world at execution time result in correct SQL queries. Thus, FROQUE is an improvement over existing database frameworks such as Apache OJB, the object relational bridge, which are not strongly typed and can lead to runtime errors.
许多面向对象的应用程序使用大量结构不同的数据库查询。使用当前的技术,编写在运行时生成查询的应用程序很困难,而且容易出错。面向对象查询的框架FROQUE提供了访问关系数据库的安全和纯粹面向对象的解决方案。因此,对于面向对象的程序员来说,它很容易使用,并且在面向对象编译器的帮助下,它保证在执行时在面向对象世界中制定的查询会产生正确的SQL查询。因此,FROQUE是对现有数据库框架(如Apache OJB,对象关系桥)的改进,后者不是强类型的,可能导致运行时错误。
{"title":"A type-safe object-oriented solution for the dynamic construction of queries","authors":"Peter Rosenthal","doi":"10.1109/ICDE.2004.1320073","DOIUrl":"https://doi.org/10.1109/ICDE.2004.1320073","url":null,"abstract":"Many object-oriented applications use large numbers of structurally different database queries. With current technology, writing applications that generate queries at runtime is difficult and error-prone. FROQUE, a framework for object-oriented queries, provides a secure and purely object-oriented solution to access relational databases. As such, it is easy to use for object-oriented programmers and with the help of object-oriented compilers it guarantees that queries formulated in the object-oriented world at execution time result in correct SQL queries. Thus, FROQUE is an improvement over existing database frameworks such as Apache OJB, the object relational bridge, which are not strongly typed and can lead to runtime errors.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121653131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the integration of structure indexes and inverted lists 论结构索引与倒排表的集成
Pub Date : 2004-03-30 DOI: 10.1145/1007568.1007656
R. Kaushik, R. Krishnamurthy, J. Naughton, R. Ramakrishnan
Recently, there has been a great deal of interest in the development of techniques to evaluate path expressions over collections of XML documents. In general, these path expressions contain both structural and keyword components. Several methods have been proposed for processing path expressions over graph/tree-structured XML data. These methods can be classified into two broad classes. The first involves graph traversal where the input query is evaluated by traversing the data graph or some compressed representation. The other class involves information-retrieval style processing using inverted lists. In this framework, structure indexes have been proposed to be used as a substitute for graph traversal. Here, we focus on a subclass of CAS queries consisting of simple path expressions. We study algorithmic issues in integrating structure indexes with inverted lists for the evaluation of these queries, where we rank all documents that match the query and return the top k documents in order of relevance.
最近,人们对开发在XML文档集合上计算路径表达式的技术产生了浓厚的兴趣。通常,这些路径表达式包含结构组件和关键字组件。已经提出了几种处理图/树结构XML数据上的路径表达式的方法。这些方法可以分为两大类。第一种方法涉及图遍历,其中通过遍历数据图或某些压缩表示来计算输入查询。另一类涉及使用倒排表的信息检索样式处理。在这个框架中,结构索引被提议用来代替图遍历。在这里,我们关注由简单路径表达式组成的CAS查询的一个子类。我们研究了将结构索引与倒排列表集成以评估这些查询的算法问题,其中我们对匹配查询的所有文档进行排序,并按相关性顺序返回前k个文档。
{"title":"On the integration of structure indexes and inverted lists","authors":"R. Kaushik, R. Krishnamurthy, J. Naughton, R. Ramakrishnan","doi":"10.1145/1007568.1007656","DOIUrl":"https://doi.org/10.1145/1007568.1007656","url":null,"abstract":"Recently, there has been a great deal of interest in the development of techniques to evaluate path expressions over collections of XML documents. In general, these path expressions contain both structural and keyword components. Several methods have been proposed for processing path expressions over graph/tree-structured XML data. These methods can be classified into two broad classes. The first involves graph traversal where the input query is evaluated by traversing the data graph or some compressed representation. The other class involves information-retrieval style processing using inverted lists. In this framework, structure indexes have been proposed to be used as a substitute for graph traversal. Here, we focus on a subclass of CAS queries consisting of simple path expressions. We study algorithmic issues in integrating structure indexes with inverted lists for the evaluation of these queries, where we rank all documents that match the query and return the top k documents in order of relevance.","PeriodicalId":358862,"journal":{"name":"Proceedings. 20th International Conference on Data Engineering","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121910639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 179
期刊
Proceedings. 20th International Conference on Data Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1