22nd International Conference on Data Engineering (ICDE'06)最新文献

英文中文

Distributed Evaluation of Continuous Equi-join Queries over Large Structured Overlay Networks 大型结构覆盖网络上连续等联接查询的分布式评估

22nd International Conference on Data Engineering (ICDE'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.50

Stratos Idreos, Christos Tryfonopoulos, Manolis Koubarakis

We study the problem of continuous relational query processing in Internet-scale overlay networks realized by distributed hash tables. We concentrate on the case of continuous two-way equi-join queries. Joins are hard to evaluate in a distributed continuous query environment because data from more than one relations is needed, and this data is inserted in the network asynchronously. Each time a new tuple is inserted, the network nodes have to cooperate to check if this tuple can contribute to the satisfaction of a query when combined with previously inserted tuples. We propose a series of algorithms that initially index queries at network nodes using hashing. Then, they exploit the values of join attributes in incoming tuples to rewrite the given queries into simpler ones, and reindex them in the network where they might be satisfied by existing or future tuples. We present a detailed experimental evaluation in a simulated environment and we show that our algorithms are scalable, balance the storage and query processing load and keep the network traffic low.

研究了分布式哈希表实现的互联网覆盖网络中连续关系查询处理问题。我们主要讨论连续的双向等连接查询。在分布式连续查询环境中，很难评估连接，因为需要来自多个关系的数据，并且这些数据是异步插入到网络中的。每次插入一个新的元组时，网络节点必须合作检查该元组与先前插入的元组组合时是否有助于满足查询。我们提出了一系列算法，这些算法最初使用哈希在网络节点上索引查询。然后，它们利用传入元组中的联接属性值将给定的查询重写为更简单的查询，并在现有或未来元组可能满足的网络中重新索引它们。我们在模拟环境中进行了详细的实验评估，结果表明我们的算法具有可扩展性，平衡了存储和查询处理负载，并保持了较低的网络流量。

引用次数: 21

A Complete and Efficient Algebraic Compiler for XQuery 一个完整而高效的XQuery代数编译器

22nd International Conference on Data Engineering (ICDE'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.6

C. Ré, Jérôme Siméon, M. Fernández

As XQuery nears standardization, more sophisticated XQuery applications are emerging, which often exploit the entire language and are applied to non-trivial XML sources. We propose an algebra and optimization techniques that are suitable for building an XQuery compiler that is complete, correct, and efficient. We describe the compilation rules for the complete language into that algebra and present novel optimization techniques that address the needs of complex queries. These techniques include new query unnesting rewritings and specialized join algorithms that account for XQuery’s complex predicate semantics. The algebra and optimizations are implemented in the Galax XQuery engine, and yield execution plans that are up to three orders of magnitude faster than earlier versions of Galax.

随着XQuery接近标准化，出现了更复杂的XQuery应用程序，它们通常利用整个语言并应用于重要的XML源。我们提出了适合构建完整、正确和高效的XQuery编译器的代数和优化技术。我们将完整语言的编译规则描述为该代数，并提出了解决复杂查询需求的新颖优化技术。这些技术包括新的查询反嵌套重写和专门的连接算法，这些算法解释了XQuery复杂的谓词语义。在Galax XQuery引擎中实现了代数和优化，生成的执行计划比早期版本的Galax快3个数量级。

引用次数: 84

Efficient Aggregation of Ranked Inputs 排序输入的有效聚合

22nd International Conference on Data Engineering (ICDE'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.54

N. Mamoulis, K. Cheng, Man Lung Yiu, D. Cheung

A top-k query combines different rankings of the same set of objects and returns the k objects with the highest combined score according to an aggregate function. We bring to light some key observations, which impose two phases that any top-k algorithm, based on sorted accesses, should go through. Based on them, we propose a new algorithm, which is designed to minimize the number of object accesses, the computational cost, and the memory requirements of top-k search. Adaptations of our algorithm for search variants (exact scores, on-line and incremental search, top-k joins, other aggregate functions, etc.) are also provided. Extensive experiments with synthetic and real data show that, compared to previous techniques, our method accesses fewer objects, while being orders of magnitude faster.

top-k查询结合同一组对象的不同排名，并根据聚合函数返回综合得分最高的k个对象。我们提出了一些关键的观察结果，它们强加了任何基于排序访问的top-k算法都应该经历的两个阶段。在此基础上，我们提出了一种新的算法，该算法旨在最大限度地减少对象访问次数、计算成本和top-k搜索的内存需求。还提供了我们的算法对搜索变量(精确分数，在线和增量搜索，top-k连接，其他聚合函数等)的适应性。大量的合成和真实数据实验表明，与以前的技术相比，我们的方法访问更少的对象，同时速度快了几个数量级。

引用次数: 42

Transaction Time Support Inside a Database Engine 数据库引擎内部的事务时间支持

22nd International Conference on Data Engineering (ICDE'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.162

D. Lomet, R. Barga, M. Mokbel, German Shegalov, Rui Wang, Yunyue Zhu

Transaction time databases retain and provide access to prior states of a database. An update "inserts" a new record while preserving the old version. Immortal DB builds transaction time database support into a database engine, not in middleware. It supports as of queries returning records current at the specified time. It also supports snapshot isolation concurrency control. Versions are stamped with the "clock times" of their updating transactions. The timestamp order agrees with transaction serialization order. Lazy timestamping propagates timestamps to transaction updates after commit. Versions are kept in an integrated storage structure, with historical versions initially stored with current data. Time-splits of pages permit large histories to be maintained, and enable time based indexing, which is essential for high performance historical queries. Experiments show that Immortal DB introduces little overhead for accessing recent database states while providing access to past states.

事务时间数据库保留并提供对数据库先前状态的访问。更新在保留旧版本的同时“插入”一条新记录。不朽数据库将事务时数据库支持构建到数据库引擎中，而不是中间件中。它支持as - of查询，返回指定时间的当前记录。它还支持快照隔离并发控制。版本上标有其更新事务的“时钟时间”。时间戳顺序与事务序列化顺序一致。延迟时间戳在提交后将时间戳传播到事务更新。版本保存在一个集成的存储结构中，历史版本最初与当前数据一起存储。页面的时间分割允许维护大型历史记录，并支持基于时间的索引，这对于高性能历史查询至关重要。实验表明，在提供对过去状态的访问时，不朽数据库对访问最近的数据库状态引入的开销很小。

引用次数: 83

HiWaRPP ― Hierarchical Wavelet-based Retrieval on Peer-to-Peer Network HiWaRPP & # 8213;基于层次小波的点对点网络检索

22nd International Conference on Data Engineering (ICDE'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.76

M. Lupu, Bei Yu

This paper introduces the use of wavelets for information retrieval in a peer-to-peer environment. In order to achieve our purposes, we use a new combination between broadcasting and a hierarchical overlay. Compared to previous approaches, we do not store complete information about the children of a super-peer, nor do we broadcast the queries blindly. We approximate the feature vectors using the multiresolution analysis and the discrete wavelet transform. Each peer is represented by a high-dimensional feature vector and the height of the hierarchy is logarithmic in the dimensionality of this feature vector. Leaf nodes represent real peers, while internal nodes are virtual peers used for routing. Our retrieval method has been tested with both real and synthetic data and shown to be efficient in retrieving relevant information, resulting in good precision and recall on four standard test collections.

本文介绍了小波在点对点环境下信息检索中的应用。为了达到我们的目的，我们在广播和分层覆盖之间使用了一种新的组合。与以前的方法相比，我们不存储关于超级对等体的子节点的完整信息，也不盲目地广播查询。我们使用多分辨率分析和离散小波变换来近似特征向量。每个节点由一个高维特征向量表示，层次结构的高度在该特征向量的维度上是对数的。叶子节点代表真实的对等点，而内部节点是用于路由的虚拟对等点。我们的检索方法已经用真实数据和合成数据进行了测试，在检索相关信息方面显示出效率，在四个标准测试集合上获得了良好的精度和召回率。

引用次数: 3

CLAN: An Algorithm for Mining Closed Cliques from Large Dense Graph Databases CLAN:一种从大型密集图数据库中挖掘封闭团的算法

22nd International Conference on Data Engineering (ICDE'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.34

Jianyong Wang, Zhiping Zeng, Lizhu Zhou

Most previously proposed frequent graph mining algorithms are intended to find the complete set of all frequent, closed subgraphs. However, in many cases only a subset of the frequent subgraphs with a certain topology is of special interest. Thus, the method of mining the complete set of all frequent subgraphs is not suitable for mining these frequent subgraphs of special interest as it wastes considerable computing power and space on uninteresting subgraphs. In this paper we develop a new algorithm, CLAN, to mine the frequent closed cliques, the most coherent structures in the graph setting. By exploring some properties of the clique pattern, we can simplify the canonical label design and the corresponding clique (or subclique) isomorphism testing. Several effective pruning methods are proposed to prune the search space, while the clique closure checking scheme is used to remove the non-closed clique patterns. Our empirical results show that CLAN is very efficient for large dense graph databases with which the traditional graph mining algorithms fail. The novelty of our method is further demonstrated by the application of CLAN in mining highly correlated stocks from large stock market data.

大多数以前提出的频繁图挖掘算法都是为了找到所有频繁的闭子图的完整集合。然而，在许多情况下，只有具有特定拓扑结构的频繁子图的子集是特别感兴趣的。因此，挖掘所有频繁子图的完整集的方法不适合挖掘这些特殊兴趣的频繁子图，因为它在无兴趣的子图上浪费了相当大的计算能力和空间。本文提出了一种新的算法CLAN，用于挖掘图集中最连贯的结构——频繁闭合团。通过探索团模式的一些特性，我们可以简化规范标签的设计和相应的团(或子团)同构测试。提出了几种有效的剪枝方法对搜索空间进行剪枝，同时采用团簇闭合检查方案去除非闭合的团簇模式。我们的实证结果表明，对于传统的图挖掘算法无法处理的大型密集图数据库，CLAN是非常有效的。CLAN在从大型股票市场数据中挖掘高度相关股票中的应用进一步证明了我们方法的新颖性。

{"title":"CLAN: An Algorithm for Mining Closed Cliques from Large Dense Graph Databases","authors":"Jianyong Wang, Zhiping Zeng, Lizhu Zhou","doi":"10.1109/ICDE.2006.34","DOIUrl":"https://doi.org/10.1109/ICDE.2006.34","url":null,"abstract":"Most previously proposed frequent graph mining algorithms are intended to find the complete set of all frequent, closed subgraphs. However, in many cases only a subset of the frequent subgraphs with a certain topology is of special interest. Thus, the method of mining the complete set of all frequent subgraphs is not suitable for mining these frequent subgraphs of special interest as it wastes considerable computing power and space on uninteresting subgraphs. In this paper we develop a new algorithm, CLAN, to mine the frequent closed cliques, the most coherent structures in the graph setting. By exploring some properties of the clique pattern, we can simplify the canonical label design and the corresponding clique (or subclique) isomorphism testing. Several effective pruning methods are proposed to prune the search space, while the clique closure checking scheme is used to remove the non-closed clique patterns. Our empirical results show that CLAN is very efficient for large dense graph databases with which the traditional graph mining algorithms fail. The novelty of our method is further demonstrated by the application of CLAN in mining highly correlated stocks from large stock market data.","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":"136 1","pages":"73-73"},"PeriodicalIF":0.0,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79670997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 70

SQL to XQuery Translation in the AquaLogic Data Services Platform aququalogic数据服务平台中SQL到XQuery的转换

22nd International Conference on Data Engineering (ICDE'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.147

Sunil Jigyasu, Sujeet Banerjee, V. Borkar, M. Carey, Kanad Dixit, Anil Malkani, S. Thatte

SQL has long been the standard language for retrieving and manipulating data in relational database systems. XML has become the standard format for data exchange, and XQuery is on its way to becoming the standard language for querying XML data. The BEA AquaLogic Data Services Platform provides a service-oriented, XML-based view of heterogeneous enterprise data sources and allows this view to be queried using XQuery. AquaLogic DSP includes a JDBC driver that connects the old (SQL) world with the new (XML) world via a SQL-to-XQuery translator. This paper outlines the issues related to creating such a driver and details the approach used to translate SQL queries into XQuery expressions. The paper also touches on performance considerations related to handling XML query results in a context where JDBC result sets are the desired output format.

长期以来，SQL一直是关系数据库系统中检索和操作数据的标准语言。XML已经成为数据交换的标准格式，而XQuery正在成为查询XML数据的标准语言。BEA AquaLogic数据服务平台提供了异构企业数据源的面向服务的、基于xml的视图，并允许使用XQuery查询该视图。AquaLogic DSP包括一个JDBC驱动程序，它通过SQL-to- xquery转换器将旧的(SQL)世界与新的(XML)世界连接起来。本文概述了与创建这样一个驱动程序相关的问题，并详细介绍了将SQL查询转换为XQuery表达式的方法。本文还涉及与在JDBC结果集是所需输出格式的上下文中处理XML查询结果相关的性能考虑。

引用次数: 17

Merging Source Query Interfaces onWeb Databases 合并web数据库上的源查询接口

22nd International Conference on Data Engineering (ICDE'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.91

Eduard Constantin Dragut, Wensheng Wu, A. Sistla, Clement T. Yu, W. Meng

Recently, there are many e-commerce search engines that return information from Web databases. Unlike text search engines, these e-commerce search engines have more complicated user interfaces. Our aim is to construct automatically a natural query user interface that integrates a set of interfaces over a given domain of interest. For example, each airline company has a query interface for ticket reservation and our system can construct an integrated interface for all these companies. This will permit users to access information uniformly from multiple sources. Each query interface from an e-commerce search engine is designed so as to facilitate users to provide necessary information. Specifically, (1) related pieces of information such as first name and last name are grouped together and (2) certain hierarchical relationships are maintained. In this paper, we provide an algorithm to compute an integrated interface from query interfaces of the same domain. The integrated query interface can be proved to preserve the above two types of relationships. Experiments on five domains verify our theoretical study.

最近，有许多电子商务搜索引擎从Web数据库返回信息。与文本搜索引擎不同，这些电子商务搜索引擎具有更复杂的用户界面。我们的目标是自动构建一个自然的查询用户界面，该界面集成了给定感兴趣领域的一组界面。例如，每个航空公司都有一个机票预订的查询接口，我们的系统可以为所有这些公司构建一个集成的接口。这将允许用户从多个来源统一访问信息。设计了电子商务搜索引擎的各个查询界面，方便用户提供必要的信息。具体来说，(1)将姓氏和名字等相关信息组合在一起;(2)保持一定的层次关系。本文提出了一种从同一域的查询接口计算集成接口的算法。可以证明集成查询接口保留了上述两种类型的关系。五个领域的实验验证了我们的理论研究。

引用次数: 40

Achieving Class-Based QoS for Transactional Workloads 为事务性工作负载实现基于类的QoS

22nd International Conference on Data Engineering (ICDE'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.11

Bianca Schroeder, Mor Harchol-Balter, A. Iyengar, E. Nahum

Transaction processing systems lie at the core of modern e-commerce applications such as on-line retail stores, banks and airline reservation systems. The economic success of these applications depends on the ability to achieve high user satisfaction, since a single mouse-click is all that it takes a frustrated user to switch to a competitor. Given that system resources are limited and demands are varying, it is difficult to provide optimal performance to all users at all times. However, often transactions can be divided into different classes based on how important they are to the online retailer. For example, transactions initiated by a "big spending" client are more important than transactions from a client that only browses the site. A natural goal then is to ensure short delays for the class of important transactions, while for the less important transactions longer delays are acceptable.

交易处理系统是现代电子商务应用的核心，例如在线零售商店、银行和航空公司预订系统。这些应用程序在经济上的成功取决于能否获得较高的用户满意度，因为沮丧的用户只需点击一下鼠标就能切换到竞争对手的应用程序。由于系统资源有限，需求多变，因此很难在任何时候为所有用户提供最佳性能。然而，交易通常可以根据它们对在线零售商的重要性划分为不同的类别。例如，由“大消费”客户发起的交易比仅浏览站点的客户发起的交易更重要。因此，一个自然的目标是确保重要事务类的短延迟，而对于不太重要的事务类，较长的延迟是可以接受的。

引用次数: 83

Partial Selection Query in Peer-to-Peer Databases 点对点数据库中的部分选择查询

22nd International Conference on Data Engineering (ICDE'06)

Pub Date : 2006-04-03 DOI: 10.1109/ICDE.2006.111

F. Kashani, C. Shahabi

In this paper, we propose DBSampler, a query execution mechanism to answer "partial selection" queries in peerto- peer databases. A partial selection query is an arbitrary selection query that is satisfied with a fraction in of the results; a universal operation with applications in database tuning, query optimization and approximate query processing in peer-to-peer databases. DBSampler is based on an epidemic dissemination algorithm. We model the epidemic dissemination as a percolation problem and by rigorous percolation analysis tune DBSampler per-query and on-thefly to answer partial queries correctly and efficiently. We verify the efficiency of DBSampler in terms of query cost and query time via extensive simulation.

在本文中，我们提出了DBSampler，一个查询执行机制来回答点对点数据库中的“部分选择”查询。部分选择查询是一种任意选择查询，它满足于结果的一小部分in;一种通用操作，应用于对等数据库的数据库调优、查询优化和近似查询处理。DBSampler基于流行病传播算法。我们将流行病传播建模为一个渗透问题，并通过严格的渗透分析，对DBSampler进行每查询和动态调整，以正确有效地回答部分查询。我们通过大量的仿真验证了DBSampler在查询成本和查询时间方面的效率。

引用次数: 4

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

22nd International Conference on Data Engineering (ICDE'06)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀