21st International Conference on Data Engineering (ICDE'05)最新文献

英文中文

A distributed quadtree index for peer-to-peer settings 点对点设置的分布式四叉树索引

21st International Conference on Data Engineering (ICDE'05)

Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.7

E. Tanin, A. Harwood, H. Samet

We describe a distributed quadtree index for enabling more powerful access on complex data over P2P networks. It is based on the Chord method. Methods such as Chord have been gaining usage in P2P settings to facilitate exact-match queries. The Chord method maps both the data keys and peer addresses. Our work can be applied to higher dimensions, to various data types, i.e., other than spatial data, and to different types of quadtrees. Finally, we can use other key-based methods than the Chord method as our base P2P routing protocol and index scale well. The index also benefits from the underlying fault-tolerant hashing-based methods by achieving a nice load distribution among many peers. We can seamlessly execute a single query on multiple branches of the index hosted by a dynamic set of peers.

我们描述了一种分布式四叉树索引，用于在P2P网络上对复杂数据进行更强大的访问。它基于Chord方法。诸如Chord之类的方法已经在P2P设置中得到使用，以促进精确匹配查询。Chord方法映射数据键和对等地址。我们的工作可以应用于更高的维度，各种数据类型，即，除了空间数据，以及不同类型的四叉树。最后，我们可以使用其他基于密钥的方法，而不是Chord方法作为我们的基本P2P路由协议和索引扩展良好。通过在许多对等点之间实现良好的负载分配，索引还受益于基于哈希的底层容错方法。我们可以在由一组动态对等节点托管的索引的多个分支上无缝地执行单个查询。

引用次数: 76

Robust identification of fuzzy duplicates 模糊重复的鲁棒识别

21st International Conference on Data Engineering (ICDE'05)

Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.125

S. Chaudhuri, Venkatesh Ganti, R. Motwani

Detecting and eliminating fuzzy duplicates is a critical data cleaning task that is required by many applications. Fuzzy duplicates are multiple seemingly distinct tuples, which represent the same real-world entity. We propose two novel criteria that enable characterization of fuzzy duplicates more accurately than is possible with existing techniques. Using these criteria, we propose a novel framework for the fuzzy duplicate elimination problem. We show that solutions within the new framework result in better accuracy than earlier approaches. We present an efficient algorithm for solving instantiations within the framework. We evaluate it on real datasets to demonstrate the accuracy and scalability of our algorithm.

检测和消除模糊重复是许多应用程序需要的关键数据清理任务。模糊副本是多个看似不同的元组，它们表示相同的现实世界实体。我们提出了两个新的标准，使模糊重复的表征比现有的技术更准确。利用这些准则，我们提出了一个新的模糊重复消除问题的框架。我们表明，新框架内的解决方案比以前的方法具有更好的准确性。我们提出了一个在框架内求解实例化的有效算法。我们在实际数据集上对其进行了评估，以证明我们的算法的准确性和可扩展性。

引用次数: 227

TRMeister: a DBMS with high-performance full-text search functions TRMeister:一个具有高性能全文检索功能的DBMS

21st International Conference on Data Engineering (ICDE'05)

Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.148

Tetsuya Ikeda, Hiroko Mano, Hideo Itoh, Hiroshi Takegawa, Takuya Hiraoka, Shiroh Horibe, Yasushi Ogawa

TRMeister is a DBMS with high-performance full-text search functions. With TRMeister, high-speed full-text search, including high-precision ranking search in addition to Boolean search, is possible. Further, in addition to search, high-speed insert and delete are possible, allowing full-text search to be used in the same way as other types of database search in which data can be searched right after data is inserted. This makes it easy to combine normal attribute search with full-text search and thus easily create text search applications.

TRMeister是一个具有高性能全文搜索功能的DBMS。使用TRMeister，除了布尔搜索之外，还可以进行高速全文搜索，包括高精度排名搜索。此外，除了搜索之外，还可以高速插入和删除，从而允许以与其他类型的数据库搜索相同的方式使用全文搜索，在其他类型的数据库搜索中，可以在插入数据后立即搜索数据。这使得将普通属性搜索与全文搜索结合起来变得很容易，从而很容易创建文本搜索应用程序。

引用次数: 4

Acceleration technique of snake-shaped regions retrieval method for telematics navigation service system 车载信息导航服务系统蛇形区域检索方法的加速技术

21st International Conference on Data Engineering (ICDE'05)

Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.14

M. Tanizaki, K. Maruyama, S. Shimada

Telematics services, which provide traffic information such as route guidance, congestion warnings, etc. via a wireless communication network, have spread recently. The demand is growing for graphical guide information to be provided in addition to the conventional service that provides text only guidance. To improve graphical service, we propose a new retrieval method. This method enables fast extraction of map objects within a snake-shaped region (SSR) along a driving route from a geo-spatial database that stores map data without rectangular mesh boundaries. For this retrieval method, we have considered three techniques. The first is based on simplification of the snake-shaped route region through point elimination, and the second is based on reduction of the processing load of the geometrical intersection detection processes. This second technique is accomplished by dividing the snake-shaped region into multiple cells, and the third is multiple distributions of the SSR retrieval result to terminals for quick start of navigation processing. We have developed a prototype to evaluate the performance of the proposed methods. The prototype provides route guidance information for an actual terminal, and uses information taken from United States road maps. Even in an urban area, we managed to provide an approximately 200-mile route of guide information within 10 seconds. We are convinced that the proposed method can be applied to actual telematics services.

通过无线通信网络提供路线指引、拥堵警告等交通信息的远程信息服务最近得到普及。除了传统的仅提供文字指导的服务外，对提供图形化指南信息的需求正在增长。为了提高图形化服务，我们提出了一种新的检索方法。该方法可以从存储地图数据的地理空间数据库中快速提取沿行驶路线的蛇形区域(SSR)内的地图对象，该数据库没有矩形网格边界。对于这种检索方法，我们考虑了三种技术。第一种方法是通过点消去对蛇形路径区域进行简化，第二种方法是减少几何交叉口检测过程的处理负荷。第二种方法是将蛇形区域划分为多个单元，第三种方法是将SSR检索结果多次分布到终端，以便快速启动导航处理。我们已经开发了一个原型来评估所提出方法的性能。原型机为实际的终点站提供路线引导信息，并使用来自美国道路地图的信息。即使在市区，我们也能在10秒内提供大约200英里的路线指南信息。我们相信，所提出的方法可以应用于实际的远程信息处理服务。

{"title":"Acceleration technique of snake-shaped regions retrieval method for telematics navigation service system","authors":"M. Tanizaki, K. Maruyama, S. Shimada","doi":"10.1109/ICDE.2005.14","DOIUrl":"https://doi.org/10.1109/ICDE.2005.14","url":null,"abstract":"Telematics services, which provide traffic information such as route guidance, congestion warnings, etc. via a wireless communication network, have spread recently. The demand is growing for graphical guide information to be provided in addition to the conventional service that provides text only guidance. To improve graphical service, we propose a new retrieval method. This method enables fast extraction of map objects within a snake-shaped region (SSR) along a driving route from a geo-spatial database that stores map data without rectangular mesh boundaries. For this retrieval method, we have considered three techniques. The first is based on simplification of the snake-shaped route region through point elimination, and the second is based on reduction of the processing load of the geometrical intersection detection processes. This second technique is accomplished by dividing the snake-shaped region into multiple cells, and the third is multiple distributions of the SSR retrieval result to terminals for quick start of navigation processing. We have developed a prototype to evaluate the performance of the proposed methods. The prototype provides route guidance information for an actual terminal, and uses information taken from United States road maps. Even in an urban area, we managed to provide an approximately 200-mile route of guide information within 10 seconds. We are convinced that the proposed method can be applied to actual telematics services.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123553889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Efficient algorithms for pattern matching on directed acyclic graphs 有向无环图模式匹配的高效算法

21st International Conference on Data Engineering (ICDE'05)

Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.56

Li Chen, Amarnath Gupta, M. E. Kurul

Recently graph data models have become increasingly popular in many scientific fields. Efficient query processing over such data is critical. Existing works often rely on index structures that store pre-computed transitive relations to achieve efficient graph matching. In this paper, we present a family of stack-based algorithms to handle path and twig pattern queries for directed acyclic graphs (DAGs) in particular. With the worst-case space cost linearly bounded by the number of edges in the graph, our algorithms achieve a quadratic runtime complexity in the average size of the query variable bindings. This is optimal among the navigation-based graph matching algorithms.

近年来，图形数据模型在许多科学领域越来越流行。对此类数据进行高效的查询处理至关重要。现有的工作通常依赖于存储预先计算的传递关系的索引结构来实现有效的图匹配。在本文中，我们提出了一组基于堆栈的算法来处理有向无环图(dag)的路径和分支模式查询。由于最坏情况下的空间成本由图中边的数量线性限定，我们的算法在查询变量绑定的平均大小上实现了二次的运行时复杂度。这在基于导航的图匹配算法中是最优的。

引用次数: 4

Improving data accessibility for mobile clients through cooperative hoarding 通过合作囤积提高移动客户端的数据可访问性

21st International Conference on Data Engineering (ICDE'05)

Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.76

K. Y. Lai, Z. Tari, P. Bertók

In this paper, we introduce the concept of cooperative hoarding to reduce the risks of cache misses for mobile clients. Cooperative hoarding takes advantage of group mobility behaviour, combined with peer cooperation in ad-hoc mode, to improve hoard performance. Two cooperative hoarding approaches that take into account clients' access frequencies, connection probabilities and cache size when performing hoarding are proposed. Test results show that the proposed methods significantly improve cache hit ratio and reduce query costs compared to existing approaches.

在本文中，我们引入了协作囤积的概念来降低移动客户端缓存丢失的风险。合作囤积利用群体流动行为，结合ad-hoc模式下的同伴合作来提高囤积行为。提出了两种考虑客户端访问频率、连接概率和缓存大小的协同囤积方法。测试结果表明，与现有方法相比，所提方法显著提高了缓存命中率，降低了查询成本。

引用次数: 4

Adaptive processing of top-k queries in XML XML中top-k查询的自适应处理

21st International Conference on Data Engineering (ICDE'05)

Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.18

A. Marian, S. Amer-Yahia, Nick Koudas, D. Srivastava

The ability to compute top-k matches to XML queries is gaining importance due to the increasing number of large XML repositories. The efficiency of top-k query evaluation relies on using scores to prune irrelevant answers as early as possible in the evaluation process. In this context, evaluating the same query plan for all answers might be too rigid because, at any time in the evaluation, answers have gone through the same number and sequence of operations, which limits the speed at which scores grow. Therefore, adaptive query processing that permits different plans for different partial matches and maximizes the best scores is more appropriate. In this paper, we propose an architecture and adaptive algorithms for efficiently computing top-k matches to XML queries. Our techniques can be used to evaluate both exact and approximate matches where approximation is defined by relaxing XPath axes. In order to compute the scores of query answers, we extend the traditional tf*idf measure to account for document structure. We conduct extensive experiments on a variety of benchmark data and queries, and demonstrate the usefulness of the adaptive approach for computing top-k queries in XML.

由于大型XML存储库的数量不断增加，计算top-k匹配到XML查询的能力变得越来越重要。top-k查询评估的效率依赖于在评估过程中尽早使用分数来修剪不相关的答案。在这种情况下，对所有答案评估相同的查询计划可能过于死板，因为在评估的任何时候，答案都经历了相同数量和顺序的操作，这限制了分数增长的速度。因此，允许针对不同部分匹配的不同计划并最大化最佳分数的自适应查询处理更为合适。本文提出了一种高效计算top-k匹配的体系结构和自适应算法。我们的技术可用于评估精确匹配和近似匹配，其中近似是通过放松XPath轴来定义的。为了计算查询答案的分数，我们扩展了传统的tf*idf度量来考虑文档结构。我们对各种基准数据和查询进行了广泛的实验，并演示了自适应方法在XML中计算top-k查询的有用性。

{"title":"Adaptive processing of top-k queries in XML","authors":"A. Marian, S. Amer-Yahia, Nick Koudas, D. Srivastava","doi":"10.1109/ICDE.2005.18","DOIUrl":"https://doi.org/10.1109/ICDE.2005.18","url":null,"abstract":"The ability to compute top-k matches to XML queries is gaining importance due to the increasing number of large XML repositories. The efficiency of top-k query evaluation relies on using scores to prune irrelevant answers as early as possible in the evaluation process. In this context, evaluating the same query plan for all answers might be too rigid because, at any time in the evaluation, answers have gone through the same number and sequence of operations, which limits the speed at which scores grow. Therefore, adaptive query processing that permits different plans for different partial matches and maximizes the best scores is more appropriate. In this paper, we propose an architecture and adaptive algorithms for efficiently computing top-k matches to XML queries. Our techniques can be used to evaluate both exact and approximate matches where approximation is defined by relaxing XPath axes. In order to compute the scores of query answers, we extend the traditional tf*idf measure to account for document structure. We conduct extensive experiments on a variety of benchmark data and queries, and demonstrate the usefulness of the adaptive approach for computing top-k queries in XML.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124506710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 102

Data stream query processing 数据流查询处理

21st International Conference on Data Engineering (ICDE'05)

Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.43

Nick Koudas, D. Srivastava

This tutorial provides a comprehensive and cohesive overview of the key research results in the area of data stream query processing, both for SQL-like and XML query languages.

本教程对数据流查询处理领域的主要研究成果进行了全面和连贯的概述，包括类似sql的查询语言和XML查询语言。

引用次数: 1

Privacy - preserving top-k queries 保护隐私的top-k查询

21st International Conference on Data Engineering (ICDE'05)

Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.112

Jaideep Vaidya, Chris Clifton

The primary contribution of this paper is a secure method for doing top-k selection from vertically partitioned data. This has particular relevance to privacy-sensitive searches, and meshes well with privacy policies such as k-anonymity. We have demonstrated how secure primitives from the literature can be composed with efficient query processing algorithms, with the result having provable security properties. The paper also shows a trade-off between efficiency and disclosure. It is worth exploring whether one could have a suite of algorithms to optimize these tradeoffs, e.g., algorithms that guarantee k-anonymity with efficiency based on the choice of k rather than the guarantees of secure multiparty computation.

本文的主要贡献是一种从垂直分区数据中进行top-k选择的安全方法。这与隐私敏感搜索特别相关，并且与k-匿名等隐私政策非常吻合。我们已经演示了如何将文献中的安全原语与高效的查询处理算法组合在一起，其结果具有可证明的安全属性。这篇论文还展示了效率与信息披露之间的权衡。值得探索的是，是否可以有一套算法来优化这些权衡，例如，基于k的选择而不是安全多方计算的保证，以效率保证k-匿名的算法。

引用次数: 77

Mining evolving customer-product relationships in multi-dimensional space 在多维空间中挖掘不断发展的客户-产品关系

21st International Conference on Data Engineering (ICDE'05)

Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.88

Xiaolei Li, Jiawei Han, Xiaoxin Yin, Dong Xin

Previous work on mining transactional database has focused primarily on mining frequent Itemsets, association rules, and sequential patterns. However, interesting relationships between customers and items, especially their evolution with time, have not been studied thoroughly. In this paper, we propose a Gaussian transformation-based regression model that captures time-variant relationships between customers and products. Moreover, since it is interesting to discover such relationships in a multi-dimensional space, an efficient method has been developed to compute multi-dimensional aggregates of such curves in a data cube environment. Our experimental results have demonstrated the promise of the approach.

以前挖掘事务数据库的工作主要集中在挖掘频繁项集、关联规则和顺序模式上。然而，顾客和商品之间的有趣关系，尤其是它们随时间的演变，还没有得到彻底的研究。在本文中，我们提出了一个基于高斯变换的回归模型来捕捉客户和产品之间的时变关系。此外，由于在多维空间中发现这种关系很有趣，因此已经开发了一种有效的方法来在数据立方体环境中计算这种曲线的多维聚合。我们的实验结果证明了这种方法的前景。

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

21st International Conference on Data Engineering (ICDE'05)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀