首页 > 最新文献

21st International Conference on Data Engineering (ICDE'05)最新文献

英文 中文
Odysseus: a high-performance ORDBMS tightly-coupled with IR features 奥德修斯:一个高性能ORDBMS与IR特性紧密耦合
Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.95
K. Whang, Min-Jae Lee, Jae-Gil Lee, Min-Soo Kim, Wook-Shin Han
We propose the notion of tight-coupling [K. Whang et al., (1999)] to add new data types into the DBMS engine. In this paper, we introduce the Odysseus ORDBMS and present its tightly-coupled IR features (US patented). We demonstrate a Web search engine capable of managing 20 million Web pages in a non-parallel configuration using Odysseus.
我们提出了紧密耦合的概念[K]。Whang et al.,(1999)]向DBMS引擎中添加新的数据类型。本文介绍了Odysseus ORDBMS,并介绍了其紧耦合IR特性(美国专利)。我们将演示一个Web搜索引擎,它能够使用奥德修斯在非并行配置中管理2000万个Web页面。
{"title":"Odysseus: a high-performance ORDBMS tightly-coupled with IR features","authors":"K. Whang, Min-Jae Lee, Jae-Gil Lee, Min-Soo Kim, Wook-Shin Han","doi":"10.1109/ICDE.2005.95","DOIUrl":"https://doi.org/10.1109/ICDE.2005.95","url":null,"abstract":"We propose the notion of tight-coupling [K. Whang et al., (1999)] to add new data types into the DBMS engine. In this paper, we introduce the Odysseus ORDBMS and present its tightly-coupled IR features (US patented). We demonstrate a Web search engine capable of managing 20 million Web pages in a non-parallel configuration using Odysseus.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121487947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Reverse nearest neighbors in large graphs 在大图中反转最近邻
Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.124
Man Lung Yiu, D. Papadias, N. Mamoulis, Yufei Tao
A reverse nearest neighbor query returns the data objects that have a query point as their nearest neighbor. Although such queries have been studied quite extensively in Euclidean spaces, there is no previous work in the context of large graphs. In this paper, we propose algorithms and optimization techniques for RNN queries by utilizing some characteristics of networks.
反向最近邻查询返回以查询点为最近邻的数据对象。虽然这样的查询在欧几里得空间中已经得到了广泛的研究,但在大图的背景下还没有以前的工作。在本文中,我们利用网络的一些特征提出了RNN查询的算法和优化技术。
{"title":"Reverse nearest neighbors in large graphs","authors":"Man Lung Yiu, D. Papadias, N. Mamoulis, Yufei Tao","doi":"10.1109/ICDE.2005.124","DOIUrl":"https://doi.org/10.1109/ICDE.2005.124","url":null,"abstract":"A reverse nearest neighbor query returns the data objects that have a query point as their nearest neighbor. Although such queries have been studied quite extensively in Euclidean spaces, there is no previous work in the context of large graphs. In this paper, we propose algorithms and optimization techniques for RNN queries by utilizing some characteristics of networks.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122388789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Data privacy through optimal k-anonymization 通过最优k-匿名化实现数据隐私
Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.42
R. Bayardo, R. Agrawal
Data de-identification reconciles the demand for release of data for research purposes and the demand for privacy from individuals. This paper proposes and evaluates an optimization algorithm for the powerful de-identification procedure known as k-anonymization. A k-anonymized dataset has the property that each record is indistinguishable from at least k - 1 others. Even simple restrictions of optimized k-anonymity are NP-hard, leading to significant computational challenges. We present a new approach to exploring the space of possible anonymizations that tames the combinatorics of the problem, and develop data-management strategies to reduce reliance on expensive operations such as sorting. Through experiments on real census data, we show the resulting algorithm can find optimal k-anonymizations under two representative cost measures and a wide range of k. We also show that the algorithm can produce good anonymizations in circumstances where the input data or input parameters preclude finding an optimal solution in reasonable time. Finally, we use the algorithm to explore the effects of different coding approaches and problem variations on anonymization quality and performance. To our knowledge, this is the first result demonstrating optimal k-anonymization of a non-trivial dataset under a general model of the problem.
数据去识别化协调了为研究目的而发布数据的需求和对个人隐私的需求。本文提出并评估了一种被称为k-匿名化的强大去识别过程的优化算法。k匿名数据集具有这样的属性,即每个记录与至少k- 1个其他记录无法区分。即使是优化k-匿名的简单限制也是np困难的,这导致了重大的计算挑战。我们提出了一种新的方法来探索可能的匿名化空间,这种方法可以驯服问题的组合,并开发数据管理策略来减少对昂贵操作(如排序)的依赖。通过对真实人口普查数据的实验,我们表明所得到的算法可以在两种具有代表性的成本度量和广泛的k范围下找到最优k-匿名化。我们还表明,在输入数据或输入参数无法在合理时间内找到最优解的情况下,该算法可以产生良好的匿名化。最后,我们使用该算法探讨了不同编码方法和问题变化对匿名化质量和性能的影响。据我们所知,这是在该问题的一般模型下展示非平凡数据集的最佳k-匿名化的第一个结果。
{"title":"Data privacy through optimal k-anonymization","authors":"R. Bayardo, R. Agrawal","doi":"10.1109/ICDE.2005.42","DOIUrl":"https://doi.org/10.1109/ICDE.2005.42","url":null,"abstract":"Data de-identification reconciles the demand for release of data for research purposes and the demand for privacy from individuals. This paper proposes and evaluates an optimization algorithm for the powerful de-identification procedure known as k-anonymization. A k-anonymized dataset has the property that each record is indistinguishable from at least k - 1 others. Even simple restrictions of optimized k-anonymity are NP-hard, leading to significant computational challenges. We present a new approach to exploring the space of possible anonymizations that tames the combinatorics of the problem, and develop data-management strategies to reduce reliance on expensive operations such as sorting. Through experiments on real census data, we show the resulting algorithm can find optimal k-anonymizations under two representative cost measures and a wide range of k. We also show that the algorithm can produce good anonymizations in circumstances where the input data or input parameters preclude finding an optimal solution in reasonable time. Finally, we use the algorithm to explore the effects of different coding approaches and problem variations on anonymization quality and performance. To our knowledge, this is the first result demonstrating optimal k-anonymization of a non-trivial dataset under a general model of the problem.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122838802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1327
Filter based directory replication and caching 基于过滤器的目录复制和缓存
Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.67
Apurva Kumar
This paper describes a novel filter based replication model for lightweight directory access protocol (LDAP) directories. Instead of replicating entire subtrees from the directory information tree (DIT), only entries matching a filter specification are replicated Advantages of the filter based replication framework over existing subtree based mechanisms have been demonstrated for a real enterprise directory using real workloads.
本文描述了一种新的基于过滤器的轻量级目录访问协议(LDAP)目录复制模型。与从目录信息树(DIT)复制整个子树不同,只有与过滤器规范匹配的条目才会被复制,基于过滤器的复制框架相对于现有基于子树机制的优势已经在使用真实工作负载的真实企业目录中得到了证明。
{"title":"Filter based directory replication and caching","authors":"Apurva Kumar","doi":"10.1109/ICDE.2005.67","DOIUrl":"https://doi.org/10.1109/ICDE.2005.67","url":null,"abstract":"This paper describes a novel filter based replication model for lightweight directory access protocol (LDAP) directories. Instead of replicating entire subtrees from the directory information tree (DIT), only entries matching a filter specification are replicated Advantages of the filter based replication framework over existing subtree based mechanisms have been demonstrated for a real enterprise directory using real workloads.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128075533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
On the optimal ordering of maps and selections under factorization 分解下映射的最优排序与选择
Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.97
Thomas Neumann, S. Helmer, G. Moerkotte
The query optimizer of a database system is confronted with two aspects when handling user-defined functions (UDFs) in query predicates: the vast differences in evaluation costs between UDFs (and other functions) and multiple calls of the same (expensive) UDF The former is dealt with by ordering the evaluation of the predicates optimally, the latter by identifying common subexpressions and thereby avoiding costly recomputation. Current approaches order n predicates optimally (neglecting factorization) in O(nlogn). Their result may deviate significantly from the optimal solution under factorization. We formalize the problem of finding optimal orderings under factorization and prove that it is NP-hard. Furthermore, we show how to improve on the run time of the brute-force algorithm (which computes all possible orderings) by presenting different enhanced algorithms. Although in the worst case these algorithms obviously still behave exponentially, our experiments demonstrate that for real-life examples their performance is much better.
数据库系统的查询优化器在处理查询谓词中的用户定义函数(UDF)时面临两个方面的问题:UDF(和其他函数)之间计算成本的巨大差异,以及对同一(昂贵的)UDF的多次调用。前者通过对谓词的求值进行最优排序来处理,后者通过标识公共子表达式来处理,从而避免代价高昂的重新计算。目前的方法在O(nlogn)内最优地使用O(n)个谓词(忽略因子分解)。它们的结果可能与因式分解下的最优解有很大的偏差。我们形式化了在分解下寻找最优排序的问题,并证明了它是np困难的。此外,我们还展示了如何通过展示不同的增强算法来改进暴力破解算法(计算所有可能的排序)的运行时间。尽管在最坏的情况下,这些算法显然仍然表现得呈指数级增长,但我们的实验表明,对于现实生活中的例子,它们的性能要好得多。
{"title":"On the optimal ordering of maps and selections under factorization","authors":"Thomas Neumann, S. Helmer, G. Moerkotte","doi":"10.1109/ICDE.2005.97","DOIUrl":"https://doi.org/10.1109/ICDE.2005.97","url":null,"abstract":"The query optimizer of a database system is confronted with two aspects when handling user-defined functions (UDFs) in query predicates: the vast differences in evaluation costs between UDFs (and other functions) and multiple calls of the same (expensive) UDF The former is dealt with by ordering the evaluation of the predicates optimally, the latter by identifying common subexpressions and thereby avoiding costly recomputation. Current approaches order n predicates optimally (neglecting factorization) in O(nlogn). Their result may deviate significantly from the optimal solution under factorization. We formalize the problem of finding optimal orderings under factorization and prove that it is NP-hard. Furthermore, we show how to improve on the run time of the brute-force algorithm (which computes all possible orderings) by presenting different enhanced algorithms. Although in the worst case these algorithms obviously still behave exponentially, our experiments demonstrate that for real-life examples their performance is much better.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117187879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Adaptive overlapped declustering: a highly available data-placement method balancing access load and space utilization 自适应重叠聚类:一种平衡访问负载和空间利用率的高可用数据放置方法
Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.16
Akitsugu Watanabe, H. Yokota
This paper proposes a new data-placement method named adaptive overlapped declustering, which can be applied to a parallel storage system using a value range partitioning-based distributed directory and primary-backup data replication, to improve the space utilization by balancing their access loads. The proposed method reduces data skews generated by data migration for balancing access load. While some data-placement methods capable of balancing access load or reducing data skew have been proposed, both requirements satisfied simultaneously. The proposed method also improves the reliability and availability of the system because it reduces recovery time for damaged backups after a disk failure. The method achieves this acceleration by reducing a large amount of network communications and disk I/O. Mathematical analysis shows the efficiency of space utilization under skewed access workloads. Queuing simulations demonstrated that the proposed method halves backup restoration time, compared with the traditional chained declustering method.
本文提出了一种新的数据放置方法——自适应重叠解簇,该方法可以应用于基于值范围分区的分布式目录和主备份数据复制的并行存储系统,通过平衡它们的访问负载来提高空间利用率。该方法减少了由于数据迁移而产生的数据倾斜,实现了访问负载均衡。虽然已经提出了一些能够平衡访问负载或减少数据倾斜的数据放置方法,但同时满足这两种要求。该方法还提高了系统的可靠性和可用性,因为它减少了磁盘故障后损坏备份的恢复时间。该方法通过减少大量的网络通信和磁盘I/O来实现这种加速。数学分析显示了在倾斜访问工作负载下的空间利用效率。排队仿真结果表明,与传统的链式聚类方法相比,该方法的备份恢复时间缩短了一半。
{"title":"Adaptive overlapped declustering: a highly available data-placement method balancing access load and space utilization","authors":"Akitsugu Watanabe, H. Yokota","doi":"10.1109/ICDE.2005.16","DOIUrl":"https://doi.org/10.1109/ICDE.2005.16","url":null,"abstract":"This paper proposes a new data-placement method named adaptive overlapped declustering, which can be applied to a parallel storage system using a value range partitioning-based distributed directory and primary-backup data replication, to improve the space utilization by balancing their access loads. The proposed method reduces data skews generated by data migration for balancing access load. While some data-placement methods capable of balancing access load or reducing data skew have been proposed, both requirements satisfied simultaneously. The proposed method also improves the reliability and availability of the system because it reduces recovery time for damaged backups after a disk failure. The method achieves this acceleration by reducing a large amount of network communications and disk I/O. Mathematical analysis shows the efficiency of space utilization under skewed access workloads. Queuing simulations demonstrated that the proposed method halves backup restoration time, compared with the traditional chained declustering method.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"261 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115012446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Adaptive process management with ADEPT2 使用ADEPT2进行自适应流程管理
Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.17
M. Reichert, S. Rinderle-Ma, U. Kreher, P. Dadam
In the ADEPT project we have been working on the design and implementation of next generation process management software. Based on a conceptual framework for dynamic process changes, on novel process support functions, and on advanced implementation concepts, the developed system enables the realization of adaptive, process-aware information systems (PAIS). Basically, process changes can take place at the type as well as the instance level: changes of single process instances may have to be carried out in an ad-hoc manner and must not affect system robustness and consistency. Process type changes, in turn, must be quickly accomplished in order to adapt the PAIS to business process changes. ADEPT2 offers powerful concepts for modeling, analyzing, and verifying process schemes. Particularly, it ensures schema correctness, like the absence of deadlock-causing cycles or erroneous data flows. This, in turn, constitutes an important prerequisite for dynamic process changes as well. ADEPT2 supports both ad-hoc changes of single process instances and the propagation of process type changes to running instances.
在ADEPT项目中,我们一直致力于下一代过程管理软件的设计和实现。基于动态过程变化的概念框架、新颖的过程支持功能和先进的实现概念,开发的系统能够实现自适应的、过程感知的信息系统(PAIS)。基本上,流程更改可以发生在类型级别和实例级别:单个流程实例的更改可能必须以特别的方式执行,并且必须不影响系统的健壮性和一致性。反过来,必须快速完成流程类型更改,以便使PAIS适应业务流程更改。ADEPT2为流程方案的建模、分析和验证提供了强大的概念。特别是,它确保了模式的正确性,比如没有导致死锁的周期或错误的数据流。反过来,这也是动态过程变化的重要先决条件。ADEPT2既支持对单个流程实例进行临时更改,也支持将流程类型更改传播到正在运行的实例。
{"title":"Adaptive process management with ADEPT2","authors":"M. Reichert, S. Rinderle-Ma, U. Kreher, P. Dadam","doi":"10.1109/ICDE.2005.17","DOIUrl":"https://doi.org/10.1109/ICDE.2005.17","url":null,"abstract":"In the ADEPT project we have been working on the design and implementation of next generation process management software. Based on a conceptual framework for dynamic process changes, on novel process support functions, and on advanced implementation concepts, the developed system enables the realization of adaptive, process-aware information systems (PAIS). Basically, process changes can take place at the type as well as the instance level: changes of single process instances may have to be carried out in an ad-hoc manner and must not affect system robustness and consistency. Process type changes, in turn, must be quickly accomplished in order to adapt the PAIS to business process changes. ADEPT2 offers powerful concepts for modeling, analyzing, and verifying process schemes. Particularly, it ensures schema correctness, like the absence of deadlock-causing cycles or erroneous data flows. This, in turn, constitutes an important prerequisite for dynamic process changes as well. ADEPT2 supports both ad-hoc changes of single process instances and the propagation of process type changes to running instances.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115485948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 187
Corpus-based schema matching 基于语料库的模式匹配
Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.39
J. Madhavan, P. Bernstein, A. Doan, A. Halevy
Schema matching is the problem of identifying corresponding elements in different schemas. Discovering these correspondences or matches is inherently difficult to automate. Past solutions have proposed a principled combination of multiple algorithms. However, these solutions sometimes perform rather poorly due to the lack of sufficient evidence in the schemas being matched. In this paper we show how a corpus of schemas and mappings can be used to augment the evidence about the schemas being matched, so they can be matched better. Such a corpus typically contains multiple schemas that model similar concepts and hence enables us to learn variations in the elements and their properties. We exploit such a corpus in two ways. First, we increase the evidence about each element being matched by including evidence from similar elements in the corpus. Second, we learn statistics about elements and their relationships and use them to infer constraints that we use to prune candidate mappings. We also describe how to use known mappings to learn the importance of domain and generic constraints. We present experimental results that demonstrate corpus-based matching outperforms direct matching (without the benefit of a corpus) in multiple domains.
模式匹配是在不同模式中识别相应元素的问题。发现这些对应或匹配本身就很难实现自动化。过去的解决方案提出了多种算法的原则组合。然而,由于在匹配的模式中缺乏足够的证据,这些解决方案有时执行得相当差。在本文中,我们展示了如何使用模式和映射的语料库来增加关于正在匹配的模式的证据,以便更好地匹配它们。这样的语料库通常包含多个模式,这些模式对相似的概念进行建模,从而使我们能够了解元素及其属性的变化。我们以两种方式利用这样的语料库。首先,我们通过包含语料库中相似元素的证据来增加每个元素匹配的证据。其次,我们学习关于元素及其关系的统计信息,并使用它们来推断约束,我们使用这些约束来修剪候选映射。我们还描述了如何使用已知映射来了解域约束和泛型约束的重要性。我们提出的实验结果表明,基于语料库的匹配在多个领域优于直接匹配(没有语料库的好处)。
{"title":"Corpus-based schema matching","authors":"J. Madhavan, P. Bernstein, A. Doan, A. Halevy","doi":"10.1109/ICDE.2005.39","DOIUrl":"https://doi.org/10.1109/ICDE.2005.39","url":null,"abstract":"Schema matching is the problem of identifying corresponding elements in different schemas. Discovering these correspondences or matches is inherently difficult to automate. Past solutions have proposed a principled combination of multiple algorithms. However, these solutions sometimes perform rather poorly due to the lack of sufficient evidence in the schemas being matched. In this paper we show how a corpus of schemas and mappings can be used to augment the evidence about the schemas being matched, so they can be matched better. Such a corpus typically contains multiple schemas that model similar concepts and hence enables us to learn variations in the elements and their properties. We exploit such a corpus in two ways. First, we increase the evidence about each element being matched by including evidence from similar elements in the corpus. Second, we learn statistics about elements and their relationships and use them to infer constraints that we use to prune candidate mappings. We also describe how to use known mappings to learn the importance of domain and generic constraints. We present experimental results that demonstrate corpus-based matching outperforms direct matching (without the benefit of a corpus) in multiple domains.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123330488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 435
An enhanced query model for soccer video retrieval using temporal relationships 基于时间关系的足球视频检索增强查询模型
Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.20
Shu‐Ching Chen, M. Shyu, Na Zhao
The focal goal of our research is to develop a general framework which can automatically analyze the sports video, detect the sports events, and finally offer an efficient and user-friendly system for sports video retrieval. In our earlier work, a novel multimedia data mining technique was proposed for automatic soccer event extraction by adopting multimodal feature analysis. Until now, this framework has been performed on the detection of goal and corner kick events and the results are quite impressive. Correspondingly, in this work, the detected video events are modeled and effectively stored in the database. A temporal query model is designed to satisfy the comprehensive temporal query requirements, and the corresponding graphical query language is developed. The advanced characteristics make our model particularly well suited for searching events in a large scale video database.
本文研究的重点是开发一个通用的框架,实现对体育视频的自动分析,对体育赛事的自动检测,最终提供一个高效、用户友好的体育视频检索系统。在之前的工作中,我们提出了一种基于多模态特征分析的多媒体数据挖掘技术,用于足球赛事的自动提取。到目前为止,这个框架已经被用于检测进球和角球事件,结果相当令人印象深刻。相应地,在本工作中,对检测到的视频事件进行建模并有效地存储在数据库中。设计了满足综合时态查询需求的时态查询模型,并开发了相应的图形化查询语言。这种先进的特性使我们的模型特别适合于在大型视频数据库中搜索事件。
{"title":"An enhanced query model for soccer video retrieval using temporal relationships","authors":"Shu‐Ching Chen, M. Shyu, Na Zhao","doi":"10.1109/ICDE.2005.20","DOIUrl":"https://doi.org/10.1109/ICDE.2005.20","url":null,"abstract":"The focal goal of our research is to develop a general framework which can automatically analyze the sports video, detect the sports events, and finally offer an efficient and user-friendly system for sports video retrieval. In our earlier work, a novel multimedia data mining technique was proposed for automatic soccer event extraction by adopting multimodal feature analysis. Until now, this framework has been performed on the detection of goal and corner kick events and the results are quite impressive. Correspondingly, in this work, the detected video events are modeled and effectively stored in the database. A temporal query model is designed to satisfy the comprehensive temporal query requirements, and the corresponding graphical query language is developed. The advanced characteristics make our model particularly well suited for searching events in a large scale video database.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125959158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Exploiting correlated attributes in acquisitional query processing 在获取查询处理中利用相关属性
Pub Date : 2005-04-05 DOI: 10.1109/ICDE.2005.63
A. Deshpande, Carlos Guestrin, W. Hong, S. Madden
Sensor networks and other distributed information systems (such as the Web) must frequently access data that has a high per-attribute acquisition cost, in terms of energy, latency, or computational resources. When executing queries that contain several predicates over such expensive attributes, we observe that it can be beneficial to use correlations to automatically introduce low-cost attributes whose observation will allow the query processor to better estimate die selectivity of these expensive predicates. In particular, we show how to build conditional plans that branch into one or more sub-plans, each with a different ordering for the expensive query predicates, based on the runtime observation of low-cost attributes. We frame the problem of constructing the optimal conditional plan for a given user query and set of candidate low-cost attributes as an optimization problem. We describe an exponential time algorithm for finding such optimal plans, and describe a polynomial-time heuristic for identifying conditional plans that perform well in practice. We also show how to compactly model conditional probability distributions needed to identify correlations and build these plans. We evaluate our algorithms against several real-world sensor-network data sets, showing several-times performance increases for a variety of queries versus traditional optimization techniques.
传感器网络和其他分布式信息系统(如Web)必须频繁访问在能量、延迟或计算资源方面具有高每个属性获取成本的数据。当在这些昂贵的属性上执行包含多个谓词的查询时,我们观察到使用相关性来自动引入低成本属性可能是有益的,这些属性的观察结果将允许查询处理器更好地估计这些昂贵谓词的die选择性。特别是,我们将展示如何构建条件计划,将其分支为一个或多个子计划,每个子计划基于对低成本属性的运行时观察,对昂贵的查询谓词使用不同的顺序。我们将为给定用户查询和候选低成本属性集构造最优条件计划的问题定义为优化问题。我们描述了一个指数时间算法来寻找这样的最优计划,并描述了一个多项式时间启发式算法来识别在实践中表现良好的条件计划。我们还展示了如何对识别相关性和构建这些计划所需的条件概率分布进行紧凑建模。我们针对几个真实世界的传感器网络数据集评估了我们的算法,显示了与传统优化技术相比,各种查询的性能提高了几倍。
{"title":"Exploiting correlated attributes in acquisitional query processing","authors":"A. Deshpande, Carlos Guestrin, W. Hong, S. Madden","doi":"10.1109/ICDE.2005.63","DOIUrl":"https://doi.org/10.1109/ICDE.2005.63","url":null,"abstract":"Sensor networks and other distributed information systems (such as the Web) must frequently access data that has a high per-attribute acquisition cost, in terms of energy, latency, or computational resources. When executing queries that contain several predicates over such expensive attributes, we observe that it can be beneficial to use correlations to automatically introduce low-cost attributes whose observation will allow the query processor to better estimate die selectivity of these expensive predicates. In particular, we show how to build conditional plans that branch into one or more sub-plans, each with a different ordering for the expensive query predicates, based on the runtime observation of low-cost attributes. We frame the problem of constructing the optimal conditional plan for a given user query and set of candidate low-cost attributes as an optimization problem. We describe an exponential time algorithm for finding such optimal plans, and describe a polynomial-time heuristic for identifying conditional plans that perform well in practice. We also show how to compactly model conditional probability distributions needed to identify correlations and build these plans. We evaluate our algorithms against several real-world sensor-network data sets, showing several-times performance increases for a variety of queries versus traditional optimization techniques.","PeriodicalId":297231,"journal":{"name":"21st International Conference on Data Engineering (ICDE'05)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129358476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 138
期刊
21st International Conference on Data Engineering (ICDE'05)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1