首页 > 最新文献

Proceedings 17th International Conference on Data Engineering最新文献

英文 中文
Counting twig matches in a tree 数树枝上的火柴
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914874
Zhiyuan Chen, H. Jagadish, Flip Korn, Nick Koudas, S. Muthukrishnan, R. Ng, D. Srivastava
Describes efficient algorithms for accurately estimating the number of matches of a small node-labeled tree, i.e. a twig, in a large node-labeled tree, using a summary data structure. This problem is of interest for queries on XML and other hierarchical data, to provide query feedback and for cost-based query optimization. Our summary data structure scalably represents approximate frequency information about twiglets (i.e. small twigs) in the data tree. Given a twig query, the number of matches is estimated by creating a set of query twiglets, and combining two complementary approaches: set hashing, used to estimate the number of matches of each query twiglet, and maximal overlap, used to combine the query twiglet estimates into an estimate for the twig query. We propose several estimation algorithms that apply these approaches on query twiglets formed using variations on different twiglet decomposition techniques. We present an extensive experimental evaluation using several real XML data sets, with a variety of twig queries. Our results demonstrate that accurate and robust estimates can be achieved, even with limited space.
描述使用摘要数据结构准确估计小节点标记树(即大节点标记树中的小枝)匹配数量的有效算法。对于XML和其他分层数据的查询、提供查询反馈和基于成本的查询优化来说,这个问题非常重要。我们的汇总数据结构可扩展地表示数据树中关于小枝(即小枝)的近似频率信息。给定一个小枝查询,通过创建一组查询小枝并结合两种互补的方法来估计匹配的数量:集合散列(用于估计每个查询小枝的匹配数量)和最大重叠(用于将查询小枝估计合并为小枝查询的估计)。我们提出了几种估计算法,将这些方法应用于使用不同小波分解技术变体形成的查询小波。我们使用几个真实的XML数据集和各种分支查询进行了广泛的实验评估。我们的结果表明,即使在有限的空间中,也可以实现准确而稳健的估计。
{"title":"Counting twig matches in a tree","authors":"Zhiyuan Chen, H. Jagadish, Flip Korn, Nick Koudas, S. Muthukrishnan, R. Ng, D. Srivastava","doi":"10.1109/ICDE.2001.914874","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914874","url":null,"abstract":"Describes efficient algorithms for accurately estimating the number of matches of a small node-labeled tree, i.e. a twig, in a large node-labeled tree, using a summary data structure. This problem is of interest for queries on XML and other hierarchical data, to provide query feedback and for cost-based query optimization. Our summary data structure scalably represents approximate frequency information about twiglets (i.e. small twigs) in the data tree. Given a twig query, the number of matches is estimated by creating a set of query twiglets, and combining two complementary approaches: set hashing, used to estimate the number of matches of each query twiglet, and maximal overlap, used to combine the query twiglet estimates into an estimate for the twig query. We propose several estimation algorithms that apply these approaches on query twiglets formed using variations on different twiglet decomposition techniques. We present an extensive experimental evaluation using several real XML data sets, with a variety of twig queries. Our results demonstrate that accurate and robust estimates can be achieved, even with limited space.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126650573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 139
Discovery and application of check constraints in DB2 DB2中检查约束的发现和应用
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914869
Jarek Gryz, Berni Schiefer, Jian Zheng, C. Zuzarte
The traditional role of integrity constraints is to protect the integrity of data, but integrity constraints can and do play other roles in databases; for example, they can be used for query optimization. In this role, they do not need to model the domain; it is sufficient that they describe regularities that are true about the data currently stored in a database. In this paper, we describe two algorithms for finding such regularities (in the syntactic form of check constraints) and discuss some of their applications in DB2. In particular, we show their use in query optimization.
完整性约束的传统作用是保护数据的完整性,但完整性约束在数据库中也可以发挥其他作用;例如,它们可用于查询优化。在这个角色中,他们不需要为域建模;它们描述当前存储在数据库中的数据的真实规律就足够了。在本文中,我们将描述用于查找此类规则的两种算法(以检查约束的语法形式),并讨论它们在DB2中的一些应用程序。特别地,我们展示了它们在查询优化中的使用。
{"title":"Discovery and application of check constraints in DB2","authors":"Jarek Gryz, Berni Schiefer, Jian Zheng, C. Zuzarte","doi":"10.1109/ICDE.2001.914869","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914869","url":null,"abstract":"The traditional role of integrity constraints is to protect the integrity of data, but integrity constraints can and do play other roles in databases; for example, they can be used for query optimization. In this role, they do not need to model the domain; it is sufficient that they describe regularities that are true about the data currently stored in a database. In this paper, we describe two algorithms for finding such regularities (in the syntactic form of check constraints) and discuss some of their applications in DB2. In particular, we show their use in query optimization.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127041516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
E-business applications for supply chain management: challenges and solutions 供应链管理的电子商务应用:挑战和解决方案
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914815
F. Casati, U. Dayal, M. Shan
Supply-chain management is a crucial activity in every company. Surprisingly, today, most of the supply-chain activities are carried out manually, and IT support is often limited to having a set of (disconnected) data repositories. In addition, business-to-business (B2B) communications are performed via phone, fax or e-mail. Increasing the operational efficiency of the supply chain results in huge savings and is the key to remaining competitive or even gaining a competitive advantage. Furthermore, a more efficient supply chain also enables revenue growth, which is often impossible to sustain with the current manual operations. In this paper, we discuss the requirements and challenges for e-business applications that support supply-chain management. Then, we propose an architecture that meets the requirements and enables solutions that deliver results quickly and that evolve with the business and IT environment. Both the requirements and the architecture are the results of several different types of supply-chain automation projects in which we have been involved.
供应链管理对每个公司来说都是至关重要的活动。令人惊讶的是,今天,大多数供应链活动都是手动执行的,IT支持通常仅限于拥有一组(断开连接的)数据存储库。此外,企业对企业(B2B)通信是通过电话、传真或电子邮件进行的。提高供应链的运营效率可以节省大量资金,是保持竞争力甚至获得竞争优势的关键。此外,更高效的供应链还可以实现收入增长,这通常是目前手工操作无法维持的。在本文中,我们讨论了支持供应链管理的电子商务应用程序的需求和挑战。然后,我们提出一个满足需求的体系结构,并使解决方案能够快速交付结果,并随着业务和IT环境的发展而发展。需求和体系结构都是我们所参与的几种不同类型的供应链自动化项目的结果。
{"title":"E-business applications for supply chain management: challenges and solutions","authors":"F. Casati, U. Dayal, M. Shan","doi":"10.1109/ICDE.2001.914815","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914815","url":null,"abstract":"Supply-chain management is a crucial activity in every company. Surprisingly, today, most of the supply-chain activities are carried out manually, and IT support is often limited to having a set of (disconnected) data repositories. In addition, business-to-business (B2B) communications are performed via phone, fax or e-mail. Increasing the operational efficiency of the supply chain results in huge savings and is the key to remaining competitive or even gaining a competitive advantage. Furthermore, a more efficient supply chain also enables revenue growth, which is often impossible to sustain with the current manual operations. In this paper, we discuss the requirements and challenges for e-business applications that support supply-chain management. Then, we propose an architecture that meets the requirements and enables solutions that deliver results quickly and that evolve with the business and IT environment. Both the requirements and the architecture are the results of several different types of supply-chain automation projects in which we have been involved.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"09 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124485303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
High-performance, space-efficient, automated object locking 高性能、空间高效、自动对象锁定
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914825
L. Daynès, G. Czajkowski
Studies the impact of several lock manager designs on the overhead imposed on a persistent programming language by automated object locking. Our study reveals that a lock management method based on lock-state sharing outperforms more traditional lock management designs. Lock-state sharing is a novel lock management method that represents all lock data structures with equal values with a single shared data structure. Sharing the value of locks has numerous benefits: (i) it makes the space consumed by the lock manager small and independent of the number of locks acquired by transactions, (ii) it eliminates the need for expensive bookkeeping of locks by transactions, and (iii) it enables the use of memoization techniques for whole locking operations. These advantages add up to making the release of locks practically free, and the processing of over 99% of the lock requests takes between eight and 14 RISC instructions.
研究几种锁管理器设计对自动对象锁定强加给持久性编程语言的开销的影响。我们的研究表明,基于锁状态共享的锁管理方法优于传统的锁管理设计。锁状态共享是一种新颖的锁管理方法,它将所有值相等的锁数据结构表示为一个共享的数据结构。共享锁的价值有很多好处:(i)它使锁管理器所消耗的空间变小,并且与事务获取的锁数量无关;(ii)它消除了事务对锁进行昂贵的记录的需要;(iii)它支持对整个锁定操作使用记忆技术。这些优点加起来使锁的释放几乎是免费的,并且处理超过99%的锁请求需要8到14条RISC指令。
{"title":"High-performance, space-efficient, automated object locking","authors":"L. Daynès, G. Czajkowski","doi":"10.1109/ICDE.2001.914825","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914825","url":null,"abstract":"Studies the impact of several lock manager designs on the overhead imposed on a persistent programming language by automated object locking. Our study reveals that a lock management method based on lock-state sharing outperforms more traditional lock management designs. Lock-state sharing is a novel lock management method that represents all lock data structures with equal values with a single shared data structure. Sharing the value of locks has numerous benefits: (i) it makes the space consumed by the lock manager small and independent of the number of locks acquired by transactions, (ii) it eliminates the need for expensive bookkeeping of locks by transactions, and (iii) it enables the use of memoization techniques for whole locking operations. These advantages add up to making the release of locks practically free, and the processing of over 99% of the lock requests takes between eight and 14 RISC instructions.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125418189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Duality-based subsequence matching in time-series databases 时间序列数据库中基于二象性的子序列匹配
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914837
Yang-Sae Moon, K. Whang, W. Loh
The authors propose a subsequence matching method, Dual Match, which exploits duality in constructing windows and significantly improves performance. Dual Match divides data sequences into disjoint windows and the query sequence into sliding windows, and thus, is a dual approach of the one by C. Faloutsos et al. (1994), which divides data sequences into sliding windows and the query sequence into disjoint windows. We formally prove that our dual approach is correct, i.e., it incurs no false dismissal. We also prove that, given the minimum query length, there is a maximum bound of the window size to guarantee correctness of Dual Match and discuss the effect of the window size on performance. FRM causes a lot of false alarms by storing minimum bounding rectangles rather than individual points representing windows to avoid excessive storage space required for the index. Dual Match solves this problem by directly storing points, but without incurring excessive storage overhead. Experimental results show that, in most cases, Dual Match provides large improvement in both false alarms and performance over FRM, given the same amount of storage space. In particular, for low selectivities (less than 10/sup -4/), Dual Match significantly improves performance up to 430-fold. On the other hand, for high selectivities(more than 10/sup -2/), it shows a very minor degradation (less than 29%). For selectivities in between (10/sup -4//spl sim/10/sup -2/), Dual Match shows performance slightly better than that of FRM. Dual Match is also 4.10/spl sim/25.6 times faster than FRM in building indexes of approximately the same size. Overall, these results indicate that our approach provides a new paradigm in subsequence matching that improves performance significantly in large database applications.
作者提出了一种基于二元匹配的子序列匹配方法,该方法利用了构造窗口的对偶性,大大提高了性能。双匹配将数据序列划分为不相交的窗口,将查询序列划分为滑动窗口,是C. Faloutsos等人(1994)将数据序列划分为滑动窗口,将查询序列划分为不相交窗口的一种对偶方法。我们正式证明我们的二元方法是正确的,即不会产生错误的解雇。我们还证明了在给定最小查询长度的情况下,存在保证Dual Match正确性的窗口大小的最大边界,并讨论了窗口大小对性能的影响。FRM通过存储最小的边界矩形而不是代表窗口的单个点来避免索引需要过多的存储空间,从而导致大量的假警报。Dual Match通过直接存储点来解决这个问题,但不会产生过多的存储开销。实验结果表明,在大多数情况下,在相同的存储空间下,Dual Match在假警报和性能方面都比FRM有很大的提高。特别是,对于低选择性(小于10/sup -4/), Dual Match显着提高性能高达430倍。另一方面,对于高选择性(大于10/sup -2/),它显示出非常小的退化(小于29%)。对于介于(10/sup -4//spl /10/sup -2/)之间的选择性,Dual Match的性能略好于FRM。在构建大小大致相同的索引时,Dual Match比FRM快4.10/spl sim/25.6倍。总的来说,这些结果表明我们的方法为子序列匹配提供了一种新的范例,可以显著提高大型数据库应用程序的性能。
{"title":"Duality-based subsequence matching in time-series databases","authors":"Yang-Sae Moon, K. Whang, W. Loh","doi":"10.1109/ICDE.2001.914837","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914837","url":null,"abstract":"The authors propose a subsequence matching method, Dual Match, which exploits duality in constructing windows and significantly improves performance. Dual Match divides data sequences into disjoint windows and the query sequence into sliding windows, and thus, is a dual approach of the one by C. Faloutsos et al. (1994), which divides data sequences into sliding windows and the query sequence into disjoint windows. We formally prove that our dual approach is correct, i.e., it incurs no false dismissal. We also prove that, given the minimum query length, there is a maximum bound of the window size to guarantee correctness of Dual Match and discuss the effect of the window size on performance. FRM causes a lot of false alarms by storing minimum bounding rectangles rather than individual points representing windows to avoid excessive storage space required for the index. Dual Match solves this problem by directly storing points, but without incurring excessive storage overhead. Experimental results show that, in most cases, Dual Match provides large improvement in both false alarms and performance over FRM, given the same amount of storage space. In particular, for low selectivities (less than 10/sup -4/), Dual Match significantly improves performance up to 430-fold. On the other hand, for high selectivities(more than 10/sup -2/), it shows a very minor degradation (less than 29%). For selectivities in between (10/sup -4//spl sim/10/sup -2/), Dual Match shows performance slightly better than that of FRM. Dual Match is also 4.10/spl sim/25.6 times faster than FRM in building indexes of approximately the same size. Overall, these results indicate that our approach provides a new paradigm in subsequence matching that improves performance significantly in large database applications.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115095803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 127
Integrating semi-join-reducers into state-of-the-art query processors 将半连接减少器集成到最先进的查询处理器中
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914872
K. Stocker, Donald Kossmann, R. Braumandl, A. Kemper
Semi-join reducers were introduced in the late 1970s as a means to reduce the communication costs of distributed database systems. Subsequent work in the 1980s showed, however, that semi-join reducers are rarely beneficial for the distributed systems of that time. This paper shows that semi-join reducers can indeed be beneficial in modern client-server or middleware systems - either to reduce communication costs or to better exploit all the resources of a system. Furthermore, we present and evaluate alternative ways to extend state-of-the-art (dynamic programming) query optimizers in order to generate good query plans with semi-join reducers. We present two variants, called Access Root and Join Root, which differ in their implementation complexity, running times and the quality of the plans they produce. We present the results of performance experiments that compare both variants with a traditional query optimizer.
半连接减少器是在20世纪70年代后期引入的,作为降低分布式数据库系统通信成本的一种手段。然而,20世纪80年代的后续工作表明,半连接减少器对当时的分布式系统很少有好处。本文表明,半连接减少器在现代客户机-服务器或中间件系统中确实是有益的——要么降低通信成本,要么更好地利用系统的所有资源。此外,我们提出并评估了扩展最先进(动态规划)查询优化器的替代方法,以便使用半连接减少器生成良好的查询计划。我们提出了两种变体,称为Access Root和Join Root,它们在实现复杂性、运行时间和生成的计划质量方面有所不同。我们给出了性能实验的结果,将这两种变体与传统查询优化器进行了比较。
{"title":"Integrating semi-join-reducers into state-of-the-art query processors","authors":"K. Stocker, Donald Kossmann, R. Braumandl, A. Kemper","doi":"10.1109/ICDE.2001.914872","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914872","url":null,"abstract":"Semi-join reducers were introduced in the late 1970s as a means to reduce the communication costs of distributed database systems. Subsequent work in the 1980s showed, however, that semi-join reducers are rarely beneficial for the distributed systems of that time. This paper shows that semi-join reducers can indeed be beneficial in modern client-server or middleware systems - either to reduce communication costs or to better exploit all the resources of a system. Furthermore, we present and evaluate alternative ways to extend state-of-the-art (dynamic programming) query optimizers in order to generate good query plans with semi-join reducers. We present two variants, called Access Root and Join Root, which differ in their implementation complexity, running times and the quality of the plans they produce. We present the results of performance experiments that compare both variants with a traditional query optimizer.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121797648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 61
SpinCircuit: a collaborative portal powered by E-speak spinccircuit:一个由E-speak驱动的协作门户
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914881
Rabindra Pathak
SpinCircuit is collaborative portal serving the semiconductor industry. SpinCircuit provides a Web-based environment facilitating B-2-B collaboration to bring together component manufacturers, component suppliers, contract manufacturers and the design community in semiconductor space. It is based on E-speak technology from Hewlett-Packard. E-speak provides a secure E-services infrastructure for the creation, composition and discovery of E-services distributed across the Internet.
spinccircuit是服务于半导体行业的协作门户。spinccircuit提供了一个基于网络的环境,促进b2b协作,将半导体领域的组件制造商、组件供应商、合同制造商和设计界聚集在一起。它基于惠普公司的E-speak技术。E-speak为创建、组合和发现分布在Internet上的电子服务提供了安全的电子服务基础设施。
{"title":"SpinCircuit: a collaborative portal powered by E-speak","authors":"Rabindra Pathak","doi":"10.1109/ICDE.2001.914881","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914881","url":null,"abstract":"SpinCircuit is collaborative portal serving the semiconductor industry. SpinCircuit provides a Web-based environment facilitating B-2-B collaboration to bring together component manufacturers, component suppliers, contract manufacturers and the design community in semiconductor space. It is based on E-speak technology from Hewlett-Packard. E-speak provides a secure E-services infrastructure for the creation, composition and discovery of E-services distributed across the Internet.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130055877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Processing queries with expensive functions and large objects in distributed mediator systems 在分布式中介系统中处理具有昂贵函数和大型对象的查询
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914817
Luc Bouganim, F. Fabret, F. Porto, P. Valduriez
LeSelect is a mediator system which allows scientists to publish their resources (data and programs) so they can be transparently accessed. The scientists can typically issue queries which access distributed published data and involve the execution of expensive functions (corresponding to programs). Furthermore, the queries can involve large objects, such as images (e.g. archived meteorological satellite data). In this context, the costs of transmitting large objects and invoking expensive functions are the dominant factors of execution time. In this paper, we first propose three query execution techniques which minimize these costs by taking full advantage of the distributed architecture of mediator systems like LeSelect. Then we devise parallel processing strategies for queries including expensive functions. Based on experimentation, we show that it is hard to predict the optimal execution order when dealing with several functions. We propose a new hybrid parallel technique to solve this problem and give some experimental results.
LeSelect是一个中介系统,它允许科学家发布他们的资源(数据和项目),以便他们可以透明地访问。科学家通常可以发出访问分布式发布数据的查询,并涉及执行昂贵的函数(对应于程序)。此外,查询可能涉及大型对象,例如图像(例如存档的气象卫星数据)。在这种情况下,传输大型对象和调用昂贵函数的成本是影响执行时间的主要因素。在本文中,我们首先提出了三种查询执行技术,这些技术通过充分利用像LeSelect这样的中介系统的分布式体系结构来最小化这些成本。然后,我们为包含昂贵函数的查询设计并行处理策略。实验表明,当处理多个函数时,很难预测最优的执行顺序。我们提出了一种新的混合并行技术来解决这个问题,并给出了一些实验结果。
{"title":"Processing queries with expensive functions and large objects in distributed mediator systems","authors":"Luc Bouganim, F. Fabret, F. Porto, P. Valduriez","doi":"10.1109/ICDE.2001.914817","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914817","url":null,"abstract":"LeSelect is a mediator system which allows scientists to publish their resources (data and programs) so they can be transparently accessed. The scientists can typically issue queries which access distributed published data and involve the execution of expensive functions (corresponding to programs). Furthermore, the queries can involve large objects, such as images (e.g. archived meteorological satellite data). In this context, the costs of transmitting large objects and invoking expensive functions are the dominant factors of execution time. In this paper, we first propose three query execution techniques which minimize these costs by taking full advantage of the distributed architecture of mediator systems like LeSelect. Then we devise parallel processing strategies for queries including expensive functions. Based on experimentation, we show that it is hard to predict the optimal execution order when dealing with several functions. We propose a new hybrid parallel technique to solve this problem and give some experimental results.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132627020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
B/sup +/-tree indexes with hybrid row identifiers in Oracle8i B/sup +/-tree索引与混合行标识符在Oracle8i
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914846
E. Chong, Souripriya Das, Chuck Freiwald, Jagannathan Srinivasan, Aravind Yalamanchi, M. Jagannath, Anh-Tuan Tran, Ramkumar Krishnan
Most commercial database systems support B/sup +/-tree indexes using either: physical row identifiers, for example, DB2; or logical row identifiers, for example, NonStop SQL. Physical row identifiers provide fast access to data. However, unlike logical row identifiers, they need to be updated whenever the row moves. This paper describes an alternate approach where hybrid row identifiers are used. A hybrid row identifier consists of two components: a logical component, namely, the primary key of the base table row; and a physical component, namely, the database block address (DBA) of the row. By treating the DBA as a guess regarding where the row may be found, performance comparable to physical B/sup +/-tree indexes is attained for valid guess-DBAs. This scheme retains the logical index advantage of avoiding an immediate index update when the base table row moves. Instead, an online utility can be used to lazily fix the invalid guess-DBAs. This scheme has been used to implement B/sup +/-tree indexes for Oracle8i index-organized tables (primary B/sup +/-tree like structure) which encounter both row movement and table reorganization.
大多数商业数据库系统支持B/sup +/-tree索引,使用以下两种方式:物理行标识符,例如DB2;或逻辑行标识符,例如,NonStop SQL。物理行标识符提供对数据的快速访问。但是,与逻辑行标识符不同,它们需要在行移动时进行更新。本文描述了使用混合行标识符的另一种方法。混合行标识符由两个组件组成:逻辑组件,即基表行的主键;和一个物理组件,即行所在的数据库块地址(DBA)。通过将DBA视为可以在哪里找到该行的猜测,可以为有效的猜测DBA获得与物理B/sup +/-tree索引相当的性能。该方案保留了逻辑索引的优势,避免了当基表行移动时立即更新索引。相反,可以使用在线实用程序惰性地修复无效的猜测—dba。该方案已用于实现Oracle8i索引组织表(主B/sup +/-tree结构)的B/sup +/-tree索引,这些表会遇到行移动和表重组。
{"title":"B/sup +/-tree indexes with hybrid row identifiers in Oracle8i","authors":"E. Chong, Souripriya Das, Chuck Freiwald, Jagannathan Srinivasan, Aravind Yalamanchi, M. Jagannath, Anh-Tuan Tran, Ramkumar Krishnan","doi":"10.1109/ICDE.2001.914846","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914846","url":null,"abstract":"Most commercial database systems support B/sup +/-tree indexes using either: physical row identifiers, for example, DB2; or logical row identifiers, for example, NonStop SQL. Physical row identifiers provide fast access to data. However, unlike logical row identifiers, they need to be updated whenever the row moves. This paper describes an alternate approach where hybrid row identifiers are used. A hybrid row identifier consists of two components: a logical component, namely, the primary key of the base table row; and a physical component, namely, the database block address (DBA) of the row. By treating the DBA as a guess regarding where the row may be found, performance comparable to physical B/sup +/-tree indexes is attained for valid guess-DBAs. This scheme retains the logical index advantage of avoiding an immediate index update when the base table row moves. Instead, an online utility can be used to lazily fix the invalid guess-DBAs. This scheme has been used to implement B/sup +/-tree indexes for Oracle8i index-organized tables (primary B/sup +/-tree like structure) which encounter both row movement and table reorganization.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"223 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134085530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Distinctiveness-sensitive nearest-neighbor search for efficient similarity retrieval of multimedia information 基于特征敏感的最近邻搜索的多媒体信息相似度高效检索
Pub Date : 2001-04-02 DOI: 10.1109/ICDE.2001.914863
Norio Katayama, S. Satoh
Nearest neighbor (NN) search in high dimensional feature space is widely used for similarity retrieval of multimedia information. However recent research results in the database literature reveal that a curious problem happens in high dimensional space. Since high dimensional space has a high degree of freedom, points could be scattered so that every distance between them might yield no significant difference. In this case, we can say that the NN is indistinctive because many points exist at the similar distance. To make matters worse, indistinctive NNs require more search cost because search completes only after choosing the NN from plenty of strong candidates. In order to circumvent the handful effect of indistinctive NNs, the paper presents a new NN search algorithm which determines the distinctiveness of the NN during search operation. This enables us not only to cut down search cost but also to distinguish distinctive NNs from indistinctive ones. These advantages are especially beneficial to interactive retrieval systems.
高维特征空间的最近邻搜索被广泛用于多媒体信息的相似性检索。然而,最近在数据库文献中的研究结果揭示了一个奇怪的问题发生在高维空间。由于高维空间具有高度的自由度,点可以被分散,使得它们之间的每一个距离都不会产生显著的差异。在这种情况下,我们可以说神经网络是无区分的,因为许多点存在于相似的距离上。更糟糕的是,无区别神经网络需要更多的搜索成本,因为只有在从大量强候选中选择神经网络后,搜索才能完成。为了克服神经网络无显著性的少数效应,本文提出了一种新的神经网络搜索算法,该算法在搜索过程中决定神经网络的显著性。这使得我们不仅可以降低搜索成本,而且可以区分有特色的神经网络和无特色的神经网络。这些优点对交互式检索系统尤其有益。
{"title":"Distinctiveness-sensitive nearest-neighbor search for efficient similarity retrieval of multimedia information","authors":"Norio Katayama, S. Satoh","doi":"10.1109/ICDE.2001.914863","DOIUrl":"https://doi.org/10.1109/ICDE.2001.914863","url":null,"abstract":"Nearest neighbor (NN) search in high dimensional feature space is widely used for similarity retrieval of multimedia information. However recent research results in the database literature reveal that a curious problem happens in high dimensional space. Since high dimensional space has a high degree of freedom, points could be scattered so that every distance between them might yield no significant difference. In this case, we can say that the NN is indistinctive because many points exist at the similar distance. To make matters worse, indistinctive NNs require more search cost because search completes only after choosing the NN from plenty of strong candidates. In order to circumvent the handful effect of indistinctive NNs, the paper presents a new NN search algorithm which determines the distinctiveness of the NN during search operation. This enables us not only to cut down search cost but also to distinguish distinctive NNs from indistinctive ones. These advantages are especially beneficial to interactive retrieval systems.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114440148","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
期刊
Proceedings 17th International Conference on Data Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1