首页 > 最新文献

Proceedings 14th International Conference on Data Engineering最新文献

英文 中文
Network latency optimizations in distributed database systems 分布式数据库系统中的网络延迟优化
Pub Date : 1998-02-23 DOI: 10.1109/ICDE.1998.655816
S. Banerjee, Panos K. Chrysanthis
The advent of high-speed networks will enable the deployment of data-server systems (currently used in LANs) over WANs. The users of these systems will have the same high expectations with respect to performance parameters (such as the transaction throughput, response time and system reliability) as in the case of LANs. Thus, it is important to study the performance of existing distributed database protocols in the new networking environment, identify the performance bottlenecks and develop protocols that are capable of taking advantage of the high-speed networking technology. As a first step, in this paper, we examine the scalability of the server-based two-phase locking (s-2PL) protocol, and discuss three optimizations which allow the s-2PL protocol to be tailored for high-speed WAN environments where the size of the message is less of a concern than the number of rounds of message passing. These optimizations, collectively called the group two-phase locking (g-2PL) protocol, reduce the number of rounds of message passing by grouping lock grants, client-end caching and data migration. In a simulation study, 20-25% improvement in the response time of the g-2PL protocol over that of the s-2PL protocol was observed.
高速网络的出现将使在广域网上部署数据服务器系统(目前在局域网中使用)成为可能。这些系统的用户对性能参数(如事务吞吐量、响应时间和系统可靠性)的期望与局域网的情况相同。因此,研究现有分布式数据库协议在新的网络环境下的性能,识别性能瓶颈,开发能够充分利用高速网络技术的协议是非常重要的。作为第一步,在本文中,我们研究了基于服务器的两阶段锁定(s-2PL)协议的可扩展性,并讨论了三种优化,这些优化允许s-2PL协议针对高速WAN环境进行定制,在这种环境中,消息的大小比消息传递的轮数更受关注。这些优化统称为组两阶段锁定(g-2PL)协议,它们通过组锁授予、客户端缓存和数据迁移减少了消息传递的轮数。在模拟研究中,观察到g-2PL协议的响应时间比s-2PL协议提高了20-25%。
{"title":"Network latency optimizations in distributed database systems","authors":"S. Banerjee, Panos K. Chrysanthis","doi":"10.1109/ICDE.1998.655816","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655816","url":null,"abstract":"The advent of high-speed networks will enable the deployment of data-server systems (currently used in LANs) over WANs. The users of these systems will have the same high expectations with respect to performance parameters (such as the transaction throughput, response time and system reliability) as in the case of LANs. Thus, it is important to study the performance of existing distributed database protocols in the new networking environment, identify the performance bottlenecks and develop protocols that are capable of taking advantage of the high-speed networking technology. As a first step, in this paper, we examine the scalability of the server-based two-phase locking (s-2PL) protocol, and discuss three optimizations which allow the s-2PL protocol to be tailored for high-speed WAN environments where the size of the message is less of a concern than the number of rounds of message passing. These optimizations, collectively called the group two-phase locking (g-2PL) protocol, reduce the number of rounds of message passing by grouping lock grants, client-end caching and data migration. In a simulation study, 20-25% improvement in the response time of the g-2PL protocol over that of the s-2PL protocol was observed.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125393159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Data logging: a method for efficient data updates in constantly active RAIDs 数据记录:一种在持续活跃的raid中有效更新数据的方法
Pub Date : 1998-02-23 DOI: 10.1109/ICDE.1998.655770
E. Gabber, H. F. Korth
RAIDs (Redundant Arrays of Independent Disks) are a set of disks organized to achieve parallel I/O to multiple disks and to provide tolerance of disk failures. RAIDs offer these advantages at the cost of additional space and additional disk I/O for writes. Previous methods of reducing this I/O overhead suffered from such problems as requiring periods during which data is reorganized and not available, destroying the physical locality of data, or weakening the RAID's fault-tolerance properties. We propose a new method called data logging which reduces the I/O overhead without requiring periodic downtime for reorganization. Instead, incremental maintenance can be performed concurrently with routine processing. This is particularly advantageous in applications requiring "24/spl times/7" uptime. Data logging preserves both physical locality of data and RAID fault tolerance. The major cost of our method is a moderate amount of nonvolatile RAM. This paper describes our method, as well as two schemes for efficient encoding of the information that must be stored in nonvolatile RAM.
raid(独立磁盘冗余阵列)是一组磁盘,用于实现对多个磁盘的并行I/O,并提供磁盘故障容忍度。raid提供这些优点的代价是额外的空间和用于写的额外磁盘I/O。以前减少这种I/O开销的方法会遇到一些问题,比如需要一段时间来重新组织数据并且不可用,破坏数据的物理位置,或者削弱RAID的容错特性。我们提出了一种称为数据记录的新方法,它可以减少I/O开销,而不需要定期停机进行重组。相反,增量维护可以与日常处理同时执行。这在需要“24/spl times/7”正常运行时间的应用中尤其有利。数据记录既保留了数据的物理位置,又保留了RAID的容错性。我们的方法的主要成本是适量的非易失性RAM。本文介绍了我们的方法,以及对必须存储在非易失性RAM中的信息进行有效编码的两种方案。
{"title":"Data logging: a method for efficient data updates in constantly active RAIDs","authors":"E. Gabber, H. F. Korth","doi":"10.1109/ICDE.1998.655770","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655770","url":null,"abstract":"RAIDs (Redundant Arrays of Independent Disks) are a set of disks organized to achieve parallel I/O to multiple disks and to provide tolerance of disk failures. RAIDs offer these advantages at the cost of additional space and additional disk I/O for writes. Previous methods of reducing this I/O overhead suffered from such problems as requiring periods during which data is reorganized and not available, destroying the physical locality of data, or weakening the RAID's fault-tolerance properties. We propose a new method called data logging which reduces the I/O overhead without requiring periodic downtime for reorganization. Instead, incremental maintenance can be performed concurrently with routine processing. This is particularly advantageous in applications requiring \"24/spl times/7\" uptime. Data logging preserves both physical locality of data and RAID fault tolerance. The major cost of our method is a moderate amount of nonvolatile RAM. This paper describes our method, as well as two schemes for efficient encoding of the information that must be stored in nonvolatile RAM.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115083379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Design and implementation of display specification for multimedia answers 多媒体答题显示规范的设计与实现
Pub Date : 1998-02-23 DOI: 10.1109/ICDE.1998.655819
Chitta Baral, G. Gonzalez, Tran Cao Son
We present the design and implementation of a loosely-bound SQL extension that allows users to include high-level display specifications with an SQL query, particularly when dealing with multimedia databases. We describe an architecture that allows a relatively simple implementation of dynamic query browsers using the proposed query language on stand-alone applications or World Wide Web pages. We have already implemented most of our proposed extension.
我们介绍了一个松散绑定的SQL扩展的设计和实现,它允许用户在SQL查询中包含高级显示规范,特别是在处理多媒体数据库时。我们描述了一种体系结构,该体系结构允许在独立应用程序或万维网页面上使用建议的查询语言相对简单地实现动态查询浏览器。我们已经实现了大部分我们提议的扩展。
{"title":"Design and implementation of display specification for multimedia answers","authors":"Chitta Baral, G. Gonzalez, Tran Cao Son","doi":"10.1109/ICDE.1998.655819","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655819","url":null,"abstract":"We present the design and implementation of a loosely-bound SQL extension that allows users to include high-level display specifications with an SQL query, particularly when dealing with multimedia databases. We describe an architecture that allows a relatively simple implementation of dynamic query browsers using the proposed query language on stand-alone applications or World Wide Web pages. We have already implemented most of our proposed extension.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123037152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
High dimensional similarity joins: algorithms and performance evaluation 高维相似连接:算法和性能评价
Pub Date : 1998-02-23 DOI: 10.1109/ICDE.1998.655809
Nick Koudas, K. Sevcik
Current data repositories include a variety of data types, including audio, images and time series. State of the art techniques for indexing such data and doing query processing rely on a transformation of data elements into points in a multidimensional feature space. Indexing and query processing then take place in the feature space. We study algorithms for finding relationships among points in multidimensional feature spaces, specifically algorithms for multidimensional joins. Like joins of conventional relations, correlations between multidimensional feature spaces can offer valuable information about the data sets involved. We present several algorithmic paradigms for solving the multidimensional join problem, and we discuss their features and limitations. We propose a generalization of the Size Separation Spatial Join algorithm, named Multidimensional Spatial Join (MSJ), to solve the multidimensional join problem. We evaluate MSJ along with several other specific algorithms, comparing their performance for various dimensionalities on both real and synthetic multidimensional data sets. Our experimental results indicate that MSJ, which is based on space filling curves, consistently yields good performance across a wide range of dimensionalities.
当前的数据存储库包括各种数据类型,包括音频、图像和时间序列。索引此类数据和进行查询处理的最新技术依赖于将数据元素转换为多维特征空间中的点。然后在特征空间中进行索引和查询处理。我们研究了在多维特征空间中寻找点之间关系的算法,特别是多维连接的算法。与传统关系的连接一样,多维特征空间之间的相关性可以提供有关所涉及数据集的有价值信息。我们提出了几种解决多维连接问题的算法范例,并讨论了它们的特点和局限性。为了解决多维连接问题,我们提出了一种尺寸分离空间连接算法的推广,称为多维空间连接(MSJ)。我们评估了MSJ和其他几种特定算法,比较了它们在真实和合成多维数据集上不同维度的性能。我们的实验结果表明,基于空间填充曲线的MSJ在广泛的维度范围内始终保持良好的性能。
{"title":"High dimensional similarity joins: algorithms and performance evaluation","authors":"Nick Koudas, K. Sevcik","doi":"10.1109/ICDE.1998.655809","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655809","url":null,"abstract":"Current data repositories include a variety of data types, including audio, images and time series. State of the art techniques for indexing such data and doing query processing rely on a transformation of data elements into points in a multidimensional feature space. Indexing and query processing then take place in the feature space. We study algorithms for finding relationships among points in multidimensional feature spaces, specifically algorithms for multidimensional joins. Like joins of conventional relations, correlations between multidimensional feature spaces can offer valuable information about the data sets involved. We present several algorithmic paradigms for solving the multidimensional join problem, and we discuss their features and limitations. We propose a generalization of the Size Separation Spatial Join algorithm, named Multidimensional Spatial Join (MSJ), to solve the multidimensional join problem. We evaluate MSJ along with several other specific algorithms, comparing their performance for various dimensionalities on both real and synthetic multidimensional data sets. Our experimental results indicate that MSJ, which is based on space filling curves, consistently yields good performance across a wide range of dimensionalities.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128593827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 121
Fast nearest neighbor search in high-dimensional space 高维空间的快速近邻搜索
Pub Date : 1998-02-23 DOI: 10.1109/ICDE.1998.655779
Stefan Berchtold, Bernhard Ertl, D. Keim, H. Kriegel, T. Seidl
Similarity search in multimedia databases requires an efficient support of nearest neighbor search on a large set of high dimensional points as a basic operation for query processing. As recent theoretical results show, state of the art approaches to nearest neighbor search are not efficient in higher dimensions. In our new approach, we therefore precompute the result of any nearest neighbor search which corresponds to a computation of the voronoi cell of each data point. In a second step, we store the voronoi cells in an index structure efficient for high dimensional data spaces. As a result, nearest neighbor search corresponds to a simple point query on the index structure. Although our technique is based on a precomputation of the solution space, it is dynamic, i.e. it supports insertions of new data points. An extensive experimental evaluation of our technique demonstrates the high efficiency for uniformly distributed as well as real data. We obtained a significant reduction of the search time compared to nearest neighbor search in the X tree (up to a factor of 4).
多媒体数据库中的相似度搜索需要对大量高维点的最近邻搜索进行有效支持,作为查询处理的基本操作。最近的理论结果表明,最近邻搜索的最先进方法在高维中效率不高。因此,在我们的新方法中,我们预先计算任何最近邻搜索的结果,这对应于每个数据点的voronoi单元的计算。在第二步中,我们将voronoi单元存储在一个对高维数据空间有效的索引结构中。因此,最近邻搜索对应于索引结构上的简单点查询。虽然我们的技术是基于解空间的预计算,但它是动态的,即它支持插入新的数据点。广泛的实验评估表明,我们的技术对均匀分布和真实数据都具有很高的效率。与X树中的最近邻搜索相比,我们获得了搜索时间的显著减少(最多减少了4倍)。
{"title":"Fast nearest neighbor search in high-dimensional space","authors":"Stefan Berchtold, Bernhard Ertl, D. Keim, H. Kriegel, T. Seidl","doi":"10.1109/ICDE.1998.655779","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655779","url":null,"abstract":"Similarity search in multimedia databases requires an efficient support of nearest neighbor search on a large set of high dimensional points as a basic operation for query processing. As recent theoretical results show, state of the art approaches to nearest neighbor search are not efficient in higher dimensions. In our new approach, we therefore precompute the result of any nearest neighbor search which corresponds to a computation of the voronoi cell of each data point. In a second step, we store the voronoi cells in an index structure efficient for high dimensional data spaces. As a result, nearest neighbor search corresponds to a simple point query on the index structure. Although our technique is based on a precomputation of the solution space, it is dynamic, i.e. it supports insertions of new data points. An extensive experimental evaluation of our technique demonstrates the high efficiency for uniformly distributed as well as real data. We obtained a significant reduction of the search time compared to nearest neighbor search in the X tree (up to a factor of 4).","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129443392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 173
A graphical editor for the conceptual design of business rules 用于业务规则概念设计的图形化编辑器
Pub Date : 1998-02-23 DOI: 10.1109/ICDE.1998.655824
Peter Lang, W. Obermair, W. Kraus, T. Thalhammer
At the conceptual level, business rules are formulated from an external observation perspective according to the event-condition-action structure of rules in active database systems. Situation/activation diagrams homogeneously extend object/behavior diagrams to a graphical notation for the conceptual design of business objects and their associated business rules. Situation diagrams provide a high-level representation of logical events. Activation diagrams specify graphically which activities have to be performed upon some triggering event if an associated condition is satisfied. The developed editor supports both object/behavior diagrams and situation/activation diagrams. The editor performs syntactic consistency checks during the interactive design process. Moreover, by building the logical model in parallel with the diagrams, the editor guarantees that local semantic consistency checks can be performed incrementally, too. The editor clearly separates between schema data and pure visualization data describing the location of diagram elements. This separation facilitates the reuse of the generated data for further processing. Both schema data and visualization data are stored in the commercial object-oriented database system GemStone. Alternatively, those data may be stored in a file. The editor has been implemented using VisualWorks and MetaDoME, a framework for building graphical editors with VisualWorks.
在概念层面,业务规则是根据活动数据库系统中规则的事件-条件-操作结构从外部观察的角度制定的。情境/激活图将对象/行为图均匀地扩展为业务对象及其相关业务规则的概念设计的图形化符号。情况图提供了逻辑事件的高级表示。激活图以图形方式指定,如果满足关联条件,必须在某个触发事件上执行哪些活动。开发的编辑器支持对象/行为图和情况/激活图。编辑器在交互设计过程中执行语法一致性检查。此外,通过与图并行构建逻辑模型,编辑器可以保证局部语义一致性检查也可以增量地执行。编辑器清楚地将模式数据和描述图元素位置的纯可视化数据区分开来。这种分离有助于对生成的数据进行重用以进行进一步处理。模式数据和可视化数据都存储在商业面向对象数据库系统GemStone中。或者,这些数据可以存储在文件中。这个编辑器是用VisualWorks和MetaDoME实现的,MetaDoME是一个用VisualWorks构建图形编辑器的框架。
{"title":"A graphical editor for the conceptual design of business rules","authors":"Peter Lang, W. Obermair, W. Kraus, T. Thalhammer","doi":"10.1109/ICDE.1998.655824","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655824","url":null,"abstract":"At the conceptual level, business rules are formulated from an external observation perspective according to the event-condition-action structure of rules in active database systems. Situation/activation diagrams homogeneously extend object/behavior diagrams to a graphical notation for the conceptual design of business objects and their associated business rules. Situation diagrams provide a high-level representation of logical events. Activation diagrams specify graphically which activities have to be performed upon some triggering event if an associated condition is satisfied. The developed editor supports both object/behavior diagrams and situation/activation diagrams. The editor performs syntactic consistency checks during the interactive design process. Moreover, by building the logical model in parallel with the diagrams, the editor guarantees that local semantic consistency checks can be performed incrementally, too. The editor clearly separates between schema data and pure visualization data describing the location of diagram elements. This separation facilitates the reuse of the generated data for further processing. Both schema data and visualization data are stored in the commercial object-oriented database system GemStone. Alternatively, those data may be stored in a file. The editor has been implemented using VisualWorks and MetaDoME, a framework for building graphical editors with VisualWorks.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121425187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A distribution-based clustering algorithm for mining in large spatial databases 大型空间数据库中基于分布的聚类挖掘算法
Pub Date : 1998-02-23 DOI: 10.1109/ICDE.1998.655795
Xiaowei Xu, M. Ester, H. Kriegel, J. Sander
The problem of detecting clusters of points belonging to a spatial point process arises in many applications. In this paper, we introduce the new clustering algorithm DBCLASD (Distribution-Based Clustering of LArge Spatial Databases) to discover clusters of this type. The results of experiments demonstrate that DBCLASD, contrary to partitioning algorithms such as CLARANS (Clustering Large Applications based on RANdomized Search), discovers clusters of arbitrary shape. Furthermore, DBCLASD does not require any input parameters, in contrast to the clustering algorithm DBSCAN (Density-Based Spatial Clustering of Applications with Noise) requiring two input parameters, which may be difficult to provide for large databases. In terms of efficiency, DBCLASD is between CLARANS and DBSCAN, close to DBSCAN. Thus, the efficiency of DBCLASD on large spatial databases is very attractive when considering its nonparametric nature and its good quality for clusters of arbitrary shape.
在许多应用中都出现了检测属于空间点过程的点簇的问题。在本文中,我们引入了新的聚类算法DBCLASD (distributionbasedclustering of LArge Spatial Databases)来发现这类聚类。实验结果表明,与CLARANS(基于随机搜索的大型应用聚类)等划分算法相反,DBCLASD可以发现任意形状的聚类。此外,DBCLASD不需要任何输入参数,与需要两个输入参数的聚类算法DBSCAN (Density-Based Spatial clustering of Applications with Noise)不同,这对于大型数据库来说可能很难提供。在效率方面,DBCLASD介于CLARANS和DBSCAN之间,接近DBSCAN。因此,考虑到DBCLASD在大型空间数据库上的非参数性和对任意形状的簇的良好质量,它的效率非常有吸引力。
{"title":"A distribution-based clustering algorithm for mining in large spatial databases","authors":"Xiaowei Xu, M. Ester, H. Kriegel, J. Sander","doi":"10.1109/ICDE.1998.655795","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655795","url":null,"abstract":"The problem of detecting clusters of points belonging to a spatial point process arises in many applications. In this paper, we introduce the new clustering algorithm DBCLASD (Distribution-Based Clustering of LArge Spatial Databases) to discover clusters of this type. The results of experiments demonstrate that DBCLASD, contrary to partitioning algorithms such as CLARANS (Clustering Large Applications based on RANdomized Search), discovers clusters of arbitrary shape. Furthermore, DBCLASD does not require any input parameters, in contrast to the clustering algorithm DBSCAN (Density-Based Spatial Clustering of Applications with Noise) requiring two input parameters, which may be difficult to provide for large databases. In terms of efficiency, DBCLASD is between CLARANS and DBSCAN, close to DBSCAN. Thus, the efficiency of DBCLASD on large spatial databases is very attractive when considering its nonparametric nature and its good quality for clusters of arbitrary shape.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114831006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 383
Asynchronous version advancement in a distributed three version database 分布式三版本数据库中的异步版本升级
Pub Date : 1998-02-23 DOI: 10.1109/ICDE.1998.655805
DatabaseH. V. JagadishAT, Inderpal Singh MumickAT, Michael RabinovichAT
We present an efficient protocol for multi-version concurrency control in distributed databases. The protocol creates no more than three versions of any data item, while guaranteeing that: update transactions never interfere with read-only transactions; the version advancement mechanism is completely asynchronous with (both update and read-only) user transactions; and read-only transactions do not acquire locks and do not write control information into the data items being read. This is an improvement over existing multi-versioning schemes for distributed databases, which either require a potentially unlimited number of versions, or require coordination between version advancement and user transactions. Our protocol can be applied in a centralized system also, where the improvement over existing techniques is in reducing the number of versions from four to three. The proposed protocol is valuable in large applications that currently shut off access to the system while managing version advancement manually, but now have a need for automating this process and providing continuous access to the data.
提出了一种高效的分布式数据库多版本并发控制协议。该协议创建任何数据项不超过三个版本,同时保证:更新事务永远不会干扰只读事务;版本升级机制是完全异步的(包括更新和只读)用户事务;只读事务不会获取锁,也不会将控制信息写入正在读取的数据项中。这是对现有分布式数据库多版本控制方案的改进,后者要么需要无限数量的版本,要么需要版本升级和用户事务之间的协调。我们的协议也可以应用于集中式系统,其中对现有技术的改进在于将版本数量从四个减少到三个。该提议的协议在大型应用程序中很有价值,这些应用程序目前在手动管理版本升级时关闭了对系统的访问,但现在需要自动化此过程并提供对数据的连续访问。
{"title":"Asynchronous version advancement in a distributed three version database","authors":"DatabaseH. V. JagadishAT, Inderpal Singh MumickAT, Michael RabinovichAT","doi":"10.1109/ICDE.1998.655805","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655805","url":null,"abstract":"We present an efficient protocol for multi-version concurrency control in distributed databases. The protocol creates no more than three versions of any data item, while guaranteeing that: update transactions never interfere with read-only transactions; the version advancement mechanism is completely asynchronous with (both update and read-only) user transactions; and read-only transactions do not acquire locks and do not write control information into the data items being read. This is an improvement over existing multi-versioning schemes for distributed databases, which either require a potentially unlimited number of versions, or require coordination between version advancement and user transactions. Our protocol can be applied in a centralized system also, where the improvement over existing techniques is in reducing the number of versions from four to three. The proposed protocol is valuable in large applications that currently shut off access to the system while managing version advancement manually, but now have a need for automating this process and providing continuous access to the data.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131982596","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Efficient retrieval of similar time sequences under time warping 时间规整下相似时间序列的高效检索
Pub Date : 1998-02-23 DOI: 10.1109/ICDE.1998.655778
Byoung-Kee Yi, H. Jagadish, C. Faloutsos
Fast similarity searching in large time sequence databases has typically used Euclidean distance as a dissimilarity metric. However, for several applications, including matching of voice, audio and medical signals (e.g., electrocardiograms), one is required to permit local accelerations and decelerations in the rate of sequences, leading to a popular, field tested dissimilarity metric called the "time warping" distance. From the indexing viewpoint, this metric presents two major challenges: (a) it does not lead to any natural indexable "features", and (b) comparing two sequences requires time quadratic in the sequence length. To address each problem, we propose to use: (a) a modification of the so called "FastMap", to map sequences into points, with little compromise of "recall" (typically zero); and (b) a fast linear test, to help us discard quickly many of the false alarms that FastMap will typically introduce. Using both ideas in cascade, our proposed method achieved up to an order of magnitude speed-up over sequential scanning on both real and synthetic datasets.
大型时间序列数据库的快速相似度搜索通常使用欧几里得距离作为不相似度度量。然而,对于一些应用,包括语音、音频和医疗信号(如心电图)的匹配,需要允许序列速率的局部加速和减速,从而产生一种流行的、经过现场测试的不相似性度量,称为“时间扭曲”距离。从索引的角度来看,该度量提出了两个主要挑战:(a)它不能产生任何自然的可索引的“特征”,(b)比较两个序列需要在序列长度上花费二次的时间。为了解决每个问题,我们建议使用:(a)对所谓的“FastMap”进行修改,将序列映射到点,几乎不牺牲“召回”(通常为零);(b)快速线性测试,以帮助我们快速丢弃FastMap通常会引入的许多假警报。在级联中使用这两种思想,我们提出的方法在真实和合成数据集上的顺序扫描上实现了高达数量级的加速。
{"title":"Efficient retrieval of similar time sequences under time warping","authors":"Byoung-Kee Yi, H. Jagadish, C. Faloutsos","doi":"10.1109/ICDE.1998.655778","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655778","url":null,"abstract":"Fast similarity searching in large time sequence databases has typically used Euclidean distance as a dissimilarity metric. However, for several applications, including matching of voice, audio and medical signals (e.g., electrocardiograms), one is required to permit local accelerations and decelerations in the rate of sequences, leading to a popular, field tested dissimilarity metric called the \"time warping\" distance. From the indexing viewpoint, this metric presents two major challenges: (a) it does not lead to any natural indexable \"features\", and (b) comparing two sequences requires time quadratic in the sequence length. To address each problem, we propose to use: (a) a modification of the so called \"FastMap\", to map sequences into points, with little compromise of \"recall\" (typically zero); and (b) a fast linear test, to help us discard quickly many of the false alarms that FastMap will typically introduce. Using both ideas in cascade, our proposed method achieved up to an order of magnitude speed-up over sequential scanning on both real and synthetic datasets.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132926911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 784
Leveraging mediator cost models with heterogeneous data sources 利用具有异构数据源的中介成本模型
Pub Date : 1998-02-23 DOI: 10.1109/ICDE.1998.655798
Hubert Naacke, G. Gardarin, A. Tomasic
Distributed systems require declarative access to diverse information sources. One approach to solving this heterogeneous distributed database problem is based on mediator architectures. In these architectures, mediators accept queries from users, process them with respect to wrappers, and return answers. Wrappers provide access to underlying sources. To efficiently process queries, the mediator must optimize the plan used for processing the query. In classical databases, cost-estimate based query optimization is effective. In a heterogeneous distributed databases, cost-estimate based query optimization is difficult to achieve because the underlying data sources do not export cost information. This paper describes a new method that permits the wrapper programmer to export cost estimates. For the wrapper programmer to describe all cost estimates may be impossible due to lack of information or burdensome due to the amount of information. We ease this responsibility of the wrapper programmer by leveraging the generic cost model of the mediator with specific cost estimates from the wrappers.
分布式系统需要对各种信息源进行声明式访问。解决这种异构分布式数据库问题的一种方法是基于中介体系结构。在这些体系结构中,中介接受来自用户的查询,根据包装器对其进行处理,并返回答案。包装器提供对底层源的访问。为了有效地处理查询,中介必须优化用于处理查询的计划。在经典数据库中,基于成本估计的查询优化是有效的。在异构分布式数据库中,基于成本估计的查询优化很难实现,因为底层数据源不导出成本信息。本文描述了一种允许包装器程序员导出成本估算的新方法。对于包装器程序员来说,描述所有的成本估计可能是不可能的,因为缺乏信息,或者由于信息量太大而负担过重。通过利用中介的通用成本模型和来自包装器的特定成本估算,我们减轻了包装器程序员的这一责任。
{"title":"Leveraging mediator cost models with heterogeneous data sources","authors":"Hubert Naacke, G. Gardarin, A. Tomasic","doi":"10.1109/ICDE.1998.655798","DOIUrl":"https://doi.org/10.1109/ICDE.1998.655798","url":null,"abstract":"Distributed systems require declarative access to diverse information sources. One approach to solving this heterogeneous distributed database problem is based on mediator architectures. In these architectures, mediators accept queries from users, process them with respect to wrappers, and return answers. Wrappers provide access to underlying sources. To efficiently process queries, the mediator must optimize the plan used for processing the query. In classical databases, cost-estimate based query optimization is effective. In a heterogeneous distributed databases, cost-estimate based query optimization is difficult to achieve because the underlying data sources do not export cost information. This paper describes a new method that permits the wrapper programmer to export cost estimates. For the wrapper programmer to describe all cost estimates may be impossible due to lack of information or burdensome due to the amount of information. We ease this responsibility of the wrapper programmer by leveraging the generic cost model of the mediator with specific cost estimates from the wrappers.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132156522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
期刊
Proceedings 14th International Conference on Data Engineering
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1