多计算机数据库系统连接操作负载平衡技术的性能评估

Proceedings of the Eleventh International Conference on Data Engineering Pub Date : 1995-03-06 DOI:10.1109/ICDE.1995.380411

K. Hua, Wallapak Tavanapong, H. Young

{"title":"多计算机数据库系统连接操作负载平衡技术的性能评估","authors":"K. Hua, Wallapak Tavanapong, H. Young","doi":"10.1109/ICDE.1995.380411","DOIUrl":null,"url":null,"abstract":"There has been a wealth of research in the area of parallel join algorithms. Among them, hash-based algorithms are particularly suitable for shared-nothing database systems. The effectiveness of these techniques depends on the uniformity in the distribution of the join attribute values. When this condition is not met, a severe fluctuation may occur among the bucket sizes, causing uneven workload for the processing nodes. Many parallel join algorithms with load balancing capability have been proposed to address this problem. Among them, the sampling and incremental approaches have been shown to provide an improvement over the more conventional methods. The comparison between these two approaches, however, has not been investigated. In this paper, we improve these techniques and implement them on an nCUBE/2 parallel computer to compare their performance. Our study indicates that the sampling technique is the better approach.<<ETX>>","PeriodicalId":184415,"journal":{"name":"Proceedings of the Eleventh International Conference on Data Engineering","volume":"116 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"A performance evaluation of load balancing techniques for join operations on multicomputer database systems\",\"authors\":\"K. Hua, Wallapak Tavanapong, H. Young\",\"doi\":\"10.1109/ICDE.1995.380411\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"There has been a wealth of research in the area of parallel join algorithms. Among them, hash-based algorithms are particularly suitable for shared-nothing database systems. The effectiveness of these techniques depends on the uniformity in the distribution of the join attribute values. When this condition is not met, a severe fluctuation may occur among the bucket sizes, causing uneven workload for the processing nodes. Many parallel join algorithms with load balancing capability have been proposed to address this problem. Among them, the sampling and incremental approaches have been shown to provide an improvement over the more conventional methods. The comparison between these two approaches, however, has not been investigated. In this paper, we improve these techniques and implement them on an nCUBE/2 parallel computer to compare their performance. Our study indicates that the sampling technique is the better approach.<<ETX>>\",\"PeriodicalId\":184415,\"journal\":{\"name\":\"Proceedings of the Eleventh International Conference on Data Engineering\",\"volume\":\"116 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Eleventh International Conference on Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.1995.380411\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Eleventh International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.1995.380411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

摘要

在并行连接算法领域已经有了大量的研究。其中，基于哈希的算法特别适合于无共享的数据库系统。这些技术的有效性取决于连接属性值分布的一致性。当不满足此条件时，桶大小之间可能会出现较大的波动，导致处理节点的工作负载不均匀。为了解决这个问题，人们提出了许多具有负载平衡能力的并行连接算法。其中，抽样和增量方法已被证明比更传统的方法提供了改进。然而，这两种方法之间的比较尚未被调查。在本文中，我们改进了这些技术，并在一台nCUBE/2并行计算机上实现了它们，比较了它们的性能。我们的研究表明，抽样技术是较好的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A performance evaluation of load balancing techniques for join operations on multicomputer database systems

There has been a wealth of research in the area of parallel join algorithms. Among them, hash-based algorithms are particularly suitable for shared-nothing database systems. The effectiveness of these techniques depends on the uniformity in the distribution of the join attribute values. When this condition is not met, a severe fluctuation may occur among the bucket sizes, causing uneven workload for the processing nodes. Many parallel join algorithms with load balancing capability have been proposed to address this problem. Among them, the sampling and incremental approaches have been shown to provide an improvement over the more conventional methods. The comparison between these two approaches, however, has not been investigated. In this paper, we improve these techniques and implement them on an nCUBE/2 parallel computer to compare their performance. Our study indicates that the sampling technique is the better approach.<>

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Eleventh International Conference on Data Engineering

自引率

0.00%

发文量

期刊最新文献

Translation of object-oriented queries to relational queries A transaction transformation approach to active rule processing Design, implementation and evaluation of SCORE (a system for content based retrieval of pictures) A structure based schema integration methodology An evaluation of sampling-based size estimation methods for selections in database systems