Performance analysis of parallel hash join algorithms on a distributed shared memory machine implementation and evaluation on HP exemplar SPP 1600

M. Nakano, H. Imai, M. Kitsuregawa
{"title":"Performance analysis of parallel hash join algorithms on a distributed shared memory machine implementation and evaluation on HP exemplar SPP 1600","authors":"M. Nakano, H. Imai, M. Kitsuregawa","doi":"10.1109/ICDE.1998.655761","DOIUrl":null,"url":null,"abstract":"The distributed shared memory (DSM) architecture is considered to be one of the most likely parallel computing environment candidate for the near future because of its ease of system scalability and facilitation for parallel programming. However, a naive program based on shared memory execution on a DSM machine often deteriorates performance, because of the overhead involved for maintaining cache coherency particularly with frequent remote memory accesses. We show that careful buffer management of parallel join processing on DSM can produce considerable performance improvements in comparison with a naive implementation. We propose four buffer management strategies for parallel hash join processing on the DSM architecture and actually implement them on the HP Exemplar SPP 1600. The basic strategy is to begin with the hash join algorithm for the shared everything architecture and then to consider the memory locality of DSM by distributing the hash table and data pool buffers among the nodes. The results of four buffering strategies are analyzed in detail. Consequently, we can conclude that, in order to achieve high performance on a DSM machine, our buffer management strategy in which the memory access pattern is extracted and buffers are allocated in the local memory of nodes to minimize memory access cost is very efficient.","PeriodicalId":264926,"journal":{"name":"Proceedings 14th International Conference on Data Engineering","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1998-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 14th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.1998.655761","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

The distributed shared memory (DSM) architecture is considered to be one of the most likely parallel computing environment candidate for the near future because of its ease of system scalability and facilitation for parallel programming. However, a naive program based on shared memory execution on a DSM machine often deteriorates performance, because of the overhead involved for maintaining cache coherency particularly with frequent remote memory accesses. We show that careful buffer management of parallel join processing on DSM can produce considerable performance improvements in comparison with a naive implementation. We propose four buffer management strategies for parallel hash join processing on the DSM architecture and actually implement them on the HP Exemplar SPP 1600. The basic strategy is to begin with the hash join algorithm for the shared everything architecture and then to consider the memory locality of DSM by distributing the hash table and data pool buffers among the nodes. The results of four buffering strategies are analyzed in detail. Consequently, we can conclude that, in order to achieve high performance on a DSM machine, our buffer management strategy in which the memory access pattern is extracted and buffers are allocated in the local memory of nodes to minimize memory access cost is very efficient.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
并行哈希连接算法在分布式共享内存机上的性能分析,在HP范例spp1600上的实现与评估
分布式共享内存(DSM)体系结构被认为是在不久的将来最有可能的并行计算环境候选之一,因为它易于系统可伸缩性和促进并行编程。然而,在DSM机器上基于共享内存执行的原始程序通常会降低性能,因为维护缓存一致性涉及开销,特别是在频繁的远程内存访问时。我们表明,与简单的实现相比,对DSM上的并行连接处理进行仔细的缓冲区管理可以产生相当大的性能改进。我们提出了四种缓冲管理策略,用于DSM架构上的并行哈希连接处理,并在HP Exemplar SPP 1600上实际实现。基本策略是从共享一切架构的哈希连接算法开始,然后通过在节点之间分布哈希表和数据池缓冲区来考虑DSM的内存局部性。详细分析了四种缓冲策略的效果。因此,我们可以得出结论,为了在DSM机器上实现高性能,我们的缓冲区管理策略非常有效,其中提取内存访问模式并在节点的本地内存中分配缓冲区以最小化内存访问成本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A distribution-based clustering algorithm for mining in large spatial databases Parallelizing loops in database programming languages Data logging: a method for efficient data updates in constantly active RAIDs Query processing in a video retrieval system Optimizing regular path expressions using graph schemas
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1