Reducing cache misses in hash join probing phase by pre-sorting strategy (abstract only)

Gihwan Oh, Jae-Myung Kim, Woon-Hak Kang, Sang-Won Lee
{"title":"Reducing cache misses in hash join probing phase by pre-sorting strategy (abstract only)","authors":"Gihwan Oh, Jae-Myung Kim, Woon-Hak Kang, Sang-Won Lee","doi":"10.1145/2213836.2213971","DOIUrl":null,"url":null,"abstract":"Recently, several studies on multi-core cache-aware hash join have been carried out [Kim09VLDB, Blanas11SIGMOD]. In particular, the work of Blanas has shown that rather simple no-partitioning hash join can outperform the work of Kim. Meanwhile, the simple but best performing hash join of Blanas still experiences severe cache misses in probing phase. Because the key values of tuples in outer relation are not sorted or clustered, each outer record has different hashed key value and thus accesses the different hash bucket. Since the size of hash table of inner table is usually much larger than that of the CPU cache, it is highly probable that the reference to hash bucket of inner table by each outer record would encounter cache miss. To reduce the cache misses in hash join probing phase, we propose a new join algorithm, Sorted Probing (in short, SP), which pre-sorts the hashed key values of outer table of hash join so that the access to the hash bucket of inner table has strong temporal locality, thus minimizing the cache misses during the probing phase. As an optimization technique of sorting, we used the cache-aware AlphaSort technique, which extracts the key from each record of data set to be sorted and its pointer, and then sorts the pairs of (key, rec_ptr). For performance evaluation, we used two hash join algorithms from Blanas' work, no partitioning(NP) and independent partitioning(IP) in a standard C++ program, provided by Blanas. Also, we implemented the AlphaSort and added it before each probing phase of NP and IP, and we call each algorithm as NP+SP and IP+SP. For syntactic workload, IP+SP outperforms all other algorithms: IP+SP is faster than other altorithms up to 30%.","PeriodicalId":212616,"journal":{"name":"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data","volume":"91 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2213836.2213971","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recently, several studies on multi-core cache-aware hash join have been carried out [Kim09VLDB, Blanas11SIGMOD]. In particular, the work of Blanas has shown that rather simple no-partitioning hash join can outperform the work of Kim. Meanwhile, the simple but best performing hash join of Blanas still experiences severe cache misses in probing phase. Because the key values of tuples in outer relation are not sorted or clustered, each outer record has different hashed key value and thus accesses the different hash bucket. Since the size of hash table of inner table is usually much larger than that of the CPU cache, it is highly probable that the reference to hash bucket of inner table by each outer record would encounter cache miss. To reduce the cache misses in hash join probing phase, we propose a new join algorithm, Sorted Probing (in short, SP), which pre-sorts the hashed key values of outer table of hash join so that the access to the hash bucket of inner table has strong temporal locality, thus minimizing the cache misses during the probing phase. As an optimization technique of sorting, we used the cache-aware AlphaSort technique, which extracts the key from each record of data set to be sorted and its pointer, and then sorts the pairs of (key, rec_ptr). For performance evaluation, we used two hash join algorithms from Blanas' work, no partitioning(NP) and independent partitioning(IP) in a standard C++ program, provided by Blanas. Also, we implemented the AlphaSort and added it before each probing phase of NP and IP, and we call each algorithm as NP+SP and IP+SP. For syntactic workload, IP+SP outperforms all other algorithms: IP+SP is faster than other altorithms up to 30%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过预排序策略减少哈希连接探测阶段的缓存丢失(仅抽象)
最近,对多核缓存感知哈希连接进行了一些研究[Kim09VLDB, Blanas11SIGMOD]。特别是,Blanas的工作表明,相当简单的无分区哈希连接可以胜过Kim的工作。同时,Blanas的简单但性能最好的哈希连接在探测阶段仍然会遇到严重的缓存丢失。由于外部关系中的元组的键值没有排序或聚集,因此每个外部记录具有不同的散列键值,从而访问不同的散列桶。自哈希表的内部表的大小通常是更大的比CPU缓存,它极有可能参考每个外散列桶内部表的记录会遇到缓存小姐减少缓存错过在散列连接探测阶段,我们提出一个新的连接算法,排序调查(简而言之,SP),预先散列键值的外部表的散列连接,以便访问散列桶内表有强烈的时间局部性,从而最小化探测阶段的缓存丢失。作为排序的优化技术,我们使用了缓存感知的AlphaSort技术,该技术从要排序的数据集的每个记录及其指针中提取键,然后对(key, rec_ptr)对进行排序。为了进行性能评估,我们使用了Blanas提供的标准c++程序中的两种散列连接算法,即无分区(NP)和独立分区(IP)。我们还实现了AlphaSort,并在NP和IP的每个探测阶段之前添加它,我们将每个算法称为NP+SP和IP+SP。对于语法负载,IP+SP优于所有其他算法:IP+SP比其他算法速度快30%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
CloudRAMSort: fast and efficient large-scale distributed RAM sort on shared-nothing cluster DP-tree: indexing multi-dimensional data under differential privacy (abstract only) Dynamic optimization of generalized SQL queries with horizontal aggregations A model-based approach to attributed graph clustering JustMyFriends: full SQL, full transactional amenities, and access privacy
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1