Cache-conscious scheduling of streaming pipelines on parallel machines with private caches

Kunal Agrawal, Jordyn C. Maglalang, Jeremy T. Fineman
{"title":"Cache-conscious scheduling of streaming pipelines on parallel machines with private caches","authors":"Kunal Agrawal, Jordyn C. Maglalang, Jeremy T. Fineman","doi":"10.1109/HiPC.2014.7116893","DOIUrl":null,"url":null,"abstract":"This paper studies the problem of scheduling a streaming pipeline on a multicore machine with private caches to maximize throughput. The theoretical contribution includes lower and upper bounds in the parallel external-memory model. We show that a simple greedy scheduling strategy is asymptotically optimal with a constant-factor memory augmentation. More specifically, we show that if our strategy has a running time of Q cache misses on a machine with size-M caches, then every “static” scheduling policy must have time at least that of Q(Q) cache misses on a machine with size-M/6 caches. Our experimental study considers the question of whether scheduling based on cache effects is more important than scheduling based on only the number of computation steps. Using synthetic pipelines with a range of parameters, we compare our cache-based partitioning against several other static schedulers that load-balance computation. In most cases, the cache-based partitioning indeed beats the other schedulers, but there are some cases that go the other way. We conclude that considering cache effects is a good idea, but other features of the streaming pipeline are also important.","PeriodicalId":337777,"journal":{"name":"2014 21st International Conference on High Performance Computing (HiPC)","volume":"150 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 21st International Conference on High Performance Computing (HiPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HiPC.2014.7116893","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

This paper studies the problem of scheduling a streaming pipeline on a multicore machine with private caches to maximize throughput. The theoretical contribution includes lower and upper bounds in the parallel external-memory model. We show that a simple greedy scheduling strategy is asymptotically optimal with a constant-factor memory augmentation. More specifically, we show that if our strategy has a running time of Q cache misses on a machine with size-M caches, then every “static” scheduling policy must have time at least that of Q(Q) cache misses on a machine with size-M/6 caches. Our experimental study considers the question of whether scheduling based on cache effects is more important than scheduling based on only the number of computation steps. Using synthetic pipelines with a range of parameters, we compare our cache-based partitioning against several other static schedulers that load-balance computation. In most cases, the cache-based partitioning indeed beats the other schedulers, but there are some cases that go the other way. We conclude that considering cache effects is a good idea, but other features of the streaming pipeline are also important.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
具有私有缓存的并行机器上流管道的缓存意识调度
研究了在多核机器上使用私有缓存实现流管道最大吞吐量的调度问题。理论贡献包括并行外部存储器模型的下界和上界。我们证明了一个简单的贪婪调度策略是渐进最优的,具有恒定因子的内存增加。更具体地说,我们表明,如果我们的策略在一个大小为m的机器上有Q个缓存丢失的运行时间,那么每个“静态”调度策略在一个大小为m /6的机器上必须至少有Q(Q)个缓存丢失的时间。我们的实验研究考虑了基于缓存效果的调度是否比仅基于计算步数的调度更重要的问题。使用具有一系列参数的合成管道,我们将基于缓存的分区与其他几个负载平衡计算的静态调度器进行比较。在大多数情况下,基于缓存的分区确实优于其他调度器,但也有一些情况相反。我们得出结论,考虑缓存效果是一个好主意,但流媒体管道的其他特性也很重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Design and evaluation of parallel hashing over large-scale data Scaling graph community detection on the Tilera many-core architecture Cache-conscious scheduling of streaming pipelines on parallel machines with private caches A high performance broadcast design with hardware multicast and GPUDirect RDMA for streaming applications on Infiniband clusters Saving energy by exploiting residual imbalances on iterative applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1