SAM: Optimizing Multithreaded Cores for Speculative Parallelism

Maleen Abeydeera, Suvinay Subramanian, M. C. Jeffrey, J. Emer, Daniel Sánchez
{"title":"SAM: Optimizing Multithreaded Cores for Speculative Parallelism","authors":"Maleen Abeydeera, Suvinay Subramanian, M. C. Jeffrey, J. Emer, Daniel Sánchez","doi":"10.1109/PACT.2017.37","DOIUrl":null,"url":null,"abstract":"This work studies the interplay between multithreaded cores and speculative parallelism (e.g., transactional memory or thread-level speculation). These techniques are often used together, yet they have been developed independently. This disconnect causes major performance pathologies: increasing the number of threads per core adds conflicts and wasted work, and puts pressure on speculative execution resources. These pathologies often squander the benefits of multithreading.We present speculation-aware multithreading (SAM), a simple policy that addresses these pathologies. By coordinating instruction dispatch and conflict resolution priorities, SAM focuses execution resources on work that is more likely to commit, avoiding aborts and using speculation resources more efficiently.We design SAM variants for in-order and out-of-order cores. SAM is cheap to implement and makes multithreaded cores much more beneficial on speculative parallel programs. We evaluate SAM on systems with up to 64 SMT cores. With SAM, 8-threaded cores outperform single-threaded cores by 2.33x on average, while a speculation-oblivious policy yields a 1.85x speedup. SAM also reduces wasted work by 52%.","PeriodicalId":438103,"journal":{"name":"2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PACT.2017.37","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

This work studies the interplay between multithreaded cores and speculative parallelism (e.g., transactional memory or thread-level speculation). These techniques are often used together, yet they have been developed independently. This disconnect causes major performance pathologies: increasing the number of threads per core adds conflicts and wasted work, and puts pressure on speculative execution resources. These pathologies often squander the benefits of multithreading.We present speculation-aware multithreading (SAM), a simple policy that addresses these pathologies. By coordinating instruction dispatch and conflict resolution priorities, SAM focuses execution resources on work that is more likely to commit, avoiding aborts and using speculation resources more efficiently.We design SAM variants for in-order and out-of-order cores. SAM is cheap to implement and makes multithreaded cores much more beneficial on speculative parallel programs. We evaluate SAM on systems with up to 64 SMT cores. With SAM, 8-threaded cores outperform single-threaded cores by 2.33x on average, while a speculation-oblivious policy yields a 1.85x speedup. SAM also reduces wasted work by 52%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
SAM:优化多线程内核的推测并行性
这项工作研究了多线程内核和推测并行(例如,事务性内存或线程级推测)之间的相互作用。这些技术经常一起使用,但它们都是独立开发的。这种断开会导致主要的性能问题:增加每个内核的线程数量会增加冲突和浪费的工作,并给推测性执行资源带来压力。这些病态常常浪费多线程的好处。我们提出了推测感知多线程(SAM),这是一种解决这些问题的简单策略。通过协调指令调度和冲突解决优先级,SAM将执行资源集中在更有可能提交的工作上,从而避免中止并更有效地使用推测资源。我们为有序核和无序核设计了SAM变体。SAM的实现成本很低,并且使多线程内核对推测性并行程序更有利。我们在多达64个SMT内核的系统上评估SAM。使用SAM, 8线程内核的性能比单线程内核平均高出2.33倍,而投机无关策略的速度提高了1.85倍。SAM还减少了52%的工作浪费。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
POSTER: Exploiting Approximations for Energy/Quality Tradeoffs in Service-Based Applications End-to-End Deep Learning of Optimization Heuristics Large Scale Data Clustering Using Memristive k-Median Computation DrMP: Mixed Precision-Aware DRAM for High Performance Approximate and Precise Computing POSTER: Improving Datacenter Efficiency Through Partitioning-Aware Scheduling
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1