Utility-based acceleration of multithreaded applications on asymmetric CMPs

José A. Joao, M. A. Suleman, O. Mutlu, Y. Patt
{"title":"Utility-based acceleration of multithreaded applications on asymmetric CMPs","authors":"José A. Joao, M. A. Suleman, O. Mutlu, Y. Patt","doi":"10.1145/2485922.2485936","DOIUrl":null,"url":null,"abstract":"Asymmetric Chip Multiprocessors (ACMPs) are becoming a reality. ACMPs can speed up parallel applications if they can identify and accelerate code segments that are critical for performance. Proposals already exist for using coarse-grained thread scheduling and fine-grained bottleneck acceleration. Unfortunately, there have been no proposals offered thus far to decide which code segments to accelerate in cases where both coarse-grained thread scheduling and fine-grained bottleneck acceleration could have value. This paper proposes Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs (UBA), a cooperative software/hardware mechanism for identifying and accelerating the most likely critical code segments from a set of multithreaded applications running on an ACMP. The key idea is a new Utility of Acceleration metric that quantifies the performance benefit of accelerating a bottleneck or a thread by taking into account both the criticality and the expected speedup. UBA outperforms the best of two state-of-the-art mechanisms by 11% for single application workloads and by 7% for two-application workloads on an ACMP with 52 small cores and 3 large cores.","PeriodicalId":20555,"journal":{"name":"Proceedings of the 40th Annual International Symposium on Computer Architecture","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"93","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 40th Annual International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2485922.2485936","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 93

Abstract

Asymmetric Chip Multiprocessors (ACMPs) are becoming a reality. ACMPs can speed up parallel applications if they can identify and accelerate code segments that are critical for performance. Proposals already exist for using coarse-grained thread scheduling and fine-grained bottleneck acceleration. Unfortunately, there have been no proposals offered thus far to decide which code segments to accelerate in cases where both coarse-grained thread scheduling and fine-grained bottleneck acceleration could have value. This paper proposes Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs (UBA), a cooperative software/hardware mechanism for identifying and accelerating the most likely critical code segments from a set of multithreaded applications running on an ACMP. The key idea is a new Utility of Acceleration metric that quantifies the performance benefit of accelerating a bottleneck or a thread by taking into account both the criticality and the expected speedup. UBA outperforms the best of two state-of-the-art mechanisms by 11% for single application workloads and by 7% for two-application workloads on an ACMP with 52 small cores and 3 large cores.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
非对称cmp上多线程应用程序基于实用程序的加速
非对称芯片多处理器(acmp)正在成为现实。如果acmp能够识别和加速对性能至关重要的代码段,则可以加快并行应用程序的速度。已经有使用粗粒度线程调度和细粒度瓶颈加速的建议。不幸的是,到目前为止还没有提出建议来决定在粗粒度线程调度和细粒度瓶颈加速都有价值的情况下加速哪些代码段。本文提出了基于实用程序的多线程应用在非对称cmp (UBA)上的加速,这是一种协作的软件/硬件机制,用于识别和加速在ACMP上运行的一组多线程应用中最可能的关键代码段。关键思想是一个新的Utility of Acceleration度量,它通过考虑临界性和预期加速来量化加速瓶颈或线程的性能收益。在具有52个小核和3个大核的ACMP上,对于单个应用程序工作负载,UBA的性能比两种最先进机制中的最佳机制高出11%,对于两个应用程序工作负载,UBA的性能高出7%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
AC-DIMM: associative computing with STT-MRAM Deconfigurable microprocessor architectures for silicon debug acceleration Thin servers with smart pipes: designing SoC accelerators for memcached An experimental study of data retention behavior in modern DRAM devices: implications for retention time profiling mechanisms Dynamic reduction of voltage margins by leveraging on-chip ECC in Itanium II processors
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1