Accelerating decoupled look-ahead via weak dependence removal: A metaheuristic approach

Raj Parihar, Michael C. Huang
{"title":"Accelerating decoupled look-ahead via weak dependence removal: A metaheuristic approach","authors":"Raj Parihar, Michael C. Huang","doi":"10.1109/HPCA.2014.6835974","DOIUrl":null,"url":null,"abstract":"Despite the proliferation of multi-core and multi-threaded architectures, exploiting implicit parallelism for a single semantic thread is still a crucial component in achieving high performance. Look-ahead is a tried-and-true strategy in uncovering implicit parallelism, but a conventional, monolithic out-of-order core quickly becomes resource-inefficient when looking beyond a small distance. A more decoupled approach with an independent, dedicated look-ahead thread on a separate thread context can be a more flexible and effective implementation, especially in a multi-core environment. While capable of generating significant performance gains, the look-ahead agent often becomes the new speed limit. Fortunately, the look-ahead thread has no hard correctness constraints and presents new opportunities for optimizations. One such opportunity is to exploit “weak” dependences. Intuitively, not all dependences are equal. Some links in a dependence chain are weak enough that removing them in the look-ahead thread does not materially affect the quality of look-ahead but improves the speed. While there are some common patterns of weak dependences, they can not be generalized as heuristics in generating better code for the look-ahead thread. A primary reason is that removing a false weak dependence can be exceedingly costly. Nevertheless, a trial-and-error approach can reliably identify opportunities for improving the look-ahead thread and quantify the benefits. A framework based on genetic algorithm can help search for the right set of changes to the look-ahead thread. In the set of applications where the speed of look-ahead has become the new limit, this method is found to improve the overall system performance by up to 1.48x with a geometric mean of 1.14x over the baseline decoupled look-ahead system, while reducing energy consumption by 11%.","PeriodicalId":164587,"journal":{"name":"2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2014.6835974","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Despite the proliferation of multi-core and multi-threaded architectures, exploiting implicit parallelism for a single semantic thread is still a crucial component in achieving high performance. Look-ahead is a tried-and-true strategy in uncovering implicit parallelism, but a conventional, monolithic out-of-order core quickly becomes resource-inefficient when looking beyond a small distance. A more decoupled approach with an independent, dedicated look-ahead thread on a separate thread context can be a more flexible and effective implementation, especially in a multi-core environment. While capable of generating significant performance gains, the look-ahead agent often becomes the new speed limit. Fortunately, the look-ahead thread has no hard correctness constraints and presents new opportunities for optimizations. One such opportunity is to exploit “weak” dependences. Intuitively, not all dependences are equal. Some links in a dependence chain are weak enough that removing them in the look-ahead thread does not materially affect the quality of look-ahead but improves the speed. While there are some common patterns of weak dependences, they can not be generalized as heuristics in generating better code for the look-ahead thread. A primary reason is that removing a false weak dependence can be exceedingly costly. Nevertheless, a trial-and-error approach can reliably identify opportunities for improving the look-ahead thread and quantify the benefits. A framework based on genetic algorithm can help search for the right set of changes to the look-ahead thread. In the set of applications where the speed of look-ahead has become the new limit, this method is found to improve the overall system performance by up to 1.48x with a geometric mean of 1.14x over the baseline decoupled look-ahead system, while reducing energy consumption by 11%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过弱依赖去除加速解耦的前瞻性:一种元启发式方法
尽管多核和多线程体系结构在不断发展,但为单个语义线程开发隐式并行性仍然是实现高性能的关键组成部分。在发现隐式并行性方面,提前查找是一种可靠的策略,但是当查找超出一小段距离时,传统的单片乱序核心很快就会变得资源效率低下。在单独的线程上下文中使用独立的、专用的预检线程的解耦方法可能是一种更灵活、更有效的实现,特别是在多核环境中。虽然能够产生显著的性能提升,但向前看代理经常成为新的速度限制。幸运的是,前瞻性线程没有硬性的正确性约束,并为优化提供了新的机会。其中一个机会就是利用“弱”依赖。直观地说,并非所有依赖项都是相等的。依赖链中的一些链接足够弱,因此在预查线程中删除它们不会对预查的质量产生实质性影响,但会提高速度。虽然存在一些常见的弱依赖模式,但它们不能被概括为为前瞻性线程生成更好代码的启发式方法。一个主要原因是,移除一个虚假的弱依赖关系的成本可能非常高。然而,试错方法可以可靠地确定改进前瞻性线程的机会,并量化其好处。基于遗传算法的框架可以帮助搜索前瞻线程的正确更改集。在一些应用中,预判速度已经成为新的限制,该方法被发现比基线解耦预判系统提高了1.48倍的整体系统性能,几何平均值为1.14倍,同时降低了11%的能耗。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Precision-aware soft error protection for GPUs Low-overhead and high coverage run-time race detection through selective meta-data management Improving DRAM performance by parallelizing refreshes with accesses Improving GPGPU resource utilization through alternative thread block scheduling DraMon: Predicting memory bandwidth usage of multi-threaded programs with high accuracy and low overhead
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1