大问题微体系结构的两条缓存线预测

Shu-Lin Hwang, F. Lai
{"title":"大问题微体系结构的两条缓存线预测","authors":"Shu-Lin Hwang, F. Lai","doi":"10.1109/ACAC.2001.903361","DOIUrl":null,"url":null,"abstract":"Modern micro-architectures employ superscalar techniques to enhance system performance. The superscalar microprocessors must fetch at least one instruction cache line at a time to support high issue rate and large amount speculative executions. In this paper, we propose the Grouped Branch Prediction (GBP) that can recognize and predict multiple branches in the same instruction cache line for a wide-issue micro-architecture. Several configurations of the GBP with different group sizes are simulated. The simulation results show that the branch penalty of the group size 4 with 2048-entry is under 0.65 clock cycle. In our design, we choose the two-group scheme with group size 4. This feature achieves an average of 4.9 IPC f (the number of instructions fetched per cycle for a machine front-end). Furthermore, we extend the GBP to achieve two cache lines predictions with two fetch units. The scheme of the 2048-entry 2-group with group size 4 can produce an average of 8.4 IPC f. The performance is approximately 66.5% better than the original 2-group GBPs. The added hardware cost (41.5 k bits) is less than 40%.","PeriodicalId":230403,"journal":{"name":"Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Two cache lines prediction for a wide-issue micro-architecture\",\"authors\":\"Shu-Lin Hwang, F. Lai\",\"doi\":\"10.1109/ACAC.2001.903361\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern micro-architectures employ superscalar techniques to enhance system performance. The superscalar microprocessors must fetch at least one instruction cache line at a time to support high issue rate and large amount speculative executions. In this paper, we propose the Grouped Branch Prediction (GBP) that can recognize and predict multiple branches in the same instruction cache line for a wide-issue micro-architecture. Several configurations of the GBP with different group sizes are simulated. The simulation results show that the branch penalty of the group size 4 with 2048-entry is under 0.65 clock cycle. In our design, we choose the two-group scheme with group size 4. This feature achieves an average of 4.9 IPC f (the number of instructions fetched per cycle for a machine front-end). Furthermore, we extend the GBP to achieve two cache lines predictions with two fetch units. The scheme of the 2048-entry 2-group with group size 4 can produce an average of 8.4 IPC f. The performance is approximately 66.5% better than the original 2-group GBPs. The added hardware cost (41.5 k bits) is less than 40%.\",\"PeriodicalId\":230403,\"journal\":{\"name\":\"Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-01-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ACAC.2001.903361\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 6th Australasian Computer Systems Architecture Conference. ACSAC 2001","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACAC.2001.903361","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

现代微架构采用超标量技术来提高系统性能。超标量微处理器必须一次至少获取一条指令缓存行,以支持高发放率和大量推测执行。在本文中,我们提出了分组分支预测(GBP),它可以识别和预测同一指令缓存线中的多个分支。模拟了几种不同群大小的GBP结构。仿真结果表明,当分组规模为4且分组个数为2048时,分支惩罚小于0.65时钟周期。在我们的设计中,我们选择两组方案,组大小为4。该特性平均实现4.9 IPC f(机器前端每个周期获取的指令数)。此外,我们扩展了GBP,以实现两个获取单元的两个缓存线预测。该方案包含2048个分组,分组大小为4,平均可以产生8.4 IPC f,性能比原来的2组GBPs提高约66.5%。增加的硬件成本(41.5 k比特)不到40%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Two cache lines prediction for a wide-issue micro-architecture
Modern micro-architectures employ superscalar techniques to enhance system performance. The superscalar microprocessors must fetch at least one instruction cache line at a time to support high issue rate and large amount speculative executions. In this paper, we propose the Grouped Branch Prediction (GBP) that can recognize and predict multiple branches in the same instruction cache line for a wide-issue micro-architecture. Several configurations of the GBP with different group sizes are simulated. The simulation results show that the branch penalty of the group size 4 with 2048-entry is under 0.65 clock cycle. In our design, we choose the two-group scheme with group size 4. This feature achieves an average of 4.9 IPC f (the number of instructions fetched per cycle for a machine front-end). Furthermore, we extend the GBP to achieve two cache lines predictions with two fetch units. The scheme of the 2048-entry 2-group with group size 4 can produce an average of 8.4 IPC f. The performance is approximately 66.5% better than the original 2-group GBPs. The added hardware cost (41.5 k bits) is less than 40%.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
The SawMill framework for virtual memory diversity Stacking them up: a comparison of virtual machines Adaptive interfacing with reconfigurable computers DStride: data-cache miss-address-based stride prefetching scheme for multimedia processors Performance evaluation of a partial retraining scheme for defective multi-layer neural networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1