An explicitly declared delayed-branch mechanism for a superscalar architecture

Roger Collins, Gordon Steven
{"title":"An explicitly declared delayed-branch mechanism for a superscalar architecture","authors":"Roger Collins,&nbsp;Gordon Steven","doi":"10.1016/0165-6074(94)90016-7","DOIUrl":null,"url":null,"abstract":"<div><p>One of the main obstacles to exploiting the fine-grained parallelism that is available in general-purpose code is the frequency of branches that cause unpredictable changes in the control flow of a program at run-time. Whenever a branch is taken, a performance penalty may be incurred as the processor waits for instructions to be fetched from the branch target stream. RISC processors introduce a delayed-branch mechanism which defines branch delay slots into which code can be scheduled. This strategy allows the processor to be kept busy executing useful instructions while the change of control flow takes place. While the concept of delayed branches can be readily extended to VLIW architectures, it is less clear how it should be incorporated in a superscalar architecture. This paper proposes a general branch-delay mechanism which is suitable for a range of code-compatible superscalar processors and which completely avoids the need to introduce NOPs into the code. This technique was developed as an integral part of the HSP superscalar project. HSP is a superscalar architecture currently being researched at the University of Hertfordshire with the aim of using compile-time instruction scheduling to achieve an order of magnitude speed-up over traditional RISC architectures for a suite of non-numeric benchmark programs.</p></div>","PeriodicalId":100927,"journal":{"name":"Microprocessing and Microprogramming","volume":"40 10","pages":"Pages 677-680"},"PeriodicalIF":0.0000,"publicationDate":"1994-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0165-6074(94)90016-7","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microprocessing and Microprogramming","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/0165607494900167","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

One of the main obstacles to exploiting the fine-grained parallelism that is available in general-purpose code is the frequency of branches that cause unpredictable changes in the control flow of a program at run-time. Whenever a branch is taken, a performance penalty may be incurred as the processor waits for instructions to be fetched from the branch target stream. RISC processors introduce a delayed-branch mechanism which defines branch delay slots into which code can be scheduled. This strategy allows the processor to be kept busy executing useful instructions while the change of control flow takes place. While the concept of delayed branches can be readily extended to VLIW architectures, it is less clear how it should be incorporated in a superscalar architecture. This paper proposes a general branch-delay mechanism which is suitable for a range of code-compatible superscalar processors and which completely avoids the need to introduce NOPs into the code. This technique was developed as an integral part of the HSP superscalar project. HSP is a superscalar architecture currently being researched at the University of Hertfordshire with the aim of using compile-time instruction scheduling to achieve an order of magnitude speed-up over traditional RISC architectures for a suite of non-numeric benchmark programs.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
超标量体系结构的显式声明延迟分支机制
利用通用代码中可用的细粒度并行性的主要障碍之一是分支的频率,这些分支在运行时会导致程序的控制流发生不可预测的变化。无论何时执行分支,处理器都可能在等待从分支目标流中提取指令时导致性能损失。RISC处理器引入了一种延迟分支机制,该机制定义了可以调度代码的分支延迟时隙。这种策略允许处理器在控制流发生变化时忙于执行有用的指令。虽然延迟分支的概念可以很容易地扩展到VLIW体系结构,但它应该如何融入超标量体系结构还不太清楚。本文提出了一种通用的分支延迟机制,该机制适用于一系列代码兼容的超标量处理器,并且完全避免了在代码中引入NOP的需要。这项技术是作为HSP超标量项目的一个组成部分开发的。HSP是赫特福德大学目前正在研究的一种超标量体系结构,其目的是使用编译时指令调度,为一套非数字基准程序实现比传统RISC体系结构高一个数量级的速度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Mixing floating- and fixed-point formats for neural network learning on neuroprocessors Subject index to volume 41 (1995/1996) A graphical simulator for programmable logic controllers based on Petri nets A neural network-based replacement strategy for high performance computer architectures Modelling and performance assessment of large ATM switching networks on loosely-coupled parallel processors
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1