Forward semantic: a compiler-assisted instruction fetch method for heavily pipelined processors

MICRO 22 Pub Date : 1989-08-01 DOI:10.1145/75362.75418
P. Chang, Wen-mei W. Hwu
{"title":"Forward semantic: a compiler-assisted instruction fetch method for heavily pipelined processors","authors":"P. Chang, Wen-mei W. Hwu","doi":"10.1145/75362.75418","DOIUrl":null,"url":null,"abstract":"A new instruction fetch method, forward semantic, is offered to enable the deeply pipelined processors to fetch one useful instruction every cycle. Forward semantic is an improved alternative to the delayed branching (with or without squashing), with five major advantages. Fist, no restriction is imposed on the type of instructions filling the branch slots, which allows a large number of slots to be filled. Second, no modification to the offsets and displacements is necessary when an instruction is copied to fill a branch slot, which simplifies the linker implementation. Third, an interrupted program can resume execution with a single program counter, eliminating the need for reloading the instruction pipeline before resuming execution. Fourth, programs compiled with N slots can execute on pipelines requiring K (K ≤ N) slots, which makes family architecture compatibility possible . Lastly, the filling of branch slots is totally transparent to code compaction and software interlocking schemes. These advantages combine to provide an efficient instruction fetch mechanism and to eliminate artificial penalties on branch cost. At the cost of 11% static code expansion, forward semantic achieves an instruction fetch cost of 1.2 cycles for pipelines requiring 10 slots for each taken branch. This level of instruction fetch efficiency has never been achieved before with conventional instruction fetch methods. The branch cost is dictated by the accuracy of the compile-time branch prediction rather than artificial limitations, such as data dependencies, which prevent the slots from being filled. These results are measured from the execution of real UNIX and CAD programs with complex control structures.","PeriodicalId":365456,"journal":{"name":"MICRO 22","volume":"239 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1989-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MICRO 22","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/75362.75418","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

A new instruction fetch method, forward semantic, is offered to enable the deeply pipelined processors to fetch one useful instruction every cycle. Forward semantic is an improved alternative to the delayed branching (with or without squashing), with five major advantages. Fist, no restriction is imposed on the type of instructions filling the branch slots, which allows a large number of slots to be filled. Second, no modification to the offsets and displacements is necessary when an instruction is copied to fill a branch slot, which simplifies the linker implementation. Third, an interrupted program can resume execution with a single program counter, eliminating the need for reloading the instruction pipeline before resuming execution. Fourth, programs compiled with N slots can execute on pipelines requiring K (K ≤ N) slots, which makes family architecture compatibility possible . Lastly, the filling of branch slots is totally transparent to code compaction and software interlocking schemes. These advantages combine to provide an efficient instruction fetch mechanism and to eliminate artificial penalties on branch cost. At the cost of 11% static code expansion, forward semantic achieves an instruction fetch cost of 1.2 cycles for pipelines requiring 10 slots for each taken branch. This level of instruction fetch efficiency has never been achieved before with conventional instruction fetch methods. The branch cost is dictated by the accuracy of the compile-time branch prediction rather than artificial limitations, such as data dependencies, which prevent the slots from being filled. These results are measured from the execution of real UNIX and CAD programs with complex control structures.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
前向语义:用于大量流水线处理器的编译器辅助指令获取方法
提出了一种新的指令获取方法——前向语义,使深度流水线处理器每个周期都能获取一条有用的指令。前向语义是延迟分支(有或没有压缩)的改进替代方案,有五个主要优点。首先,没有对填充分支槽的指令类型施加限制,这允许填充大量槽。其次,当复制一条指令以填充分支槽时,不需要修改偏移量和位移,这简化了链接器的实现。第三,被中断的程序可以用一个程序计数器恢复执行,在恢复执行之前不需要重新加载指令管道。第四,用N个插槽编译的程序可以在需要K (K≤N)个插槽的管道上执行,使得家族架构兼容成为可能。最后,分支槽的填充对代码压缩和软件联锁方案完全透明。这些优点结合起来提供了有效的指令获取机制,并消除了对分支成本的人为惩罚。以11%的静态代码扩展为代价,对于每个分支需要10个槽的管道,前向语义实现了1.2个周期的指令获取成本。这种级别的指令获取效率在以前的常规指令获取方法中从未实现过。分支成本是由编译时分支预测的准确性决定的,而不是由人为限制决定的,比如数据依赖关系,这些限制会阻止槽被填充。这些结果是通过具有复杂控制结构的实际UNIX和CAD程序的执行来测量的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Incremental foresighted local compaction MIES: a microarchitecture design tool Functional languages in microcode compilers “Combining” as a compilation technique for VLIW architectures On reordering instruction streams for pipelined computers
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1