Forward semantic: a compiler-assisted instruction fetch method for heavily pipelined processors

MICRO 22 Pub Date : 1989-08-01 DOI:10.1145/75362.75418

P. Chang, Wen-mei W. Hwu

{"title":"Forward semantic: a compiler-assisted instruction fetch method for heavily pipelined processors","authors":"P. Chang, Wen-mei W. Hwu","doi":"10.1145/75362.75418","DOIUrl":null,"url":null,"abstract":"A new instruction fetch method, forward semantic, is offered to enable the deeply pipelined processors to fetch one useful instruction every cycle. Forward semantic is an improved alternative to the delayed branching (with or without squashing), with five major advantages. Fist, no restriction is imposed on the type of instructions filling the branch slots, which allows a large number of slots to be filled. Second, no modification to the offsets and displacements is necessary when an instruction is copied to fill a branch slot, which simplifies the linker implementation. Third, an interrupted program can resume execution with a single program counter, eliminating the need for reloading the instruction pipeline before resuming execution. Fourth, programs compiled with N slots can execute on pipelines requiring K (K ≤ N) slots, which makes family architecture compatibility possible . Lastly, the filling of branch slots is totally transparent to code compaction and software interlocking schemes. These advantages combine to provide an efficient instruction fetch mechanism and to eliminate artificial penalties on branch cost. At the cost of 11% static code expansion, forward semantic achieves an instruction fetch cost of 1.2 cycles for pipelines requiring 10 slots for each taken branch. This level of instruction fetch efficiency has never been achieved before with conventional instruction fetch methods. The branch cost is dictated by the accuracy of the compile-time branch prediction rather than artificial limitations, such as data dependencies, which prevent the slots from being filled. These results are measured from the execution of real UNIX and CAD programs with complex control structures.","PeriodicalId":365456,"journal":{"name":"MICRO 22","volume":"239 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1989-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"MICRO 22","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/75362.75418","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

A new instruction fetch method, forward semantic, is offered to enable the deeply pipelined processors to fetch one useful instruction every cycle. Forward semantic is an improved alternative to the delayed branching (with or without squashing), with five major advantages. Fist, no restriction is imposed on the type of instructions filling the branch slots, which allows a large number of slots to be filled. Second, no modification to the offsets and displacements is necessary when an instruction is copied to fill a branch slot, which simplifies the linker implementation. Third, an interrupted program can resume execution with a single program counter, eliminating the need for reloading the instruction pipeline before resuming execution. Fourth, programs compiled with N slots can execute on pipelines requiring K (K ≤ N) slots, which makes family architecture compatibility possible . Lastly, the filling of branch slots is totally transparent to code compaction and software interlocking schemes. These advantages combine to provide an efficient instruction fetch mechanism and to eliminate artificial penalties on branch cost. At the cost of 11% static code expansion, forward semantic achieves an instruction fetch cost of 1.2 cycles for pipelines requiring 10 slots for each taken branch. This level of instruction fetch efficiency has never been achieved before with conventional instruction fetch methods. The branch cost is dictated by the accuracy of the compile-time branch prediction rather than artificial limitations, such as data dependencies, which prevent the slots from being filled. These results are measured from the execution of real UNIX and CAD programs with complex control structures.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

前向语义:用于大量流水线处理器的编译器辅助指令获取方法

提出了一种新的指令获取方法——前向语义，使深度流水线处理器每个周期都能获取一条有用的指令。前向语义是延迟分支(有或没有压缩)的改进替代方案，有五个主要优点。首先，没有对填充分支槽的指令类型施加限制，这允许填充大量槽。其次，当复制一条指令以填充分支槽时，不需要修改偏移量和位移，这简化了链接器的实现。第三，被中断的程序可以用一个程序计数器恢复执行，在恢复执行之前不需要重新加载指令管道。第四，用N个插槽编译的程序可以在需要K (K≤N)个插槽的管道上执行，使得家族架构兼容成为可能。最后，分支槽的填充对代码压缩和软件联锁方案完全透明。这些优点结合起来提供了有效的指令获取机制，并消除了对分支成本的人为惩罚。以11%的静态代码扩展为代价，对于每个分支需要10个槽的管道，前向语义实现了1.2个周期的指令获取成本。这种级别的指令获取效率在以前的常规指令获取方法中从未实现过。分支成本是由编译时分支预测的准确性决定的，而不是由人为限制决定的，比如数据依赖关系，这些限制会阻止槽被填充。这些结果是通过具有复杂控制结构的实际UNIX和CAD程序的执行来测量的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助