Micro-21节目主持人

Wen-mei W. Hwu
{"title":"Micro-21节目主持人","authors":"Wen-mei W. Hwu","doi":"10.1145/378818.378848","DOIUrl":null,"url":null,"abstract":"The instruction queue is a critical component of the proposed mlcroarchitecture where executable instructions are detected and delivered to the execution unit. This paper clarifies the issue of loading instructions into the instruction queue and evaluates the resulting performance due to different schemes. paths are identified in the complicated UNIX programs so that trace scheduling can be effectively applied. Experimental results are provided for ten UNIX system and CAD programs which all exhibit complicated control structure. This is the first paper to address the issue of applying trace scheduling to complicated programs. The work is critical to adapting trace scheduling to RISC's and other upcoming pipelined, parallel mlcroarchltectures. Research The CMOS 370 has some Control Store on chip and some off. A small on-chip Control Store holds the first two microwords of each microsequence (target of conditional branches). A close look reveals that the two-level Control Store structure can be viewed as a programmer managed target instruction buffer. This structure makes it possible to access one microinstruction from a (mostly off-chip) large Control store every cycle while achieving a short cycle time. Efficient trapping is proposed to support efficient instruction emulation in processors with hardwired control. This makes the issue of instruction set design relatively independent of the implementation (hardwired or microprogrammed). • \"Multiple Instruction Issue and Single-Chip Processors,\" A. Pleszkun and G. Sohi, U. of Wisconsin-Madison. Sometimes issuing multiple instructions is not a win. It would be interesting to experiment on the effect of compilation support (trace scheduling, register allocation, etc.) on the instruction issue rate. Comparing the results presented in this paper and those presented by the VLIW team, compilation support seems to be critical for issuing multiple instructions per cycle. The paper discusses the dilemma due to the interdependence between data routing and code scheduling in ASIC code generation. This issue corresponds closely to the one regarding the code scheduling and register allocation for pipelined and/or wide instruction architectures. The trend is to consider both factors together during code generation. The dynamic reconfigurability is a very interesting feature of the proposed ASIC paradigm. However, the slow prototype makes one wonder if a simple microprocessor can be programmed to achieve the same performance for the target applications. • \"Implementing a Prolog Machine with Multiple Functional Units,\" A. Singhal and Y. Patt, U. C. Berkeley. Parallel unification and execution result in factor of 4 speedup over the Berkeley PLM. …","PeriodicalId":138968,"journal":{"name":"ACM Sigmicro Newsletter","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1989-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Micro-21 from the program chair\",\"authors\":\"Wen-mei W. Hwu\",\"doi\":\"10.1145/378818.378848\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The instruction queue is a critical component of the proposed mlcroarchitecture where executable instructions are detected and delivered to the execution unit. This paper clarifies the issue of loading instructions into the instruction queue and evaluates the resulting performance due to different schemes. paths are identified in the complicated UNIX programs so that trace scheduling can be effectively applied. Experimental results are provided for ten UNIX system and CAD programs which all exhibit complicated control structure. This is the first paper to address the issue of applying trace scheduling to complicated programs. The work is critical to adapting trace scheduling to RISC's and other upcoming pipelined, parallel mlcroarchltectures. Research The CMOS 370 has some Control Store on chip and some off. A small on-chip Control Store holds the first two microwords of each microsequence (target of conditional branches). A close look reveals that the two-level Control Store structure can be viewed as a programmer managed target instruction buffer. This structure makes it possible to access one microinstruction from a (mostly off-chip) large Control store every cycle while achieving a short cycle time. Efficient trapping is proposed to support efficient instruction emulation in processors with hardwired control. This makes the issue of instruction set design relatively independent of the implementation (hardwired or microprogrammed). • \\\"Multiple Instruction Issue and Single-Chip Processors,\\\" A. Pleszkun and G. Sohi, U. of Wisconsin-Madison. Sometimes issuing multiple instructions is not a win. It would be interesting to experiment on the effect of compilation support (trace scheduling, register allocation, etc.) on the instruction issue rate. Comparing the results presented in this paper and those presented by the VLIW team, compilation support seems to be critical for issuing multiple instructions per cycle. The paper discusses the dilemma due to the interdependence between data routing and code scheduling in ASIC code generation. This issue corresponds closely to the one regarding the code scheduling and register allocation for pipelined and/or wide instruction architectures. The trend is to consider both factors together during code generation. The dynamic reconfigurability is a very interesting feature of the proposed ASIC paradigm. However, the slow prototype makes one wonder if a simple microprocessor can be programmed to achieve the same performance for the target applications. • \\\"Implementing a Prolog Machine with Multiple Functional Units,\\\" A. Singhal and Y. Patt, U. C. Berkeley. Parallel unification and execution result in factor of 4 speedup over the Berkeley PLM. …\",\"PeriodicalId\":138968,\"journal\":{\"name\":\"ACM Sigmicro Newsletter\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1989-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Sigmicro Newsletter\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/378818.378848\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Sigmicro Newsletter","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/378818.378848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

指令队列是所建议的mlcroo体系结构的关键组件,在其中检测可执行指令并将其传递给执行单元。本文阐明了将指令加载到指令队列中的问题,并对不同方案所产生的性能进行了评估。在复杂的UNIX程序中识别路径,以便有效地应用跟踪调度。给出了10种控制结构复杂的UNIX系统和CAD程序的实验结果。这是第一篇讨论将跟踪调度应用于复杂程序的论文。这项工作对于使跟踪调度适应RISC和其他即将到来的流水线、并行多体系结构至关重要。CMOS 370芯片上有控制存储,也有控制存储。一个小的片上控制存储器保存每个微序列的前两个微字(条件分支的目标)。仔细观察可以发现,两层控制存储结构可以看作是程序员管理的目标指令缓冲区。这种结构使得每个周期从(大部分是片外)大型控制存储访问一个微指令成为可能,同时实现了较短的周期时间。为了在硬连线控制的处理器中支持有效的指令仿真,提出了有效的捕获方法。这使得指令集设计问题相对独立于实现(硬连接或微编程)。•“多指令问题和单芯片处理器”,A. Pleszkun和G. Sohi,威斯康星大学麦迪逊分校。有时发出多个指令并不是一件好事。测试编译支持(跟踪调度、寄存器分配等)对指令发放率的影响会很有趣。比较本文中给出的结果和VLIW团队给出的结果,编译支持似乎对每个周期发出多个指令至关重要。本文讨论了在ASIC代码生成中由于数据路由和代码调度相互依赖而产生的困境。这个问题与流水线和/或宽指令体系结构的代码调度和寄存器分配密切相关。趋势是在代码生成过程中同时考虑这两个因素。动态可重构性是所提出的ASIC范式的一个非常有趣的特征。然而,缓慢的原型使人怀疑是否可以对一个简单的微处理器进行编程以实现目标应用程序的相同性能。•“实现具有多个功能单元的Prolog机器”,a . Singhal和Y. Patt, uc Berkeley。并行统一和执行导致比伯克利PLM的4倍的加速。...
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Micro-21 from the program chair
The instruction queue is a critical component of the proposed mlcroarchitecture where executable instructions are detected and delivered to the execution unit. This paper clarifies the issue of loading instructions into the instruction queue and evaluates the resulting performance due to different schemes. paths are identified in the complicated UNIX programs so that trace scheduling can be effectively applied. Experimental results are provided for ten UNIX system and CAD programs which all exhibit complicated control structure. This is the first paper to address the issue of applying trace scheduling to complicated programs. The work is critical to adapting trace scheduling to RISC's and other upcoming pipelined, parallel mlcroarchltectures. Research The CMOS 370 has some Control Store on chip and some off. A small on-chip Control Store holds the first two microwords of each microsequence (target of conditional branches). A close look reveals that the two-level Control Store structure can be viewed as a programmer managed target instruction buffer. This structure makes it possible to access one microinstruction from a (mostly off-chip) large Control store every cycle while achieving a short cycle time. Efficient trapping is proposed to support efficient instruction emulation in processors with hardwired control. This makes the issue of instruction set design relatively independent of the implementation (hardwired or microprogrammed). • "Multiple Instruction Issue and Single-Chip Processors," A. Pleszkun and G. Sohi, U. of Wisconsin-Madison. Sometimes issuing multiple instructions is not a win. It would be interesting to experiment on the effect of compilation support (trace scheduling, register allocation, etc.) on the instruction issue rate. Comparing the results presented in this paper and those presented by the VLIW team, compilation support seems to be critical for issuing multiple instructions per cycle. The paper discusses the dilemma due to the interdependence between data routing and code scheduling in ASIC code generation. This issue corresponds closely to the one regarding the code scheduling and register allocation for pipelined and/or wide instruction architectures. The trend is to consider both factors together during code generation. The dynamic reconfigurability is a very interesting feature of the proposed ASIC paradigm. However, the slow prototype makes one wonder if a simple microprocessor can be programmed to achieve the same performance for the target applications. • "Implementing a Prolog Machine with Multiple Functional Units," A. Singhal and Y. Patt, U. C. Berkeley. Parallel unification and execution result in factor of 4 speedup over the Berkeley PLM. …
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Book review: The Art of Computer Systems Performance Analysis - by Raj Jain (ISBN 0471-50336-3, 1991, 685 pages, Price: $ 52.95 John Wiley & Sons Inc., New York) Micro-22 awards Judgement Bit slice software: user retargetable microcode tools Book Review: MICROPROGRAMMING AND FIRMWARE ENGINEERING METHODS by Stanley Habib, Editor:, Van Nostrand Reinhold, 1988
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1