Improving the operation autonomy of SIMD processing elements by using guarded instructions and pseudo branches

M. L. Anido, A. Paar, N. Bagherzadeh
{"title":"Improving the operation autonomy of SIMD processing elements by using guarded instructions and pseudo branches","authors":"M. L. Anido, A. Paar, N. Bagherzadeh","doi":"10.1109/DSD.2002.1115363","DOIUrl":null,"url":null,"abstract":"This paper presents a novel method for improving the operation autonomy of the processing elements (PE) of SIMD-like machines. By combining guarded instructions and pseudo branches it is possible to achieve higher operation autonomy and higher instruction level parallelism than in previous SIMD/ASIMD architectures. The paper shows that it is feasible to avoid most branches and it is also possible to emulate conditional execution on the processing elements, either by using guarded instructions or by using pseudo branches, thus avoiding unnecessary intervention by the array control unit in data-dependant computations. Pseudo branches are used when it is not possible to use guarded instructions. Additionally, they also support the implementation of complex nested if-then-else constructs, improving the execution of irregular dataparallel applications. The paper also shows that the simplicity of the method allows it to be implemented both in fine-grain and coarse-grain SIMD/ASIMD architectures because it does not require significant additional silicon area. Finally, it is shown that pseudo branches can be used to control the power saving of those processing elements that have instructions nullified.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"103 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSD.2002.1115363","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 26

Abstract

This paper presents a novel method for improving the operation autonomy of the processing elements (PE) of SIMD-like machines. By combining guarded instructions and pseudo branches it is possible to achieve higher operation autonomy and higher instruction level parallelism than in previous SIMD/ASIMD architectures. The paper shows that it is feasible to avoid most branches and it is also possible to emulate conditional execution on the processing elements, either by using guarded instructions or by using pseudo branches, thus avoiding unnecessary intervention by the array control unit in data-dependant computations. Pseudo branches are used when it is not possible to use guarded instructions. Additionally, they also support the implementation of complex nested if-then-else constructs, improving the execution of irregular dataparallel applications. The paper also shows that the simplicity of the method allows it to be implemented both in fine-grain and coarse-grain SIMD/ASIMD architectures because it does not require significant additional silicon area. Finally, it is shown that pseudo branches can be used to control the power saving of those processing elements that have instructions nullified.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
采用保护指令和伪分支,提高了SIMD处理单元的操作自主性
提出了一种提高类simd机械加工单元操作自主性的新方法。通过结合保护指令和伪分支,可以实现比以前的SIMD/ASIMD体系结构更高的操作自主性和更高的指令级并行性。本文表明,避免大多数分支是可行的,也可以通过使用保护指令或使用伪分支在处理单元上模拟条件执行,从而避免阵列控制单元对数据依赖计算的不必要干预。当不可能使用受保护指令时,使用伪分支。此外,它们还支持复杂的嵌套if-then-else结构的实现,从而改进不规则数据并行应用程序的执行。本文还表明,该方法的简单性使得它可以在细颗粒和粗颗粒SIMD/ASIMD体系结构中实现,因为它不需要显着的额外硅面积。最后,证明了伪分支可以用来控制那些指令无效的处理元素的省电。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Fault latencies of concurrent checking FSMs On the fundamental design gap in terabit per second packet switching Bit-level allocation of multiple-precision specifications Improving mW/MHz ratio in FPGAs pipelined designs Hardware implementation of a memory allocator
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1