减法和乘法除法/平方根实现的面积/性能比较

Peter Soderquist, M. Leeser
{"title":"减法和乘法除法/平方根实现的面积/性能比较","authors":"Peter Soderquist, M. Leeser","doi":"10.1109/ARITH.1995.465366","DOIUrl":null,"url":null,"abstract":"The implementations of division and square root in the FPU's of current microprocessors are based on one of two categories of algorithms. Multiplicative techniques, exemplified by the Newton-Raphson method and Goldschmidt's algorithm, share functionality with the floating-point multiplier. Subtractive methods, such as the many variations of radix-4 SRT, generally use dedicated, parallel hardware. These different approaches give rise to the distinct area and performance characteristics which are explored in this paper. Area comparisons are derived from measurements of commercial and academic hardware implementations. Representative divide/square root implementations are paired with typical add-multiply structures and simulated, using data from current microprocessor and arithmetic coprocessor designs, to obtain performance estimates. The results suggest that subtractive implementations offer a superior balance of area and performance, and stand to benefit most decisively from improvements in technology and growing transistor budgets due to their parallel operation. Multiplicative methods lend themselves best to situations where hardware re-use is mandated due to area or architectural constraints.<<ETX>>","PeriodicalId":332829,"journal":{"name":"Proceedings of the 12th Symposium on Computer Arithmetic","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1995-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":"{\"title\":\"An area/performance comparison of subtractive and multiplicative divide/square root implementations\",\"authors\":\"Peter Soderquist, M. Leeser\",\"doi\":\"10.1109/ARITH.1995.465366\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The implementations of division and square root in the FPU's of current microprocessors are based on one of two categories of algorithms. Multiplicative techniques, exemplified by the Newton-Raphson method and Goldschmidt's algorithm, share functionality with the floating-point multiplier. Subtractive methods, such as the many variations of radix-4 SRT, generally use dedicated, parallel hardware. These different approaches give rise to the distinct area and performance characteristics which are explored in this paper. Area comparisons are derived from measurements of commercial and academic hardware implementations. Representative divide/square root implementations are paired with typical add-multiply structures and simulated, using data from current microprocessor and arithmetic coprocessor designs, to obtain performance estimates. The results suggest that subtractive implementations offer a superior balance of area and performance, and stand to benefit most decisively from improvements in technology and growing transistor budgets due to their parallel operation. Multiplicative methods lend themselves best to situations where hardware re-use is mandated due to area or architectural constraints.<<ETX>>\",\"PeriodicalId\":332829,\"journal\":{\"name\":\"Proceedings of the 12th Symposium on Computer Arithmetic\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1995-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"29\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 12th Symposium on Computer Arithmetic\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ARITH.1995.465366\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th Symposium on Computer Arithmetic","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARITH.1995.465366","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 29

摘要

在当前微处理器的FPU中,除法和平方根的实现基于两类算法中的一种。以Newton-Raphson方法和Goldschmidt算法为例的乘法技术与浮点乘法器共享功能。减法方法,如基数-4 SRT的许多变体,通常使用专用的并行硬件。这些不同的方法产生了不同的区域和性能特征,本文将对此进行探讨。面积比较来自商业和学术硬件实现的测量。代表性的除法/平方根实现与典型的加乘结构配对,并使用当前微处理器和算术协处理器设计的数据进行模拟,以获得性能估计。结果表明,减法实现提供了面积和性能的卓越平衡,并且由于其并行操作而从技术改进和晶体管预算增长中获益最多。乘法方法最适合由于区域或体系结构限制而强制要求硬件重用的情况。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An area/performance comparison of subtractive and multiplicative divide/square root implementations
The implementations of division and square root in the FPU's of current microprocessors are based on one of two categories of algorithms. Multiplicative techniques, exemplified by the Newton-Raphson method and Goldschmidt's algorithm, share functionality with the floating-point multiplier. Subtractive methods, such as the many variations of radix-4 SRT, generally use dedicated, parallel hardware. These different approaches give rise to the distinct area and performance characteristics which are explored in this paper. Area comparisons are derived from measurements of commercial and academic hardware implementations. Representative divide/square root implementations are paired with typical add-multiply structures and simulated, using data from current microprocessor and arithmetic coprocessor designs, to obtain performance estimates. The results suggest that subtractive implementations offer a superior balance of area and performance, and stand to benefit most decisively from improvements in technology and growing transistor budgets due to their parallel operation. Multiplicative methods lend themselves best to situations where hardware re-use is mandated due to area or architectural constraints.<>
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Reducing the number of counters needed for integer multiplication Cascaded implementation of an iterative inverse-square-root algorithm, with overflow lookahead Application of fast layout synthesis environment to dividers evaluation Design strategies for optimal multiplier circuits Semi-logarithmic number systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1