A digit-set-interleaved radix-8 division/square root kernel for double-precision floating point

2010 International Symposium on System on Chip Pub Date : 2010-11-09 DOI:10.1109/ISSOC.2010.5625547

I. Rust, T. Noll

引用次数: 8

Abstract

A common and very efficient approach to division and square root is the subtractive SRT algorithm combined with a redundant partial remainder representation like carry-save. A recently proposed modification of the SRT algorithm for division reduces the number of comparators inside the Quotient Digit Selection Function (QDSF) to the number necessary in a non-redundant implementation and derives partial remainders directly from comparison results calculated inside the QDSF. In this paper it is shown that this modified approach is also applicable to square root operations in an efficient way. A combined radix-8 division and square root kernel for double-precision floating point was synthesized using a 40-nm general-purpose cell library. The implementation comprises a critical path of only 20.8 fanout-4 inverter delays at worst case conditions which is comparable to 20.0 inverter delays published for a high-speed radix-4 SRT implementation. Furthermore, the proposed algorithm reduces the total area compared to equivalent SRT-based implementations.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

双精度浮点数的数字集交错基数-8除法/平方根核

一种常见且非常有效的除法和平方根方法是减法SRT算法与冗余部分余数表示(如carry-save)相结合。最近提出的对SRT除法算法的修改将商数字选择函数(QDSF)内的比较器数量减少到非冗余实现所需的数量，并直接从QDSF内计算的比较结果中导出部分余数。本文证明了这种改进的方法同样有效地适用于平方根运算。利用40 nm通用单元库合成了双精度浮点数的基数-8除法和平方根组合核。该实现包括在最坏情况下只有20.8扇出-4逆变器延迟的关键路径，这与高速基数-4 SRT实现发布的20.0逆变器延迟相当。此外，与等效的基于srt的实现相比，该算法减少了总面积。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2010 International Symposium on System on Chip

自引率

0.00%

发文量

期刊最新文献

H.264/AVC framework for multi-core embedded video encoders Correct and energy-efficient design of SoCs: The H.264 encoder case study Heap access optimizations for a hardware-accelerated Java virtual machine A case study of hierarchically heterogeneous application modelling using UML and Ptolemy II Useful-state encoding: Network control with minimal redundancy