Proceedings of the 12th Symposium on Computer Arithmetic最新文献

英文中文

167 MHz radix-8 divide and square root using overlapped radix-2 stages 167 MHz基数8除法和平方根使用重叠的基数2级

Proceedings of the 12th Symposium on Computer Arithmetic

Pub Date : 1995-07-19 DOI: 10.1109/ARITH.1995.465363

J. Prabhu, G. Zyner

UltraSPARC's IEEE-754 compliant floating point divide and square root implementation is presented. Three overlapping stages of SRT radix-2 quotient selection logic enable an effective radix-8 calculation at 167 MHz while only a single radix-2 quotient selection logic delay is seen in the critical path. Speculative partial remainder and quotient calculation in the main datapath also improves cycle time. The quotient selection logic is slightly modified to prevent the formation of a negative partial remainder for exact results. This saves latency and hardware as the partial remainder no longer needs to be restored before calculating the sticky bit for rounding.<>

介绍了UltraSPARC符合IEEE-754标准的浮点除法和平方根实现。SRT基数-2商选择逻辑的三个重叠阶段能够在167 MHz下进行有效的基数-8计算，而在关键路径中仅看到单个基数-2商选择逻辑延迟。在主数据路径上进行推测性的部分余数和商的计算也提高了循环时间。稍微修改了商选择逻辑，以防止形成精确结果的负部分余数。这节省了延迟和硬件，因为在计算舍入的粘性位之前不再需要恢复部分余数。

引用次数: 45

A GaAs IEEE floating point standard single precision multiplier 一个GaAs IEEE浮点标准单精度乘法器

Proceedings of the 12th Symposium on Computer Arithmetic

Pub Date : 1995-07-19 DOI: 10.1109/ARITH.1995.465372

S. Cui, N. Burgess, M. Liebelt, K. Eshraghian

This paper presents a GaAs IEEE floating point standard single precision multiplier. A modified carry save array is used in conjunction with Booth's algorithm to reduce the partial product addition and interconnection. A special rounding technique called Trailing-1's Predictor is used to speed up the final addition and rounding. The combination of the fast arithmetic architecture and compact layout style achieves 4 ns multiplication time with 3.5 W power dissipation at 75/spl deg/C giving 14 mW/MHz. The area is 2.43 mm by 3.77 mm (excluding pads) and uses 28,000 transistors to give a density of 3056 transistors/mm/sup 2/ for 0.8-/spl mu/m GaAs technology.<>

提出了一种GaAs IEEE浮点标准单精度乘法器。采用改进的进位保存阵列与Booth算法相结合，减少了部分乘积的加法和互连。一种特殊的四舍五入技术称为尾1的预测器，用于加快最后的加法和四舍五入。快速的算法架构和紧凑的布局风格相结合，在75/spl度/C下，在14 mW/MHz下，实现了4 ns的乘法时间和3.5 W的功耗。面积为2.43 mm × 3.77 mm(不包括焊盘)，使用28,000个晶体管，为0.8-/spl mu/m GaAs技术提供3056个晶体管/mm/sup / 2/的密度。

引用次数: 9

Sign detection and comparison networks with a small number of transitions 具有少量转换的标志检测和比较网络

Proceedings of the 12th Symposium on Computer Arithmetic

Pub Date : 1995-07-19 DOI: 10.1109/ARITH.1995.465376

M. Ercegovac, T. Lang

We present an approach to reducing the average number of signal transitions (T,,) in the design of sign-detection and comparison of magnitudes. Our approach reduces T/sub av/ from 21n/8 (n-operand precision in bits) to 4.5 in the case of iterative implementation, and from about n to roughly k+n/2/sup k-1/ in the tree network implemented with k-bit modules. We also discuss comparison of small numbers. The approach is applicable to other arithmetic problems.<>

我们提出了一种在信号检测和幅度比较设计中减少信号转换平均次数(T，，)的方法。在迭代实现的情况下，我们的方法将T/sub /从21n/8 (n-操作数精度，以位为单位)降低到4.5，在使用k位模块实现的树网络中，从大约n降低到大约k+n/2/sup k-1/。我们还讨论了小数值的比较。这种方法也适用于其他算术问题

引用次数: 12

Exact computation of a sum or difference with applications to argument reduction 和或差的精确计算，应用于论证化简

Proceedings of the 12th Symposium on Computer Arithmetic

Pub Date : 1995-07-19 DOI: 10.1109/ARITH.1995.465355

W. Ferguson

Results are presented that identify when the computed value of a sum or difference is exact. The accuracy of an argument reduction algorithm is analyzed using these results. This analysis demonstrates that catastrophic cancellation does not occur in this algorithm's computation of the reduced argument.<>

给出了确定和或差的计算值是否准确的结果。利用这些结果分析了一种参数约简算法的精度。分析表明，该算法在约简参数的计算中不会发生灾难性消去。

引用次数: 10

A complex-number multiplier using radix-4 digits 一种使用4位基数的复数乘数

Proceedings of the 12th Symposium on Computer Arithmetic

Pub Date : 1995-07-19 DOI: 10.1109/ARITH.1995.465373

Belle W. Y. Wei, He Du, Honglu Chen

This paper describes the design of a 16/spl times/16 complex-number multiplier developed as part of the arithmetic datapath of a complex-number digital signal processor. The complex-number multiplier internally uses binary signed digits for fast multiplication and compact layout. It employs the traditional three-multiplication scheme while minimizing the logic and delay associated with the three extra pre-multiplication binary additions which that scheme requires. The minimization comes from producing the redundant binary sum for each of the pre-multiplication binary additions with minimal hardware, and then recoding the redundant sums as radix-4 multiplier operands. The radix-4 operands halve the number of summands to be added in each of the three real multiplier units. Furthermore, an additional factor of two reduction in the number of summands is effectuated by our coding scheme for representing binary signed digits. The result is a fast and compact complex-number multiplier.<>

本文介绍了一个16/spl倍/16复数乘法器的设计，该乘法器是复数数字信号处理器算术数据通路的一部分。复数乘法器内部使用二进制有符号数字进行快速乘法和紧凑布局。它采用传统的三乘法方案，同时最大限度地减少了与该方案所需的三个额外的预乘法二进制加法相关的逻辑和延迟。最小化来自于用最少的硬件为每个预乘法二进制加法生成冗余二进制和，然后将冗余和重新编码为基数为4的乘数操作数。基数为4的操作数将三个实数乘数单位中相加的和数减半。此外，我们的二进制有符号数编码方案还使求和数减少了两倍。结果是一个快速和紧凑的复数乘法器。

引用次数: 24

Cascaded implementation of an iterative inverse-square-root algorithm, with overflow lookahead 级联实现了一个迭代的反平方根算法，具有溢出前瞻性

Proceedings of the 12th Symposium on Computer Arithmetic

Pub Date : 1995-07-19 DOI: 10.1109/ARITH.1995.465369

H. Kwan, R.L. Nelson, E. Swartzlander

We present an unconventional method of computing the inverse of the square root. It implements the equivalent of two iterations of a well-known multiplicative method to obtain 24-bit mantissa accuracy. We implement each "iteration" as a separate logic module and exploit knowledge about the relative error during computation. To reduce the size of the implementation. We use overflow lookahead logic to facilitate the exponent computations. No division is required in the entire process. Examples and error analysis are given.<>

我们提出了一种计算平方根倒数的非常规方法。它实现了一种著名的乘法方法的两次迭代，以获得24位尾数精度。我们将每个“迭代”实现为一个单独的逻辑模块，并在计算过程中利用有关相对误差的知识。以减少实现的大小。我们使用溢出前瞻逻辑来简化指数计算。整个过程不需要除法。给出了实例和误差分析。

引用次数: 13

Design strategies for optimal multiplier circuits 最优乘法器电路设计策略

Proceedings of the 12th Symposium on Computer Arithmetic

Pub Date : 1995-07-19 DOI: 10.1109/ARITH.1995.465378

C. Martel, V. Oklobdzija, R. Ravi, P. Stelling

We present new design and analysis techniques for the synthesis of fast parallel multiplier circuits. V.G. Oklobdzija, D. Villeger, and S.S. Lui (1995) suggested a new approach, the three dimensional method (TDM), for partial product reduction tree (PPRT) design that produces multipliers which outperform the current best designs. The goal of TDM is to produce a minimum delay PPRT using full adders. This is done by carefully modelling the relationship of the output delays to the input delays an an adder, and then interconnecting the adders in a globally optimal way. Oklobdzija, et. al. suggested a good heuristic for finding the optimal PPRT, but no proofs about the performance of this heuristic were given. We provide a formal characterization of optimal PPRT circuits and prove a number of properties about them. For the problem of summing a set of input bits within the minimum delay, we present an algorithm that produces a minimum delay circuit in time linear in the size of the inputs. Our techniques allow us to prove tight lower bounds on multiplier circuit delays. These results are combined to create a program which finds optimal TDM multiplier designs.<>

我们提出了新的设计和分析技术，用于快速并联乘法器电路的合成。V.G. Oklobdzija, D. Villeger和S.S. Lui(1995)提出了一种新的方法，三维方法(TDM)，用于部分产品简化树(PPRT)设计，产生优于当前最佳设计的乘数。TDM的目标是使用全加法器产生最小延迟PPRT。这是通过仔细建模输出延迟与输入延迟和加法器之间的关系，然后以全局最优的方式将加法器互连来完成的。Oklobdzija等人提出了一种寻找最佳PPRT的良好启发式方法，但没有给出关于该启发式方法性能的证明。我们提供了最优PPRT电路的形式化表征，并证明了它们的一些性质。对于在最小延迟内对一组输入位求和的问题，我们提出了一种算法，该算法产生的最小延迟电路与输入的大小呈时间线性关系。我们的技术允许我们证明乘法器电路延迟的严格下界。将这些结果结合起来创建一个程序，以找到最佳的时分复用乘法器设计。

{"title":"Design strategies for optimal multiplier circuits","authors":"C. Martel, V. Oklobdzija, R. Ravi, P. Stelling","doi":"10.1109/ARITH.1995.465378","DOIUrl":"https://doi.org/10.1109/ARITH.1995.465378","url":null,"abstract":"We present new design and analysis techniques for the synthesis of fast parallel multiplier circuits. V.G. Oklobdzija, D. Villeger, and S.S. Lui (1995) suggested a new approach, the three dimensional method (TDM), for partial product reduction tree (PPRT) design that produces multipliers which outperform the current best designs. The goal of TDM is to produce a minimum delay PPRT using full adders. This is done by carefully modelling the relationship of the output delays to the input delays an an adder, and then interconnecting the adders in a globally optimal way. Oklobdzija, et. al. suggested a good heuristic for finding the optimal PPRT, but no proofs about the performance of this heuristic were given. We provide a formal characterization of optimal PPRT circuits and prove a number of properties about them. For the problem of summing a set of input bits within the minimum delay, we present an algorithm that produces a minimum delay circuit in time linear in the size of the inputs. Our techniques allow us to prove tight lower bounds on multiplier circuit delays. These results are combined to create a program which finds optimal TDM multiplier designs.<<ETX>>","PeriodicalId":332829,"journal":{"name":"Proceedings of the 12th Symposium on Computer Arithmetic","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115781140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Efficient initial approximation and fast converging methods for division and square root 除法和平方根的有效初始逼近和快速收敛方法

Proceedings of the 12th Symposium on Computer Arithmetic

Pub Date : 1995-07-19 DOI: 10.1109/ARITH.1995.465383

Masayuki Ito, N. Takagi, S. Yajima

Efficient initial approximations and fast converging algorithms are important to achieve the desired precision faster at lower hardware cost in multiplicative division and square root. In this paper, a new initial approximation method for division, an accelerated higher order converging division algorithm, and a new square root algorithm are proposed. They are all suitable for implementation on an arithmetic unit where one multiply-accumulate operation, can be executed in one cycle. In the case of division, the combination of our initial approximation method and our converging algorithm, enables a single iteration of the converging algorithm to produce double-precision quotients. Our new square root algorithm can form, double-precision square roots faster using smaller look-up tables than the Newton-Raphson method.<>

在乘法除法和平方根中，高效的初始近似和快速收敛算法对于以较低的硬件成本更快地达到所需的精度至关重要。本文提出了一种新的除法初始逼近法、一种加速的高阶收敛除法和一种新的平方根算法。它们都适合在算术单元上实现，其中一个乘法累加操作可以在一个周期内执行。在除法的情况下，我们的初始逼近方法和我们的收敛算法相结合，使得收敛算法的一次迭代可以产生双精度商。我们的新平方根算法可以使用比牛顿-拉夫森方法更小的查找表更快地形成双精度平方根。

引用次数: 37

Reducing the number of counters needed for integer multiplication 减少整数乘法所需的计数器数量

Proceedings of the 12th Symposium on Computer Arithmetic

Pub Date : 1995-07-19 DOI: 10.1109/ARITH.1995.465379

R. Owens, R. Bajwa, M. J. Irwin

In this paper we consider the problem of multiplying reasonably small integers using fewer counters than that required by straightforward partial product accumulation. Not surprisingly the method we use is based on the observation that integer multiplication can be formulated as aperiodic convolution. However, instead of using something like the Fast Fourier Transform to compute the aperiodic convolution, we use what are known as a "fast" convolution algorithms. In this way we can construct multipliers for as small as eighteen bit integers which use fewer counters than that required by straightforward partial product accumulation. Because of the perceived "overhead" involved with an aperiodic formulation of integer multiplication, the ability to do this goes somewhat against the conventional wisdom that aperiodic formulation of integer multiplication gains an advantage over a straightforward partial product formulation only for fairly large integers.<>

在本文中，我们考虑用比直接的部分积累加所需的计数器更少的计数器来乘合理的小整数的问题。毫不奇怪，我们使用的方法是基于整数乘法可以表示为非周期卷积的观察。然而，我们不是使用快速傅里叶变换来计算非周期卷积，而是使用所谓的“快速”卷积算法。用这种方法，我们可以为小到18位的整数构造乘数，它比直接的部分积累加所需要的计数器更少。由于整数乘法的非周期公式涉及到可感知的“开销”，这样做的能力在某种程度上违背了传统的智慧，即整数乘法的非周期公式只在相当大的整数上比直接的部分乘积公式更有优势

引用次数: 5

Analytic approach for error masking elimination in on-line multipliers 在线乘法器中误差掩蔽消除的解析方法

Proceedings of the 12th Symposium on Computer Arithmetic

Pub Date : 1995-07-19 DOI: 10.1109/ARITH.1995.465380

H. Bederr, M. Nicolaidis, A. Guyot

Several systematic design approaches are known to be representatives of the techniques well adapted for testing sequential circuits (partial and full scan, LSSD...). However in some cases, like for the test of on-line operators, ad-hoc DFT (design for testability) schemes become more suitable. Indeed, on-line arithmetic are used for high precision numbers resulting on high length operators. Thus the length of a test sequence for a scan design approach can grow quite large due to the shift in (shift out) of test values (test responses) and therefore the test application time would become prohibitive. Moreover, the arithmetic nature of these operators imply that some errors detected locally are masked before their observation at the primary outputs. In this paper we describe an analytic approach for testing on-line multipliers that allows to avoid error masking without adding extra hardware for internal state observability while maintaining a 100% fault coverage. Compared to a DFT approach using parity trees, this method leads to a reduction of the area overhead from 7% to 1% and of the extra pins count from 6 to 3 in the case of the on-line multipliers considered in this paper.<>

几种系统的设计方法被认为是适合于测试顺序电路的技术的代表(部分和完全扫描，LSSD…)。然而，在某些情况下，如在线运营商的测试，ad-hoc DFT(可测试性设计)方案变得更合适。实际上，在线算法用于由高长度运算符产生的高精度数字。因此，由于测试值(测试响应)的移进(移出)，扫描设计方法的测试序列的长度可能会变得相当大，因此测试应用时间将变得令人望而却步。此外，这些运算符的算术性质意味着局部检测到的一些错误在它们在主输出处观察到之前被掩盖。在本文中，我们描述了一种测试在线乘法器的分析方法，该方法允许避免错误屏蔽，而无需为内部状态可观察性添加额外的硬件，同时保持100%的故障覆盖率。与使用奇偶树的DFT方法相比，该方法将面积开销从7%减少到1%，并且在本文中考虑的在线乘法器的情况下，将额外的引脚计数从6减少到3。

{"title":"Analytic approach for error masking elimination in on-line multipliers","authors":"H. Bederr, M. Nicolaidis, A. Guyot","doi":"10.1109/ARITH.1995.465380","DOIUrl":"https://doi.org/10.1109/ARITH.1995.465380","url":null,"abstract":"Several systematic design approaches are known to be representatives of the techniques well adapted for testing sequential circuits (partial and full scan, LSSD...). However in some cases, like for the test of on-line operators, ad-hoc DFT (design for testability) schemes become more suitable. Indeed, on-line arithmetic are used for high precision numbers resulting on high length operators. Thus the length of a test sequence for a scan design approach can grow quite large due to the shift in (shift out) of test values (test responses) and therefore the test application time would become prohibitive. Moreover, the arithmetic nature of these operators imply that some errors detected locally are masked before their observation at the primary outputs. In this paper we describe an analytic approach for testing on-line multipliers that allows to avoid error masking without adding extra hardware for internal state observability while maintaining a 100% fault coverage. Compared to a DFT approach using parity trees, this method leads to a reduction of the area overhead from 7% to 1% and of the extra pins count from 6 to 3 in the case of the on-line multipliers considered in this paper.<<ETX>>","PeriodicalId":332829,"journal":{"name":"Proceedings of the 12th Symposium on Computer Arithmetic","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134513532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 12th Symposium on Computer Arithmetic

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀