首页 > 最新文献

[1991] Proceedings 10th IEEE Symposium on Computer Arithmetic最新文献

英文 中文
Specifications for a variable-precision arithmetic coprocessor 可变精度算术协处理器技术条件
Pub Date : 1991-06-26 DOI: 10.1109/ARITH.1991.145548
T. E. Hull, M. S. Cohen, C. Hall
The authors have been developing a programming system intended to be especially convenient for scientific computing. Its main features are variable precision (decimal) floating-point arithmetic and convenient exception handling. The software implementation of the system has evolved over a number of years, and a partial hardware implementation of the arithmetic itself was constructed and used during the early stages of the project. Based on this experience, the authors have developed a set of specifications for an arithmetic coprocessor to support such a system. These specifications are described. An outline of the language features and how they can be used is also provided, to help justify the particular choice of coprocessor specifications. The authors also indicate what other hardware features would be most helpful to the systems programmer, especially for implementation of the exception handling.<>
作者一直在开发一种编程系统,旨在为科学计算提供特别方便。其主要特点是可变精度(十进制)浮点运算和方便的异常处理。该系统的软件实现已经发展了许多年,并且在项目的早期阶段构建并使用了算法本身的部分硬件实现。基于这种经验,作者开发了一套算术协处理器的规范来支持这样的系统。对这些规格进行了描述。还提供了语言特性及其使用方法的概要,以帮助确定协处理器规范的特定选择。作者还指出了对系统程序员最有帮助的其他硬件特性,特别是对于异常处理的实现。
{"title":"Specifications for a variable-precision arithmetic coprocessor","authors":"T. E. Hull, M. S. Cohen, C. Hall","doi":"10.1109/ARITH.1991.145548","DOIUrl":"https://doi.org/10.1109/ARITH.1991.145548","url":null,"abstract":"The authors have been developing a programming system intended to be especially convenient for scientific computing. Its main features are variable precision (decimal) floating-point arithmetic and convenient exception handling. The software implementation of the system has evolved over a number of years, and a partial hardware implementation of the arithmetic itself was constructed and used during the early stages of the project. Based on this experience, the authors have developed a set of specifications for an arithmetic coprocessor to support such a system. These specifications are described. An outline of the language features and how they can be used is also provided, to help justify the particular choice of coprocessor specifications. The authors also indicate what other hardware features would be most helpful to the systems programmer, especially for implementation of the exception handling.<<ETX>>","PeriodicalId":190650,"journal":{"name":"[1991] Proceedings 10th IEEE Symposium on Computer Arithmetic","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116795339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Fast division using accurate quotient approximations to reduce the number of iterations 快速除法使用准确的商近似值,以减少迭代次数
Pub Date : 1991-06-26 DOI: 10.1109/ARITH.1991.145559
D. Wong, M. Flynn
A class of iterative integer division algorithms is presented based on lookup table Taylor-series approximations to the reciprocal. The algorithm iterates by using the reciprocal to find an approximate quotient and then subtracting the quotient multiplied by the divisor from the dividend to find a remaining dividend. Fast implementations can produce an average of either 14 or 27 b per iteration, depending on whether the basic or advanced version of this method is implemented. Detailed analyses are presented to support the claimed accuracy per iteration. Speed estimates using state-of-the-art ECL (emitted coupled logic) components show that this method is faster than the Newton-Raphson technique and can produce 53-b quotients of 53-b numbers in about 28 or 22 ns for the basic and advanced versions.<>
提出了一类基于查找表泰勒级数近似的迭代整数除法算法。该算法通过使用倒数来找到一个近似商,然后从被除数中减去这个商乘以除数来找到剩余的被除数。快速实现每次迭代平均可以产生14或27b,这取决于是否实现了该方法的基本版本或高级版本。提出了详细的分析来支持每次迭代所声称的准确性。使用最先进的ECL(发射耦合逻辑)组件的速度估计表明,这种方法比牛顿-拉夫森技术更快,并且对于基本版本和高级版本,可以在大约28或22 ns内产生53-b数的53-b商。
{"title":"Fast division using accurate quotient approximations to reduce the number of iterations","authors":"D. Wong, M. Flynn","doi":"10.1109/ARITH.1991.145559","DOIUrl":"https://doi.org/10.1109/ARITH.1991.145559","url":null,"abstract":"A class of iterative integer division algorithms is presented based on lookup table Taylor-series approximations to the reciprocal. The algorithm iterates by using the reciprocal to find an approximate quotient and then subtracting the quotient multiplied by the divisor from the dividend to find a remaining dividend. Fast implementations can produce an average of either 14 or 27 b per iteration, depending on whether the basic or advanced version of this method is implemented. Detailed analyses are presented to support the claimed accuracy per iteration. Speed estimates using state-of-the-art ECL (emitted coupled logic) components show that this method is faster than the Newton-Raphson technique and can produce 53-b quotients of 53-b numbers in about 28 or 22 ns for the basic and advanced versions.<<ETX>>","PeriodicalId":190650,"journal":{"name":"[1991] Proceedings 10th IEEE Symposium on Computer Arithmetic","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130779484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 84
Application of on-line arithmetic algorithms to the SVD computation: preliminary results 在线算法在奇异值分解计算中的应用:初步结果
Pub Date : 1991-06-26 DOI: 10.1109/ARITH.1991.145568
P. Tu, M. Ercegovac
A scheme for the singular value decomposition (SVD) problem, based on online arithmetic, is discussed. The design, using radix-2 floating-point online operations, implemented in the LSI HCMOS gate-array technology, is compared with a compatible conventional arithmetic implementation. The preliminary results indicate that the proposed online approach achieves a speedup of 2.4-3.2 with respect to the conventional solutions, with 1.3-5.5 more gates and more than 6 times fewer interconnections.<>
讨论了一种基于在线算法的奇异值分解(SVD)问题的解决方案。该设计采用基数-2浮点在线运算,在LSI HCMOS门阵列技术中实现,并与兼容的传统算法实现进行了比较。初步结果表明,与传统方法相比,该方法的速度提高了2.4-3.2,门数增加了1.3-5.5,互连数减少了6倍以上。
{"title":"Application of on-line arithmetic algorithms to the SVD computation: preliminary results","authors":"P. Tu, M. Ercegovac","doi":"10.1109/ARITH.1991.145568","DOIUrl":"https://doi.org/10.1109/ARITH.1991.145568","url":null,"abstract":"A scheme for the singular value decomposition (SVD) problem, based on online arithmetic, is discussed. The design, using radix-2 floating-point online operations, implemented in the LSI HCMOS gate-array technology, is compared with a compatible conventional arithmetic implementation. The preliminary results indicate that the proposed online approach achieves a speedup of 2.4-3.2 with respect to the conventional solutions, with 1.3-5.5 more gates and more than 6 times fewer interconnections.<<ETX>>","PeriodicalId":190650,"journal":{"name":"[1991] Proceedings 10th IEEE Symposium on Computer Arithmetic","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125992834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Delay optimization of carry-skip adders and block carry-lookahead adders 进位跳跃加法器和块进位前瞻加法器的延迟优化
Pub Date : 1991-06-26 DOI: 10.1109/ARITH.1991.145552
P. K. Chan, M. Schlag, C. Thomborson, V. Oklobdzija
The worst-case carry propagation delays in carry-skip adders and block carry-lookahead adders depend on how the full adders are grouped structurally together into blocks as well as the number of levels. The authors report a multidimensional dynamic programming paradigm for configuring these two adders to attain minimum latency. Previous methods are applicable only to very limited delay models that do not guarantee a minimum latency configuration. Under the proposed delay model, critical path delay is calculated taking into account not only the intrinsic gate delays but also the fanin and fanout contributions.<>
进位跳跃加法器和块进位前瞻加法器的最坏情况下的进位传播延迟取决于全加法器如何在结构上分组成块以及层数。作者报告了一个多维动态规划范例,用于配置这两个加法器以获得最小的延迟。以前的方法只适用于非常有限的延迟模型,不能保证最小延迟配置。在所提出的延迟模型下,计算关键路径延迟不仅考虑了固有门延迟,而且考虑了风扇和风扇输出的贡献。
{"title":"Delay optimization of carry-skip adders and block carry-lookahead adders","authors":"P. K. Chan, M. Schlag, C. Thomborson, V. Oklobdzija","doi":"10.1109/ARITH.1991.145552","DOIUrl":"https://doi.org/10.1109/ARITH.1991.145552","url":null,"abstract":"The worst-case carry propagation delays in carry-skip adders and block carry-lookahead adders depend on how the full adders are grouped structurally together into blocks as well as the number of levels. The authors report a multidimensional dynamic programming paradigm for configuring these two adders to attain minimum latency. Previous methods are applicable only to very limited delay models that do not guarantee a minimum latency configuration. Under the proposed delay model, critical path delay is calculated taking into account not only the intrinsic gate delays but also the fanin and fanout contributions.<<ETX>>","PeriodicalId":190650,"journal":{"name":"[1991] Proceedings 10th IEEE Symposium on Computer Arithmetic","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128955682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Design and implementation of a floating-point quasi-systolic general purpose CORDIC rotator for high-rate parallel data and signal processing 用于高速并行数据和信号处理的浮点准收缩通用CORDIC旋转器的设计与实现
Pub Date : 1991-06-26 DOI: 10.1109/ARITH.1991.145571
A. D. Lange, E. Deprettere
The authors describe the design and implementation of an algorithm and a processor which can be used to accelerate computations in which large amounts of rotations (circular as well as hyperbolic) are involved. The processor is a low-cost high-throughput VLSI implementation of the algorithm. With 10/sup 7/ rotations per second, many real-time and interaction-time applications in scientific computation become feasible. The required storage and/or silicon area is low and the execution time is independent of the particular operation performed. Another feature of this CORDIC design is its pipelined architecture and floating point extension. It is angle-pipelinable at the bit-level and has an execution time which is independent of any possible operation that can be executed.<>
作者描述了一种算法和处理器的设计和实现,该算法和处理器可用于加速涉及大量旋转(圆形和双曲)的计算。该处理器是实现该算法的低成本高吞吐量VLSI。在每秒10/sup / 7/转的情况下,科学计算中的许多实时和交互时间应用变得可行。所需的存储和/或硅面积很低,执行时间与所执行的特定操作无关。这种CORDIC设计的另一个特点是它的流水线架构和浮点扩展。它在位级上是角管道的,并且有一个独立于任何可能被执行的操作的执行时间。
{"title":"Design and implementation of a floating-point quasi-systolic general purpose CORDIC rotator for high-rate parallel data and signal processing","authors":"A. D. Lange, E. Deprettere","doi":"10.1109/ARITH.1991.145571","DOIUrl":"https://doi.org/10.1109/ARITH.1991.145571","url":null,"abstract":"The authors describe the design and implementation of an algorithm and a processor which can be used to accelerate computations in which large amounts of rotations (circular as well as hyperbolic) are involved. The processor is a low-cost high-throughput VLSI implementation of the algorithm. With 10/sup 7/ rotations per second, many real-time and interaction-time applications in scientific computation become feasible. The required storage and/or silicon area is low and the execution time is independent of the particular operation performed. Another feature of this CORDIC design is its pipelined architecture and floating point extension. It is angle-pipelinable at the bit-level and has an execution time which is independent of any possible operation that can be executed.<<ETX>>","PeriodicalId":190650,"journal":{"name":"[1991] Proceedings 10th IEEE Symposium on Computer Arithmetic","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124986731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Accurate and monotone approximations of some transcendental functions 某些超越函数的精确单调逼近
Pub Date : 1991-06-26 DOI: 10.1109/ARITH.1991.145566
W. Ferguson, T. Brightman
A technique for computing monotonicity preserving approximations F/sub a/(x) of a function F(x) is presented. This technique involves computing an extra precise approximation of F(x) that is rounded to produce the value of F/sub a/(x). For example, only a few extra bits of precision are used to make the accurate transcendental functions found on the Cyrix FasMath line of 80387 compatible math coprocessors monotonic.<>
提出了一种计算函数F(x)的保持单调近似F/下标A /(x)的方法。该技术涉及计算F(x)的一个更精确的近似值,该近似值四舍五入以产生F/sub a/(x)的值。例如,仅使用几个额外的精度位就可以使80387兼容数学协处理器的Cyrix FasMath行上找到的精确超越函数单调。
{"title":"Accurate and monotone approximations of some transcendental functions","authors":"W. Ferguson, T. Brightman","doi":"10.1109/ARITH.1991.145566","DOIUrl":"https://doi.org/10.1109/ARITH.1991.145566","url":null,"abstract":"A technique for computing monotonicity preserving approximations F/sub a/(x) of a function F(x) is presented. This technique involves computing an extra precise approximation of F(x) that is rounded to produce the value of F/sub a/(x). For example, only a few extra bits of precision are used to make the accurate transcendental functions found on the Cyrix FasMath line of 80387 compatible math coprocessors monotonic.<<ETX>>","PeriodicalId":190650,"journal":{"name":"[1991] Proceedings 10th IEEE Symposium on Computer Arithmetic","volume":"252 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133407189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
A redundant binary Euclidean GCD algorithm 冗余二进制欧几里得GCD算法
Pub Date : 1991-06-26 DOI: 10.1109/ARITH.1991.145563
S. N. Parikh, D. Matula
An efficient implementation of the Euclidean GCD (greatest common divisor) algorithm employing the redundant binary number system is described. The time complexity is O(n), utilizing O(n)4-2 signed 1-b adders to determine the GCD of two n-b integers. The process is similar to that used in SRT division. The efficiency of the algorithm is competitive, to within a small factor, with floating point division in terms of the number of shift and add/subtract operations. The novelty of the algorithm is based on properties derived from the proposed scheme of normalization of signed bit fractions. The implementation is well suited for systolic hardware design.<>
描述了利用冗余二进制数系统实现欧几里得最大公约数算法的一种有效方法。时间复杂度为O(n),利用O(n)4-2个有符号1-b加法器确定两个n-b整数的GCD。该过程类似于SRT划分中使用的过程。在移位和加减操作的次数方面,该算法的效率与浮点除法竞争,在很小的范围内。该算法的新颖性是基于所提出的符号位分数归一化方案的特性。该实现非常适合于心脏收缩的硬件设计。
{"title":"A redundant binary Euclidean GCD algorithm","authors":"S. N. Parikh, D. Matula","doi":"10.1109/ARITH.1991.145563","DOIUrl":"https://doi.org/10.1109/ARITH.1991.145563","url":null,"abstract":"An efficient implementation of the Euclidean GCD (greatest common divisor) algorithm employing the redundant binary number system is described. The time complexity is O(n), utilizing O(n)4-2 signed 1-b adders to determine the GCD of two n-b integers. The process is similar to that used in SRT division. The efficiency of the algorithm is competitive, to within a small factor, with floating point division in terms of the number of shift and add/subtract operations. The novelty of the algorithm is based on properties derived from the proposed scheme of normalization of signed bit fractions. The implementation is well suited for systolic hardware design.<<ETX>>","PeriodicalId":190650,"journal":{"name":"[1991] Proceedings 10th IEEE Symposium on Computer Arithmetic","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114067360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
OCAPI: architecture of a VLSI coprocessor for the GCD and the extended GCD of large numbers 用于GCD和大数字扩展GCD的VLSI协处理器的体系结构
Pub Date : 1991-06-26 DOI: 10.1109/ARITH.1991.145564
A. Guyot
Various algorithms for finding the greatest common divisor (GCD) and extended GCD of very large integers are explored. In particular, the tradeoff between computation time and area is examined. Two of the algorithms, from which the method for deriving variants is straightforward, are detailed. Then the architecture of a VLSI processor dedicated to GCD as well as multiply, divide, square root, etc. of very large numbers (>600 decimal digits), using an internal radix 2 redundant representation and supporting multiple precision, is described.<>
探讨了求极大整数的最大公约数(GCD)和扩展GCD的各种算法。特别是,计算时间和面积之间的权衡进行了检查。详细介绍了其中两种算法,从它们推导变量的方法是直接的。然后描述了专用于GCD的VLSI处理器的架构,以及使用内部基数2冗余表示并支持多重精度的非常大的数字(bbb600十进制数字)的乘法、除法、平方根等
{"title":"OCAPI: architecture of a VLSI coprocessor for the GCD and the extended GCD of large numbers","authors":"A. Guyot","doi":"10.1109/ARITH.1991.145564","DOIUrl":"https://doi.org/10.1109/ARITH.1991.145564","url":null,"abstract":"Various algorithms for finding the greatest common divisor (GCD) and extended GCD of very large integers are explored. In particular, the tradeoff between computation time and area is examined. Two of the algorithms, from which the method for deriving variants is straightforward, are detailed. Then the architecture of a VLSI processor dedicated to GCD as well as multiply, divide, square root, etc. of very large numbers (>600 decimal digits), using an internal radix 2 redundant representation and supporting multiple precision, is described.<<ETX>>","PeriodicalId":190650,"journal":{"name":"[1991] Proceedings 10th IEEE Symposium on Computer Arithmetic","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114689215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Representation of numbers in nonclassical numeration systems 非经典记数系统中数字的表示
Pub Date : 1991-06-26 DOI: 10.1109/ARITH.1991.145528
Christiane Frougny
Numeration systems, the bases of which are defined by a linear recurrence with integer coefficients, are considered. Conditions on the recurrence are given under which the function of normalization which transforms any representation of an integer into the normal one-obtained by the usual algorithm-can be realized by a finite automaton. Addition is a particular case of normalization. The same questions are discussed for the representation of real numbers in basis theta , where theta is a real number >1. In particular it is shown that, if theta is a Pisot number, then the normalization and the addition in basis theta are computable by a finite automaton.<>
考虑了用整数系数线性递归定义基的计算系统。给出了归一化函数可以用有限自动机实现的递推性条件,在此条件下,归一化函数可以将任意整数的表示形式转化为通常算法得到的正则表示形式。加法是规格化的一种特殊情况。对于实数在基中的表示也讨论了同样的问题,其中为>1的实数。特别指出,如果是一个皮索数,则基的归一化和加法可由有限自动机计算。
{"title":"Representation of numbers in nonclassical numeration systems","authors":"Christiane Frougny","doi":"10.1109/ARITH.1991.145528","DOIUrl":"https://doi.org/10.1109/ARITH.1991.145528","url":null,"abstract":"Numeration systems, the bases of which are defined by a linear recurrence with integer coefficients, are considered. Conditions on the recurrence are given under which the function of normalization which transforms any representation of an integer into the normal one-obtained by the usual algorithm-can be realized by a finite automaton. Addition is a particular case of normalization. The same questions are discussed for the representation of real numbers in basis theta , where theta is a real number >1. In particular it is shown that, if theta is a Pisot number, then the normalization and the addition in basis theta are computable by a finite automaton.<<ETX>>","PeriodicalId":190650,"journal":{"name":"[1991] Proceedings 10th IEEE Symposium on Computer Arithmetic","volume":"77 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134411195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A 160 ns 54 bit CMOS division implementation using self-timing and symmetrically overlapped SRT stages 采用自定时和对称重叠SRT级的160 ns 54位CMOS除法实现
Pub Date : 1991-06-26 DOI: 10.1109/ARITH.1991.145561
T. Williams, M. Horowitz
A full-custom VLSI chip demonstrates an arithmetic implementation for computing the mantissa of a 54-b (floating-point double-precision) division operation in 45 ns to 160 ns, depending on the data. The design uses self-timing to avoid the need to partition logic into clock cycles and the need for high-speed clocks. Self-timing allows the circuits to iterate with no overhead over the pure combinational logic delays. It also allows a greater-efficiency symmetric overlapped execution of the SRT stages because of dynamic path ordering. The design has several other performance enhancements, and their effects on the performance are discussed. The total effect of all the performance enhancements provides a factor of two increase in performance due to architectural improvements over a straightforward SRT approach.<>
一个完全定制的VLSI芯片演示了一种算法实现,用于计算54-b(浮点双精度)除法运算的尾数,根据数据在45 ns到160 ns之间。该设计采用自定时,避免了将逻辑划分为时钟周期和高速时钟的需要。自定时允许电路迭代,而没有纯组合逻辑延迟的开销。由于动态路径排序,它还允许更高效率的对称重叠SRT阶段执行。该设计还有其他几个性能增强,并讨论了它们对性能的影响。所有性能增强的总体效果提供了两倍的性能提高,这是由于在直接的SRT方法上进行了架构改进。
{"title":"A 160 ns 54 bit CMOS division implementation using self-timing and symmetrically overlapped SRT stages","authors":"T. Williams, M. Horowitz","doi":"10.1109/ARITH.1991.145561","DOIUrl":"https://doi.org/10.1109/ARITH.1991.145561","url":null,"abstract":"A full-custom VLSI chip demonstrates an arithmetic implementation for computing the mantissa of a 54-b (floating-point double-precision) division operation in 45 ns to 160 ns, depending on the data. The design uses self-timing to avoid the need to partition logic into clock cycles and the need for high-speed clocks. Self-timing allows the circuits to iterate with no overhead over the pure combinational logic delays. It also allows a greater-efficiency symmetric overlapped execution of the SRT stages because of dynamic path ordering. The design has several other performance enhancements, and their effects on the performance are discussed. The total effect of all the performance enhancements provides a factor of two increase in performance due to architectural improvements over a straightforward SRT approach.<<ETX>>","PeriodicalId":190650,"journal":{"name":"[1991] Proceedings 10th IEEE Symposium on Computer Arithmetic","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1991-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123093880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
期刊
[1991] Proceedings 10th IEEE Symposium on Computer Arithmetic
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1