首页 > 最新文献

Proceedings 2003 16th IEEE Symposium on Computer Arithmetic最新文献

英文 中文
Revisiting SRT quotient digit selection 重访SRT商位选择
Pub Date : 2003-06-15 DOI: 10.1109/ARITH.2003.1207658
Peter Kornerup
The quotient digit selection in the SRT division algorithm is based on a few most significant bits of the remainder and divisor, where the remainder is usually represented in a redundant representation. The number of leading bits needed depends on the quotient radix and digit set, and is usually found by an extensive search, to assure that the next quotient digit can be chosen as valid for all points (remainder, divisor) in a set defined by the truncated remainder and divisor, i.e., an "uncertainty rectangle". We present expressions for the number of bits needed for the truncated remainder and divisor, thus eliminating the need for a search through the truncation parameter space for validation. We also present simple algorithms to properly map truncated negative divisors and remainders into nonnegative values, allowing the quotient selection function only to be defined on the smaller domain of nonnegative values.
SRT除法算法中的商位数选择基于余数和除数的几个最高有效位,其中余数通常以冗余表示形式表示。前导位的数量取决于商基数和数字集,通常通过广泛搜索来确定,以确保下一个商数字可以被选为由截断的余数和除数定义的集合中的所有点(余数,除数)有效,即“不确定矩形”。我们给出了截断的余数和除数所需的位数表达式,从而消除了在截断参数空间中搜索验证的需要。我们还提出了将截断的负除数和余数正确映射为非负值的简单算法,允许商选择函数仅在非负值的较小定义域上定义。
{"title":"Revisiting SRT quotient digit selection","authors":"Peter Kornerup","doi":"10.1109/ARITH.2003.1207658","DOIUrl":"https://doi.org/10.1109/ARITH.2003.1207658","url":null,"abstract":"The quotient digit selection in the SRT division algorithm is based on a few most significant bits of the remainder and divisor, where the remainder is usually represented in a redundant representation. The number of leading bits needed depends on the quotient radix and digit set, and is usually found by an extensive search, to assure that the next quotient digit can be chosen as valid for all points (remainder, divisor) in a set defined by the truncated remainder and divisor, i.e., an \"uncertainty rectangle\". We present expressions for the number of bits needed for the truncated remainder and divisor, thus eliminating the need for a search through the truncation parameter space for validation. We also present simple algorithms to properly map truncated negative divisors and remainders into nonnegative values, allowing the quotient selection function only to be defined on the smaller domain of nonnegative values.","PeriodicalId":399928,"journal":{"name":"Proceedings 2003 16th IEEE Symposium on Computer Arithmetic","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115343124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
"Partially rounded" small-order approximations for accurate, hardware-oriented, table-based methods 用于精确的、面向硬件的、基于表的方法的“部分舍入”小阶近似值
Pub Date : 2003-06-15 DOI: 10.1109/ARITH.2003.1207668
J. Muller
We aim at evaluating elementary and special functions using small tables and small, rectangular, multipliers. To do that, we show how accurate polynomial approximations whose order-1 coefficients are small in size (a few bits only) can be computed. We compare the obtained results with similar work in the recent literature.
我们的目标是用小表格和小矩形乘数来求初等函数和特殊函数的值。为此,我们将展示如何计算阶-1系数很小(只有几个比特)的精确多项式近似。我们将获得的结果与最近文献中的类似工作进行比较。
{"title":"\"Partially rounded\" small-order approximations for accurate, hardware-oriented, table-based methods","authors":"J. Muller","doi":"10.1109/ARITH.2003.1207668","DOIUrl":"https://doi.org/10.1109/ARITH.2003.1207668","url":null,"abstract":"We aim at evaluating elementary and special functions using small tables and small, rectangular, multipliers. To do that, we show how accurate polynomial approximations whose order-1 coefficients are small in size (a few bits only) can be computed. We compare the obtained results with similar work in the recent literature.","PeriodicalId":399928,"journal":{"name":"Proceedings 2003 16th IEEE Symposium on Computer Arithmetic","volume":"433 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116279869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Isolating critical cases for reciprocals using integer factorization 使用整数分解分离往复式的临界情况
Pub Date : 2003-06-15 DOI: 10.1109/ARITH.2003.1207673
J. Harrison
One approach to testing and/or proving correctness of a floating-point algorithm computing a function f is based on finding input floating-point numbers a such that the exact result f(a) is very close to a "rounding boundary", i.e. a floating-point number or a midpoint between them. We show how to do this for the reciprocal function by utilizing prime factorizations. We present the method and show examples, as well as making a fairly detailed study of its expected and worst-case behavior. We point out how this analysis of reciprocals can be useful in analyzing certain reciprocal algorithms, and also show how the approach can be trivially adapted to the reciprocal square root function.
测试和/或证明计算函数f的浮点算法的正确性的一种方法是基于查找输入浮点数a,使其精确结果f(a)非常接近“舍入边界”,即浮点数或它们之间的中点。我们将展示如何利用质因数分解来处理互反函数。我们提出了该方法并举例说明,并对其预期和最坏情况进行了相当详细的研究。我们指出这种对倒数的分析在分析某些倒数算法时是如何有用的,并且还展示了如何将该方法简单地适应于倒数平方根函数。
{"title":"Isolating critical cases for reciprocals using integer factorization","authors":"J. Harrison","doi":"10.1109/ARITH.2003.1207673","DOIUrl":"https://doi.org/10.1109/ARITH.2003.1207673","url":null,"abstract":"One approach to testing and/or proving correctness of a floating-point algorithm computing a function f is based on finding input floating-point numbers a such that the exact result f(a) is very close to a \"rounding boundary\", i.e. a floating-point number or a midpoint between them. We show how to do this for the reciprocal function by utilizing prime factorizations. We present the method and show examples, as well as making a fairly detailed study of its expected and worst-case behavior. We point out how this analysis of reciprocals can be useful in analyzing certain reciprocal algorithms, and also show how the approach can be trivially adapted to the reciprocal square root function.","PeriodicalId":399928,"journal":{"name":"Proceedings 2003 16th IEEE Symposium on Computer Arithmetic","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129977070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
On computing addition related arithmetic operations via controlled transport of charge 通过控制电荷传输计算加法相关算术运算
Pub Date : 2003-06-15 DOI: 10.1109/ARITH.2003.1207685
S. Cotofana, C. Lageweg, S. Vassiliadis
We investigate the implementation of basic arithmetic functions, such as addition and multiplication, in single electron tunneling (SET) technology. First, we describe the SET equivalents of Boolean CMOS gates and threshold logic gates. Second, we propose a set of building blocks, which can be utilized for a novel design style, namely arithmetic operations performed by direct manipulation of the location of individual electrons within the system. Using this new set of building blocks, we propose several novel approaches for computing addition related arithmetic operations via the controlled transport of charge (individual electrons). In particular, we prove the following: n-bit addition can be implemented with a depth-2 network built with O(n) circuit elements; n-input parity can be computed with a depth-2 network constructed with O(n) circuit elements and the same applies for n/logn counters; multiple operand addition of m n-bit operands can be implemented with a depth-2 network using O(mn) circuit elements; and finally n-bit multiplication can be implemented with a depth-3 network built with O(n) circuit elements.
研究了单电子隧道(SET)技术中加法和乘法等基本运算函数的实现。首先,我们描述了布尔CMOS门和阈值逻辑门的SET等效。其次,我们提出了一套构建模块,可以用于一种新的设计风格,即通过直接操纵系统内单个电子的位置来执行算术运算。利用这组新的构建模块,我们提出了几种通过控制电荷传输(单个电子)来计算加法相关算术运算的新方法。特别地,我们证明了以下几点:n位加法可以用O(n)个电路元件构建的深度2网络来实现;n输入奇偶校验可以用由O(n)个电路元件构成的深度-2网络计算,同样适用于n/logn计数器;m个n位操作数的多操作数相加可以通过使用O(mn)个电路元件的深度2网络实现;最后用O(n)个电路元件构建的深度3网络实现n位乘法。
{"title":"On computing addition related arithmetic operations via controlled transport of charge","authors":"S. Cotofana, C. Lageweg, S. Vassiliadis","doi":"10.1109/ARITH.2003.1207685","DOIUrl":"https://doi.org/10.1109/ARITH.2003.1207685","url":null,"abstract":"We investigate the implementation of basic arithmetic functions, such as addition and multiplication, in single electron tunneling (SET) technology. First, we describe the SET equivalents of Boolean CMOS gates and threshold logic gates. Second, we propose a set of building blocks, which can be utilized for a novel design style, namely arithmetic operations performed by direct manipulation of the location of individual electrons within the system. Using this new set of building blocks, we propose several novel approaches for computing addition related arithmetic operations via the controlled transport of charge (individual electrons). In particular, we prove the following: n-bit addition can be implemented with a depth-2 network built with O(n) circuit elements; n-input parity can be computed with a depth-2 network constructed with O(n) circuit elements and the same applies for n/logn counters; multiple operand addition of m n-bit operands can be implemented with a depth-2 network using O(mn) circuit elements; and finally n-bit multiplication can be implemented with a depth-3 network built with O(n) circuit elements.","PeriodicalId":399928,"journal":{"name":"Proceedings 2003 16th IEEE Symposium on Computer Arithmetic","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122033659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Some optimizations of hardware multiplication by constant matrices 一些优化的硬件乘法常数矩阵
Pub Date : 2003-06-15 DOI: 10.1109/ARITH.2003.1207656
Nicolas Boullis, A. Tisserand
We present some improvements on the optimization of hardware multiplication by constant matrices. We focus on the automatic generation of circuits that involve constant matrix multiplication (CMM), i.e. multiplication of a vector by a constant matrix. The proposed method, based on number recoding and dedicated common sub-expression factorization algorithms was implemented in a VHDL generator. The obtained results on several applications have been implemented on FPGAs and compared to previous solutions. Up to 40% area and speed savings are achieved.
我们对常数矩阵的硬件乘法优化提出了一些改进。我们专注于涉及常数矩阵乘法(CMM)的电路的自动生成,即向量乘以常数矩阵。该方法基于数字编码和专用的公共子表达式分解算法,在VHDL生成器中实现。在fpga上实现了几个应用的结果,并与以前的解决方案进行了比较。节省高达40%的面积和速度。
{"title":"Some optimizations of hardware multiplication by constant matrices","authors":"Nicolas Boullis, A. Tisserand","doi":"10.1109/ARITH.2003.1207656","DOIUrl":"https://doi.org/10.1109/ARITH.2003.1207656","url":null,"abstract":"We present some improvements on the optimization of hardware multiplication by constant matrices. We focus on the automatic generation of circuits that involve constant matrix multiplication (CMM), i.e. multiplication of a vector by a constant matrix. The proposed method, based on number recoding and dedicated common sub-expression factorization algorithms was implemented in a VHDL generator. The obtained results on several applications have been implemented on FPGAs and compared to previous solutions. Up to 40% area and speed savings are achieved.","PeriodicalId":399928,"journal":{"name":"Proceedings 2003 16th IEEE Symposium on Computer Arithmetic","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126494470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 97
Scaling an RNS number using the core function 使用核心函数缩放RNS号码
Pub Date : 2003-06-15 DOI: 10.1109/ARITH.2003.1207687
N. Burgess
We introduce a method for extracting the core of a residue number system (RNS) number within the RNS, this affording a new method for scaling RNS numbers. Suppose an RNS comprises a set of coprime moduli, m/sub i/, with /spl Pi/m/sub i/=M. We describe a method for approximately scaling such an RNS number by a subset of the moduli, /spl Pi/m/sub j/=M/sub J//spl ap//spl radic/M, with the characteristic that all computations are performed using the original moduli and one other nonmaintained short wordlength modulus.
提出了一种提取残数系统(RNS)内残数核心的方法,为RNS数的标度提供了一种新的方法。假设一个RNS包含一组素数模m/下标i/,其中/spl Pi/m/下标i/= m。我们描述了一种用模的子集/spl Pi/m/sub j/= m/sub j/ /spl ap//spl radic/ m来近似缩放RNS数的方法,其特点是所有的计算都是使用原始模和另一个不可维持的短字长模进行的。
{"title":"Scaling an RNS number using the core function","authors":"N. Burgess","doi":"10.1109/ARITH.2003.1207687","DOIUrl":"https://doi.org/10.1109/ARITH.2003.1207687","url":null,"abstract":"We introduce a method for extracting the core of a residue number system (RNS) number within the RNS, this affording a new method for scaling RNS numbers. Suppose an RNS comprises a set of coprime moduli, m/sub i/, with /spl Pi/m/sub i/=M. We describe a method for approximately scaling such an RNS number by a subset of the moduli, /spl Pi/m/sub j/=M/sub J//spl ap//spl radic/M, with the characteristic that all computations are performed using the original moduli and one other nonmaintained short wordlength modulus.","PeriodicalId":399928,"journal":{"name":"Proceedings 2003 16th IEEE Symposium on Computer Arithmetic","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123242625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
SRT division algorithms as dynamical systems 动态系统SRT除法算法
Pub Date : 2003-06-15 DOI: 10.1109/ARITH.2003.1207659
Mark McCann, N. Pippenger
SRT division, as it was discovered in the late 1950s represented an important improvement in the speed of division algorithms for computers at the time. A variant of SRT division is still commonly implemented in computers today. Although some bounds on the performance of the original SRT division method were obtained, a great many questions remained unanswered. The original version of SRT division is described as a dynamical system. This enables us to bring modern dynamical systems theory, a relatively new development in mathematics, to bear on an older problem. In doing so, we are able to show that SRT division is ergodic, and is even Bernoulli, for all real divisors and dividends. With the Bernoulli property, we are able to use entropy to prove that the natural extensions of SRT division are isomorphic by way of the Kolmogorov-Ornstein theorem. We demonstrate how our methods and results can be applied to a much larger class of division algorithms.
SRT除法是在20世纪50年代后期发现的,它代表了当时计算机除法算法速度的重要改进。SRT除法的一种变体在今天的计算机中仍然普遍实现。虽然获得了原始SRT分割方法性能的一些界限,但仍有许多问题没有得到解答。原始版本的SRT划分被描述为一个动态系统。这使我们能够把现代动力系统理论——数学中一个相对较新的发展——用于解决一个老问题。通过这样做,我们可以证明SRT除法是遍历的,并且对于所有实数除法和实数除法都是伯努利的。利用伯努利性质,我们可以利用熵来证明SRT划分的自然扩展是同构的,方法是利用Kolmogorov-Ornstein定理。我们演示了如何将我们的方法和结果应用于更大的除法算法类。
{"title":"SRT division algorithms as dynamical systems","authors":"Mark McCann, N. Pippenger","doi":"10.1109/ARITH.2003.1207659","DOIUrl":"https://doi.org/10.1109/ARITH.2003.1207659","url":null,"abstract":"SRT division, as it was discovered in the late 1950s represented an important improvement in the speed of division algorithms for computers at the time. A variant of SRT division is still commonly implemented in computers today. Although some bounds on the performance of the original SRT division method were obtained, a great many questions remained unanswered. The original version of SRT division is described as a dynamical system. This enables us to bring modern dynamical systems theory, a relatively new development in mathematics, to bear on an older problem. In doing so, we are able to show that SRT division is ergodic, and is even Bernoulli, for all real divisors and dividends. With the Bernoulli property, we are able to use entropy to prove that the natural extensions of SRT division are isomorphic by way of the Kolmogorov-Ornstein theorem. We demonstrate how our methods and results can be applied to a much larger class of division algorithms.","PeriodicalId":399928,"journal":{"name":"Proceedings 2003 16th IEEE Symposium on Computer Arithmetic","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128638857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Revisions to the IEEE 754 standard for floating-point arithmetic 对浮点运算的IEEE 754标准的修订
Pub Date : 2003-06-15 DOI: 10.1109/ARITH.2003.1207667
E. Schwarz
Almost twenty years ago the IEEE 754 binary floating-point standard was adopted. Since then almost every microprocessor as well as many programming languages have defined the floating-point arithmetic to be IEEE 754 compliant. From the many years experience in implementing the standard in hardware and writing floating-point programs, there have been numerous suggestions for revisions. All IEEE standards must undergo a review process every 5 years or be dropped as an active standard. For past reviews this standard was extended without much discussion. But finally in January 2001 an in-depth review was started. A committee was formed and over the past two years many revisions have been evaluated. The most extensive change to the standard is to adopt formats for decimal floating-point arithmetic. This proposal creates decimal floating-point data formats for 32, 64, and 128 bits. Decimal floating-point arithmetic provides an exact representation of displayed numbers and provides a precise round at decimal radix point. This type of arithmetic is required in financial calculations. Some experts argue that decimal will replace binary due to its ability to represent decimal numbers exactly, while others think that binary will remain the key floating-point format due to its speed of execution and its more regular spacing of intervals. Another once controversial proposal is the addition of fused multiply-add. This operation only causes one rounding error, while in most implementations, provides twice the performance of separate operations. Other additions to the standard include a quadword format and many predicate functions such as comparison operators like greater than. Also operators for maximum and minimum have been accepted that after hours of arguing now favor a numeric result over a NaN. There are also deletions such as the single extended and double extended formats. And there are some items that are deleted in one meeting and resurrected in the following meeting such as signaling NaNs. Over the past two years of committee review there has been many proposals discussed. This panel discussion will enlighten the audience to the additions, deletions, and some of the current controversial proposals. The panel will consist of : • David Hough, Sun Microsystems, Editor of the Standard – Overview • Mike Cowlishaw, IBM Corp., Decimal Floating-Point Software Advocate • David Bailey, Lawrence Berkeley Lab., Quadword Precision Advocate • David Matula, Southern Methodist University, Academics / Industry Consultant • Eric Schwarz, IBM Corp., Decimal Floating-Point Hardware – Panel Chair
大约20年前,IEEE 754二进制浮点标准被采用。从那时起,几乎所有的微处理器以及许多编程语言都定义了符合IEEE 754标准的浮点运算。根据多年来在硬件上实现该标准和编写浮点程序的经验,有许多修改建议。所有IEEE标准必须每5年进行一次审查,否则将不再是现行标准。对于过去的审查,这个标准在没有太多讨论的情况下进行了扩展。但最终在2001年1月,一项深入的审查开始了。成立了一个委员会,并在过去两年中对许多修订进行了评估。该标准最广泛的变化是采用了十进制浮点运算的格式。该建议创建32位、64位和128位的十进制浮点数据格式。十进制浮点运算提供了显示数字的精确表示,并提供了小数点的精确舍入。这种算式在财务计算中是必需的。一些专家认为,十进制将取代二进制,因为它能够精确地表示十进制数,而另一些专家则认为,由于其执行速度和更规则的间隔间隔,二进制仍将是关键的浮点格式。另一个曾经有争议的提议是加入融合乘加。此操作只会导致一次舍入错误,而在大多数实现中,提供的性能是单独操作的两倍。标准中增加的其他内容包括四字格式和许多谓词函数,例如比较运算符,如大于。此外,经过数小时的争论,最大值和最小值的运算符现在更倾向于数字结果而不是NaN。也有删除,如单扩展和双扩展格式。还有一些项目在一次会议中被删除,而在下一次会议中重新出现,例如信令nan。在过去两年的委员会审查中,讨论了许多提案。这个小组讨论将启发观众对增加,删除,和一些目前有争议的建议。该小组将包括:•David Hough, Sun微系统公司,标准概述编辑•Mike Cowlishaw, IBM公司,十进制浮点软件倡导者•David Bailey,劳伦斯伯克利实验室。•David Matula,南卫理公会大学,学术/行业顾问•Eric Schwarz, IBM公司,十进制浮点硬件小组主席
{"title":"Revisions to the IEEE 754 standard for floating-point arithmetic","authors":"E. Schwarz","doi":"10.1109/ARITH.2003.1207667","DOIUrl":"https://doi.org/10.1109/ARITH.2003.1207667","url":null,"abstract":"Almost twenty years ago the IEEE 754 binary floating-point standard was adopted. Since then almost every microprocessor as well as many programming languages have defined the floating-point arithmetic to be IEEE 754 compliant. From the many years experience in implementing the standard in hardware and writing floating-point programs, there have been numerous suggestions for revisions. All IEEE standards must undergo a review process every 5 years or be dropped as an active standard. For past reviews this standard was extended without much discussion. But finally in January 2001 an in-depth review was started. A committee was formed and over the past two years many revisions have been evaluated. The most extensive change to the standard is to adopt formats for decimal floating-point arithmetic. This proposal creates decimal floating-point data formats for 32, 64, and 128 bits. Decimal floating-point arithmetic provides an exact representation of displayed numbers and provides a precise round at decimal radix point. This type of arithmetic is required in financial calculations. Some experts argue that decimal will replace binary due to its ability to represent decimal numbers exactly, while others think that binary will remain the key floating-point format due to its speed of execution and its more regular spacing of intervals. Another once controversial proposal is the addition of fused multiply-add. This operation only causes one rounding error, while in most implementations, provides twice the performance of separate operations. Other additions to the standard include a quadword format and many predicate functions such as comparison operators like greater than. Also operators for maximum and minimum have been accepted that after hours of arguing now favor a numeric result over a NaN. There are also deletions such as the single extended and double extended formats. And there are some items that are deleted in one meeting and resurrected in the following meeting such as signaling NaNs. Over the past two years of committee review there has been many proposals discussed. This panel discussion will enlighten the audience to the additions, deletions, and some of the current controversial proposals. The panel will consist of : • David Hough, Sun Microsystems, Editor of the Standard – Overview • Mike Cowlishaw, IBM Corp., Decimal Floating-Point Software Advocate • David Bailey, Lawrence Berkeley Lab., Quadword Precision Advocate • David Matula, Southern Methodist University, Academics / Industry Consultant • Eric Schwarz, IBM Corp., Decimal Floating-Point Hardware – Panel Chair","PeriodicalId":399928,"journal":{"name":"Proceedings 2003 16th IEEE Symposium on Computer Arithmetic","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126021560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Worst cases and lattice reduction 最坏情况和格化简
Pub Date : 2003-06-15 DOI: 10.1109/ARITH.2003.1207672
D. Stehlé, V. Lefèvre, P. Zimmermann
We propose a new algorithm to find worst cases for correct rounding of an analytic function. We first reduce this problem to the real small value problem - i.e. for polynomials with real coefficients. Then we show that this second problem can be solved efficiently, by extending Coppersmith's work on the integer small value problem - for polynomials with integer coefficients - using lattice reduction (D. Coppersmith, 1996; 2001). For floating-point numbers with a mantissa less than N, and a polynomial approximation of degree d, our algorithm finds all worst cases at distance < N/sup -d2//(2d+1) from a machine number in time O(N/sup ((d+1)/(2d+1))+/spl epsiv//). For d=2, this improves on the O(N/sup 2/(3+/spl epsiv/)/) complexity from Lefevre's algorithm (V. Lefevre, 2000; V. Lefevre et al., 2001) to O(N/sup 3/(5+/spl epsiv/)/). We exhibit some new worst cases found using our algorithm, for double-extended and quadruple precision. For larger d, our algorithm can be used to check that there exist no worst cases at distance < N/sup -k/ in time O(N/sup (1/2)+O(1/k)/).
我们提出了一种新的算法来寻找解析函数正确舍入的最坏情况。我们首先将这个问题简化为实小值问题,即对于具有实系数的多项式。然后我们证明了第二个问题可以有效地解决,通过扩展Coppersmith关于整数小值问题的工作-对于具有整数系数的多项式-使用晶格约简(D. Coppersmith, 1996;2001)。对于尾数小于N的浮点数,并且次数为d的多项式近似值,我们的算法在时间O(N/sup ((d+1)/(2d+1))+/spl epsiv//)距离机器号码< N/sup -d2//(2d+1))处找到所有最坏情况。对于d=2,这改进了Lefevre算法的O(N/sup 2/(3+/spl epsiv/)/)复杂度(V. Lefevre, 2000;V. Lefevre等人,2001)至O(N/sup 3/(5+/spl epsiv/)/)。我们展示了使用我们的算法发现的一些新的最坏情况,用于双扩展和四倍精度。对于较大的d,我们的算法可以用来检查在时间O(N/sup (1/2)+O(1/k)/)处距离< N/sup -k/处不存在最坏情况。
{"title":"Worst cases and lattice reduction","authors":"D. Stehlé, V. Lefèvre, P. Zimmermann","doi":"10.1109/ARITH.2003.1207672","DOIUrl":"https://doi.org/10.1109/ARITH.2003.1207672","url":null,"abstract":"We propose a new algorithm to find worst cases for correct rounding of an analytic function. We first reduce this problem to the real small value problem - i.e. for polynomials with real coefficients. Then we show that this second problem can be solved efficiently, by extending Coppersmith's work on the integer small value problem - for polynomials with integer coefficients - using lattice reduction (D. Coppersmith, 1996; 2001). For floating-point numbers with a mantissa less than N, and a polynomial approximation of degree d, our algorithm finds all worst cases at distance < N/sup -d2//(2d+1) from a machine number in time O(N/sup ((d+1)/(2d+1))+/spl epsiv//). For d=2, this improves on the O(N/sup 2/(3+/spl epsiv/)/) complexity from Lefevre's algorithm (V. Lefevre, 2000; V. Lefevre et al., 2001) to O(N/sup 3/(5+/spl epsiv/)/). We exhibit some new worst cases found using our algorithm, for double-extended and quadruple precision. For larger d, our algorithm can be used to check that there exist no worst cases at distance < N/sup -k/ in time O(N/sup (1/2)+O(1/k)/).","PeriodicalId":399928,"journal":{"name":"Proceedings 2003 16th IEEE Symposium on Computer Arithmetic","volume":"29 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114025506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Hardware implementations of denormalized numbers 非规范化数字的硬件实现
Pub Date : 2003-06-15 DOI: 10.1109/ARITH.2003.1207662
E. Schwarz, M. Schmookler, S. D. Trong
Denormalized numbers are the most difficult type of numbers to implement in floating-point units. They are so complex that some designs have elected to handle them in software rather than hardware. This has resulted in execution times in the tens of thousands of cycles, which has made denormalized numbers useless to programmers. This does not have to happen. With a small amount of additional hardware, denormalized numbers and underflows can be handled close to the speed of normalized numbers. We will summarize the little known techniques for handling denormalized numbers. Most of the techniques discussed have only been discussed in filed or pending patent applications.
在浮点单位中,非规范化数字是最难实现的数字类型。它们是如此复杂,以至于一些设计选择用软件而不是硬件来处理它们。这导致了数以万计周期的执行时间,这使得非规范化的数字对程序员来说毫无用处。这并不一定会发生。使用少量的额外硬件,处理非规范化数字和溢出的速度可以接近规范化数字的速度。我们将总结一些鲜为人知的处理非规范化数字的技术。所讨论的大多数技术仅在已提交或未决的专利申请中进行了讨论。
{"title":"Hardware implementations of denormalized numbers","authors":"E. Schwarz, M. Schmookler, S. D. Trong","doi":"10.1109/ARITH.2003.1207662","DOIUrl":"https://doi.org/10.1109/ARITH.2003.1207662","url":null,"abstract":"Denormalized numbers are the most difficult type of numbers to implement in floating-point units. They are so complex that some designs have elected to handle them in software rather than hardware. This has resulted in execution times in the tens of thousands of cycles, which has made denormalized numbers useless to programmers. This does not have to happen. With a small amount of additional hardware, denormalized numbers and underflows can be handled close to the speed of normalized numbers. We will summarize the little known techniques for handling denormalized numbers. Most of the techniques discussed have only been discussed in filed or pending patent applications.","PeriodicalId":399928,"journal":{"name":"Proceedings 2003 16th IEEE Symposium on Computer Arithmetic","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121250391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
期刊
Proceedings 2003 16th IEEE Symposium on Computer Arithmetic
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1