首页 > 最新文献

2015 IEEE 22nd Symposium on Computer Arithmetic最新文献

英文 中文
Low-Cost Duplicate Multiplication 低成本重复乘法
Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.29
Michael B. Sullivan, E. Swartzlander
Rising levels of integration, decreasing component reliabilities, and the ubiquity of computer systems make error protection a rising concern. Meanwhile, the uncertainty of future fault and error modes motivates the design of strong error detection mechanisms that offer fault-agnostic error protection. Current concurrent hardware mechanisms, however, either offer strong error detection coverage at high cost or restrict their coverage to narrow synthetic error models. This paper investigates the potential for duplication using alternate number systems to lower the costs of duplicated multiplication without sacrificing error coverage. Two examples of such low-cost duplication schemes are described and evaluated, it is shown that specialized carry-save or residue number system checking can be used to increase the efficiency of duplicated multiplication.
集成水平的提高,组件可靠性的降低,以及计算机系统的普遍存在,使得错误保护日益受到关注。同时,未来故障和错误模式的不确定性激发了强大的错误检测机制的设计,提供故障不可知的错误保护。然而,当前的并发硬件机制要么以高昂的代价提供强大的错误检测覆盖,要么将其覆盖范围限制在狭窄的综合错误模型中。本文研究了使用备用数字系统来降低重复乘法的成本而不牺牲错误覆盖率的可能性。本文给出了两种低成本复制方案的实例,并对其进行了评价。结果表明,采用特殊的免进位或剩余数系统检查可以提高重复乘法的效率。
{"title":"Low-Cost Duplicate Multiplication","authors":"Michael B. Sullivan, E. Swartzlander","doi":"10.1109/ARITH.2015.29","DOIUrl":"https://doi.org/10.1109/ARITH.2015.29","url":null,"abstract":"Rising levels of integration, decreasing component reliabilities, and the ubiquity of computer systems make error protection a rising concern. Meanwhile, the uncertainty of future fault and error modes motivates the design of strong error detection mechanisms that offer fault-agnostic error protection. Current concurrent hardware mechanisms, however, either offer strong error detection coverage at high cost or restrict their coverage to narrow synthetic error models. This paper investigates the potential for duplication using alternate number systems to lower the costs of duplicated multiplication without sacrificing error coverage. Two examples of such low-cost duplication schemes are described and evaluated, it is shown that specialized carry-save or residue number system checking can be used to increase the efficiency of duplicated multiplication.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"16 1","pages":"2-9"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76020759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Reliable Evaluation of the Worst-Case Peak Gain Matrix in Multiple Precision 多精度下最坏情况峰值增益矩阵的可靠评估
Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.14
Anastasia Volkova, Thibault Hilaire, C. Lauter
The worst-case peak gain (WCPG) of a linear filter is an important measure for the implementation of signal processing algorithms. It is used in the error propagation analysis for filters, thus a reliable evaluation with controlled precision is required. The WCPG is computed as an infinite sum and has matrix powers in each summand. We propose a direct formula for the lower bound on truncation order of the infinite sum in dependency of desired truncation error. Several multiprecision methods for complex matrix operations are developed and their error analysis performed. A multiprecision matrix powering method is presented. All methods yield a rigorous solution with an absolute error bounded by an a priori given value. The results are illustrated with numerical examples.
线性滤波器的最坏峰值增益(WCPG)是衡量信号处理算法实现的重要指标。它用于滤波器的误差传播分析,因此需要一个精度可控的可靠评估。WCPG被计算为一个无限和,并且在每个和中都有矩阵幂。我们给出了与期望截断误差相关的无穷和截断阶下界的一个直接公式。提出了几种复杂矩阵运算的多精度方法,并对其误差进行了分析。提出了一种多精度矩阵供电方法。所有方法都得到一个严格的解,其绝对误差由一个先验给定的值限定。通过数值算例说明了所得结果。
{"title":"Reliable Evaluation of the Worst-Case Peak Gain Matrix in Multiple Precision","authors":"Anastasia Volkova, Thibault Hilaire, C. Lauter","doi":"10.1109/ARITH.2015.14","DOIUrl":"https://doi.org/10.1109/ARITH.2015.14","url":null,"abstract":"The worst-case peak gain (WCPG) of a linear filter is an important measure for the implementation of signal processing algorithms. It is used in the error propagation analysis for filters, thus a reliable evaluation with controlled precision is required. The WCPG is computed as an infinite sum and has matrix powers in each summand. We propose a direct formula for the lower bound on truncation order of the infinite sum in dependency of desired truncation error. Several multiprecision methods for complex matrix operations are developed and their error analysis performed. A multiprecision matrix powering method is presented. All methods yield a rigorous solution with an absolute error bounded by an a priori given value. The results are illustrated with numerical examples.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"1 1","pages":"96-103"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75374002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
The Exact Real Arithmetical Algorithm in Binary Continued Fractions 二元连分数的精确实数算法
Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.20
P. Kurka
The exact real binary arithmetical algorithm is an on-line algorithm which computes the sum, product or ratio of two real numbers to arbitrary precision. The algorithm works in general Moebius number systems which represent real numbers by infinite products of Moebius transformations. We consider a number system of binary continued fractions in which this algorithm is computed faster than in the binary signed system. Moreover, the number system of binary continued fractions circumvents the problem of nonredundancy and slow convergence of continued fractions.
精确实数二进制算术算法是一种将两个实数的和、积或比计算到任意精度的在线算法。该算法适用于用莫比乌斯变换的无穷积表示实数的一般莫比乌斯数系统。我们考虑了一个二进制连分制的数系统,在这个数系统中,该算法的计算速度比在二进制有符号系统中快。此外,二元连分式的数制还克服了连分式的非冗余性和收敛速度慢的问题。
{"title":"The Exact Real Arithmetical Algorithm in Binary Continued Fractions","authors":"P. Kurka","doi":"10.1109/ARITH.2015.20","DOIUrl":"https://doi.org/10.1109/ARITH.2015.20","url":null,"abstract":"The exact real binary arithmetical algorithm is an on-line algorithm which computes the sum, product or ratio of two real numbers to arbitrary precision. The algorithm works in general Moebius number systems which represent real numbers by infinite products of Moebius transformations. We consider a number system of binary continued fractions in which this algorithm is computed faster than in the binary signed system. Moreover, the number system of binary continued fractions circumvents the problem of nonredundancy and slow convergence of continued fractions.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"44 1","pages":"168-175"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76972742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Efficient Divide-and-Conquer Multiprecision Integer Division 高效分治多精度整数除法
Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.19
William Bruce Hart
We present a new divide-and-conquer algorithm for mid-range multiprecision integer division which is typically 20-25% faster than the recent algorithms of Moller and Granlund implemented in the GNU Multi Precision (GMP) library.
我们提出了一种新的分治算法,用于中程多精度整数除法,该算法通常比GNU多精度(GMP)库中实现的Moller和Granlund最近的算法快20-25%。
{"title":"Efficient Divide-and-Conquer Multiprecision Integer Division","authors":"William Bruce Hart","doi":"10.1109/ARITH.2015.19","DOIUrl":"https://doi.org/10.1109/ARITH.2015.19","url":null,"abstract":"We present a new divide-and-conquer algorithm for mid-range multiprecision integer division which is typically 20-25% faster than the recent algorithms of Moller and Granlund implemented in the GNU Multi Precision (GMP) library.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"48 1","pages":"90-95"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80827653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Semi-Automatic Floating-Point Implementation of Special Functions 特殊函数的半自动浮点实现
Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.12
C. Lauter, M. Mezzarobba
This work introduces an approach to the computer-assisted implementation of mathematical functions geared toward special functions such as those occurring in mathematical physics. The general idea is to start with an exact symbolic representation of a function and automate as much as possible of the process of implementing it. In order to deal with a large class of special functions, our symbolic representation is an implicit one: the input is a linear differential equation with polynomial coefficients along with initial values. The output is a C program to evaluate the solution of the equation using domain splitting, argument reduction and polynomial approximations in double-precision arithmetic, in the usual style of mathematical libraries. Our generation method combines symbolic-numeric manipulations of linear ODEs with interval-based tools for the floating-point implementation of "black-box" functions. We describe a prototype code generator that can automatically produce implementations on moderately large intervals. Implementations on the whole real line are possible in some cases but require manual tool setup and code integration. Due to this limitation and as some heuristics remain, we refer to our method as "semi-automatic" at this stage. Along with other examples, we present an implementation of the Voigt profile with fixed parameters that may be of independent interest.
这项工作介绍了一种计算机辅助实现数学函数的方法,这种方法面向数学物理中出现的特殊函数。一般的想法是从函数的精确符号表示开始,并尽可能地自动化实现它的过程。为了处理大量的特殊函数,我们的符号表示是隐式的:输入是一个具有多项式系数和初始值的线性微分方程。输出是一个C程序,使用双精度算法中的域分裂,参数约简和多项式近似来评估方程的解,在通常的数学库中。我们的生成方法将线性ode的符号-数值操作与用于“黑箱”函数的浮点实现的基于间隔的工具相结合。我们描述了一个原型代码生成器,它可以以中等大的间隔自动生成实现。在某些情况下,在整条实线上实现是可能的,但需要手动工具设置和代码集成。由于这种限制和一些启发式方法的存在,我们在这个阶段将我们的方法称为“半自动”。除了其他示例外,我们还提供了Voigt配置文件的实现,该配置文件具有可能独立感兴趣的固定参数。
{"title":"Semi-Automatic Floating-Point Implementation of Special Functions","authors":"C. Lauter, M. Mezzarobba","doi":"10.1109/ARITH.2015.12","DOIUrl":"https://doi.org/10.1109/ARITH.2015.12","url":null,"abstract":"This work introduces an approach to the computer-assisted implementation of mathematical functions geared toward special functions such as those occurring in mathematical physics. The general idea is to start with an exact symbolic representation of a function and automate as much as possible of the process of implementing it. In order to deal with a large class of special functions, our symbolic representation is an implicit one: the input is a linear differential equation with polynomial coefficients along with initial values. The output is a C program to evaluate the solution of the equation using domain splitting, argument reduction and polynomial approximations in double-precision arithmetic, in the usual style of mathematical libraries. Our generation method combines symbolic-numeric manipulations of linear ODEs with interval-based tools for the floating-point implementation of \"black-box\" functions. We describe a prototype code generator that can automatically produce implementations on moderately large intervals. Implementations on the whole real line are possible in some cases but require manual tool setup and code integration. Due to this limitation and as some heuristics remain, we refer to our method as \"semi-automatic\" at this stage. Along with other examples, we present an implementation of the Voigt profile with fixed parameters that may be of independent interest.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"93 1","pages":"58-65"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84167115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Modulo-(2^n -- 2^q -- 1) Parallel Prefix Addition via Excess-Modulo Encoding of Residues 残数的过模编码的模-(2^n—2^q—1)并行前缀加法
Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.9
Seyed Hamed Fatemi Langroudi, G. Jaberipur
The residue number system t = {2n - 1, 2n, 2n + 1} has been extensively studied towards perfection in realization of efficient parallel prefix modular adders, with (3 + 2logn △G latency. Many applications, such as digital signal processing require fast modular operations. However, relying only on t limits the magnitude of n, and accordingly the dynamic range. Therefore, additional mutually prime moduli are required to accommodate for wider dynamic range. On the other hand, speed of modular arithmetic operations for the additional moduli should be as close as possible to those in t. This could be best met by the moduli of the form 2n - (2q + 1), with 1 ≤ q ≤ n - 2, such as 2n - 3, 2n - 5. However, the fastest parallel prefix realization of modulo-(2n - 2q - 1) adders that we have encountered in the relevant literature, claims (7 + 2 log n)△G latency. Motivated by the need to reduce the latter, we propose new designs of such adders with (5 + 2 log n)△G latency without any penalty in area consumption or power dissipation. The proposed modular addition algorithm entails supplementary representation of residues in [0,2q], as [2n - (2q + 1), 2n - 1]. This leads to additional performance efficiency similar to the effect of double zero representation in modulo-(2n - 1) adders. The aforementioned analytically evaluated speed gain and improvements in other figures of merit are also supported via circuit simulation and synthesis.
对剩余数系统t = {2n - 1,2n, 2n + 1}进行了广泛的研究,以实现具有(3 + 2logn△G延迟的高效并行前缀模加法器。许多应用,如数字信号处理需要快速的模块化操作。但是,仅依靠t限制了n的大小,从而限制了动态范围。因此,需要额外的互素数模来适应更宽的动态范围。另一方面,额外模的模运算速度应尽可能接近t中的模运算速度。这可以通过2n - (2q + 1)形式的模来满足,其中1≤q≤n - 2,例如2n - 3,2n - 5。然而,我们在相关文献中遇到的模-(2n - 2q - 1)加法器的最快并行前缀实现要求(7 + 2 log n)△G延迟。由于需要减少后者,我们提出了具有(5 + 2 log n)△G延迟的新加法器设计,而不会对面积消耗或功耗造成任何损失。所提出的模加法算法需要将[0,2q]中的残数补充表示为[2n - (2q + 1), 2n - 1]。这导致了额外的性能效率,类似于模-(2n - 1)加法器中的双零表示的效果。上述分析评估的速度增益和其他性能指标的改进也通过电路仿真和综合得到支持。
{"title":"Modulo-(2^n -- 2^q -- 1) Parallel Prefix Addition via Excess-Modulo Encoding of Residues","authors":"Seyed Hamed Fatemi Langroudi, G. Jaberipur","doi":"10.1109/ARITH.2015.9","DOIUrl":"https://doi.org/10.1109/ARITH.2015.9","url":null,"abstract":"The residue number system t = {2<sup>n</sup> - 1, 2<sup>n</sup>, 2<sup>n</sup> + 1} has been extensively studied towards perfection in realization of efficient parallel prefix modular adders, with (3 + 2logn △G latency. Many applications, such as digital signal processing require fast modular operations. However, relying only on t limits the magnitude of n, and accordingly the dynamic range. Therefore, additional mutually prime moduli are required to accommodate for wider dynamic range. On the other hand, speed of modular arithmetic operations for the additional moduli should be as close as possible to those in t. This could be best met by the moduli of the form 2<sup>n</sup> - (2<sup>q</sup> + 1), with 1 ≤ q ≤ n - 2, such as 2<sup>n</sup> - 3, 2<sup>n</sup> - 5. However, the fastest parallel prefix realization of modulo-(2<sup>n</sup> - 2<sup>q</sup> - 1) adders that we have encountered in the relevant literature, claims (7 + 2 log n)△G latency. Motivated by the need to reduce the latter, we propose new designs of such adders with (5 + 2 log n)△G latency without any penalty in area consumption or power dissipation. The proposed modular addition algorithm entails supplementary representation of residues in [0,2<sup>q</sup>], as [2<sup>n</sup> - (2<sup>q</sup> + 1), 2<sup>n</sup> - 1]. This leads to additional performance efficiency similar to the effect of double zero representation in modulo-(2<sup>n</sup> - 1) adders. The aforementioned analytically evaluated speed gain and improvements in other figures of merit are also supported via circuit simulation and synthesis.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"2 1","pages":"121-128"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83098151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Design and Implementation of an Embedded FPGA Floating Point DSP Block 嵌入式FPGA浮点DSP模块的设计与实现
Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.18
M. Langhammer, B. Pasca
This paper describes the architecture and implementation, from both the standpoint of target applications as well as circuit design, of an FPGA DSP Block that can efficiently support both fixed and single precision (SP) floating-point (FP) arithmetic. Most contemporary FPGAs embed DSP blocks that provide simple multiply-add-based fixed-point arithmetic cores. Current FP arithmetic FPGA solutions make use of these hardened DSP resources, together with embedded memory blocks and soft logic resources, however, larger systems cannot be efficiently implemented due to the routing and soft logic limitations on the devices, resulting in significant area, performance, and power consumption penalties compared to ASIC implementations. In this paper we analyse earlier proposed embedded FP implementations, and show why they are not suitable for a production FPGA. We contrast these against our solution -- a unified DSP Block -- where (a) the SP FP multiplier is overlaid on the fixed point constructs, (b) the SP FP Adder/Subtracter is integrated as a separate unit, and (c) the multiplier and adder can be combined in a way that is both arithmetically useful, but also efficient in terms of FPGA routing density and congestion. In addition, a novel way of seamlessly combining any number of DSP Blocks in a low latency structure will be introduced. We will show that this new approach allows a low cost, low power, and high density FP platform on current production 20nm FPGAs. We also describe a future enhancement of the DSP block that can support subnormal numbers.
本文从目标应用和电路设计的角度,描述了一种能够有效支持固定精度和单精度浮点运算的FPGA DSP模块的结构和实现。大多数当代fpga嵌入DSP块,提供简单的基于乘加的定点算术核心。目前的FP算法FPGA解决方案利用了这些强化的DSP资源,以及嵌入式内存块和软逻辑资源,然而,由于设备上的路由和软逻辑限制,无法有效地实现更大的系统,导致与ASIC实现相比,显着的面积,性能和功耗损失。在本文中,我们分析了早期提出的嵌入式FP实现,并说明了为什么它们不适合生产FPGA。我们将这些与我们的解决方案进行对比-一个统一的DSP块-其中(a) SP FP乘法器覆盖在固定点结构上,(b) SP FP加/减法器集成为一个单独的单元,以及(c)乘法器和加法器可以以一种既在算术上有用,又在FPGA路由密度和拥堵方面有效的方式组合。此外,还将介绍一种在低延迟结构中无缝结合任意数量DSP块的新方法。我们将证明这种新方法可以在当前生产的20nm fpga上实现低成本,低功耗和高密度FP平台。我们还描述了DSP块的未来增强,可以支持亚正规数。
{"title":"Design and Implementation of an Embedded FPGA Floating Point DSP Block","authors":"M. Langhammer, B. Pasca","doi":"10.1109/ARITH.2015.18","DOIUrl":"https://doi.org/10.1109/ARITH.2015.18","url":null,"abstract":"This paper describes the architecture and implementation, from both the standpoint of target applications as well as circuit design, of an FPGA DSP Block that can efficiently support both fixed and single precision (SP) floating-point (FP) arithmetic. Most contemporary FPGAs embed DSP blocks that provide simple multiply-add-based fixed-point arithmetic cores. Current FP arithmetic FPGA solutions make use of these hardened DSP resources, together with embedded memory blocks and soft logic resources, however, larger systems cannot be efficiently implemented due to the routing and soft logic limitations on the devices, resulting in significant area, performance, and power consumption penalties compared to ASIC implementations. In this paper we analyse earlier proposed embedded FP implementations, and show why they are not suitable for a production FPGA. We contrast these against our solution -- a unified DSP Block -- where (a) the SP FP multiplier is overlaid on the fixed point constructs, (b) the SP FP Adder/Subtracter is integrated as a separate unit, and (c) the multiplier and adder can be combined in a way that is both arithmetically useful, but also efficient in terms of FPGA routing density and congestion. In addition, a novel way of seamlessly combining any number of DSP Blocks in a low latency structure will be introduced. We will show that this new approach allows a low cost, low power, and high density FP platform on current production 20nm FPGAs. We also describe a future enhancement of the DSP block that can support subnormal numbers.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"128 1","pages":"26-33"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87912391","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
The end of numerical error 结束数值误差
Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.34
J. Gustafson
Summary form only given, as follows. The full paper was not made available as part of this conference proceedings. It is time to overthrow a century of methods based on floating point arithmetic. Current technical computing is based on the acceptance of rounding error using numerical representations that were invented in 1914, and acceptance of sampling error using algorithms designed for a time when transistors were very expensive. By sticking to an antiquated storage format (now codified as an IEEE standard) well into the exascale area, we are wasting power, energy, storage, bandwidth, and programmer effort. The pursuit of exascale floating point is ridiculous, since we do not need to be making 10^18 sloppy rounding errors per second; we need instead to get provable, valid results for the first time, by turning the speed of parallel computers into higher quality answers instead of more junk per second. We introduce the 'unum' (universal number), a superset of IEEE Floating Point, that contains extra metadata fields that actually save storage, yet give more accurate answers that do not round, overflow, or underflow. The potential they offer for improved programmer productivity is enormous. They also provide, for the first time, the hope of a numerical standard that guarantees bitwise identical results across different computer architectures. Unum format is the basis for the 'ubox' method, which redefines what is meant by "high performance" by measuring performance in terms of the knowledge obtained about the answer and not the operations performed per second. Examples are given for practical application to structural analysis, radiation transfer, the n-body problem, linear and nonlinear systems of equations, and Laplace’s equation. This is a fresh approach to scientific computing that allows proper, rigorous representation of real number sets for the first time.
仅给出摘要形式,如下。全文未作为本次会议记录的一部分提供。是时候推翻一个世纪以来基于浮点运算的方法了。目前的技术计算是基于接受舍入误差,使用1914年发明的数字表示,接受采样误差,使用为晶体管非常昂贵的时代设计的算法。通过将过时的存储格式(现在被编纂为IEEE标准)很好地应用于百亿亿次领域,我们正在浪费电力、能源、存储、带宽和程序员的努力。追求百亿亿次浮点数是荒谬的,因为我们不需要每秒产生10^18个草率的舍入误差;相反,我们需要通过将并行计算机的速度转化为更高质量的答案,而不是每秒产生更多垃圾,从而首次获得可证明的、有效的结果。我们引入了“unum”(通用数),一个IEEE浮点数的超集,它包含额外的元数据字段,实际上节省了存储空间,但给出了更准确的答案,不会四舍五入,溢出或下溢。它们为提高程序员的生产力提供的潜力是巨大的。它们还首次提供了一种数字标准,以保证在不同的计算机体系结构中得到相同的位结果。Unum格式是“ubox”方法的基础,它通过根据获得的关于答案的知识而不是每秒执行的操作来衡量性能,重新定义了“高性能”的含义。给出了结构分析、辐射传递、n体问题、线性和非线性方程组以及拉普拉斯方程的实际应用实例。这是科学计算的一种新方法,首次允许对实数集进行适当、严格的表示。
{"title":"The end of numerical error","authors":"J. Gustafson","doi":"10.1109/ARITH.2015.34","DOIUrl":"https://doi.org/10.1109/ARITH.2015.34","url":null,"abstract":"Summary form only given, as follows. The full paper was not made available as part of this conference proceedings. It is time to overthrow a century of methods based on floating point arithmetic. Current technical computing is based on the acceptance of rounding error using numerical representations that were invented in 1914, and acceptance of sampling error using algorithms designed for a time when transistors were very expensive. By sticking to an antiquated storage format (now codified as an IEEE standard) well into the exascale area, we are wasting power, energy, storage, bandwidth, and programmer effort. The pursuit of exascale floating point is ridiculous, since we do not need to be making 10^18 sloppy rounding errors per second; we need instead to get provable, valid results for the first time, by turning the speed of parallel computers into higher quality answers instead of more junk per second. We introduce the 'unum' (universal number), a superset of IEEE Floating Point, that contains extra metadata fields that actually save storage, yet give more accurate answers that do not round, overflow, or underflow. The potential they offer for improved programmer productivity is enormous. They also provide, for the first time, the hope of a numerical standard that guarantees bitwise identical results across different computer architectures. Unum format is the basis for the 'ubox' method, which redefines what is meant by \"high performance\" by measuring performance in terms of the knowledge obtained about the answer and not the operations performed per second. Examples are given for practical application to structural analysis, radiation transfer, the n-body problem, linear and nonlinear systems of equations, and Laplace’s equation. This is a fresh approach to scientific computing that allows proper, rigorous representation of real number sets for the first time.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"35 1","pages":"74"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89332574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Reproducible Tall-Skinny QR 可重复的高瘦QR
Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.28
Hong Diep Nguyen, J. Demmel
Reproducibility is the ability to obtain bitwise identical results from different runs of the same program on the same input data, regardless of the available computing resources, or how they are scheduled. Recently, techniques have been proposed to attain reproducibility for BLAS operations, all of which rely on reproducibly computing the floating-point sum and dot product. Nonetheless, a reproducible BLAS library does not automatically translate into a reproducible higher-level linear algebra library, especially when communication is optimized. For instance, for the QR factorization, conventional algorithms such as Householder transformation or Gram-Schmidt process can be used to reproducibly factorize a floating-point matrix by fixing the high-level order of computation, for example column-by-column from left to right, and by using reproducible versions of level-1 BLAS operations such as dot product and 2-norm. In a massively parallel environment, those algorithms have high communication cost due to the need for synchronization after each step. The Tall-Skinny QR algorithm obtains much better performance in massively parallel environments by reducing the number of messages by a factor of n to O(log(P)) where P is the processor count, by reducing the number of reduction operations to O(1). Those reduction operations however are highly dependent on the network topology, in particular the number of computing nodes, and therefore are difficult to implement reproducibly and with reasonable performance. In this paper we present a new technique to reproducibly compute a QR factorization for a tall skinny matrix, which is based on the Cholesky QR algorithm to attain reproducibility as well as to improve communication cost, and the iterative refinement technique to guarantee the accuracy of the computed results. Our technique exhibits strong scalability in massively parallel environments, and at the same time can provide results of almost the same accuracy as the conventional Householder QR algorithm unless the matrix is extremely badly conditioned, in which case a warning can be given. Initial experimental results in Matlab show that for not too ill-conditioned matrices whose condition number is smaller than sqrt(1/e) where e is the machine epsilon, our technique runs less than 4 times slower than the built-in Matlab qr() function, and always computes numerically stable results in terms of column-wise relative error.
再现性是指在相同的输入数据上,从相同程序的不同运行中获得按位相同结果的能力,而不考虑可用的计算资源或它们是如何调度的。最近,已经提出了一些技术来实现BLAS操作的再现性,所有这些技术都依赖于浮点和和点积的再现性计算。尽管如此,可复制的BLAS库不会自动转换为可复制的高级线性代数库,尤其是在优化通信时。例如,对于QR分解,可以使用Householder变换或Gram-Schmidt过程等传统算法,通过固定计算的高级顺序(例如从左到右逐列)和使用可重复版本的1级BLAS操作(例如点积和2-范数)来可重复地分解浮点矩阵。在大规模并行环境下,由于每一步都需要同步,这些算法的通信成本很高。Tall-Skinny QR算法在大规模并行环境中获得了更好的性能,通过将消息数量减少n到O(log(P)),其中P是处理器计数,通过将约简操作的数量减少到O(1)。然而,这些约简操作高度依赖于网络拓扑,特别是计算节点的数量,因此难以实现可再现性和合理的性能。本文提出了一种基于Cholesky QR算法的高瘦矩阵QR分解可重复计算的新方法,该方法既能获得可再现性,又能提高通信成本,并采用迭代细化技术保证计算结果的准确性。我们的技术在大规模并行环境中表现出很强的可扩展性,同时可以提供与传统Householder QR算法几乎相同精度的结果,除非矩阵条件极其恶劣,在这种情况下可以给出警告。Matlab中的初步实验结果表明,对于条件数小于sqrt(1/e)的非病态矩阵,其中e为机器epsilon,我们的技术运行速度比内置Matlab qr()函数慢不到4倍,并且总是根据列相对误差计算数值稳定的结果。
{"title":"Reproducible Tall-Skinny QR","authors":"Hong Diep Nguyen, J. Demmel","doi":"10.1109/ARITH.2015.28","DOIUrl":"https://doi.org/10.1109/ARITH.2015.28","url":null,"abstract":"Reproducibility is the ability to obtain bitwise identical results from different runs of the same program on the same input data, regardless of the available computing resources, or how they are scheduled. Recently, techniques have been proposed to attain reproducibility for BLAS operations, all of which rely on reproducibly computing the floating-point sum and dot product. Nonetheless, a reproducible BLAS library does not automatically translate into a reproducible higher-level linear algebra library, especially when communication is optimized. For instance, for the QR factorization, conventional algorithms such as Householder transformation or Gram-Schmidt process can be used to reproducibly factorize a floating-point matrix by fixing the high-level order of computation, for example column-by-column from left to right, and by using reproducible versions of level-1 BLAS operations such as dot product and 2-norm. In a massively parallel environment, those algorithms have high communication cost due to the need for synchronization after each step. The Tall-Skinny QR algorithm obtains much better performance in massively parallel environments by reducing the number of messages by a factor of n to O(log(P)) where P is the processor count, by reducing the number of reduction operations to O(1). Those reduction operations however are highly dependent on the network topology, in particular the number of computing nodes, and therefore are difficult to implement reproducibly and with reasonable performance. In this paper we present a new technique to reproducibly compute a QR factorization for a tall skinny matrix, which is based on the Cholesky QR algorithm to attain reproducibility as well as to improve communication cost, and the iterative refinement technique to guarantee the accuracy of the computed results. Our technique exhibits strong scalability in massively parallel environments, and at the same time can provide results of almost the same accuracy as the conventional Householder QR algorithm unless the matrix is extremely badly conditioned, in which case a warning can be given. Initial experimental results in Matlab show that for not too ill-conditioned matrices whose condition number is smaller than sqrt(1/e) where e is the machine epsilon, our technique runs less than 4 times slower than the built-in Matlab qr() function, and always computes numerically stable results in terms of column-wise relative error.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"31 1","pages":"152-159"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85481966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Code Generators for Mathematical Functions 数学函数的代码生成器
Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.22
Nicolas Brunie, F. D. Dinechin, O. Kupriianova, C. Lauter
A typical floating-point environment includes support for a small set of about 30 mathematical functions such as exponential, logarithm, trigonometric and hyperbolic functions. These functions are provided by mathematical software libraries (libm), typically in IEEE754 single, double and quad precision. This article suggests to replace this libm paradigm by a more general approach: the on-demand generation of numerical function code, on arbitrary domains and with arbitrary accuracies. First, such code generation opens up the libm function space available to programmers. It may capture a much wider set of functions, and may capture even standard functions on non-standard domains and accuracy/performance points. Second, writing libm code requires fine-tuned instruction selection and scheduling for performance, and sophisticated floating-point techniques for accuracy. Automating this task through code generation improves confidence in the code while enabling better design space exploration, and therefore better time to market, even for the libm functions. This article discusses the new challenges of this paradigm shift, and presents the current state of open-source function code generators available on http://www.metalibm.org/.
典型的浮点环境包括对大约30个数学函数的一小部分支持,例如指数函数、对数函数、三角函数和双曲函数。这些函数由数学软件库(libm)提供,通常在IEEE754中提供单精度、双精度和四精度。本文建议用一种更一般的方法来取代这种libm范式:在任意域上以任意精度按需生成数值函数代码。首先,这样的代码生成为程序员打开了可用的libm函数空间。它可以捕获更广泛的函数集,甚至可以捕获非标准域和准确性/性能点上的标准函数。其次,编写libm代码需要对指令选择和调度进行微调以提高性能,还需要使用复杂的浮点技术来提高精度。通过代码生成自动化此任务可以提高对代码的信心,同时支持更好的设计空间探索,从而缩短上市时间,甚至对于libm函数也是如此。本文讨论了这种范式转换的新挑战,并介绍了http://www.metalibm.org/上可用的开源函数代码生成器的当前状态。
{"title":"Code Generators for Mathematical Functions","authors":"Nicolas Brunie, F. D. Dinechin, O. Kupriianova, C. Lauter","doi":"10.1109/ARITH.2015.22","DOIUrl":"https://doi.org/10.1109/ARITH.2015.22","url":null,"abstract":"A typical floating-point environment includes support for a small set of about 30 mathematical functions such as exponential, logarithm, trigonometric and hyperbolic functions. These functions are provided by mathematical software libraries (libm), typically in IEEE754 single, double and quad precision. This article suggests to replace this libm paradigm by a more general approach: the on-demand generation of numerical function code, on arbitrary domains and with arbitrary accuracies. First, such code generation opens up the libm function space available to programmers. It may capture a much wider set of functions, and may capture even standard functions on non-standard domains and accuracy/performance points. Second, writing libm code requires fine-tuned instruction selection and scheduling for performance, and sophisticated floating-point techniques for accuracy. Automating this task through code generation improves confidence in the code while enabling better design space exploration, and therefore better time to market, even for the libm functions. This article discusses the new challenges of this paradigm shift, and presents the current state of open-source function code generators available on http://www.metalibm.org/.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"12 1","pages":"66-73"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79130514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
期刊
2015 IEEE 22nd Symposium on Computer Arithmetic
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1