首页 > 最新文献

17th IEEE Symposium on Computer Arithmetic (ARITH'05)最新文献

英文 中文
New Results on the Distance between a Segment and Z². Application to the Exact Rounding 段与z之间距离的新结果²精确四舍五入的应用
Pub Date : 2005-06-27 DOI: 10.1109/ARITH.2005.32
V. Lefèvre
This paper presents extensions to Lefevre's algorithm that computes a lower bound on the distance between a segment and a regular grid Zopf2. This algorithm and, in particular, the extensions are useful in the search for worst cases for the exact rounding of unary elementary functions or base-conversion functions. The proof that is presented is simpler and less technical than the original proof. This paper also gives benchmark results with various optimization parameters, explanations of these results, and an application to base conversion
本文给出了对Lefevre算法的扩展,该算法用于计算线段与规则网格之间距离的下界Zopf2。该算法,特别是扩展,在搜索一元初等函数或基转换函数的精确舍入的最坏情况时非常有用。提出的证明比原始证明更简单,技术含量更低。本文还给出了各种优化参数的基准测试结果,对这些结果的解释,以及在基转换中的应用
{"title":"New Results on the Distance between a Segment and Z². Application to the Exact Rounding","authors":"V. Lefèvre","doi":"10.1109/ARITH.2005.32","DOIUrl":"https://doi.org/10.1109/ARITH.2005.32","url":null,"abstract":"This paper presents extensions to Lefevre's algorithm that computes a lower bound on the distance between a segment and a regular grid Zopf2. This algorithm and, in particular, the extensions are useful in the search for worst cases for the exact rounding of unary elementary functions or base-conversion functions. The proof that is presented is simpler and less technical than the original proof. This paper also gives benchmark results with various optimization parameters, explanations of these results, and an application to base conversion","PeriodicalId":194902,"journal":{"name":"17th IEEE Symposium on Computer Arithmetic (ARITH'05)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114726320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Gal's accurate tables method revisited 对盖尔的精确表法进行了重新审视
Pub Date : 2005-06-27 DOI: 10.1109/ARITH.2005.24
D. Stehlé, P. Zimmermann
Gal's accurate tables algorithm aims at providing an efficient implementation of mathematical functions with correct rounding as often as possible. This method requires an expensive pre-computation of the values taken by the function - or by several related functions - at some distinguished points. Our improvements of Gal's method are two-fold: on the one hand we describe what is the arguably best set of distinguished values and how it improves the efficiency and accuracy of the function implementation, and on the other hand we give an algorithm which drastically decreases the cost of the pre-computation. These improvements are related to the worst cases for the correct rounding of mathematical functions and to the algorithms for finding them. We demonstrate how the whole method can be turned into practice for 2/sup x/ and sin x for x/spl isin/[1/2,1[, in double precision.
Gal的精确表算法旨在提供尽可能经常使用正确舍入的数学函数的有效实现。这种方法需要对函数(或几个相关函数)在某些不同点处取的值进行昂贵的预计算。我们对Gal的方法进行了两方面的改进:一方面,我们描述了什么是可论证的最佳区分值集,以及它如何提高函数实现的效率和准确性;另一方面,我们给出了一个大大降低预计算成本的算法。这些改进与数学函数正确舍入的最坏情况以及找到它们的算法有关。我们演示了如何将整个方法应用于双精度的2/sup x/和sin x (x/spl isin/[1/2,1])。
{"title":"Gal's accurate tables method revisited","authors":"D. Stehlé, P. Zimmermann","doi":"10.1109/ARITH.2005.24","DOIUrl":"https://doi.org/10.1109/ARITH.2005.24","url":null,"abstract":"Gal's accurate tables algorithm aims at providing an efficient implementation of mathematical functions with correct rounding as often as possible. This method requires an expensive pre-computation of the values taken by the function - or by several related functions - at some distinguished points. Our improvements of Gal's method are two-fold: on the one hand we describe what is the arguably best set of distinguished values and how it improves the efficiency and accuracy of the function implementation, and on the other hand we give an algorithm which drastically decreases the cost of the pre-computation. These improvements are related to the worst cases for the correct rounding of mathematical functions and to the algorithms for finding them. We demonstrate how the whole method can be turned into practice for 2/sup x/ and sin x for x/spl isin/[1/2,1[, in double precision.","PeriodicalId":194902,"journal":{"name":"17th IEEE Symposium on Computer Arithmetic (ARITH'05)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115836090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Solving constraints on the invisible bits of the intermediate result for floating-point verification 解决浮点验证中间结果不可见位的约束
Pub Date : 2005-06-27 DOI: 10.1109/ARITH.2005.38
M. Aharoni, Sigal Asaf, Ron Maharik, Ilan Nehama, Ilya Nikulshin, A. Ziv
Test generation for datapath floating-point verification involves targeting intricate corner cases, which can often be solved only through complex constraint solving. In the process of calculating the result, we use an intermediate result whose significand comprises a finite number of bits and a sticky bit that is 0 if and only if the intermediate result is exact. We refer to all the bits beyond those represented in the final result as the invisible bits. We deal with corner cases that can only be defined via constraints on the intermediate result. Our work investigates the following problem: given a floating-point operation, and constraints on the invisible bits and the sticky bit, find two inputs for the operation that yield an intermediate result compatible with the constraints. The paper supplies a deterministic solution for addition and subtraction, and probabilistic solutions for multiplication and division. It also discusses the application of these algorithms to the verification of floating-point implementations.
数据路径浮点验证的测试生成涉及到针对复杂的极端情况,通常只能通过复杂的约束求解来解决。在计算结果的过程中,当且仅当中间结果是精确的,我们使用一个中间结果,其有效位数由有限位数和一个粘着位组成,粘着位为0。我们把在最终结果中表示的比特以外的所有比特称为不可见比特。我们处理只能通过对中间结果的约束来定义的极端情况。我们的工作研究了以下问题:给定一个浮点操作,以及对不可见位和粘着位的约束,为该操作找到两个产生与约束兼容的中间结果的输入。本文给出了加法和减法的确定性解和乘法和除法的概率解。讨论了这些算法在浮点实现验证中的应用。
{"title":"Solving constraints on the invisible bits of the intermediate result for floating-point verification","authors":"M. Aharoni, Sigal Asaf, Ron Maharik, Ilan Nehama, Ilya Nikulshin, A. Ziv","doi":"10.1109/ARITH.2005.38","DOIUrl":"https://doi.org/10.1109/ARITH.2005.38","url":null,"abstract":"Test generation for datapath floating-point verification involves targeting intricate corner cases, which can often be solved only through complex constraint solving. In the process of calculating the result, we use an intermediate result whose significand comprises a finite number of bits and a sticky bit that is 0 if and only if the intermediate result is exact. We refer to all the bits beyond those represented in the final result as the invisible bits. We deal with corner cases that can only be defined via constraints on the intermediate result. Our work investigates the following problem: given a floating-point operation, and constraints on the invisible bits and the sticky bit, find two inputs for the operation that yield an intermediate result compatible with the constraints. The paper supplies a deterministic solution for addition and subtraction, and probabilistic solutions for multiplication and division. It also discusses the application of these algorithms to the verification of floating-point implementations.","PeriodicalId":194902,"journal":{"name":"17th IEEE Symposium on Computer Arithmetic (ARITH'05)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130075816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Arithmetic operations in the polynomial modular number system 多项式模数系统中的算术运算
Pub Date : 2005-06-27 DOI: 10.1109/ARITH.2005.11
J. Bajard, L. Imbert, T. Plantard
We propose a new number representation and arithmetic for the elements of the ring of integers modulo p. The so-called polynomial modular number system (PMNS) allows for fast polynomial arithmetic and easy parallelization. The most important contribution of this paper is the fundamental theorem of a modular number system, which provides a bound for the coefficients of the polynomials used to represent the set /spl Zopf//sub p/. However, we also propose a complete set of algorithms to perform the arithmetic operations over a PMNS, which make this system of practical interest for people concerned about efficient implementation of modular arithmetic.
我们提出了一种新的整数模数环元素的数表示和算法。所谓的多项式模数系统(PMNS)允许快速的多项式运算和容易的并行化。本文最重要的贡献是模系统的基本定理,它为表示集合/spl Zopf//sub p/的多项式的系数提供了一个界。然而,我们也提出了一套完整的算法来执行PMNS上的算术运算,这使得这个系统对关心模块化算法的有效实现的人有实际的兴趣。
{"title":"Arithmetic operations in the polynomial modular number system","authors":"J. Bajard, L. Imbert, T. Plantard","doi":"10.1109/ARITH.2005.11","DOIUrl":"https://doi.org/10.1109/ARITH.2005.11","url":null,"abstract":"We propose a new number representation and arithmetic for the elements of the ring of integers modulo p. The so-called polynomial modular number system (PMNS) allows for fast polynomial arithmetic and easy parallelization. The most important contribution of this paper is the fundamental theorem of a modular number system, which provides a bound for the coefficients of the polynomials used to represent the set /spl Zopf//sub p/. However, we also propose a complete set of algorithms to perform the arithmetic operations over a PMNS, which make this system of practical interest for people concerned about efficient implementation of modular arithmetic.","PeriodicalId":194902,"journal":{"name":"17th IEEE Symposium on Computer Arithmetic (ARITH'05)","volume":"2 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116774978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
High-radix implementation of IEEE floating-point addition IEEE浮点加法的高基数实现
Pub Date : 2005-06-27 DOI: 10.1109/ARITH.2005.26
P. Seidel
We are proposing a micro-architecture for high-performance IEEE floating-point addition that is based on a (non-redundant) high-radix representation of the floating-point operands. The main improvement of the proposed IEEE FP addition implementation is achieved by avoiding the computation of full alignment and normalization shifts which impose major delays in conventional implementations of IEEE FP addition. This reduction is achieved at the cost of wider operand interfaces and an increased complexity for IEEE compliant rounding. We present a detailed discussion of an IEEE FP adder implementation using the proposed high-radix format and explain the specific benefits and challenges of the design.
我们提出了一种高性能IEEE浮点加法的微架构,该架构基于浮点操作数的(非冗余)高基数表示。所提出的IEEE FP加法实现的主要改进在于避免了在传统的IEEE FP加法实现中造成主要延迟的完全对齐和归一化移位的计算。这种减少是以更宽的操作数接口为代价的,并且增加了IEEE兼容舍入的复杂性。我们详细讨论了使用所提出的高基数格式的IEEE FP加法器实现,并解释了该设计的具体优点和挑战。
{"title":"High-radix implementation of IEEE floating-point addition","authors":"P. Seidel","doi":"10.1109/ARITH.2005.26","DOIUrl":"https://doi.org/10.1109/ARITH.2005.26","url":null,"abstract":"We are proposing a micro-architecture for high-performance IEEE floating-point addition that is based on a (non-redundant) high-radix representation of the floating-point operands. The main improvement of the proposed IEEE FP addition implementation is achieved by avoiding the computation of full alignment and normalization shifts which impose major delays in conventional implementations of IEEE FP addition. This reduction is achieved at the cost of wider operand interfaces and an increased complexity for IEEE compliant rounding. We present a detailed discussion of an IEEE FP adder implementation using the proposed high-radix format and explain the specific benefits and challenges of the design.","PeriodicalId":194902,"journal":{"name":"17th IEEE Symposium on Computer Arithmetic (ARITH'05)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127420835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Decimal multiplication with efficient partial product generation 具有高效部分积生成的十进制乘法
Pub Date : 2005-06-27 DOI: 10.1109/ARITH.2005.15
M. A. Erle, E. Schwarz, M. Schulte
Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents a novel design for fixed-point decimal multiplication that utilizes a simple recoding scheme to produce signed-magnitude representations of the operands thereby greatly simplifying the process of generating partial products for each multiplier digit. The partial products are generated using a digit-by-digit multiplier on a word-by-digit basis, first in a signed-digit form with two digits per position, and then combined via a combinational circuit. As the signed-digit partial products are developed one at a time while traversing the recoded multiplier operand from the least significant digit to the most significant digit, each partial product is added along with the accumulated sum of previous partial products via a signed-digit adder. This work is significantly different from other work employing digit-by-digit multipliers due to the efficiency gained by restricting the range of digits throughout the multiplication process.
十进制乘法在许多商业应用中都很重要,包括金融分析、银行业务、税收计算、货币转换、保险和会计。本文提出了一种新颖的定点十进制乘法设计,它利用一种简单的编码方案来产生操作数的带符号幅度表示,从而大大简化了为每个乘数数字生成部分乘积的过程。部分乘积是在一个字一个数字的基础上使用一个数字乘法器生成的,首先以每个位置有两位数字的符号数字形式生成,然后通过组合电路组合。由于在从最低有效位数到最高有效位数遍历编码乘法器操作数时,每次开发一个有符号数字的部分乘积,因此每个部分乘积都通过有符号数字加法器与先前部分乘积的累积和一起相加。这项工作与其他使用数位乘法器的工作有很大不同,因为在整个乘法过程中通过限制数字的范围来获得效率。
{"title":"Decimal multiplication with efficient partial product generation","authors":"M. A. Erle, E. Schwarz, M. Schulte","doi":"10.1109/ARITH.2005.15","DOIUrl":"https://doi.org/10.1109/ARITH.2005.15","url":null,"abstract":"Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents a novel design for fixed-point decimal multiplication that utilizes a simple recoding scheme to produce signed-magnitude representations of the operands thereby greatly simplifying the process of generating partial products for each multiplier digit. The partial products are generated using a digit-by-digit multiplier on a word-by-digit basis, first in a signed-digit form with two digits per position, and then combined via a combinational circuit. As the signed-digit partial products are developed one at a time while traversing the recoded multiplier operand from the least significant digit to the most significant digit, each partial product is added along with the accumulated sum of previous partial products via a signed-digit adder. This work is significantly different from other work employing digit-by-digit multipliers due to the efficiency gained by restricting the range of digits throughout the multiplication process.","PeriodicalId":194902,"journal":{"name":"17th IEEE Symposium on Computer Arithmetic (ARITH'05)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128065190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 132
N-bit unsigned division via n-bit multiply-add n位无符号除法通过n位乘加运算
Pub Date : 2005-06-27 DOI: 10.1109/ARITH.2005.31
A. Robison
Integer division on modern processors is expensive compared to multiplication. Previous algorithms for performing unsigned division by an invariant divisor, via reciprocal approximation, suffer in the worst case from a common requirement for n+1 bit multiplication, which typically must be synthesized from n-bit multiplication and extra arithmetic operations. This paper presents, and proves, a hybrid of previous algorithms that replaces n+1 bit multiplication with a single fused multiply-add operation on n-bit operands, thus reducing any n-bit unsigned division to the upper n bits of a multiply-add, followed by a single right shift. An additional benefit is that the prerequisite calculations are simple and fast. On the Itanium/spl reg/ 2 processor, the technique is advantageous for as few as two quotients that share a common run-time divisor.
与乘法相比,现代处理器上的整数除法开销更大。在最坏的情况下,以前通过互反逼近执行无符号除法的算法会受到n+1位乘法的常见要求的影响,这通常必须由n位乘法和额外的算术运算合成。本文提出并证明了用n位操作数上的一个融合的乘加运算取代n+1位乘法的混合算法,从而将任何n位无符号除法简化为乘加运算的上n位,然后再进行一次右移。另一个好处是先决条件的计算简单而快速。在Itanium/spl reg/ 2处理器上,该技术对于共享一个公共运行时除数的两个商是有利的。
{"title":"N-bit unsigned division via n-bit multiply-add","authors":"A. Robison","doi":"10.1109/ARITH.2005.31","DOIUrl":"https://doi.org/10.1109/ARITH.2005.31","url":null,"abstract":"Integer division on modern processors is expensive compared to multiplication. Previous algorithms for performing unsigned division by an invariant divisor, via reciprocal approximation, suffer in the worst case from a common requirement for n+1 bit multiplication, which typically must be synthesized from n-bit multiplication and extra arithmetic operations. This paper presents, and proves, a hybrid of previous algorithms that replaces n+1 bit multiplication with a single fused multiply-add operation on n-bit operands, thus reducing any n-bit unsigned division to the upper n bits of a multiply-add, followed by a single right shift. An additional benefit is that the prerequisite calculations are simple and fast. On the Itanium/spl reg/ 2 processor, the technique is advantageous for as few as two quotients that share a common run-time divisor.","PeriodicalId":194902,"journal":{"name":"17th IEEE Symposium on Computer Arithmetic (ARITH'05)","volume":"202 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133263968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Division by constant for the ST100 DSP microprocessor 除常数为ST100 DSP微处理器
Pub Date : 2005-06-27 DOI: 10.1109/ARITH.2005.17
J. Muller, A. Tisserand, B. Dinechin, Christophe Monat
Algorithms for Euclidean (i.e., integer) division by a constant operation are presented. They allow fast computation for some values of the divisor (known at compile time) or also when both quotient and modulus are required. These algorithms are based on the multiply-accumulate instruction and the 40-bit arithmetic available in DSPs such as the ST100 DSP from STMicroelectronics. The results are demonstrated in the case of standard speech coding applications.
给出了用常数运算进行欧几里得(即整数)除法的算法。它们允许快速计算除数的某些值(在编译时已知),或者在同时需要商和模的情况下。这些算法是基于乘累加指令和40位算术的DSP,如意法半导体的ST100 DSP。结果在标准语音编码应用中得到了验证。
{"title":"Division by constant for the ST100 DSP microprocessor","authors":"J. Muller, A. Tisserand, B. Dinechin, Christophe Monat","doi":"10.1109/ARITH.2005.17","DOIUrl":"https://doi.org/10.1109/ARITH.2005.17","url":null,"abstract":"Algorithms for Euclidean (i.e., integer) division by a constant operation are presented. They allow fast computation for some values of the divisor (known at compile time) or also when both quotient and modulus are required. These algorithms are based on the multiply-accumulate instruction and the 40-bit arithmetic available in DSPs such as the ST100 DSP from STMicroelectronics. The results are demonstrated in the case of standard speech coding applications.","PeriodicalId":194902,"journal":{"name":"17th IEEE Symposium on Computer Arithmetic (ARITH'05)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132213189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Table lookup structures for multiplicative inverses modulo 2/sup k/ 表查找结构的乘法逆模2/sup k/
Pub Date : 2005-06-27 DOI: 10.1109/ARITH.2005.43
D. Matula, A. Fit-Florea, M. Thornton
We introduce an inheritance property and related table lookup structures applicable to simplified evaluation of the modular operations "multiplicative inverse", "discrete log", and "exponential residue" in the particular modulus 2/sup k/. Regarding applications, we describe an integer representation system of Benschop for transforming integer multiplications into additions which benefits from our table lookup function evaluation procedures. We focus herein on the multiplicative inverse modulo 2/sup k/ to exhibit simplifications in hardware implementations realized from the inheritance property. A table lookup structure given by a bit string that can be interpreted with reference to a binary tree is described and analyzed. Using observed symmetries, the lookup structure size is reduced allowing a novel direct lookup process for multiplicative inverses for all 16-bit odd integers to be obtained from a table of size less than two KBytes. The 16-bit multiplicative inverse operation is also applicable for providing a seed inverse for obtaining 32/64-bit multiplicative inverses by one/two iterations of a known quadratic refinement algorithm.
我们介绍了适用于特定模2/sup k/上的“乘法逆”、“离散对数”和“指数残数”的模运算的简化求值的继承性质和相关的查找结构。在应用方面,我们描述了一个Benschop整数表示系统,用于将整数乘法转换为加法,这得益于我们的表查找函数求值过程。我们在此重点讨论乘法逆模2/sup k/,以展示从继承属性实现的硬件实现的简化。描述并分析了由位串给出的表查找结构,该结构可以参考二叉树进行解释。使用观察到的对称性,查找结构的大小被减小,允许一种新的直接查找过程,从大小小于2 kb的表中获得所有16位奇数的乘法逆。16位乘法逆运算也适用于提供种子逆,通过已知的二次优化算法的一次/两次迭代获得32/64位乘法逆。
{"title":"Table lookup structures for multiplicative inverses modulo 2/sup k/","authors":"D. Matula, A. Fit-Florea, M. Thornton","doi":"10.1109/ARITH.2005.43","DOIUrl":"https://doi.org/10.1109/ARITH.2005.43","url":null,"abstract":"We introduce an inheritance property and related table lookup structures applicable to simplified evaluation of the modular operations \"multiplicative inverse\", \"discrete log\", and \"exponential residue\" in the particular modulus 2/sup k/. Regarding applications, we describe an integer representation system of Benschop for transforming integer multiplications into additions which benefits from our table lookup function evaluation procedures. We focus herein on the multiplicative inverse modulo 2/sup k/ to exhibit simplifications in hardware implementations realized from the inheritance property. A table lookup structure given by a bit string that can be interpreted with reference to a binary tree is described and analyzed. Using observed symmetries, the lookup structure size is reduced allowing a novel direct lookup process for multiplicative inverses for all 16-bit odd integers to be obtained from a table of size less than two KBytes. The 16-bit multiplicative inverse operation is also applicable for providing a seed inverse for obtaining 32/64-bit multiplicative inverses by one/two iterations of a known quadratic refinement algorithm.","PeriodicalId":194902,"journal":{"name":"17th IEEE Symposium on Computer Arithmetic (ARITH'05)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133621552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A fast-start method for computing the inverse tangent 计算正切逆的快速启动方法
Pub Date : 2005-06-27 DOI: 10.1109/ARITH.2005.5
Peter W. Markstein
In a search for an algorithm to compute atan(x) which has both low latency and few floating point instructions, an interesting variant of familiar trigonometry formulas was discovered that allow the start of argument reduction to commence before any references to tables stored in memory are needed. Low latency makes the method suitable for a closed subroutine, and few floating-point operations make the method advantageous for a software-pipelined implementation.
在搜索计算atan(x)的算法时,它具有低延迟和很少的浮点指令,我们发现了熟悉的三角公式的一个有趣的变体,它允许在需要引用存储在内存中的表之前开始进行参数缩减。低延迟使该方法适合于封闭子例程,并且很少的浮点操作使该方法适合于软件流水线实现。
{"title":"A fast-start method for computing the inverse tangent","authors":"Peter W. Markstein","doi":"10.1109/ARITH.2005.5","DOIUrl":"https://doi.org/10.1109/ARITH.2005.5","url":null,"abstract":"In a search for an algorithm to compute atan(x) which has both low latency and few floating point instructions, an interesting variant of familiar trigonometry formulas was discovered that allow the start of argument reduction to commence before any references to tables stored in memory are needed. Low latency makes the method suitable for a closed subroutine, and few floating-point operations make the method advantageous for a software-pipelined implementation.","PeriodicalId":194902,"journal":{"name":"17th IEEE Symposium on Computer Arithmetic (ARITH'05)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133613258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
17th IEEE Symposium on Computer Arithmetic (ARITH'05)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1