2015 IEEE 22nd Symposium on Computer Arithmetic最新文献

英文中文

External Reviewers ARITH 2021 外部审稿人ARITH 2021

2015 IEEE 22nd Symposium on Computer Arithmetic

Pub Date : 2021-06-01 DOI: 10.1109/arith51176.2021.00009

Karim Bigou, Mojtaba Bisheh-Niasar, L. Fiolhais, Rogério Paludo, Hwajeong Seo

引用次数: 0

Digit Recurrence Floating-Point Division under HUB Format HUB格式下的数字递归浮点除法

2015 IEEE 22nd Symposium on Computer Arithmetic

Pub Date : 2016-01-01 DOI: 10.1109/ARITH.2016.17

J. Villalba

Half-Unit-Biased format is based on shifting the representation line of the binary numbers by half Unit in the Last Place. The main feature of this format is that the roundto-nearest is carried out by a simple truncation, preventing any carry propagation and saving time and area. Algorithms and architectures have been defined for addition/substraction and multiplication operations under this format. Nevertheless, the division operation has not been confronted yet. In this paper we deal with the floating-point division under HUB format, studying the architecture for the digit recurrence method, including the on-the-fly conversion of the signed digit quotient. Keywords—division by digit recurrence, HUB format, on-the-fly conversion

半单位偏置格式是基于在最后一个位置将二进制数的表示行移动半个单位。这种格式的主要特点是通过简单的截断来进行舍入到最近邻，防止了任何进位传播，节省了时间和面积。在这种格式下，已经为加法/减法和乘法操作定义了算法和体系结构。然而，分裂行动还没有遇到。本文研究了HUB格式下的浮点除法，研究了数字递归法的体系结构，包括有符号数商的动态转换。关键词:数字递归除法，HUB格式，实时转换

引用次数: 0

Contributions to the Design of Residue Number System Architectures 对剩余数系统架构设计的贡献

2015 IEEE 22nd Symposium on Computer Arithmetic

Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.25

Benoît Gérard, J. Kammerer, Nabil Merkiche

Residue Number System (RNS) is nowadays considered as a real alternative to other hardware architectures for handling large-number computations. In this paper we propose algorithmic answers to some of the questions that may face a designer when implementing such solution. More precisely, we investigated the following three problems. First, we propose an efficient method for constructing maximal bases noticing that this problem can be seen as a max-clique problem. Second we consider the logical gates count reduction when two different bases share the same hardware modules. Again it is linked to graph theory since it corresponds to finding a maximum weighted matching. Eventually we detail how the presence of DSP blocks in FPGAs can be leveraged to reach higher design frequencies by implementing full computation units inside.

残数系统(RNS)目前被认为是处理大量计算的其他硬件体系结构的真正替代方案。在本文中，我们提出了一些算法的答案，可能面临的一些问题，当一个设计师实现这样的解决方案。更确切地说，我们调查了以下三个问题。首先，我们提出了一种构造极大基的有效方法，注意到这个问题可以看作是一个极大团问题。其次，当两个不同的基地共享相同的硬件模块时，我们考虑逻辑门计数减少。再一次，它与图论联系在一起，因为它对应于寻找最大加权匹配。最后，我们详细介绍了如何利用fpga中DSP块的存在，通过在内部实现完整的计算单元来达到更高的设计频率。

引用次数: 9

Calculating in floating sexagesimal place value notation, 4000 years ago 4000年前，用浮动六十进制位值记数法计算

2015 IEEE 22nd Symposium on Computer Arithmetic

Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.33

C. Proust

Summary form only given, as follows. The full paper was not made available as part of this conference proceedings. By the end of the third millennium BCE in Mesopotamia an innovation of major significance for the history of mathematics occurred: the sexagesimal place value notation. A sophisticated mathematical culture was subsequently developed by masters attached to the scribal schools that flourished in Iraq, Iran and Syria during the first centuries of the second millenium BCE. The best known aspect of this mathematical culture is the art of solving quadratic problems. The numerical algorithms exploiting the properties of base 60 and the floating notation are less known. This paper presents some of these algorithms, especially those based on factorization methods.

仅给出摘要形式，如下。全文未作为本次会议记录的一部分提供。公元前第三个千年末期，美索不达米亚出现了一项对数学历史具有重大意义的创新:六十进制位值记数法。在公元前2000年的前几个世纪，在伊拉克、伊朗和叙利亚蓬勃发展的抄写学校附属的大师们发展了一种复杂的数学文化。这种数学文化最著名的方面是解决二次问题的艺术。利用60进制和浮点记数法的数值算法尚不为人所知。本文介绍了其中的一些算法，特别是基于因式分解的算法。

引用次数: 0

New Bit-Level Serial GF (2^m) Multiplication Using Polynomial Basis 基于多项式基的新位级串行GF (2^m)乘法

2015 IEEE 22nd Symposium on Computer Arithmetic

Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.11

H. El-Razouk, A. Reyhani-Masoleh

The Polynomial basis (PB) representation offers efficient hardware realizations of GF(2m) multipliers. Bit-level serial multiplication over GF(2m) trades-off the computational latency for lower silicon area, and hence, is favored in resource constrained applications. In such area critical applications, extra clock cycles might take place to read the inputs of the multiplication if the data-path has limited capacity. In this paper, we present a new bit-level serial PB multiplication scheme which generates its output bits in parallel after m clock cycles without requiring any preloading of the inputs, for the first time in the open literature. The proposed architecture, referred to as fully-serial-in-parallel-out (FSIPO), is useful for achieving higher throughput in resource constrained environments if the data-path for entering inputs has limited capacity, especially, for large dimensions of the field GF (2m).

多项式基(PB)表示提供了GF(2m)乘法器的高效硬件实现。在GF(2m)上的位级串行乘法权衡了较低硅面积的计算延迟，因此在资源受限的应用程序中受到青睐。在这种区域关键型应用程序中，如果数据路径的容量有限，可能需要额外的时钟周期来读取乘法的输入。在本文中，我们提出了一种新的位级串行PB乘法方案，该方案在m个时钟周期后并行生成其输出位，而无需对输入进行任何预加载，这在公开文献中是第一次。所提出的架构，被称为全串行并行输出(FSIPO)，对于在资源受限的环境中实现更高的吞吐量非常有用，如果输入的数据路径容量有限，特别是对于大尺寸的现场GF (2m)。

引用次数: 12

An Efficient Softcore Multiplier Architecture for Xilinx FPGAs 一种适用于赛灵思fpga的高效软核乘法器架构

2015 IEEE 22nd Symposium on Computer Arithmetic

Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.17

M. Kumm, Shahid Abbas, P. Zipf

This work presents an efficient implementation of a softcore multiplier, i.e., a multiplier architecture which can be efficiently mapped to the slice resources of modern Xilinx FPGAs. Instead of dividing the multiplication into the generation of partial products and the summation using a compressor tree, as done in modern multipliers, an array-like architecture is proposed. Each row of the array generates a partial product which is directly added to results of previous rows using the fast carry chain. A radix-4 Booth encoding/decoding is used to reduce the I/O count of the partial product generation which makes it possible to map both, the Booth encoder and decoder, into a single 6-input look up table (LUT). Like a conventional Booth multiplier, this nearly halves the number of rows compared to a ripple carry array multiplier. In addition, the compressor tree is completely avoided and an efficient and regular structure retains that uses up to 50% less slice resources compared to previous approaches and offers a multiply accumulate (MAC) operation without extra resources.

这项工作提出了一个软核乘法器的有效实现，即一个乘法器架构，可以有效地映射到现代赛灵思fpga的切片资源。与现代乘法器中使用压缩树将乘法分解为部分乘积的生成和求和不同，本文提出了一种类似数组的结构。数组的每一行生成一个部分积，它使用快速进位链直接添加到前一行的结果中。基数-4 Booth编码/解码用于减少部分产品生成的I/O计数，从而可以将Booth编码器和解码器映射到单个6输入查找表(LUT)中。与传统的Booth乘法器一样，与纹波进位阵列乘法器相比，这几乎减少了一半的行数。此外，完全避免了压缩树，保留了高效且规则的结构，与以前的方法相比，使用的切片资源减少了50%，并且在没有额外资源的情况下提供了乘法累积(MAC)操作。

引用次数: 23

RNS Arithmetic Approach in Lattice-Based Cryptography: Accelerating the "Rounding-off" Core Procedure 栅格密码中的RNS算法:加速“舍入”核心过程

2015 IEEE 22nd Symposium on Computer Arithmetic

Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.30

J. Bajard, J. Eynard, Nabil Merkiche, T. Plantard

Residue Number Systems (RNS) are naturally considered as an interesting candidate to provide efficient arithmetic for implementations of cryptosystems such as RSA, ECC (Elliptic Curve Cryptography), pairings, etc. More recently, RNS have been used to accelerate fully homomorphic encryption as lattice-based cryptogaphy. In this paper, we present an RNS algorithm resolving the Closest Vector Problem (CVP). This algorithm is particularly efficient for a certain class of lattice basis. It provides a full RNS Babai round-off procedure without any costly conversion into alternative positional number system such as Mixed Radix System (MRS). An optimized Cox-Rower architecture adapted to the proposed algorithm is also presented. The main modifications reside in the Rower unit whose feature is to use only one multiplier. This allows to free two out of three multipliers from the Rower unit by reusing the same one with an overhead of 3 more cycles per inner reduction. An analysis of feasibility of implementation within FPGA is also given.

剩余数系统(RNS)自然被认为是一个有趣的候选者，为诸如RSA, ECC(椭圆曲线加密)，配对等密码系统的实现提供有效的算法。最近，RNS已被用于加速完全同态加密作为基于格的加密。本文提出了一种求解最接近向量问题(CVP)的RNS算法。该算法对一类格基特别有效。它提供了一个完整的RNS Babai舍入过程，而不需要任何昂贵的转换到其他位置数字系统，如混合基数系统(MRS)。并提出了一种适用于该算法的优化Cox-Rower结构。主要的修改存在于划桨单元，其特点是只使用一个乘法器。这允许通过重用相同的乘数器来从Rower单元中释放三个乘数器中的两个，每次内部缩减的开销为3个多周期。分析了该方法在FPGA上实现的可行性。

引用次数: 21

Precise and Fast Computation of Elliptic Integrals and Functions 椭圆积分与函数的精确快速计算

2015 IEEE 22nd Symposium on Computer Arithmetic

Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.15

T. Fukushima

Summarized is the recent progress of the new methods to compute Legendre's complete and incomplete elliptic integrals of all three kinds and Jacobian elliptic functions. Also reviewed are the entirely new methods to (i) compute the inverse functions of complete elliptic integrals, (ii) invert a general incomplete elliptic integral numerically, and (iii) evaluate the partial derivatives of the elliptic integrals and functions recursively. In order to avoid the information loss against small parameter and/or characteristic, newly introduced are the associate complete and incomplete elliptic integrals. The main techniques used are (i) the piecewise approximation for single variable functions, and (ii) a systematic utilization of the half and double argument transformations and the truncated Maclaurin series expansions for the others. The new methods are of the errors of 5 ulps at most without any chance of cancellation against small input arguments. They run significantly faster than the existing methods: (i) slightly faster than Bulirsch's procedure for the incomplete elliptic integral of the first kind, (ii) 1.5 times faster than Bulirsch's procedure for Jacobian elliptic functions, (iii) 2.5 times faster than Cody's and Bulirsch's procedures for the complete elliptic integrals, and (iv) 3.5 times faster than Carlson's procedures for the incomplete elliptic integrals of the second and third kind. Their Fortran programs are available at https://www.researchgate.net/profile/Toshio_Fukushima/.

综述了计算三种椭圆函数的Legendre完全和不完全椭圆积分及Jacobian椭圆函数的新方法的最新进展。还回顾了(i)计算完全椭圆积分的反函数，(ii)用数值方法求一般不完全椭圆积分的逆，以及(iii)递归求椭圆积分和函数的偏导数的全新方法。为了避免对小参数和/或特征的信息损失，新引入了关联完全和不完全椭圆积分。所使用的主要技术是(i)单变量函数的分段逼近，以及(ii)系统地利用半参数和双参数变换以及对其他函数的截断麦克劳林级数展开。对于小的输入参数，新方法的误差最多为5 ulps，没有任何消除的机会。它们的运行速度明显快于现有方法:(i)处理第一类不完全椭圆积分的速度略快于布利希方法，(ii)处理雅可比椭圆函数的速度比布利希方法快1.5倍，(iii)处理完全椭圆积分的速度比Cody和布利希方法快2.5倍，(iv)处理第二类和第三类不完全椭圆积分的速度比卡尔森方法快3.5倍。他们的Fortran程序可以在https://www.researchgate.net/profile/Toshio_Fukushima/上找到。

{"title":"Precise and Fast Computation of Elliptic Integrals and Functions","authors":"T. Fukushima","doi":"10.1109/ARITH.2015.15","DOIUrl":"https://doi.org/10.1109/ARITH.2015.15","url":null,"abstract":"Summarized is the recent progress of the new methods to compute Legendre's complete and incomplete elliptic integrals of all three kinds and Jacobian elliptic functions. Also reviewed are the entirely new methods to (i) compute the inverse functions of complete elliptic integrals, (ii) invert a general incomplete elliptic integral numerically, and (iii) evaluate the partial derivatives of the elliptic integrals and functions recursively. In order to avoid the information loss against small parameter and/or characteristic, newly introduced are the associate complete and incomplete elliptic integrals. The main techniques used are (i) the piecewise approximation for single variable functions, and (ii) a systematic utilization of the half and double argument transformations and the truncated Maclaurin series expansions for the others. The new methods are of the errors of 5 ulps at most without any chance of cancellation against small input arguments. They run significantly faster than the existing methods: (i) slightly faster than Bulirsch's procedure for the incomplete elliptic integral of the first kind, (ii) 1.5 times faster than Bulirsch's procedure for Jacobian elliptic functions, (iii) 2.5 times faster than Cody's and Bulirsch's procedures for the complete elliptic integrals, and (iv) 3.5 times faster than Carlson's procedures for the incomplete elliptic integrals of the second and third kind. Their Fortran programs are available at https://www.researchgate.net/profile/Toshio_Fukushima/.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"19 1","pages":"50-57"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74016051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Efficient Modular Exponentiation Based on Multiple Multiplications by a Common Operand 基于公共操作数的多次乘法的高效模求

2015 IEEE 22nd Symposium on Computer Arithmetic

Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.24

C. Nègre, T. Plantard, J. Robert

The main operation in RSA encryption/decryption is the modular exponentiation, which involves a long sequence of modular squarings and multiplications. In this paper, we propose to improve modular multiplications AB, AC which have a common operand. To reach this goal we modify the Montgomery modular multiplication in order to share common computations in AB and AC. We extend this idea to reduce the cost of multiple modular multiplications AB1,...,ABℓ by the same operand A. We then take advantage of these improvements in the Montgomery-ladder and SPA resistant m-ary exponentiation algorithms. The complexity analysis shows that for an RSA modulus of size 2048 bits, the proposed improvements reduce the number of word operations (ADD and MUL) by 14% for the Montgomery-ladder and by 5%-8% for the m-ary exponentiations. Our implementations show a speed-up by 8%-14% for the Montgomery-ladder and by 1%-8% for the m-ary exponentiations for modulus of size 1024, 2048 and 4048 bits.

RSA加密/解密的主要操作是模幂运算，它涉及一长串的模平方和乘法。本文提出改进具有共同操作数的模乘法AB、AC。为了实现这一目标，我们修改了Montgomery模乘法，以便在AB和AC中共享共同的计算。我们扩展了这一思想，以降低多个模乘法AB1，…然后，我们利用这些改进在蒙哥马利阶梯和抗SPA的m-幂算法中。复杂度分析表明，对于大小为2048位的RSA模量，所提出的改进将蒙哥马利阶梯的字操作(ADD和MUL)数量减少了14%，将m次幂的字操作数量减少了5%-8%。我们的实现表明，蒙哥马利阶梯的速度提高了8%-14%，对于大小为1024、2048和4048位的模量的m次幂的速度提高了1%-8%。

引用次数: 3

An Automatable Formal Semantics for IEEE-754 Floating-Point Arithmetic IEEE-754浮点运算的可自动化形式语义

2015 IEEE 22nd Symposium on Computer Arithmetic

Pub Date : 2015-06-22 DOI: 10.1109/ARITH.2015.26

M. Brain, C. Tinelli, Philipp Rümmer, T. Wahl

Automated reasoning tools often provide little or no support to reason accurately and efficiently about floating-point arithmetic. As a consequence, software verification systems that use these tools are unable to reason reliably about programs containing floating-point calculations or may give unsound results. These deficiencies are in stark contrast to the increasing awareness that the improper use of floating-point arithmetic in programs can lead to unintuitive and harmful defects in software. To promote coordinated efforts towards building efficient and accurate floating-point reasoning engines, this paper presents a formalization of the IEEE-754 standard for floating-point arithmetic as a theory in many-sorted first-order logic. Benefits include a standardized syntax and unambiguous semantics, allowing tool interoperability and sharing of benchmarks, and providing a basis for automated, formal analysis of programs that process floating-point data.

自动推理工具通常很少或根本不支持对浮点运算进行准确和有效的推理。因此，使用这些工具的软件验证系统无法对包含浮点计算的程序进行可靠的推理，或者可能给出不可靠的结果。与这些缺陷形成鲜明对比的是，人们越来越意识到，在程序中不正确地使用浮点运算可能导致软件中出现不直观和有害的缺陷。为了促进建立高效和准确的浮点推理引擎的协调努力，本文提出了IEEE-754浮点算术标准的形式化，作为多排序一阶逻辑中的理论。好处包括标准化的语法和明确的语义，允许工具互操作性和基准的共享，并为处理浮点数据的程序的自动化、形式化分析提供基础。

引用次数: 60

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2015 IEEE 22nd Symposium on Computer Arithmetic

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀