An Efficient Methodology for Binary Logarithmic Computations of Floating-Point Numbers With Normalized Output Within One ulp of Accuracy

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Computers Pub Date : 2025-02-19 DOI:10.1109/TC.2025.3543676

Fei Lyu;Yuanyong Luo;Weiqiang Liu

{"title":"An Efficient Methodology for Binary Logarithmic Computations of Floating-Point Numbers With Normalized Output Within One ulp of Accuracy","authors":"Fei Lyu;Yuanyong Luo;Weiqiang Liu","doi":"10.1109/TC.2025.3543676","DOIUrl":null,"url":null,"abstract":"Many studies have focused on the hardware implementation of binary logarithmic computation with fixed-point output. Although their outputs are accurate within 1 ulp (unit in the last place) in fixed-point format, they are far from meeting the accuracy requirement of 1 ulp in floating-point format when the output is close to 0. However, normalized floating-point output that is accurate to within 1-3 ulp is needed in many math libraries (for example, OpenCL, NVIDIA CUDA, and AMD AOCL). To the best of our knowledge, this is the first study to propose a hardware implementation of binary logarithmic computation for floating-point numbers with a normalized output that is accurate to within 1 ulp. Instead of calculating <inline-formula><tex-math>$\\textrm{log}_{2}(1+fi)$</tex-math></inline-formula> (where <inline-formula><tex-math>$\\boldsymbol{fi}$</tex-math></inline-formula> is the fractional part of the floating-point number) directly, the proposed methodology uses two novel objective functions for the polynomial approximation method. The novel objective functions make the significant bits of the outputs move forward to eliminate the necessity for high precision near zero. Compared with the designs of fixed-point binary logarithmic converters, the proposed hardware implementation achieves greater accuracy to meet the requirement of 1 ulp of floating-point format with a 21% extra area consumption.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 5","pages":"1800-1813"},"PeriodicalIF":3.8000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10892356/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Many studies have focused on the hardware implementation of binary logarithmic computation with fixed-point output. Although their outputs are accurate within 1 ulp (unit in the last place) in fixed-point format, they are far from meeting the accuracy requirement of 1 ulp in floating-point format when the output is close to 0. However, normalized floating-point output that is accurate to within 1-3 ulp is needed in many math libraries (for example, OpenCL, NVIDIA CUDA, and AMD AOCL). To the best of our knowledge, this is the first study to propose a hardware implementation of binary logarithmic computation for floating-point numbers with a normalized output that is accurate to within 1 ulp. Instead of calculating

$\textrm{log}_{2}(1+fi)$

(where

$\boldsymbol{fi}$

is the fractional part of the floating-point number) directly, the proposed methodology uses two novel objective functions for the polynomial approximation method. The novel objective functions make the significant bits of the outputs move forward to eliminate the necessity for high precision near zero. Compared with the designs of fixed-point binary logarithmic converters, the proposed hardware implementation achieves greater accuracy to meet the requirement of 1 ulp of floating-point format with a 21% extra area consumption.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一种精度以内归一化输出浮点数二进制对数计算的有效方法

许多研究都集中在具有定点输出的二进制对数计算的硬件实现上。虽然它们在定点格式下的输出精度在1 ulp（最后一个单位）以内，但当输出接近于0时，它们远不能满足浮点格式下1 ulp的精度要求。然而，在许多数学库（例如，OpenCL， NVIDIA CUDA和AMD AOCL）中需要精确到1-3 ulp的规范化浮点输出。据我们所知，这是第一个提出浮点数二进制对数计算的硬件实现的研究，其规范化输出精确到1 up以内。该方法不直接计算$\textrm{log}_{2}(1+fi)$（其中$\boldsymbol{fi}$是浮点数的小数部分），而是使用两个新的目标函数进行多项式近似方法。新的目标函数使输出的有效位前移，从而消除了对接近零的高精度的需要。与定点二进制对数转换器的设计相比，所提出的硬件实现达到了更高的精度，满足了1 ulp浮点格式的要求，并增加了21%的面积消耗。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Computers 工程技术-工程：电子与电气

CiteScore

6.60

自引率

5.40%

发文量

199

审稿时长

6.0 months

期刊介绍： The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.

期刊最新文献

2025 Reviewers List Evaluation of Radiation Resilience, Performance, and Vmin of Sub-3 nm FSFET Based SRAM Arrays Dual-Pronged Deep Learning Preprocessing on Heterogeneous Platforms With CPU, Accelerator and CSD Latency Optimization in Hybrid Memory System for GNNs Fused FP8 Many-Terms Dot Product With Scaling and FP32 Accumulation