Low-cost constant time signed digit selection for most significant bit first multiplication

IF 1.9 4区 计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Microprocessors and Microsystems Pub Date : 2024-11-01 DOI:10.1016/j.micpro.2024.105118
Ghassem Jaberipur , Saeid Gorgin , Jeong-A. Lee
{"title":"Low-cost constant time signed digit selection for most significant bit first multiplication","authors":"Ghassem Jaberipur ,&nbsp;Saeid Gorgin ,&nbsp;Jeong-A. Lee","doi":"10.1016/j.micpro.2024.105118","DOIUrl":null,"url":null,"abstract":"<div><div>Serial binary multiplication is frequently used in many digital applications. In particular, left-to-right (aka online) manipulation of operands promotes the real-time generation of product digits for immediate utilization in subsequent online computations (e.g., successive layers of a neural network). In the left-to-right arithmetic operations, where a residual is maintained for digit selection, utilization of a redundant number system for the representation of outputs is mandatory, while the input operands and the residual may be redundant or non-redundant. However, when the input data paths are narrow (e.g., eight bits as in BFloat16), conventional non-redundant representations of inputs and residual provide some advantages. For example, the immediate and costless sign detection of the residual that is necessary for the next digit selection; a property not shared by redundant numbers. Nevertheless, digit selection, as practiced in the previous realizations, with both redundant and non-redundant inputs and/or residual, is slow and rather complex. Therefore, in this paper, we offer an imprecise, but faster digit selection scheme, with the required correction in the next cycle. Analytical evaluations and synthesis of the proposed circuits on FPGA platform, shows 30 % speedup and less cost with respect to both cases with redundant and non-redundant inputs and residual.</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"111 ","pages":"Article 105118"},"PeriodicalIF":1.9000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microprocessors and Microsystems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141933124001133","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Serial binary multiplication is frequently used in many digital applications. In particular, left-to-right (aka online) manipulation of operands promotes the real-time generation of product digits for immediate utilization in subsequent online computations (e.g., successive layers of a neural network). In the left-to-right arithmetic operations, where a residual is maintained for digit selection, utilization of a redundant number system for the representation of outputs is mandatory, while the input operands and the residual may be redundant or non-redundant. However, when the input data paths are narrow (e.g., eight bits as in BFloat16), conventional non-redundant representations of inputs and residual provide some advantages. For example, the immediate and costless sign detection of the residual that is necessary for the next digit selection; a property not shared by redundant numbers. Nevertheless, digit selection, as practiced in the previous realizations, with both redundant and non-redundant inputs and/or residual, is slow and rather complex. Therefore, in this paper, we offer an imprecise, but faster digit selection scheme, with the required correction in the next cycle. Analytical evaluations and synthesis of the proposed circuits on FPGA platform, shows 30 % speedup and less cost with respect to both cases with redundant and non-redundant inputs and residual.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
低成本恒定时间有符号数位选择,用于最显著位首数乘法
串行二进制乘法经常用于许多数字应用中。特别是,操作数的从左到右(又称在线)运算可促进实时生成乘积数字,以便在随后的在线计算(如神经网络的连续层)中立即使用。在从左到右的算术运算中,需要保留一个残差用于数字选择,因此必须使用冗余数字系统来表示输出,而输入操作数和残差可以是冗余或非冗余的。然而,当输入数据路径较窄时(如 BFloat16 中的 8 位),传统的非冗余输入和残差表示法具有一些优势。例如,下一位数选择所需的残差可立即、无代价地进行符号检测;这是冗余数字所不具备的特性。尽管如此,在以往的实现过程中,利用冗余和非冗余输入和/或残差进行数字选择的速度很慢,而且相当复杂。因此,在本文中,我们提供了一种不精确但更快的数字选择方案,并在下一个周期进行所需的校正。在 FPGA 平台上对所提电路进行的分析评估和综合显示,与冗余和非冗余输入及残差两种情况相比,速度提高了 30%,成本降低了。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Microprocessors and Microsystems
Microprocessors and Microsystems 工程技术-工程:电子与电气
CiteScore
6.90
自引率
3.80%
发文量
204
审稿时长
172 days
期刊介绍: Microprocessors and Microsystems: Embedded Hardware Design (MICPRO) is a journal covering all design and architectural aspects related to embedded systems hardware. This includes different embedded system hardware platforms ranging from custom hardware via reconfigurable systems and application specific processors to general purpose embedded processors. Special emphasis is put on novel complex embedded architectures, such as systems on chip (SoC), systems on a programmable/reconfigurable chip (SoPC) and multi-processor systems on a chip (MPSoC), as well as, their memory and communication methods and structures, such as network-on-chip (NoC). Design automation of such systems including methodologies, techniques, flows and tools for their design, as well as, novel designs of hardware components fall within the scope of this journal. Novel cyber-physical applications that use embedded systems are also central in this journal. While software is not in the main focus of this journal, methods of hardware/software co-design, as well as, application restructuring and mapping to embedded hardware platforms, that consider interplay between software and hardware components with emphasis on hardware, are also in the journal scope.
期刊最新文献
Editorial Board Algorithms for scheduling CNNs on multicore MCUs at the neuron and layer levels Low-cost constant time signed digit selection for most significant bit first multiplication An adaptive binary classifier for highly imbalanced datasets on the Edge Quality-driven design of deep neural network hardware accelerators for low power CPS and IoT applications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1