ABS: Accumulation Bit-Width Scaling Method for Designing Low-Precision Tensor Core

IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-06-25 DOI:10.1109/TVLSI.2024.3414260
Yasong Cao;Mei Wen;Zhongdi Luo;Xin Ju;Haolan Huang;Junzhong Shen;Haiyan Chen
{"title":"ABS: Accumulation Bit-Width Scaling Method for Designing Low-Precision Tensor Core","authors":"Yasong Cao;Mei Wen;Zhongdi Luo;Xin Ju;Haolan Huang;Junzhong Shen;Haiyan Chen","doi":"10.1109/TVLSI.2024.3414260","DOIUrl":null,"url":null,"abstract":"A big gap exists between deep neural network (DNN) applications’ computational demand and the computing power of DNN accelerators. Low-precision floating-point (LP-FP) computation is one of the important means to improve the performance of DNN training and inference. However, the high-precision accumulators are typically applied to summating the dot products during general matrix multiplication (GEMM) in tensor cores (TCs). As the precision of data decreases, the accumulator becomes the main consumer of multiply-accumulate’s (MAC’s) area and power. Reducing the accumulators’ bit-width is of significant importance for improving the area- and energy-efficiency of TCs. There are two main challenges: 1) theoretical support on the floating-point (FP) formats with the lowest bit-width of TC’s accumulators and 2) how to integrate the LP-FP TC in the framework of DNN training and inference to evaluate its benefits. In this article, we propose accumulation bit-width scaling (ABS), a novel ABS method, to guide the design of LP-FP TCs. We 1) implement this method by constructing a novel variance retention ratio (VRR) model to predict the FP format with the minimum bit-width for TC’s accumulator; 2) provide a generator of DNN accelerator based on a systolic-array (SA) TC, supporting many low-precision configurations; and 3) design an LP-FP DNN executing framework that supports software-simulation mode and hardware-accelerator mode to run LP-FP DNN tasks. The experimental results show that the LP-FP TC guided by our ABS method has a maximum reduction of 76.47% and 75.60% in area and power consumption, respectively, compared with the advanced TCs.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":null,"pages":null},"PeriodicalIF":2.8000,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10571370/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

A big gap exists between deep neural network (DNN) applications’ computational demand and the computing power of DNN accelerators. Low-precision floating-point (LP-FP) computation is one of the important means to improve the performance of DNN training and inference. However, the high-precision accumulators are typically applied to summating the dot products during general matrix multiplication (GEMM) in tensor cores (TCs). As the precision of data decreases, the accumulator becomes the main consumer of multiply-accumulate’s (MAC’s) area and power. Reducing the accumulators’ bit-width is of significant importance for improving the area- and energy-efficiency of TCs. There are two main challenges: 1) theoretical support on the floating-point (FP) formats with the lowest bit-width of TC’s accumulators and 2) how to integrate the LP-FP TC in the framework of DNN training and inference to evaluate its benefits. In this article, we propose accumulation bit-width scaling (ABS), a novel ABS method, to guide the design of LP-FP TCs. We 1) implement this method by constructing a novel variance retention ratio (VRR) model to predict the FP format with the minimum bit-width for TC’s accumulator; 2) provide a generator of DNN accelerator based on a systolic-array (SA) TC, supporting many low-precision configurations; and 3) design an LP-FP DNN executing framework that supports software-simulation mode and hardware-accelerator mode to run LP-FP DNN tasks. The experimental results show that the LP-FP TC guided by our ABS method has a maximum reduction of 76.47% and 75.60% in area and power consumption, respectively, compared with the advanced TCs.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ABS:设计低精度张量核心的累积位宽缩放方法
深度神经网络(DNN)应用的计算需求与DNN加速器的计算能力之间存在巨大差距。低精度浮点(LP-FP)计算是提高 DNN 训练和推理性能的重要手段之一。然而,高精度累加器通常用于在张量内核(TC)的通用矩阵乘法(GEMM)过程中求和点积。随着数据精度的降低,累加器成为乘法累加器(MAC)面积和功耗的主要消耗者。减少累加器的位宽对提高 TC 的面积和能效具有重要意义。目前面临两大挑战1) 从理论上支持具有最低积算器位宽的浮点(FP)格式;2) 如何将 LP-FP TC 集成到 DNN 训练和推理框架中,以评估其优势。在本文中,我们提出了一种新颖的累加位宽缩放(ABS)方法来指导 LP-FP TC 的设计。我们:1)通过构建一个新颖的方差保留率(VRR)模型来预测积算器位宽最小的 FP 格式,从而实现该方法;2)提供一个基于收缩阵列(SA)积算器的 DNN 加速器生成器,支持多种低精度配置;3)设计一个 LP-FP DNN 执行框架,支持软件模拟模式和硬件加速模式,以运行 LP-FP DNN 任务。实验结果表明,采用我们的 ABS 方法指导的 LP-FP TC 与先进的 TC 相比,面积和功耗分别最大减少了 76.47% 和 75.60%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
6.40
自引率
7.10%
发文量
187
审稿时长
3.6 months
期刊介绍: The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society. Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels. To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.
期刊最新文献
Sophon: A Time-Repeatable and Low-Latency Architecture for Embedded Real-Time Systems Based on RISC-V CR-DRAM: Improving DRAM Refresh Energy Efficiency With Inter-Subarray Charge Recycling A Novel TriNet Architecture for Enhanced Analog IC Design Automation A Two-Channel Interleaved ADC With Fast-Converging Foreground Time Calibration and Comparison-Based Control Logic A Post-Bond ILV Test Method in Monolithic 3-D ICs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1