HuNT: Exploiting Heterogeneous PIM Devices to Design a 3-D Manycore Architecture for DNN Training

IF 2.7 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Pub Date : 2024-11-06 DOI:10.1109/TCAD.2024.3444708
Chukwufumnanya Ogbogu;Gaurav Narang;Biresh Kumar Joardar;Janardhan Rao Doppa;Krishnendu Chakrabarty;Partha Pratim Pande
{"title":"HuNT: Exploiting Heterogeneous PIM Devices to Design a 3-D Manycore Architecture for DNN Training","authors":"Chukwufumnanya Ogbogu;Gaurav Narang;Biresh Kumar Joardar;Janardhan Rao Doppa;Krishnendu Chakrabarty;Partha Pratim Pande","doi":"10.1109/TCAD.2024.3444708","DOIUrl":null,"url":null,"abstract":"Processing-in-memory (PIM) architectures have emerged as an attractive computing paradigm for accelerating deep neural network (DNN) training and inferencing. However, a plethora of PIM devices, e.g., resistive random-access memory, ferroelectric field-effect transistor, phase change memory, MRAM, static random-access memory, exists and each of these devices offers advantages and drawbacks in terms of power, latency, area, and nonidealities. A heterogeneous architecture that combines the benefits of multiple devices in a single platform can enable energy-efficient and high-performance DNN training and inference. 3-D integration enables the design of such a heterogeneous architecture where multiple planar tiers consisting of different PIM devices can be integrated into a single platform. In this work, we propose the HuNT framework, which hunts for (finds) an optimal DNN neural layer mapping, and planar tier configurations for a 3-D heterogeneous architecture. Overall, our experimental results demonstrate that the HuNT-enabled 3-D heterogeneous architecture achieves up to \n<inline-formula> <tex-math>$10 {\\times }$ </tex-math></inline-formula>\n and \n<inline-formula> <tex-math>$3.5 {\\times }$ </tex-math></inline-formula>\n improvement with respect to the homogeneous and existing heterogeneous PIM-based architectures, respectively, in terms of energy-efficiency (TOPS/W). Similarly, the proposed HuNT-enabled architecture outperforms existing homogeneous and heterogeneous architectures by up to \n<inline-formula> <tex-math>$8 {\\times }$ </tex-math></inline-formula>\n and \n<inline-formula> <tex-math>$2.4\\times $ </tex-math></inline-formula>\n, respectively, in terms of compute-efficiency (TOPS/mm2) without compromising the final DNN accuracy.","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"3300-3311"},"PeriodicalIF":2.7000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10745791/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Processing-in-memory (PIM) architectures have emerged as an attractive computing paradigm for accelerating deep neural network (DNN) training and inferencing. However, a plethora of PIM devices, e.g., resistive random-access memory, ferroelectric field-effect transistor, phase change memory, MRAM, static random-access memory, exists and each of these devices offers advantages and drawbacks in terms of power, latency, area, and nonidealities. A heterogeneous architecture that combines the benefits of multiple devices in a single platform can enable energy-efficient and high-performance DNN training and inference. 3-D integration enables the design of such a heterogeneous architecture where multiple planar tiers consisting of different PIM devices can be integrated into a single platform. In this work, we propose the HuNT framework, which hunts for (finds) an optimal DNN neural layer mapping, and planar tier configurations for a 3-D heterogeneous architecture. Overall, our experimental results demonstrate that the HuNT-enabled 3-D heterogeneous architecture achieves up to $10 {\times }$ and $3.5 {\times }$ improvement with respect to the homogeneous and existing heterogeneous PIM-based architectures, respectively, in terms of energy-efficiency (TOPS/W). Similarly, the proposed HuNT-enabled architecture outperforms existing homogeneous and heterogeneous architectures by up to $8 {\times }$ and $2.4\times $ , respectively, in terms of compute-efficiency (TOPS/mm2) without compromising the final DNN accuracy.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
HuNT:利用异构 PIM 设备设计用于 DNN 训练的三维多核架构
内存处理(PIM)架构已成为加速深度神经网络(DNN)训练和推理的一种极具吸引力的计算模式。然而,目前存在大量 PIM 设备,例如电阻式随机存取存储器、铁电场效应晶体管、相变存储器、MRAM、静态随机存取存储器,这些设备在功耗、延迟、面积和非理想性方面各有优缺点。在单个平台中结合多种器件优势的异构架构可实现高能效、高性能的 DNN 训练和推理。三维集成可以设计这样一种异构架构,将由不同 PIM 设备组成的多个平面层集成到一个平台中。在这项工作中,我们提出了 HuNT 框架,它可以为三维异构架构寻找(发现)最佳 DNN 神经层映射和平面层配置。总体而言,我们的实验结果表明,与基于 PIM 的同构架构和现有异构架构相比,支持 HuNT 的三维异构架构在能效(TOPS/W)方面分别实现了高达 10 {\times }$ 美元和 3.5 {\times }$ 美元的改进。同样,在计算效率(TOPS/mm2)方面,所提出的支持 HuNT 的架构比现有的同构和异构架构分别高出 8 {\times }$ 美元和 2.4 {\times }$ 美元,而不会影响最终 DNN 的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
5.60
自引率
13.80%
发文量
500
审稿时长
7 months
期刊介绍: The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.
期刊最新文献
Table of Contents IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems publication information IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems society information 2024 Index IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems Vol. 43 Table of Contents
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1