Benchmarking DNN Mapping Methods for the in-Memory Computing Accelerators

IF 3.7 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Journal on Emerging and Selected Topics in Circuits and Systems Pub Date : 2023-10-31 DOI:10.1109/JETCAS.2023.3328864
Yimin Wang;Xuanyao Fong
{"title":"Benchmarking DNN Mapping Methods for the in-Memory Computing Accelerators","authors":"Yimin Wang;Xuanyao Fong","doi":"10.1109/JETCAS.2023.3328864","DOIUrl":null,"url":null,"abstract":"This paper presents a study of methods for mapping the convolutional workloads in deep neural networks (DNNs) onto the computing hardware in the in-memory computing (IMC) architecture. Specifically, we focus on categorizing and benchmarking the processing element (PE)-level mapping methods, which have not been investigated in detail for IMC-based architectures. First, we categorize the PE-level mapping methods from the loop unrolling perspective and discuss the corresponding implications on input data reuse and output data reduction. Then, a mapping-oriented architecture is proposed by considering the input and output datapaths under various mapping methods. The architecture is evaluated on the 45 nm technology showing good area-efficiency and scalability, providing a hardware substrate for further performance improvements via PE-level mappings. Furthermore, we present an evaluation framework that captures the architecture behaviors and enables extensive benchmarking of mapping methods under various neural network workloads, main memory bandwidth, and digital computing throughput. The benchmarking results demonstrate significant tradeoffs in the design space and unlock new design possibilities. We present case studies to showcase preferred mapping methods for best energy consumption and/or execution time and demonstrate that a hybrid-mapping scheme enhances minimum execution time by up to 30% for the publicly-available DNN benchmarks.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"13 4","pages":"1040-1051"},"PeriodicalIF":3.7000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10302283/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

This paper presents a study of methods for mapping the convolutional workloads in deep neural networks (DNNs) onto the computing hardware in the in-memory computing (IMC) architecture. Specifically, we focus on categorizing and benchmarking the processing element (PE)-level mapping methods, which have not been investigated in detail for IMC-based architectures. First, we categorize the PE-level mapping methods from the loop unrolling perspective and discuss the corresponding implications on input data reuse and output data reduction. Then, a mapping-oriented architecture is proposed by considering the input and output datapaths under various mapping methods. The architecture is evaluated on the 45 nm technology showing good area-efficiency and scalability, providing a hardware substrate for further performance improvements via PE-level mappings. Furthermore, we present an evaluation framework that captures the architecture behaviors and enables extensive benchmarking of mapping methods under various neural network workloads, main memory bandwidth, and digital computing throughput. The benchmarking results demonstrate significant tradeoffs in the design space and unlock new design possibilities. We present case studies to showcase preferred mapping methods for best energy consumption and/or execution time and demonstrate that a hybrid-mapping scheme enhances minimum execution time by up to 30% for the publicly-available DNN benchmarks.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
内存计算加速器的 DNN 映射方法基准测试
本文研究了将深度神经网络(DNN)中的卷积工作量映射到内存计算(IMC)架构中的计算硬件上的方法。具体而言,我们将重点放在处理元件(PE)级映射方法的分类和基准测试上,这些方法尚未针对基于 IMC 的架构进行详细研究。首先,我们从循环展开的角度对 PE 级映射方法进行分类,并讨论其对输入数据重用和输出数据缩减的相应影响。然后,通过考虑各种映射方法下的输入和输出数据通路,提出了一种面向映射的架构。该架构在 45 纳米技术上进行了评估,显示出良好的面积效率和可扩展性,为通过 PE 级映射进一步提高性能提供了硬件基础。此外,我们还提出了一个评估框架,可捕捉架构行为,并在各种神经网络工作负载、主存储器带宽和数字计算吞吐量下对映射方法进行广泛的基准测试。基准测试结果表明了设计空间中的重大权衡,并揭示了新的设计可能性。我们通过案例研究展示了最佳能耗和/或执行时间的首选映射方法,并证明混合映射方案可将公开的 DNN 基准的最短执行时间最多延长 30%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
8.50
自引率
2.20%
发文量
86
期刊介绍: The IEEE Journal on Emerging and Selected Topics in Circuits and Systems is published quarterly and solicits, with particular emphasis on emerging areas, special issues on topics that cover the entire scope of the IEEE Circuits and Systems (CAS) Society, namely the theory, analysis, design, tools, and implementation of circuits and systems, spanning their theoretical foundations, applications, and architectures for signal and information processing.
期刊最新文献
Introducing IEEE Collabratec Table of Contents Erratum to “A Reconfigurable Spatial Architecture for Energy-Efficient Inception Neural Networks” Guest Editorial: Toward Trustworthy AI: Advances in Circuits, Systems, and Applications IEEE Journal on Emerging and Selected Topics in Circuits and Systems Publication Information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1