{"title":"Benchmarking DNN Mapping Methods for the in-Memory Computing Accelerators","authors":"Yimin Wang;Xuanyao Fong","doi":"10.1109/JETCAS.2023.3328864","DOIUrl":null,"url":null,"abstract":"This paper presents a study of methods for mapping the convolutional workloads in deep neural networks (DNNs) onto the computing hardware in the in-memory computing (IMC) architecture. Specifically, we focus on categorizing and benchmarking the processing element (PE)-level mapping methods, which have not been investigated in detail for IMC-based architectures. First, we categorize the PE-level mapping methods from the loop unrolling perspective and discuss the corresponding implications on input data reuse and output data reduction. Then, a mapping-oriented architecture is proposed by considering the input and output datapaths under various mapping methods. The architecture is evaluated on the 45 nm technology showing good area-efficiency and scalability, providing a hardware substrate for further performance improvements via PE-level mappings. Furthermore, we present an evaluation framework that captures the architecture behaviors and enables extensive benchmarking of mapping methods under various neural network workloads, main memory bandwidth, and digital computing throughput. The benchmarking results demonstrate significant tradeoffs in the design space and unlock new design possibilities. We present case studies to showcase preferred mapping methods for best energy consumption and/or execution time and demonstrate that a hybrid-mapping scheme enhances minimum execution time by up to 30% for the publicly-available DNN benchmarks.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"13 4","pages":"1040-1051"},"PeriodicalIF":3.7000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10302283/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents a study of methods for mapping the convolutional workloads in deep neural networks (DNNs) onto the computing hardware in the in-memory computing (IMC) architecture. Specifically, we focus on categorizing and benchmarking the processing element (PE)-level mapping methods, which have not been investigated in detail for IMC-based architectures. First, we categorize the PE-level mapping methods from the loop unrolling perspective and discuss the corresponding implications on input data reuse and output data reduction. Then, a mapping-oriented architecture is proposed by considering the input and output datapaths under various mapping methods. The architecture is evaluated on the 45 nm technology showing good area-efficiency and scalability, providing a hardware substrate for further performance improvements via PE-level mappings. Furthermore, we present an evaluation framework that captures the architecture behaviors and enables extensive benchmarking of mapping methods under various neural network workloads, main memory bandwidth, and digital computing throughput. The benchmarking results demonstrate significant tradeoffs in the design space and unlock new design possibilities. We present case studies to showcase preferred mapping methods for best energy consumption and/or execution time and demonstrate that a hybrid-mapping scheme enhances minimum execution time by up to 30% for the publicly-available DNN benchmarks.
期刊介绍:
The IEEE Journal on Emerging and Selected Topics in Circuits and Systems is published quarterly and solicits, with particular emphasis on emerging areas, special issues on topics that cover the entire scope of the IEEE Circuits and Systems (CAS) Society, namely the theory, analysis, design, tools, and implementation of circuits and systems, spanning their theoretical foundations, applications, and architectures for signal and information processing.