{"title":"3DNN-Xplorer:一个用于异构三维DNN加速器设计空间探索的机器学习框架","authors":"Gauthaman Murali;Min Gyu Park;Sung Kyu Lim","doi":"10.1109/TVLSI.2024.3471496","DOIUrl":null,"url":null,"abstract":"This article presents 3DNN-Xplorer, the first machine learning (ML)-based framework for predicting the performance of heterogeneous 3-D deep neural network (DNN) accelerators. Our ML framework facilitates the design space exploration (DSE) of heterogeneous 3-D accelerators with a two-tier compute-on-memory (CoM) configuration, considering 3-D physical design factors. Our design space encompasses four distinct heterogeneous 3-D integration styles, combining 28- and 16-nm technology nodes for both compute and memory tiers. Using extrapolation techniques with ML models trained on 10-to-256 processing element (PE) accelerator configurations, we estimate the performance of systems featuring 75–16384 PEs, achieving a maximum absolute error of 13.9% (the number of PEs is not continuous and varies based on the accelerator architecture). To ensure balanced tier areas in the design, our framework assumes the same number of PEs or on-chip memory capacity across the four integration styles, accounting for area imbalance resulting from different technology nodes. Our analysis reveals that the heterogeneous 3-D style with 28-nm compute and 16-nm memory is energy-efficient and offers notable energy savings of up to 50% and an 8.8% reduction in runtime compared to other 3-D integration styles with the same number of PEs. Similarly, the heterogeneous 3-D style with 16-nm compute and 28-nm memory is area-efficient and shows up to 8.3% runtime reduction compared to other 3-D styles with the same on-chip memory capacity.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"33 2","pages":"358-370"},"PeriodicalIF":2.8000,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"3DNN-Xplorer: A Machine Learning Framework for Design Space Exploration of Heterogeneous 3-D DNN Accelerators\",\"authors\":\"Gauthaman Murali;Min Gyu Park;Sung Kyu Lim\",\"doi\":\"10.1109/TVLSI.2024.3471496\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article presents 3DNN-Xplorer, the first machine learning (ML)-based framework for predicting the performance of heterogeneous 3-D deep neural network (DNN) accelerators. Our ML framework facilitates the design space exploration (DSE) of heterogeneous 3-D accelerators with a two-tier compute-on-memory (CoM) configuration, considering 3-D physical design factors. Our design space encompasses four distinct heterogeneous 3-D integration styles, combining 28- and 16-nm technology nodes for both compute and memory tiers. Using extrapolation techniques with ML models trained on 10-to-256 processing element (PE) accelerator configurations, we estimate the performance of systems featuring 75–16384 PEs, achieving a maximum absolute error of 13.9% (the number of PEs is not continuous and varies based on the accelerator architecture). To ensure balanced tier areas in the design, our framework assumes the same number of PEs or on-chip memory capacity across the four integration styles, accounting for area imbalance resulting from different technology nodes. Our analysis reveals that the heterogeneous 3-D style with 28-nm compute and 16-nm memory is energy-efficient and offers notable energy savings of up to 50% and an 8.8% reduction in runtime compared to other 3-D integration styles with the same number of PEs. Similarly, the heterogeneous 3-D style with 16-nm compute and 28-nm memory is area-efficient and shows up to 8.3% runtime reduction compared to other 3-D styles with the same on-chip memory capacity.\",\"PeriodicalId\":13425,\"journal\":{\"name\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"volume\":\"33 2\",\"pages\":\"358-370\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-10-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10715720/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10715720/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
3DNN-Xplorer: A Machine Learning Framework for Design Space Exploration of Heterogeneous 3-D DNN Accelerators
This article presents 3DNN-Xplorer, the first machine learning (ML)-based framework for predicting the performance of heterogeneous 3-D deep neural network (DNN) accelerators. Our ML framework facilitates the design space exploration (DSE) of heterogeneous 3-D accelerators with a two-tier compute-on-memory (CoM) configuration, considering 3-D physical design factors. Our design space encompasses four distinct heterogeneous 3-D integration styles, combining 28- and 16-nm technology nodes for both compute and memory tiers. Using extrapolation techniques with ML models trained on 10-to-256 processing element (PE) accelerator configurations, we estimate the performance of systems featuring 75–16384 PEs, achieving a maximum absolute error of 13.9% (the number of PEs is not continuous and varies based on the accelerator architecture). To ensure balanced tier areas in the design, our framework assumes the same number of PEs or on-chip memory capacity across the four integration styles, accounting for area imbalance resulting from different technology nodes. Our analysis reveals that the heterogeneous 3-D style with 28-nm compute and 16-nm memory is energy-efficient and offers notable energy savings of up to 50% and an 8.8% reduction in runtime compared to other 3-D integration styles with the same number of PEs. Similarly, the heterogeneous 3-D style with 16-nm compute and 28-nm memory is area-efficient and shows up to 8.3% runtime reduction compared to other 3-D styles with the same on-chip memory capacity.
期刊介绍:
The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society.
Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels.
To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.