{"title":"ASIC Design of Nanoscale Artificial Neural Networks for Inference/Training by Floating-Point Arithmetic","authors":"Farzad Niknia;Ziheng Wang;Shanshan Liu;Pedro Reviriego;Ahmed Louri;Fabrizio Lombardi","doi":"10.1109/TNANO.2024.3367916","DOIUrl":null,"url":null,"abstract":"Inference and on-chip training of Artificial Neural Networks (ANNs) are challenging computational processes for large datasets; hardware implementations are needed to accelerate this computation, while meeting metrics such as operating frequency, power dissipation and accuracy. In this article, a high-performance ASIC-based design is proposed to implement both forward and backward propagations of multi-layer perceptrons (MLPs) at the nanoscales. To attain a higher accuracy, floating-point arithmetic units for a multiply-and-accumulate (MAC) array are employed in the proposed design; moreover, a hybrid implementation scheme is utilized to achieve flexibility (for networks of different size) and comprehensively low hardware overhead. The proposed design is fully pipelined, and its performance is independent of network size, except for the number of cycles and latency. The efficiency of the proposed nanoscale MLP-based design for inference (as taking place over multiple steps) and training (due to the complex processing in backward propagation by eliminating many redundant calculations) is analyzed. Moreover, the impact of different floating-point precision formats on the final accuracy and hardware metrics under the same design constraints is studied. A comparative evaluation of the proposed MLP design for different datasets and floating-point precision formats is provided. Results show that compared to current schemes found in the technical literatures, the proposed design has the best operating frequency and accuracy with still good latency and energy dissipation.","PeriodicalId":449,"journal":{"name":"IEEE Transactions on Nanotechnology","volume":"23 ","pages":"208-216"},"PeriodicalIF":2.1000,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Nanotechnology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10440496/","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Inference and on-chip training of Artificial Neural Networks (ANNs) are challenging computational processes for large datasets; hardware implementations are needed to accelerate this computation, while meeting metrics such as operating frequency, power dissipation and accuracy. In this article, a high-performance ASIC-based design is proposed to implement both forward and backward propagations of multi-layer perceptrons (MLPs) at the nanoscales. To attain a higher accuracy, floating-point arithmetic units for a multiply-and-accumulate (MAC) array are employed in the proposed design; moreover, a hybrid implementation scheme is utilized to achieve flexibility (for networks of different size) and comprehensively low hardware overhead. The proposed design is fully pipelined, and its performance is independent of network size, except for the number of cycles and latency. The efficiency of the proposed nanoscale MLP-based design for inference (as taking place over multiple steps) and training (due to the complex processing in backward propagation by eliminating many redundant calculations) is analyzed. Moreover, the impact of different floating-point precision formats on the final accuracy and hardware metrics under the same design constraints is studied. A comparative evaluation of the proposed MLP design for different datasets and floating-point precision formats is provided. Results show that compared to current schemes found in the technical literatures, the proposed design has the best operating frequency and accuracy with still good latency and energy dissipation.
期刊介绍:
The IEEE Transactions on Nanotechnology is devoted to the publication of manuscripts of archival value in the general area of nanotechnology, which is rapidly emerging as one of the fastest growing and most promising new technological developments for the next generation and beyond.