{"title":"BSTCIM:面向能效神经网络的平衡对称三元全数字 In-MRAM 计算宏","authors":"Zhongzhen Tong;Chenghang Li;Chao Wang;Suteng Zhao;Qianyong Peng;Zhenyu Yan;Siqi Zhang;Daming Zhou;Zhaohao Wang;Xiaoyang Lin;Weisheng Zhao","doi":"10.1109/TCSI.2024.3438553","DOIUrl":null,"url":null,"abstract":"Silicon-based traditional binary computing in-memory (TBCIM) architectures are approaching their energy efficiency and throughput limits owing to challenges facing Moore’s Law. Thus, it is essential to explore architecture based on novel devices and computing paradigms to fulfill data-centric applications, such as artificial intelligence. In this paper, we propose a balanced symmetry ternary (BST) fully digital in-MRAM computing macro (BSTCIM) using hybrid voltage-gated spin-orbit torque magnetic tunnel junctions (VGSOT-MTJ) and gate-all-around carbon nanotube field-effect-transistors (GAA-CNTFET) technology. The overall computing is based on the highest efficiency multi-bit ternary system. BSTCIM includes a ternary dot product (TDP) unit with 4 GAA-CNTFETs and 2 VGSOT-MTJs achieving TDP operation without complex logic circuits. The multi-bit ternary multiply-and-accumulate (MAC) operation is realized through the proposed ternary adder tree and ternary post adder which accumulate TDP results within the digital domain enabling high accuracy neural network inference. Furthermore, due to the advantages of BST, ternary signed MAC is more easily performed compared to TBCIM macros that adapt 2’s complement or separate signed bit calculations. BSTCIM with 288 kb is simulated, achieving throughput and energy efficiency of 0.72 TOPS and 54.5 TOPS/W, respectively, at a 0.6 V supply voltage and 1.15 TOPS and 33.7 TOPS/W, respectively at a 0.8 V supply voltage with 8b-IN, 8b-W, and 20b-OUT. Moreover, the figure-of-merit for BSTCIM is 1.13–33.6 times higher than that of existing CIM macros.","PeriodicalId":13039,"journal":{"name":"IEEE Transactions on Circuits and Systems I: Regular Papers","volume":"71 12","pages":"6114-6127"},"PeriodicalIF":5.2000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BSTCIM: A Balanced Symmetry Ternary Fully Digital In-MRAM Computing Macro for Energy Efficiency Neural Network\",\"authors\":\"Zhongzhen Tong;Chenghang Li;Chao Wang;Suteng Zhao;Qianyong Peng;Zhenyu Yan;Siqi Zhang;Daming Zhou;Zhaohao Wang;Xiaoyang Lin;Weisheng Zhao\",\"doi\":\"10.1109/TCSI.2024.3438553\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Silicon-based traditional binary computing in-memory (TBCIM) architectures are approaching their energy efficiency and throughput limits owing to challenges facing Moore’s Law. Thus, it is essential to explore architecture based on novel devices and computing paradigms to fulfill data-centric applications, such as artificial intelligence. In this paper, we propose a balanced symmetry ternary (BST) fully digital in-MRAM computing macro (BSTCIM) using hybrid voltage-gated spin-orbit torque magnetic tunnel junctions (VGSOT-MTJ) and gate-all-around carbon nanotube field-effect-transistors (GAA-CNTFET) technology. The overall computing is based on the highest efficiency multi-bit ternary system. BSTCIM includes a ternary dot product (TDP) unit with 4 GAA-CNTFETs and 2 VGSOT-MTJs achieving TDP operation without complex logic circuits. The multi-bit ternary multiply-and-accumulate (MAC) operation is realized through the proposed ternary adder tree and ternary post adder which accumulate TDP results within the digital domain enabling high accuracy neural network inference. Furthermore, due to the advantages of BST, ternary signed MAC is more easily performed compared to TBCIM macros that adapt 2’s complement or separate signed bit calculations. BSTCIM with 288 kb is simulated, achieving throughput and energy efficiency of 0.72 TOPS and 54.5 TOPS/W, respectively, at a 0.6 V supply voltage and 1.15 TOPS and 33.7 TOPS/W, respectively at a 0.8 V supply voltage with 8b-IN, 8b-W, and 20b-OUT. Moreover, the figure-of-merit for BSTCIM is 1.13–33.6 times higher than that of existing CIM macros.\",\"PeriodicalId\":13039,\"journal\":{\"name\":\"IEEE Transactions on Circuits and Systems I: Regular Papers\",\"volume\":\"71 12\",\"pages\":\"6114-6127\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-08-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Circuits and Systems I: Regular Papers\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10632213/\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems I: Regular Papers","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10632213/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
BSTCIM: A Balanced Symmetry Ternary Fully Digital In-MRAM Computing Macro for Energy Efficiency Neural Network
Silicon-based traditional binary computing in-memory (TBCIM) architectures are approaching their energy efficiency and throughput limits owing to challenges facing Moore’s Law. Thus, it is essential to explore architecture based on novel devices and computing paradigms to fulfill data-centric applications, such as artificial intelligence. In this paper, we propose a balanced symmetry ternary (BST) fully digital in-MRAM computing macro (BSTCIM) using hybrid voltage-gated spin-orbit torque magnetic tunnel junctions (VGSOT-MTJ) and gate-all-around carbon nanotube field-effect-transistors (GAA-CNTFET) technology. The overall computing is based on the highest efficiency multi-bit ternary system. BSTCIM includes a ternary dot product (TDP) unit with 4 GAA-CNTFETs and 2 VGSOT-MTJs achieving TDP operation without complex logic circuits. The multi-bit ternary multiply-and-accumulate (MAC) operation is realized through the proposed ternary adder tree and ternary post adder which accumulate TDP results within the digital domain enabling high accuracy neural network inference. Furthermore, due to the advantages of BST, ternary signed MAC is more easily performed compared to TBCIM macros that adapt 2’s complement or separate signed bit calculations. BSTCIM with 288 kb is simulated, achieving throughput and energy efficiency of 0.72 TOPS and 54.5 TOPS/W, respectively, at a 0.6 V supply voltage and 1.15 TOPS and 33.7 TOPS/W, respectively at a 0.8 V supply voltage with 8b-IN, 8b-W, and 20b-OUT. Moreover, the figure-of-merit for BSTCIM is 1.13–33.6 times higher than that of existing CIM macros.
期刊介绍:
TCAS I publishes regular papers in the field specified by the theory, analysis, design, and practical implementations of circuits, and the application of circuit techniques to systems and to signal processing. Included is the whole spectrum from basic scientific theory to industrial applications. The field of interest covered includes: - Circuits: Analog, Digital and Mixed Signal Circuits and Systems - Nonlinear Circuits and Systems, Integrated Sensors, MEMS and Systems on Chip, Nanoscale Circuits and Systems, Optoelectronic - Circuits and Systems, Power Electronics and Systems - Software for Analog-and-Logic Circuits and Systems - Control aspects of Circuits and Systems.