{"title":"Utilizing MRAMs With Low Resistance and Limited Dynamic Range for Efficient MAC Accelerator","authors":"Sateesh;Kaustubh Chakarwar;Shubham Sahay","doi":"10.1109/OJNANO.2024.3501293","DOIUrl":null,"url":null,"abstract":"The recent advancements in data mining, machine learning algorithms and cognitive systems have necessitated the development of neuromorphic processing engines which may enable resource and computationally intensive applications on the internet-of-Things (IoT) edge devices with unprecedented energy efficiency. Spintronics based magnetic memory devices can emulate synaptic behavior efficiently and are hailed as one of the most promising candidates for realizing compact and ultra-energy efficient neural network accelerators. Although ultra-dense magnetic memories with multi-bit capability (MLC) were proposed recently, their application in hybrid CMOS-non-volatile memory accelerators is limited due to their low dynamic range (memory window) and high cell currents (ON/OFF-state resistance in ∼kΩ). In this work, we propose a novel supercell to enable the use of MLC MRAMs for neuromorphic multiply-accumulate (MAC) accelerators. For proof-of-concept demonstration, we exploit an MLC MRAM based on c-MTJ for realizing a highly scalable 2-FinFET-1-MRAM supercell with large dynamic range, low supercell currents and high endurance. Furthermore, we perform a comprehensive design exploration of a time-domain MAC accelerator utilizing the proposed supercell. Our detailed analysis using the ASAP7 PDK from ARM for FinFETs and an experimentally calibrated compact model for c-MTJ-based MRAM indicates the possibility of realizing a significantly high energy-efficiency of 87.4 TOPS/W and a throughput of 2.5 TOPS for a 200×200 MAC operation with 4-bit precision.","PeriodicalId":446,"journal":{"name":"IEEE Open Journal of Nanotechnology","volume":"5 ","pages":"141-148"},"PeriodicalIF":1.8000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10756528","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Open Journal of Nanotechnology","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10756528/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATERIALS SCIENCE, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
The recent advancements in data mining, machine learning algorithms and cognitive systems have necessitated the development of neuromorphic processing engines which may enable resource and computationally intensive applications on the internet-of-Things (IoT) edge devices with unprecedented energy efficiency. Spintronics based magnetic memory devices can emulate synaptic behavior efficiently and are hailed as one of the most promising candidates for realizing compact and ultra-energy efficient neural network accelerators. Although ultra-dense magnetic memories with multi-bit capability (MLC) were proposed recently, their application in hybrid CMOS-non-volatile memory accelerators is limited due to their low dynamic range (memory window) and high cell currents (ON/OFF-state resistance in ∼kΩ). In this work, we propose a novel supercell to enable the use of MLC MRAMs for neuromorphic multiply-accumulate (MAC) accelerators. For proof-of-concept demonstration, we exploit an MLC MRAM based on c-MTJ for realizing a highly scalable 2-FinFET-1-MRAM supercell with large dynamic range, low supercell currents and high endurance. Furthermore, we perform a comprehensive design exploration of a time-domain MAC accelerator utilizing the proposed supercell. Our detailed analysis using the ASAP7 PDK from ARM for FinFETs and an experimentally calibrated compact model for c-MTJ-based MRAM indicates the possibility of realizing a significantly high energy-efficiency of 87.4 TOPS/W and a throughput of 2.5 TOPS for a 200×200 MAC operation with 4-bit precision.