Duy-Thanh Nguyen;Abhiroop Bhattacharjee;Abhishek Moitra;Priyadarshini Panda
{"title":"MCAIMem:用于面积和能效高的片上人工智能存储器的混合 SRAM 和 eDRAM 单元","authors":"Duy-Thanh Nguyen;Abhiroop Bhattacharjee;Abhishek Moitra;Priyadarshini Panda","doi":"10.1109/TVLSI.2024.3439231","DOIUrl":null,"url":null,"abstract":"AI chips commonly employ SRAM memory as buffers for their reliability and speed, which contribute to high performance. However, SRAM is expensive and demands significant area and energy consumption. Previous studies have explored replacing SRAM with emerging technologies, such as nonvolatile memory, which offers fast read memory access and a small cell area. Despite these advantages, nonvolatile memory’s slow write memory access and high write energy consumption prevent it from surpassing SRAM performance in AI applications with extensive memory access requirements. Some research has also investigated embedded dynamic random access memory (eDRAM) as an area-efficient on-chip memory with similar access times as SRAM. Still, refresh power remains a concern, leaving the trade-off among performance, area, and power consumption unresolved. To address this issue, this article presents a novel mixed CMOS cell memory design that balances performance, area, and energy efficiency for AI memory by combining SRAM and eDRAM cells. We consider the proportion ratio of one SRAM and seven eDRAM cells in the memory to achieve area reduction using mixed CMOS cell memory. In addition, we capitalize on the characteristics of deep neural network (DNN) data representation and integrate asymmetric eDRAM cells to lower energy consumption. To validate our proposed MCAIMem solution, we conduct extensive simulations and benchmarking against traditional SRAM. Our results demonstrate that the MCAIMem significantly outperforms these alternatives in terms of area and energy efficiency. Specifically, our MCAIMem can reduce the area by 48% and energy consumption by \n<inline-formula> <tex-math>$3.4\\times $ </tex-math></inline-formula>\n compared with SRAM designs, without incurring any accuracy loss.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 11","pages":"2023-2036"},"PeriodicalIF":2.8000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MCAIMem: A Mixed SRAM and eDRAM Cell for Area and Energy-Efficient On-Chip AI Memory\",\"authors\":\"Duy-Thanh Nguyen;Abhiroop Bhattacharjee;Abhishek Moitra;Priyadarshini Panda\",\"doi\":\"10.1109/TVLSI.2024.3439231\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"AI chips commonly employ SRAM memory as buffers for their reliability and speed, which contribute to high performance. However, SRAM is expensive and demands significant area and energy consumption. Previous studies have explored replacing SRAM with emerging technologies, such as nonvolatile memory, which offers fast read memory access and a small cell area. Despite these advantages, nonvolatile memory’s slow write memory access and high write energy consumption prevent it from surpassing SRAM performance in AI applications with extensive memory access requirements. Some research has also investigated embedded dynamic random access memory (eDRAM) as an area-efficient on-chip memory with similar access times as SRAM. Still, refresh power remains a concern, leaving the trade-off among performance, area, and power consumption unresolved. To address this issue, this article presents a novel mixed CMOS cell memory design that balances performance, area, and energy efficiency for AI memory by combining SRAM and eDRAM cells. We consider the proportion ratio of one SRAM and seven eDRAM cells in the memory to achieve area reduction using mixed CMOS cell memory. In addition, we capitalize on the characteristics of deep neural network (DNN) data representation and integrate asymmetric eDRAM cells to lower energy consumption. To validate our proposed MCAIMem solution, we conduct extensive simulations and benchmarking against traditional SRAM. Our results demonstrate that the MCAIMem significantly outperforms these alternatives in terms of area and energy efficiency. Specifically, our MCAIMem can reduce the area by 48% and energy consumption by \\n<inline-formula> <tex-math>$3.4\\\\times $ </tex-math></inline-formula>\\n compared with SRAM designs, without incurring any accuracy loss.\",\"PeriodicalId\":13425,\"journal\":{\"name\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"volume\":\"32 11\",\"pages\":\"2023-2036\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-09-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Very Large Scale Integration (VLSI) Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10684121/\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10684121/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
MCAIMem: A Mixed SRAM and eDRAM Cell for Area and Energy-Efficient On-Chip AI Memory
AI chips commonly employ SRAM memory as buffers for their reliability and speed, which contribute to high performance. However, SRAM is expensive and demands significant area and energy consumption. Previous studies have explored replacing SRAM with emerging technologies, such as nonvolatile memory, which offers fast read memory access and a small cell area. Despite these advantages, nonvolatile memory’s slow write memory access and high write energy consumption prevent it from surpassing SRAM performance in AI applications with extensive memory access requirements. Some research has also investigated embedded dynamic random access memory (eDRAM) as an area-efficient on-chip memory with similar access times as SRAM. Still, refresh power remains a concern, leaving the trade-off among performance, area, and power consumption unresolved. To address this issue, this article presents a novel mixed CMOS cell memory design that balances performance, area, and energy efficiency for AI memory by combining SRAM and eDRAM cells. We consider the proportion ratio of one SRAM and seven eDRAM cells in the memory to achieve area reduction using mixed CMOS cell memory. In addition, we capitalize on the characteristics of deep neural network (DNN) data representation and integrate asymmetric eDRAM cells to lower energy consumption. To validate our proposed MCAIMem solution, we conduct extensive simulations and benchmarking against traditional SRAM. Our results demonstrate that the MCAIMem significantly outperforms these alternatives in terms of area and energy efficiency. Specifically, our MCAIMem can reduce the area by 48% and energy consumption by
$3.4\times $
compared with SRAM designs, without incurring any accuracy loss.
期刊介绍:
The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society.
Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels.
To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.