MCAIMem: A Mixed SRAM and eDRAM Cell for Area and Energy-Efficient On-Chip AI Memory

IF 2.8 2区 工程技术 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE Transactions on Very Large Scale Integration (VLSI) Systems Pub Date : 2024-09-18 DOI:10.1109/TVLSI.2024.3439231
Duy-Thanh Nguyen;Abhiroop Bhattacharjee;Abhishek Moitra;Priyadarshini Panda
{"title":"MCAIMem: A Mixed SRAM and eDRAM Cell for Area and Energy-Efficient On-Chip AI Memory","authors":"Duy-Thanh Nguyen;Abhiroop Bhattacharjee;Abhishek Moitra;Priyadarshini Panda","doi":"10.1109/TVLSI.2024.3439231","DOIUrl":null,"url":null,"abstract":"AI chips commonly employ SRAM memory as buffers for their reliability and speed, which contribute to high performance. However, SRAM is expensive and demands significant area and energy consumption. Previous studies have explored replacing SRAM with emerging technologies, such as nonvolatile memory, which offers fast read memory access and a small cell area. Despite these advantages, nonvolatile memory’s slow write memory access and high write energy consumption prevent it from surpassing SRAM performance in AI applications with extensive memory access requirements. Some research has also investigated embedded dynamic random access memory (eDRAM) as an area-efficient on-chip memory with similar access times as SRAM. Still, refresh power remains a concern, leaving the trade-off among performance, area, and power consumption unresolved. To address this issue, this article presents a novel mixed CMOS cell memory design that balances performance, area, and energy efficiency for AI memory by combining SRAM and eDRAM cells. We consider the proportion ratio of one SRAM and seven eDRAM cells in the memory to achieve area reduction using mixed CMOS cell memory. In addition, we capitalize on the characteristics of deep neural network (DNN) data representation and integrate asymmetric eDRAM cells to lower energy consumption. To validate our proposed MCAIMem solution, we conduct extensive simulations and benchmarking against traditional SRAM. Our results demonstrate that the MCAIMem significantly outperforms these alternatives in terms of area and energy efficiency. Specifically, our MCAIMem can reduce the area by 48% and energy consumption by \n<inline-formula> <tex-math>$3.4\\times $ </tex-math></inline-formula>\n compared with SRAM designs, without incurring any accuracy loss.","PeriodicalId":13425,"journal":{"name":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","volume":"32 11","pages":"2023-2036"},"PeriodicalIF":2.8000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Very Large Scale Integration (VLSI) Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10684121/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

AI chips commonly employ SRAM memory as buffers for their reliability and speed, which contribute to high performance. However, SRAM is expensive and demands significant area and energy consumption. Previous studies have explored replacing SRAM with emerging technologies, such as nonvolatile memory, which offers fast read memory access and a small cell area. Despite these advantages, nonvolatile memory’s slow write memory access and high write energy consumption prevent it from surpassing SRAM performance in AI applications with extensive memory access requirements. Some research has also investigated embedded dynamic random access memory (eDRAM) as an area-efficient on-chip memory with similar access times as SRAM. Still, refresh power remains a concern, leaving the trade-off among performance, area, and power consumption unresolved. To address this issue, this article presents a novel mixed CMOS cell memory design that balances performance, area, and energy efficiency for AI memory by combining SRAM and eDRAM cells. We consider the proportion ratio of one SRAM and seven eDRAM cells in the memory to achieve area reduction using mixed CMOS cell memory. In addition, we capitalize on the characteristics of deep neural network (DNN) data representation and integrate asymmetric eDRAM cells to lower energy consumption. To validate our proposed MCAIMem solution, we conduct extensive simulations and benchmarking against traditional SRAM. Our results demonstrate that the MCAIMem significantly outperforms these alternatives in terms of area and energy efficiency. Specifically, our MCAIMem can reduce the area by 48% and energy consumption by $3.4\times $ compared with SRAM designs, without incurring any accuracy loss.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MCAIMem:用于面积和能效高的片上人工智能存储器的混合 SRAM 和 eDRAM 单元
人工智能芯片通常使用 SRAM 存储器作为缓冲器,其可靠性和速度有助于实现高性能。然而,SRAM 价格昂贵,需要大量的面积和能耗。以前的研究曾探讨过用非易失性存储器等新兴技术取代 SRAM,因为非易失性存储器读取内存速度快,单元面积小。尽管具有这些优势,但非易失性存储器的写入内存访问速度慢、写入能耗高,因此在具有大量内存访问要求的人工智能应用中,非易失性存储器的性能无法超越 SRAM。一些研究还将嵌入式动态随机访问存储器(eDRAM)作为一种面积效率高的片上存储器进行了研究,其访问时间与 SRAM 相似。然而,刷新功耗仍然是一个令人担忧的问题,性能、面积和功耗之间的权衡尚未解决。为解决这一问题,本文提出了一种新型混合 CMOS 单元存储器设计,通过结合 SRAM 和 eDRAM 单元,平衡了人工智能存储器的性能、面积和能效。我们考虑了存储器中一个 SRAM 和七个 eDRAM 单元的比例,以利用混合 CMOS 单元存储器实现面积缩减。此外,我们还利用深度神经网络(DNN)数据表示的特点,集成了非对称 eDRAM 单元,以降低能耗。为了验证我们提出的 MCAIMem 解决方案,我们进行了大量仿真,并以传统 SRAM 为基准进行了测试。结果表明,MCAIMem 在面积和能效方面明显优于这些替代方案。具体来说,与 SRAM 设计相比,我们的 MCAIMem 可以减少 48% 的面积和 3.4 美元/次的能耗,而且不会造成任何精度损失。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
6.40
自引率
7.10%
发文量
187
审稿时长
3.6 months
期刊介绍: The IEEE Transactions on VLSI Systems is published as a monthly journal under the co-sponsorship of the IEEE Circuits and Systems Society, the IEEE Computer Society, and the IEEE Solid-State Circuits Society. Design and realization of microelectronic systems using VLSI/ULSI technologies require close collaboration among scientists and engineers in the fields of systems architecture, logic and circuit design, chips and wafer fabrication, packaging, testing and systems applications. Generation of specifications, design and verification must be performed at all abstraction levels, including the system, register-transfer, logic, circuit, transistor and process levels. To address this critical area through a common forum, the IEEE Transactions on VLSI Systems have been founded. The editorial board, consisting of international experts, invites original papers which emphasize and merit the novel systems integration aspects of microelectronic systems including interactions among systems design and partitioning, logic and memory design, digital and analog circuit design, layout synthesis, CAD tools, chips and wafer fabrication, testing and packaging, and systems level qualification. Thus, the coverage of these Transactions will focus on VLSI/ULSI microelectronic systems integration.
期刊最新文献
Table of Contents IEEE Transactions on Very Large Scale Integration (VLSI) Systems Society Information IEEE Transactions on Very Large Scale Integration (VLSI) Systems Publication Information Table of Contents IEEE Transactions on Very Large Scale Integration (VLSI) Systems Publication Information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1