Power, Area, and Performance Optimization of Standard Cell Memory Arrays Through Controlled Placement

A. Teman, D. Rossi, P. Meinerzhagen, L. Benini, A. Burg
{"title":"Power, Area, and Performance Optimization of Standard Cell Memory Arrays Through Controlled Placement","authors":"A. Teman, D. Rossi, P. Meinerzhagen, L. Benini, A. Burg","doi":"10.1145/2890498","DOIUrl":null,"url":null,"abstract":"Embedded memory remains a major bottleneck in current integrated circuit design in terms of silicon area, power dissipation, and performance; however, static random access memories (SRAMs) are almost exclusively supplied by a small number of vendors through memory generators, targeted at rather generic design specifications. As an alternative, standard cell memories (SCMs) can be defined, synthesized, and placed and routed as an integral part of a given digital system, providing complete design flexibility, good energy efficiency, low-voltage operation, and even area efficiency for small memory blocks. Yet implementing an SCM block with a standard digital flow often fails to exploit the distinct and regular structure of such an array, leaving room for optimization. In this article, we present a design methodology for optimizing the physical implementation of SCM macros as part of the standard design flow. This methodology introduces controlled placement, leading to a structured, noncongested layout with close to 100% placement utilization, resulting in a smaller silicon footprint, reduced wire length, and lower power consumption compared to SCMs without controlled placement. This methodology is demonstrated on SCM macros of various sizes and aspect ratios in a state-of-the-art 28nm fully depleted silicon-on-insulator technology, and compared with equivalent macros designed with the noncontrolled, standard flow, as well as with foundry-supplied SRAM macros. The controlled SCMs provide an average 25% reduction in area as compared to noncontrolled implementations while achieving a smaller size than SRAM macros of up to 1Kbyte. Power and performance comparisons of controlled SCM blocks of a commonly found 256 × 32 (1 Kbyte) memory with foundry-provided SRAMs show greater than 65% and 10% reduction in read and write power, respectively, while providing faster access than their SRAM counterparts, despite being of an aspect ratio that is typically unfavorable for SCMs. In addition, the SCM blocks function correctly with a supply voltage as low as 0.3V, well below the lower limit of even the SRAM macros optimized for low-voltage operation. The controlled placement methodology is applied within a full-chip physical implementation flow of an OpenRISC-based test chip, providing more than 50% power reduction compared to equivalently sized compiled SRAMs under a benchmark application.","PeriodicalId":7063,"journal":{"name":"ACM Trans. Design Autom. Electr. Syst.","volume":"59 1","pages":"59:1-59:25"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Trans. Design Autom. Electr. Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2890498","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 48

Abstract

Embedded memory remains a major bottleneck in current integrated circuit design in terms of silicon area, power dissipation, and performance; however, static random access memories (SRAMs) are almost exclusively supplied by a small number of vendors through memory generators, targeted at rather generic design specifications. As an alternative, standard cell memories (SCMs) can be defined, synthesized, and placed and routed as an integral part of a given digital system, providing complete design flexibility, good energy efficiency, low-voltage operation, and even area efficiency for small memory blocks. Yet implementing an SCM block with a standard digital flow often fails to exploit the distinct and regular structure of such an array, leaving room for optimization. In this article, we present a design methodology for optimizing the physical implementation of SCM macros as part of the standard design flow. This methodology introduces controlled placement, leading to a structured, noncongested layout with close to 100% placement utilization, resulting in a smaller silicon footprint, reduced wire length, and lower power consumption compared to SCMs without controlled placement. This methodology is demonstrated on SCM macros of various sizes and aspect ratios in a state-of-the-art 28nm fully depleted silicon-on-insulator technology, and compared with equivalent macros designed with the noncontrolled, standard flow, as well as with foundry-supplied SRAM macros. The controlled SCMs provide an average 25% reduction in area as compared to noncontrolled implementations while achieving a smaller size than SRAM macros of up to 1Kbyte. Power and performance comparisons of controlled SCM blocks of a commonly found 256 × 32 (1 Kbyte) memory with foundry-provided SRAMs show greater than 65% and 10% reduction in read and write power, respectively, while providing faster access than their SRAM counterparts, despite being of an aspect ratio that is typically unfavorable for SCMs. In addition, the SCM blocks function correctly with a supply voltage as low as 0.3V, well below the lower limit of even the SRAM macros optimized for low-voltage operation. The controlled placement methodology is applied within a full-chip physical implementation flow of an OpenRISC-based test chip, providing more than 50% power reduction compared to equivalently sized compiled SRAMs under a benchmark application.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过控制放置的标准单元存储阵列的功率、面积和性能优化
就硅面积、功耗和性能而言,嵌入式存储器仍然是当前集成电路设计的主要瓶颈;然而,静态随机存取存储器(sram)几乎完全由少数供应商通过存储器生成器提供,目标是相当通用的设计规范。作为替代方案,标准单元存储器(scm)可以定义、合成、放置和路由为给定数字系统的一个组成部分,提供完整的设计灵活性、良好的能效、低电压操作,甚至小存储块的面积效率。然而,使用标准数字流实现SCM块通常无法利用这种阵列的独特和规则结构,从而留下了优化的空间。在本文中,我们提出了一种设计方法,用于优化SCM宏的物理实现,作为标准设计流程的一部分。该方法引入了可控布局,实现了结构化、非拥堵布局,布局利用率接近100%,与没有控制布局的scm相比,硅占地面积更小,导线长度更短,功耗更低。该方法在最先进的28nm完全耗尽绝缘体上硅技术的各种尺寸和纵横比的SCM宏上进行了演示,并与使用非控制标准流程设计的等效宏以及代工厂提供的SRAM宏进行了比较。与非控制实现相比,受控scm的面积平均减少了25%,同时实现了比SRAM宏更小的1Kbyte。通常发现的256 × 32 (1 Kbyte)存储器的控制SCM块与代工厂提供的SRAM的功率和性能比较分别显示读写功率降低65%和10%以上,同时提供比SRAM更快的访问,尽管其宽高比通常对SCM不利。此外,单片机模块在电源电压低至0.3V时正常工作,远低于为低压操作优化的SRAM宏的下限。控制放置方法应用于基于openrisc的测试芯片的全芯片物理实现流程中,与基准应用程序下同等大小的编译sram相比,可提供50%以上的功耗降低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
High-Level Synthesis Implementation of an Embedded Real-Time HEVC Intra Encoder on FPGA for Media Applications Achieving High In Situ Training Accuracy and Energy Efficiency with Analog Non-Volatile Synaptic Devices A Comprehensive Survey of Attacks without Physical Access Targeting Hardware Vulnerabilities in IoT/IIoT Devices, and Their Detection Mechanisms Improving LDPC Decoding Performance for 3D TLC NAND Flash by LLR Optimization Scheme for Hard and Soft Decision Demand-Driven Multi-Target Sample Preparation on Resource-Constrained Digital Microfluidic Biochips
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1