AC-DIMM: associative computing with STT-MRAM

Qing Guo, Xiaochen Guo, Ravi Patel, Engin Ipek, E. Friedman
{"title":"AC-DIMM: associative computing with STT-MRAM","authors":"Qing Guo, Xiaochen Guo, Ravi Patel, Engin Ipek, E. Friedman","doi":"10.1145/2485922.2485939","DOIUrl":null,"url":null,"abstract":"With technology scaling, on-chip power dissipation and off-chip memory bandwidth have become significant performance bottlenecks in virtually all computer systems, from mobile devices to supercomputers. An effective way of improving performance in the face of bandwidth and power limitations is to rely on associative memory systems. Recent work on a PCM-based, associative TCAM accelerator shows that associative search capability can reduce both off-chip bandwidth demand and overall system energy. Unfortunately, previously proposed resistive TCAM accelerators have limited flexibility: only a restricted (albeit important) class of applications can benefit from a TCAM accelerator, and the implementation is confined to resistive memory technologies with a high dynamic range (RHigh/RLow), such as PCM. This work proposes AC-DIMM, a flexible, high-performance associative compute engine built on a DDR3-compatible memory module. AC-DIMM addresses the limited flexibility of previous resistive TCAM accelerators by combining two powerful capabilities---associative search and processing in memory. Generality is improved by augmenting a TCAM system with a set of integrated, user programmable microcontrollers that operate directly on search results, and by architecting the system such that key-value pairs can be co-located in the same TCAM row. A new, bit-serial TCAM array is proposed, which enables the system to be implemented using STT-MRAM. AC-DIMM achieves a 4.2X speedup and a 6.5X energy reduction over a conventional RAM-based system on a set of 13 evaluated applications.","PeriodicalId":20555,"journal":{"name":"Proceedings of the 40th Annual International Symposium on Computer Architecture","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"132","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 40th Annual International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2485922.2485939","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 132

Abstract

With technology scaling, on-chip power dissipation and off-chip memory bandwidth have become significant performance bottlenecks in virtually all computer systems, from mobile devices to supercomputers. An effective way of improving performance in the face of bandwidth and power limitations is to rely on associative memory systems. Recent work on a PCM-based, associative TCAM accelerator shows that associative search capability can reduce both off-chip bandwidth demand and overall system energy. Unfortunately, previously proposed resistive TCAM accelerators have limited flexibility: only a restricted (albeit important) class of applications can benefit from a TCAM accelerator, and the implementation is confined to resistive memory technologies with a high dynamic range (RHigh/RLow), such as PCM. This work proposes AC-DIMM, a flexible, high-performance associative compute engine built on a DDR3-compatible memory module. AC-DIMM addresses the limited flexibility of previous resistive TCAM accelerators by combining two powerful capabilities---associative search and processing in memory. Generality is improved by augmenting a TCAM system with a set of integrated, user programmable microcontrollers that operate directly on search results, and by architecting the system such that key-value pairs can be co-located in the same TCAM row. A new, bit-serial TCAM array is proposed, which enables the system to be implemented using STT-MRAM. AC-DIMM achieves a 4.2X speedup and a 6.5X energy reduction over a conventional RAM-based system on a set of 13 evaluated applications.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
AC-DIMM:与STT-MRAM关联计算
随着技术的发展,片上功耗和片外内存带宽已经成为从移动设备到超级计算机等几乎所有计算机系统的重要性能瓶颈。面对带宽和功率限制,提高性能的有效方法是依赖关联存储系统。最近对基于pcm的关联TCAM加速器的研究表明,关联搜索功能可以减少片外带宽需求和整体系统能量。不幸的是,以前提出的电阻式TCAM加速器具有有限的灵活性:只有有限(尽管重要)的应用类别可以从TCAM加速器中受益,并且实现仅限于具有高动态范围(RHigh/RLow)的电阻式存储技术,例如PCM。这项工作提出了AC-DIMM,一个灵活的,高性能的关联计算引擎,建立在ddr3兼容的内存模块上。AC-DIMM通过结合两种强大的功能-关联搜索和内存处理,解决了以前电阻式TCAM加速器的有限灵活性。通过使用一组集成的、用户可编程的微控制器来增加TCAM系统,这些微控制器直接对搜索结果进行操作,并且通过构建系统使键值对可以在同一TCAM行中共存来提高通用性。提出了一种新的位串行TCAM阵列,使系统能够使用STT-MRAM实现。在一组13个评估应用中,AC-DIMM比传统的基于ram的系统实现了4.2倍的加速和6.5倍的节能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
AC-DIMM: associative computing with STT-MRAM Deconfigurable microprocessor architectures for silicon debug acceleration Thin servers with smart pipes: designing SoC accelerators for memcached An experimental study of data retention behavior in modern DRAM devices: implications for retention time profiling mechanisms Dynamic reduction of voltage margins by leveraging on-chip ECC in Itanium II processors
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1