电阻联想记忆中的多级可调谐近似搜索

Mohsen Imani;Abbas Rahimi;Pietro Mercati;Tajana Simunic Rosing
{"title":"电阻联想记忆中的多级可调谐近似搜索","authors":"Mohsen Imani;Abbas Rahimi;Pietro Mercati;Tajana Simunic Rosing","doi":"10.1109/TMSCS.2017.2665462","DOIUrl":null,"url":null,"abstract":"General-purpose graphics processing units (GPGPUs), as programmable accelerators, improve energy efficiency by integrating a large number of relatively small cores. In this paper, we focus on improving energy efficiency of such processing core by integrating an associative memory where function responses are prestored. Associative memories can search and recall function responses for a subset of input values therefore avoiding the actual function execution on the processing core that leads to energy saving. We propose a novel low-energy Resistive Multi-stage Associative Memory (ReMAM) architecture to significantly reduce energy of a search operation by employing selective row activation and in-advance precharging techniques. ReMAM splits the search operations in a ternary content addressable memory (TCAM) to a number of shorter searches in consecutive stages. Then, it selectively activates TCAM rows at each stage based on the hits of previous stages, thus enabling energy savings. The proposed inadvance precharging technique mitigates the delay of the sequential TCAM search and limits the number of precharges to two low-cost steps. ReMAM further implements approximation on the selective TCAM blocks to reduce the search energy that relaxes the function output in a fine-grained granularity with very low impact on accuracy of the results. Its multi-stage search operation makes ReMAM applicable to many applications such as search engines, sorting, image coding, pattern recognition, query processing, and machine learning. In this work, we show an application of proposed ReMAM on AMD Southern Island GPUs. Our experimental evaluation shows that ReMAM reduces on average GPGPU energy consumption by 35 percent in the exact mode, and 58 percent in approximate mode with average relative error lower than 10 percent. These energy savings are 1.8x and 1.5x higher than state-of-the-art associative memories used in GPGPUs in exact and approximate modes.","PeriodicalId":100643,"journal":{"name":"IEEE Transactions on Multi-Scale Computing Systems","volume":"4 1","pages":"17-29"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TMSCS.2017.2665462","citationCount":"22","resultStr":"{\"title\":\"Multi-Stage Tunable Approximate Search in Resistive Associative Memory\",\"authors\":\"Mohsen Imani;Abbas Rahimi;Pietro Mercati;Tajana Simunic Rosing\",\"doi\":\"10.1109/TMSCS.2017.2665462\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"General-purpose graphics processing units (GPGPUs), as programmable accelerators, improve energy efficiency by integrating a large number of relatively small cores. In this paper, we focus on improving energy efficiency of such processing core by integrating an associative memory where function responses are prestored. Associative memories can search and recall function responses for a subset of input values therefore avoiding the actual function execution on the processing core that leads to energy saving. We propose a novel low-energy Resistive Multi-stage Associative Memory (ReMAM) architecture to significantly reduce energy of a search operation by employing selective row activation and in-advance precharging techniques. ReMAM splits the search operations in a ternary content addressable memory (TCAM) to a number of shorter searches in consecutive stages. Then, it selectively activates TCAM rows at each stage based on the hits of previous stages, thus enabling energy savings. The proposed inadvance precharging technique mitigates the delay of the sequential TCAM search and limits the number of precharges to two low-cost steps. ReMAM further implements approximation on the selective TCAM blocks to reduce the search energy that relaxes the function output in a fine-grained granularity with very low impact on accuracy of the results. Its multi-stage search operation makes ReMAM applicable to many applications such as search engines, sorting, image coding, pattern recognition, query processing, and machine learning. In this work, we show an application of proposed ReMAM on AMD Southern Island GPUs. Our experimental evaluation shows that ReMAM reduces on average GPGPU energy consumption by 35 percent in the exact mode, and 58 percent in approximate mode with average relative error lower than 10 percent. These energy savings are 1.8x and 1.5x higher than state-of-the-art associative memories used in GPGPUs in exact and approximate modes.\",\"PeriodicalId\":100643,\"journal\":{\"name\":\"IEEE Transactions on Multi-Scale Computing Systems\",\"volume\":\"4 1\",\"pages\":\"17-29\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1109/TMSCS.2017.2665462\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Multi-Scale Computing Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/7845654/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Multi-Scale Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/7845654/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22

摘要

通用图形处理单元(GPGPU)作为可编程加速器,通过集成大量相对较小的内核来提高能源效率。在本文中,我们专注于通过集成预存储函数响应的联想存储器来提高这种处理核心的能量效率。关联存储器可以搜索和调用输入值子集的函数响应,从而避免在处理核心上执行导致节能的实际函数。我们提出了一种新的低能量电阻多级联想存储器(ReMAM)架构,通过采用选择性行激活和预先预充电技术来显著降低搜索操作的能量。ReMAM将三元内容可寻址存储器(TCAM)中的搜索操作拆分为连续阶段中的若干较短搜索。然后,它根据前一阶段的命中率选择性地激活每个阶段的TCAM行,从而实现节能。所提出的先进预充电技术减轻了顺序TCAM搜索的延迟,并将预充电次数限制在两个低成本步骤。ReMAM进一步在选择性TCAM块上实现近似,以减少以细粒度放松函数输出的搜索能量,对结果的准确性的影响非常低。它的多阶段搜索操作使ReMAM适用于许多应用,如搜索引擎、排序、图像编码、模式识别、查询处理和机器学习。在这项工作中,我们展示了所提出的ReMAM在AMD南岛GPU上的应用。我们的实验评估表明,ReMAM在精确模式下平均降低了35%的GPGPU能耗,在平均相对误差低于10%的近似模式下平均减少了58%。在精确和近似模式下,这些能量节省比GPGPU中使用的最先进的关联存储器高1.8倍和1.5倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Multi-Stage Tunable Approximate Search in Resistive Associative Memory
General-purpose graphics processing units (GPGPUs), as programmable accelerators, improve energy efficiency by integrating a large number of relatively small cores. In this paper, we focus on improving energy efficiency of such processing core by integrating an associative memory where function responses are prestored. Associative memories can search and recall function responses for a subset of input values therefore avoiding the actual function execution on the processing core that leads to energy saving. We propose a novel low-energy Resistive Multi-stage Associative Memory (ReMAM) architecture to significantly reduce energy of a search operation by employing selective row activation and in-advance precharging techniques. ReMAM splits the search operations in a ternary content addressable memory (TCAM) to a number of shorter searches in consecutive stages. Then, it selectively activates TCAM rows at each stage based on the hits of previous stages, thus enabling energy savings. The proposed inadvance precharging technique mitigates the delay of the sequential TCAM search and limits the number of precharges to two low-cost steps. ReMAM further implements approximation on the selective TCAM blocks to reduce the search energy that relaxes the function output in a fine-grained granularity with very low impact on accuracy of the results. Its multi-stage search operation makes ReMAM applicable to many applications such as search engines, sorting, image coding, pattern recognition, query processing, and machine learning. In this work, we show an application of proposed ReMAM on AMD Southern Island GPUs. Our experimental evaluation shows that ReMAM reduces on average GPGPU energy consumption by 35 percent in the exact mode, and 58 percent in approximate mode with average relative error lower than 10 percent. These energy savings are 1.8x and 1.5x higher than state-of-the-art associative memories used in GPGPUs in exact and approximate modes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Monolithic 3D Hybrid Architecture for Energy-Efficient Computation H$^2$OEIN: A Hierarchical Hybrid Optical/Electrical Interconnection Network for Exascale Computing Systems A Novel, Simulator for Heterogeneous Cloud Systems that Incorporate Custom Hardware Accelerators Enforcing End-to-End I/O Policies for Scientific Workflows Using Software-Defined Storage Resource Enclaves Low Register-Complexity Systolic Digit-Serial Multiplier Over $GF(2^m)$ Based on Trinomials
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1