程序加速使用最近距离关联搜索

2018 19th International Symposium on Quality Electronic Design (ISQED) Pub Date : 2018-03-13 DOI:10.1109/ISQED.2018.8357263

M. Imani, Daniel Peroni, T. Simunic

{"title":"程序加速使用最近距离关联搜索","authors":"M. Imani, Daniel Peroni, T. Simunic","doi":"10.1109/ISQED.2018.8357263","DOIUrl":null,"url":null,"abstract":"Data generated by current computing systems is rapidly increasing as they become more interconnected as part of the Internet of Things (IoT). The growing amount of generated data, such as multimedia, needs to be accelerated using efficient massive parallel processors. Associative memories, in tandem with processing elements, in the form of look-up tables, can reduce energy consumption by eliminating redundant computations. In this paper, we propose a resistive associative unit, called RAU, which approximately performs basic computations with significantly higher efficiency compared to traditional processing units. RAU stores high frequency patterns corresponding to each operation and then retrieves the nearest distance row to the input data as an approximate output. In order to avoid using a large and energy intensive RAU, our design adaptively detects inputs with lower frequency and assigns them to precise cores to process. For each application, our design is able to adjust the ratio of data processed between RAU and precise cores to ensure computational accuracy. We consider the application of RAU on an AMD Southern Island GPU, a recent GPGPU architecture. Our experimental evaluation shows that GPGPU enhanced with RAU can achieve 61% average energy savings, and 2.2× speedup over eight diverse OpenCL applications, while ensuring acceptable quality of computation. The energy-delay product improvement of enhanced GPGPU is 5.7× and 2.8× higher compared to conventional and state-of-the-art approximate GPGPU, respectively.","PeriodicalId":213351,"journal":{"name":"2018 19th International Symposium on Quality Electronic Design (ISQED)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Program acceleration using nearest distance associative search\",\"authors\":\"M. Imani, Daniel Peroni, T. Simunic\",\"doi\":\"10.1109/ISQED.2018.8357263\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data generated by current computing systems is rapidly increasing as they become more interconnected as part of the Internet of Things (IoT). The growing amount of generated data, such as multimedia, needs to be accelerated using efficient massive parallel processors. Associative memories, in tandem with processing elements, in the form of look-up tables, can reduce energy consumption by eliminating redundant computations. In this paper, we propose a resistive associative unit, called RAU, which approximately performs basic computations with significantly higher efficiency compared to traditional processing units. RAU stores high frequency patterns corresponding to each operation and then retrieves the nearest distance row to the input data as an approximate output. In order to avoid using a large and energy intensive RAU, our design adaptively detects inputs with lower frequency and assigns them to precise cores to process. For each application, our design is able to adjust the ratio of data processed between RAU and precise cores to ensure computational accuracy. We consider the application of RAU on an AMD Southern Island GPU, a recent GPGPU architecture. Our experimental evaluation shows that GPGPU enhanced with RAU can achieve 61% average energy savings, and 2.2× speedup over eight diverse OpenCL applications, while ensuring acceptable quality of computation. The energy-delay product improvement of enhanced GPGPU is 5.7× and 2.8× higher compared to conventional and state-of-the-art approximate GPGPU, respectively.\",\"PeriodicalId\":213351,\"journal\":{\"name\":\"2018 19th International Symposium on Quality Electronic Design (ISQED)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 19th International Symposium on Quality Electronic Design (ISQED)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISQED.2018.8357263\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 19th International Symposium on Quality Electronic Design (ISQED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISQED.2018.8357263","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

随着当前计算系统作为物联网(IoT)的一部分变得更加互联，它们产生的数据正在迅速增加。越来越多的生成数据(如多媒体)需要使用高效的大规模并行处理器来加速。以查找表的形式与处理元素相结合的联想存储器可以通过消除冗余计算来减少能耗。在本文中，我们提出了一种称为RAU的电阻联想单元，与传统处理单元相比，它可以以显着更高的效率近似执行基本计算。RAU存储对应于每个操作的高频模式，然后检索距离输入数据最近的行作为近似输出。为了避免使用大型和能源密集型的RAU，我们的设计自适应地检测频率较低的输入，并将其分配给精确的核心进行处理。对于每个应用程序，我们的设计能够调整RAU和精确核心之间处理的数据比例，以确保计算精度。我们考虑RAU在AMD Southern Island GPU上的应用，这是一种最新的GPGPU架构。我们的实验评估表明，经过RAU增强的GPGPU在8种不同的OpenCL应用程序中可以实现61%的平均节能和2.2倍的加速，同时确保可接受的计算质量。增强型GPGPU的能量延迟积比传统GPGPU和最先进的近似GPGPU分别提高了5.7倍和2.8倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Program acceleration using nearest distance associative search

Data generated by current computing systems is rapidly increasing as they become more interconnected as part of the Internet of Things (IoT). The growing amount of generated data, such as multimedia, needs to be accelerated using efficient massive parallel processors. Associative memories, in tandem with processing elements, in the form of look-up tables, can reduce energy consumption by eliminating redundant computations. In this paper, we propose a resistive associative unit, called RAU, which approximately performs basic computations with significantly higher efficiency compared to traditional processing units. RAU stores high frequency patterns corresponding to each operation and then retrieves the nearest distance row to the input data as an approximate output. In order to avoid using a large and energy intensive RAU, our design adaptively detects inputs with lower frequency and assigns them to precise cores to process. For each application, our design is able to adjust the ratio of data processed between RAU and precise cores to ensure computational accuracy. We consider the application of RAU on an AMD Southern Island GPU, a recent GPGPU architecture. Our experimental evaluation shows that GPGPU enhanced with RAU can achieve 61% average energy savings, and 2.2× speedup over eight diverse OpenCL applications, while ensuring acceptable quality of computation. The energy-delay product improvement of enhanced GPGPU is 5.7× and 2.8× higher compared to conventional and state-of-the-art approximate GPGPU, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 19th International Symposium on Quality Electronic Design (ISQED)

自引率

0.00%

发文量

期刊最新文献

Body-biasing assisted vmin optimization for 5nm-node multi-Vt FD-SOI 6T-SRAM PDA-HyPAR: Path-diversity-aware hybrid planar adaptive routing algorithm for 3D NoCs A loop structure optimization targeting high-level synthesis of fast number theoretic transform Hybrid-comp: A criticality-aware compressed last-level cache Low power latch based design with smart retiming