可重构的内存计算SRAM体系结构，用于可扩展的矢量化

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design Pub Date : 2020-08-10 DOI:10.1145/3370748.3406550

R. Gauchi, V. Egloff, Maha Kooli, J. Noël, B. Giraud, P. Vivet, S. Mitra, H. Charles

{"title":"可重构的内存计算SRAM体系结构，用于可扩展的矢量化","authors":"R. Gauchi, V. Egloff, Maha Kooli, J. Noël, B. Giraud, P. Vivet, S. Mitra, H. Charles","doi":"10.1145/3370748.3406550","DOIUrl":null,"url":null,"abstract":"For big data applications, bringing computation to the memory is expected to reduce drastically data transfers, which can be done using recent concepts of Computing-In-Memory (CIM). To address kernels with larger memory data sets, we propose a reconfigurable tile-based architecture composed of Computational-SRAM (C-SRAM) tiles, each enabling arithmetic and logic operations within the memory. The proposed horizontal scalability and vertical data communication are combined to select the optimal vector width for maximum performance. These schemes allow to use vector-based kernels available on existing SIMD engines onto the targeted CIM architecture. For architecture exploration, we propose an instruction-accurate simulation platform using SystemC/TLM to quantify performance and energy of various kernels. For detailed performance evaluation, the platform is calibrated with data extracted from the Place&Route C-SRAM circuit, designed in 22nm FDSOI technology. Compared to 512-bit SIMD architecture, the proposed CIM architecture achieves an EDP reduction up to 60× and 34× for memory bound kernels and for compute bound kernels, respectively.","PeriodicalId":116486,"journal":{"name":"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Reconfigurable tiles of computing-in-memory SRAM architecture for scalable vectorization\",\"authors\":\"R. Gauchi, V. Egloff, Maha Kooli, J. Noël, B. Giraud, P. Vivet, S. Mitra, H. Charles\",\"doi\":\"10.1145/3370748.3406550\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For big data applications, bringing computation to the memory is expected to reduce drastically data transfers, which can be done using recent concepts of Computing-In-Memory (CIM). To address kernels with larger memory data sets, we propose a reconfigurable tile-based architecture composed of Computational-SRAM (C-SRAM) tiles, each enabling arithmetic and logic operations within the memory. The proposed horizontal scalability and vertical data communication are combined to select the optimal vector width for maximum performance. These schemes allow to use vector-based kernels available on existing SIMD engines onto the targeted CIM architecture. For architecture exploration, we propose an instruction-accurate simulation platform using SystemC/TLM to quantify performance and energy of various kernels. For detailed performance evaluation, the platform is calibrated with data extracted from the Place&Route C-SRAM circuit, designed in 22nm FDSOI technology. Compared to 512-bit SIMD architecture, the proposed CIM architecture achieves an EDP reduction up to 60× and 34× for memory bound kernels and for compute bound kernels, respectively.\",\"PeriodicalId\":116486,\"journal\":{\"name\":\"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-08-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3370748.3406550\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3370748.3406550","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

对于大数据应用程序，将计算带入内存有望大幅减少数据传输，这可以使用内存中计算(CIM)的最新概念来实现。为了解决具有更大内存数据集的内核，我们提出了一种可重构的基于块的架构，该架构由计算sram (C-SRAM)块组成，每个块在内存中支持算术和逻辑操作。将所提出的水平可扩展性和垂直数据通信相结合，选择最佳矢量宽度以获得最大性能。这些方案允许在目标CIM体系结构上使用现有SIMD引擎上可用的基于矢量的内核。在架构探索方面，我们提出了一个指令精确的仿真平台，使用SystemC/TLM来量化各种内核的性能和能量。为了进行详细的性能评估，该平台使用从Place&Route C-SRAM电路提取的数据进行校准，该电路采用22nm FDSOI技术设计。与512位SIMD体系结构相比，所提出的CIM体系结构对于内存绑定内核和计算绑定内核分别实现了高达60倍和34倍的EDP降低。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Reconfigurable tiles of computing-in-memory SRAM architecture for scalable vectorization

For big data applications, bringing computation to the memory is expected to reduce drastically data transfers, which can be done using recent concepts of Computing-In-Memory (CIM). To address kernels with larger memory data sets, we propose a reconfigurable tile-based architecture composed of Computational-SRAM (C-SRAM) tiles, each enabling arithmetic and logic operations within the memory. The proposed horizontal scalability and vertical data communication are combined to select the optimal vector width for maximum performance. These schemes allow to use vector-based kernels available on existing SIMD engines onto the targeted CIM architecture. For architecture exploration, we propose an instruction-accurate simulation platform using SystemC/TLM to quantify performance and energy of various kernels. For detailed performance evaluation, the platform is calibrated with data extracted from the Place&Route C-SRAM circuit, designed in 22nm FDSOI technology. Compared to 512-bit SIMD architecture, the proposed CIM architecture achieves an EDP reduction up to 60× and 34× for memory bound kernels and for compute bound kernels, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design

自引率

0.00%

发文量