利用预先优化的子图映射的粗粒度可重构架构的有效编译

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP) Pub Date : 2022-03-01 DOI:10.1109/pdp55904.2022.00010

Ayaka Ohwada, Takuya Kojima, H. Amano

{"title":"利用预先优化的子图映射的粗粒度可重构架构的有效编译","authors":"Ayaka Ohwada, Takuya Kojima, H. Amano","doi":"10.1109/pdp55904.2022.00010","DOIUrl":null,"url":null,"abstract":"In recent years, IoT devices have become widespread, and energy-efficient coarse-grained reconfigurable architectures (CGRAs) have attracted attention. CGRAs comprise several processing units called processing elements (PEs) arranged in a two-dimensional array. The operations of PEs and the interconnections between them are adaptively changed depending on a target application, and this contributes to a higher energy efficiency compared to general-purpose processors. The application kernel executed on CGRAs is represented as a data flow graph (DFG), and CGRA compilers are responsible for mapping the DFG onto the PE array. Thus, mapping algorithms significantly influence the performance and power efficiency of CGRAs as well as the compile time. This paper proposes POCOCO, a compiler framework for CGRAs that can use pre-optimized subgraph mappings. This contributes to reducing the compiler optimization task. To leverage the subgraph mappings, we extend an existing mapping method based on a genetic algorithm. Experiments on three architectures demonstrated that the proposed method reduces the optimization time by 48%, on an average, for the best case of the three architectures.","PeriodicalId":210759,"journal":{"name":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An efficient compilation of coarse-grained reconfigurable architectures utilizing pre-optimized sub-graph mappings\",\"authors\":\"Ayaka Ohwada, Takuya Kojima, H. Amano\",\"doi\":\"10.1109/pdp55904.2022.00010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, IoT devices have become widespread, and energy-efficient coarse-grained reconfigurable architectures (CGRAs) have attracted attention. CGRAs comprise several processing units called processing elements (PEs) arranged in a two-dimensional array. The operations of PEs and the interconnections between them are adaptively changed depending on a target application, and this contributes to a higher energy efficiency compared to general-purpose processors. The application kernel executed on CGRAs is represented as a data flow graph (DFG), and CGRA compilers are responsible for mapping the DFG onto the PE array. Thus, mapping algorithms significantly influence the performance and power efficiency of CGRAs as well as the compile time. This paper proposes POCOCO, a compiler framework for CGRAs that can use pre-optimized subgraph mappings. This contributes to reducing the compiler optimization task. To leverage the subgraph mappings, we extend an existing mapping method based on a genetic algorithm. Experiments on three architectures demonstrated that the proposed method reduces the optimization time by 48%, on an average, for the best case of the three architectures.\",\"PeriodicalId\":210759,\"journal\":{\"name\":\"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)\",\"volume\":\"96 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/pdp55904.2022.00010\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/pdp55904.2022.00010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

近年来，随着物联网设备的普及，节能的粗粒度可重构架构(CGRAs)受到了人们的关注。CGRAs包括以二维数组排列的称为处理单元(pe)的几个处理单元。pe的操作和它们之间的互连会根据目标应用程序自适应地改变，与通用处理器相比，这有助于提高能源效率。在CGRAs上执行的应用程序内核表示为数据流图(DFG)， CGRA编译器负责将DFG映射到PE阵列。因此，映射算法显著影响CGRAs的性能和能效以及编译时间。本文提出了POCOCO，一个可以使用预优化子图映射的CGRAs编译框架。这有助于减少编译器优化任务。为了利用子图映射，我们扩展了基于遗传算法的现有映射方法。在三种体系结构上的实验表明，对于三种体系结构中最优的情况，该方法平均缩短了48%的优化时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

An efficient compilation of coarse-grained reconfigurable architectures utilizing pre-optimized sub-graph mappings

In recent years, IoT devices have become widespread, and energy-efficient coarse-grained reconfigurable architectures (CGRAs) have attracted attention. CGRAs comprise several processing units called processing elements (PEs) arranged in a two-dimensional array. The operations of PEs and the interconnections between them are adaptively changed depending on a target application, and this contributes to a higher energy efficiency compared to general-purpose processors. The application kernel executed on CGRAs is represented as a data flow graph (DFG), and CGRA compilers are responsible for mapping the DFG onto the PE array. Thus, mapping algorithms significantly influence the performance and power efficiency of CGRAs as well as the compile time. This paper proposes POCOCO, a compiler framework for CGRAs that can use pre-optimized subgraph mappings. This contributes to reducing the compiler optimization task. To leverage the subgraph mappings, we extend an existing mapping method based on a genetic algorithm. Experiments on three architectures demonstrated that the proposed method reduces the optimization time by 48%, on an average, for the best case of the three architectures.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 30th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)

自引率

0.00%

发文量

期刊最新文献

Some Experiments on High Performance Anomaly Detection Advancing Database System Operators with Near-Data Processing A Parallel Approximation Algorithm for the Steiner Forest Problem NoaSci: A Numerical Object Array Library for I/O of Scientific Applications on Object Storage Load Balancing of the Parallel Execution of Two Dimensional Partitioned Cellular Automata