Efficient Coarse-Grained Reconfigurable Array architecture for machine learning applications in space using DARE65T library platform

IF 2.6 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Microprocessors and Microsystems Pub Date : 2025-01-14 DOI:10.1016/j.micpro.2025.105142

Luca Zulberti , Matteo Monopoli , Pietro Nannipieri , Silvia Moranti , Geert Thys , Luca Fanucci

{"title":"Efficient Coarse-Grained Reconfigurable Array architecture for machine learning applications in space using DARE65T library platform","authors":"Luca Zulberti , Matteo Monopoli , Pietro Nannipieri , Silvia Moranti , Geert Thys , Luca Fanucci","doi":"10.1016/j.micpro.2025.105142","DOIUrl":null,"url":null,"abstract":"<div><div>With the increasing use of satellites, rovers, and other space exploration devices, Artificial Intelligence (AI) is also becoming an important tool for space exploration, allowing autonomous decision-making and operations in harsh environments. As a result, there is an increasing demand for reliable and energy-efficient processing platforms in the space industry. Among all processing architectures, Coarse-Grained Reconfigurable Arrays (CGRAs) are becoming popular, particularly in data-intensive applications like machine learning, demonstrating a substantial improvement in the energy efficiency of inference operations while preserving a good degree of versatility. In high-level class space missions, the hardware platforms incorporate radiation-hardened Field Programmable Gate Arrays (FPGAs) and microcontrollers, which do not meet the performance requirements for the aforementioned AI applications. The use of CGRA architectures in space missions is still not widely studied. The main contribution of this work is a comprehensive Design Space Exploration (DSE) activity with our highly parameterized CGRA architecture, exploring the costs associated with various design parameters when targeting AI in the space domain. We evaluated performance, power consumption, and area occupation after synthesis on the radiation-hardened DARE65T standard cell library developed by imec, based on a commercial 65 nm technology process. We characterize different CGRA configurations, comparing them with state-of-the-art solutions used for the acceleration of the AI algorithms. This work highlights Performance, Power, and Area (PPA) results that range from <span><math><mrow><mi>100</mi><mspace></mspace><mi>MHz</mi></mrow></math></span> (up to <span><math><mrow><mi>600</mi><mspace></mspace><mi>MOps</mi></mrow></math></span>), <span><math><mrow><mi>2.43</mi><mo>×</mo><msup><mrow><mi>10</mi></mrow><mrow><mi>4</mi></mrow></msup><mspace></mspace><mstyle><mstyle><mi>μ</mi></mstyle></mstyle><msup><mrow><mi>m</mi></mrow><mrow><mi>2</mi></mrow></msup></mrow></math></span> cell area occupation and <span><math><mrow><mi>0.699</mi><mspace></mspace><mi>mW</mi></mrow></math></span> power consumption, to <span><math><mrow><mi>625</mi><mspace></mspace><mi>MHz</mi></mrow></math></span> (up to <span><math><mrow><mi>3.75</mi><mspace></mspace><mi>GOps</mi></mrow></math></span>), <span><math><mrow><mi>2.43</mi><mo>×</mo><msup><mrow><mi>10</mi></mrow><mrow><mi>5</mi></mrow></msup><mspace></mspace><mstyle><mstyle><mi>μ</mi></mstyle></mstyle><msup><mrow><mi>m</mi></mrow><mrow><mi>2</mi></mrow></msup><mo>,</mo><mi>46.5</mi><mspace></mspace><mi>mW</mi></mrow></math></span>. During DSE activity, we highlight the optimal solutions in terms of area efficiency (up to <span><math><mrow><mi>313.1</mi><mspace></mspace><msup><mrow><mi>GOps/mm</mi></mrow><mrow><mi>2</mi></mrow></msup></mrow></math></span>) and energy efficiency (up to <span><math><mrow><mi>289</mi><mspace></mspace><mi>GOps/W</mi></mrow></math></span>) of each CGRA configuration.</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"113 ","pages":"Article 105142"},"PeriodicalIF":2.6000,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microprocessors and Microsystems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141933125000109","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

With the increasing use of satellites, rovers, and other space exploration devices, Artificial Intelligence (AI) is also becoming an important tool for space exploration, allowing autonomous decision-making and operations in harsh environments. As a result, there is an increasing demand for reliable and energy-efficient processing platforms in the space industry. Among all processing architectures, Coarse-Grained Reconfigurable Arrays (CGRAs) are becoming popular, particularly in data-intensive applications like machine learning, demonstrating a substantial improvement in the energy efficiency of inference operations while preserving a good degree of versatility. In high-level class space missions, the hardware platforms incorporate radiation-hardened Field Programmable Gate Arrays (FPGAs) and microcontrollers, which do not meet the performance requirements for the aforementioned AI applications. The use of CGRA architectures in space missions is still not widely studied. The main contribution of this work is a comprehensive Design Space Exploration (DSE) activity with our highly parameterized CGRA architecture, exploring the costs associated with various design parameters when targeting AI in the space domain. We evaluated performance, power consumption, and area occupation after synthesis on the radiation-hardened DARE65T standard cell library developed by imec, based on a commercial 65 nm technology process. We characterize different CGRA configurations, comparing them with state-of-the-art solutions used for the acceleration of the AI algorithms. This work highlights Performance, Power, and Area (PPA) results that range from

100 MHz

(up to

600 MOps

2.43 \times 10^{4} μ m^{2}

cell area occupation and

0.699 mW

power consumption, to

625 MHz

(up to

3.75 GOps

2.43 \times 10^{5} μ m^{2}, 46.5 mW

. During DSE activity, we highlight the optimal solutions in terms of area efficiency (up to

313.1 {GOps/mm}^{2}

) and energy efficiency (up to

289 GOps/W

) of each CGRA configuration.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用 DARE65T 库平台为空间机器学习应用设计高效的粗粒度可重构阵列架构

随着卫星、漫游者和其他空间探索设备的使用越来越多，人工智能（AI）也成为空间探索的重要工具，可以在恶劣环境下自主决策和操作。因此，航天工业对可靠和节能的处理平台的需求日益增加。在所有的处理架构中，粗粒度可重构阵列（CGRAs）正变得越来越流行，特别是在数据密集型应用中，如机器学习，在保持良好通用性的同时，证明了推理操作的能源效率的大幅提高。在高级别空间任务中，硬件平台包含抗辐射的现场可编程门阵列（fpga）和微控制器，它们不满足上述人工智能应用的性能要求。CGRA结构在空间任务中的应用还没有得到广泛的研究。这项工作的主要贡献是利用我们高度参数化的CGRA架构进行全面的设计空间探索（DSE）活动，探索在空间领域瞄准人工智能时与各种设计参数相关的成本。我们评估了imec基于商用65纳米工艺开发的抗辐射DARE65T标准细胞库合成后的性能、功耗和面积占用。我们描述了不同的CGRA配置，并将它们与用于加速AI算法的最先进解决方案进行了比较。这项工作突出了性能，功率和面积（PPA）结果，范围从100MHz（高达600MOps）， 2.43×104μm2小区面积占用和0.699mW功耗，到625MHz（高达3.75GOps）， 2.43×105μm2,46.5mW。在DSE活动期间，我们强调了每个CGRA配置在面积效率（高达313.1GOps/mm2）和能源效率（高达289GOps/W）方面的最佳解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Microprocessors and Microsystems 工程技术-工程：电子与电气

CiteScore

6.90

自引率

3.80%

发文量

204

审稿时长

172 days

期刊介绍： Microprocessors and Microsystems: Embedded Hardware Design (MICPRO) is a journal covering all design and architectural aspects related to embedded systems hardware. This includes different embedded system hardware platforms ranging from custom hardware via reconfigurable systems and application specific processors to general purpose embedded processors. Special emphasis is put on novel complex embedded architectures, such as systems on chip (SoC), systems on a programmable/reconfigurable chip (SoPC) and multi-processor systems on a chip (MPSoC), as well as, their memory and communication methods and structures, such as network-on-chip (NoC). Design automation of such systems including methodologies, techniques, flows and tools for their design, as well as, novel designs of hardware components fall within the scope of this journal. Novel cyber-physical applications that use embedded systems are also central in this journal. While software is not in the main focus of this journal, methods of hardware/software co-design, as well as, application restructuring and mapping to embedded hardware platforms, that consider interplay between software and hardware components with emphasis on hardware, are also in the journal scope.

期刊最新文献

ViT-LoRA: Optimized vision transformer for efficient edge computing in medical imaging Edge computing System-on-Chip architecture for a Non-Intrusive Load Monitoring sensor in ambient intelligence applications Editorial Board