多核 MCU 神经元和层级 CNN 调度算法

IF 2.6 4区计算机科学 Q3 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Microprocessors and Microsystems Pub Date : 2024-11-01 Epub Date: 2024-10-21 DOI:10.1016/j.micpro.2024.105107

Petr Dobiáš , Thomas Garbay , Bertrand Granado , Khalil Hachicha , Andrea Pinna

{"title":"多核 MCU 神经元和层级 CNN 调度算法","authors":"Petr Dobiáš , Thomas Garbay , Bertrand Granado , Khalil Hachicha , Andrea Pinna","doi":"10.1016/j.micpro.2024.105107","DOIUrl":null,"url":null,"abstract":"<div><div>Convolutional neural networks (CNNs) are progressively deployed on embedded systems, which is challenging because their computational and energy requirements need to be satisfied by devices with limited resources and power supplies. For instance, they can be implemented in the Internet of Things or edge computing, i.e., in applications using low-power and low-performance microcontroller units (MCUs). Monocore MCUs are not tailored to respond to the computational and energy requirements of CNNs due to their limited resources, but a multicore MCU can overcome these limitations. This paper presents an empirical study analysing three algorithms for scheduling CNNs on embedded systems at two different levels (neuron and layer levels) and evaluates their performance in terms of makespan and energy consumption using six neural networks, both in general and in the case of CubeSats. The results show that the <span>SNN</span> algorithm outperforms the other two algorithms (<span>STD</span> and <span>STS</span>) and that scheduling at the layer level significantly reduces the energy consumption. Therefore, embedded systems based on multicore MCUs are suitable for executing CNNs, and they can be used, for example, on board small satellites called CubeSats.</div></div>","PeriodicalId":49815,"journal":{"name":"Microprocessors and Microsystems","volume":"111 ","pages":"Article 105107"},"PeriodicalIF":2.6000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Algorithms for scheduling CNNs on multicore MCUs at the neuron and layer levels\",\"authors\":\"Petr Dobiáš , Thomas Garbay , Bertrand Granado , Khalil Hachicha , Andrea Pinna\",\"doi\":\"10.1016/j.micpro.2024.105107\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Convolutional neural networks (CNNs) are progressively deployed on embedded systems, which is challenging because their computational and energy requirements need to be satisfied by devices with limited resources and power supplies. For instance, they can be implemented in the Internet of Things or edge computing, i.e., in applications using low-power and low-performance microcontroller units (MCUs). Monocore MCUs are not tailored to respond to the computational and energy requirements of CNNs due to their limited resources, but a multicore MCU can overcome these limitations. This paper presents an empirical study analysing three algorithms for scheduling CNNs on embedded systems at two different levels (neuron and layer levels) and evaluates their performance in terms of makespan and energy consumption using six neural networks, both in general and in the case of CubeSats. The results show that the <span>SNN</span> algorithm outperforms the other two algorithms (<span>STD</span> and <span>STS</span>) and that scheduling at the layer level significantly reduces the energy consumption. Therefore, embedded systems based on multicore MCUs are suitable for executing CNNs, and they can be used, for example, on board small satellites called CubeSats.</div></div>\",\"PeriodicalId\":49815,\"journal\":{\"name\":\"Microprocessors and Microsystems\",\"volume\":\"111 \",\"pages\":\"Article 105107\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Microprocessors and Microsystems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0141933124001029\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/10/21 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Microprocessors and Microsystems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141933124001029","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/21 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

卷积神经网络（CNN）正逐步部署到嵌入式系统中，这具有挑战性，因为其计算和能源需求需要由资源和电源有限的设备来满足。例如，它们可以在物联网或边缘计算中实施，即在使用低功耗和低性能微控制器单元（MCU）的应用中实施。由于资源有限，单核 MCU 无法满足 CNN 的计算和能源需求，但多核 MCU 可以克服这些限制。本文介绍了一项实证研究，分析了在嵌入式系统上对两个不同级别（神经元和层级）的 CNN 进行调度的三种算法，并使用六个神经网络评估了它们在一般情况下和立方体卫星情况下的正常运行时间和能耗方面的性能。结果表明，SNN 算法优于其他两种算法（STD 和 STS），层级调度可显著降低能耗。因此，基于多核微控制器的嵌入式系统适用于执行 CNN，例如可用于被称为 CubeSats 的小型卫星。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Algorithms for scheduling CNNs on multicore MCUs at the neuron and layer levels

Convolutional neural networks (CNNs) are progressively deployed on embedded systems, which is challenging because their computational and energy requirements need to be satisfied by devices with limited resources and power supplies. For instance, they can be implemented in the Internet of Things or edge computing, i.e., in applications using low-power and low-performance microcontroller units (MCUs). Monocore MCUs are not tailored to respond to the computational and energy requirements of CNNs due to their limited resources, but a multicore MCU can overcome these limitations. This paper presents an empirical study analysing three algorithms for scheduling CNNs on embedded systems at two different levels (neuron and layer levels) and evaluates their performance in terms of makespan and energy consumption using six neural networks, both in general and in the case of CubeSats. The results show that the SNN algorithm outperforms the other two algorithms (STD and STS) and that scheduling at the layer level significantly reduces the energy consumption. Therefore, embedded systems based on multicore MCUs are suitable for executing CNNs, and they can be used, for example, on board small satellites called CubeSats.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Microprocessors and Microsystems 工程技术-工程：电子与电气

CiteScore

6.90

自引率

3.80%

发文量

204

审稿时长

172 days

期刊介绍： Microprocessors and Microsystems: Embedded Hardware Design (MICPRO) is a journal covering all design and architectural aspects related to embedded systems hardware. This includes different embedded system hardware platforms ranging from custom hardware via reconfigurable systems and application specific processors to general purpose embedded processors. Special emphasis is put on novel complex embedded architectures, such as systems on chip (SoC), systems on a programmable/reconfigurable chip (SoPC) and multi-processor systems on a chip (MPSoC), as well as, their memory and communication methods and structures, such as network-on-chip (NoC). Design automation of such systems including methodologies, techniques, flows and tools for their design, as well as, novel designs of hardware components fall within the scope of this journal. Novel cyber-physical applications that use embedded systems are also central in this journal. While software is not in the main focus of this journal, methods of hardware/software co-design, as well as, application restructuring and mapping to embedded hardware platforms, that consider interplay between software and hardware components with emphasis on hardware, are also in the journal scope.

期刊最新文献

CGR-AI Engine: A scalable CGRA-based processing platform for Artificial Intelligence in space applications The TEXTAROSSA project: Cool all the Way Down to the Hardware Analyzing the impact of functional approximation on the resilience of Deep Neural Networks Efficient associative processing in FPGA LoLiPoP-IoT: Advancing the energy-efficient Internet of Things