{"title":"REC: REtime Convolutional layers to fully exploit harvested energy for ReRAM-based CNN accelerators","authors":"Kunyu Zhou, Keni Qiu","doi":"10.1145/3652593","DOIUrl":null,"url":null,"abstract":"<p>As the Internet of Things (IoTs) increasingly combines AI technology, it is a trend to deploy neural network algorithms at edges and make IoT devices more intelligent than ever. Moreover, energy-harvesting technology-based IoT devices have shown the advantages of green and low-carbon economy, convenient maintenance, and theoretically infinite lifetime, etc. However, the harvested energy is often unstable, resulting in low performance due to the fact that a fixed load cannot sufficiently utilize the harvested energy. To address this problem, recent works focusing on ReRAM-based convolutional neural networks (CNN) accelerators under harvested energy have proposed hardware/software optimizations. However, those works have overlooked the mismatch between the power requirement of different CNN layers and the variation of harvested power. </p><p>Motivated by the above observation, this paper proposes a novel strategy, called <i>REC</i>, that retimes convolutional layers of CNN inferences to improve the performance and energy efficiency of energy harvesting ReRAM-based accelerators. Specifically, at the offline stage, <i>REC</i> defines different power levels to fit the power requirements of different convolutional layers. At runtime, instead of sequentially executing the convolutional layers of an inference one by one, <i>REC</i> retimes the execution timeframe of different convolutional layers so as to accommodate different CNN layers to the changing power inputs. What is more, <i>REC</i> provides a parallel strategy to fully utilize very high power inputs. Moreover, a case study is presented to show that <i>REC</i> is effective to improve the real-time accomplishment of periodical critical inferences because <i>REC</i> provides an opportunity for critical inferences to preempt the process window with a high power supply. Our experimental results show that the proposed <i>REC</i> scheme achieves an average performance improvement of 6.1 × (up to 16.5 ×) compared to the traditional strategy without the <i>REC</i> idea. The case study results show that the <i>REC</i> scheme can significantly improve the success rate of periodical critical inferences’ real-time accomplishment.</p>","PeriodicalId":50914,"journal":{"name":"ACM Transactions on Embedded Computing Systems","volume":"28 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Embedded Computing Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3652593","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
As the Internet of Things (IoTs) increasingly combines AI technology, it is a trend to deploy neural network algorithms at edges and make IoT devices more intelligent than ever. Moreover, energy-harvesting technology-based IoT devices have shown the advantages of green and low-carbon economy, convenient maintenance, and theoretically infinite lifetime, etc. However, the harvested energy is often unstable, resulting in low performance due to the fact that a fixed load cannot sufficiently utilize the harvested energy. To address this problem, recent works focusing on ReRAM-based convolutional neural networks (CNN) accelerators under harvested energy have proposed hardware/software optimizations. However, those works have overlooked the mismatch between the power requirement of different CNN layers and the variation of harvested power.
Motivated by the above observation, this paper proposes a novel strategy, called REC, that retimes convolutional layers of CNN inferences to improve the performance and energy efficiency of energy harvesting ReRAM-based accelerators. Specifically, at the offline stage, REC defines different power levels to fit the power requirements of different convolutional layers. At runtime, instead of sequentially executing the convolutional layers of an inference one by one, REC retimes the execution timeframe of different convolutional layers so as to accommodate different CNN layers to the changing power inputs. What is more, REC provides a parallel strategy to fully utilize very high power inputs. Moreover, a case study is presented to show that REC is effective to improve the real-time accomplishment of periodical critical inferences because REC provides an opportunity for critical inferences to preempt the process window with a high power supply. Our experimental results show that the proposed REC scheme achieves an average performance improvement of 6.1 × (up to 16.5 ×) compared to the traditional strategy without the REC idea. The case study results show that the REC scheme can significantly improve the success rate of periodical critical inferences’ real-time accomplishment.
期刊介绍:
The design of embedded computing systems, both the software and hardware, increasingly relies on sophisticated algorithms, analytical models, and methodologies. ACM Transactions on Embedded Computing Systems (TECS) aims to present the leading work relating to the analysis, design, behavior, and experience with embedded computing systems.