Jose M. Badia , German Leon , Mario Garcia-Valderas , Jose A. Belloch , Almudena Lindoso , Luis Entrena
{"title":"Analysing the radiation reliability, performance and energy consumption of low-power SoC through heterogeneous parallelism","authors":"Jose M. Badia , German Leon , Mario Garcia-Valderas , Jose A. Belloch , Almudena Lindoso , Luis Entrena","doi":"10.1016/j.suscom.2024.101049","DOIUrl":null,"url":null,"abstract":"<div><div>This study focuses on the low-power Tegra X1 System-on-Chip (SoC) from the Jetson Nano Developer Kit, which is increasingly used in various environments and tasks. As these SoCs grow in prevalence, it becomes crucial to analyse their computational performance, energy consumption, and reliability, especially for safety-critical applications. A key factor examined in this paper is the SoC’s neutron radiation tolerance. This is explored by subjecting a parallel version of matrix multiplication, which has been offloaded to various hardware components via OpenMP, to neutron irradiation. Through this approach, this researcher establishes a correlation between the SoC’s reliability and its computational and energy performance. The analysis enables the identification of an optimal workload distribution strategy, considering factors such as execution time, energy efficiency, and system reliability. Experimental results reveal that, while the GPU executes matrix multiplication tasks more rapidly and efficiently than the CPU, using both components only marginally reduces execution time. Interestingly, GPU usage significantly increases the SoC’s critical section, leading to an escalated error rate for both Detected Unrecoverable Errors (DUE) and Silent Data Corruptions (SDC), with the CPU showing a higher average number of affected elements per SDC.</div></div>","PeriodicalId":48686,"journal":{"name":"Sustainable Computing-Informatics & Systems","volume":"44 ","pages":"Article 101049"},"PeriodicalIF":3.8000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sustainable Computing-Informatics & Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210537924000945","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
This study focuses on the low-power Tegra X1 System-on-Chip (SoC) from the Jetson Nano Developer Kit, which is increasingly used in various environments and tasks. As these SoCs grow in prevalence, it becomes crucial to analyse their computational performance, energy consumption, and reliability, especially for safety-critical applications. A key factor examined in this paper is the SoC’s neutron radiation tolerance. This is explored by subjecting a parallel version of matrix multiplication, which has been offloaded to various hardware components via OpenMP, to neutron irradiation. Through this approach, this researcher establishes a correlation between the SoC’s reliability and its computational and energy performance. The analysis enables the identification of an optimal workload distribution strategy, considering factors such as execution time, energy efficiency, and system reliability. Experimental results reveal that, while the GPU executes matrix multiplication tasks more rapidly and efficiently than the CPU, using both components only marginally reduces execution time. Interestingly, GPU usage significantly increases the SoC’s critical section, leading to an escalated error rate for both Detected Unrecoverable Errors (DUE) and Silent Data Corruptions (SDC), with the CPU showing a higher average number of affected elements per SDC.
期刊介绍:
Sustainable computing is a rapidly expanding research area spanning the fields of computer science and engineering, electrical engineering as well as other engineering disciplines. The aim of Sustainable Computing: Informatics and Systems (SUSCOM) is to publish the myriad research findings related to energy-aware and thermal-aware management of computing resource. Equally important is a spectrum of related research issues such as applications of computing that can have ecological and societal impacts. SUSCOM publishes original and timely research papers and survey articles in current areas of power, energy, temperature, and environment related research areas of current importance to readers. SUSCOM has an editorial board comprising prominent researchers from around the world and selects competitively evaluated peer-reviewed papers.