Pub Date : 2017-07-12DOI: 10.1109/ReCoSoC.2017.8016151
Leonardo Suriano, Alfonso Rodríguez, K. Desnos, M. Pelcat, E. D. L. Torre
Nowadays, new heterogeneous system technologies are flooding the market: through the past years, it is possible to observe the move from single CPUs to multi-core devices featuring CPUs, GPUs and large FPGAs, such as Xilinx Zynq-7000 or Zynq UltraScale+ MPSoC architectures. In this context, providing developers with transparent deployment capabilities to efficiently execute different applications on such complex devices is important. In this paper, a design flow that combines, on one side, PREESM, a dataflow-based prototyping framework and, on the other side, Xilinx SDSoC, an HLS-based framework to automatically generate and manage hardware accelerators, is presented. This integration leverages the automatic, static task scheduling obtained from PREESM with asynchronous invocations that trigger the parallel execution of multiple hardware accelerators from some of their associated sequential software threads. An image processing application is used as a proof of concept, showing the interoperability possibilities of both tools, the level of design automation achieved and, for the resulting computing architecture, the good performance scalability according to the number of accelerators and sw threads.
{"title":"Analysis of a heterogeneous multi-core, multi-hw-accelerator-based system designed using PREESM and SDSoC","authors":"Leonardo Suriano, Alfonso Rodríguez, K. Desnos, M. Pelcat, E. D. L. Torre","doi":"10.1109/ReCoSoC.2017.8016151","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2017.8016151","url":null,"abstract":"Nowadays, new heterogeneous system technologies are flooding the market: through the past years, it is possible to observe the move from single CPUs to multi-core devices featuring CPUs, GPUs and large FPGAs, such as Xilinx Zynq-7000 or Zynq UltraScale+ MPSoC architectures. In this context, providing developers with transparent deployment capabilities to efficiently execute different applications on such complex devices is important. In this paper, a design flow that combines, on one side, PREESM, a dataflow-based prototyping framework and, on the other side, Xilinx SDSoC, an HLS-based framework to automatically generate and manage hardware accelerators, is presented. This integration leverages the automatic, static task scheduling obtained from PREESM with asynchronous invocations that trigger the parallel execution of multiple hardware accelerators from some of their associated sequential software threads. An image processing application is used as a proof of concept, showing the interoperability possibilities of both tools, the level of design automation achieved and, for the resulting computing architecture, the good performance scalability according to the number of accelerators and sw threads.","PeriodicalId":393701,"journal":{"name":"2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128326853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-07-12DOI: 10.1109/ReCoSoC.2017.8016152
R. Domingo, R. Salvador, H. Fabelo, D. Madroñal, S. Ortega, R. Lazcano, E. Juárez, G. Callicó, C. Sanz
Current computational demands require increasing designer's efficiency and system performance per watt. A broadly accepted solution for efficient accelerators implementation is reconfigurable computing. However, typical HDL methodologies require very specific skills and a considerable amount of designer's time. Despite the new approaches to high-level synthesis like OpenCL, given the large heterogeneity in today's devices (manycore, CPUs, GPUs, FPGAs), there is no one-fits-all solution, so to maximize performance, platform-driven optimization is needed. This paper reviews some latest works using Intel FPGA SDK for OpenCL and the strategies for optimization, evaluating the framework for the design of a hyperspectral image spatial-spectral classifier accelerator. Results are reported for a Cyclone V SoC using Intel FPGA OpenCL Offline Compiler 16.0 out-of-the-box. From a common baseline C implementation running on the embedded ARM® Cortex®-A9, OpenCL-based synthesis is evaluated applying different generic and vendor specific optimizations. Results show how reasonable speedups are obtained in a device with scarce computing and embedded memory resources. It seems a great step has been given to effectively raise the abstraction level, but still, a considerable amount of HW design skills is needed.
当前的计算需求要求提高设计人员的效率和每瓦特的系统性能。一个被广泛接受的高效加速器实现方案是可重构计算。然而,典型的HDL方法需要非常特殊的技能和大量的设计时间。尽管有像OpenCL这样的高级综合的新方法,但考虑到当今设备(多核、cpu、gpu、fpga)的巨大异构性,没有放之万用的解决方案,因此为了最大化性能,需要平台驱动的优化。本文综述了利用Intel FPGA SDK实现OpenCL的最新研究成果及其优化策略,对高光谱图像空间光谱分类器加速器的设计框架进行了评价。报告了使用Intel FPGA OpenCL离线编译器16.0开箱即用的Cyclone V SoC的结果。从运行在嵌入式ARM®Cortex®-A9上的通用基线C实现开始,基于opencl的合成应用不同的通用和特定于供应商的优化进行评估。结果表明如何在计算资源和嵌入式内存资源稀缺的设备上获得合理的加速。似乎已经迈出了一大步,有效地提高了抽象层次,但仍然需要相当数量的硬件设计技能。
{"title":"High-level design using Intel FPGA OpenCL: A hyperspectral imaging spatial-spectral classifier","authors":"R. Domingo, R. Salvador, H. Fabelo, D. Madroñal, S. Ortega, R. Lazcano, E. Juárez, G. Callicó, C. Sanz","doi":"10.1109/ReCoSoC.2017.8016152","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2017.8016152","url":null,"abstract":"Current computational demands require increasing designer's efficiency and system performance per watt. A broadly accepted solution for efficient accelerators implementation is reconfigurable computing. However, typical HDL methodologies require very specific skills and a considerable amount of designer's time. Despite the new approaches to high-level synthesis like OpenCL, given the large heterogeneity in today's devices (manycore, CPUs, GPUs, FPGAs), there is no one-fits-all solution, so to maximize performance, platform-driven optimization is needed. This paper reviews some latest works using Intel FPGA SDK for OpenCL and the strategies for optimization, evaluating the framework for the design of a hyperspectral image spatial-spectral classifier accelerator. Results are reported for a Cyclone V SoC using Intel FPGA OpenCL Offline Compiler 16.0 out-of-the-box. From a common baseline C implementation running on the embedded ARM® Cortex®-A9, OpenCL-based synthesis is evaluated applying different generic and vendor specific optimizations. Results show how reasonable speedups are obtained in a device with scarce computing and embedded memory resources. It seems a great step has been given to effectively raise the abstraction level, but still, a considerable amount of HW design skills is needed.","PeriodicalId":393701,"journal":{"name":"2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117217326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-07-12DOI: 10.1109/ReCoSoC.2017.8016160
E. Abdali, M. Pelcat, F. Berry, J. Diguet, F. Palumbo
An ever larger share of FPGAs are supporting Dynamic and Partial Reconfiguration (DPR). A reconfigurable point-to-point interconnect (ρ-P2P) is a communication mechanism based on DPR that swaps between different precomputed configurations stored in partial bitstreams. ρ-Point-to-Point (P2P) is intended as a lightweight interconnect that suits the reconfigurable systems where a limited number of configurations are desirable. This paper assesses the pros and cons of ρ-P2P in terms of resource and performance depending on the number of input/output signals, their width and the number of supported configurations. Experimental results, conducted on an Intel Cyclone V FPGA, compare ρ-P2P to an equivalently functional non-DPR solution called μ-P2P and to a full crossbar. They show that ρ-P2P is indeed lightweight but introduces performance limitations on operating frequency, memory footprint and reconfiguration time. However, ρ-P2P is in general the least resource intensive of the tested interconnects, except in the trivial case of low numbers of signals and configurations. In particular, an 18 × 18 full crossbar interconnect requires 75% more resources than an equivalent ρ-P2P. Interestingly, this resource difference between ρ-P2P and a full crossbar grows linearly with the interconnect size.
越来越多的fpga支持动态和部分重构(DPR)。可重构点对点互连(ρ-P2P)是一种基于DPR的通信机制,它在存储在部分比特流中的不同预计算配置之间进行交换。ρ-点对点(P2P)是一种轻量级互连,适用于需要有限数量配置的可重构系统。本文根据输入/输出信号的数量、宽度和支持的配置数量,评估了ρ-P2P在资源和性能方面的利弊。在英特尔Cyclone V FPGA上进行的实验结果将ρ-P2P与称为μ-P2P的同等功能的非dpr解决方案和全横杆进行了比较。他们表明,ρ-P2P确实是轻量级的,但在操作频率、内存占用和重新配置时间方面引入了性能限制。然而,除了信号和配置数量较少的情况外,在测试的互连中,ρ-P2P通常是资源密集程度最低的。特别是,一个18 × 18的全交叉互连需要比等效的ρ-P2P多75%的资源。有趣的是,ρ-P2P与全交叉条之间的资源差异随着互连大小线性增长。
{"title":"Exploring the performance of partially reconfigurable point-to-point interconnects","authors":"E. Abdali, M. Pelcat, F. Berry, J. Diguet, F. Palumbo","doi":"10.1109/ReCoSoC.2017.8016160","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2017.8016160","url":null,"abstract":"An ever larger share of FPGAs are supporting Dynamic and Partial Reconfiguration (DPR). A reconfigurable point-to-point interconnect (ρ-P2P) is a communication mechanism based on DPR that swaps between different precomputed configurations stored in partial bitstreams. ρ-Point-to-Point (P2P) is intended as a lightweight interconnect that suits the reconfigurable systems where a limited number of configurations are desirable. This paper assesses the pros and cons of ρ-P2P in terms of resource and performance depending on the number of input/output signals, their width and the number of supported configurations. Experimental results, conducted on an Intel Cyclone V FPGA, compare ρ-P2P to an equivalently functional non-DPR solution called μ-P2P and to a full crossbar. They show that ρ-P2P is indeed lightweight but introduces performance limitations on operating frequency, memory footprint and reconfiguration time. However, ρ-P2P is in general the least resource intensive of the tested interconnects, except in the trivial case of low numbers of signals and configurations. In particular, an 18 × 18 full crossbar interconnect requires 75% more resources than an equivalent ρ-P2P. Interestingly, this resource difference between ρ-P2P and a full crossbar grows linearly with the interconnect size.","PeriodicalId":393701,"journal":{"name":"2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126270208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-07-12DOI: 10.1109/ReCoSoC.2017.8016143
J. Joseph, Lennart Bamberg, Sven Wrieden, Dominik Ermel, A. Ortiz, Thilo Pionteck
New 3D production methods enable heterogeneous integration of dies manufactured in different technology nodes. Asymmetric 3D interconnect architectures (A-3D-IAs) are the communication infrastructure targeting these heterogeneous 3D system on chips (3D SoCs), for which design methodologies and design tools are still missing. Here, a design method is proposed following an incremental approach enabled by high level models. Therefore, we present the first simulator and design framework covering the diverse requirements of A-3D-IAs. This includes an abstract model to estimate the application specific energy consumption of 2D metal wires and 3D through silicon vias (TSVs) in an A-3D-IA. It is validated by circuit simulations in combination with an electromagnetic field solver which is used for the extraction of the TSV array equivalent circuit. The model lays on a high abstraction level for fast simulations. Nonetheless, for real data stream scenarios it still shows a small maximum error of less than 8%. Additionally, a mathematical description is presented which enables a fast evaluation of low power coding schemes for A-3D-IA on a high level of abstraction.
新的3D生产方法使不同技术节点制造的模具能够异构集成。非对称3D互连架构(A-3D-IAs)是针对这些异构3D芯片系统(3D soc)的通信基础设施,其设计方法和设计工具仍然缺失。在这里,提出了一种设计方法,该方法遵循由高级模型支持的增量方法。因此,我们提出了第一个模拟器和设计框架,涵盖了A-3D-IAs的各种要求。这包括一个抽象模型,用于估计A-3D-IA中2D金属线和3D硅通孔(tsv)的应用特定能耗。结合电磁场求解器对TSV阵列等效电路的提取进行了电路仿真验证。该模型具有较高的抽象层次,可实现快速仿真。尽管如此,对于真实的数据流场景,它仍然显示出小于8%的最大误差。此外,还提出了一种数学描述,可以在高抽象水平上快速评估a - 3d - ia的低功耗编码方案。
{"title":"Design method for asymmetric 3D interconnect architectures with high level models","authors":"J. Joseph, Lennart Bamberg, Sven Wrieden, Dominik Ermel, A. Ortiz, Thilo Pionteck","doi":"10.1109/ReCoSoC.2017.8016143","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2017.8016143","url":null,"abstract":"New 3D production methods enable heterogeneous integration of dies manufactured in different technology nodes. Asymmetric 3D interconnect architectures (A-3D-IAs) are the communication infrastructure targeting these heterogeneous 3D system on chips (3D SoCs), for which design methodologies and design tools are still missing. Here, a design method is proposed following an incremental approach enabled by high level models. Therefore, we present the first simulator and design framework covering the diverse requirements of A-3D-IAs. This includes an abstract model to estimate the application specific energy consumption of 2D metal wires and 3D through silicon vias (TSVs) in an A-3D-IA. It is validated by circuit simulations in combination with an electromagnetic field solver which is used for the extraction of the TSV array equivalent circuit. The model lays on a high abstraction level for fast simulations. Nonetheless, for real data stream scenarios it still shows a small maximum error of less than 8%. Additionally, a mathematical description is presented which enables a fast evaluation of low power coding schemes for A-3D-IA on a high level of abstraction.","PeriodicalId":393701,"journal":{"name":"2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124793972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-07-12DOI: 10.1109/ReCoSoC.2017.8016144
Peter Rouget, Benoît Badrignans, P. Benoit, L. Torres
In recent years, the need in security for embedded devices and data centers has increased sharply. The possible consequences of attacks on these equipments make them privileged targets. In these fields, FPGA are increasingly used because of their flexibility and constantly decreasing power consumption and cost: they can embed several hard/soft processors running Linux enhancing system integration. This paper discusses the security issues related to operating system boot security on FPGAs. We show how the software early boot stages can be protected using FPGA built-in security mechanisms and user logic. We consider that external memories can be tampered by software attacks or board level attacks. By using open source elements and standard tools, we present and implement a lightweight solution. We show that the dynamic reconfiguration has nearly no impact on usable resources of the FPGA matrix at the end of the boot process.
{"title":"SecBoot — lightweight secure boot mechanism for Linux-based embedded systems on FPGAs","authors":"Peter Rouget, Benoît Badrignans, P. Benoit, L. Torres","doi":"10.1109/ReCoSoC.2017.8016144","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2017.8016144","url":null,"abstract":"In recent years, the need in security for embedded devices and data centers has increased sharply. The possible consequences of attacks on these equipments make them privileged targets. In these fields, FPGA are increasingly used because of their flexibility and constantly decreasing power consumption and cost: they can embed several hard/soft processors running Linux enhancing system integration. This paper discusses the security issues related to operating system boot security on FPGAs. We show how the software early boot stages can be protected using FPGA built-in security mechanisms and user logic. We consider that external memories can be tampered by software attacks or board level attacks. By using open source elements and standard tools, we present and implement a lightweight solution. We show that the dynamic reconfiguration has nearly no impact on usable resources of the FPGA matrix at the end of the boot process.","PeriodicalId":393701,"journal":{"name":"2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129857515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-07-12DOI: 10.1109/ReCoSoC.2017.8016154
Fatemeh Arezoomand, Arghavan Asad, M. Fazeli, M. Fathy, F. Mohammadi
In Nano-scale technologies, static power consumption due to leakage current has become a serious issue in the design of SRAM based on-chip cache memories. To address this issue, non-volatile memory technologies such as STT-RAM (Spin Transfer Torque-RAM) have been proposed as a replacement for SRAM cells due to their near zero static power consumption and high memory density. Nonetheless, STT-RAMs suffer from some failures such as read disturb and limited endurance as well as high switching energy. One effective way to decrease the STT-RAMs' switching energy is to reduce their retention time, however, reducing the retention time has a negative impact on the reliability of STT-RAM cells. In this paper, we propose a hybrid cache layer for an embedded 3D-Chip Multiprocessor which employs two types of STT-RAM memory banks with retention time of 1s and 10ms to provide a beneficial tradeoff between reliability, energy consumption, and performance. To this end, we also propose an optimization model to find the optimal configurations for these two kinds of memory banks. Simulation results using the Gem5 simulator through comparisons with fully SRAM and fully STT-RAM based cache show that the proposed hybrid cache consumes significantly less power while offering higher throughput (instructions per cycle) compared to a fully STT-RAM based cache.
{"title":"Energy aware and reliable STT-RAM based cache design for 3D embedded chip-multiprocessors","authors":"Fatemeh Arezoomand, Arghavan Asad, M. Fazeli, M. Fathy, F. Mohammadi","doi":"10.1109/ReCoSoC.2017.8016154","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2017.8016154","url":null,"abstract":"In Nano-scale technologies, static power consumption due to leakage current has become a serious issue in the design of SRAM based on-chip cache memories. To address this issue, non-volatile memory technologies such as STT-RAM (Spin Transfer Torque-RAM) have been proposed as a replacement for SRAM cells due to their near zero static power consumption and high memory density. Nonetheless, STT-RAMs suffer from some failures such as read disturb and limited endurance as well as high switching energy. One effective way to decrease the STT-RAMs' switching energy is to reduce their retention time, however, reducing the retention time has a negative impact on the reliability of STT-RAM cells. In this paper, we propose a hybrid cache layer for an embedded 3D-Chip Multiprocessor which employs two types of STT-RAM memory banks with retention time of 1s and 10ms to provide a beneficial tradeoff between reliability, energy consumption, and performance. To this end, we also propose an optimization model to find the optimal configurations for these two kinds of memory banks. Simulation results using the Gem5 simulator through comparisons with fully SRAM and fully STT-RAM based cache show that the proposed hybrid cache consumes significantly less power while offering higher throughput (instructions per cycle) compared to a fully STT-RAM based cache.","PeriodicalId":393701,"journal":{"name":"2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131703462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-07-12DOI: 10.1109/ReCoSoC.2017.8016157
Marta Beltrán, Miguel Calvo, Sergio Gonzalez
Different application domains are challenging the still immature access control mechanisms currently used to authenticate and to authorize system-on-chip architectures to services deployed locally or in the cloud. These domains include Internet of Things, Smart Places or Industry 4.0 where different kinds of devices and objects, often poorly physically protected, low-cost and energy-constrained, interact with different kinds of services through lightweight communication protocols. These protocols usually guarantee basic data confidentiality and integrity, securing communication channels using cryptography, but there are still important challenges related to authentication and authorization. This work proposes a new system-to-service authentication and authorization mechanism based on the combination of a Physical Unclonable Function (PUF) and two tokens (one devoted to authentication and the other devoted to authorization), capable of working over HTTP or COAP relying on federated schemes and adapted to the specific requirements of this kind of environments. The new mechanism is validated and its efficiency and security are evaluated using a real healthcare case study.
{"title":"Federated system-to-service authentication and authorization combining PUFs and tokens","authors":"Marta Beltrán, Miguel Calvo, Sergio Gonzalez","doi":"10.1109/ReCoSoC.2017.8016157","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2017.8016157","url":null,"abstract":"Different application domains are challenging the still immature access control mechanisms currently used to authenticate and to authorize system-on-chip architectures to services deployed locally or in the cloud. These domains include Internet of Things, Smart Places or Industry 4.0 where different kinds of devices and objects, often poorly physically protected, low-cost and energy-constrained, interact with different kinds of services through lightweight communication protocols. These protocols usually guarantee basic data confidentiality and integrity, securing communication channels using cryptography, but there are still important challenges related to authentication and authorization. This work proposes a new system-to-service authentication and authorization mechanism based on the combination of a Physical Unclonable Function (PUF) and two tokens (one devoted to authentication and the other devoted to authorization), capable of working over HTTP or COAP relying on federated schemes and adapted to the specific requirements of this kind of environments. The new mechanism is validated and its efficiency and security are evaluated using a real healthcare case study.","PeriodicalId":393701,"journal":{"name":"2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123146765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-07-12DOI: 10.1109/ReCoSoC.2017.8016145
Daniela Genius, L. Apvrille
Massively parallel applications such as telecommunication and video streaming have the particularity that a large proportion of the time is spent on accessing communication channels between the tasks, due to contention on the on-chip interconnect. Moreover, the analysis of a given task deployment is often fastidious. Thus, we propose to extend an existing easy-to-use System-level Design methodology to task farm applications. The contribution first concerns adding relevant SysML modeling elements to take into account application code, hardware platforms and deployment constraints. Secondly, new modeling elements — including access techniques to communication channels — must be given a semantics in order to transform models into a well-defined SystemC virtual prototyping MPSoC platform. A telecommunication application serves as an example.
{"title":"System-level design for communication-centric task farm applications","authors":"Daniela Genius, L. Apvrille","doi":"10.1109/ReCoSoC.2017.8016145","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2017.8016145","url":null,"abstract":"Massively parallel applications such as telecommunication and video streaming have the particularity that a large proportion of the time is spent on accessing communication channels between the tasks, due to contention on the on-chip interconnect. Moreover, the analysis of a given task deployment is often fastidious. Thus, we propose to extend an existing easy-to-use System-level Design methodology to task farm applications. The contribution first concerns adding relevant SysML modeling elements to take into account application code, hardware platforms and deployment constraints. Secondly, new modeling elements — including access techniques to communication channels — must be given a semantics in order to transform models into a well-defined SystemC virtual prototyping MPSoC platform. A telecommunication application serves as an example.","PeriodicalId":393701,"journal":{"name":"2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128161926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-07-12DOI: 10.1109/ReCoSoC.2017.8016149
Hamidreza Ahmadian, Farzad Nekouei, R. Obermaisser
Adaptivity in terms of fault recovery and energy efficiency alongside with mixed-criticality support are demanded in today's embedded systems. Safety-critical systems are desired to switch between precomputed resource allocations at runtime based on the monitored information from the platform. In addition, those systems are desired to adjust their internal behavior with regard to a change in the environment, while operating at a desired safety level. At the same time, resource requests in such systems can be highly dynamic and data dependent. Aiming at meeting a superset of all worst case demands leads to unaffordable overheads in terms of resource utilization. Hence, efficient resource management mechanisms are required to provide fault recovery and to make the system adaptive to the changes in the environmental or the resource requests, while keeping the system at a safe state. This paper introduces a solution for supporting resource management in networks-on-chips that fulfills the requirements of adaptive mixed-criticality systems and proposes an architecture that establishes fault recovery by switching between precomputed resource allocations based on the statistical and diagnostic information.
{"title":"Fault recovery and adaptation in time-triggered Networks-on-Chips for mixed-criticality systems","authors":"Hamidreza Ahmadian, Farzad Nekouei, R. Obermaisser","doi":"10.1109/ReCoSoC.2017.8016149","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2017.8016149","url":null,"abstract":"Adaptivity in terms of fault recovery and energy efficiency alongside with mixed-criticality support are demanded in today's embedded systems. Safety-critical systems are desired to switch between precomputed resource allocations at runtime based on the monitored information from the platform. In addition, those systems are desired to adjust their internal behavior with regard to a change in the environment, while operating at a desired safety level. At the same time, resource requests in such systems can be highly dynamic and data dependent. Aiming at meeting a superset of all worst case demands leads to unaffordable overheads in terms of resource utilization. Hence, efficient resource management mechanisms are required to provide fault recovery and to make the system adaptive to the changes in the environmental or the resource requests, while keeping the system at a safe state. This paper introduces a solution for supporting resource management in networks-on-chips that fulfills the requirements of adaptive mixed-criticality systems and proposes an architecture that establishes fault recovery by switching between precomputed resource allocations based on the statistical and diagnostic information.","PeriodicalId":393701,"journal":{"name":"2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125334089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-07-01DOI: 10.1109/ReCoSoC.2017.8016162
P. Dziurzański, T. Maka
This paper proposes a technique for determining the current mode in an electronic control unit (ECU) during run-time. We use a decision tree classifier which observes the latest execution times of processes (runnables). When a mode change is detected, the migration of runnables is performed to decrease the number of active cores leading to considerable energy savings while still not violating any of timing constraints. The proposed approach consists of both off-line and on-line steps, whereas more computational intensive steps are performed statically. In the presented automotive use case, the current mode is detected with 100% accuracy while observing execution time of a particular single runnable. The migration time of systems with dynamic mode detection based on the runnable execution time with various periods is also provided.
{"title":"Current mode detection in hard real-time automotive applications dedicated to many-core platforms","authors":"P. Dziurzański, T. Maka","doi":"10.1109/ReCoSoC.2017.8016162","DOIUrl":"https://doi.org/10.1109/ReCoSoC.2017.8016162","url":null,"abstract":"This paper proposes a technique for determining the current mode in an electronic control unit (ECU) during run-time. We use a decision tree classifier which observes the latest execution times of processes (runnables). When a mode change is detected, the migration of runnables is performed to decrease the number of active cores leading to considerable energy savings while still not violating any of timing constraints. The proposed approach consists of both off-line and on-line steps, whereas more computational intensive steps are performed statically. In the presented automotive use case, the current mode is detected with 100% accuracy while observing execution time of a particular single runnable. The migration time of systems with dynamic mode detection based on the runnable execution time with various periods is also provided.","PeriodicalId":393701,"journal":{"name":"2017 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122529582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}