Peter Brand, Jonathan Ah Sue, J. Brendel, J. Falk, R. Hasholzner, Jürgen Teich, S. Wildermann
In embedded systems powered by batteries, power is undoubtedly a critical resource making power management an important topic in the design phase. Even though power management is a heavily researched topic, most approaches focus on improving the way the power manager reacts to outside control events. In this paper, we propose techniques that not only react but rather try to predict these outside control events in advance, thus, broadening the capabilities of any employed power manager by allowing for superior transition decisions and even saving redundant calculations. We present results on employing a predictive power management system that couples a classic dynamic power manager with a machine learning subsystem in the context of a mobile device in a Long Term Evolution (LTE) system, with emphasis on evaluating the potential of saving power as well as the handling of the induced prediction uncertainty. First, we examine the LTE communication protocol and showcase certain control data that has to be received periodically, but may contain no information for the receiver. Finally, we show a proof-of-concept based on real LTE traces and hardware simulation, that prediction of this information can be leveraged to allow for a far superior decision process compared to a non-predicting system. Here, we achieve a theoretical best case power saving of 15 % for an idealized prediction with 100 % accuracy and no additional power consumption.
{"title":"Exploiting Predictability in Dynamic Network Communication for Power-Efficient Data Transmission in LTE Radio Systems","authors":"Peter Brand, Jonathan Ah Sue, J. Brendel, J. Falk, R. Hasholzner, Jürgen Teich, S. Wildermann","doi":"10.1145/3078659.3078670","DOIUrl":"https://doi.org/10.1145/3078659.3078670","url":null,"abstract":"In embedded systems powered by batteries, power is undoubtedly a critical resource making power management an important topic in the design phase. Even though power management is a heavily researched topic, most approaches focus on improving the way the power manager reacts to outside control events. In this paper, we propose techniques that not only react but rather try to predict these outside control events in advance, thus, broadening the capabilities of any employed power manager by allowing for superior transition decisions and even saving redundant calculations. We present results on employing a predictive power management system that couples a classic dynamic power manager with a machine learning subsystem in the context of a mobile device in a Long Term Evolution (LTE) system, with emphasis on evaluating the potential of saving power as well as the handling of the induced prediction uncertainty. First, we examine the LTE communication protocol and showcase certain control data that has to be received periodically, but may contain no information for the receiver. Finally, we show a proof-of-concept based on real LTE traces and hardware simulation, that prediction of this information can be leveraged to allow for a far superior decision process compared to a non-predicting system. Here, we achieve a theoretical best case power saving of 15 % for an idealized prediction with 100 % accuracy and no additional power consumption.","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"193 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126046736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrés Goens, R. Khasanov, J. Castrillón, Marcus Hähnel, Till Smejkal, Hermann Härtig
For embedded system software, it is common to use static mappings of tasks to cores. This becomes considerably more challenging in multi-application scenarios. In this paper, we propose TETRiS, a multi-application run-time system for static mappings for heterogeneous system-on-chip architectures. It leverages compile-time information to map and migrate tasks in a fashion that preserves the predictable performance of using static mappings, allowing the system to accommodate multiple applications. TETRiS runs on off-the-shelf embedded systems and is Linux-compatible. We embed our approach in a state-of-the-art compiler for multicore systems and evaluate the proposed run-time system in a modern heterogeneous platform using realistic benchmarks. We present two experiments whose execution time and energy consumptions are comparable to those obtained by the highly-optimized Linux scheduler CFS, and where execution time variance is reduced by a factor of 510, and energy consumption variance by a factor of 83.
{"title":"TETRiS: a Multi-Application Run-Time System for Predictable Execution of Static Mappings","authors":"Andrés Goens, R. Khasanov, J. Castrillón, Marcus Hähnel, Till Smejkal, Hermann Härtig","doi":"10.1145/3078659.3078663","DOIUrl":"https://doi.org/10.1145/3078659.3078663","url":null,"abstract":"For embedded system software, it is common to use static mappings of tasks to cores. This becomes considerably more challenging in multi-application scenarios. In this paper, we propose TETRiS, a multi-application run-time system for static mappings for heterogeneous system-on-chip architectures. It leverages compile-time information to map and migrate tasks in a fashion that preserves the predictable performance of using static mappings, allowing the system to accommodate multiple applications. TETRiS runs on off-the-shelf embedded systems and is Linux-compatible. We embed our approach in a state-of-the-art compiler for multicore systems and evaluate the proposed run-time system in a modern heterogeneous platform using realistic benchmarks. We present two experiments whose execution time and energy consumptions are comparable to those obtained by the highly-optimized Linux scheduler CFS, and where execution time variance is reduced by a factor of 510, and energy consumption variance by a factor of 83.","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124525979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Future real-time embedded systems will increasingly incorporate mixed application models with timing constraints running on the same multi-core platform. These application models are dataflow applications with timing constraints and traditional real-time applications modelled as independent arbitrary-deadline tasks. These systems require guarantees that all running applications execute satisfying their timing constraints. Also, to be cost-efficient in terms of design, they require efficient mapping strategies that maximize the use of system resources to reduce the overall cost. This work proposes an approach to integrate mixed application models (dataflow and traditional real-time applications) with timing requirements on the same multi-core platform. It comprises three main algorithms: 1) Slack-Based Merging, 2) Timing Parameter Extraction, and 3) Communication-Aware Mapping. Together, these three algorithms play a part in allowing mapping and scheduling of mixed application models in embedded real-time systems. The complete approach and the three algorithms presented have been validated through proofs and experimental evaluation.
{"title":"Combining Dataflow Applications and Real-time Task Sets on Multi-core Platforms","authors":"H. Ali, B. Akesson, L. M. Pinho","doi":"10.1145/3078659.3078671","DOIUrl":"https://doi.org/10.1145/3078659.3078671","url":null,"abstract":"Future real-time embedded systems will increasingly incorporate mixed application models with timing constraints running on the same multi-core platform. These application models are dataflow applications with timing constraints and traditional real-time applications modelled as independent arbitrary-deadline tasks. These systems require guarantees that all running applications execute satisfying their timing constraints. Also, to be cost-efficient in terms of design, they require efficient mapping strategies that maximize the use of system resources to reduce the overall cost. This work proposes an approach to integrate mixed application models (dataflow and traditional real-time applications) with timing requirements on the same multi-core platform. It comprises three main algorithms: 1) Slack-Based Merging, 2) Timing Parameter Extraction, and 3) Communication-Aware Mapping. Together, these three algorithms play a part in allowing mapping and scheduling of mixed application models in embedded real-time systems. The complete approach and the three algorithms presented have been validated through proofs and experimental evaluation.","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"125 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134283656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Many-core heterogeneous designs are nowadays widely available among embedded systems. Initiatives such as the HSA push for a model where the host processor and the accelerator(s) communicate via coherent, Unified Virtual Memory (UVM). In this paper we describe our experience in porting the OpenMP v4 programming model to a low-end, heterogeneous embedded system based on the PULP many-core accelerator featuring lightweight (software-managed) UVM support. We describe a GCC-based toolchain which enables: i) the automatic generation of host and accelerator binaries from a single, high-level, OpenMP parallel program; ii) the automatic instrumentation of the accelerator program to transparently manage UVM. This enables up to 4x faster execution compared to traditional copy-based offload mechanisms.
{"title":"Enabling zero-copy OpenMP offloading on the PULP many-core accelerator","authors":"Alessandro Capotondi, A. Marongiu","doi":"10.1145/3078659.3079071","DOIUrl":"https://doi.org/10.1145/3078659.3079071","url":null,"abstract":"Many-core heterogeneous designs are nowadays widely available among embedded systems. Initiatives such as the HSA push for a model where the host processor and the accelerator(s) communicate via coherent, Unified Virtual Memory (UVM). In this paper we describe our experience in porting the OpenMP v4 programming model to a low-end, heterogeneous embedded system based on the PULP many-core accelerator featuring lightweight (software-managed) UVM support. We describe a GCC-based toolchain which enables: i) the automatic generation of host and accelerator binaries from a single, high-level, OpenMP parallel program; ii) the automatic instrumentation of the accelerator program to transparently manage UVM. This enables up to 4x faster execution compared to traditional copy-based offload mechanisms.","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134511182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biagio Cosenza, J. Durillo, Stefano Ermon, B. Juurlink
The increasing performance of today's computer architecture comes with an unprecedented augment of hardware complexity. Unfortunately this results in difficult-to-tune software and consequentially in a gap between the potential peak performance and the actual performance. Automatic tuning is an emerging approach that assists the programmer in managing this complexity. State-of-the-art autotuners are limited, though: they either require long tuning times, e.g., due to iterative searches, or cannot tackle the complexity of the problem due to the limitation of the supervised machine learning (ML) methodologies used. In particular, traditional ML autotuning approaches exploiting classification algorithms (such as neural networks and support vector machines) face difficulties in capturing all features of large search spaces. We propose a new way of performing automatic tuning based on structural learning: the tuning problem is formulated as a version ranking prediction modeling and solved using ordinal regression. We demonstrate its potential on a well-known autotuning problem: stencil computations. We compare state-of-the-art iterative compilation methods with our ordinal regression approach and analyze the quality of the obtained ranking in terms of Kendall rank correlation coefficients.
{"title":"Stencil Autotuning with Ordinal Regression: Extended Abstract","authors":"Biagio Cosenza, J. Durillo, Stefano Ermon, B. Juurlink","doi":"10.1145/3078659.3078664","DOIUrl":"https://doi.org/10.1145/3078659.3078664","url":null,"abstract":"The increasing performance of today's computer architecture comes with an unprecedented augment of hardware complexity. Unfortunately this results in difficult-to-tune software and consequentially in a gap between the potential peak performance and the actual performance. Automatic tuning is an emerging approach that assists the programmer in managing this complexity. State-of-the-art autotuners are limited, though: they either require long tuning times, e.g., due to iterative searches, or cannot tackle the complexity of the problem due to the limitation of the supervised machine learning (ML) methodologies used. In particular, traditional ML autotuning approaches exploiting classification algorithms (such as neural networks and support vector machines) face difficulties in capturing all features of large search spaces. We propose a new way of performing automatic tuning based on structural learning: the tuning problem is formulated as a version ranking prediction modeling and solved using ordinal regression. We demonstrate its potential on a well-known autotuning problem: stencil computations. We compare state-of-the-art iterative compilation methods with our ordinal regression approach and analyze the quality of the obtained ranking in terms of Kendall rank correlation coefficients.","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133150349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Floating-point numbers are used to approximate the exact real numbers in a wide range of domains like numerical simulations, embedded software, etc. However, floating-point numbers are a finite approximation of real numbers. In practice, this approximation may introduce round-off errors and this can lead to catastrophic results. To cope with this issue, we have developed a tool which corrects partly these round-off errors and which consequently improves the numerical accuracy of computations by automatically transforming programs in a source to source manner. Our transformation, relies on static analysis by abstract interpretation and operates on pieces of code with assignments, conditionals and loops. In former work, we have focused on the intraprocedural transformation of programs and, in this article, we introduce the interprocedural transformation to improve accuracy.
{"title":"Numerical Accuracy Improvement by Interprocedural Program Transformation","authors":"Nasrine Damouche, M. Martel, Alexandre Chapoutot","doi":"10.1145/3078659.3078662","DOIUrl":"https://doi.org/10.1145/3078659.3078662","url":null,"abstract":"Floating-point numbers are used to approximate the exact real numbers in a wide range of domains like numerical simulations, embedded software, etc. However, floating-point numbers are a finite approximation of real numbers. In practice, this approximation may introduce round-off errors and this can lead to catastrophic results. To cope with this issue, we have developed a tool which corrects partly these round-off errors and which consequently improves the numerical accuracy of computations by automatically transforming programs in a source to source manner. Our transformation, relies on static analysis by abstract interpretation and operates on pieces of code with assignments, conditionals and loops. In former work, we have focused on the intraprocedural transformation of programs and, in this article, we introduce the interprocedural transformation to improve accuracy.","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122077642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jutta Pirkl, Andreas Becher, Jorge Echavarria, J. Teich, S. Wildermann
Approximate Computing aims at trading off computational accuracy against improvements regarding performance, resource utilization and power consumption by making use of the capability of many applications to tolerate a certain loss of quality. A key issue is the dependency of the impact of approximation on the input data as well as user preferences and environmental conditions. In this context, we therefore investigate the concept of self-adaptive image processing that is able to autonomously adapt 2D-convolution filter operators of different accuracy degrees by means of partial reconfiguration on Field-Programmable-Gate-Arrays (FPGAs). Experimental evaluation shows that the dynamic system is able to better exploit a given error tolerance than any static approximation technique due to its responsiveness to changes in input data. Additionally, it provides a user control knob to select the desired output quality via the metric threshold at runtime.
{"title":"Self-Adaptive FPGA-Based Image Processing Filters Using Approximate Arithmetics","authors":"Jutta Pirkl, Andreas Becher, Jorge Echavarria, J. Teich, S. Wildermann","doi":"10.1145/3078659.3078669","DOIUrl":"https://doi.org/10.1145/3078659.3078669","url":null,"abstract":"Approximate Computing aims at trading off computational accuracy against improvements regarding performance, resource utilization and power consumption by making use of the capability of many applications to tolerate a certain loss of quality. A key issue is the dependency of the impact of approximation on the input data as well as user preferences and environmental conditions. In this context, we therefore investigate the concept of self-adaptive image processing that is able to autonomously adapt 2D-convolution filter operators of different accuracy degrees by means of partial reconfiguration on Field-Programmable-Gate-Arrays (FPGAs). Experimental evaluation shows that the dynamic system is able to better exploit a given error tolerance than any static approximation technique due to its responsiveness to changes in input data. Additionally, it provides a user control knob to select the desired output quality via the metric threshold at runtime.","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"48 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132026471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Hempel, Andrés Goens, J. Castrillón, Josefine Asmus, I. Sbalzarini
Embedded systems are often designed as complex architectures with numerous processing elements. Effectively programming such systems requires parallel programming models e.g. task-based or dataflow-based models. With these types of models, the mapping of the abstract application model to the existing hardware architecture plays a decisive role and is usually optimized to achieve an ideal resource footprint or a near-minimal execution time. However, when mapping several independent programs to the same platform, resource conflicts can arise. This can be circumvented by remapping some of the tasks of an application, which in turn affect its timing behavior, possibly leading to constraint violations. In this work we present a novel method to compute mappings that are robust against local task remapping. The underlying method is based on the bio-inspired design centering algorithm of Lp-Adaptation. We evaluate this with several benchmarks on different platforms and show that mappings obtained with our algorithm are indeed robust. In all experiments, our robust mappings tolerated significantly more run-time perturbations without violating constraints than mappings devised with optimization heuristics
{"title":"Robust Mapping of Process Networks to Many-Core Systems using Bio-Inspired Design Centering","authors":"G. Hempel, Andrés Goens, J. Castrillón, Josefine Asmus, I. Sbalzarini","doi":"10.1145/3078659.3078667","DOIUrl":"https://doi.org/10.1145/3078659.3078667","url":null,"abstract":"Embedded systems are often designed as complex architectures with numerous processing elements. Effectively programming such systems requires parallel programming models e.g. task-based or dataflow-based models. With these types of models, the mapping of the abstract application model to the existing hardware architecture plays a decisive role and is usually optimized to achieve an ideal resource footprint or a near-minimal execution time. However, when mapping several independent programs to the same platform, resource conflicts can arise. This can be circumvented by remapping some of the tasks of an application, which in turn affect its timing behavior, possibly leading to constraint violations. In this work we present a novel method to compute mappings that are robust against local task remapping. The underlying method is based on the bio-inspired design centering algorithm of Lp-Adaptation. We evaluate this with several benchmarks on different platforms and show that mappings obtained with our algorithm are indeed robust. In all experiments, our robust mappings tolerated significantly more run-time perturbations without violating constraints than mappings devised with optimization heuristics","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134019251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Path planning is one of the key functional blocks for any autonomous aerial vehicle (UAV). The goal of a path planner module is to constantly update the route of the vehicle based on information sensed in real-time. Given the high computational requirements of this task, heterogeneous many-cores are appealing candidates for its execution. Approximate path computation has proven a promising approach to reduce total execution time, at the cost of a slight loss in accuracy. In this work we study performance and accuracy of state-of-the-art, near-optimal parallel path planning in combination with program transformations aimed at ensuring efficient use of embedded GPU resources. We propose a profile-based algorithmic variant which boosts GPU execution by up to ≈ 7x, while maintaining the accuracy loss below 5%.
{"title":"On the Accuracy of Near-Optimal GPU-Based Path Planning for UAVs","authors":"D. Palossi, A. Marongiu, L. Benini","doi":"10.1145/3078659.3079072","DOIUrl":"https://doi.org/10.1145/3078659.3079072","url":null,"abstract":"Path planning is one of the key functional blocks for any autonomous aerial vehicle (UAV). The goal of a path planner module is to constantly update the route of the vehicle based on information sensed in real-time. Given the high computational requirements of this task, heterogeneous many-cores are appealing candidates for its execution. Approximate path computation has proven a promising approach to reduce total execution time, at the cost of a slight loss in accuracy. In this work we study performance and accuracy of state-of-the-art, near-optimal parallel path planning in combination with program transformations aimed at ensuring efficient use of embedded GPU resources. We propose a profile-based algorithmic variant which boosts GPU execution by up to ≈ 7x, while maintaining the accuracy loss below 5%.","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"355 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122763879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simulink has gained a lot of acceptance due to its intuitive through block-based algorithm design, simulation, and rapid prototyping capabilities for signal processing as well as control applications. However, automatic code generation for heterogeneous architectures is currently not supported by Simulink. In the literature, there exist automatic translation toolchains for generation of C or C++ code from Simulink models, which then are used for implementation or validation purposes. But few of them approach the generation of models that can be used in well-established Electronic System Level (ESL) design methodologies and tools. In order to address this issue, we present a methodology to extract an executable specification based on Data Flow Graphs (DFGs) from a given Simulink model. Such a specification can then be used by ESL tools to perform a Design Space Exploration (DSE) and generate code for hardware/software partitions directly from the ESL model. In a case study from signal processing, we validate the equivalence of the results of the simulation in Simulink and the results obtained by simulation of the DFG fully automatically generated from the Simulink model in the SystemC-based actor language SysteMoC.
由于其直观的基于块的算法设计、仿真和快速原型设计功能,Simulink在信号处理和控制应用中获得了广泛的认可。然而,目前Simulink并不支持异构体系结构的自动代码生成。在文献中,存在用于从Simulink模型生成C或c++代码的自动翻译工具链,然后将其用于实现或验证目的。但是,他们中很少有人接近可以在已建立的电子系统级(ESL)设计方法和工具中使用的模型生成。为了解决这个问题,我们提出了一种从给定的Simulink模型中提取基于数据流图(DFGs)的可执行规范的方法。这样的规范可以被ESL工具用来执行设计空间探索(Design Space Exploration, DSE),并直接从ESL模型生成硬件/软件分区的代码。在信号处理的一个案例研究中,我们验证了在Simulink中仿真的结果与在基于systemc的参与者语言SysteMoC中对Simulink模型完全自动生成的DFG进行仿真的结果是等价的。
{"title":"Automatic Conversion of Simulink Models to SysteMoC Actor Networks","authors":"Martín Letras, J. Falk, S. Wildermann, J. Teich","doi":"10.1145/3078659.3078668","DOIUrl":"https://doi.org/10.1145/3078659.3078668","url":null,"abstract":"Simulink has gained a lot of acceptance due to its intuitive through block-based algorithm design, simulation, and rapid prototyping capabilities for signal processing as well as control applications. However, automatic code generation for heterogeneous architectures is currently not supported by Simulink. In the literature, there exist automatic translation toolchains for generation of C or C++ code from Simulink models, which then are used for implementation or validation purposes. But few of them approach the generation of models that can be used in well-established Electronic System Level (ESL) design methodologies and tools. In order to address this issue, we present a methodology to extract an executable specification based on Data Flow Graphs (DFGs) from a given Simulink model. Such a specification can then be used by ESL tools to perform a Design Space Exploration (DSE) and generate code for hardware/software partitions directly from the ESL model. In a case study from signal processing, we validate the equivalence of the results of the simulation in Simulink and the results obtained by simulation of the DFG fully automatically generated from the Simulink model in the SystemC-based actor language SysteMoC.","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124428094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}