Simulink has gained a lot of acceptance due to its intuitive through block-based algorithm design, simulation, and rapid prototyping capabilities for signal processing as well as control applications. However, automatic code generation for heterogeneous architectures is currently not supported by Simulink. In the literature, there exist automatic translation toolchains for generation of C or C++ code from Simulink models, which then are used for implementation or validation purposes. But few of them approach the generation of models that can be used in well-established Electronic System Level (ESL) design methodologies and tools. In order to address this issue, we present a methodology to extract an executable specification based on Data Flow Graphs (DFGs) from a given Simulink model. Such a specification can then be used by ESL tools to perform a Design Space Exploration (DSE) and generate code for hardware/software partitions directly from the ESL model. In a case study from signal processing, we validate the equivalence of the results of the simulation in Simulink and the results obtained by simulation of the DFG fully automatically generated from the Simulink model in the SystemC-based actor language SysteMoC.
由于其直观的基于块的算法设计、仿真和快速原型设计功能,Simulink在信号处理和控制应用中获得了广泛的认可。然而,目前Simulink并不支持异构体系结构的自动代码生成。在文献中,存在用于从Simulink模型生成C或c++代码的自动翻译工具链,然后将其用于实现或验证目的。但是,他们中很少有人接近可以在已建立的电子系统级(ESL)设计方法和工具中使用的模型生成。为了解决这个问题,我们提出了一种从给定的Simulink模型中提取基于数据流图(DFGs)的可执行规范的方法。这样的规范可以被ESL工具用来执行设计空间探索(Design Space Exploration, DSE),并直接从ESL模型生成硬件/软件分区的代码。在信号处理的一个案例研究中,我们验证了在Simulink中仿真的结果与在基于systemc的参与者语言SysteMoC中对Simulink模型完全自动生成的DFG进行仿真的结果是等价的。
{"title":"Automatic Conversion of Simulink Models to SysteMoC Actor Networks","authors":"Martín Letras, J. Falk, S. Wildermann, J. Teich","doi":"10.1145/3078659.3078668","DOIUrl":"https://doi.org/10.1145/3078659.3078668","url":null,"abstract":"Simulink has gained a lot of acceptance due to its intuitive through block-based algorithm design, simulation, and rapid prototyping capabilities for signal processing as well as control applications. However, automatic code generation for heterogeneous architectures is currently not supported by Simulink. In the literature, there exist automatic translation toolchains for generation of C or C++ code from Simulink models, which then are used for implementation or validation purposes. But few of them approach the generation of models that can be used in well-established Electronic System Level (ESL) design methodologies and tools. In order to address this issue, we present a methodology to extract an executable specification based on Data Flow Graphs (DFGs) from a given Simulink model. Such a specification can then be used by ESL tools to perform a Design Space Exploration (DSE) and generate code for hardware/software partitions directly from the ESL model. In a case study from signal processing, we validate the equivalence of the results of the simulation in Simulink and the results obtained by simulation of the DFG fully automatically generated from the Simulink model in the SystemC-based actor language SysteMoC.","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124428094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Hot Path SSA (HPSSA) form filled a long-standing void by providing an SSA-like intermediate representation that could weave static program code and run-time profile information in a single data structure, thereby facilitating speculative analyses and optimizations. The original algorithm proposed for the Hot Path SSA construction builds HPSSA over non-SSA programs with interleaved SSA and HPSSA construction passes. In this work, we propose a new algorithm for constructing HPSSA programs from programs in the SSA form. Our new algorithm has the following advantages over the original algorithm: firstly, as all modern compilers have built-in SSA construction passes, it is difficult to incorporate the original algorithm within an existing compiler as it requires the compiler writer to intrude in and retrofit the HPSSA construction stages with the SSA construction pass. Our new algorithm can simply be pipelined next to the SSA construction pass with no modification required to existing passes. Secondly, our new algorithm is more efficient than the original algorithm: the original algorithm needs to process all definitions in the program while our new algorithm processes only the φ-functions---a small fraction of all program instructions. Most importantly, our new algorithm is much simpler than the original algorithm. We have implemented our algorithm in the LLVM compiler framework and evaluated its effectiveness by implementing an ILP driven path-profile guided register allocator.
{"title":"Constructing HPSSA over SSA","authors":"Smriti Jaiswal, P. Hegde, Subhajit Roy","doi":"10.1145/3078659.3078660","DOIUrl":"https://doi.org/10.1145/3078659.3078660","url":null,"abstract":"The Hot Path SSA (HPSSA) form filled a long-standing void by providing an SSA-like intermediate representation that could weave static program code and run-time profile information in a single data structure, thereby facilitating speculative analyses and optimizations. The original algorithm proposed for the Hot Path SSA construction builds HPSSA over non-SSA programs with interleaved SSA and HPSSA construction passes. In this work, we propose a new algorithm for constructing HPSSA programs from programs in the SSA form. Our new algorithm has the following advantages over the original algorithm: firstly, as all modern compilers have built-in SSA construction passes, it is difficult to incorporate the original algorithm within an existing compiler as it requires the compiler writer to intrude in and retrofit the HPSSA construction stages with the SSA construction pass. Our new algorithm can simply be pipelined next to the SSA construction pass with no modification required to existing passes. Secondly, our new algorithm is more efficient than the original algorithm: the original algorithm needs to process all definitions in the program while our new algorithm processes only the φ-functions---a small fraction of all program instructions. Most importantly, our new algorithm is much simpler than the original algorithm. We have implemented our algorithm in the LLVM compiler framework and evaluated its effectiveness by implementing an ILP driven path-profile guided register allocator.","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117221541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Juurlink, J. Lucas, Nadjib Mammeri, Martyn Bliss, G. Keramidas, Chrysa Kokkala, A. Richards
The LPGPU2 project is a 30-month-project (Innovation Action) funded by the European Union. Its overall goal is to develop an analysis and visualization framework that enables GPU application developers to improve the performance and power consumption of their applications. To achieve this overall goal, several key objectives need to be achieved. First, several applications (use cases) need to be developed for or ported to low-power GPUs. Thereafter, these applications need to be optimized using the tooling framework. In addition, power measurement devices and power models need to be developed that are 10x more accurate than the state of the art. The project consortium actively promotes open vendor-neutral standards via the Khronos group. This paper briefly reports on the achievements made in the first half of the project, and focuses on the progress made in applications; in power measurement, estimation, and modelling; and in the analysis and visualization tool suite.
{"title":"The LPGPU2 Project: Low-Power Parallel Computing on GPUs: Extended Abstract","authors":"B. Juurlink, J. Lucas, Nadjib Mammeri, Martyn Bliss, G. Keramidas, Chrysa Kokkala, A. Richards","doi":"10.1145/3078659.3078672","DOIUrl":"https://doi.org/10.1145/3078659.3078672","url":null,"abstract":"The LPGPU2 project is a 30-month-project (Innovation Action) funded by the European Union. Its overall goal is to develop an analysis and visualization framework that enables GPU application developers to improve the performance and power consumption of their applications. To achieve this overall goal, several key objectives need to be achieved. First, several applications (use cases) need to be developed for or ported to low-power GPUs. Thereafter, these applications need to be optimized using the tooling framework. In addition, power measurement devices and power models need to be developed that are 10x more accurate than the state of the art. The project consortium actively promotes open vendor-neutral standards via the Khronos group. This paper briefly reports on the achievements made in the first half of the project, and focuses on the progress made in applications; in power measurement, estimation, and modelling; and in the analysis and visualization tool suite.","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125971708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
James Pallister, Steve Kerrison, J. Morse, K. Eder
Safely meeting Worst Case Energy Consumption (WCEC) criteria requires accurate energy modeling of software. We investigate the impact of instruction operand values upon energy consumption in cacheless embedded processors. Existing instruction-level energy models typically use measurements from random input data, providing estimates unsuitable for safe WCEC analysis. We examine probabilistic energy distributions of instructions and propose a model for composing instruction sequences using distributions, enabling WCEC analysis on program basic blocks. The worst case is predicted with statistical analysis. Further, we verify that the energy of embedded benchmarks can be characterised as a distribution, and compare our proposed technique with other methods of estimating energy consumption.
{"title":"Data Dependent Energy Modeling for Worst Case Energy Consumption Analysis","authors":"James Pallister, Steve Kerrison, J. Morse, K. Eder","doi":"10.1145/3078659.3078666","DOIUrl":"https://doi.org/10.1145/3078659.3078666","url":null,"abstract":"Safely meeting Worst Case Energy Consumption (WCEC) criteria requires accurate energy modeling of software. We investigate the impact of instruction operand values upon energy consumption in cacheless embedded processors. Existing instruction-level energy models typically use measurements from random input data, providing estimates unsuitable for safe WCEC analysis. We examine probabilistic energy distributions of instructions and propose a model for composing instruction sequences using distributions, enabling WCEC analysis on program basic blocks. The worst case is predicted with statistical analysis. Further, we verify that the energy of embedded benchmarks can be characterised as a distribution, and compare our proposed technique with other methods of estimating energy consumption.","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122537442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","authors":"","doi":"10.1145/3078659","DOIUrl":"https://doi.org/10.1145/3078659","url":null,"abstract":"","PeriodicalId":240210,"journal":{"name":"Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133316905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}