Sudipta Chattopadhyay, Abhik Roychoudhury, T. Mitra
Timing analysis of concurrent programs running on multi-core platforms is currently an important problem. The key to solving this problem is to accurately model the timing effects of shared resources in multi-cores, namely shared cache and bus. In this paper, we provide an integrated timing analysis framework that captures timing effects of both shared cache and shared bus. We also develop a cycle-accurate simulation infra-structure to evaluate the precision of our analysis. Experimental results from a large fragment of an in-orbit spacecraft software show that our analysis produces around 20% over-estimation over simulation results.
{"title":"Modeling shared cache and bus in multi-cores for timing analysis","authors":"Sudipta Chattopadhyay, Abhik Roychoudhury, T. Mitra","doi":"10.1145/1811212.1811220","DOIUrl":"https://doi.org/10.1145/1811212.1811220","url":null,"abstract":"Timing analysis of concurrent programs running on multi-core platforms is currently an important problem. The key to solving this problem is to accurately model the timing effects of shared resources in multi-cores, namely shared cache and bus. In this paper, we provide an integrated timing analysis framework that captures timing effects of both shared cache and shared bus. We also develop a cycle-accurate simulation infra-structure to evaluate the precision of our analysis. Experimental results from a large fragment of an in-orbit spacecraft software show that our analysis produces around 20% over-estimation over simulation results.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126817204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Felipe Restrepo-Calle, A. Martínez-Álvarez, H. Guzmán-Miranda, F. R. Palomo, M. Aguirre, S. Cuenca-Asensi
The protection of processor-based systems to mitigate the harmful effects of transient faults (hardening) is gaining importance as technology shrinks. Hybrid hardware/software hardening approaches are promising alternatives in the design of such fault tolerant systems. This paper presents a compiler-based infrastructure for facilitating the exploration of the design space between hardware-only and software-only fault tolerant techniques. The compiler design is based on a generic architecture that facilitates the implementation of software-based techniques, providing an uniform isolated-from-target hardening core. In this way, these methods can be implemented in an architecture independent way and can easily integrate new protection mechanisms to automatically produce hardened code. The infrastructure includes a simulator that provides information about memory and execution time overheads to aid the designer in the co-design decisions. The tool-chain is complemented by a hardware fault emulation tool that allows to measure the fault coverage of the different solutions running on the real system. A case study was implemented allowing to evaluate the flexibility of the infrastructure to fit the reliability requirements of the system within their memory and performance restrictions.
{"title":"A compiler-based infrastructure for fault-tolerant co-design","authors":"Felipe Restrepo-Calle, A. Martínez-Álvarez, H. Guzmán-Miranda, F. R. Palomo, M. Aguirre, S. Cuenca-Asensi","doi":"10.1145/1811212.1811218","DOIUrl":"https://doi.org/10.1145/1811212.1811218","url":null,"abstract":"The protection of processor-based systems to mitigate the harmful effects of transient faults (hardening) is gaining importance as technology shrinks. Hybrid hardware/software hardening approaches are promising alternatives in the design of such fault tolerant systems. This paper presents a compiler-based infrastructure for facilitating the exploration of the design space between hardware-only and software-only fault tolerant techniques. The compiler design is based on a generic architecture that facilitates the implementation of software-based techniques, providing an uniform isolated-from-target hardening core. In this way, these methods can be implemented in an architecture independent way and can easily integrate new protection mechanisms to automatically produce hardened code. The infrastructure includes a simulator that provides information about memory and execution time overheads to aid the designer in the co-design decisions. The tool-chain is complemented by a hardware fault emulation tool that allows to measure the fault coverage of the different solutions running on the real system. A case study was implemented allowing to evaluate the flexibility of the infrastructure to fit the reliability requirements of the system within their memory and performance restrictions.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115737074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Register allocation is a fundamental part of any optimizing compiler. Effectively managing the limited register resources of the constrained architectures commonly found in embedded systems is essential in order to maximize code quality. In this paper we deconstruct the register allocation problem into distinct components: coalescing, spilling, move insertion, and assignment. Using an optimal register allocation framework, we empirically evaluate the importance of each of the components, the impact of component integration, and the effectiveness of existing heuristics. We evaluate code quality both in terms of code performance and code size and consider four distinct instruction set architectures: ARM, Thumb, x86, and x86-64. The results of our investigation reveal general principles for register allocation design.
{"title":"Register allocation deconstructed","authors":"D. Koes, S. Goldstein","doi":"10.1145/1543820.1543824","DOIUrl":"https://doi.org/10.1145/1543820.1543824","url":null,"abstract":"Register allocation is a fundamental part of any optimizing compiler. Effectively managing the limited register resources of the constrained architectures commonly found in embedded systems is essential in order to maximize code quality. In this paper we deconstruct the register allocation problem into distinct components: coalescing, spilling, move insertion, and assignment. Using an optimal register allocation framework, we empirically evaluate the importance of each of the components, the impact of component integration, and the effectiveness of existing heuristics. We evaluate code quality both in terms of code performance and code size and consider four distinct instruction set architectures: ARM, Thumb, x86, and x86-64. The results of our investigation reveal general principles for register allocation design.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121329020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Instruction set simulation based on dynamic compilation is a popular approach that focuses on fast simulation of user-visible features according to the instruction-set-architecture abstraction of a given processor. Simulation of interrupts, even though they are rare events, is very expensive for these simulators, because interrupts may occur anytime at any phase of the programs execution. Many optimizations in compiling simulators can not be applied or become less beneficial in the presence of interrupts. We propose a rollback mechanism in order to enable effective optimizations to be combined with cycle accurate handling of interrupts. Our simulator speculatively executes instructions of the emulated processor assuming that no interrupts will occur. At restore-points this assumption is verified and the processor state reverted to an earlier restore-point if an interrupt did actually occur. All architecture dependent simulation functions are derived using an architecture description language that is capable to automatically generate optimized simulators using our new approach. We are able to eliminate most of the overhead usually induced by interrupts. The simulation speed is improved up to a factor of 2.95 and compilation time is reduced by nearly 30% even for lower compilation thresholds.
{"title":"Precise simulation of interrupts using a rollback mechanism","authors":"F. Brandner","doi":"10.1145/1543820.1543833","DOIUrl":"https://doi.org/10.1145/1543820.1543833","url":null,"abstract":"Instruction set simulation based on dynamic compilation is a popular approach that focuses on fast simulation of user-visible features according to the instruction-set-architecture abstraction of a given processor. Simulation of interrupts, even though they are rare events, is very expensive for these simulators, because interrupts may occur anytime at any phase of the programs execution. Many optimizations in compiling simulators can not be applied or become less beneficial in the presence of interrupts.\u0000 We propose a rollback mechanism in order to enable effective optimizations to be combined with cycle accurate handling of interrupts. Our simulator speculatively executes instructions of the emulated processor assuming that no interrupts will occur. At restore-points this assumption is verified and the processor state reverted to an earlier restore-point if an interrupt did actually occur. All architecture dependent simulation functions are derived using an architecture description language that is capable to automatically generate optimized simulators using our new approach.\u0000 We are able to eliminate most of the overhead usually induced by interrupts. The simulation speed is improved up to a factor of 2.95 and compilation time is reduced by nearly 30% even for lower compilation thresholds.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"70 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120989324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. B. Lisboa, Luciano Silva, Igino Chaves, Thiago Lima, E. Barros
Nowadays, embedded Systems must communicate with different peripheral devices. The communication structure is implemented by a combined solution of hardware and software. The device controller is the hardware that implements, in general, complex communication protocols. On the other hand, the device driver provides transparent access to the functionalities of the device and depends on the architecture of the device controller. So, the design of the communication structure demands great effort, considerable development time and is very susceptible to errors. To minimize these issues, this paper presents an approach to the concurrent development of device controller simulation models and of the respective device drivers. Also a domain specific language, called DevC, is proposed to describe device controller features in a high level of abstraction. In this paper a brief introduction to this language is presented. From the specifications described in DevC, controller models and device drivers are synthesized. Both the device controller and the driver are first validated using a hardware virtual platform to reduce simulation time, and then they are validated on real hardware. Some controllers, such as a serial, as well as a text and graphic lcd, have been developed to validate the approach proposed.
{"title":"A design flow based on a domain specific language to concurrent development of device drivers and device controller simulation models","authors":"E. B. Lisboa, Luciano Silva, Igino Chaves, Thiago Lima, E. Barros","doi":"10.1145/1543820.1543830","DOIUrl":"https://doi.org/10.1145/1543820.1543830","url":null,"abstract":"Nowadays, embedded Systems must communicate with different peripheral devices. The communication structure is implemented by a combined solution of hardware and software. The device controller is the hardware that implements, in general, complex communication protocols. On the other hand, the device driver provides transparent access to the functionalities of the device and depends on the architecture of the device controller. So, the design of the communication structure demands great effort, considerable development time and is very susceptible to errors. To minimize these issues, this paper presents an approach to the concurrent development of device controller simulation models and of the respective device drivers. Also a domain specific language, called DevC, is proposed to describe device controller features in a high level of abstraction. In this paper a brief introduction to this language is presented. From the specifications described in DevC, controller models and device drivers are synthesized. Both the device controller and the driver are first validated using a hardware virtual platform to reduce simulation time, and then they are validated on real hardware. Some controllers, such as a serial, as well as a text and graphic lcd, have been developed to validate the approach proposed.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131800400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andreas Dahlin, Johan Ersfolk, Guyfu Yang, Haitham Habli, J. Lilius
Stream-based computing as embodied in stream-programming environments and streaming languages has attracted quite a lot of interest as a potential solution to programming many-cores. Modern embedded multimedia devices embody many characteristics of stream-based computing with the additional constraint on energy-consumption. In this paper we present a new streaming language Canals together with its compiler. Canals proposes the following novel features: 1. The ability to describe the scheduling of the computation kernels: Canals has a sub-language for describing schedulers and run-time system support. 2. The ability to detect type of data on the inputs of a network (the scheduling is often dependent on the data at run-time): Canals provides bit-stream parsing through automatic deserialization of data in network inputs. 3. Choice of synchronization mechanism between computational kernels and the scheduler to avoid overheads. This is implemented in the run-time system through the Hardware Abstraction Layer (HAL). We describe the language and the code-generators for the Cell processor and the Altera FPGA board.
{"title":"The canals language and its compiler","authors":"Andreas Dahlin, Johan Ersfolk, Guyfu Yang, Haitham Habli, J. Lilius","doi":"10.1145/1543820.1543829","DOIUrl":"https://doi.org/10.1145/1543820.1543829","url":null,"abstract":"Stream-based computing as embodied in stream-programming environments and streaming languages has attracted quite a lot of interest as a potential solution to programming many-cores. Modern embedded multimedia devices embody many characteristics of stream-based computing with the additional constraint on energy-consumption. In this paper we present a new streaming language Canals together with its compiler. Canals proposes the following novel features: 1. The ability to describe the scheduling of the computation kernels: Canals has a sub-language for describing schedulers and run-time system support. 2. The ability to detect type of data on the inputs of a network (the scheduling is often dependent on the data at run-time): Canals provides bit-stream parsing through automatic deserialization of data in network inputs. 3. Choice of synchronization mechanism between computational kernels and the scheduler to avoid overheads. This is implemented in the run-time system through the Hardware Abstraction Layer (HAL). We describe the language and the code-generators for the Cell processor and the Altera FPGA board.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"40 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133599228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The worst-case execution time (WCET) being the upper bound of the maximum execution time corresponds to the longest path through the program's control flow graph. Its reduction is the objective of a WCET optimization. Unlike average-case execution time compiler optimizations which consider a static (most frequently executed) path, the longest path is variable since its optimization might result in another path becoming the effective longest path. To keep path information valid, WCET optimizations typically perform a time-consuming static WCET analysis after each code modification to ensure that subsequent optimization steps operate on the critical path. However, a code modification does not always lead to a path switch, making many WCET analyses superfluous. To cope with this problem, we propose a new paradigm called Invariant Path which eliminates the pessimism by indicating whether a path update is mandatory. To demonstrate the paradigm's practical use, we developed a novel optimization called WCET-driven Loop Unswitching which exploits the Invariant Path information. In a case study, our optimization reduced the WCET of real-world benchmarks by up to 17.4%, while exploiting the Invariant Path paradigm led to a reduction of the optimization time by 57.5% on average.
{"title":"Accelerating WCET-driven optimizations by the invariant path paradigm: a case study of loop unswitching","authors":"Paul Lokuciejewski, Fatih Gedikli, P. Marwedel","doi":"10.1145/1543820.1543823","DOIUrl":"https://doi.org/10.1145/1543820.1543823","url":null,"abstract":"The worst-case execution time (WCET) being the upper bound of the maximum execution time corresponds to the longest path through the program's control flow graph. Its reduction is the objective of a WCET optimization. Unlike average-case execution time compiler optimizations which consider a static (most frequently executed) path, the longest path is variable since its optimization might result in another path becoming the effective longest path.\u0000 To keep path information valid, WCET optimizations typically perform a time-consuming static WCET analysis after each code modification to ensure that subsequent optimization steps operate on the critical path. However, a code modification does not always lead to a path switch, making many WCET analyses superfluous. To cope with this problem, we propose a new paradigm called Invariant Path which eliminates the pessimism by indicating whether a path update is mandatory. To demonstrate the paradigm's practical use, we developed a novel optimization called WCET-driven Loop Unswitching which exploits the Invariant Path information. In a case study, our optimization reduced the WCET of real-world benchmarks by up to 17.4%, while exploiting the Invariant Path paradigm led to a reduction of the optimization time by 57.5% on average.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"278 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131497529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Esterel and other imperative synchronous languages offer a rich set of statements, which can be used to conveniently describe complex control behaviors in a concise, but yet precise way. In particular, the ability to arbitrarily nest all kinds of statements including loops, local declarations, sequential and parallel control, as well as several kinds of preemption statements leads to a powerful programming language. However, this orthogonal design imposes difficult problems for the modular or separate compilation, which has to deal with special problems like the instantaneous reincarnation of locally declared variables. This paper presents a compilation procedure allowing to separately compile modules of a synchronous language. Our approach is based on two new achievements: First, we derive the information that is required for a linker to combine already compiled modules. This information is stored in a file written in an intermediate format, which is the target of our compilation procedure and the source of the linker. Second, we describe a compilation procedure for a typical imperative synchronous language to generate this intermediate format. We have implemented the approach in the upcoming version 2.0 of our Averest system.
{"title":"Separate compilation for synchronous programs","authors":"J. Brandt, K. Schneider","doi":"10.1145/1543820.1543822","DOIUrl":"https://doi.org/10.1145/1543820.1543822","url":null,"abstract":"Esterel and other imperative synchronous languages offer a rich set of statements, which can be used to conveniently describe complex control behaviors in a concise, but yet precise way. In particular, the ability to arbitrarily nest all kinds of statements including loops, local declarations, sequential and parallel control, as well as several kinds of preemption statements leads to a powerful programming language. However, this orthogonal design imposes difficult problems for the modular or separate compilation, which has to deal with special problems like the instantaneous reincarnation of locally declared variables.\u0000 This paper presents a compilation procedure allowing to separately compile modules of a synchronous language. Our approach is based on two new achievements: First, we derive the information that is required for a linker to combine already compiled modules. This information is stored in a file written in an intermediate format, which is the target of our compilation procedure and the source of the linker. Second, we describe a compilation procedure for a typical imperative synchronous language to generate this intermediate format. We have implemented the approach in the upcoming version 2.0 of our Averest system.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121175925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The BIP framework provides a methodology supported by a tool chain for developing software for embedded systems. The design of a BIP system follows the decomposition in behavior, interaction and priority. The first step comprises the division of desired behavior of a system into components. In a second step interactions and their priorities are added between the components. Finally, machine code is generated from the BIP model. While adding interactions it is possible to overconstrain a system resulting in potential deadlocks. The tool chain crucially depends on an automatic tool, D-Finder, which checks for deadlock-freedom. This paper reports on guaranteeing the correctness of the verdict of D-Finder. We address the problem of formally proving deadlock-freedom of an embedded system in a way that is comprehensible for third party users and other tools. We propose the automatic generation of certificates for each BIP model declared safe by D-Finder. These certificates comprise a proof of deadlock-freedom of the BIP model which can be checked by an independent checker. We use the Coq theorem prover as certificate checker. Thus, bringing the high level of confidence of a formal proof to the deadlock analysis results. With the help of certificates one gets a deadlock-freedom guarantee of BIP models without having to trust or even take a look at the deadlock checking tool. The proof of deadlock-freedom fundamentally relies on the computation of invariant properties of the considered BIP model which is carried out by D-Finder and serves as basis for certificate generation. Encapsulating these invariants into certificates and checking them is the most important subtask of our methodology for guaranteeing deadlock-freedom.
{"title":"Certifying deadlock-freedom for BIP models","authors":"J. Blech, Michaël Périn","doi":"10.1145/1543820.1543832","DOIUrl":"https://doi.org/10.1145/1543820.1543832","url":null,"abstract":"The BIP framework provides a methodology supported by a tool chain for developing software for embedded systems. The design of a BIP system follows the decomposition in behavior, interaction and priority. The first step comprises the division of desired behavior of a system into components. In a second step interactions and their priorities are added between the components. Finally, machine code is generated from the BIP model. While adding interactions it is possible to overconstrain a system resulting in potential deadlocks. The tool chain crucially depends on an automatic tool, D-Finder, which checks for deadlock-freedom.\u0000 This paper reports on guaranteeing the correctness of the verdict of D-Finder. We address the problem of formally proving deadlock-freedom of an embedded system in a way that is comprehensible for third party users and other tools. We propose the automatic generation of certificates for each BIP model declared safe by D-Finder. These certificates comprise a proof of deadlock-freedom of the BIP model which can be checked by an independent checker. We use the Coq theorem prover as certificate checker. Thus, bringing the high level of confidence of a formal proof to the deadlock analysis results.\u0000 With the help of certificates one gets a deadlock-freedom guarantee of BIP models without having to trust or even take a look at the deadlock checking tool. The proof of deadlock-freedom fundamentally relies on the computation of invariant properties of the considered BIP model which is carried out by D-Finder and serves as basis for certificate generation. Encapsulating these invariants into certificates and checking them is the most important subtask of our methodology for guaranteeing deadlock-freedom.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132223268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Florian Kluge, Chenglong Yu, Jörg Mische, S. Uhrig, T. Ungerer
The AUTOSAR specification provides a common standard for software development in the automotive domain. Its functional definition is based on the concept of single-threaded processors. Recent trends in embedded processors provide new possibilities for more powerful processors using parallel execution techniques like multithreading and multi-cores. We discuss the implementation of the AUTOSAR operating system interface on a modern simultaneous multithreaded (SMT) processor. Several problems in resource management arise when AUTOSAR tasks are executed concurrently on a multithreaded processor. Especially deadlocks, which should be averted through the priority ceiling protocol, can reoccur. We solve this problems by extending AUTOSAR OS by the Task Filtering Method to avoid deadlocks in multithreaded processors. Other synchronisation problems arising through the parallel execution of tasks are solved through the use of lock-free data structures. In the end, we propose some extensions to the AUTOSAR specification so it can be used in software development for SMT processors. We develop some additional requirements on such SMT processors to enable the use of the Task Filtering Method. Our work gives also perspectives for software development on upcoming multi-core processors in the automotive domain.
{"title":"Implementing AUTOSAR scheduling and resource management on an embedded SMT processor","authors":"Florian Kluge, Chenglong Yu, Jörg Mische, S. Uhrig, T. Ungerer","doi":"10.1145/1543820.1543828","DOIUrl":"https://doi.org/10.1145/1543820.1543828","url":null,"abstract":"The AUTOSAR specification provides a common standard for software development in the automotive domain. Its functional definition is based on the concept of single-threaded processors. Recent trends in embedded processors provide new possibilities for more powerful processors using parallel execution techniques like multithreading and multi-cores. We discuss the implementation of the AUTOSAR operating system interface on a modern simultaneous multithreaded (SMT) processor. Several problems in resource management arise when AUTOSAR tasks are executed concurrently on a multithreaded processor. Especially deadlocks, which should be averted through the priority ceiling protocol, can reoccur. We solve this problems by extending AUTOSAR OS by the Task Filtering Method to avoid deadlocks in multithreaded processors. Other synchronisation problems arising through the parallel execution of tasks are solved through the use of lock-free data structures. In the end, we propose some extensions to the AUTOSAR specification so it can be used in software development for SMT processors. We develop some additional requirements on such SMT processors to enable the use of the Task Filtering Method. Our work gives also perspectives for software development on upcoming multi-core processors in the automotive domain.","PeriodicalId":375451,"journal":{"name":"Software and Compilers for Embedded Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132352206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}