Pub Date : 2009-07-13DOI: 10.1109/MEMCOD.2009.5185379
F. Brewer, J. Hoe
The 2009 MEMOCODE Co-Design Contest is the third in the series of annual design contests organized by the MEMOCODE Conference. Contestants have one month to create the best performing design solution to a posted design challenge. The contest is open to all interested participants, and the contest rules are designed to not exclude or favor any one design methodology or platform. The goal of the contest is to invite developers of tools and platforms to showcase their technology in a leveled competition and to encourage hands-on design activities in the fields of interest of the MEMOCODE Conference. Please see http://www.memocode-conference.com for current information about this contest.
{"title":"2009 MEMOCODE Co-Design Contest","authors":"F. Brewer, J. Hoe","doi":"10.1109/MEMCOD.2009.5185379","DOIUrl":"https://doi.org/10.1109/MEMCOD.2009.5185379","url":null,"abstract":"The 2009 MEMOCODE Co-Design Contest is the third in the series of annual design contests organized by the MEMOCODE Conference. Contestants have one month to create the best performing design solution to a posted design challenge. The contest is open to all interested participants, and the contest rules are designed to not exclude or favor any one design methodology or platform. The goal of the contest is to invite developers of tools and platforms to showcase their technology in a leveled competition and to encourage hands-on design activities in the fields of interest of the MEMOCODE Conference. Please see http://www.memocode-conference.com for current information about this contest.","PeriodicalId":163970,"journal":{"name":"2009 7th IEEE/ACM International Conference on Formal Methods and Models for Co-Design","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114351105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-07-13DOI: 10.1109/MEMCOD.2009.5185392
J. Brandt, K. Schneider
Synchronous programming languages are well-suited for the design of safety-critical real-time embedded systems. However, the compilers and synthesis procedures are challenged by the synchronous programming paradigm and have to solve additional problems like causality and schizophrenia problems. Algorithms to solve these basic compilation problems have already become mature, but code optimization still lacks behind. Often, code optimization is left to the back-end tools like compilers for sequential software or hardware synthesis tools. In this paper, we develop a static analysis procedure to introduce code optimization techniques to synchronous languages. We develop specialized code optimization procedures that can be applied to all kinds of synchronous languages. Similar to the code optimization techniques used for the compilation of sequential software, our procedures are also based on a static data-flow analysis that is adapted to the synchronous programing model.
{"title":"Static data-flow analysis of synchronous programs","authors":"J. Brandt, K. Schneider","doi":"10.1109/MEMCOD.2009.5185392","DOIUrl":"https://doi.org/10.1109/MEMCOD.2009.5185392","url":null,"abstract":"Synchronous programming languages are well-suited for the design of safety-critical real-time embedded systems. However, the compilers and synthesis procedures are challenged by the synchronous programming paradigm and have to solve additional problems like causality and schizophrenia problems. Algorithms to solve these basic compilation problems have already become mature, but code optimization still lacks behind. Often, code optimization is left to the back-end tools like compilers for sequential software or hardware synthesis tools. In this paper, we develop a static analysis procedure to introduce code optimization techniques to synchronous languages. We develop specialized code optimization procedures that can be applied to all kinds of synchronous languages. Similar to the code optimization techniques used for the compilation of sequential software, our procedures are also based on a static data-flow analysis that is adapted to the synchronous programing model.","PeriodicalId":163970,"journal":{"name":"2009 7th IEEE/ACM International Conference on Formal Methods and Models for Co-Design","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116786783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-07-13DOI: 10.1109/MEMCOD.2009.5185381
Abhinav Agarwal, Nirav H. Dave, Kermin Fleming, Asif Khan, Myron King, Man Cheuk Ng, M. Vijayaraghavan
The 2009 MEMOCODE Hardware/Software Co-Design Contest assignment was the implementation of a cartesian-to-polar matrix interpolator. We discuss our hardware and software design submissions.
{"title":"Implementing a fast cartesian-polar matrix interpolator","authors":"Abhinav Agarwal, Nirav H. Dave, Kermin Fleming, Asif Khan, Myron King, Man Cheuk Ng, M. Vijayaraghavan","doi":"10.1109/MEMCOD.2009.5185381","DOIUrl":"https://doi.org/10.1109/MEMCOD.2009.5185381","url":null,"abstract":"The 2009 MEMOCODE Hardware/Software Co-Design Contest assignment was the implementation of a cartesian-to-polar matrix interpolator. We discuss our hardware and software design submissions.","PeriodicalId":163970,"journal":{"name":"2009 7th IEEE/ACM International Conference on Formal Methods and Models for Co-Design","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124778083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-07-13DOI: 10.1109/MEMCOD.2009.5185383
Anita Lungu, P. Bose, Daniel J. Sorin, S. German, G. Janssen
Dynamic power management (DPM) is important for multicore architectures. One important challenge for multicore DPM schemes is verifying that they are both safe (cannot lead to power or thermal catastrophes) and efficient (achieve as much performance as possible without exceeding power constraints). The verification difficulty varies among designs, depending, for example, on the particular power management mechanisms utilized and the algorithms used to adjust them. However, verification effort is often not considered in the early stages of DPM scheme design, leading to proposals that can be extremely difficult to verify. To address this problem, we propose using formal verification (with probabilistic model checking) of a high-level, early-stage model of the DPM scheme. Using the model checker, we estimate the required verification effort, providing insight on how certain design parameters impact this effort. Furthermore, we supplement the verifiability results with high-level estimates of power consumption and performance, which allow us to perform a trade-off analysis between power, performance, and verification. We show that this trade-off analysis uncovers design points that are better than those that consider only power and performance.
{"title":"Multicore power management: Ensuring robustness via early-stage formal verification","authors":"Anita Lungu, P. Bose, Daniel J. Sorin, S. German, G. Janssen","doi":"10.1109/MEMCOD.2009.5185383","DOIUrl":"https://doi.org/10.1109/MEMCOD.2009.5185383","url":null,"abstract":"Dynamic power management (DPM) is important for multicore architectures. One important challenge for multicore DPM schemes is verifying that they are both safe (cannot lead to power or thermal catastrophes) and efficient (achieve as much performance as possible without exceeding power constraints). The verification difficulty varies among designs, depending, for example, on the particular power management mechanisms utilized and the algorithms used to adjust them. However, verification effort is often not considered in the early stages of DPM scheme design, leading to proposals that can be extremely difficult to verify. To address this problem, we propose using formal verification (with probabilistic model checking) of a high-level, early-stage model of the DPM scheme. Using the model checker, we estimate the required verification effort, providing insight on how certain design parameters impact this effort. Furthermore, we supplement the verifiability results with high-level estimates of power consumption and performance, which allow us to perform a trade-off analysis between power, performance, and verification. We show that this trade-off analysis uncovers design points that are better than those that consider only power and performance.","PeriodicalId":163970,"journal":{"name":"2009 7th IEEE/ACM International Conference on Formal Methods and Models for Co-Design","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125491324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-07-13DOI: 10.1109/MEMCOD.2009.5185377
H. Garavel, C. Helmstetter, Olivier Ponsini, Wendelin Serwe
SystemC/TLM is a widely used standard for system level descriptions of complex architectures. It is particularly useful for fast simulation, thus allowing early development and testing of the targeted software. In general, formal verification of SystemC/TLM relies on the translation of the complete model into a language accepted by a verification tool. In this paper, we present an approach to the validation of a SystemC/TLM description by translation into LOTOS, reusing as much as possible of the original SystemC/TLM C++ code. To this end, we exploit a feature offered by the formal verification toolbox CADP, namely the import of external C code in a LOTOS model. We report on experiments of our approach on the BDisp, a complex graphical processing unit designed by STMicroelectronics.
{"title":"Verification of an industrial SystemC/TLM model using LOTOS and CADP","authors":"H. Garavel, C. Helmstetter, Olivier Ponsini, Wendelin Serwe","doi":"10.1109/MEMCOD.2009.5185377","DOIUrl":"https://doi.org/10.1109/MEMCOD.2009.5185377","url":null,"abstract":"SystemC/TLM is a widely used standard for system level descriptions of complex architectures. It is particularly useful for fast simulation, thus allowing early development and testing of the targeted software. In general, formal verification of SystemC/TLM relies on the translation of the complete model into a language accepted by a verification tool. In this paper, we present an approach to the validation of a SystemC/TLM description by translation into LOTOS, reusing as much as possible of the original SystemC/TLM C++ code. To this end, we exploit a feature offered by the formal verification toolbox CADP, namely the import of external C code in a LOTOS model. We report on experiments of our approach on the BDisp, a complex graphical processing unit designed by STMicroelectronics.","PeriodicalId":163970,"journal":{"name":"2009 7th IEEE/ACM International Conference on Formal Methods and Models for Co-Design","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115213860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-07-13DOI: 10.1109/MEMCOD.2009.5185385
Eric S. Chung, J. Hoe
We have developed a 16-way multithreaded microprocessor called BlueSPARC. This in-order, high-throughput processor incorporates complex features such as privileged operations, memory management, and a non-blocking cache subsystem. When supported by a hybrid simulation technique that handles rare, unimplemented behaviors in a software host, the BlueSPARC microprocessor runs unmodified UltraSPARC III-based commercial applications on Solaris 8 while hosted on a single Xilinx XCV2P70 FPGA clocked at 90MHz. This significant effort was achieved in under one man-year using a high-level language and a high-level validation approach. In the first part of the paper, we describe our experience in applying the Bluespec SystemVerilog (BSV) language to develop a large hardware design that must meet specific area and performance requirements. In the second part of the paper, we present the FPGA-accelerated validation approach we employed to check the correct execution of real multithreaded programs running on the BlueSPARC processor. We discuss the challenges and our solutions to validation in the presence of full-system interactions and microarchitectural nondeterminism.
{"title":"Implementing a high-performance multithreaded microprocessor: A case study in high-level design and validation","authors":"Eric S. Chung, J. Hoe","doi":"10.1109/MEMCOD.2009.5185385","DOIUrl":"https://doi.org/10.1109/MEMCOD.2009.5185385","url":null,"abstract":"We have developed a 16-way multithreaded microprocessor called BlueSPARC. This in-order, high-throughput processor incorporates complex features such as privileged operations, memory management, and a non-blocking cache subsystem. When supported by a hybrid simulation technique that handles rare, unimplemented behaviors in a software host, the BlueSPARC microprocessor runs unmodified UltraSPARC III-based commercial applications on Solaris 8 while hosted on a single Xilinx XCV2P70 FPGA clocked at 90MHz. This significant effort was achieved in under one man-year using a high-level language and a high-level validation approach. In the first part of the paper, we describe our experience in applying the Bluespec SystemVerilog (BSV) language to develop a large hardware design that must meet specific area and performance requirements. In the second part of the paper, we present the FPGA-accelerated validation approach we employed to check the correct execution of real multithreaded programs running on the BlueSPARC processor. We discuss the challenges and our solutions to validation in the presence of full-system interactions and microarchitectural nondeterminism.","PeriodicalId":163970,"journal":{"name":"2009 7th IEEE/ACM International Conference on Formal Methods and Models for Co-Design","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122981961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-07-13DOI: 10.1109/MEMCOD.2009.5185380
Daniel L. Rosenband, Till Rosenband
This paper describes our winning submission for the Absolute Performance category of the MEMOCODE 2009 Design Contest. We show that our GPGPU-based design achieves performance within a factor of four of theoretical maximum performance for the implemented algorithm. This result was reached after a short design-cycle of 2 man-days, which indicates that the NVIDIA CUDA platform allows for rapid development and optimization of applications that make substantial use of all available GPGPU computing resources. We also analyze the maximum theoretical performance of alternative computing systems that could have been used to implement the algorithm.
{"title":"A design case study: CPU vs. GPGPU vs. FPGA","authors":"Daniel L. Rosenband, Till Rosenband","doi":"10.1109/MEMCOD.2009.5185380","DOIUrl":"https://doi.org/10.1109/MEMCOD.2009.5185380","url":null,"abstract":"This paper describes our winning submission for the Absolute Performance category of the MEMOCODE 2009 Design Contest. We show that our GPGPU-based design achieves performance within a factor of four of theoretical maximum performance for the implemented algorithm. This result was reached after a short design-cycle of 2 man-days, which indicates that the NVIDIA CUDA platform allows for rapid development and optimization of applications that make substantial use of all available GPGPU computing resources. We also analyze the maximum theoretical performance of alternative computing systems that could have been used to implement the algorithm.","PeriodicalId":163970,"journal":{"name":"2009 7th IEEE/ACM International Conference on Formal Methods and Models for Co-Design","volume":"388 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122780177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-07-13DOI: 10.1109/MEMCOD.2009.5185391
N. Vasudevan, S. Edwards
Most compilers focus on optimizing performance, often at the expense of memory, but efficient memory use can be just as important in constrained environments such as embedded systems.
{"title":"Buffer sharing in CSP-like programs","authors":"N. Vasudevan, S. Edwards","doi":"10.1109/MEMCOD.2009.5185391","DOIUrl":"https://doi.org/10.1109/MEMCOD.2009.5185391","url":null,"abstract":"Most compilers focus on optimizing performance, often at the expense of memory, but efficient memory use can be just as important in constrained environments such as embedded systems.","PeriodicalId":163970,"journal":{"name":"2009 7th IEEE/ACM International Conference on Formal Methods and Models for Co-Design","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125404575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-07-13DOI: 10.1109/MEMCOD.2009.5185382
D. Harel
The talk shows the way techniques from computer science and software engineering can be applied beneficially to research in the life sciences. We will discuss the idea of comprehensive and realistic modeling of biological systems, where we try to understand and analyze an entire system in detail, utilizing in the modeling effort all that is known about it. I will address the motivation for such modeling and the philosophy underlying the techniques for carrying it out, as well as the crucial question of when such models are to be deemed valid, or complete. The examples I will present will be from among the biological modeling efforts my group has been involved in: T cell development in the thymus, lymph node behavior, organogenesis of the pancreas, fate determination in the reproductive system of C. elegans, and a generic cell model. The ultimate long-term “grand challenge” is to produce an interactive, dynamic, computerized model of an entire multi-cellular organism, such as the C. elegans nematode worm, which is complex, but well-defined in terms of anatomy and genetics. The challenge is to construct a full, true-to-all-known-facts, 4-dimensional, interactively animated model of the development and behavior of this worm (or of a comparable multi-cellular animal), which is easily extendable as new biological facts are discovered.
{"title":"Can we computerize an elephant?","authors":"D. Harel","doi":"10.1109/MEMCOD.2009.5185382","DOIUrl":"https://doi.org/10.1109/MEMCOD.2009.5185382","url":null,"abstract":"The talk shows the way techniques from computer science and software engineering can be applied beneficially to research in the life sciences. We will discuss the idea of comprehensive and realistic modeling of biological systems, where we try to understand and analyze an entire system in detail, utilizing in the modeling effort all that is known about it. I will address the motivation for such modeling and the philosophy underlying the techniques for carrying it out, as well as the crucial question of when such models are to be deemed valid, or complete. The examples I will present will be from among the biological modeling efforts my group has been involved in: T cell development in the thymus, lymph node behavior, organogenesis of the pancreas, fate determination in the reproductive system of C. elegans, and a generic cell model. The ultimate long-term “grand challenge” is to produce an interactive, dynamic, computerized model of an entire multi-cellular organism, such as the C. elegans nematode worm, which is complex, but well-defined in terms of anatomy and genetics. The challenge is to construct a full, true-to-all-known-facts, 4-dimensional, interactively animated model of the development and behavior of this worm (or of a comparable multi-cellular animal), which is easily extendable as new biological facts are discovered.","PeriodicalId":163970,"journal":{"name":"2009 7th IEEE/ACM International Conference on Formal Methods and Models for Co-Design","volume":"22 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130944363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-07-13DOI: 10.1109/MEMCOD.2009.5185393
M. Vijayaraghavan, Arvind
We present a theory for modular refinement of Synchronous Sequential Circuits (SSMs) using Bounded Dataflow Networks (BDNs). We provide a procedure for implementing any SSM into an LI-BDN, a special class of BDNs with some good compositional properties. We show that the Latency-Insensitive property of LI-BDNs is preserved under parallel and iterative composition of LI-BDNs. Our theory permits one to make arbitrary cuts in an SSM and turn each of the parts into LI-BDNs without affecting the overall functionality. We can further refine each constituent LI-BDN into another LI-BDN which may take different number of cycles to compute. If the constituent LI-BDN is refined correctly we guarantee that the overall behavior would be cycle-accurate with respect to the original SSM. Thus one can replace, say a 3-ported register file in an SSM by a one-ported register file without affecting the correctness of the SSM. We give several examples to show how our theory supports a generalization of previous techniques for Latency-Insensitive refinements of SSMs.
{"title":"Bounded Dataflow Networks and Latency-Insensitive circuits","authors":"M. Vijayaraghavan, Arvind","doi":"10.1109/MEMCOD.2009.5185393","DOIUrl":"https://doi.org/10.1109/MEMCOD.2009.5185393","url":null,"abstract":"We present a theory for modular refinement of Synchronous Sequential Circuits (SSMs) using Bounded Dataflow Networks (BDNs). We provide a procedure for implementing any SSM into an LI-BDN, a special class of BDNs with some good compositional properties. We show that the Latency-Insensitive property of LI-BDNs is preserved under parallel and iterative composition of LI-BDNs. Our theory permits one to make arbitrary cuts in an SSM and turn each of the parts into LI-BDNs without affecting the overall functionality. We can further refine each constituent LI-BDN into another LI-BDN which may take different number of cycles to compute. If the constituent LI-BDN is refined correctly we guarantee that the overall behavior would be cycle-accurate with respect to the original SSM. Thus one can replace, say a 3-ported register file in an SSM by a one-ported register file without affecting the correctness of the SSM. We give several examples to show how our theory supports a generalization of previous techniques for Latency-Insensitive refinements of SSMs.","PeriodicalId":163970,"journal":{"name":"2009 7th IEEE/ACM International Conference on Formal Methods and Models for Co-Design","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114218738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}