Pub Date : 1998-12-02DOI: 10.1109/ISSS.1998.730610
P. Knudsen, J. Madsen
This paper presents a codesign approach which incorporates communication protocol selection as a design parameter within hardware/software partitioning. The presented approach takes into account data transfer rates depending on communication protocol types and configurations, and different operating frequencies of system components, i.e. CPUs, ASICs, and busses. It also takes into account the timing and area influences of drivers and driver calls needed to perform the communication. The approach is illustrated by a number of design space exploration experiments which use models of the PCI and USB communication protocols.
{"title":"Integrating communication protocol selection with partitioning in hardware/software codesign","authors":"P. Knudsen, J. Madsen","doi":"10.1109/ISSS.1998.730610","DOIUrl":"https://doi.org/10.1109/ISSS.1998.730610","url":null,"abstract":"This paper presents a codesign approach which incorporates communication protocol selection as a design parameter within hardware/software partitioning. The presented approach takes into account data transfer rates depending on communication protocol types and configurations, and different operating frequencies of system components, i.e. CPUs, ASICs, and busses. It also takes into account the timing and area influences of drivers and driver calls needed to perform the communication. The approach is illustrated by a number of design space exploration experiments which use models of the PCI and USB communication protocols.","PeriodicalId":305333,"journal":{"name":"Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131493756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-02DOI: 10.1109/ISSS.1998.730589
R. Leupers, Fabian David
A number of different algorithms for optimized offset assignment in DSP code generation have been developed recently. These algorithms aim at constructing a layout of local variables in memory, such that the addresses of variables can be computed efficiently in most cases. This is achieved by maximizing the use of auto-increment operations on address registers. However, the algorithms published in previous work only consider special cases of offset assignment problems, characterized by fixed parameters such as register file sizes and auto-increment ranges. In contrast, this paper presents a genetic optimization technique capable of simultaneously handling arbitrary register file sizes and auto-increment ranges. Moreover, this technique is the first that integrates the allocation of modify registers into offset assignment. Experimental evaluation indicates a significant improvement in the quality of constructed offset assignments, as compared to previous work.
{"title":"A uniform optimization technique for offset assignment problems","authors":"R. Leupers, Fabian David","doi":"10.1109/ISSS.1998.730589","DOIUrl":"https://doi.org/10.1109/ISSS.1998.730589","url":null,"abstract":"A number of different algorithms for optimized offset assignment in DSP code generation have been developed recently. These algorithms aim at constructing a layout of local variables in memory, such that the addresses of variables can be computed efficiently in most cases. This is achieved by maximizing the use of auto-increment operations on address registers. However, the algorithms published in previous work only consider special cases of offset assignment problems, characterized by fixed parameters such as register file sizes and auto-increment ranges. In contrast, this paper presents a genetic optimization technique capable of simultaneously handling arbitrary register file sizes and auto-increment ranges. Moreover, this technique is the first that integrates the allocation of modify registers into offset assignment. Experimental evaluation indicates a significant improvement in the quality of constructed offset assignments, as compared to previous work.","PeriodicalId":305333,"journal":{"name":"Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210)","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134236223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-02DOI: 10.1109/ISSS.1998.730598
C. Jäschke, R. Laur
Commonly used scheduling algorithms in high-level synthesis are not capable of sharing resources across process boundaries. This results in the usage of at least one resource per operation type and process. A new method is proposed in order to overcome these restrictions and to share high-cost or limited resources within a process group. This allows the use of less than one resource per operation type and process, while keeping the mutual independence of the involved processes. The method which represents an extension of general scheduling algorithms is not tied to a specific algorithm. The method is explained by using the common List Scheduling and further on applied to examples.
{"title":"Resource constrained modulo scheduling with global resource sharing","authors":"C. Jäschke, R. Laur","doi":"10.1109/ISSS.1998.730598","DOIUrl":"https://doi.org/10.1109/ISSS.1998.730598","url":null,"abstract":"Commonly used scheduling algorithms in high-level synthesis are not capable of sharing resources across process boundaries. This results in the usage of at least one resource per operation type and process. A new method is proposed in order to overcome these restrictions and to share high-cost or limited resources within a process group. This allows the use of less than one resource per operation type and process, while keeping the mutual independence of the involved processes. The method which represents an extension of general scheduling algorithms is not tied to a specific algorithm. The method is explained by using the common List Scheduling and further on applied to examples.","PeriodicalId":305333,"journal":{"name":"Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124563042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-02DOI: 10.1109/ISSS.1998.730590
L. Coster, M. Adé, R. Lauwereins, J. Peperstraete
Bit-true simulation verifies the finite word length choices in the VLSI implementation of a DSP application. Present-day bit-true simulation tools are time consuming. We elaborate a new approach in which the signal flow graph of the application is analyzed and then transformed utilizing the flexibility available on the simulation target. This global approach outperforms current tools by an order of magnitude in simulation time.
{"title":"Code generation for compiled bit-true simulation of DSP applications","authors":"L. Coster, M. Adé, R. Lauwereins, J. Peperstraete","doi":"10.1109/ISSS.1998.730590","DOIUrl":"https://doi.org/10.1109/ISSS.1998.730590","url":null,"abstract":"Bit-true simulation verifies the finite word length choices in the VLSI implementation of a DSP application. Present-day bit-true simulation tools are time consuming. We elaborate a new approach in which the signal flow graph of the application is analyzed and then transformed utilizing the flexibility available on the simulation target. This global approach outperforms current tools by an order of magnitude in simulation time.","PeriodicalId":305333,"journal":{"name":"Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210)","volume":"60 14","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120839524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-02DOI: 10.1109/ISSS.1998.730607
Z. Wu, W. Wolf
This paper describes a methodology for synthesizing the data-path of a Very Long Instruction Word (VLIW) based Video Signal Processor (VSP). Offering both performance and programmability, VSPs are important for their roles in digital video applications, which are omnipresent in today's world. Among many different architectures, VLIW is becoming increasingly popular and widely used due to its efficiency in exploiting high degree of parallelism inherent in multimedia applications. While architectural syntheses of embedded systems have been studied in depth, little literature has addressed similar issues for VLIW-based VSPs. Using an MPEG-2 video encoder as a case study, in this paper we present a combined application of trace-driven simulation and performance estimation in the data-path synthesis of a VLIW VSP. Results show that our estimations are quite precise and helpful, let alone that they are orders of magnitude faster than simulation.
{"title":"Data-path synthesis of VLIW video signal processors","authors":"Z. Wu, W. Wolf","doi":"10.1109/ISSS.1998.730607","DOIUrl":"https://doi.org/10.1109/ISSS.1998.730607","url":null,"abstract":"This paper describes a methodology for synthesizing the data-path of a Very Long Instruction Word (VLIW) based Video Signal Processor (VSP). Offering both performance and programmability, VSPs are important for their roles in digital video applications, which are omnipresent in today's world. Among many different architectures, VLIW is becoming increasingly popular and widely used due to its efficiency in exploiting high degree of parallelism inherent in multimedia applications. While architectural syntheses of embedded systems have been studied in depth, little literature has addressed similar issues for VLIW-based VSPs. Using an MPEG-2 video encoder as a case study, in this paper we present a combined application of trace-driven simulation and performance estimation in the data-path synthesis of a VLIW VSP. Results show that our estimations are quite precise and helpful, let alone that they are orders of magnitude faster than simulation.","PeriodicalId":305333,"journal":{"name":"Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126945872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-02DOI: 10.1109/ISSS.1998.730608
O. Bringmann, W. Rosenstiel, D. Reichardt
Complex system specifications are often hierarchically composed of several subsystems. Each subsystem contains one or more processes. In order to provide optimization across different levels of hierarchy, a synchronicity analysis of the concerned processes has to be performed during high-level synthesis. The first step is the generation of a condensed graph representation of the inter-process communication. This graph is then utilized to detect inter-process communication which can be used to represent synchronization points between two or more processes. A synchronization point represents the starting point of an interval in which the communicating processes run synchronously. This interval is limited by unbounded data-dependent loops, denoted as de-synchronization points. As a result, different processes can only share resources in such an interval.
{"title":"Synchronization detection for multi-process hierarchical synthesis","authors":"O. Bringmann, W. Rosenstiel, D. Reichardt","doi":"10.1109/ISSS.1998.730608","DOIUrl":"https://doi.org/10.1109/ISSS.1998.730608","url":null,"abstract":"Complex system specifications are often hierarchically composed of several subsystems. Each subsystem contains one or more processes. In order to provide optimization across different levels of hierarchy, a synchronicity analysis of the concerned processes has to be performed during high-level synthesis. The first step is the generation of a condensed graph representation of the inter-process communication. This graph is then utilized to detect inter-process communication which can be used to represent synchronization points between two or more processes. A synchronization point represents the starting point of an interval in which the communicating processes run synchronously. This interval is limited by unbounded data-dependent loops, denoted as de-synchronization points. As a result, different processes can only share resources in such an interval.","PeriodicalId":305333,"journal":{"name":"Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133829600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-02DOI: 10.1109/ISSS.1998.730599
H. Tomiyama, A. Inoue, H. Yasuura
The inevitable fluctuation in fabrication processes results in LSI chips with various critical path delay even though all the chips are fabricated from the same design. Therefore, in LSI design, it is important to estimate what percentage of the fabricated chips will achieve the performance level and to maximize the percentage. This paper presents a model and a method to analyze statistical delay of RT-level datapath designs. The method predicts the probability that the fabricated circuits will work at a user specified clock period. Using the method, we can estimate a tight bound on the worst case critical path delay of the circuits. Based on the delay analysis method, a high-level module binding algorithm which maximizes the probability is also proposed. Experimental results demonstrate that the proposed statistical delay analysis method leads to lower-cost or higher-performance designs than conventional delay analysis methods.
{"title":"Statistical performance-driven module binding in high-level synthesis","authors":"H. Tomiyama, A. Inoue, H. Yasuura","doi":"10.1109/ISSS.1998.730599","DOIUrl":"https://doi.org/10.1109/ISSS.1998.730599","url":null,"abstract":"The inevitable fluctuation in fabrication processes results in LSI chips with various critical path delay even though all the chips are fabricated from the same design. Therefore, in LSI design, it is important to estimate what percentage of the fabricated chips will achieve the performance level and to maximize the percentage. This paper presents a model and a method to analyze statistical delay of RT-level datapath designs. The method predicts the probability that the fabricated circuits will work at a user specified clock period. Using the method, we can estimate a tight bound on the worst case critical path delay of the circuits. Based on the delay analysis method, a high-level module binding algorithm which maximizes the probability is also proposed. Experimental results demonstrate that the proposed statistical delay analysis method leads to lower-cost or higher-performance designs than conventional delay analysis methods.","PeriodicalId":305333,"journal":{"name":"Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132866844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-02DOI: 10.1109/ISSS.1998.730595
F. Vahid, T. Givargis
We describe an approach for incorporating cores into a system-level specification. The goal is to allow a designer to specify both custom behavior and pre-designed cores at the earliest design stages, and to refine both into implementations in a unified manner. The approach is based on experience with an actual application of a GPS-based navigation system. We use an object oriented language for specification, representing each core as an object. We define three specification levels, and we evaluate the appropriateness of existing inter-object communication methods for cores. The approach forms the specification basis for the Dalton project.
{"title":"Incorporating cores into system-level specification","authors":"F. Vahid, T. Givargis","doi":"10.1109/ISSS.1998.730595","DOIUrl":"https://doi.org/10.1109/ISSS.1998.730595","url":null,"abstract":"We describe an approach for incorporating cores into a system-level specification. The goal is to allow a designer to specify both custom behavior and pre-designed cores at the earliest design stages, and to refine both into implementations in a unified manner. The approach is based on experience with an actual application of a GPS-based navigation system. We use an object oriented language for specification, representing each core as an object. We define three specification levels, and we evaluate the appropriateness of existing inter-object communication methods for cores. The approach forms the specification basis for the Dalton project.","PeriodicalId":305333,"journal":{"name":"Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125435070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-02DOI: 10.1109/ISSS.1998.730594
E. Filippi, L. Lavagno, L. Licciardi, Achille Montanaro, M. Paolini, R. Passerone, M. Sgroi, A. Sangiovanni-Vincentelli
Design of large systems on a chip would be infeasible without the capability to flexibly adapt the system architecture to the application and the re-use of existing Intellectual Property (IP). This in turn requires the use of an appropriate methodology for system specification, architecture selection, IP integration and implementation generation. The goals of this work are: a) verification of the effectiveness of the POLIS HW/SW co-design methodology for the design of embedded systems for telecom applications; b) definition of methodology for integrating system level IP libraries in this HW/SW co-design framework. Methodology evaluations have been carried out through the development of an industrial telecom system design, an ATM node server.
{"title":"Intellectual property re-use in embedded system co-design: an industrial case study","authors":"E. Filippi, L. Lavagno, L. Licciardi, Achille Montanaro, M. Paolini, R. Passerone, M. Sgroi, A. Sangiovanni-Vincentelli","doi":"10.1109/ISSS.1998.730594","DOIUrl":"https://doi.org/10.1109/ISSS.1998.730594","url":null,"abstract":"Design of large systems on a chip would be infeasible without the capability to flexibly adapt the system architecture to the application and the re-use of existing Intellectual Property (IP). This in turn requires the use of an appropriate methodology for system specification, architecture selection, IP integration and implementation generation. The goals of this work are: a) verification of the effectiveness of the POLIS HW/SW co-design methodology for the design of embedded systems for telecom applications; b) definition of methodology for integrating system level IP libraries in this HW/SW co-design framework. Methodology evaluations have been carried out through the development of an industrial telecom system design, an ATM node server.","PeriodicalId":305333,"journal":{"name":"Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129154411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-12-02DOI: 10.1109/ISSS.1998.730593
C. Siska
Many ISA-level machine description languages have been introduced to support the automated development and retargeting of digital signal processor (DSP) software development tools. These languages have yet to move below the ISA-level and adequately address DSP pipeline issues. ISA-level bit-accurate models may be reasonable for small micro-controllers, but are inadequate when applied to complex high-performance DSPs. We introduce a new machine description language, RADL, which supports the automated generation of DSP programming tools. From RADL, we can generate production-quality tools including cycle- and phase-accurate simulators. RADL has explicit support for pipeline modeling, including delay slots, interrupts, hardware loops, hazards, and multiple interacting pipelines in a natural and intuitive way. RADL can represent both SIMD and MIMD instruction styles. We have coupled our language to an in-house tool-chain generator which is used to create production assemblers, simulators and compilers.
{"title":"A processor description language supporting retargetable multi-pipeline DSP program development tools","authors":"C. Siska","doi":"10.1109/ISSS.1998.730593","DOIUrl":"https://doi.org/10.1109/ISSS.1998.730593","url":null,"abstract":"Many ISA-level machine description languages have been introduced to support the automated development and retargeting of digital signal processor (DSP) software development tools. These languages have yet to move below the ISA-level and adequately address DSP pipeline issues. ISA-level bit-accurate models may be reasonable for small micro-controllers, but are inadequate when applied to complex high-performance DSPs. We introduce a new machine description language, RADL, which supports the automated generation of DSP programming tools. From RADL, we can generate production-quality tools including cycle- and phase-accurate simulators. RADL has explicit support for pipeline modeling, including delay slots, interrupts, hardware loops, hazards, and multiple interacting pipelines in a natural and intuitive way. RADL can represent both SIMD and MIMD instruction styles. We have coupled our language to an in-house tool-chain generator which is used to create production assemblers, simulators and compilers.","PeriodicalId":305333,"journal":{"name":"Proceedings. 11th International Symposium on System Synthesis (Cat. No.98EX210)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117039874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}