Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115355
J. Soininen, A. Pelkonen, J. Roivainen
A configurable memory organisation for the execution of Hiperlan/2 transceiver baseband processing and MPEG2 decoding is presented. The configuration of the memory system is done by controlling the DSP processor's access to memory buses with an external processor and switches. The configurable memory organisation allows the scaling of system capacity to the needs of the applications and makes the use of the capacity more effective. The architecture was modelled and evaluated using a systemC simulator and workload models. The clock frequency can be reduced by up to 25% if a configurable memory system is used instead of a bus-based shared memory. The memory latency with configurable memory organisation was less than 50% of the latency of the shared memory solution.
{"title":"Configurable memory organisation for communication applications","authors":"J. Soininen, A. Pelkonen, J. Roivainen","doi":"10.1109/DSD.2002.1115355","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115355","url":null,"abstract":"A configurable memory organisation for the execution of Hiperlan/2 transceiver baseband processing and MPEG2 decoding is presented. The configuration of the memory system is done by controlling the DSP processor's access to memory buses with an external processor and switches. The configurable memory organisation allows the scaling of system capacity to the needs of the applications and makes the use of the capacity more effective. The architecture was modelled and evaluated using a systemC simulator and workload models. The clock frequency can be reduced by up to 25% if a configurable memory system is used instead of a bus-based shared memory. The memory latency with configurable memory organisation was less than 50% of the latency of the shared memory solution.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123308356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115359
M. Edwards, B. Fozard
This paper presents a practical approach to hardware/software partitioning, which is targeted at the rapid prototyping of embedded systems as a mixture of software and reconfigurable hardware. In our method, an application is firstly specified in the high-level programming language C - this is considered to be an executable functional specification. We subsequently allow this specification to be partitioned into hardware and software modules. The hardware modules, which are defined in Handel-C, are synthesised and mapped to a Xilinx Virtex FPGA. The FPGA is situated on a PCB, which is installed in a standard PC. The software modules are executed on the same PC. The paper describes the methodology, and shows how the partitioning process can be readily achieved with minimal changes to the original C program via the use of a predefined library. A simple example is used to illustrate the design process.
{"title":"Rapid prototyping of mixed hardware and software systems","authors":"M. Edwards, B. Fozard","doi":"10.1109/DSD.2002.1115359","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115359","url":null,"abstract":"This paper presents a practical approach to hardware/software partitioning, which is targeted at the rapid prototyping of embedded systems as a mixture of software and reconfigurable hardware. In our method, an application is firstly specified in the high-level programming language C - this is considered to be an executable functional specification. We subsequently allow this specification to be partitioned into hardware and software modules. The hardware modules, which are defined in Handel-C, are synthesised and mapped to a Xilinx Virtex FPGA. The FPGA is situated on a PCB, which is installed in a standard PC. The software modules are executed on the same PC. The paper describes the methodology, and shows how the partitioning process can be readily achieved with minimal changes to the original C program via the use of a predefined library. A simple example is used to illustrate the design process.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124525136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115384
A. M. Sllame, V. Drábek
Scheduling is considered as the most important task in high-level synthesis process. This paper presents a novel list-based scheduling algorithm based on incorporating some information extracted from data flow graph (DFG) structure to guide the scheduler to find near-optimal/optimal schedules quickly. We have developed a novel approach based on DFG analysis that is totally done as preparation phase. This DFG analysis information includes: every node knows its successor and its predecessor, total number of successors, and the tree which it belongs to, where trees are constructed from every output operation from the constructed DFG. Incorporating this knowledge in the priority functions of the scheduler guided the scheduler to make the correct choice of the perfect operation to be scheduled next.
{"title":"An efficient list-based scheduling algorithm for high-level synthesis","authors":"A. M. Sllame, V. Drábek","doi":"10.1109/DSD.2002.1115384","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115384","url":null,"abstract":"Scheduling is considered as the most important task in high-level synthesis process. This paper presents a novel list-based scheduling algorithm based on incorporating some information extracted from data flow graph (DFG) structure to guide the scheduler to find near-optimal/optimal schedules quickly. We have developed a novel approach based on DFG analysis that is totally done as preparation phase. This DFG analysis information includes: every node knows its successor and its predecessor, total number of successors, and the tree which it belongs to, where trees are constructed from every output operation from the constructed DFG. Incorporating this knowledge in the priority functions of the scheduler guided the scheduler to make the correct choice of the perfect operation to be scheduled next.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129163639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115365
Josef Strnadel, Z. Kotásek
In the paper a new heuristic approach to the RTL testability analysis is presented It is shown how the values of controllability/observability factors reflecting the structure of the circuit and other factors can be utilised to find solutions which are sub-optimal but still acceptable for the designer. The goal of the methodology is to enable the identification of such testability solutions which satisfy concrete requirements in terms of the number of registers included into the scan chain, the area overhead and the test application time as a result of RTL testability analysis. The approach is based on the combination of analytical and evolutionary approaches at the RT level.
{"title":"Testability improvements based on the combination of analytical and evolutionary approaches at RT level","authors":"Josef Strnadel, Z. Kotásek","doi":"10.1109/DSD.2002.1115365","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115365","url":null,"abstract":"In the paper a new heuristic approach to the RTL testability analysis is presented It is shown how the values of controllability/observability factors reflecting the structure of the circuit and other factors can be utilised to find solutions which are sub-optimal but still acceptable for the designer. The goal of the methodology is to enable the identification of such testability solutions which satisfy concrete requirements in terms of the number of registers included into the scan chain, the area overhead and the test application time as a result of RTL testability analysis. The approach is based on the combination of analytical and evolutionary approaches at the RT level.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129778028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115369
P. Wielage, K. Goossens
Continuing VLSI technology scaling raises several deep submicron (DSM) problems like relatively slow interconnect, power dissipation and distribution, and signal integrity. Those problems are encountered particularly on long wires for global interconnect. As clock frequencies increase, scaled wires become relatively slower and on-chip communication will be the limiting performance factor of future chips. We explain why efficiently sharing of the wires for long distance communication is the solution to this problem. We introduce networks on silicon (NoS), that route packets over shared (semi)-global wires. NoS performance is expected to be high, but comes at a cost. Balancing the performance and cost of a NoS is a major challenge, and we believe busses still have a role to play.
{"title":"Networks on silicon: blessing or nightmare?","authors":"P. Wielage, K. Goossens","doi":"10.1109/DSD.2002.1115369","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115369","url":null,"abstract":"Continuing VLSI technology scaling raises several deep submicron (DSM) problems like relatively slow interconnect, power dissipation and distribution, and signal integrity. Those problems are encountered particularly on long wires for global interconnect. As clock frequencies increase, scaled wires become relatively slower and on-chip communication will be the limiting performance factor of future chips. We explain why efficiently sharing of the wires for long distance communication is the solution to this problem. We introduce networks on silicon (NoS), that route packets over shared (semi)-global wires. NoS performance is expected to be high, but comes at a cost. Balancing the performance and cost of a NoS is a major challenge, and we believe busses still have a role to play.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132028302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115383
J. Mendias, R. Hermida, M. Molina, O. Peñalba
This paper presents an efficient method to solve an important aspect of the high-level verification problem: the formal verification of RT-level implementations (datapath + controller), obtained from algorithmic-level specifications by high-level synthesis tools. The method consists in replicating external, and potentially incorrect, design processes within a mathematical framework, giving as a result the proof of correctness or the set of design decisions that led to errors. As the computational complexity is a major problem informal verification, the formal framework is based in an ad hoc formal theory. The moderate complexity achieved, has been confirmed by a detailed experimental study, which shows that our method can verify complex designs overloading the highlevel design-cycle only minimally.
{"title":"Efficient verification of scheduling, allocation and binding in high-level synthesis","authors":"J. Mendias, R. Hermida, M. Molina, O. Peñalba","doi":"10.1109/DSD.2002.1115383","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115383","url":null,"abstract":"This paper presents an efficient method to solve an important aspect of the high-level verification problem: the formal verification of RT-level implementations (datapath + controller), obtained from algorithmic-level specifications by high-level synthesis tools. The method consists in replicating external, and potentially incorrect, design processes within a mathematical framework, giving as a result the proof of correctness or the set of design decisions that led to errors. As the computational complexity is a major problem informal verification, the formal framework is based in an ad hoc formal theory. The moderate complexity achieved, has been confirmed by a detailed experimental study, which shows that our method can verify complex designs overloading the highlevel design-cycle only minimally.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131610057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115348
Jarno Vanne, E. Aho, Kimmo Kuusilinna, T. Hämäläinen
Contemporary multimedia processors and applications are increasingly limited by their data accessing capabilities. However, the designed Configurable Parallel Memory Architecture (CPMA) alleviates these multimedia data accessing requirements; achieving significant performance improvements over traditional memory architectures. CPMA decreases considerably the processor-memory bottleneck by widening the memory bandwidth, decreasing the number of memory accesses, and diminishing the significance of memory latency. To further enhance the performance of CPMA, this paper introduces a novel architectural extension called CPMA access instruction correlation recognition. The presented method is intended for accelerating the execution rate of consecutive, temporally conflict-free, CPMA memory accesses. As demonstrated in this paper, the superior CPMA performance can also be maintained in the case of limited access widths. In addition, the presented results confirm that CPMA can have an acceptable silicon area.
{"title":"Enhanced configurable parallel memory architecture","authors":"Jarno Vanne, E. Aho, Kimmo Kuusilinna, T. Hämäläinen","doi":"10.1109/DSD.2002.1115348","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115348","url":null,"abstract":"Contemporary multimedia processors and applications are increasingly limited by their data accessing capabilities. However, the designed Configurable Parallel Memory Architecture (CPMA) alleviates these multimedia data accessing requirements; achieving significant performance improvements over traditional memory architectures. CPMA decreases considerably the processor-memory bottleneck by widening the memory bandwidth, decreasing the number of memory accesses, and diminishing the significance of memory latency. To further enhance the performance of CPMA, this paper introduces a novel architectural extension called CPMA access instruction correlation recognition. The presented method is intended for accelerating the execution rate of consecutive, temporally conflict-free, CPMA memory accesses. As demonstrated in this paper, the superior CPMA performance can also be maintained in the case of limited access widths. In addition, the presented results confirm that CPMA can have an acceptable silicon area.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121302309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115354
O. Mansour, Egbert Molenkamp, T. Krol
This paper addresses the hardware implementation of a dynamic scheduler for non-manifest data dependent periodic loops. Static scheduling techniques which are known to give near optimal scheduling-solutions for manifest loops, fail at scheduling non-manifest loops, since they lack the run time information needed which makes a static schedule feasible. In this paper a dynamic scheduling approach was chosen to circumvent this problem. We present a case study using VHDL where the focus lies on implementations with minimal memory usage and low communication overhead between various components of the architecture. This has resulted in an efficient and synthesisable system.
{"title":"The synthesis of a hardware scheduler for non-manifest loops","authors":"O. Mansour, Egbert Molenkamp, T. Krol","doi":"10.1109/DSD.2002.1115354","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115354","url":null,"abstract":"This paper addresses the hardware implementation of a dynamic scheduler for non-manifest data dependent periodic loops. Static scheduling techniques which are known to give near optimal scheduling-solutions for manifest loops, fail at scheduling non-manifest loops, since they lack the run time information needed which makes a static schedule feasible. In this paper a dynamic scheduling approach was chosen to circumvent this problem. We present a case study using VHDL where the focus lies on implementations with minimal memory usage and low communication overhead between various components of the architecture. This has resulted in an efficient and synthesisable system.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"252 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116057827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115380
M. Azhar, K. Dimond
This article describes an alternative hardware solution to be implemented on FPGAs (field programmable gate array) for collision free robot navigation. A RAM based artificial neural network (ANN) was considered as the heart of the controller due to the advantage of its ease of implementation in conventional hardware. The structure of the ANN was well suited to realize the experiments for evolutionary robotics (ER). The hardware implementation gives massive parallelism of neural networks and the FPGA allows fast IC prototyping and low cost modifications.
{"title":"Design of an FPGA based adaptive neural controller for intelligent robot navigation","authors":"M. Azhar, K. Dimond","doi":"10.1109/DSD.2002.1115380","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115380","url":null,"abstract":"This article describes an alternative hardware solution to be implemented on FPGAs (field programmable gate array) for collision free robot navigation. A RAM based artificial neural network (ANN) was considered as the heart of the controller due to the advantage of its ease of implementation in conventional hardware. The structure of the ANN was well suited to realize the experiments for evolutionary robotics (ER). The hardware implementation gives massive parallelism of neural networks and the FPGA allows fast IC prototyping and low cost modifications.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127621107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-09-04DOI: 10.1109/DSD.2002.1115347
K. Kent, M. Serra
This paper discusses the hardware architecture used in the hw/sw co-design of a Java virtual machine. The paper briefly outlines the partitioning of instructions and support for the virtual machine. Discussion concerning the hardware architecture follows focusing on the special requirements that must be considered for the target environment. A comparison is performed between this design and that of picoJava, a stand-alone processor for Java. The paper concludes with benchmark results for this architecture compared with software execution.
{"title":"Hardware architecture for Java in a hardware/software co-design of the virtual machine","authors":"K. Kent, M. Serra","doi":"10.1109/DSD.2002.1115347","DOIUrl":"https://doi.org/10.1109/DSD.2002.1115347","url":null,"abstract":"This paper discusses the hardware architecture used in the hw/sw co-design of a Java virtual machine. The paper briefly outlines the partitioning of instructions and support for the virtual machine. Discussion concerning the hardware architecture follows focusing on the special requirements that must be considered for the target environment. A comparison is performed between this design and that of picoJava, a stand-alone processor for Java. The paper concludes with benchmark results for this architecture compared with software execution.","PeriodicalId":330609,"journal":{"name":"Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129046211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}