Pub Date : 1995-08-01DOI: 10.1109/ASPDAC.1995.486359
R. Hartenstein, R. Kress
A datapath synthesis system (DPSS) for the reconfigurable datapath architecture (rDPA) is presented. The DPSS allows automatic mapping of high level descriptions onto the rDPA without manual interaction. The required algorithms of this synthesis system are described in detail. Optimization techniques like loop folding or loop unrolling are sketched. The rDPA is scalable to arbitrarily large arrays and reconfigurable to be adaptable to the computational problem. Fine grained parallelism is achieved by using simple reconfigurable processing elements which are called datapath units (DPUs). The rDPA can be used as a reconfigurable ALU for bus oriented systems as well as for rapid prototyping of high speed datapaths.
{"title":"A datapath synthesis system for the reconfigurable datapath architecture","authors":"R. Hartenstein, R. Kress","doi":"10.1109/ASPDAC.1995.486359","DOIUrl":"https://doi.org/10.1109/ASPDAC.1995.486359","url":null,"abstract":"A datapath synthesis system (DPSS) for the reconfigurable datapath architecture (rDPA) is presented. The DPSS allows automatic mapping of high level descriptions onto the rDPA without manual interaction. The required algorithms of this synthesis system are described in detail. Optimization techniques like loop folding or loop unrolling are sketched. The rDPA is scalable to arbitrarily large arrays and reconfigurable to be adaptable to the computational problem. Fine grained parallelism is achieved by using simple reconfigurable processing elements which are called datapath units (DPUs). The rDPA can be used as a reconfigurable ALU for bus oriented systems as well as for rapid prototyping of high speed datapaths.","PeriodicalId":119232,"journal":{"name":"Proceedings of ASP-DAC'95/CHDL'95/VLSI'95 with EDA Technofair","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122812168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-01DOI: 10.1109/ASPDAC.1995.486217
A. Ghosh, M. Bershteyn, R. Casley, Chen-Fu Chien, A. Jain, M. Lipsie, D. Tarrodaychik, O. Yamamo
One of the interesting problems in hardware-software co-design is that of debugging embedded software in conjunction with hardware. Currently, most software designers wait until a working hardware prototype is available before debugging software. Bugs discovered in hardware during the software debugging phase require re-design and re-fabrication, thereby not only delaying the project but also increasing cost. It also puts software debugging on hold until a new hardware prototype is available. In this paper we describe a hardware-software co-simulator that can be used in the design, debugging and verification of embedded systems. This tool contains simulators for different parts of the system and a backplane which is used to integrate the simulators. This enables us to simulate hardware, software and their interaction efficiently. We also address the problem of simulation speed. Currently, the more accurate (in terms of timing) the models used, the longer it takes to simulate a system. Our main contribution is a set of techniques to speed up simulation of processors and peripherals without significant loss in timing accuracy. Finally, we describe applications used to test the co-simulator and our experience in using it.
{"title":"A hardware-software co-simulator for embedded system design and debugging","authors":"A. Ghosh, M. Bershteyn, R. Casley, Chen-Fu Chien, A. Jain, M. Lipsie, D. Tarrodaychik, O. Yamamo","doi":"10.1109/ASPDAC.1995.486217","DOIUrl":"https://doi.org/10.1109/ASPDAC.1995.486217","url":null,"abstract":"One of the interesting problems in hardware-software co-design is that of debugging embedded software in conjunction with hardware. Currently, most software designers wait until a working hardware prototype is available before debugging software. Bugs discovered in hardware during the software debugging phase require re-design and re-fabrication, thereby not only delaying the project but also increasing cost. It also puts software debugging on hold until a new hardware prototype is available. In this paper we describe a hardware-software co-simulator that can be used in the design, debugging and verification of embedded systems. This tool contains simulators for different parts of the system and a backplane which is used to integrate the simulators. This enables us to simulate hardware, software and their interaction efficiently. We also address the problem of simulation speed. Currently, the more accurate (in terms of timing) the models used, the longer it takes to simulate a system. Our main contribution is a set of techniques to speed up simulation of processors and peripherals without significant loss in timing accuracy. Finally, we describe applications used to test the co-simulator and our experience in using it.","PeriodicalId":119232,"journal":{"name":"Proceedings of ASP-DAC'95/CHDL'95/VLSI'95 with EDA Technofair","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123574715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-01DOI: 10.1109/ASPDAC.1995.486197
H. Kunieda, Yusong Liao, Dongju Li, Kazuhito Ito
A memory sharing processor array (MSPA) architecture is effective in both data storage and processor cell utilization efficiency. In this paper, the design methodology for MSPA is extended to synthesise a bit-serial datapath. As a synthesis example, we propose a new bit-serial multiplier with a smaller number of logic gates than conventional bit-serial multipliers.
{"title":"Automatic design for bit-serial MSPA architecture","authors":"H. Kunieda, Yusong Liao, Dongju Li, Kazuhito Ito","doi":"10.1109/ASPDAC.1995.486197","DOIUrl":"https://doi.org/10.1109/ASPDAC.1995.486197","url":null,"abstract":"A memory sharing processor array (MSPA) architecture is effective in both data storage and processor cell utilization efficiency. In this paper, the design methodology for MSPA is extended to synthesise a bit-serial datapath. As a synthesis example, we propose a new bit-serial multiplier with a smaller number of logic gates than conventional bit-serial multipliers.","PeriodicalId":119232,"journal":{"name":"Proceedings of ASP-DAC'95/CHDL'95/VLSI'95 with EDA Technofair","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114488773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-01DOI: 10.1109/ASPDAC.1995.486213
K. Tani
This paper proposes an approach to enhance Fiduccia-Mattheyses' min-cut algorithm. The approach includes two new ideas: LOOK-AHEAD WEIGHTING and DYNAMIC WEIGHTING. The former is based on the concept of VLSI placement method using quadratic programming. The latter is a technique to carry the better behavior of move-and-lock improvement strategy. Experiments on practical circuits with 5 K/spl sim/140 K cells show that the proposed approach achieves promising results.
{"title":"A robust min-cut improvement algorithm based on dynamic look-ahead weighting","authors":"K. Tani","doi":"10.1109/ASPDAC.1995.486213","DOIUrl":"https://doi.org/10.1109/ASPDAC.1995.486213","url":null,"abstract":"This paper proposes an approach to enhance Fiduccia-Mattheyses' min-cut algorithm. The approach includes two new ideas: LOOK-AHEAD WEIGHTING and DYNAMIC WEIGHTING. The former is based on the concept of VLSI placement method using quadratic programming. The latter is a technique to carry the better behavior of move-and-lock improvement strategy. Experiments on practical circuits with 5 K/spl sim/140 K cells show that the proposed approach achieves promising results.","PeriodicalId":119232,"journal":{"name":"Proceedings of ASP-DAC'95/CHDL'95/VLSI'95 with EDA Technofair","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122191801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-01DOI: 10.1109/ASPDAC.1995.486215
V. Tiwari, M. Lee
A new approach for power analysis of microprocessors has recently been proposed (Tiwari et al 1994). The idea is to look at the power consumption in a microprocessor from the point of view of the actual software executing on the processor. The basic component of this approach is a measurement based, instruction-level power analysis technique. The technique allows for the development of an instruction-level power model for the given processor, which can be used to evaluate software in terms of the power consumption, and for exploring the optimization of software for lower power. This paper describes the application of this technique for a comprehensive instruction-level power analysis of a commercial 32-bit RISC-based embedded microcontroller. The salient results of the analysis and the basic instruction-level power model are described. Interesting observations and insights based on the results are also presented. Such an instruction-level power analysis can provide cues as to what optimizations in the micro-architecture design of the processor would lead to the most effective power savings in actual software applications. Wherever the results indicate such optimizations, they have been discussed. Furthermore, ideas for low power software design, as suggested by the results, are described in this paper as well.
最近提出了一种新的微处理器功耗分析方法(Tiwari et al . 1994)。这个想法是从处理器上执行的实际软件的角度来看微处理器的功耗。该方法的基本组成部分是基于测量的指导性功率分析技术。该技术允许为给定处理器开发指令级功耗模型,该模型可用于根据功耗评估软件,并用于探索低功耗软件的优化。本文介绍了该技术在商用32位risc嵌入式微控制器的综合指令级功耗分析中的应用。介绍了分析的显著结果和基本指令级功率模型。还介绍了基于结果的有趣观察和见解。这种指令级功耗分析可以提供线索,说明处理器微架构设计中的哪些优化将导致实际软件应用程序中最有效的功耗节省。只要结果表明了这种优化,就会对其进行讨论。此外,根据研究结果提出了低功耗软件设计的思路,并在本文中进行了描述。
{"title":"Power analysis of a 32-bit embedded microcontroller","authors":"V. Tiwari, M. Lee","doi":"10.1109/ASPDAC.1995.486215","DOIUrl":"https://doi.org/10.1109/ASPDAC.1995.486215","url":null,"abstract":"A new approach for power analysis of microprocessors has recently been proposed (Tiwari et al 1994). The idea is to look at the power consumption in a microprocessor from the point of view of the actual software executing on the processor. The basic component of this approach is a measurement based, instruction-level power analysis technique. The technique allows for the development of an instruction-level power model for the given processor, which can be used to evaluate software in terms of the power consumption, and for exploring the optimization of software for lower power. This paper describes the application of this technique for a comprehensive instruction-level power analysis of a commercial 32-bit RISC-based embedded microcontroller. The salient results of the analysis and the basic instruction-level power model are described. Interesting observations and insights based on the results are also presented. Such an instruction-level power analysis can provide cues as to what optimizations in the micro-architecture design of the processor would lead to the most effective power savings in actual software applications. Wherever the results indicate such optimizations, they have been discussed. Furthermore, ideas for low power software design, as suggested by the results, are described in this paper as well.","PeriodicalId":119232,"journal":{"name":"Proceedings of ASP-DAC'95/CHDL'95/VLSI'95 with EDA Technofair","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123492494","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-01DOI: 10.1109/ASPDAC.1995.486243
D. Debnath, Tsutomu Sasao
A generalized Reed-Muller expression (GRM) is a type of AND-EXOR expressions. In a GRM, each variable may appear both complemented and uncomplemented. Networks realized using GRMs are easily tested. This paper presents GRMIN, a heuristic simplification algorithm for GRMs of multiple-output functions. GRMIN uses eight rules. As the primary objective, it reduces the number of products, and as the secondary objective, it reduces the number of literals. Experimental results show that, in most cases, GRMs require fewer products than conventional sum-of-products expressions (SOPs). GRMIN outperforms existing algorithms.
{"title":"GRMIN: a heuristic simplification algorithm for generalized Reed-Muller expressions","authors":"D. Debnath, Tsutomu Sasao","doi":"10.1109/ASPDAC.1995.486243","DOIUrl":"https://doi.org/10.1109/ASPDAC.1995.486243","url":null,"abstract":"A generalized Reed-Muller expression (GRM) is a type of AND-EXOR expressions. In a GRM, each variable may appear both complemented and uncomplemented. Networks realized using GRMs are easily tested. This paper presents GRMIN, a heuristic simplification algorithm for GRMs of multiple-output functions. GRMIN uses eight rules. As the primary objective, it reduces the number of products, and as the secondary objective, it reduces the number of literals. Experimental results show that, in most cases, GRMs require fewer products than conventional sum-of-products expressions (SOPs). GRMIN outperforms existing algorithms.","PeriodicalId":119232,"journal":{"name":"Proceedings of ASP-DAC'95/CHDL'95/VLSI'95 with EDA Technofair","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115241040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-01DOI: 10.1109/ASPDAC.1995.486251
J. Cong, D. Xu
Most existing placement algorithms consider only connectivity information during the placement process, and ignore other information available from the higher levels of design process. In this paper, we exploit the use of signal flow and logic dependency in standard cell placement by using the maximum fanout-free cone (MFFC) decomposition technique. We developed a containment tree based algorithm for splitting large MFFCs into smaller ones to get clusters with restricted sizes. We also developed a placement algorithm, named MFFC-TW, which first clusters the circuit based on MFFC decomposition and then feeds the clustered circuit to the Timberwolf 6.0 placement package. Very promising experimental results were obtained.
{"title":"Exploiting signal flow and logic dependency in standard cell placement","authors":"J. Cong, D. Xu","doi":"10.1109/ASPDAC.1995.486251","DOIUrl":"https://doi.org/10.1109/ASPDAC.1995.486251","url":null,"abstract":"Most existing placement algorithms consider only connectivity information during the placement process, and ignore other information available from the higher levels of design process. In this paper, we exploit the use of signal flow and logic dependency in standard cell placement by using the maximum fanout-free cone (MFFC) decomposition technique. We developed a containment tree based algorithm for splitting large MFFCs into smaller ones to get clusters with restricted sizes. We also developed a placement algorithm, named MFFC-TW, which first clusters the circuit based on MFFC decomposition and then feeds the clustered circuit to the Timberwolf 6.0 placement package. Very promising experimental results were obtained.","PeriodicalId":119232,"journal":{"name":"Proceedings of ASP-DAC'95/CHDL'95/VLSI'95 with EDA Technofair","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115178585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-01DOI: 10.1109/ASPDAC.1995.486230
A. Kondratyev, M. Kishinevsky, A. Yakovlev
We investigate the problem of hazard free gate level implementation of speed independent circuits specified by event based models, such as signal transition Graph (for processes with AND causality and input choice) or its extension, called change diagram (which allows OR causality). The main result of the paper is twofold: the proof that any speed independent behavior can be implemented at the gate level without hazards; and an efficient method for such an implementation. This method is based on transformations of the specification to the form satisfying the monotonous cover requirement. Since this method is based on standard gate cells it can be used both in the full custom and semi custom VLSI design. Experimental results demonstrate area and performance efficiency of our method.
{"title":"On hazard-free implementation of speed-independent circuits","authors":"A. Kondratyev, M. Kishinevsky, A. Yakovlev","doi":"10.1109/ASPDAC.1995.486230","DOIUrl":"https://doi.org/10.1109/ASPDAC.1995.486230","url":null,"abstract":"We investigate the problem of hazard free gate level implementation of speed independent circuits specified by event based models, such as signal transition Graph (for processes with AND causality and input choice) or its extension, called change diagram (which allows OR causality). The main result of the paper is twofold: the proof that any speed independent behavior can be implemented at the gate level without hazards; and an efficient method for such an implementation. This method is based on transformations of the specification to the form satisfying the monotonous cover requirement. Since this method is based on standard gate cells it can be used both in the full custom and semi custom VLSI design. Experimental results demonstrate area and performance efficiency of our method.","PeriodicalId":119232,"journal":{"name":"Proceedings of ASP-DAC'95/CHDL'95/VLSI'95 with EDA Technofair","volume":"161 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122674920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-01DOI: 10.1109/ASPDAC.1995.486211
Chunghee Kim, Hyunchul Shin, Young-Uk Yu
A new performance-driven partitioning algorithm has been developed to implement a large circuit by using multiple FPGA chips. Partitioning for multiple FPGAs has several constraints to satisfy so that each partitioned subcircuit can be implemented in a FPGA chip. To obtain satisfactory results under the constraints, the partitioning is performed in two phases which are the initial partitioning for global optimisation and the iterative partitioning improvements for constraint satisfaction. Experimental results using the MCNC benchmark examples show that our partition method produces better results than those of other recent approaches on the average and that performance-driven partitioning is effective in reducing critical time delays.
{"title":"Performance-driven circuit partitioning for prototyping by using multiple FPGA chips","authors":"Chunghee Kim, Hyunchul Shin, Young-Uk Yu","doi":"10.1109/ASPDAC.1995.486211","DOIUrl":"https://doi.org/10.1109/ASPDAC.1995.486211","url":null,"abstract":"A new performance-driven partitioning algorithm has been developed to implement a large circuit by using multiple FPGA chips. Partitioning for multiple FPGAs has several constraints to satisfy so that each partitioned subcircuit can be implemented in a FPGA chip. To obtain satisfactory results under the constraints, the partitioning is performed in two phases which are the initial partitioning for global optimisation and the iterative partitioning improvements for constraint satisfaction. Experimental results using the MCNC benchmark examples show that our partition method produces better results than those of other recent approaches on the average and that performance-driven partitioning is effective in reducing critical time delays.","PeriodicalId":119232,"journal":{"name":"Proceedings of ASP-DAC'95/CHDL'95/VLSI'95 with EDA Technofair","volume":"213 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122853238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-01DOI: 10.1109/ASPDAC.1995.486253
M. S. Zamani, G. Hellestrand
In this paper, we introduce a new neural network approach to the placement of gate array designs. The network used is a Kohonen self-organising map. An abstract specification of the design is converted to a set of appropriate input vectors fed to the network at random. At the end of the process, the map shows a 2-dimensional plane of the design in which the modules with higher connectivity are placed adjacent to each other, hence minimising total connection length in the design. The approach can consider external connections and is able to place modules in a rectilinear boundary. These features makes the approach capable of being used in hierarchical floorplanning algorithms.
{"title":"A neural network approach to the placement problem","authors":"M. S. Zamani, G. Hellestrand","doi":"10.1109/ASPDAC.1995.486253","DOIUrl":"https://doi.org/10.1109/ASPDAC.1995.486253","url":null,"abstract":"In this paper, we introduce a new neural network approach to the placement of gate array designs. The network used is a Kohonen self-organising map. An abstract specification of the design is converted to a set of appropriate input vectors fed to the network at random. At the end of the process, the map shows a 2-dimensional plane of the design in which the modules with higher connectivity are placed adjacent to each other, hence minimising total connection length in the design. The approach can consider external connections and is able to place modules in a rectilinear boundary. These features makes the approach capable of being used in hierarchical floorplanning algorithms.","PeriodicalId":119232,"journal":{"name":"Proceedings of ASP-DAC'95/CHDL'95/VLSI'95 with EDA Technofair","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124641076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}