Pub Date : 1999-01-10DOI: 10.1109/ICVD.1999.745175
A. Basu, R. Leupers, P. Marwedel
Code optimization for digital signal processors (DSPs) has been identified as an important new topic in system-level design of embedded systems. Both DSP processors and algorithms show special characteristics usually not found in general-purpose computing. Since real-time constraints imposed on DSP algorithms demand for very high quality machine code, high-level language compilers for DSPs should take these characteristics into account. One important characteristic of DSP algorithms is the iterative pattern of references to array elements within loops. DSPs support efficient address computations for such array accesses by means of dedicated address generation units (AGUs). In this paper, we present a heuristic code optimization technique which, given an AGU with a fixed number of address registers, minimizes the number of instructions needed for address computations in loops.
{"title":"Array index allocation under register constraints in DSP programs","authors":"A. Basu, R. Leupers, P. Marwedel","doi":"10.1109/ICVD.1999.745175","DOIUrl":"https://doi.org/10.1109/ICVD.1999.745175","url":null,"abstract":"Code optimization for digital signal processors (DSPs) has been identified as an important new topic in system-level design of embedded systems. Both DSP processors and algorithms show special characteristics usually not found in general-purpose computing. Since real-time constraints imposed on DSP algorithms demand for very high quality machine code, high-level language compilers for DSPs should take these characteristics into account. One important characteristic of DSP algorithms is the iterative pattern of references to array elements within loops. DSPs support efficient address computations for such array accesses by means of dedicated address generation units (AGUs). In this paper, we present a heuristic code optimization technique which, given an AGU with a fixed number of address registers, minimizes the number of instructions needed for address computations in loops.","PeriodicalId":443373,"journal":{"name":"Proceedings Twelfth International Conference on VLSI Design. (Cat. No.PR00013)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116846832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-01-10DOI: 10.1109/ICVD.1999.745138
M. O’nils, A. Jantsch
Starting from an architecture and implementation independent specification of hardware/software communication protocols, we present a protocol synthesis method that generates a mixed hardware and software implementation. For the hardware part, the synthesis method will generate an application specific direct memory access (DMA) controller for each protocol specification. Software parts of the generated implementation are components for initialization, synchronization and communication with the DMA controller. The protocol specification, with the grammar-based language ProGram, is used to model the HW/SW communication protocol. Since this approach is based on a device driver synthesis system for software solutions, which adopts the generated device drivers to a selected processor and kernel, the generated hardware/software solutions can also be adopted to any processor and OS kernel. This lets the designer explore the design space for the communication protocols by trading off between performance and cost.
{"title":"Synthesis of DMA controllers from architecture independent descriptions of HW/SW communication protocols","authors":"M. O’nils, A. Jantsch","doi":"10.1109/ICVD.1999.745138","DOIUrl":"https://doi.org/10.1109/ICVD.1999.745138","url":null,"abstract":"Starting from an architecture and implementation independent specification of hardware/software communication protocols, we present a protocol synthesis method that generates a mixed hardware and software implementation. For the hardware part, the synthesis method will generate an application specific direct memory access (DMA) controller for each protocol specification. Software parts of the generated implementation are components for initialization, synchronization and communication with the DMA controller. The protocol specification, with the grammar-based language ProGram, is used to model the HW/SW communication protocol. Since this approach is based on a device driver synthesis system for software solutions, which adopts the generated device drivers to a selected processor and kernel, the generated hardware/software solutions can also be adopted to any processor and OS kernel. This lets the designer explore the design space for the communication protocols by trading off between performance and cost.","PeriodicalId":443373,"journal":{"name":"Proceedings Twelfth International Conference on VLSI Design. (Cat. No.PR00013)","volume":"161 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116549853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-01-10DOI: 10.1109/ICVD.1999.745186
S. Ramesh
Traditional description techniques like Finite State Machines (FSMs) are inadequate for current day complex hardware control circuits as they are flat and unstructured. Recently Harel [1987] defined statecharts by introducing concurrent and hierarchical structure to FSMs. Statecharts can be implemented in hardware using the conventional implementation scheme of combinational-logic block with a feedback register. The main problem here is the encoding of state configurations. The encoding, besides uniquely identifying configurations should be easily decomposable into the codes of constituent states so that the set of permissible transitions in these states can be performed and the resulting outputs and the next configuration can be computed. This paper proposes a new scheme for encoding statechart configurations. The distinguishing feature of this scheme is to encode not only basic states but also intermediate states. The encoding is based upon the hierarchical and concurrent structure of statecharts. It has been shown both theoretically and experimentally that the scheme performs better than existing encoding schemes.
{"title":"Efficient translation of statecharts to hardware circuits","authors":"S. Ramesh","doi":"10.1109/ICVD.1999.745186","DOIUrl":"https://doi.org/10.1109/ICVD.1999.745186","url":null,"abstract":"Traditional description techniques like Finite State Machines (FSMs) are inadequate for current day complex hardware control circuits as they are flat and unstructured. Recently Harel [1987] defined statecharts by introducing concurrent and hierarchical structure to FSMs. Statecharts can be implemented in hardware using the conventional implementation scheme of combinational-logic block with a feedback register. The main problem here is the encoding of state configurations. The encoding, besides uniquely identifying configurations should be easily decomposable into the codes of constituent states so that the set of permissible transitions in these states can be performed and the resulting outputs and the next configuration can be computed. This paper proposes a new scheme for encoding statechart configurations. The distinguishing feature of this scheme is to encode not only basic states but also intermediate states. The encoding is based upon the hierarchical and concurrent structure of statecharts. It has been shown both theoretically and experimentally that the scheme performs better than existing encoding schemes.","PeriodicalId":443373,"journal":{"name":"Proceedings Twelfth International Conference on VLSI Design. (Cat. No.PR00013)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123298531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-01-10DOI: 10.1109/ICVD.1999.745140
Ludovic Jacomme, F. Pétrot, R. K. Bawa
This paper deals with the formal identification of flip-flops and latches within VHDL descriptions of hardware systems. Due to the simulation based semantics of VHDL, the existing synthesis tools rely on explicit templates to guarantee memorizing element inference. The approach proposed here is based on a formal representation of VHDL in terms of interpreted Petri nets. A Petri net preserving the simulation semantic is built as a result of VHDL compilation and then reduced to a unique minimal form. A set of equations is extracted and a formal analysis is performed on all cyclic symbol assignments. The result is a RTL VHDL description, synthesizable by any existing synthesis tools. This methodology has been implemented and is illustrated on a set of simple and representative descriptions.
{"title":"Formal analysis of single WAIT VHDL processes for semantic based synthesis","authors":"Ludovic Jacomme, F. Pétrot, R. K. Bawa","doi":"10.1109/ICVD.1999.745140","DOIUrl":"https://doi.org/10.1109/ICVD.1999.745140","url":null,"abstract":"This paper deals with the formal identification of flip-flops and latches within VHDL descriptions of hardware systems. Due to the simulation based semantics of VHDL, the existing synthesis tools rely on explicit templates to guarantee memorizing element inference. The approach proposed here is based on a formal representation of VHDL in terms of interpreted Petri nets. A Petri net preserving the simulation semantic is built as a result of VHDL compilation and then reduced to a unique minimal form. A set of equations is extracted and a formal analysis is performed on all cyclic symbol assignments. The result is a RTL VHDL description, synthesizable by any existing synthesis tools. This methodology has been implemented and is illustrated on a set of simple and representative descriptions.","PeriodicalId":443373,"journal":{"name":"Proceedings Twelfth International Conference on VLSI Design. (Cat. No.PR00013)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115259822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-01-10DOI: 10.1109/ICVD.1999.745125
P. Singh, S. Jayasimha
We present two memory-, process- and power-efficient algorithmic transformations for the Viterbi Algorithm (VA). The first performs in-place computations reducing memory required and bit transitions on the data address bus, while the second simplifies the traceback routine of the VA.
{"title":"A low-complexity, reduced-power Viterbi Algorithm","authors":"P. Singh, S. Jayasimha","doi":"10.1109/ICVD.1999.745125","DOIUrl":"https://doi.org/10.1109/ICVD.1999.745125","url":null,"abstract":"We present two memory-, process- and power-efficient algorithmic transformations for the Viterbi Algorithm (VA). The first performs in-place computations reducing memory required and bit transitions on the data address bus, while the second simplifies the traceback routine of the VA.","PeriodicalId":443373,"journal":{"name":"Proceedings Twelfth International Conference on VLSI Design. (Cat. No.PR00013)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123611636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-01-10DOI: 10.1109/ICVD.1999.745184
Tai-Hung Liu, Malay K. Ganai, A. Aziz, J. Burns
For many digital designs, implementation in pass-transistor logic (PTL) has been shown to be superior in terms of area, timing, and power characteristics to static CMOS. Binary Decision Diagrams (BDDs) have been used for PTL synthesis because of the close relationship between BDDs and PTL. Thus far BDD optimization for PTL synthesis has targeted minimizing the number of BDD nodes. This strategy leads to smaller PTL implementations, but it can result in circuits of poor performance. In this paper we model the delay of PTL circuits derived from BDDs, and propose procedures to reduce the worst-case delay or the area-delay product of such circuits. The experimental results show a significant improvement in the delay (30%) or area-delay product (24%) for the ISCAS benchmark circuits.
{"title":"Performance driven synthesis for pass-transistor logic","authors":"Tai-Hung Liu, Malay K. Ganai, A. Aziz, J. Burns","doi":"10.1109/ICVD.1999.745184","DOIUrl":"https://doi.org/10.1109/ICVD.1999.745184","url":null,"abstract":"For many digital designs, implementation in pass-transistor logic (PTL) has been shown to be superior in terms of area, timing, and power characteristics to static CMOS. Binary Decision Diagrams (BDDs) have been used for PTL synthesis because of the close relationship between BDDs and PTL. Thus far BDD optimization for PTL synthesis has targeted minimizing the number of BDD nodes. This strategy leads to smaller PTL implementations, but it can result in circuits of poor performance. In this paper we model the delay of PTL circuits derived from BDDs, and propose procedures to reduce the worst-case delay or the area-delay product of such circuits. The experimental results show a significant improvement in the delay (30%) or area-delay product (24%) for the ISCAS benchmark circuits.","PeriodicalId":443373,"journal":{"name":"Proceedings Twelfth International Conference on VLSI Design. (Cat. No.PR00013)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127593090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-01-10DOI: 10.1109/ICVD.1999.745118
M. P. Chew, S. Saxena, Thomas F. Cobourn, P. K. Mozumder, A. Strojwas
To minimize the time to market and cost of new sub 0.2 um process technologies and products, PDF Solutions, Inc. has developed a new comprehensive approach based on the use of predictive simulation roots combined with highly efficient experimental design techniques and special test structures. This paper focuses on our approach for concurrent development of new technologies and optimization of cell libraries for these technologies. We present a software system called Circuit Surfer which performs this library optimization in a highly automated fashion and with guaranteed correctness in silicon. We demonstrate several examples of Circuit Surfer applications to cell library design to optimize such objective functions as performance, cell area or yield.
{"title":"A new methodology for concurrent technology development and cell library optimization","authors":"M. P. Chew, S. Saxena, Thomas F. Cobourn, P. K. Mozumder, A. Strojwas","doi":"10.1109/ICVD.1999.745118","DOIUrl":"https://doi.org/10.1109/ICVD.1999.745118","url":null,"abstract":"To minimize the time to market and cost of new sub 0.2 um process technologies and products, PDF Solutions, Inc. has developed a new comprehensive approach based on the use of predictive simulation roots combined with highly efficient experimental design techniques and special test structures. This paper focuses on our approach for concurrent development of new technologies and optimization of cell libraries for these technologies. We present a software system called Circuit Surfer which performs this library optimization in a highly automated fashion and with guaranteed correctness in silicon. We demonstrate several examples of Circuit Surfer applications to cell library design to optimize such objective functions as performance, cell area or yield.","PeriodicalId":443373,"journal":{"name":"Proceedings Twelfth International Conference on VLSI Design. (Cat. No.PR00013)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124523532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-01-10DOI: 10.1109/ICVD.1999.745224
P. Kar, S. Roy
As the ability to integrate and pack more devices within a die of silicon increases with each new generation of the fabrication technology, the complexity of digital systems realizable on a single chip has also grown by leaps and bounds. It is imperative, from the point of view of economics, to be able to migrate any design from one foundry specific technology to another; as well as, from a present generation fabrication technology to the next generation fabrication technology. The process of porting a layout from one process to the other, there by exploiting the reusability of the layout resources accumulated from the initial process is called technology migration. We describe the technology migration process with reference to our layout-preserving migration tool "TECHMIG" highlighting our innovative constraint generation algorithms for polygonal layout elements which are generally encountered in real life layouts.
{"title":"TECHMIG: A layout tool for technology migration","authors":"P. Kar, S. Roy","doi":"10.1109/ICVD.1999.745224","DOIUrl":"https://doi.org/10.1109/ICVD.1999.745224","url":null,"abstract":"As the ability to integrate and pack more devices within a die of silicon increases with each new generation of the fabrication technology, the complexity of digital systems realizable on a single chip has also grown by leaps and bounds. It is imperative, from the point of view of economics, to be able to migrate any design from one foundry specific technology to another; as well as, from a present generation fabrication technology to the next generation fabrication technology. The process of porting a layout from one process to the other, there by exploiting the reusability of the layout resources accumulated from the initial process is called technology migration. We describe the technology migration process with reference to our layout-preserving migration tool \"TECHMIG\" highlighting our innovative constraint generation algorithms for polygonal layout elements which are generally encountered in real life layouts.","PeriodicalId":443373,"journal":{"name":"Proceedings Twelfth International Conference on VLSI Design. (Cat. No.PR00013)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117056977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-01-10DOI: 10.1109/ICVD.1999.745214
Sandip Das, S. Nandy, B. Bhattacharya
In this paper, we present a new approach to MCM routing to minimize the number of vias and wire length. A 3D routing substrate is partitioned into a number of layers. Chip blocks are placed on the top layer, and routing layers are used pair-wise for interconnections. The set of projected pins of the blocks on a routing layer plays the role of obstacles; the space (river) between two consecutive rows/columns of blocks is used for routing. The proposed algorithm consists of a preprocessing stage that determines a routing order among the nets. For each net, a rectilinear Steiner tree with a minimum number of bends is constructed, and the nets are ordered on the basis of a metric called average path length. Next, routing is done in the nonoverlap model, using a heuristic guided by the above ordering. Finally, via minimization is achieved by slightly re-routing the nets in the overlap model. Experimental evidence on standard benchmarks reveals that our solution produces significantly fewer number of vias, and compares favourably with respect to wire length against the best known existing results.
{"title":"High performance MCM routing: a new approach","authors":"Sandip Das, S. Nandy, B. Bhattacharya","doi":"10.1109/ICVD.1999.745214","DOIUrl":"https://doi.org/10.1109/ICVD.1999.745214","url":null,"abstract":"In this paper, we present a new approach to MCM routing to minimize the number of vias and wire length. A 3D routing substrate is partitioned into a number of layers. Chip blocks are placed on the top layer, and routing layers are used pair-wise for interconnections. The set of projected pins of the blocks on a routing layer plays the role of obstacles; the space (river) between two consecutive rows/columns of blocks is used for routing. The proposed algorithm consists of a preprocessing stage that determines a routing order among the nets. For each net, a rectilinear Steiner tree with a minimum number of bends is constructed, and the nets are ordered on the basis of a metric called average path length. Next, routing is done in the nonoverlap model, using a heuristic guided by the above ordering. Finally, via minimization is achieved by slightly re-routing the nets in the overlap model. Experimental evidence on standard benchmarks reveals that our solution produces significantly fewer number of vias, and compares favourably with respect to wire length against the best known existing results.","PeriodicalId":443373,"journal":{"name":"Proceedings Twelfth International Conference on VLSI Design. (Cat. No.PR00013)","volume":"271 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124387331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-01-10DOI: 10.1109/ICVD.1999.745213
S. Ganesan, R. Vemuri
In this paper we address the routability and analog performance issues involved in routing for array-based FPAAs that have single-segment horizontal and vertical routing resources. We then present FAAR (field-programmable analog array router) and describe a routing algorithm developed for the target array-based FPAA architecture. Sequential routing technique is used for routing multi-terminal nets as well as multiple nets. Multi-terminal nets are broken into two-terminal pairs and routed. We use the notion of resource demand as a measure of the effect of a net-route on the routing of the other nets, while the number of programmable switches and the net-crossings are used as the metrics of interconnect parasitics. We present experiments to study the effect of various parameters such as the number of nets, terminals, CABs and I/O cells on the routing as well as the performance degradation. FAAR routes with high efficiency while keeping performance degradation small, and has considerably short execution times.
{"title":"FAAR: A router for field-programmable analog arrays","authors":"S. Ganesan, R. Vemuri","doi":"10.1109/ICVD.1999.745213","DOIUrl":"https://doi.org/10.1109/ICVD.1999.745213","url":null,"abstract":"In this paper we address the routability and analog performance issues involved in routing for array-based FPAAs that have single-segment horizontal and vertical routing resources. We then present FAAR (field-programmable analog array router) and describe a routing algorithm developed for the target array-based FPAA architecture. Sequential routing technique is used for routing multi-terminal nets as well as multiple nets. Multi-terminal nets are broken into two-terminal pairs and routed. We use the notion of resource demand as a measure of the effect of a net-route on the routing of the other nets, while the number of programmable switches and the net-crossings are used as the metrics of interconnect parasitics. We present experiments to study the effect of various parameters such as the number of nets, terminals, CABs and I/O cells on the routing as well as the performance degradation. FAAR routes with high efficiency while keeping performance degradation small, and has considerably short execution times.","PeriodicalId":443373,"journal":{"name":"Proceedings Twelfth International Conference on VLSI Design. (Cat. No.PR00013)","volume":"202 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128204698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}