Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643530
G. Ellis, L. Pileggi, Rob A. Rutenbar
This paper describes a novel methodology to automate the design of the interconnect distribution for multistage clock circuits. We introduce two key ideas. First, a hierarchical decomposition of the layout divides the problem into a set of local Steiner-wired latch clusters (to minimize and balance local capacitance) fed globally by a balanced binary tree (to maximize performance). Second, we recast the global clock distribution problem as a simultaneous optimization of clock topology, clock segment routing, wire sizing and buffering. The hierarchical decomposition reduces the problem complexity and allows use of more aggressive optimization techniques. Integration of the geometric and electrical optimizations likewise allows more aggressive performance goals. Experiments with an industrial design comprising over 16,000 latches demonstrate the efficiency of the approach: a complete clock distribution solution met a 200-MHz cycle time specification with only 310 ps of skew, met strict current density constraints, exhibited good delay matching across uniform wire width and device variations, and was completed in under 10 CPU hours.
{"title":"A hierarchical decomposition methodology for multistage clock circuits","authors":"G. Ellis, L. Pileggi, Rob A. Rutenbar","doi":"10.1109/ICCAD.1997.643530","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643530","url":null,"abstract":"This paper describes a novel methodology to automate the design of the interconnect distribution for multistage clock circuits. We introduce two key ideas. First, a hierarchical decomposition of the layout divides the problem into a set of local Steiner-wired latch clusters (to minimize and balance local capacitance) fed globally by a balanced binary tree (to maximize performance). Second, we recast the global clock distribution problem as a simultaneous optimization of clock topology, clock segment routing, wire sizing and buffering. The hierarchical decomposition reduces the problem complexity and allows use of more aggressive optimization techniques. Integration of the geometric and electrical optimizations likewise allows more aggressive performance goals. Experiments with an industrial design comprising over 16,000 latches demonstrate the efficiency of the approach: a complete clock distribution solution met a 200-MHz cycle time specification with only 310 ps of skew, met strict current density constraints, exhibited good delay matching across uniform wire width and device variations, and was completed in under 10 CPU hours.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114780781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643613
Rachid Helaihel, K. Olukotun
The specification language is a critical component of the hardware-software co-design process since it is used for functional validation and as a starting point for hardware-software partitioning and co-synthesis. The paper proposes the Java programming language as a specification language for hardware-software systems. Java has several characteristics that make it suitable for system specification. However static control and data flow analysis of Java programs is problematic because Java classes are dynamically linked. The paper provides a general solution to the problem of statically analyzing Java programs using a technique that pre-allocates most class instances and aggressively resolves memory aliasing using global analysis. The output of the analysis is a control data flow graph for the input specification. The results for sample designs show that the analysis can extract fine to coarse-grained concurrency for subsequent hardware-software partitioning and co-synthesis steps of the hardware-software co-design process to exploit.
{"title":"Java as a specification language for hardware-software systems","authors":"Rachid Helaihel, K. Olukotun","doi":"10.1109/ICCAD.1997.643613","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643613","url":null,"abstract":"The specification language is a critical component of the hardware-software co-design process since it is used for functional validation and as a starting point for hardware-software partitioning and co-synthesis. The paper proposes the Java programming language as a specification language for hardware-software systems. Java has several characteristics that make it suitable for system specification. However static control and data flow analysis of Java programs is problematic because Java classes are dynamically linked. The paper provides a general solution to the problem of statically analyzing Java programs using a technique that pre-allocates most class instances and aggressively resolves memory aliasing using global analysis. The output of the analysis is a control data flow graph for the input specification. The results for sample designs show that the analysis can extract fine to coarse-grained concurrency for subsequent hardware-software partitioning and co-synthesis steps of the hardware-software co-design process to exploit.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"295 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133719721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1007/978-1-4615-0292-0_35
P. Feldmann, R. Freund
{"title":"Circuit noise evaluation by Pade approximation based model-reduction techniques","authors":"P. Feldmann, R. Freund","doi":"10.1007/978-1-4615-0292-0_35","DOIUrl":"https://doi.org/10.1007/978-1-4615-0292-0_35","url":null,"abstract":"","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125787612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643618
Sying-Jyan Wang, Tsi-Ming Tsai
Since field programmable gate arrays (FPGAs) are reprogrammable, faults in them can be easily tolerated once fault sites are located. We present a method for the testing and diagnosis of faults in FPGAs. The proposed method imposes no hardware overhead, and requires minimal support from external test equipment. Test time depends only on the number of faults, and is independent of the chip size. With the help of this technique, chips with faults can still be used. As a result, the chip yield can be improved and chip cost is reduced. Experimental results are given to show the feasibility of this method.
{"title":"Test and diagnosis of faulty logic blocks in FPGAs","authors":"Sying-Jyan Wang, Tsi-Ming Tsai","doi":"10.1109/ICCAD.1997.643618","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643618","url":null,"abstract":"Since field programmable gate arrays (FPGAs) are reprogrammable, faults in them can be easily tolerated once fault sites are located. We present a method for the testing and diagnosis of faults in FPGAs. The proposed method imposes no hardware overhead, and requires minimal support from external test equipment. Test time depends only on the number of faults, and is independent of the chip size. With the help of this technique, chips with faults can still be used. As a result, the chip yield can be improved and chip cost is reduced. Experimental results are given to show the feasibility of this method.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"440 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123562654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643593
H. Konuk
We describe a highly accurate but efficient fault simulator for interconnect opens, based on characterizing the standard cell library with SPICE; using transistor charge equations for the site of the open; using logic simulation for the rest of the circuit; taking four different factors, that can affect the voltage of an open, into account; and considering the oscillation and sequential behaviour potential of opens. A novel test technique based on controlling the die surface voltage is also described. We present simulation results of ISCAS85 layouts using stuck-at and IDDQ test sets.
{"title":"Fault simulation of interconnect opens in digital CMOS circuits","authors":"H. Konuk","doi":"10.1109/ICCAD.1997.643593","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643593","url":null,"abstract":"We describe a highly accurate but efficient fault simulator for interconnect opens, based on characterizing the standard cell library with SPICE; using transistor charge equations for the site of the open; using logic simulation for the rest of the circuit; taking four different factors, that can affect the voltage of an open, into account; and considering the oscillation and sequential behaviour potential of opens. A novel test technique based on controlling the die surface voltage is also described. We present simulation results of ISCAS85 layouts using stuck-at and IDDQ test sets.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124933255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/iccad.1997.643534
E. Goldberg, T. Villa, R. Brayton, A. Sangiovanni-Vincentelli
We present a new matrix formulation of the face hypercube embedding problem that motivates the design of an efficient search strategy to find an encoding that satisfies all faces of minimum length. Increasing dimensions of the Boolean space are explored; for a given dimension constraints are satisfied one at a time. The following features help to reduce the nodes of the solution space that must be explored: candidate cubes instead of candidate codes are generated, cubes yielding symmetric solutions are not generated, a smaller sufficient set of solutions (producing basic sections) is explored, necessary conditions help discard unsuitable candidate cubes, early detection that a partial solution cannot be extended to be a global solution prunes infeasible portions of the search tree. We have implemented a prototype package MINSK based on the previous ideas and run experiments to evaluate it. The experiments show that MINSK is faster and solves more problems than any available algorithm. Moreover, MINSK is a robust algorithm, while most of the proposed alternatives are not. Besides most problems of the complete MCNC benchmark suite, other solved examples include an important set of decoder PLAs coming from the design of microprocessor instruction sets.
{"title":"A fast and robust exact algorithm for face embedding","authors":"E. Goldberg, T. Villa, R. Brayton, A. Sangiovanni-Vincentelli","doi":"10.1109/iccad.1997.643534","DOIUrl":"https://doi.org/10.1109/iccad.1997.643534","url":null,"abstract":"We present a new matrix formulation of the face hypercube embedding problem that motivates the design of an efficient search strategy to find an encoding that satisfies all faces of minimum length. Increasing dimensions of the Boolean space are explored; for a given dimension constraints are satisfied one at a time. The following features help to reduce the nodes of the solution space that must be explored: candidate cubes instead of candidate codes are generated, cubes yielding symmetric solutions are not generated, a smaller sufficient set of solutions (producing basic sections) is explored, necessary conditions help discard unsuitable candidate cubes, early detection that a partial solution cannot be extended to be a global solution prunes infeasible portions of the search tree. We have implemented a prototype package MINSK based on the previous ideas and run experiments to evaluate it. The experiments show that MINSK is faster and solves more problems than any available algorithm. Moreover, MINSK is a robust algorithm, while most of the proposed alternatives are not. Besides most problems of the complete MCNC benchmark suite, other solved examples include an important set of decoder PLAs coming from the design of microprocessor instruction sets.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130310396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643563
F. Leyn, W. Daems, G. Gielen, W. Sansen
This paper describes a new modeling methodology that allows to derive systematically behavioral signal path models of operational amplifiers. Combined with symbolic simulation, these models provide high qualitative insight into the small-signal functioning of a circuit. The behavioral signal path model provides compact interpretable expressions for the poles and zeros that constitute the signal path. These expressions show which design parameters have dominant influence on the position of a pole/zero and thus enable a designer to control a manual interactive sizing process. The methodology consists of the application of a sequence of abstractions, so that one gradually progresses from a full device to a full behavior circuit representation. During this translation, qualitative insight and design requirements are obtained. The methodology is implemented in an open tool called EF2ef. The behavioral signal path model is also used for optimization based sizing in order to achieve pole placement in an efficient way. For optimization based siting, a new strategy for hierarchical penalty function composition is proposed, which allows sequential pruning of the design space. Combined with an operating point driven DC formulation and local minimax optimization, a fast sizing method is obtained which can be used for interactive design space exploration. Experimental results of both modeling and siting are shown.
{"title":"A behavioral signal path modeling methodology for qualitative insight in and efficient sizing of CMOS opamps","authors":"F. Leyn, W. Daems, G. Gielen, W. Sansen","doi":"10.1109/ICCAD.1997.643563","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643563","url":null,"abstract":"This paper describes a new modeling methodology that allows to derive systematically behavioral signal path models of operational amplifiers. Combined with symbolic simulation, these models provide high qualitative insight into the small-signal functioning of a circuit. The behavioral signal path model provides compact interpretable expressions for the poles and zeros that constitute the signal path. These expressions show which design parameters have dominant influence on the position of a pole/zero and thus enable a designer to control a manual interactive sizing process. The methodology consists of the application of a sequence of abstractions, so that one gradually progresses from a full device to a full behavior circuit representation. During this translation, qualitative insight and design requirements are obtained. The methodology is implemented in an open tool called EF2ef. The behavioral signal path model is also used for optimization based sizing in order to achieve pole placement in an efficient way. For optimization based siting, a new strategy for hierarchical penalty function composition is proposed, which allows sequential pruning of the design space. Combined with an operating point driven DC formulation and local minimax optimization, a fast sizing method is obtained which can be used for interactive design space exploration. Experimental results of both modeling and siting are shown.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"155 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127272624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643610
J. Lou, Amir H. Salek, Massoud Pedram
The authors present an optimal algorithm for solving the simultaneous technology mapping and linear placement problem for tree-structured circuits with the objective of minimizing the post-layout area. The proposed algorithm relies on the generation of gate-area versus cut-width curves using a dynamic programming approach. A novel design flow which extends this algorithm to minimize the circuit delay and handle general DAG structures is also presented. Experimental results on MCNC benchmarks are reported.
{"title":"An exact solution to simultaneous technology mapping and linear placement problem","authors":"J. Lou, Amir H. Salek, Massoud Pedram","doi":"10.1109/ICCAD.1997.643610","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643610","url":null,"abstract":"The authors present an optimal algorithm for solving the simultaneous technology mapping and linear placement problem for tree-structured circuits with the objective of minimizing the post-layout area. The proposed algorithm relies on the generation of gate-area versus cut-width curves using a dynamic programming approach. A novel design flow which extends this algorithm to minimize the circuit delay and handle general DAG structures is also presented. Experimental results on MCNC benchmarks are reported.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128857193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643251
Yirng-An Chen, R. Bryant
Data structures such as *BMDs, HDDs, and K*BMDs provide compact representations for functions which map Boolean vectors into integer values, but not floating point values. We propose a new data structure, called Multiplicative Power Hybrid Decision Diagrams (*PHDDs), to provide a compact representation for functions that map Boolean vectors into integer or floating point values. The size of the graph to represent the IEEE floating point encoding is linear with the word size. The complexity of floating point multiplication grows linearly with the word size. The complexity of floating point addition grows exponentially with the size of the exponent part, but linearly with the size of the mantissa part. We applied *PHDDs to verify integer multipliers and floating point multipliers before the rounding stage, based on a hierarchical verification approach. For integer multipliers, our results are at least 6 times faster than *BMDs. Previous attempts at verifying floating point multipliers required manual intervention. We verified floating point multipliers before the rounding stage automatically.
{"title":"*PHDD: an efficient graph representation for floating point circuit verification","authors":"Yirng-An Chen, R. Bryant","doi":"10.1109/ICCAD.1997.643251","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643251","url":null,"abstract":"Data structures such as *BMDs, HDDs, and K*BMDs provide compact representations for functions which map Boolean vectors into integer values, but not floating point values. We propose a new data structure, called Multiplicative Power Hybrid Decision Diagrams (*PHDDs), to provide a compact representation for functions that map Boolean vectors into integer or floating point values. The size of the graph to represent the IEEE floating point encoding is linear with the word size. The complexity of floating point multiplication grows linearly with the word size. The complexity of floating point addition grows exponentially with the size of the exponent part, but linearly with the size of the mantissa part. We applied *PHDDs to verify integer multipliers and floating point multipliers before the rounding stage, based on a hierarchical verification approach. For integer multipliers, our results are at least 6 times faster than *BMDs. Previous attempts at verifying floating point multipliers required manual intervention. We verified floating point multipliers before the rounding stage automatically.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124244865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643547
Wray L. Buntine, L. Su, A. Newton, Andrew Mayer
An algorithm that remains in use at the core of many partitioning systems is the Kemighan-Lin algorithm and a variant the Fidducia-Matheysses (FM) algorithm. To understand the FM algorithm we applied principles of data engineering where visualization and statistical analysis are used to analyze the run-time behavior. We identified two improvements to the algorithm which, without clustering or an improved heuristic function, bring the performance of the algorithm near that of more sophisticated algorithms. One improvement is based on the observation, explored empirically, that the full passes in the FM algorithm appear comparable to a stochastic local restart in the search. We motivate this observation with a discussion of recent improvements in Monte Carlo Markov Chain methods in statistics. The other improvement is based on the observation that when an FM-like algorithm is run 20 times and the best run chosen, the performance trace of the algorithm on earlier runs is useful data for learning when to abort a later run. These improvements, implemented with a simple adaptive scheme, are orthogonal to techniques used in state-of-the-art implementations, and therefore should be applicable to other VLSI optimization algorithms.
{"title":"Adaptive methods for netlist partitioning","authors":"Wray L. Buntine, L. Su, A. Newton, Andrew Mayer","doi":"10.1109/ICCAD.1997.643547","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643547","url":null,"abstract":"An algorithm that remains in use at the core of many partitioning systems is the Kemighan-Lin algorithm and a variant the Fidducia-Matheysses (FM) algorithm. To understand the FM algorithm we applied principles of data engineering where visualization and statistical analysis are used to analyze the run-time behavior. We identified two improvements to the algorithm which, without clustering or an improved heuristic function, bring the performance of the algorithm near that of more sophisticated algorithms. One improvement is based on the observation, explored empirically, that the full passes in the FM algorithm appear comparable to a stochastic local restart in the search. We motivate this observation with a discussion of recent improvements in Monte Carlo Markov Chain methods in statistics. The other improvement is based on the observation that when an FM-like algorithm is run 20 times and the best run chosen, the performance trace of the algorithm on earlier runs is useful data for learning when to abort a later run. These improvements, implemented with a simple adaptive scheme, are orthogonal to techniques used in state-of-the-art implementations, and therefore should be applicable to other VLSI optimization algorithms.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115307914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}