Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643375
R. Aggarwal, R. Murgai, M. Fujita
Technology-independent timing optimization is an important problem in logic synthesis. Although many promising techniques have been proposed in the past, unfortunately they are quite slow and thus impractical for large networks. In this paper, we propose DEPART, a delay-based partitioner-cum-optimizer, which purports to solve this problem. Given a combinational logic network that is to be optimized for timing, DEPART divides it into sub-networks using timing information and a constraint on the maximum number of gates allowed in a single sub-network. These sub-networks are then dispatched, one by one, to a standard timing optimizer. The optimized sub-networks are re-glued, generating an optimized network. The challenge is how to partition the original network into sub-networks so that the final solution quality after partitioning and optimization is comparable to that from the timing optimizer. We propose a partitioning technique that is timing-driven and is simple yet effective. We compare DEPART with speed-up, a state-of-the-art timing optimization tool, and with various partitioning techniques such as min-cut based and region growing, on a suite of large industrial and ISCAS circuits. On more than half of the benchmarks, DEPART yields run-time improvements of 20 to 450 times over a normal invocation of speed-up (the overall average improvement being 8 times), without compromising the solution quality much. Min-cut and region growing partitioning schemes, not being timing-driven, perform poorly in terms of the final circuit delay.
{"title":"Speeding up technology-independent timing optimization by network partitioning","authors":"R. Aggarwal, R. Murgai, M. Fujita","doi":"10.1109/ICCAD.1997.643375","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643375","url":null,"abstract":"Technology-independent timing optimization is an important problem in logic synthesis. Although many promising techniques have been proposed in the past, unfortunately they are quite slow and thus impractical for large networks. In this paper, we propose DEPART, a delay-based partitioner-cum-optimizer, which purports to solve this problem. Given a combinational logic network that is to be optimized for timing, DEPART divides it into sub-networks using timing information and a constraint on the maximum number of gates allowed in a single sub-network. These sub-networks are then dispatched, one by one, to a standard timing optimizer. The optimized sub-networks are re-glued, generating an optimized network. The challenge is how to partition the original network into sub-networks so that the final solution quality after partitioning and optimization is comparable to that from the timing optimizer. We propose a partitioning technique that is timing-driven and is simple yet effective. We compare DEPART with speed-up, a state-of-the-art timing optimization tool, and with various partitioning techniques such as min-cut based and region growing, on a suite of large industrial and ISCAS circuits. On more than half of the benchmarks, DEPART yields run-time improvements of 20 to 450 times over a normal invocation of speed-up (the overall average improvement being 8 times), without compromising the solution quality much. Min-cut and region growing partitioning schemes, not being timing-driven, perform poorly in terms of the final circuit delay.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"28 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129389913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643566
D. Stoffel, W. Kunz
This paper proposes a technique for sequential logic equivalence checking by a structural fixed point iteration. Verification is performed by expanding the circuit into an iterative circuit array and by proving equivalence of each time frame by well-known combinational verification techniques. These exploit structural similarity between designs by local circuit transformations. Starting from the initial state, for each time frame the performed circuit transformations are stored (recorded) in an instruction queue. In subsequent time frames the instruction queue is re-used (played) and updated when necessary. At some point the instruction queue does not need to be modified any more and is valid in all subsequent time frames. Thus, a fixed point is reached and machine equivalence is proved by induction. Experimental results show the great promise of this approach to verify circuits after resynthesis and retiming.
{"title":"Record and play: a structural fixed point iteration for sequential circuit verification","authors":"D. Stoffel, W. Kunz","doi":"10.1109/ICCAD.1997.643566","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643566","url":null,"abstract":"This paper proposes a technique for sequential logic equivalence checking by a structural fixed point iteration. Verification is performed by expanding the circuit into an iterative circuit array and by proving equivalence of each time frame by well-known combinational verification techniques. These exploit structural similarity between designs by local circuit transformations. Starting from the initial state, for each time frame the performed circuit transformations are stored (recorded) in an instruction queue. In subsequent time frames the instruction queue is re-used (played) and updated when necessary. At some point the instruction queue does not need to be modified any more and is valid in all subsequent time frames. Thus, a fixed point is reached and machine equivalence is proved by induction. Experimental results show the great promise of this approach to verify circuits after resynthesis and retiming.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114274749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643606
Keerthi Heragu, J. Patel, V. Agrawal
The authors propose a novel algorithm to rapidly identify untestable delay faults using pre-computed static logic implications. The fault-independent analysis identifies large sets of untestable faults, if any, without enumerating them. The cardinalities of these sets are obtained by using a counting algorithm that has quadratic complexity in the number of lines. Since the method is based on an incomplete set of logic implications, it gives only a lower bound on the number of untestable faults. A post-processing step can list the untestable faults, if desired. Targeting untestable delay faults for test generation by an automatic test pattern generation (ATPG) tool can be avoided. The method works for the segment delay fault model and its special case, the path delay fault model, and identifies robustly untestable, non-robustly untestable, and functionally unsensitizable delay faults. Results on benchmark circuits show that many delay faults are identified as untestable in a very short time. For the benchmark circuit c6288, the algorithm identified 1.978/spl times/10/sup 20/ functionally unsensitizable path faults in 3 CPU seconds.
{"title":"Fast identification of untestable delay faults using implications","authors":"Keerthi Heragu, J. Patel, V. Agrawal","doi":"10.1109/ICCAD.1997.643606","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643606","url":null,"abstract":"The authors propose a novel algorithm to rapidly identify untestable delay faults using pre-computed static logic implications. The fault-independent analysis identifies large sets of untestable faults, if any, without enumerating them. The cardinalities of these sets are obtained by using a counting algorithm that has quadratic complexity in the number of lines. Since the method is based on an incomplete set of logic implications, it gives only a lower bound on the number of untestable faults. A post-processing step can list the untestable faults, if desired. Targeting untestable delay faults for test generation by an automatic test pattern generation (ATPG) tool can be avoided. The method works for the segment delay fault model and its special case, the path delay fault model, and identifies robustly untestable, non-robustly untestable, and functionally unsensitizable delay faults. Results on benchmark circuits show that many delay faults are identified as untestable in a very short time. For the benchmark circuit c6288, the algorithm identified 1.978/spl times/10/sup 20/ functionally unsensitizable path faults in 3 CPU seconds.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125106541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643378
E. Goldberg, L. Carloni, T. Villa, R. Brayton, A. Sangiovanni-Vincentelli
We introduce a new technique to solve exactly a discrete optimization problem, based on the paradigm of "negative" thinking. The motivation is that when searching the space of solutions, often a good solution is reached quickly and then improved only a few times before the optimum is found: hence most of the solution space is explored to certify optimality, but it does not yield any improvement of the cost function. So it is quite natural for an algorithm to be "skeptical" about the chance to improve the current best solution. For illustration we have applied our approach to the unate covering problem. We designed a procedure, raiser, implementing a negative thinking search, which is incorporated into a common branch-and-bound procedure. Experiments show that our program, AURA, outperforms both ESPRESSO and our enhancement of ESPRESSO using Coudert's limit lower bound. It is always faster and in the most difficult examples either has a running time better by up to two orders of magnitude, or the other programs fail to finish due to timeout or spaceout. The package SCHERZO is faster on some examples and loses on others, due to a less powerful pruning strategy of the search space, partially mitigated by a more effective computation of the maximal independent set.
{"title":"Negative thinking by incremental problem solving: application to unate covering","authors":"E. Goldberg, L. Carloni, T. Villa, R. Brayton, A. Sangiovanni-Vincentelli","doi":"10.1109/ICCAD.1997.643378","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643378","url":null,"abstract":"We introduce a new technique to solve exactly a discrete optimization problem, based on the paradigm of \"negative\" thinking. The motivation is that when searching the space of solutions, often a good solution is reached quickly and then improved only a few times before the optimum is found: hence most of the solution space is explored to certify optimality, but it does not yield any improvement of the cost function. So it is quite natural for an algorithm to be \"skeptical\" about the chance to improve the current best solution. For illustration we have applied our approach to the unate covering problem. We designed a procedure, raiser, implementing a negative thinking search, which is incorporated into a common branch-and-bound procedure. Experiments show that our program, AURA, outperforms both ESPRESSO and our enhancement of ESPRESSO using Coudert's limit lower bound. It is always faster and in the most difficult examples either has a running time better by up to two orders of magnitude, or the other programs fail to finish due to timeout or spaceout. The package SCHERZO is faster on some examples and loses on others, due to a less powerful pruning strategy of the search space, partially mitigated by a more effective computation of the maximal independent set.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121930413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643529
A. Takahashi, Kazunori Inoue, Y. Kajitani
It is known that the clock period can be shorter than the maximum of the signal delays between registers if the clock arrival time to each register is properly scheduled. The algorithm to design an optimal clock schedule is given. In this paper, we propose a clock-tree routing algorithm that realizes a given clock schedule using the Elmore delay model. Following the deferred-merge-embedding (DME) framework, the algorithm generates a topology of the clock tree and determines the locations and sizes of intermediate buffers simultaneously. The experimental results show that this method constructs clock trees with moderate wire length compared with that of zero-skew clock trees.
{"title":"Clock-tree routing realizing a clock-schedule for semi-synchronous circuits","authors":"A. Takahashi, Kazunori Inoue, Y. Kajitani","doi":"10.1109/ICCAD.1997.643529","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643529","url":null,"abstract":"It is known that the clock period can be shorter than the maximum of the signal delays between registers if the clock arrival time to each register is properly scheduled. The algorithm to design an optimal clock schedule is given. In this paper, we propose a clock-tree routing algorithm that realizes a given clock schedule using the Elmore delay model. Following the deferred-merge-embedding (DME) framework, the algorithm generates a topology of the clock tree and determines the locations and sizes of intermediate buffers simultaneously. The experimental results show that this method constructs clock trees with moderate wire length compared with that of zero-skew clock trees.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121567362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643261
Jian Li, Rajesh K. Gupta
Presynthesis optimizations transform a behavioral HDL description into an optimized HDL description that results in improved synthesis results. We introduce the decomposition of timed decision tables (TDT), a tabular model of system behavior. The TDT decomposition is based on the kernel extraction algorithm. By experimenting using named benchmarks, we demonstrate how TDT decomposition can be used in presynthesis optimizations.
{"title":"Decomposition of timed decision tables and its use in presynthesis optimizations","authors":"Jian Li, Rajesh K. Gupta","doi":"10.1109/ICCAD.1997.643261","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643261","url":null,"abstract":"Presynthesis optimizations transform a behavioral HDL description into an optimized HDL description that results in improved synthesis results. We introduce the decomposition of timed decision tables (TDT), a tabular model of system behavior. The TDT decomposition is based on the kernel extraction algorithm. By experimenting using named benchmarks, we demonstrate how TDT decomposition can be used in presynthesis optimizations.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131329278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643622
K. Kundert
The principles employed in the development of modern RF simulators are introduced and the various techniques currently in use, or expected to be in use in the next few years, are surveyed. Frequency and time domain techniques are presented and contrasted, as are steady state and envelope techniques and large and small signal techniques.
{"title":"Simulation methods for RF integrated circuits","authors":"K. Kundert","doi":"10.1109/ICCAD.1997.643622","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643622","url":null,"abstract":"The principles employed in the development of modern RF simulators are introduced and the various techniques currently in use, or expected to be in use in the next few years, are surveyed. Frequency and time domain techniques are presented and contrasted, as are steady state and envelope techniques and large and small signal techniques.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"11 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120931043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643581
J. Kozhaya, F. Najm
A power estimation approach is presented in which blocks of consecutive vectors are selected at random from a user-supplied realistic input vector set and the circuit is simulated for each block starting from an unknown state. This leads to two (upper and lower) bounds on the desired power value which can be quite tight (under 10% difference between the two in many cases). As a result, the power dissipation is obtained by simulating only a fraction of the potentially very large vector set.
{"title":"Accurate power estimation for large sequential circuits","authors":"J. Kozhaya, F. Najm","doi":"10.1109/ICCAD.1997.643581","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643581","url":null,"abstract":"A power estimation approach is presented in which blocks of consecutive vectors are selected at random from a user-supplied realistic input vector set and the circuit is simulated for each block starting from an unknown state. This leads to two (upper and lower) bounds on the desired power value which can be quite tight (under 10% difference between the two in many cases). As a result, the power dissipation is obtained by simulating only a fraction of the potentially very large vector set.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121475423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643582
L. Benini, G. Micheli, E. Macii, M. Poncino, R. Scarsi
The power dissipated by digital systems under realistic input stimuli is not accurately described by a single average value, but by a waveform that shows how power consumption varies over time as the system responds to the inputs. We face the problem of obtaining accurate power waveforms for combinational and sequential circuits under typical usage patterns. We propose a multi level simulation engine that achieves high accuracy in estimating the average power as well as the time domain power waveform with high computational efficiency.
{"title":"Fast power estimation for deterministic input streams","authors":"L. Benini, G. Micheli, E. Macii, M. Poncino, R. Scarsi","doi":"10.1109/ICCAD.1997.643582","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643582","url":null,"abstract":"The power dissipated by digital systems under realistic input stimuli is not accurately described by a single average value, but by a waveform that shows how power consumption varies over time as the system responds to the inputs. We face the problem of obtaining accurate power waveforms for combinational and sequential circuits under typical usage patterns. We propose a multi level simulation engine that achieves high accuracy in estimating the average power as well as the time domain power waveform with high computational efficiency.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"11 20","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133204887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643572
Jason Y. Zien, P. K. Chan, M. Schlag
We develop a new multi-way, hybrid spectral/iterative hypergraph partitioning algorithm that combines the strengths of spectral partitioners and iterative improvement algorithms to create a new class of partitioners. We use spectral information (the eigenvectors of a graph) to generate initial partitions, influence the selection of iterative improvement moves, and break out of local minima. Our 3-way and 4-way partitioning results exhibit significant improvement over current published results, demonstrating the effectiveness of our new method. Our hybrid algorithm produces an improvement of 25% over GFM for 3-way partitions, 41% improvement over GFM for 4-way partitions, and 58% improvement over ML/sub F/ for 4-way partitions.
{"title":"Hybrid spectral/iterative partitioning","authors":"Jason Y. Zien, P. K. Chan, M. Schlag","doi":"10.1109/ICCAD.1997.643572","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643572","url":null,"abstract":"We develop a new multi-way, hybrid spectral/iterative hypergraph partitioning algorithm that combines the strengths of spectral partitioners and iterative improvement algorithms to create a new class of partitioners. We use spectral information (the eigenvectors of a graph) to generate initial partitions, influence the selection of iterative improvement moves, and break out of local minima. Our 3-way and 4-way partitioning results exhibit significant improvement over current published results, demonstrating the effectiveness of our new method. Our hybrid algorithm produces an improvement of 25% over GFM for 3-way partitions, 41% improvement over GFM for 4-way partitions, and 58% improvement over ML/sub F/ for 4-way partitions.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114572520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}