Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643537
O. Bringmann, W. Rosenstiel
This paper presents a new approach to hierarchical high-level synthesis with respect to internal register-transfer structures of complex components. Entire subdesigns can efficiently be used as complex components at a higher hierarchical level of the design. After synthesis, the calculated schedule of each subdesign is added to its register-transfer component model. This enables the sharing of unused subcomponents across different hierarchical levels of the design. Especially, subcomponents of autonomous components, with a separate controller can also be shared. As a result, the presented methodology offers a high degree of optimization to hierarchically specified designs.
{"title":"Resource sharing in hierarchical synthesis","authors":"O. Bringmann, W. Rosenstiel","doi":"10.1109/ICCAD.1997.643537","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643537","url":null,"abstract":"This paper presents a new approach to hierarchical high-level synthesis with respect to internal register-transfer structures of complex components. Entire subdesigns can efficiently be used as complex components at a higher hierarchical level of the design. After synthesis, the calculated schedule of each subdesign is added to its register-transfer component model. This enables the sharing of unused subcomponents across different hierarchical levels of the design. Especially, subcomponents of autonomous components, with a separate controller can also be shared. As a result, the presented methodology offers a high degree of optimization to hierarchically specified designs.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115882758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a completely new approach to the problem of delay minimization by simultaneous buffer insertion and wire sizing for a wire. We show that the problem can be formulated as a convex quadratic program, which is known to be solvable in polynomial time. Nevertheless, we explore some special properties of our problem and derive on optimal and very efficient algorithm to solve the resulting program. Given m buffers and a set of n discrete choices of wire width, the running time of our algorithm is O(mn/sup 2/) and is independent of the wire length in practice. For example, an instance of 100 buffers and 100 choices of wire width can be solved in 3 seconds. Besides, our formulation is so versatile that it is easy to consider other objectives like wire area or power dissipation, or to add constraints to the solution. Also, wire capacitance lookup tables, or very general wire capacitance models which can capture area capacitance, fringing capacitance, coupling capacitance, etc. can be used.
{"title":"A new approach to simultaneous buffer insertion and wire sizing","authors":"Lei He, A. Kahng, K. Tam, Jinjun Xiong","doi":"10.5555/266388.266564","DOIUrl":"https://doi.org/10.5555/266388.266564","url":null,"abstract":"We present a completely new approach to the problem of delay minimization by simultaneous buffer insertion and wire sizing for a wire. We show that the problem can be formulated as a convex quadratic program, which is known to be solvable in polynomial time. Nevertheless, we explore some special properties of our problem and derive on optimal and very efficient algorithm to solve the resulting program. Given m buffers and a set of n discrete choices of wire width, the running time of our algorithm is O(mn/sup 2/) and is independent of the wire length in practice. For example, an instance of 100 buffers and 100 choices of wire width can be solved in 3 seconds. Besides, our formulation is so versatile that it is easy to consider other objectives like wire area or power dissipation, or to add constraints to the solution. Also, wire capacitance lookup tables, or very general wire capacitance models which can capture area capacitance, fringing capacitance, coupling capacitance, etc. can be used.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"32 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123205401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643255
P. Vuillod, L. Benini, G. Micheli
We present a novel approach for post-mapping optimization. We exploit the concept of generalised matching, a technique that finds symbolically all possible matching assignments of library cells to a multi-output network specified by a Boolean relation. Several objectives are targeted: area minimization under delay constraints; power minimization under delay constraints; and unconstrained delay minimization. We describe the theory of generalized matching and the algorithmic optimization required for its efficient and robust implementation. A tool based on generalized matching has been implemented and tested on large examples of the MCNC'91 benchmark suite. We obtain sizable improvements in: speed (6% in average, up to 20.7%); area under speed constraints (13.7% an average, up to 29.5%); and power under speed constraints (22.3% in average, up to 38.1%).
{"title":"Generalized matching from theory to application","authors":"P. Vuillod, L. Benini, G. Micheli","doi":"10.1109/ICCAD.1997.643255","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643255","url":null,"abstract":"We present a novel approach for post-mapping optimization. We exploit the concept of generalised matching, a technique that finds symbolically all possible matching assignments of library cells to a multi-output network specified by a Boolean relation. Several objectives are targeted: area minimization under delay constraints; power minimization under delay constraints; and unconstrained delay minimization. We describe the theory of generalized matching and the algorithmic optimization required for its efficient and robust implementation. A tool based on generalized matching has been implemented and tested on large examples of the MCNC'91 benchmark suite. We obtain sizable improvements in: speed (6% in average, up to 20.7%); area under speed constraints (13.7% an average, up to 29.5%); and power under speed constraints (22.3% in average, up to 38.1%).","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128914362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643575
M. Kamon, N. Marques, Jacob K. White
In this paper we describe a computationally efficient approach to generating reduced-order models from PEEC-based three-dimensional electromagnetic analysis programs. It is shown that a recycled multipole-accelerated approach applied to recent model order reduction techniques requires nearly two orders of magnitude fewer floating point operations than direct techniques thus allowing the analysis of larger, more complex three-dimensional geometries.
{"title":"FastPep: a fast parasitic extraction program for complex three-dimensional geometries","authors":"M. Kamon, N. Marques, Jacob K. White","doi":"10.1109/ICCAD.1997.643575","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643575","url":null,"abstract":"In this paper we describe a computationally efficient approach to generating reduced-order models from PEEC-based three-dimensional electromagnetic analysis programs. It is shown that a recycled multipole-accelerated approach applied to recent model order reduction techniques requires nearly two orders of magnitude fewer floating point operations than direct techniques thus allowing the analysis of larger, more complex three-dimensional geometries.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130174899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643597
H. Zhou, Martin D. F. Wong
With the remarkable growth of portable application and the increasing frequency and integration density, power is being given comparable weight to speed and area in IC designs. In technology mapping, how decomposition is done can have a significant impact on the power dissipation of the final implementation. In the literature, only heuristic algorithms are given for the low power gate decomposition problem. We prove many properties an optimal decomposition tree must have. Based on these optimality properties, we design an efficient exact algorithm to solve the low power gate decomposition problem. Moreover the exact algorithm can be easily modified to a heuristic algorithm which performs much better than the known heuristics.
{"title":"An exact gate decomposition algorithm for low-power technology mapping","authors":"H. Zhou, Martin D. F. Wong","doi":"10.1109/ICCAD.1997.643597","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643597","url":null,"abstract":"With the remarkable growth of portable application and the increasing frequency and integration density, power is being given comparable weight to speed and area in IC designs. In technology mapping, how decomposition is done can have a significant impact on the power dissipation of the final implementation. In the literature, only heuristic algorithms are given for the low power gate decomposition problem. We prove many properties an optimal decomposition tree must have. Based on these optimality properties, we design an efficient exact algorithm to solve the low power gate decomposition problem. Moreover the exact algorithm can be easily modified to a heuristic algorithm which performs much better than the known heuristics.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126556782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643585
G. Marchioro, J. Daveau, A. Jerraya
This paper presents the underlying methodology of Cosmos, an interactive approach for hardware/software co-design capable of handling multiprocessor systems and distributed architectures. The approach covers the co-design process through a set of user guided transformations allowing semi-automatic partitioning. The transformations are based on a powerful set of primitives for functional partitioning, structural reorganization and communication transformation. It leads to a fast transformation of a system-level specification into an architecture with a short design time and fast exploration of design space. The application of this approach is illustrated using a design example starting from a system-level specification given in SDL to a distributed hardware/software architecture described in C/VHDL. We show that the use of transformational approach allows: 1) application of the expertise of the designer during partitioning; 2) the user to understand the results of the co-design process; 3) the process to take into account partial existing solutions; 4) large design space exploration; and 5) the designer to start from a very high-level specification language of the system to be designed.
{"title":"Transformational partitioning for co-design of multiprocessor systems","authors":"G. Marchioro, J. Daveau, A. Jerraya","doi":"10.1109/ICCAD.1997.643585","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643585","url":null,"abstract":"This paper presents the underlying methodology of Cosmos, an interactive approach for hardware/software co-design capable of handling multiprocessor systems and distributed architectures. The approach covers the co-design process through a set of user guided transformations allowing semi-automatic partitioning. The transformations are based on a powerful set of primitives for functional partitioning, structural reorganization and communication transformation. It leads to a fast transformation of a system-level specification into an architecture with a short design time and fast exploration of design space. The application of this approach is illustrated using a design example starting from a system-level specification given in SDL to a distributed hardware/software architecture described in C/VHDL. We show that the use of transformational approach allows: 1) application of the expertise of the designer during partitioning; 2) the user to understand the results of the co-design process; 3) the process to take into account partial existing solutions; 4) large design space exploration; and 5) the designer to start from a very high-level specification language of the system to be designed.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134250725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643570
I. Pomeranz, S. Reddy
We consider the problem of built-in test generation for synchronous sequential circuits. The proposed scheme leaves the circuit flip-flops unmodified, and thus allows at-speed test application. We introduce a uniform, parametrized structure for test pattern generation. By matching the parameters of the test pattern generator to the circuit-under-test, high fault coverage is achieved. In many cases, the fault coverage is equal to the fault coverage that can be achieved by deterministic test sequences. We also investigate a method to minimize the size of the test pattern generator, and study its effectiveness alone and in conjunction with the insertion of test-points.
{"title":"Built-in test generation for synchronous sequential circuits","authors":"I. Pomeranz, S. Reddy","doi":"10.1109/ICCAD.1997.643570","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643570","url":null,"abstract":"We consider the problem of built-in test generation for synchronous sequential circuits. The proposed scheme leaves the circuit flip-flops unmodified, and thus allows at-speed test application. We introduce a uniform, parametrized structure for test pattern generation. By matching the parameters of the test pattern generator to the circuit-under-test, high fault coverage is achieved. In many cases, the fault coverage is equal to the fault coverage that can be achieved by deterministic test sequences. We also investigate a method to minimize the size of the test pattern generator, and study its effectiveness alone and in conjunction with the insertion of test-points.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115339615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643276
Zhanping Chen, K. Roy, T. Chou
Power dissipation in CMOS circuits heavily depends on the signal properties of the primary inputs. Due to uncertainties in specification of such properties, the average power should be specified between a maximum and a minimum possible value. Due to the complex nature of the problem, it is practically impossible to use traditional power estimation techniques to determine such bounds. We present a novel approach to accurately estimate the maximum and minimum bounds for average power using a technique which calculates the sensitivities of average power dissipation to primary input signal properties. The sensitivities are calculated using a novel statistical technique and can be obtained as a by-product of average power estimation using a Monte Carlo based approach. The signal properties are specified in terms of signal probability (probability of a signal being logic ONE) and signal activity (probability of signal switching). Results show that the maximum and minimum average power dissipation can vary widely if the primary input probabilities and activities are not specified accurately.
{"title":"Power sensitivity-a new method to estimate power dissipation considering uncertain specifications of primary inputs","authors":"Zhanping Chen, K. Roy, T. Chou","doi":"10.1109/ICCAD.1997.643276","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643276","url":null,"abstract":"Power dissipation in CMOS circuits heavily depends on the signal properties of the primary inputs. Due to uncertainties in specification of such properties, the average power should be specified between a maximum and a minimum possible value. Due to the complex nature of the problem, it is practically impossible to use traditional power estimation techniques to determine such bounds. We present a novel approach to accurately estimate the maximum and minimum bounds for average power using a technique which calculates the sensitivities of average power dissipation to primary input signal properties. The sensitivities are calculated using a novel statistical technique and can be obtained as a by-product of average power estimation using a Monte Carlo based approach. The signal properties are specified in terms of signal probability (probability of a signal being logic ONE) and signal activity (probability of signal switching). Results show that the maximum and minimum average power dissipation can vary widely if the primary input probabilities and activities are not specified accurately.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115134897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643525
C. Monahan, F. Brewer
Generalizes ALAP (as late as possible) bounds for the exact scheduling problem on a pre-defined data path. Conventional bounds are inapplicable because of the possible requirement of re-computing operands for minimal schedule length. Efficient techniques are presented for constructing the new bounds which are sensitive to point-to-point delays via transitive memory units. An efficient operand mapping bound is also described. Based on these two bounds, time improvement factors of 50 have demonstrated in exact scheduling results.
{"title":"Scheduling and binding bounds for RT-level symbolic execution","authors":"C. Monahan, F. Brewer","doi":"10.1109/ICCAD.1997.643525","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643525","url":null,"abstract":"Generalizes ALAP (as late as possible) bounds for the exact scheduling problem on a pre-defined data path. Conventional bounds are inapplicable because of the possible requirement of re-computing operands for minimal schedule length. Efficient techniques are presented for constructing the new bounds which are sensitive to point-to-point delays via transitive memory units. An efficient operand mapping bound is also described. Based on these two bounds, time improvement factors of 50 have demonstrated in exact scheduling results.","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122471102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-11-13DOI: 10.1109/ICCAD.1997.643527
G. Lakshminarayana, K. Khouri, N. Jha
Presents a novel scheduling algorithm targeted towards minimizing the average execution time of control-flow intensive behavioral descriptions. Our algorithm uses a control-data flow graph (CDFG) model, which preserves the parallelism inherent in the application. It explores previously unexplored regions of the solution space through its ability to overlap the schedules of independent iterative constructs whose bodies share resources. It also incorporates well-known optimization techniques like loop unrolling in a natural fashion. This is made possible by a general loop-handling technique which we have devised. Application of the algorithm to several common benchmarks demonstrates up to 4.8-fold improvement in expected schedule length over existing scheduling algorithms, without paying a price in terms of the best- and worst-case schedule lengths required to execute the behavioral description (in fact, frequently, the best/worst-case schedule lengths are also better for our algorithm).
{"title":"Wavesched: a novel scheduling technique for control-flow intensive behavioral descriptions","authors":"G. Lakshminarayana, K. Khouri, N. Jha","doi":"10.1109/ICCAD.1997.643527","DOIUrl":"https://doi.org/10.1109/ICCAD.1997.643527","url":null,"abstract":"Presents a novel scheduling algorithm targeted towards minimizing the average execution time of control-flow intensive behavioral descriptions. Our algorithm uses a control-data flow graph (CDFG) model, which preserves the parallelism inherent in the application. It explores previously unexplored regions of the solution space through its ability to overlap the schedules of independent iterative constructs whose bodies share resources. It also incorporates well-known optimization techniques like loop unrolling in a natural fashion. This is made possible by a general loop-handling technique which we have devised. Application of the algorithm to several common benchmarks demonstrates up to 4.8-fold improvement in expected schedule length over existing scheduling algorithms, without paying a price in terms of the best- and worst-case schedule lengths required to execute the behavioral description (in fact, frequently, the best/worst-case schedule lengths are also better for our algorithm).","PeriodicalId":187521,"journal":{"name":"1997 Proceedings of IEEE International Conference on Computer Aided Design (ICCAD)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122946801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}