Pub Date : 1990-09-17DOI: 10.1109/ICCD.1990.130257
J. Saul
The use of the Reed-Muller representation to represent and manipulate switching functions in logic synthesis systems is discussed. An algorithm for the minimization of mixed-polarity Reed-Muller representations to multiple-output incompletely specified switching functions is presented, in which heuristics are used to determine the best application of previously known rules for minimizing single-output equations; rules are used to link multiple-output functions and to minimize incompletely specified functions. This algorithm has been implemented, and benchmark comparisons with the best previous minimization method known shows that the method is faster and results in smaller representations.<>
{"title":"An improved algorithm for the minimization of mixed polarity Reed-Muller representations","authors":"J. Saul","doi":"10.1109/ICCD.1990.130257","DOIUrl":"https://doi.org/10.1109/ICCD.1990.130257","url":null,"abstract":"The use of the Reed-Muller representation to represent and manipulate switching functions in logic synthesis systems is discussed. An algorithm for the minimization of mixed-polarity Reed-Muller representations to multiple-output incompletely specified switching functions is presented, in which heuristics are used to determine the best application of previously known rules for minimizing single-output equations; rules are used to link multiple-output functions and to minimize incompletely specified functions. This algorithm has been implemented, and benchmark comparisons with the best previous minimization method known shows that the method is faster and results in smaller representations.<<ETX>>","PeriodicalId":441935,"journal":{"name":"Proceedings., 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"163 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127422790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-09-17DOI: 10.1109/ICCD.1990.130190
F. Brglez, C. Gloster, G. Kedem
The authors address scan-based built-in self-test (BIST) of digital circuits that are highly resistant to testing with uniform random patterns. Introducing a procedure, the precompute test patterns for random-pattern resistant faults and generate optimized distributions of weights that guarantee pattern coverage in a given number of random trials. The software implementation offers a tradeoff in the number of distributions (hardware memory) and the length of the total test time. The hardware implementation is based on a canonic weighting circuit that interfaces to a circulating memory and a pseudo-random source.<>
{"title":"Built-in self-test with weighted random pattern hardware","authors":"F. Brglez, C. Gloster, G. Kedem","doi":"10.1109/ICCD.1990.130190","DOIUrl":"https://doi.org/10.1109/ICCD.1990.130190","url":null,"abstract":"The authors address scan-based built-in self-test (BIST) of digital circuits that are highly resistant to testing with uniform random patterns. Introducing a procedure, the precompute test patterns for random-pattern resistant faults and generate optimized distributions of weights that guarantee pattern coverage in a given number of random trials. The software implementation offers a tradeoff in the number of distributions (hardware memory) and the length of the total test time. The hardware implementation is based on a canonic weighting circuit that interfaces to a circulating memory and a pseudo-random source.<<ETX>>","PeriodicalId":441935,"journal":{"name":"Proceedings., 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127532839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-09-17DOI: 10.1109/ICCD.1990.130155
H. Shih, Predrag G. Kovijanic, R. Razdan
A global feedback detection algorithm for VLSI circuits is presented. It can identify all the global feedback loops within reasonable computational time. The overall algorithm is as follows: First, all the strongly connected components (SCC) are found using a modified version of the Tarjan algorithm which can handle circuits with flip-flops and latches. Second, each SCC recursively cuts the loops based on heuristic criteria to reduce computation time and space until all loops inside this SCC are out. The modified Tarjan algorithm for finding SCCs in circuits consisting of functional primitive elements such as flip-flops and latches is described. A recursive loop-cutting algorithm for strongly connected components is presented, and a top-level partitioning scheme to reduce memory requirements and computation time for finding global feedback loops is proposed.<>
{"title":"A global feedback detection algorithm for VLSI circuits","authors":"H. Shih, Predrag G. Kovijanic, R. Razdan","doi":"10.1109/ICCD.1990.130155","DOIUrl":"https://doi.org/10.1109/ICCD.1990.130155","url":null,"abstract":"A global feedback detection algorithm for VLSI circuits is presented. It can identify all the global feedback loops within reasonable computational time. The overall algorithm is as follows: First, all the strongly connected components (SCC) are found using a modified version of the Tarjan algorithm which can handle circuits with flip-flops and latches. Second, each SCC recursively cuts the loops based on heuristic criteria to reduce computation time and space until all loops inside this SCC are out. The modified Tarjan algorithm for finding SCCs in circuits consisting of functional primitive elements such as flip-flops and latches is described. A recursive loop-cutting algorithm for strongly connected components is presented, and a top-level partitioning scheme to reduce memory requirements and computation time for finding global feedback loops is proposed.<<ETX>>","PeriodicalId":441935,"journal":{"name":"Proceedings., 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"135 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127029664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-09-17DOI: 10.1109/ICCD.1990.130266
H. Niijima, N. Ohba
A quick access memory (QRAM) was developed that realizes a cost-effective high-performance memory architecture. The QRAM improves the effective data access speed by making maximum use of the page mode of memory, and hence acts like a pseudo-cache memory. For high performance and usability, it has three special features: built-in address latches/comparators, a direct handshake facility, and a multiple active island structure. It can communicate directly with the microprocessor by handshaking of the memory request and the memory ready. This reduces the amount of external logic needed for the memory system. The QRAM can be made with conventional technology.<>
{"title":"QRAM-Quick access memory system","authors":"H. Niijima, N. Ohba","doi":"10.1109/ICCD.1990.130266","DOIUrl":"https://doi.org/10.1109/ICCD.1990.130266","url":null,"abstract":"A quick access memory (QRAM) was developed that realizes a cost-effective high-performance memory architecture. The QRAM improves the effective data access speed by making maximum use of the page mode of memory, and hence acts like a pseudo-cache memory. For high performance and usability, it has three special features: built-in address latches/comparators, a direct handshake facility, and a multiple active island structure. It can communicate directly with the microprocessor by handshaking of the memory request and the memory ready. This reduces the amount of external logic needed for the memory system. The QRAM can be made with conventional technology.<<ETX>>","PeriodicalId":441935,"journal":{"name":"Proceedings., 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"251 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129916358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-09-17DOI: 10.1109/ICCD.1990.130167
C. Benveniste, Yarsun Hsu
Performance of the wraparound network through trace-driven simulation is examined. This network has a processor attached to each node, but the links in the network are unidirectional, and the two ends of the network are joined together. Traces of three representative parallel engineering/scientific programs are used as input to the simulator, and the performance of this network is compared to that of an omega network under the same inputs. The wraparound network and the omega network are found to perform similarly, while the size and cost of the wraparound network are smaller than those of the omega.<>
{"title":"A trace-driven analysis of the 'wrap-around' network","authors":"C. Benveniste, Yarsun Hsu","doi":"10.1109/ICCD.1990.130167","DOIUrl":"https://doi.org/10.1109/ICCD.1990.130167","url":null,"abstract":"Performance of the wraparound network through trace-driven simulation is examined. This network has a processor attached to each node, but the links in the network are unidirectional, and the two ends of the network are joined together. Traces of three representative parallel engineering/scientific programs are used as input to the simulator, and the performance of this network is compared to that of an omega network under the same inputs. The wraparound network and the omega network are found to perform similarly, while the size and cost of the wraparound network are smaller than those of the omega.<<ETX>>","PeriodicalId":441935,"journal":{"name":"Proceedings., 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128791206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-09-17DOI: 10.1109/ICCD.1990.130209
S. Kimura, E. Clarke
A parallel algorithm for constructing binary decision diagrams is described. The algorithms treats binary decision graphs as minimal finite automata. The automation for a Boolean function with AND as its main operation (OR operation) is obtained by forming the intersection (union) of the regular sets associated with its operands. The union and intersection operations are implemented by a product construction on the minimal automata for the regular sets. After each product construction step the automaton must be reminimized. The parallel algorithm is designed so that it is possible to find the minimal representations for several Boolean operations in parallel. The level of each operation is determined. Operations at the same level can be performed in parallel without any communication between processors. If there are relatively few operations in one level, then the product generation step is divided into several suboperations and the results are merged.<>
{"title":"A parallel algorithm for constructing binary decision diagrams","authors":"S. Kimura, E. Clarke","doi":"10.1109/ICCD.1990.130209","DOIUrl":"https://doi.org/10.1109/ICCD.1990.130209","url":null,"abstract":"A parallel algorithm for constructing binary decision diagrams is described. The algorithms treats binary decision graphs as minimal finite automata. The automation for a Boolean function with AND as its main operation (OR operation) is obtained by forming the intersection (union) of the regular sets associated with its operands. The union and intersection operations are implemented by a product construction on the minimal automata for the regular sets. After each product construction step the automaton must be reminimized. The parallel algorithm is designed so that it is possible to find the minimal representations for several Boolean operations in parallel. The level of each operation is determined. Operations at the same level can be performed in parallel without any communication between processors. If there are relatively few operations in one level, then the product generation step is divided into several suboperations and the results are merged.<<ETX>>","PeriodicalId":441935,"journal":{"name":"Proceedings., 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130516005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-09-17DOI: 10.1109/ICCD.1990.130235
U. Schmidt, Sonke Mehrgardt
A single-chip MIMD wavefront array processor for video applications is presented. The processor topology is an array of individually programmable mesh-connected cells; processors may be cascaded indefinitely in one or two dimensions. 12-b word width, superscalar RISC cell architecture, and the 125-MHz clock rate are tailored toward the requirement of digital video signal processing. The processor executes statically scheduled data flow programs, propagating data through the array in a wavefront-like manner. The processor is implemented in 0.8- mu double-metal CMOS. It has 1.2 million transistors, a chip area of 150 mm/sup 2/, a pin count of 124, and a maximum power dissipation of 8 W.<>
{"title":"Wavefront array processor for video applications","authors":"U. Schmidt, Sonke Mehrgardt","doi":"10.1109/ICCD.1990.130235","DOIUrl":"https://doi.org/10.1109/ICCD.1990.130235","url":null,"abstract":"A single-chip MIMD wavefront array processor for video applications is presented. The processor topology is an array of individually programmable mesh-connected cells; processors may be cascaded indefinitely in one or two dimensions. 12-b word width, superscalar RISC cell architecture, and the 125-MHz clock rate are tailored toward the requirement of digital video signal processing. The processor executes statically scheduled data flow programs, propagating data through the array in a wavefront-like manner. The processor is implemented in 0.8- mu double-metal CMOS. It has 1.2 million transistors, a chip area of 150 mm/sup 2/, a pin count of 124, and a maximum power dissipation of 8 W.<<ETX>>","PeriodicalId":441935,"journal":{"name":"Proceedings., 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130623631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-09-17DOI: 10.1109/ICCD.1990.130246
Chung-Han Chen, N. Tzeng
The VLSI layouts of most fault-tolerant binary tree architectures are based on the classical H-tree layout, resulting in low area utilization and an unnecessarily high manufacturing cost due to the waste of a significant portion of silicon area. An area-efficient approach to the reconfigurable binary tree architecture is presented. Area utilization and interconnection complexity of the proposed design compare favorably with other known approaches. The use of the coverage factor makes it possible to analyze the system reliability by means of the Markov model. Unlike previous reliability studies in which chips are assumed to be defect-free, this analysis considers the fact that an accepted chip may have used spares to replace manufacturing defects, and the number of spares available for tolerating operational faults may thus vary from chip to chip. The developed analytical model for reliability is readily extended to other VSLI/WIS-based multiprocessor systems.<>
{"title":"An area-efficient reconfigurable binary tree architecture","authors":"Chung-Han Chen, N. Tzeng","doi":"10.1109/ICCD.1990.130246","DOIUrl":"https://doi.org/10.1109/ICCD.1990.130246","url":null,"abstract":"The VLSI layouts of most fault-tolerant binary tree architectures are based on the classical H-tree layout, resulting in low area utilization and an unnecessarily high manufacturing cost due to the waste of a significant portion of silicon area. An area-efficient approach to the reconfigurable binary tree architecture is presented. Area utilization and interconnection complexity of the proposed design compare favorably with other known approaches. The use of the coverage factor makes it possible to analyze the system reliability by means of the Markov model. Unlike previous reliability studies in which chips are assumed to be defect-free, this analysis considers the fact that an accepted chip may have used spares to replace manufacturing defects, and the number of spares available for tolerating operational faults may thus vary from chip to chip. The developed analytical model for reliability is readily extended to other VSLI/WIS-based multiprocessor systems.<<ETX>>","PeriodicalId":441935,"journal":{"name":"Proceedings., 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132049831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-09-17DOI: 10.1109/ICCD.1990.130176
P. Bose, S. Bandyopadhyay, D. Majumder
The problem of integrating testability issues into the synthesis process of programmable-logic-array (PLA-)based VLSI logic design is investigated. Based on the insight gained from prior work on algorithmic and heuristic test generation for PLAs, a systematic methodology for synthesizing easily testable PLAs from high-level (Boolean) specifications is developed. Experimental results are presented to illustrate how adaptive heuristics aid in reducing the complexity of the synthesis-for-testability problem.<>
{"title":"Synthesis of testable PLAs using adaptive heuristics for efficiency","authors":"P. Bose, S. Bandyopadhyay, D. Majumder","doi":"10.1109/ICCD.1990.130176","DOIUrl":"https://doi.org/10.1109/ICCD.1990.130176","url":null,"abstract":"The problem of integrating testability issues into the synthesis process of programmable-logic-array (PLA-)based VLSI logic design is investigated. Based on the insight gained from prior work on algorithmic and heuristic test generation for PLAs, a systematic methodology for synthesizing easily testable PLAs from high-level (Boolean) specifications is developed. Experimental results are presented to illustrate how adaptive heuristics aid in reducing the complexity of the synthesis-for-testability problem.<<ETX>>","PeriodicalId":441935,"journal":{"name":"Proceedings., 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133683630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1990-09-17DOI: 10.1109/ICCD.1990.130240
Ching-Yi Wang, K. Parhi
Novel algorithms for synthesis of control circuits in pipelined signal processing architectures are presented. The algorithms generate appropriate latching and switching of intermediate signals for a functionally correct operation. Sufficient theory of pipelining is developed to ensure iteration independence of the registers used in control circuits of the dedicated architectures. The interprocessor control circuits are being incorporated into CAD systems for dedicated designs. Algorithms for automatic generation of all control circuits for a specified sequencing and scheduling of operations, for single and multiple clock, and for single and multiple implementation styles are presented.<>
{"title":"Automatic generation of control circuits in pipelined DSP architectures","authors":"Ching-Yi Wang, K. Parhi","doi":"10.1109/ICCD.1990.130240","DOIUrl":"https://doi.org/10.1109/ICCD.1990.130240","url":null,"abstract":"Novel algorithms for synthesis of control circuits in pipelined signal processing architectures are presented. The algorithms generate appropriate latching and switching of intermediate signals for a functionally correct operation. Sufficient theory of pipelining is developed to ensure iteration independence of the registers used in control circuits of the dedicated architectures. The interprocessor control circuits are being incorporated into CAD systems for dedicated designs. Algorithms for automatic generation of all control circuits for a specified sequencing and scheduling of operations, for single and multiple clock, and for single and multiple implementation styles are presented.<<ETX>>","PeriodicalId":441935,"journal":{"name":"Proceedings., 1990 IEEE International Conference on Computer Design: VLSI in Computers and Processors","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1990-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133785737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}