Pub Date : 1989-01-03DOI: 10.1109/HICSS.1989.47162
G. Miranker, J. Rubinstein, J. Sanguinetti
The Titan was intended to be a personal visualization tool, i.e. a machine that would allow an engineer or scientist to model a physical entity and then visualize the results of the model. This was achieved by the use of several technologies, namely, dense CMOS gate arrays, a commercial RISC IPU (reduced-instruction-set computer instruction processing unit), and pipelinable floating-point units, and known effective architecture features. The opportunities and costs of these technologies and the architectural decisions that resulted in the successful development of Titan are discussed.<>
{"title":"Design of the Titan graphics supercomputer","authors":"G. Miranker, J. Rubinstein, J. Sanguinetti","doi":"10.1109/HICSS.1989.47162","DOIUrl":"https://doi.org/10.1109/HICSS.1989.47162","url":null,"abstract":"The Titan was intended to be a personal visualization tool, i.e. a machine that would allow an engineer or scientist to model a physical entity and then visualize the results of the model. This was achieved by the use of several technologies, namely, dense CMOS gate arrays, a commercial RISC IPU (reduced-instruction-set computer instruction processing unit), and pipelinable floating-point units, and known effective architecture features. The opportunities and costs of these technologies and the architectural decisions that resulted in the successful development of Titan are discussed.<<ETX>>","PeriodicalId":300182,"journal":{"name":"[1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126999129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-01-03DOI: 10.1109/HICSS.1989.47167
A. Mink, G. Nacht
A hardware approach is presented for the design of performance measurement instrumentation for a shared-memory, tightly coupled MIMD multiprocessor. The resource measurement system (REMS) is a nonintrusive, hardware measurement tool used to obtain both trace measurement and resource utilization information. This approach provides more detailed and extensive measurement information than alternative software or hybrid approaches without introducing artifacts into the test results. This is accomplished at a significantly higher tool cost than the alternative software or hybrid approaches. Certain features of today's microprocessors limit the applicability of such a hardware tool. Measurements obtained using this hardware tool on two kernel (small benchmark) routines are presented.<>
{"title":"Performance measurement of a shared-memory multiprocessor using hardware instrumentation","authors":"A. Mink, G. Nacht","doi":"10.1109/HICSS.1989.47167","DOIUrl":"https://doi.org/10.1109/HICSS.1989.47167","url":null,"abstract":"A hardware approach is presented for the design of performance measurement instrumentation for a shared-memory, tightly coupled MIMD multiprocessor. The resource measurement system (REMS) is a nonintrusive, hardware measurement tool used to obtain both trace measurement and resource utilization information. This approach provides more detailed and extensive measurement information than alternative software or hybrid approaches without introducing artifacts into the test results. This is accomplished at a significantly higher tool cost than the alternative software or hybrid approaches. Certain features of today's microprocessors limit the applicability of such a hardware tool. Measurements obtained using this hardware tool on two kernel (small benchmark) routines are presented.<<ETX>>","PeriodicalId":300182,"journal":{"name":"[1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125722456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-01-03DOI: 10.1109/HICSS.1989.47154
R. Kieckhafer
The development of the task scheduling mechanism for the multicomputer architecture for fault-tolerance (MAFT) is discussed. MAFT is a distributed computer system designed to provide high performance and extreme reliability in real-time control applications. The impact of the system's functional requirements, fault-tolerance requirements, and architecture on the development of the scheduling mechanism is examined. MAFT uses a priority-list scheduling algorithm modified to provide extreme reliability in the monitoring of tasks and the detection of scheduling errors. It considers such issues as modular redundancy, Byzantine agreement, and the use of multiversion software and dissimilar hardware. An example of scheduler performance with a realistic workload is presented.<>
{"title":"Fault-tolerant real-time task scheduling in the MAFT distributed system","authors":"R. Kieckhafer","doi":"10.1109/HICSS.1989.47154","DOIUrl":"https://doi.org/10.1109/HICSS.1989.47154","url":null,"abstract":"The development of the task scheduling mechanism for the multicomputer architecture for fault-tolerance (MAFT) is discussed. MAFT is a distributed computer system designed to provide high performance and extreme reliability in real-time control applications. The impact of the system's functional requirements, fault-tolerance requirements, and architecture on the development of the scheduling mechanism is examined. MAFT uses a priority-list scheduling algorithm modified to provide extreme reliability in the monitoring of tasks and the detection of scheduling errors. It considers such issues as modular redundancy, Byzantine agreement, and the use of multiversion software and dissimilar hardware. An example of scheduler performance with a realistic workload is presented.<<ETX>>","PeriodicalId":300182,"journal":{"name":"[1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124877177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-01-03DOI: 10.1109/HICSS.1989.47173
R. Kenner, S. Dickey, P. Teller
A description is given of the ways in which the environment of a highly parallel, high-latency interconnection network is different from that encountered in a uniprocessor system. The impact of these differences on the design of the processing elements is discussed. Methods that can be used to evaluate the impact of architectural choices on the performance of any system that uses a similar network are examined. Two detailed designs of processing elements, one using a CISC (complex-instruction-set computer) processor and the other using a RISC (reduced-instruction-set computer) are given as examples.<>
{"title":"The design of processing elements on a multiprocessor system with a high-bandwidth, high-latency interconnection network","authors":"R. Kenner, S. Dickey, P. Teller","doi":"10.1109/HICSS.1989.47173","DOIUrl":"https://doi.org/10.1109/HICSS.1989.47173","url":null,"abstract":"A description is given of the ways in which the environment of a highly parallel, high-latency interconnection network is different from that encountered in a uniprocessor system. The impact of these differences on the design of the processing elements is discussed. Methods that can be used to evaluate the impact of architectural choices on the performance of any system that uses a similar network are examined. Two detailed designs of processing elements, one using a CISC (complex-instruction-set computer) processor and the other using a RISC (reduced-instruction-set computer) are given as examples.<<ETX>>","PeriodicalId":300182,"journal":{"name":"[1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124319832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-01-03DOI: 10.1109/HICSS.1989.47169
M. Brorsson
A decentralized scheme for virtual memory management on MIMD (multiple-instruction-multiple-data) multiprocessors with shared memory has been developed. Control and data structures are kept local to the processing elements (PE), which reduces the global traffic and makes a high degree of parallelism possible. Each of the PEs in the target architecture consists of a processor and part of the shared memory and is connected to the others by a common bus. The traditional approach, based on replication or sharing of data structures is not suitable in this case when the number of PEs is of the magnitude of 100. This is due to the excessive global traffic caused by consistency or mutual exclusion protocols. A variant of the Dennings working set page replacement algorithm is used, in which each process owns a page list. Shared pages are not present in more than one list, and it is shown that this will not increase the page fault rate in most cases.<>
{"title":"A decentralized virtual memory scheme implemented on an emulated multiprocessor","authors":"M. Brorsson","doi":"10.1109/HICSS.1989.47169","DOIUrl":"https://doi.org/10.1109/HICSS.1989.47169","url":null,"abstract":"A decentralized scheme for virtual memory management on MIMD (multiple-instruction-multiple-data) multiprocessors with shared memory has been developed. Control and data structures are kept local to the processing elements (PE), which reduces the global traffic and makes a high degree of parallelism possible. Each of the PEs in the target architecture consists of a processor and part of the shared memory and is connected to the others by a common bus. The traditional approach, based on replication or sharing of data structures is not suitable in this case when the number of PEs is of the magnitude of 100. This is due to the excessive global traffic caused by consistency or mutual exclusion protocols. A variant of the Dennings working set page replacement algorithm is used, in which each process owns a page list. Shared pages are not present in more than one list, and it is shown that this will not increase the page fault rate in most cases.<<ETX>>","PeriodicalId":300182,"journal":{"name":"[1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116482864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-01-03DOI: 10.1109/HICSS.1989.47168
C. Chi, H. Dietz
A technique is proposed to prevent the return of infrequently used items to cache after they are bumped from it. Simulations have shown that the return of these items, called cache pollution, typically degrade cache-based system performance (average reference time) by 10% to 30%. The technique proposed involves the use of hardware called a bypass-cache, which, under program control, will determine whether each reference should be through the cache or should bypass the cache and reference main memory directly. Several inexpensive heuristics for the compiler to determine how to make each reference are given. It is shown that much of the performance loss can be regained.<>
{"title":"Improving cache performance by selective cache bypass","authors":"C. Chi, H. Dietz","doi":"10.1109/HICSS.1989.47168","DOIUrl":"https://doi.org/10.1109/HICSS.1989.47168","url":null,"abstract":"A technique is proposed to prevent the return of infrequently used items to cache after they are bumped from it. Simulations have shown that the return of these items, called cache pollution, typically degrade cache-based system performance (average reference time) by 10% to 30%. The technique proposed involves the use of hardware called a bypass-cache, which, under program control, will determine whether each reference should be through the cache or should bypass the cache and reference main memory directly. Several inexpensive heuristics for the compiler to determine how to make each reference are given. It is shown that much of the performance loss can be regained.<<ETX>>","PeriodicalId":300182,"journal":{"name":"[1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track","volume":"6 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132530853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-01-03DOI: 10.1109/HICSS.1989.47146
C. Shung, R. Jain, K. Rimey, Edward Wang, M. Srivastava, B. Richards, E. Lettang, S. K. Azim, L. Thon, P. Hilfinger, J. Rabaey, R. Brodersen
Lager, an integrated CAD (computer-aided design) system for algorithm-specific IC design, is described. It consists of a behavioral mapper and a silicon assembler. To generate a chip from a behavioral description, the user specifies both the behavioral description and a parameterized structural description. The behavior is mapped onto the parameterized structure to produce microcode and parameter values. The silicon assembler then translates the fill-out structural description into a physical layout. A number of algorithm-specific ICs designed with Lager have been fabricated and tested. A robot-control chip is described.<>
{"title":"An integrated CAD system for algorithm-specific IC design","authors":"C. Shung, R. Jain, K. Rimey, Edward Wang, M. Srivastava, B. Richards, E. Lettang, S. K. Azim, L. Thon, P. Hilfinger, J. Rabaey, R. Brodersen","doi":"10.1109/HICSS.1989.47146","DOIUrl":"https://doi.org/10.1109/HICSS.1989.47146","url":null,"abstract":"Lager, an integrated CAD (computer-aided design) system for algorithm-specific IC design, is described. It consists of a behavioral mapper and a silicon assembler. To generate a chip from a behavioral description, the user specifies both the behavioral description and a parameterized structural description. The behavior is mapped onto the parameterized structure to produce microcode and parameter values. The silicon assembler then translates the fill-out structural description into a physical layout. A number of algorithm-specific ICs designed with Lager have been fabricated and tested. A robot-control chip is described.<<ETX>>","PeriodicalId":300182,"journal":{"name":"[1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114309798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-01-03DOI: 10.1109/HICSS.1989.47140
P. Ang, P.A. Ruetz
An approach that has been successfully in the design of a family of high-performance digital signal processors is described. It offers the advantage of a short design cycle without sacrificing performance. The method relies on the availability of a well-characterized standard cell library, an accurate gate-level simulator, a behavioral simulator for architectural evaluations, and module generators for generic digital signal processing operators such as multipliers and adders. The method has the flexibility of being able to retarget the logic description into either an array-based, cell-based, or even full-custom physical implementation.<>
{"title":"A methodology for quick turn-around of high performance DSP ASICS","authors":"P. Ang, P.A. Ruetz","doi":"10.1109/HICSS.1989.47140","DOIUrl":"https://doi.org/10.1109/HICSS.1989.47140","url":null,"abstract":"An approach that has been successfully in the design of a family of high-performance digital signal processors is described. It offers the advantage of a short design cycle without sacrificing performance. The method relies on the availability of a well-characterized standard cell library, an accurate gate-level simulator, a behavioral simulator for architectural evaluations, and module generators for generic digital signal processing operators such as multipliers and adders. The method has the flexibility of being able to retarget the logic description into either an array-based, cell-based, or even full-custom physical implementation.<<ETX>>","PeriodicalId":300182,"journal":{"name":"[1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116231754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-01-03DOI: 10.1109/HICSS.1989.47164
R. B. Lee
The author discusses the Hewlett-Packard Precision architecture, which was designed as a common architecture for HP computer systems. It has a RISC (reduced-instruction-set computer)-like execution model, with features for code compaction and execution time reduction for frequent instruction sequences. In addition, it has features for making the architecture extendible, for enhancing its longevity, and for supporting different operating environments. The author describes some aspects of the Precision processor architecture, its goals, how it addresses the spectrum of general-purpose use information, processing needs, and some architectural design tradeoffs.<>
{"title":"HP Precision: a spectrum architecture","authors":"R. B. Lee","doi":"10.1109/HICSS.1989.47164","DOIUrl":"https://doi.org/10.1109/HICSS.1989.47164","url":null,"abstract":"The author discusses the Hewlett-Packard Precision architecture, which was designed as a common architecture for HP computer systems. It has a RISC (reduced-instruction-set computer)-like execution model, with features for code compaction and execution time reduction for frequent instruction sequences. In addition, it has features for making the architecture extendible, for enhancing its longevity, and for supporting different operating environments. The author describes some aspects of the Precision processor architecture, its goals, how it addresses the spectrum of general-purpose use information, processing needs, and some architectural design tradeoffs.<<ETX>>","PeriodicalId":300182,"journal":{"name":"[1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130166266","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-01-03DOI: 10.1109/HICSS.1989.47143
W. Birmingham, A. Kapoor, D. Siewiorek, N. Vidovic
The synthesis tools provided by the MICON system are described. They are M1, which utilizes a knowledge-based approach to represent and apply design knowledge, and the knowledge acquisition tool CGEN, which allows hardware designers to deposit their expertise into M1 without writing any code. The authors explore the development of an automated design environment for M1/CGEN, where computer design knowledge in M1 and CGEN is replaced by operational knowledge creating a generalized system for integrating and sequencing a suite of design tools.<>
{"title":"The design of an integrated environment for the automated synthesis of small computer systems","authors":"W. Birmingham, A. Kapoor, D. Siewiorek, N. Vidovic","doi":"10.1109/HICSS.1989.47143","DOIUrl":"https://doi.org/10.1109/HICSS.1989.47143","url":null,"abstract":"The synthesis tools provided by the MICON system are described. They are M1, which utilizes a knowledge-based approach to represent and apply design knowledge, and the knowledge acquisition tool CGEN, which allows hardware designers to deposit their expertise into M1 without writing any code. The authors explore the development of an automated design environment for M1/CGEN, where computer design knowledge in M1 and CGEN is replaced by operational knowledge creating a generalized system for integrating and sequencing a suite of design tools.<<ETX>>","PeriodicalId":300182,"journal":{"name":"[1989] Proceedings of the Twenty-Second Annual Hawaii International Conference on System Sciences. Volume 1: Architecture Track","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124560026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}