Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263836
J. Lilienkamp, Bruce J. Walker, R. Silva
This paper presents a general design for providing access to remote devices, pipes, and FIFOs, within a distributed computing environment. Though the design is presented in terms of SVR4 and Sun ONC, it is sufficiently general to be built upon almost any underlying Unix system and distribution architecture. The design is applicable to all types of devices, and special attention is given to STREAMS oriented devices.<>
{"title":"Accessing remote special files in a distributed computing environment","authors":"J. Lilienkamp, Bruce J. Walker, R. Silva","doi":"10.1109/HPDC.1993.263836","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263836","url":null,"abstract":"This paper presents a general design for providing access to remote devices, pipes, and FIFOs, within a distributed computing environment. Though the design is presented in terms of SVR4 and Sun ONC, it is sufficiently general to be built upon almost any underlying Unix system and distribution architecture. The design is applicable to all types of devices, and special attention is given to STREAMS oriented devices.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121470251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263850
L. Colombet, L. Desbat, François Ménard
The authors present the parallelization of a Monte Carlo radiative transfer code on a workstation network using PVM. In order to measure parallel performances on heterogenous networks, they propose a generalization to heterogeneous parallel architectures of the classical speedup and efficiency definitions. They apply these formulae to the study of their parallel code. They then show some scientific results obtained using this program on a Gflops peak performance network.<>
{"title":"Star modeling on IBM RS6000 networks using PVM","authors":"L. Colombet, L. Desbat, François Ménard","doi":"10.1109/HPDC.1993.263850","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263850","url":null,"abstract":"The authors present the parallelization of a Monte Carlo radiative transfer code on a workstation network using PVM. In order to measure parallel performances on heterogenous networks, they propose a generalization to heterogeneous parallel architectures of the classical speedup and efficiency definitions. They apply these formulae to the study of their parallel code. They then show some scientific results obtained using this program on a Gflops peak performance network.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"8 9-10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120920041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263852
M. Wehner, J. Ambrosiano, J. Brown, W. Dannevik, P. Eltgroth, A. Mirin, J. Farrara, C. Ma, C. Mechoso, J. A. Spahr
As part of a long range plan to develop a comprehensive climate systems modeling capability, the authors have taken the atmospheric general circulation model originally developed by Arakawa and collaborators at UCLA and have recast it in a portable, parallel form. The code uses an explicit time-advance procedure on a staggered three-dimensional Eulerian mesh. They have implemented a two-dimensional latitude/longitude domain decomposition message passing strategy. Both dynamic memory management and interprocess communication are handled with macro constructs that are preprocessed prior to compilation. The code can be moved about a variety of platforms, including massively parallel processors, workstation clusters, and vector processors, with a mere change of three parameters. Performance on the various platforms as well as issues associated with coupling different models for major components of the climate system are discussed.<>
{"title":"Toward a high performance distributed memory climate model","authors":"M. Wehner, J. Ambrosiano, J. Brown, W. Dannevik, P. Eltgroth, A. Mirin, J. Farrara, C. Ma, C. Mechoso, J. A. Spahr","doi":"10.1109/HPDC.1993.263852","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263852","url":null,"abstract":"As part of a long range plan to develop a comprehensive climate systems modeling capability, the authors have taken the atmospheric general circulation model originally developed by Arakawa and collaborators at UCLA and have recast it in a portable, parallel form. The code uses an explicit time-advance procedure on a staggered three-dimensional Eulerian mesh. They have implemented a two-dimensional latitude/longitude domain decomposition message passing strategy. Both dynamic memory management and interprocess communication are handled with macro constructs that are preprocessed prior to compilation. The code can be moved about a variety of platforms, including massively parallel processors, workstation clusters, and vector processors, with a mere change of three parameters. Performance on the various platforms as well as issues associated with coupling different models for major components of the climate system are discussed.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122518004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263829
L. Crutcher, A. Lazar, Steven K. Feiner, Michelle X. Zhou
Just as broadband networks will enable user-to-user communications to extend from textural services to those employing multimedia, they will also enable a management environment that can take advantage of increased bandwidth and multimedia technology. The fundamental advances incorporated in such an environment can provide efficient solutions to the problem of information management. To establish this environment, the authors tackle the fundamental problems of observability and controllability of broadband networks. A virtual world provides a next-generation network management interface through which a user can observe and interact with the network directly in real time. The system that the authors are developing uses a 3D virtual world as the user interface for managing a large gigabit ATM network. It provides the capability for experimentation in all aspects of network transport, control and management.<>
{"title":"Management of broadband networks using 3D virtual world","authors":"L. Crutcher, A. Lazar, Steven K. Feiner, Michelle X. Zhou","doi":"10.1109/HPDC.1993.263829","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263829","url":null,"abstract":"Just as broadband networks will enable user-to-user communications to extend from textural services to those employing multimedia, they will also enable a management environment that can take advantage of increased bandwidth and multimedia technology. The fundamental advances incorporated in such an environment can provide efficient solutions to the problem of information management. To establish this environment, the authors tackle the fundamental problems of observability and controllability of broadband networks. A virtual world provides a next-generation network management interface through which a user can observe and interact with the network directly in real time. The system that the authors are developing uses a 3D virtual world as the user interface for managing a large gigabit ATM network. It provides the capability for experimentation in all aspects of network transport, control and management.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"212 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123181219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263833
P. Amer, T. J. Connolly, C. Chassot, M. Diaz
This paper introduces a partial order connection (POC) protocol. Motivated in particular by multimedia applications, POC is an end-to-end connection that provides a partial order service, that is, a service that requires some, but not all objects to be received in the order transmitted. This paper discusses R-PO, a reliable version of POC which requires that all transmitted objects are eventually delivered. A metric based on the number of linear extensions of a partial order in the presence of no lost objects is proposed to quantify different partial orders. Means for its calculation is presented when P can be modeled as a combination of sequential and/or parallel compositions of Petri-nets. This metric allows one to compare and evaluate the complexity of different partial order services.<>
{"title":"Partial order transport service for multimedia applications: reliable service","authors":"P. Amer, T. J. Connolly, C. Chassot, M. Diaz","doi":"10.1109/HPDC.1993.263833","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263833","url":null,"abstract":"This paper introduces a partial order connection (POC) protocol. Motivated in particular by multimedia applications, POC is an end-to-end connection that provides a partial order service, that is, a service that requires some, but not all objects to be received in the order transmitted. This paper discusses R-PO, a reliable version of POC which requires that all transmitted objects are eventually delivered. A metric based on the number of linear extensions of a partial order in the presence of no lost objects is proposed to quantify different partial orders. Means for its calculation is presented when P can be modeled as a combination of sequential and/or parallel compositions of Petri-nets. This metric allows one to compare and evaluate the complexity of different partial order services.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124394503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263851
P. Agrawal, V. Agrawal, Joan Villoldo
A sequential circuit test generation program is parallelized to run on a network of Sparc 2 workstations connected through ethernet. The program attempts to compute tests to detect all faults in a given list. The fault list is equally divided among the processors. The entire process consists of a series of parallel computing passes with synchronization occurring between passes. During a pass, each processor independently generates test sequencies for the assigned faults through vector generation and fault simulation. A fixed per-fault CPU time limit is used within a pass. Faults requiring more time are abandoned for later passes. Each processor simulates the entire fault list with its vectors and communicates the list of undetected faults to all other processors. Processors then combine these fault lists to create a list of faults that were not detected by all processors. This list is again equally divided and the next pass begins with a larger per-fault time limit for test generation. The process stops after either the required fault coverage is achieved or the pass with given maximum per-fault time limit is completed. Some benchmark results are given to show the advantage of distributed system for large circuits. Finally, the authors study a speedup model that considers duplicated computation and interprocessor communication.<>
{"title":"Test pattern generation for sequential circuits on a network of workstations","authors":"P. Agrawal, V. Agrawal, Joan Villoldo","doi":"10.1109/HPDC.1993.263851","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263851","url":null,"abstract":"A sequential circuit test generation program is parallelized to run on a network of Sparc 2 workstations connected through ethernet. The program attempts to compute tests to detect all faults in a given list. The fault list is equally divided among the processors. The entire process consists of a series of parallel computing passes with synchronization occurring between passes. During a pass, each processor independently generates test sequencies for the assigned faults through vector generation and fault simulation. A fixed per-fault CPU time limit is used within a pass. Faults requiring more time are abandoned for later passes. Each processor simulates the entire fault list with its vectors and communicates the list of undetected faults to all other processors. Processors then combine these fault lists to create a list of faults that were not detected by all processors. This list is again equally divided and the next pass begins with a larger per-fault time limit for test generation. The process stops after either the required fault coverage is achieved or the pass with given maximum per-fault time limit is completed. Some benchmark results are given to show the advantage of distributed system for large circuits. Finally, the authors study a speedup model that considers duplicated computation and interprocessor communication.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123555595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263849
Tong-Yee Lee, C. Raghavendra, J. Nicholas
The authors describe a fully distributed, parallel algorithm for ray-tracing problem. Load balancing is achieved through the use of comb distribution to roughly assign the same amount of pixels to each processor first, and then dynamically redistribute excessive loads among processors to keep each processor busy. In this model, there is no need for a master node to be responsible for dynamic scheduling. When each node finishes its job, it just requests an extra job from one of its neighbors. The authors implement their algorithm on Intel Delta Touchstone machine with 2-D mesh network topology and provide simulation results. With their scheme, they can get good speedup and high efficiency without much communication overhead.<>
{"title":"A fully distributed parallel ray tracing scheme on the Delta Touchstone machine","authors":"Tong-Yee Lee, C. Raghavendra, J. Nicholas","doi":"10.1109/HPDC.1993.263849","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263849","url":null,"abstract":"The authors describe a fully distributed, parallel algorithm for ray-tracing problem. Load balancing is achieved through the use of comb distribution to roughly assign the same amount of pixels to each processor first, and then dynamically redistribute excessive loads among processors to keep each processor busy. In this model, there is no need for a master node to be responsible for dynamic scheduling. When each node finishes its job, it just requests an extra job from one of its neighbors. The authors implement their algorithm on Intel Delta Touchstone machine with 2-D mesh network topology and provide simulation results. With their scheme, they can get good speedup and high efficiency without much communication overhead.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117343284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263847
A. Varma, V. Sahai, R. Bryant
The authors present a performance study of a switching system being designed for use in the high-performance switching system (HPSS) project at the Lawrence Livermore National Laboratory. The HPSS is a distributed switching system designed to operate with the protocols of the proposed ANSI fibre channel standard (FCS). The system is based on a folded version of the Clos three-stage network and its largest configuration has 4096 ports, each operating at 1.0625 Gbit/s. A detailed simulation model is used to evaluate the throughput, setup time, and blocking at various stages in an HPSS configuration with 512 ports. The results indicate that the system can sustain a throughput that is within 70 to 80 percent of the maximum theoretical limit for the authors choice of operational parameters.<>
{"title":"Performance evaluation of a high-speed switching system based on the fibre channel standard","authors":"A. Varma, V. Sahai, R. Bryant","doi":"10.1109/HPDC.1993.263847","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263847","url":null,"abstract":"The authors present a performance study of a switching system being designed for use in the high-performance switching system (HPSS) project at the Lawrence Livermore National Laboratory. The HPSS is a distributed switching system designed to operate with the protocols of the proposed ANSI fibre channel standard (FCS). The system is based on a folded version of the Clos three-stage network and its largest configuration has 4096 ports, each operating at 1.0625 Gbit/s. A detailed simulation model is used to evaluate the throughput, setup time, and blocking at various stages in an HPSS configuration with 512 ports. The results indicate that the system can sustain a throughput that is within 70 to 80 percent of the maximum theoretical limit for the authors choice of operational parameters.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"143 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127631122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263834
Robert Y. Hou, Y. Patt
Improvements in disk access time have lagged behind improvements in microprocessor and main memory speeds. This disparity has made the storage subsystem a major bottleneck for many applications. Disk arrays that can service multiple disk requests simultaneously are being used to satisfy increasing throughput requirements. Higher throughput rates can be achieved by increasing the number of disks in an array. This increases the number of actuators that are available to service separate requests. It also spreads the data among more disk drives, reducing the seek time as the number of cylinders utilized on each disk drive decreases. The result is an increase in throughput that exceeds the increase in the number of disks. This suggests a tradeoff between the space utilization of disks in an array and the throughput of the array.<>
{"title":"Trading disk capacity for performance","authors":"Robert Y. Hou, Y. Patt","doi":"10.1109/HPDC.1993.263834","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263834","url":null,"abstract":"Improvements in disk access time have lagged behind improvements in microprocessor and main memory speeds. This disparity has made the storage subsystem a major bottleneck for many applications. Disk arrays that can service multiple disk requests simultaneously are being used to satisfy increasing throughput requirements. Higher throughput rates can be achieved by increasing the number of disks in an array. This increases the number of actuators that are available to service separate requests. It also spreads the data among more disk drives, reducing the seek time as the number of cylinders utilized on each disk drive decreases. The result is an increase in throughput that exceeds the increase in the number of disks. This suggests a tradeoff between the space utilization of disks in an array and the throughput of the array.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123693487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263863
A. Tanenbaum, H. Bal, M. Kaashoek
Building the hardware for a high-performance distributed computer system is a lot easier than building its software. The authors describe a model for programming distributed systems based on abstract data types that can be replicated on all machines that need them. Read operations are done locally, without requiring network traffic. Writes can be done using a reliable broadcast algorithm if the hardware supports broadcasting; otherwise, a point-to-point protocol is used. The authors have built such a system based on the Amoeba microkernel, and implemented a language, Orca, on top of it. For Orca applications that have a high ratio of reads to writes, they measure good speedups on a system with 16 processors.<>
{"title":"Programming a distributed system using shared objects","authors":"A. Tanenbaum, H. Bal, M. Kaashoek","doi":"10.1109/HPDC.1993.263863","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263863","url":null,"abstract":"Building the hardware for a high-performance distributed computer system is a lot easier than building its software. The authors describe a model for programming distributed systems based on abstract data types that can be replicated on all machines that need them. Read operations are done locally, without requiring network traffic. Writes can be done using a reliable broadcast algorithm if the hardware supports broadcasting; otherwise, a point-to-point protocol is used. The authors have built such a system based on the Amoeba microkernel, and implemented a language, Orca, on top of it. For Orca applications that have a high ratio of reads to writes, they measure good speedups on a system with 16 processors.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121644029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}