Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263858
R. Butler, Alan L. Leveton, E. Lusk
Facilities such as interprocess communication and protection of shared resources have been added to operating systems to support multiprogramming and have since been adapted to exploit explicit multiprocessing within the scope of two models: the shared-memory model and the distributed (message-passing) model. When multiprocessors (or networks of heterogeneous processors) are used for explicit parallelism, the difference between these models is exposed to the programmer. The p4 tool set was originally developed to buffer the programmer from synchronization issues while offering an added advantage in portability, however two models are often still needed to develop parallel algorithms. The authors provide two implementations of Linda in an attempt to support a single high-level programming model on top of the existing paradigms in order to provide a consistent semantics regardless of the underlying model. Linda's fundamental properties associated with generative communication eliminate the distinction between shared and distributed memory.<>
{"title":"p4-Linda: a portable implementation of Linda","authors":"R. Butler, Alan L. Leveton, E. Lusk","doi":"10.1109/HPDC.1993.263858","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263858","url":null,"abstract":"Facilities such as interprocess communication and protection of shared resources have been added to operating systems to support multiprogramming and have since been adapted to exploit explicit multiprocessing within the scope of two models: the shared-memory model and the distributed (message-passing) model. When multiprocessors (or networks of heterogeneous processors) are used for explicit parallelism, the difference between these models is exposed to the programmer. The p4 tool set was originally developed to buffer the programmer from synchronization issues while offering an added advantage in portability, however two models are often still needed to develop parallel algorithms. The authors provide two implementations of Linda in an attempt to support a single high-level programming model on top of the existing paradigms in order to provide a consistent semantics regardless of the underlying model. Linda's fundamental properties associated with generative communication eliminate the distinction between shared and distributed memory.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116824155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263839
B. Joshi, S. Hosseini, K. Vairavan
In general, a load balancing algorithm improves a system performance. Obviously, larger the difference between the task arrival rates at various processors, more the system is imbalanced and more improvement in the system performance is achieved using a load balancing algorithm. The existing works which have used an experimental technique to show the improvement in the system performance under a load balancing algorithm have used an ad hoc procedure to select the task arrival rates for various processors. Thus, their experimental results necessarily may not provide a complete picture of the improvement in the system performance under their load balancing algorithms. The authors present a systematic scheme for the selection of the task arrival rates at various processors such that experimental results reflect a complete picture of the improvement in the system performance under a load balancing algorithm. The idea has been motivated by the well-known Taguchi technique used in quality control.<>
{"title":"A methodology for evaluating load balancing algorithms","authors":"B. Joshi, S. Hosseini, K. Vairavan","doi":"10.1109/HPDC.1993.263839","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263839","url":null,"abstract":"In general, a load balancing algorithm improves a system performance. Obviously, larger the difference between the task arrival rates at various processors, more the system is imbalanced and more improvement in the system performance is achieved using a load balancing algorithm. The existing works which have used an experimental technique to show the improvement in the system performance under a load balancing algorithm have used an ad hoc procedure to select the task arrival rates for various processors. Thus, their experimental results necessarily may not provide a complete picture of the improvement in the system performance under their load balancing algorithms. The authors present a systematic scheme for the selection of the task arrival rates at various processors such that experimental results reflect a complete picture of the improvement in the system performance under a load balancing algorithm. The idea has been motivated by the well-known Taguchi technique used in quality control.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114679941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263859
Phyllis E. Crandall, M. J. Quinn
The authors present a block data decomposition algorithm for two-dimensional grid problems. Their method includes local balancing to accommodate heterogeneous processors, and they characterize the conditions that must be met for their partitioning strategy to be of value. While they concentrate on the workstation network model of parallel processing because of its high communication costs and inherent heterogeneity, their method is applicable to other parallel architectures.<>
{"title":"Block data decomposition for data-parallel programming on a heterogeneous workstation network","authors":"Phyllis E. Crandall, M. J. Quinn","doi":"10.1109/HPDC.1993.263859","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263859","url":null,"abstract":"The authors present a block data decomposition algorithm for two-dimensional grid problems. Their method includes local balancing to accommodate heterogeneous processors, and they characterize the conditions that must be met for their partitioning strategy to be of value. While they concentrate on the workstation network model of parallel processing because of its high communication costs and inherent heterogeneity, their method is applicable to other parallel architectures.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"55 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130494655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263862
A. Skjellum
This paper concerns itself with efforts to extend multicomputer libraries to a hierarchical, heterogeneous network environment. Two classes of support for such libraries are discussed: first, the message-passing features needed to establish groups of communicating processes, and communication contexts within which libraries can safely work. Second, it discusses message-passing primitives that encapsulate heterogeneity, hiding it from the user program (and library alike), and eliminating it when it proves unnecessary (within a homogeneous invocation, for instance). The multicomputer toolbox first-generation scalable libraries, and zipcode message-passing systems are the means by which the author demonstrates his research, so they are discussed. He relates zipcode syntax and semantics to the emerging MPI standard, when appropriate.<>
{"title":"Scalable libraries in a heterogeneous environment","authors":"A. Skjellum","doi":"10.1109/HPDC.1993.263862","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263862","url":null,"abstract":"This paper concerns itself with efforts to extend multicomputer libraries to a hierarchical, heterogeneous network environment. Two classes of support for such libraries are discussed: first, the message-passing features needed to establish groups of communicating processes, and communication contexts within which libraries can safely work. Second, it discusses message-passing primitives that encapsulate heterogeneity, hiding it from the user program (and library alike), and eliminating it when it proves unnecessary (within a homogeneous invocation, for instance). The multicomputer toolbox first-generation scalable libraries, and zipcode message-passing systems are the means by which the author demonstrates his research, so they are discussed. He relates zipcode syntax and semantics to the emerging MPI standard, when appropriate.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130780488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263841
I. Pramanick
This paper proposes two distributed solutions to the all-pairs shortest path problem, and reports the results of experiments conducted on a network of IBM RISC System/6000s, containing up to seven such workstations. It discusses the issues that become critical in a distributed environment as opposed to a parallel environment, and the results obtained underline the importance of reducing communication between the loosely coupled subtasks in a distributed environment. The results demonstrate that properly designed distributed algorithms, which take into account the limitations (in terms of a slower communication medium and/or the non-dedicated mode of machines) of a distributed computing environment, can yield significant performance benefits.<>
{"title":"Distributed computing solutions to the all-pairs shortest path problem","authors":"I. Pramanick","doi":"10.1109/HPDC.1993.263841","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263841","url":null,"abstract":"This paper proposes two distributed solutions to the all-pairs shortest path problem, and reports the results of experiments conducted on a network of IBM RISC System/6000s, containing up to seven such workstations. It discusses the issues that become critical in a distributed environment as opposed to a parallel environment, and the results obtained underline the importance of reducing communication between the loosely coupled subtasks in a distributed environment. The results demonstrate that properly designed distributed algorithms, which take into account the limitations (in terms of a slower communication medium and/or the non-dedicated mode of machines) of a distributed computing environment, can yield significant performance benefits.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128208763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263857
B. Mukherjee, K. Schwan
Since the mechanisms of an operating system can significantly affect the performance of parallel programs, it is important to customize operating system functionality for specific application programs. The authors first present a model for adaptive objects and the associated mechanisms, then they use this model to implement adaptive locks for multiprocessors which adapt themselves according to user-provided adaptation policies to suit changing application locking patterns. Using a parallel branch and bound program, they demonstrate the performance advantage of adaptive locks over existing locks.<>
{"title":"Improving performance by use of adaptive objects: experimentation with a configurable multiprocessor thread package","authors":"B. Mukherjee, K. Schwan","doi":"10.1109/HPDC.1993.263857","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263857","url":null,"abstract":"Since the mechanisms of an operating system can significantly affect the performance of parallel programs, it is important to customize operating system functionality for specific application programs. The authors first present a model for adaptive objects and the associated mechanisms, then they use this model to implement adaptive locks for multiprocessors which adapt themselves according to user-provided adaptation policies to suit changing application locking patterns. Using a parallel branch and bound program, they demonstrate the performance advantage of adaptive locks over existing locks.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129813047","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263843
P. Coddington
The author has implemented a set of computational physics codes on a network of IBM RS/6000 workstations used as a distributed parallel computer. He compares the performance of the codes on this network, using both standard Ethernet connections and a fast prototype switch, and also on the nCUBE/2, a MIMD parallel computer. The algorithms used range from simple, local, and regular to complex, non-local, and irregular. He describes his experiences with the hardware, software and parallel languages used, and discusses ideas for making distributed parallel computing on workstation networks more easily usable for computational physicists.<>
{"title":"An analysis of distributed computing software and hardware for applications in computational physics","authors":"P. Coddington","doi":"10.1109/HPDC.1993.263843","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263843","url":null,"abstract":"The author has implemented a set of computational physics codes on a network of IBM RS/6000 workstations used as a distributed parallel computer. He compares the performance of the codes on this network, using both standard Ethernet connections and a fast prototype switch, and also on the nCUBE/2, a MIMD parallel computer. The algorithms used range from simple, local, and regular to complex, non-local, and irregular. He describes his experiences with the hardware, software and parallel languages used, and discusses ideas for making distributed parallel computing on workstation networks more easily usable for computational physicists.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130769993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263846
Jian Ma, K. Rahko
The authors propose a novel output queuing ATM modular switch which has memoryless two-stage interconnection with disjoint-path topology. The goal of achieving the modular switch is to relax the limitation of VLSI implementation, to simplify interstage wiring and synchronization, furthermore to reduce complexity of the overall switch. A pure output queue is constructed by providing multipath in each output port and replicated switching module planes. The switch with certain cell loss requirement can be ensured by choosing a suitable path set of L/sub 1/ and L/sub 2/. For instance, cell loss probability in the switch can be kept less than 10/sup -6/ for various N, under 90% load, if a set of L/sub 1/=9 and L/sub 2/=4 (or L/sub 1/=8 and L/sub 2/=5) is chosen.<>
{"title":"MULTIPAR: an output queue ATM modular switch with multiple phases and replicated planes","authors":"Jian Ma, K. Rahko","doi":"10.1109/HPDC.1993.263846","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263846","url":null,"abstract":"The authors propose a novel output queuing ATM modular switch which has memoryless two-stage interconnection with disjoint-path topology. The goal of achieving the modular switch is to relax the limitation of VLSI implementation, to simplify interstage wiring and synchronization, furthermore to reduce complexity of the overall switch. A pure output queue is constructed by providing multipath in each output port and replicated switching module planes. The switch with certain cell loss requirement can be ensured by choosing a suitable path set of L/sub 1/ and L/sub 2/. For instance, cell loss probability in the switch can be kept less than 10/sup -6/ for various N, under 90% load, if a set of L/sub 1/=9 and L/sub 2/=4 (or L/sub 1/=8 and L/sub 2/=5) is chosen.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125056854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-01-27DOI: 10.1109/HPDC.1993.263860
John F. Karpovich, M. Judd, W. Strayer, A. Grimshaw
The authors present an object-oriented framework for constructing parallel implementations of stencil algorithms. This framework simplifies the development process by encapsulating the common aspects of stencil algorithms in a base stencil class so that application-specific derived classes can be easily defined via inheritance and overloading. In addition, the stencil base class contains mechanisms for parallel execution. The result is a high-performance, parallel, application-specific stencil class. The authors present the design rationale for the base class and illustrate the derivation process by defining two subclasses, an image convolution class and a PDE solver. The classes have been implemented in Mentat, an object-oriented parallel programming system that is available on a variety of platforms. Performance results are given for a network of Sun SPARCstation IPCs.<>
{"title":"A parallel object-oriented framework for stencil algorithms","authors":"John F. Karpovich, M. Judd, W. Strayer, A. Grimshaw","doi":"10.1109/HPDC.1993.263860","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263860","url":null,"abstract":"The authors present an object-oriented framework for constructing parallel implementations of stencil algorithms. This framework simplifies the development process by encapsulating the common aspects of stencil algorithms in a base stencil class so that application-specific derived classes can be easily defined via inheritance and overloading. In addition, the stencil base class contains mechanisms for parallel execution. The result is a high-performance, parallel, application-specific stencil class. The authors present the design rationale for the base class and illustrate the derivation process by defining two subclasses, an image convolution class and a PDE solver. The classes have been implemented in Mentat, an object-oriented parallel programming system that is available on a variety of platforms. Performance results are given for a network of Sun SPARCstation IPCs.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124918310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}