Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263840
Brian Tung, L. Kleinrock
The distributed system is becoming increasingly popular, and this produces the need for more sophisticated distributed control techniques. The authors present a method for distributed control using simple finite state automata. Each of the distributed entities is 'controlled' by its associated automaton, in the sense that the entity examines the state of the automaton to determine its behavior. The result of the collective behavior of all of the entities is fed back to the automata, which change their state as a result of this feedback. They give a new method of analysis which derives the steady state behavior of this system as a whole, by decomposing it into two parts: describing and solving an imbedding auxiliary Markov chain, and analyzing the behavior of the system within each of the states of this auxiliary chain.<>
{"title":"Distributed control methods","authors":"Brian Tung, L. Kleinrock","doi":"10.1109/HPDC.1993.263840","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263840","url":null,"abstract":"The distributed system is becoming increasingly popular, and this produces the need for more sophisticated distributed control techniques. The authors present a method for distributed control using simple finite state automata. Each of the distributed entities is 'controlled' by its associated automaton, in the sense that the entity examines the state of the automaton to determine its behavior. The result of the collective behavior of all of the entities is fed back to the automata, which change their state as a result of this feedback. They give a new method of analysis which derives the steady state behavior of this system as a whole, by decomposing it into two parts: describing and solving an imbedding auxiliary Markov chain, and analyzing the behavior of the system within each of the states of this auxiliary chain.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122166742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263861
J. Manke, K. Neves, T. Wicks
The high-speed computing (HSC) program was established in 1986 within Boeing Computer Services (BCS) to provide a prototypical environment which could be used to study the problem of integrating new computing technologies into Boeing. This paper summarizes work within the HSC program on parallel computing and concentrates on one application area, helicopter rotor design, as an example of how these technologies are being used to prototype critical engineering processes. The paper emphasizes: (1) The prototypical process; (2) The computing technologies which were employed; (3) Some of the issues which were addressed while investigating these technologies; and (4) The software metrics used to assess the impact of using this technology in the design process.<>
{"title":"Parallel computing for helicopter rotor design","authors":"J. Manke, K. Neves, T. Wicks","doi":"10.1109/HPDC.1993.263861","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263861","url":null,"abstract":"The high-speed computing (HSC) program was established in 1986 within Boeing Computer Services (BCS) to provide a prototypical environment which could be used to study the problem of integrating new computing technologies into Boeing. This paper summarizes work within the HSC program on parallel computing and concentrates on one application area, helicopter rotor design, as an example of how these technologies are being used to prototype critical engineering processes. The paper emphasizes: (1) The prototypical process; (2) The computing technologies which were employed; (3) Some of the issues which were addressed while investigating these technologies; and (4) The software metrics used to assess the impact of using this technology in the design process.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133974147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263844
J. Fletcher, Z. Obradovic
A constructive learning algorithm dynamically creates a problem-specific neural network architecture rather than learning on a pre-specified architecture. The authors propose a parallel version of their recently presented constructive neural network learning algorithm. Parallelization provides a computational speedup by a factor of O(t) where t is the number of training examples. Distributed and parallel implementations under p4 using a network of workstations and a Touchstone DELTA are examined. Experimental results indicate that algorithm parallelization may result not only in improved computational time, but also in better prediction quality.<>
{"title":"Parallel and distributed systems for constructive neural network learning","authors":"J. Fletcher, Z. Obradovic","doi":"10.1109/HPDC.1993.263844","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263844","url":null,"abstract":"A constructive learning algorithm dynamically creates a problem-specific neural network architecture rather than learning on a pre-specified architecture. The authors propose a parallel version of their recently presented constructive neural network learning algorithm. Parallelization provides a computational speedup by a factor of O(t) where t is the number of training examples. Distributed and parallel implementations under p4 using a network of workstations and a Touchstone DELTA are examined. Experimental results indicate that algorithm parallelization may result not only in improved computational time, but also in better prediction quality.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125991600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263837
L. Mullin, Scott Thibault, Daria R. Dooling, Erik A. Sandberg
The PRAM model has been shown to be an optimal design for emulating both loose and tightly coupled multiprocessors for unit time operations. When virtual processors are required, multiplexing work to available processors is employed. This introduces a form of latency incurred by operating system overhead. Further complications arise when bandwidth creates bottlenecking of work units. G.E. Blelloch (1989) showed how to add parallel prefix operations (scans) to an extended PRAM model which uses unit step, not time operations. This paper shows how the Psi( psi ) calculus can be used to group work units, i.e. pipelining the work units, so that multiplexing is not required. The authors instead pipeline work units to processors and show how the number of processors need not be equivalent to the number of data components. Partitioning array data structures and pipelining groups of partitions to processors can minimize latency and bottlenecking on distributed message passing multiprocessing architectures.<>
{"title":"Formal method for scheduling, routing and communication protocol","authors":"L. Mullin, Scott Thibault, Daria R. Dooling, Erik A. Sandberg","doi":"10.1109/HPDC.1993.263837","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263837","url":null,"abstract":"The PRAM model has been shown to be an optimal design for emulating both loose and tightly coupled multiprocessors for unit time operations. When virtual processors are required, multiplexing work to available processors is employed. This introduces a form of latency incurred by operating system overhead. Further complications arise when bandwidth creates bottlenecking of work units. G.E. Blelloch (1989) showed how to add parallel prefix operations (scans) to an extended PRAM model which uses unit step, not time operations. This paper shows how the Psi( psi ) calculus can be used to group work units, i.e. pipelining the work units, so that multiplexing is not required. The authors instead pipeline work units to processors and show how the number of processors need not be equivalent to the number of data components. Partitioning array data structures and pipelining groups of partitions to processors can minimize latency and bottlenecking on distributed message passing multiprocessing architectures.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129963603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263853
W. Dobosiewicz, P. Gburzynski
The authors present a simple MAC-level protocol for high-speed ring networks. The protocol is fair and its maximum achievable throughput does not deteriorate with the increasing propagation length of the ring. The proposed protocol operates on the so-called 'spiral ring' topology, including the single-segment regular ring as a special case. It accommodates synchronous traffic automatically, without any bandwidth preallocation or similar special efforts, yet with an arbitrarily low jitter and without packet loss.<>
{"title":"DSMA: a fair capacity-1 protocol for gigabit ring networks","authors":"W. Dobosiewicz, P. Gburzynski","doi":"10.1109/HPDC.1993.263853","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263853","url":null,"abstract":"The authors present a simple MAC-level protocol for high-speed ring networks. The protocol is fair and its maximum achievable throughput does not deteriorate with the increasing propagation length of the ring. The proposed protocol operates on the so-called 'spiral ring' topology, including the single-segment regular ring as a special case. It accommodates synchronous traffic automatically, without any bandwidth preallocation or similar special efforts, yet with an arbitrarily low jitter and without packet loss.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132478974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263835
P. Biswas, K. Ramakrishnan, D. Towsley, C. M. Krishna
The authors study the use of non-volatile memory for caching in distributed file systems. This provides an advantage over traditional distributed file systems in that the load is reduced at the server without making the data vulnerable to failures. They show that small non-volatile write caches at the clients and the server are quite effective. They reduce the write response time and the load on the file server dramatically, thus improving the scalability of the system. They show that a proposed threshold based writeback policy is more effective than a periodic writeback policy. They use a synthetic workload developed from analysis of file I/O traces from commercial production systems. The study is based on a detailed simulation of the distributed environment. The service times for the resources of the system were derived from measurements performed on a typical workstation.<>
{"title":"Performance analysis of distributed file systems with non-volatile caches","authors":"P. Biswas, K. Ramakrishnan, D. Towsley, C. M. Krishna","doi":"10.1109/HPDC.1993.263835","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263835","url":null,"abstract":"The authors study the use of non-volatile memory for caching in distributed file systems. This provides an advantage over traditional distributed file systems in that the load is reduced at the server without making the data vulnerable to failures. They show that small non-volatile write caches at the clients and the server are quite effective. They reduce the write response time and the load on the file server dramatically, thus improving the scalability of the system. They show that a proposed threshold based writeback policy is more effective than a periodic writeback policy. They use a synthetic workload developed from analysis of file I/O traces from commercial production systems. The study is based on a detailed simulation of the distributed environment. The service times for the resources of the system were derived from measurements performed on a typical workstation.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131792102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263827
S. Srbljic, L. Budin
The proposed distributed shared memory model is based on a data replication scheme that provides an environment for a collection of processes that interact to solve a parallel programming problem. In the implementation of the scheme the authors suppose that the replicas of the shared data are present at each node and that an appropriate coherence protocol for maintaining the consistency among the replicas is applied. The performance of the distributed computation is very sensitive to the data-access behavior of the application and to the applied coherence protocol. Communication cost is regarded as an appropriate performance measure. Therefore, the authors first introduce a model characterizing the computation behavior with five workload parameters. Second, they formally describe the coherence protocols as cooperating state machines in order to evaluate their communication costs as functions of workload parameters.<>
{"title":"Analytical performance evaluation of data replication based shared memory model","authors":"S. Srbljic, L. Budin","doi":"10.1109/HPDC.1993.263827","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263827","url":null,"abstract":"The proposed distributed shared memory model is based on a data replication scheme that provides an environment for a collection of processes that interact to solve a parallel programming problem. In the implementation of the scheme the authors suppose that the replicas of the shared data are present at each node and that an appropriate coherence protocol for maintaining the consistency among the replicas is applied. The performance of the distributed computation is very sensitive to the data-access behavior of the application and to the applied coherence protocol. Communication cost is regarded as an appropriate performance measure. Therefore, the authors first introduce a model characterizing the computation behavior with five workload parameters. Second, they formally describe the coherence protocols as cooperating state machines in order to evaluate their communication costs as functions of workload parameters.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131724207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263845
Taitin Chen, Jim Feeney, G. Fox, G. Frieder, S. Ranka, Bill Wilhelm, Fang-Kuo Yu
This paper discusses the architecture and performance of a prototype switch for interconnecting IBM RISC System/6000 workstations. The paper describes the interconnection architecture and performance on a cluster of four IBM RISC System 6000 model 340 workstations. It also describes the driver level software interface to the switch and the features incorporated to minimize communication overhead. The performance measurements cover communication latency and bandwidth. In addition, performance measurements of Express, a popular parallel-programming interface, are provided.<>
本文讨论了一种用于IBM RISC /6000工作站互连的原型交换机的结构和性能。本文描述了一个由4个IBM RISC System 6000 model 340工作站组成的集群的互连体系结构和性能。它还描述了交换机的驱动级软件接口,以及为最小化通信开销而集成的功能。性能测量包括通信延迟和带宽。此外,还提供了流行的并行编程接口Express的性能测量。
{"title":"A low-latency programming interface and a prototype switch for scalable high-performance distributed computing","authors":"Taitin Chen, Jim Feeney, G. Fox, G. Frieder, S. Ranka, Bill Wilhelm, Fang-Kuo Yu","doi":"10.1109/HPDC.1993.263845","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263845","url":null,"abstract":"This paper discusses the architecture and performance of a prototype switch for interconnecting IBM RISC System/6000 workstations. The paper describes the interconnection architecture and performance on a cluster of four IBM RISC System 6000 model 340 workstations. It also describes the driver level software interface to the switch and the features incorporated to minimize communication overhead. The performance measurements cover communication latency and bandwidth. In addition, performance measurements of Express, a popular parallel-programming interface, are provided.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131533802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263838
Kenneth F. Wong, M. Franklin
This paper examines the performance of synchronous checkpointing in a distributed computing environment with and without load redistribution. Performance models are developed, and optimum checkpoint intervals are determined. The analysis extends earlier work by allowing for multiple nodes, state dependent checkpoint intervals, and a performance metric which is coupled with failure-free performance and the speedup functions associated with implementation of parallel algorithms. Expressions for the optimum checkpoint intervals for synchronous checkpointing with and without load redistribution are derived and the results are then used to determine when load redistribution is advantageous.<>
{"title":"Distributed computing systems and checkpointing","authors":"Kenneth F. Wong, M. Franklin","doi":"10.1109/HPDC.1993.263838","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263838","url":null,"abstract":"This paper examines the performance of synchronous checkpointing in a distributed computing environment with and without load redistribution. Performance models are developed, and optimum checkpoint intervals are determined. The analysis extends earlier work by allowing for multiple nodes, state dependent checkpoint intervals, and a performance metric which is coupled with failure-free performance and the speedup functions associated with implementation of parallel algorithms. Expressions for the optimum checkpoint intervals for synchronous checkpointing with and without load redistribution are derived and the results are then used to determine when load redistribution is advantageous.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"209 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115930713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263856
A. Richards, T. Ginige, A. Seneviratne, Teresa Buczkowska, M. Fry
It has been shown that protocol processing represents a severe bottle-neck for high speed computer networks. The disadvantage of proposed solutions are their incompatibility with existing standardised protocol implementations and/or their complexity. One method of alleviating this limitation is to have an adaptable protocol stack, as proposed in this paper. Preliminary results are presented which show that significant gains in throughput can be achieved while still maintaining compatibility with existing standard protocol stacks.<>
{"title":"DARTS-a dynamically adaptable transport service suitable for high speed networks","authors":"A. Richards, T. Ginige, A. Seneviratne, Teresa Buczkowska, M. Fry","doi":"10.1109/HPDC.1993.263856","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263856","url":null,"abstract":"It has been shown that protocol processing represents a severe bottle-neck for high speed computer networks. The disadvantage of proposed solutions are their incompatibility with existing standardised protocol implementations and/or their complexity. One method of alleviating this limitation is to have an adaptable protocol stack, as proposed in this paper. Preliminary results are presented which show that significant gains in throughput can be achieved while still maintaining compatibility with existing standard protocol stacks.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"417 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116561119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}