Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263840
Brian Tung, L. Kleinrock
The distributed system is becoming increasingly popular, and this produces the need for more sophisticated distributed control techniques. The authors present a method for distributed control using simple finite state automata. Each of the distributed entities is 'controlled' by its associated automaton, in the sense that the entity examines the state of the automaton to determine its behavior. The result of the collective behavior of all of the entities is fed back to the automata, which change their state as a result of this feedback. They give a new method of analysis which derives the steady state behavior of this system as a whole, by decomposing it into two parts: describing and solving an imbedding auxiliary Markov chain, and analyzing the behavior of the system within each of the states of this auxiliary chain.<>
{"title":"Distributed control methods","authors":"Brian Tung, L. Kleinrock","doi":"10.1109/HPDC.1993.263840","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263840","url":null,"abstract":"The distributed system is becoming increasingly popular, and this produces the need for more sophisticated distributed control techniques. The authors present a method for distributed control using simple finite state automata. Each of the distributed entities is 'controlled' by its associated automaton, in the sense that the entity examines the state of the automaton to determine its behavior. The result of the collective behavior of all of the entities is fed back to the automata, which change their state as a result of this feedback. They give a new method of analysis which derives the steady state behavior of this system as a whole, by decomposing it into two parts: describing and solving an imbedding auxiliary Markov chain, and analyzing the behavior of the system within each of the states of this auxiliary chain.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122166742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263861
J. Manke, K. Neves, T. Wicks
The high-speed computing (HSC) program was established in 1986 within Boeing Computer Services (BCS) to provide a prototypical environment which could be used to study the problem of integrating new computing technologies into Boeing. This paper summarizes work within the HSC program on parallel computing and concentrates on one application area, helicopter rotor design, as an example of how these technologies are being used to prototype critical engineering processes. The paper emphasizes: (1) The prototypical process; (2) The computing technologies which were employed; (3) Some of the issues which were addressed while investigating these technologies; and (4) The software metrics used to assess the impact of using this technology in the design process.<>
{"title":"Parallel computing for helicopter rotor design","authors":"J. Manke, K. Neves, T. Wicks","doi":"10.1109/HPDC.1993.263861","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263861","url":null,"abstract":"The high-speed computing (HSC) program was established in 1986 within Boeing Computer Services (BCS) to provide a prototypical environment which could be used to study the problem of integrating new computing technologies into Boeing. This paper summarizes work within the HSC program on parallel computing and concentrates on one application area, helicopter rotor design, as an example of how these technologies are being used to prototype critical engineering processes. The paper emphasizes: (1) The prototypical process; (2) The computing technologies which were employed; (3) Some of the issues which were addressed while investigating these technologies; and (4) The software metrics used to assess the impact of using this technology in the design process.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133974147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263844
J. Fletcher, Z. Obradovic
A constructive learning algorithm dynamically creates a problem-specific neural network architecture rather than learning on a pre-specified architecture. The authors propose a parallel version of their recently presented constructive neural network learning algorithm. Parallelization provides a computational speedup by a factor of O(t) where t is the number of training examples. Distributed and parallel implementations under p4 using a network of workstations and a Touchstone DELTA are examined. Experimental results indicate that algorithm parallelization may result not only in improved computational time, but also in better prediction quality.<>
{"title":"Parallel and distributed systems for constructive neural network learning","authors":"J. Fletcher, Z. Obradovic","doi":"10.1109/HPDC.1993.263844","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263844","url":null,"abstract":"A constructive learning algorithm dynamically creates a problem-specific neural network architecture rather than learning on a pre-specified architecture. The authors propose a parallel version of their recently presented constructive neural network learning algorithm. Parallelization provides a computational speedup by a factor of O(t) where t is the number of training examples. Distributed and parallel implementations under p4 using a network of workstations and a Touchstone DELTA are examined. Experimental results indicate that algorithm parallelization may result not only in improved computational time, but also in better prediction quality.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125991600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263837
L. Mullin, Scott Thibault, Daria R. Dooling, Erik A. Sandberg
The PRAM model has been shown to be an optimal design for emulating both loose and tightly coupled multiprocessors for unit time operations. When virtual processors are required, multiplexing work to available processors is employed. This introduces a form of latency incurred by operating system overhead. Further complications arise when bandwidth creates bottlenecking of work units. G.E. Blelloch (1989) showed how to add parallel prefix operations (scans) to an extended PRAM model which uses unit step, not time operations. This paper shows how the Psi( psi ) calculus can be used to group work units, i.e. pipelining the work units, so that multiplexing is not required. The authors instead pipeline work units to processors and show how the number of processors need not be equivalent to the number of data components. Partitioning array data structures and pipelining groups of partitions to processors can minimize latency and bottlenecking on distributed message passing multiprocessing architectures.<>
{"title":"Formal method for scheduling, routing and communication protocol","authors":"L. Mullin, Scott Thibault, Daria R. Dooling, Erik A. Sandberg","doi":"10.1109/HPDC.1993.263837","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263837","url":null,"abstract":"The PRAM model has been shown to be an optimal design for emulating both loose and tightly coupled multiprocessors for unit time operations. When virtual processors are required, multiplexing work to available processors is employed. This introduces a form of latency incurred by operating system overhead. Further complications arise when bandwidth creates bottlenecking of work units. G.E. Blelloch (1989) showed how to add parallel prefix operations (scans) to an extended PRAM model which uses unit step, not time operations. This paper shows how the Psi( psi ) calculus can be used to group work units, i.e. pipelining the work units, so that multiplexing is not required. The authors instead pipeline work units to processors and show how the number of processors need not be equivalent to the number of data components. Partitioning array data structures and pipelining groups of partitions to processors can minimize latency and bottlenecking on distributed message passing multiprocessing architectures.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129963603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263853
W. Dobosiewicz, P. Gburzynski
The authors present a simple MAC-level protocol for high-speed ring networks. The protocol is fair and its maximum achievable throughput does not deteriorate with the increasing propagation length of the ring. The proposed protocol operates on the so-called 'spiral ring' topology, including the single-segment regular ring as a special case. It accommodates synchronous traffic automatically, without any bandwidth preallocation or similar special efforts, yet with an arbitrarily low jitter and without packet loss.<>
{"title":"DSMA: a fair capacity-1 protocol for gigabit ring networks","authors":"W. Dobosiewicz, P. Gburzynski","doi":"10.1109/HPDC.1993.263853","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263853","url":null,"abstract":"The authors present a simple MAC-level protocol for high-speed ring networks. The protocol is fair and its maximum achievable throughput does not deteriorate with the increasing propagation length of the ring. The proposed protocol operates on the so-called 'spiral ring' topology, including the single-segment regular ring as a special case. It accommodates synchronous traffic automatically, without any bandwidth preallocation or similar special efforts, yet with an arbitrarily low jitter and without packet loss.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132478974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263835
P. Biswas, K. Ramakrishnan, D. Towsley, C. M. Krishna
The authors study the use of non-volatile memory for caching in distributed file systems. This provides an advantage over traditional distributed file systems in that the load is reduced at the server without making the data vulnerable to failures. They show that small non-volatile write caches at the clients and the server are quite effective. They reduce the write response time and the load on the file server dramatically, thus improving the scalability of the system. They show that a proposed threshold based writeback policy is more effective than a periodic writeback policy. They use a synthetic workload developed from analysis of file I/O traces from commercial production systems. The study is based on a detailed simulation of the distributed environment. The service times for the resources of the system were derived from measurements performed on a typical workstation.<>
{"title":"Performance analysis of distributed file systems with non-volatile caches","authors":"P. Biswas, K. Ramakrishnan, D. Towsley, C. M. Krishna","doi":"10.1109/HPDC.1993.263835","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263835","url":null,"abstract":"The authors study the use of non-volatile memory for caching in distributed file systems. This provides an advantage over traditional distributed file systems in that the load is reduced at the server without making the data vulnerable to failures. They show that small non-volatile write caches at the clients and the server are quite effective. They reduce the write response time and the load on the file server dramatically, thus improving the scalability of the system. They show that a proposed threshold based writeback policy is more effective than a periodic writeback policy. They use a synthetic workload developed from analysis of file I/O traces from commercial production systems. The study is based on a detailed simulation of the distributed environment. The service times for the resources of the system were derived from measurements performed on a typical workstation.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131792102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263831
P. Dowd, K. Bogineni, K. A. Aly, J. A. Perreault
This paper introduces a hierarchical optical structure for processor interconnection and evaluates its performance. The architecture is based on wavelength division multiplexing (WDM) which enables multiple multi-access channels to be realized on a single optical fiber. The objective of the hierarchical architecture is to achieve scalability yet avoid the requirement of multiple wavelength tunable devices per node as with the WDM-based hypercube interconnection scheme. Furthermore, single-hop communication is achieved: a packet remains in the optical form from source to destination and does not require cross dimensional intermediate routing. The wavelength multiplexed hierarchical structure features wavelength channel re-use at each level, allowing scalability to very large system sizes. It employs acousto-optic tunable filters in conjunction with passive couplers to partition the traffic between different levels of the hierarchy without electronic intervention. A significant advantage of the proposed structure is its ability to dynamically vary the bandwidth provided to different levels of the hierarchy. The architecture is compared to a wavelength-flat architecture in terms of physical and performance scalability.<>
{"title":"Design and analysis of a hierarchical scalable photonic architecture","authors":"P. Dowd, K. Bogineni, K. A. Aly, J. A. Perreault","doi":"10.1109/HPDC.1993.263831","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263831","url":null,"abstract":"This paper introduces a hierarchical optical structure for processor interconnection and evaluates its performance. The architecture is based on wavelength division multiplexing (WDM) which enables multiple multi-access channels to be realized on a single optical fiber. The objective of the hierarchical architecture is to achieve scalability yet avoid the requirement of multiple wavelength tunable devices per node as with the WDM-based hypercube interconnection scheme. Furthermore, single-hop communication is achieved: a packet remains in the optical form from source to destination and does not require cross dimensional intermediate routing. The wavelength multiplexed hierarchical structure features wavelength channel re-use at each level, allowing scalability to very large system sizes. It employs acousto-optic tunable filters in conjunction with passive couplers to partition the traffic between different levels of the hierarchy without electronic intervention. A significant advantage of the proposed structure is its ability to dynamically vary the bandwidth provided to different levels of the hierarchy. The architecture is compared to a wavelength-flat architecture in terms of physical and performance scalability.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117086370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263838
Kenneth F. Wong, M. Franklin
This paper examines the performance of synchronous checkpointing in a distributed computing environment with and without load redistribution. Performance models are developed, and optimum checkpoint intervals are determined. The analysis extends earlier work by allowing for multiple nodes, state dependent checkpoint intervals, and a performance metric which is coupled with failure-free performance and the speedup functions associated with implementation of parallel algorithms. Expressions for the optimum checkpoint intervals for synchronous checkpointing with and without load redistribution are derived and the results are then used to determine when load redistribution is advantageous.<>
{"title":"Distributed computing systems and checkpointing","authors":"Kenneth F. Wong, M. Franklin","doi":"10.1109/HPDC.1993.263838","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263838","url":null,"abstract":"This paper examines the performance of synchronous checkpointing in a distributed computing environment with and without load redistribution. Performance models are developed, and optimum checkpoint intervals are determined. The analysis extends earlier work by allowing for multiple nodes, state dependent checkpoint intervals, and a performance metric which is coupled with failure-free performance and the speedup functions associated with implementation of parallel algorithms. Expressions for the optimum checkpoint intervals for synchronous checkpointing with and without load redistribution are derived and the results are then used to determine when load redistribution is advantageous.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"209 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115930713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263842
Patrick T. Homer, R. Schlichting
The numerical propulsion system simulation (NPSS) project has been initiated by NASA to expand the use of computer simulation in the development of new aircraft engines. A major goal is to study interactions between engine components using multiple computational codes, each modeling a separate component and potentially executing on a different machine in a network. Thus, a simulation run is a heterogeneous distributed program controlled by a simulation executive. This paper describes a prototype executive composed of the AVS visualization system and the Schooner heterogeneous remote procedure call (RPC) facility. In addition, the match between Schooner's capabilities and the needs of NPSS is evaluated based on the authors experience with a collection of test codes. This discussion not only documents the evolution of Schooner, but also serves to highlight the practical problems that can be encountered when dealing with heterogeneity and distribution in such applications.<>
{"title":"Support heterogeneity and distribution in the numerical propulsion system simulation project","authors":"Patrick T. Homer, R. Schlichting","doi":"10.1109/HPDC.1993.263842","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263842","url":null,"abstract":"The numerical propulsion system simulation (NPSS) project has been initiated by NASA to expand the use of computer simulation in the development of new aircraft engines. A major goal is to study interactions between engine components using multiple computational codes, each modeling a separate component and potentially executing on a different machine in a network. Thus, a simulation run is a heterogeneous distributed program controlled by a simulation executive. This paper describes a prototype executive composed of the AVS visualization system and the Schooner heterogeneous remote procedure call (RPC) facility. In addition, the match between Schooner's capabilities and the needs of NPSS is evaluated based on the authors experience with a collection of test codes. This discussion not only documents the evolution of Schooner, but also serves to highlight the practical problems that can be encountered when dealing with heterogeneity and distribution in such applications.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126297814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-07-20DOI: 10.1109/HPDC.1993.263848
G. Minden, Joseph B. Evans, D. Petr, V. Frost
This paper describes a gigabit LAN/WAN gateway being developed for the MAGIC gigabit testbed. The gateway interfaces a gigabit LAN developed by Digital's Systems Research Center and the MAGIC SONET/ATM wide area network. The gateway provides 622 Mb/s throughput between the LAN and WAN environments, and supports either a single STS-12c or four STS-3c tributaries. Traffic measurement capability and support for multiple bandwidth management schemes are provided by this architecture.<>
{"title":"An ATM WAN/LAN gateway architecture","authors":"G. Minden, Joseph B. Evans, D. Petr, V. Frost","doi":"10.1109/HPDC.1993.263848","DOIUrl":"https://doi.org/10.1109/HPDC.1993.263848","url":null,"abstract":"This paper describes a gigabit LAN/WAN gateway being developed for the MAGIC gigabit testbed. The gateway interfaces a gigabit LAN developed by Digital's Systems Research Center and the MAGIC SONET/ATM wide area network. The gateway provides 622 Mb/s throughput between the LAN and WAN environments, and supports either a single STS-12c or four STS-3c tributaries. Traffic measurement capability and support for multiple bandwidth management schemes are provided by this architecture.<<ETX>>","PeriodicalId":226280,"journal":{"name":"[1993] Proceedings The 2nd International Symposium on High Performance Distributed Computing","volume":"237 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127046651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}