Pub Date : 1992-12-01DOI: 10.1109/SPDP.1992.242733
S. M. Chung, Cailin Cao
A novel replica control algorithm, called the multiple tree quorum (MTQ) algorithm, is proposed to manage replicated data in distributed database systems. This algorithm provides a high availability for read and write operations by imposing a logical structure of multiple trees on data copies. With the MTQ algorithm, a read operation is limited to a couple of data copies, and a write operation is allowed as long as the majority of the roots of the trees and the majority of the children of each node selected are available. Compared to other algorithms, the MTQ requires lower message cost for an operation while providing higher availability.<>
{"title":"Multiple tree quorum algorithm for replica control in distributed database systems","authors":"S. M. Chung, Cailin Cao","doi":"10.1109/SPDP.1992.242733","DOIUrl":"https://doi.org/10.1109/SPDP.1992.242733","url":null,"abstract":"A novel replica control algorithm, called the multiple tree quorum (MTQ) algorithm, is proposed to manage replicated data in distributed database systems. This algorithm provides a high availability for read and write operations by imposing a logical structure of multiple trees on data copies. With the MTQ algorithm, a read operation is limited to a couple of data copies, and a write operation is allowed as long as the majority of the roots of the trees and the majority of the children of each node selected are available. Compared to other algorithms, the MTQ requires lower message cost for an operation while providing higher availability.<<ETX>>","PeriodicalId":265469,"journal":{"name":"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128538419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-12-01DOI: 10.1109/SPDP.1992.242725
Vincent R. Freytag, Ben Lee, A. Hurson
The authors propose a method called the balanced layered allocation scheme (BLAS) which utilizes heuristic rules, to find a balance between computation and communication costs in hypercube based dataflow systems. The central idea of this method is to arrange the nodes of a dataflow graph into layers that have a one-to-one correspondence with processors in a hypercube. During the allocation, CP (critical path) and LDP (longest directed path) heuristics determine the set of nodes which are to be assigned to processors. Each set of nodes is assigned in an iterative fashion to every possible layer. In this manner the effects of execution times and communication costs can be weighted against each other for every possible layer assignment. Sets of nodes are then assigned to the layer that yields the earliest completion time of the program. Simulation studies indicate that the proposed allocation scheme is effective in reducing communication overhead and thus the overall execution time of a program distribution on a hypercube dataflow computer. Overall, the BLAS showed promising improvements over the VL (vertically layered) allocation scheme.<>
{"title":"A balanced layered allocation scheme for hypercube based dataflow systems","authors":"Vincent R. Freytag, Ben Lee, A. Hurson","doi":"10.1109/SPDP.1992.242725","DOIUrl":"https://doi.org/10.1109/SPDP.1992.242725","url":null,"abstract":"The authors propose a method called the balanced layered allocation scheme (BLAS) which utilizes heuristic rules, to find a balance between computation and communication costs in hypercube based dataflow systems. The central idea of this method is to arrange the nodes of a dataflow graph into layers that have a one-to-one correspondence with processors in a hypercube. During the allocation, CP (critical path) and LDP (longest directed path) heuristics determine the set of nodes which are to be assigned to processors. Each set of nodes is assigned in an iterative fashion to every possible layer. In this manner the effects of execution times and communication costs can be weighted against each other for every possible layer assignment. Sets of nodes are then assigned to the layer that yields the earliest completion time of the program. Simulation studies indicate that the proposed allocation scheme is effective in reducing communication overhead and thus the overall execution time of a program distribution on a hypercube dataflow computer. Overall, the BLAS showed promising improvements over the VL (vertically layered) allocation scheme.<<ETX>>","PeriodicalId":265469,"journal":{"name":"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123043753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-12-01DOI: 10.1109/SPDP.1992.242716
L. Welch, A. Stoyen, T. Marlowe
A framework for analyzing resource usage of distributed periodic processes specified in RT-Chart is provided. Functions for reasoning about the quality of an assignment of processes to computation nodes are developed, and a novel model of resource use costs and communication costs is shown. The model allows reasoning about the rates of progress each action makes when using its resources, thus modeling contention among periodic processes sharing a resource. The computed rates of progress are accurate since impossible sources of resource contention are removed. These are useful techniques for assignment algorithms (such as simulated annealing) that repeatedly reassign actions until a close-to-optimal solution is obtained.<>
{"title":"Modeling resource contention among distributed periodic processes","authors":"L. Welch, A. Stoyen, T. Marlowe","doi":"10.1109/SPDP.1992.242716","DOIUrl":"https://doi.org/10.1109/SPDP.1992.242716","url":null,"abstract":"A framework for analyzing resource usage of distributed periodic processes specified in RT-Chart is provided. Functions for reasoning about the quality of an assignment of processes to computation nodes are developed, and a novel model of resource use costs and communication costs is shown. The model allows reasoning about the rates of progress each action makes when using its resources, thus modeling contention among periodic processes sharing a resource. The computed rates of progress are accurate since impossible sources of resource contention are removed. These are useful techniques for assignment algorithms (such as simulated annealing) that repeatedly reassign actions until a close-to-optimal solution is obtained.<<ETX>>","PeriodicalId":265469,"journal":{"name":"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115220460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-12-01DOI: 10.1109/SPDP.1992.242763
R. Ayani, H. Rajaei
The authors study the impact of load balancing on the performance of a window-based parallel simulation scheme. They show the impact of four scheduling policies: static, dynamic, majority, and longest window. For simple simulation problems where there are a few events and the events need fairly short computation, i.e., small event granularity, the first three policies perform fairly close to each other. However, for large simulation problems where millions of events exist and the event granularity is large, majority scheduling is a better choice. For instance, if each event requires 18.3 ms processing time, majority scheduling performs 38.5% better than dynamic scheduling and 43.6% better than static scheduling.<>
{"title":"Event scheduling in window based parallel simulation schemes","authors":"R. Ayani, H. Rajaei","doi":"10.1109/SPDP.1992.242763","DOIUrl":"https://doi.org/10.1109/SPDP.1992.242763","url":null,"abstract":"The authors study the impact of load balancing on the performance of a window-based parallel simulation scheme. They show the impact of four scheduling policies: static, dynamic, majority, and longest window. For simple simulation problems where there are a few events and the events need fairly short computation, i.e., small event granularity, the first three policies perform fairly close to each other. However, for large simulation problems where millions of events exist and the event granularity is large, majority scheduling is a better choice. For instance, if each event requires 18.3 ms processing time, majority scheduling performs 38.5% better than dynamic scheduling and 43.6% better than static scheduling.<<ETX>>","PeriodicalId":265469,"journal":{"name":"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing","volume":"15 10","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132118815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-12-01DOI: 10.1109/SPDP.1992.242767
Alexander Wang, R. Cypher
The authors study the ability of the hypercube to implement algorithms with ring, mesh, and torus communication patterns when the hypercube contains faults. The primary result is a fault-free embedding of the longest possible ring into an n-cube with at most (n-h(n)) even faulty nodes and (n-h(n)) odd faulty nodes, where h(n) is a function such that h(n)=O( square root n log n). Given the above bounds on the parities of the faults, the result obtained improved upon previous results both in the number of faults that are tolerated and in the length of the ring that is embedded. In addition, the result leads to improved bounds for fault-free embeddings of meshes and tori into faulty hypercubes.<>
{"title":"Fault-tolerant embeddings of rings, meshes, and tori in hypercubes","authors":"Alexander Wang, R. Cypher","doi":"10.1109/SPDP.1992.242767","DOIUrl":"https://doi.org/10.1109/SPDP.1992.242767","url":null,"abstract":"The authors study the ability of the hypercube to implement algorithms with ring, mesh, and torus communication patterns when the hypercube contains faults. The primary result is a fault-free embedding of the longest possible ring into an n-cube with at most (n-h(n)) even faulty nodes and (n-h(n)) odd faulty nodes, where h(n) is a function such that h(n)=O( square root n log n). Given the above bounds on the parities of the faults, the result obtained improved upon previous results both in the number of faults that are tolerated and in the length of the ring that is embedded. In addition, the result leads to improved bounds for fault-free embeddings of meshes and tori into faulty hypercubes.<<ETX>>","PeriodicalId":265469,"journal":{"name":"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116592687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-12-01DOI: 10.1109/SPDP.1992.242704
R. Cypher, J. Sanz
The authors present two new classes of interconnection networks for SIMD (single-instruction multiple-data) computers, namely, the hierarchical shuffle-exchange (HSE) and hierarchical de Bruijn (HdB) networks. These networks are efficient in implementing a wide range of algorithms, including all of those in the classes ascend and descend. The networks are highly regular and scalable, and well-suited to VLSI implementation. In addition, they can be adjusted to match the pin limitations imposed by the packaging technology. The authors compare the HSE and HdB networks with hypercube, 2-D mesh, 3-D mesh, shuffle-exchange, hypernet, de Bruijn, and cube-connected cycles networks. The HSE and HdB networks are shown to have advantages in terms of regularity, scalability, and performance.<>
{"title":"Hierarchical shuffle-exchange and de Bruijn networks","authors":"R. Cypher, J. Sanz","doi":"10.1109/SPDP.1992.242704","DOIUrl":"https://doi.org/10.1109/SPDP.1992.242704","url":null,"abstract":"The authors present two new classes of interconnection networks for SIMD (single-instruction multiple-data) computers, namely, the hierarchical shuffle-exchange (HSE) and hierarchical de Bruijn (HdB) networks. These networks are efficient in implementing a wide range of algorithms, including all of those in the classes ascend and descend. The networks are highly regular and scalable, and well-suited to VLSI implementation. In addition, they can be adjusted to match the pin limitations imposed by the packaging technology. The authors compare the HSE and HdB networks with hypercube, 2-D mesh, 3-D mesh, shuffle-exchange, hypernet, de Bruijn, and cube-connected cycles networks. The HSE and HdB networks are shown to have advantages in terms of regularity, scalability, and performance.<<ETX>>","PeriodicalId":265469,"journal":{"name":"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125126244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-12-01DOI: 10.1109/SPDP.1992.242724
E. Filho, V. Barbosa
In hypercube multiprocessors, multiple users are normally supported by dividing the cube into subcubes of different dimensions. A user request for a subcube may be denied depending on how other subcubes were previously allocated or because the allocation algorithm fails to recognize an existing free subcube that would satisfy the request. In both cases, the main consequence is a reduction in system utilization. To support multiple users while avoiding such problems, the authors propose to multiplex all the processors in the cube among the users. In this way, one can get full system utilization and offer all the resources of the system to the users. The authors have conducted experiments with this new approach on an Intel iPSC/860 system, comparing its performance with the one obtained in the conventional cube-partitioning approach. The results show that, for computationally intensive applications, the average execution time per user when multiplexing the processors is generally comparable to the same average when allocating subcubes to the users, and often significantly lower.<>
{"title":"Time sharing in hypercube multiprocessors","authors":"E. Filho, V. Barbosa","doi":"10.1109/SPDP.1992.242724","DOIUrl":"https://doi.org/10.1109/SPDP.1992.242724","url":null,"abstract":"In hypercube multiprocessors, multiple users are normally supported by dividing the cube into subcubes of different dimensions. A user request for a subcube may be denied depending on how other subcubes were previously allocated or because the allocation algorithm fails to recognize an existing free subcube that would satisfy the request. In both cases, the main consequence is a reduction in system utilization. To support multiple users while avoiding such problems, the authors propose to multiplex all the processors in the cube among the users. In this way, one can get full system utilization and offer all the resources of the system to the users. The authors have conducted experiments with this new approach on an Intel iPSC/860 system, comparing its performance with the one obtained in the conventional cube-partitioning approach. The results show that, for computationally intensive applications, the average execution time per user when multiplexing the processors is generally comparable to the same average when allocating subcubes to the users, and often significantly lower.<<ETX>>","PeriodicalId":265469,"journal":{"name":"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122357629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-12-01DOI: 10.1109/SPDP.1992.242707
Gowri Ramanathan, M. Clement, Phyllis E. Crandall
Hyperweave, a novel hierarchical, expandable interconnection network, is presented. Its topology is an improvement over the earlier proposed extended hypercube topology. Hyperweave has better node fault-tolerance, a greater number of disjoint paths between any two nodes, and a larger bisection width than the extended hypercube. A message-routing algorithm is presented, and an embedding of the extended hypercube on the Hyperweave is shown.<>
{"title":"Hyperweave: a fault-tolerant expandable interconnection network","authors":"Gowri Ramanathan, M. Clement, Phyllis E. Crandall","doi":"10.1109/SPDP.1992.242707","DOIUrl":"https://doi.org/10.1109/SPDP.1992.242707","url":null,"abstract":"Hyperweave, a novel hierarchical, expandable interconnection network, is presented. Its topology is an improvement over the earlier proposed extended hypercube topology. Hyperweave has better node fault-tolerance, a greater number of disjoint paths between any two nodes, and a larger bisection width than the extended hypercube. A message-routing algorithm is presented, and an embedding of the extended hypercube on the Hyperweave is shown.<<ETX>>","PeriodicalId":265469,"journal":{"name":"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing","volume":"50 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126120756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-12-01DOI: 10.1109/SPDP.1992.242757
Sunjay E. Talele, T. Johnson, P. Livadas
By exploiting the structure of a directed toroidal graph, the authors have developed a parallel solution to find the shortest path. A parallel dynamic programming solution to finding the minimum cost path is presented. First the authors map the toroidal graph to a planar graph, whose structure is exploited to form a parallel algorithm suitable for a message-passing parallel architecture. The problem has applications in surface reconstruction, where contours of a surface are represented as graphs. Finding the shortest-path in these graphs corresponds to finding a best-fit surface over the contours. By parallelizing the solution, the authors have obtained a significant speedup to a computationally intensive problem. Since generic message passing is used for interprocessor communication, the proposed algorithm can be implemented in any distributed or parallel environment. In a heterogeneous environment, relative processor speed and memory would have to be considered for load balancing.<>
{"title":"Surface reconstruction in parallel","authors":"Sunjay E. Talele, T. Johnson, P. Livadas","doi":"10.1109/SPDP.1992.242757","DOIUrl":"https://doi.org/10.1109/SPDP.1992.242757","url":null,"abstract":"By exploiting the structure of a directed toroidal graph, the authors have developed a parallel solution to find the shortest path. A parallel dynamic programming solution to finding the minimum cost path is presented. First the authors map the toroidal graph to a planar graph, whose structure is exploited to form a parallel algorithm suitable for a message-passing parallel architecture. The problem has applications in surface reconstruction, where contours of a surface are represented as graphs. Finding the shortest-path in these graphs corresponds to finding a best-fit surface over the contours. By parallelizing the solution, the authors have obtained a significant speedup to a computationally intensive problem. Since generic message passing is used for interprocessor communication, the proposed algorithm can be implemented in any distributed or parallel environment. In a heterogeneous environment, relative processor speed and memory would have to be considered for load balancing.<<ETX>>","PeriodicalId":265469,"journal":{"name":"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126252273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-12-01DOI: 10.1109/SPDP.1992.242768
Jehoshua Bruck, R. Cypher, C. T. Ho
The authors present an efficient method for tolerating faults in a two-dimensional mesh architecture. The approach is based on adding spare components (nodes) and extra links (edges) such that the resulting architecture can be reconfigured as a mesh in the presence of faults. The cost of the fault-tolerant mesh architecture is optimized by adding about one row of redundant nodes in addition to a set of k spare nodes (while tolerating up to k node faults) and minimizing the number of links per node. The results are surprisingly efficient and seem to be practical for small values of k. The degree of the fault-tolerant architecture is k+5 for odd k, and k+6 for even k. The results can be generalized to d-dimensional meshes such that the number of spare nodes is less than the length of the shortest axis plus k, and the degree of the fault-tolerant mesh is (d-1)k+d+3 when k is odd and (d-1)k+2d+2 when k is even.<>
{"title":"Tolerating faults in a mesh with a row of spare nodes","authors":"Jehoshua Bruck, R. Cypher, C. T. Ho","doi":"10.1109/SPDP.1992.242768","DOIUrl":"https://doi.org/10.1109/SPDP.1992.242768","url":null,"abstract":"The authors present an efficient method for tolerating faults in a two-dimensional mesh architecture. The approach is based on adding spare components (nodes) and extra links (edges) such that the resulting architecture can be reconfigured as a mesh in the presence of faults. The cost of the fault-tolerant mesh architecture is optimized by adding about one row of redundant nodes in addition to a set of k spare nodes (while tolerating up to k node faults) and minimizing the number of links per node. The results are surprisingly efficient and seem to be practical for small values of k. The degree of the fault-tolerant architecture is k+5 for odd k, and k+6 for even k. The results can be generalized to d-dimensional meshes such that the number of spare nodes is less than the length of the shortest axis plus k, and the degree of the fault-tolerant mesh is (d-1)k+d+3 when k is odd and (d-1)k+2d+2 when k is even.<<ETX>>","PeriodicalId":265469,"journal":{"name":"[1992] Proceedings of the Fourth IEEE Symposium on Parallel and Distributed Processing","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124361695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}