Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262789
Chao-Chun Wang, L. Jamieson
Heuristic search is the process of searching a state space under the guidance of an evaluation function. Most research on parallelizing heuristic search algorithms has emphasized system problems such as load balancing and reduction in memory use. A theoretical analysis of a new autonomous parallel heuristic search algorithm is introduced. Rather than simply dividing the search space among the processors, the processors share information that monitors the progress of the search and use consensus to limit the amount of time spent in expanding nodes that are not on the optimal path. Each processor uses a different admissible heuristic function, and it is shown that the expected number of nodes generated by each processor in the course of the search is reduced by a factor that reflects the consensus among the processors. The asynchronous behavior of the algorithm eliminates synchronization delays.<>
{"title":"Autonomous parallel heuristic combinatorial search","authors":"Chao-Chun Wang, L. Jamieson","doi":"10.1109/IPPS.1993.262789","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262789","url":null,"abstract":"Heuristic search is the process of searching a state space under the guidance of an evaluation function. Most research on parallelizing heuristic search algorithms has emphasized system problems such as load balancing and reduction in memory use. A theoretical analysis of a new autonomous parallel heuristic search algorithm is introduced. Rather than simply dividing the search space among the processors, the processors share information that monitors the progress of the search and use consensus to limit the amount of time spent in expanding nodes that are not on the optimal path. Each processor uses a different admissible heuristic function, and it is shown that the expected number of nodes generated by each processor in the course of the search is reduced by a factor that reflects the consensus among the processors. The asynchronous behavior of the algorithm eliminates synchronization delays.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124561779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262802
S. Rajasekaran, David S. L. Wei
The authors consider the problems of selection, routing and sorting on an n-star graph (with n! n odes), an interconnection network which has been proven to possess many special properties. They identify a tree like subgraph (a '(k, 1, k) chain network') of the star graph which enables them to design efficient algorithms for these problems. They present an algorithm that performs a sequence of n prefix computations in O(n/sup 2/) time. This algorithm is used as a subroutine in other algorithms. In addition they offer an efficient deterministic sorting algorithm that runs in (n/sup 3/ log n)/2 steps. They also show that sorting can be performed on the n-star graph in time O(n/sup 3/) and that selection of a set of uniformly distributed n keys can be performed in O(n/sup 2/) time with high probability. Finally, they also present a deterministic (non oblivious) routing algorithm that realizes any permutation in O(n/sup 3/) steps on the n-star graph.<>
{"title":"Selection, routing, and sorting on the star graph","authors":"S. Rajasekaran, David S. L. Wei","doi":"10.1109/IPPS.1993.262802","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262802","url":null,"abstract":"The authors consider the problems of selection, routing and sorting on an n-star graph (with n! n odes), an interconnection network which has been proven to possess many special properties. They identify a tree like subgraph (a '(k, 1, k) chain network') of the star graph which enables them to design efficient algorithms for these problems. They present an algorithm that performs a sequence of n prefix computations in O(n/sup 2/) time. This algorithm is used as a subroutine in other algorithms. In addition they offer an efficient deterministic sorting algorithm that runs in (n/sup 3/ log n)/2 steps. They also show that sorting can be performed on the n-star graph in time O(n/sup 3/) and that selection of a set of uniformly distributed n keys can be performed in O(n/sup 2/) time with high probability. Finally, they also present a deterministic (non oblivious) routing algorithm that realizes any permutation in O(n/sup 3/) steps on the n-star graph.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124053490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262834
Insup Lee, S. Rajasekaran
Binary decision diagrams (BDDs) have recently been used in model checking to verify systems with a large number of states (of the order of 5*10/sup 20/). Representing both the state space and the state transition graph as BDDs has been demonstrated to alleviate the problem of state space explosion. But there are limitations to this heuristic approach. Even systems of reasonable complexity have many more states. Also, the BDD approach might fail even on some simple systems. The authors propose the use of parallelism to extend the applicability of BDDs in model checking. They present fast algorithms for model checking that employ BDDs. The algorithms presented are much faster than the best known previous algorithms.<>
{"title":"Fast parallel algorithms for model checking using BDDs","authors":"Insup Lee, S. Rajasekaran","doi":"10.1109/IPPS.1993.262834","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262834","url":null,"abstract":"Binary decision diagrams (BDDs) have recently been used in model checking to verify systems with a large number of states (of the order of 5*10/sup 20/). Representing both the state space and the state transition graph as BDDs has been demonstrated to alleviate the problem of state space explosion. But there are limitations to this heuristic approach. Even systems of reasonable complexity have many more states. Also, the BDD approach might fail even on some simple systems. The authors propose the use of parallelism to extend the applicability of BDDs in model checking. They present fast algorithms for model checking that employ BDDs. The algorithms presented are much faster than the best known previous algorithms.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"260 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122464285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262832
D. Windheiser, E. Boyd, E. Hao, S. Abraham, E. Davidson
This paper analyzes and evaluates some novel latency hiding features of the KSR1 multiprocessor: prefetch and poststore instructions and automatic updates. As a case study, the authors analyze the performance of an iterative sparse solver which generates irregular communications. They show that automatic updates significantly reduce the amount of communication. Although prefetch and poststore instructions reduce the coherence miss ratios, they do not significantly improve the sparse solver performance due to the overhead in executing these instructions.<>
{"title":"KSR1 multiprocessor: analysis of latency hiding techniques in a sparse solver","authors":"D. Windheiser, E. Boyd, E. Hao, S. Abraham, E. Davidson","doi":"10.1109/IPPS.1993.262832","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262832","url":null,"abstract":"This paper analyzes and evaluates some novel latency hiding features of the KSR1 multiprocessor: prefetch and poststore instructions and automatic updates. As a case study, the authors analyze the performance of an iterative sparse solver which generates irregular communications. They show that automatic updates significantly reduce the amount of communication. Although prefetch and poststore instructions reduce the coherence miss ratios, they do not significantly improve the sparse solver performance due to the overhead in executing these instructions.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130667466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262886
T. Varvarigou, V. Roychowdhury, T. Kailath, E. Lawler
The authors consider the problem of scheduling tasks on multiprocessor architectures in the presence of communication delays. Given a set of dependent tasks, the scheduling problem is to allocate the tasks to processors such that the pre-specified precedence constraints among the tasks are obeyed and certain cost-measures (such as computation time) are minimized. Several cases of the scheduling problem have been proven to be NP-complete. Nevertheless, there are polynomial time algorithms for several interesting special cases of the general scheduling problem. Most of these results, however, do not take into consideration the delays due to message passing among processors. The authors study the increase in time complexity of the scheduling problem due to the introduction of communication delays. In particular, they address the open problem of scheduling out-forests (in-forests) in a multiprocessor system of m identical processors when communication delays are considered. They present first known polynomial time algorithms for the computation of the optimal schedule when the number of available processors is given and bounded and both computation and communication delays are assumed to take one unit of time.<>
{"title":"Scheduling in and out forests in the presence of communication delays","authors":"T. Varvarigou, V. Roychowdhury, T. Kailath, E. Lawler","doi":"10.1109/IPPS.1993.262886","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262886","url":null,"abstract":"The authors consider the problem of scheduling tasks on multiprocessor architectures in the presence of communication delays. Given a set of dependent tasks, the scheduling problem is to allocate the tasks to processors such that the pre-specified precedence constraints among the tasks are obeyed and certain cost-measures (such as computation time) are minimized. Several cases of the scheduling problem have been proven to be NP-complete. Nevertheless, there are polynomial time algorithms for several interesting special cases of the general scheduling problem. Most of these results, however, do not take into consideration the delays due to message passing among processors. The authors study the increase in time complexity of the scheduling problem due to the introduction of communication delays. In particular, they address the open problem of scheduling out-forests (in-forests) in a multiprocessor system of m identical processors when communication delays are considered. They present first known polynomial time algorithms for the computation of the optimal schedule when the number of available processors is given and bounded and both computation and communication delays are assumed to take one unit of time.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128982889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262882
David Nassimi
The author presents an efficient O(n) parallel algorithm for finding a minimum-cost spanning forest (MSF) of a weighted undirected planar graph with n/sup 2/ edges, on an n*n mesh-connected computer. He also obtains efficient MSF-based O(n) algorithms for several application problems in image processing. In particular, he shows that an MSF can be used to obtain more efficient and elegant O(n) algorithms for the 'k-width connectivity' problem and the 'optical clustering' problem.<>
{"title":"A parallel MSF algorithm for planar graphs on a mesh and applications to image processing","authors":"David Nassimi","doi":"10.1109/IPPS.1993.262882","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262882","url":null,"abstract":"The author presents an efficient O(n) parallel algorithm for finding a minimum-cost spanning forest (MSF) of a weighted undirected planar graph with n/sup 2/ edges, on an n*n mesh-connected computer. He also obtains efficient MSF-based O(n) algorithms for several application problems in image processing. In particular, he shows that an MSF can be used to obtain more efficient and elegant O(n) algorithms for the 'k-width connectivity' problem and the 'optical clustering' problem.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122834845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262804
W. Guan, W. Tsai, D. Blough
The communication performance of the interconnection network is critical in a multicomputer system. Wormhole routing has been known to be more efficient than the traditional circuit switching and packet switching. To evaluate wormhole routing, a queueing-theoretic analysis is used. This paper presents a general analytical model for wormhole routing based on very basic assumptions. The model is used to evaluate the routing delays in hypercubes and meshes. Delays calculated are compared against those obtained from simulations, and these comparisons show that the model is within a reasonable accuracy.<>
{"title":"An analytical model for wormhole routing in multicomputer interconnection networks","authors":"W. Guan, W. Tsai, D. Blough","doi":"10.1109/IPPS.1993.262804","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262804","url":null,"abstract":"The communication performance of the interconnection network is critical in a multicomputer system. Wormhole routing has been known to be more efficient than the traditional circuit switching and packet switching. To evaluate wormhole routing, a queueing-theoretic analysis is used. This paper presents a general analytical model for wormhole routing based on very basic assumptions. The model is used to evaluate the routing delays in hypercubes and meshes. Delays calculated are compared against those obtained from simulations, and these comparisons show that the model is within a reasonable accuracy.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126012301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262885
M. Eshaghian-Wilner, M. Shaaban
Cluster-M is a new parallel programming paradigm for designing portable software. The two main components of this paradigm are cluster-M specifications and cluster-M representations. Cluster-M specifications are high level machine independent parallel code which are mapped onto cluster-M representations, system graphs representing the topologies of the underlying architectures. An algorithm for generating cluster-M representations is presented. Also, a set of high-level constructs essential for writing cluster-M specifications are shown. Using these components, an efficient methodology is proposed to map parallel algorithms onto architectures.<>
{"title":"A cluster-M based mapping methodology","authors":"M. Eshaghian-Wilner, M. Shaaban","doi":"10.1109/IPPS.1993.262885","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262885","url":null,"abstract":"Cluster-M is a new parallel programming paradigm for designing portable software. The two main components of this paradigm are cluster-M specifications and cluster-M representations. Cluster-M specifications are high level machine independent parallel code which are mapped onto cluster-M representations, system graphs representing the topologies of the underlying architectures. An algorithm for generating cluster-M representations is presented. Also, a set of high-level constructs essential for writing cluster-M specifications are shown. Using these components, an efficient methodology is proposed to map parallel algorithms onto architectures.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"20 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113958079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262864
Yeimkuan Chang, L. Bhuyan
Parallel algorithms of the hypercube allocation strategies are considered. Although the sequential algorithms of various hypercube allocation strategies are easier to implement, their worst case time complexities exponentially increase as the dimension of the hypercube increases. The authors show that the free processors can be utilized to perform the allocation jobs in parallel to improve the efficiency of the hypercube allocation algorithms. A modified parallel algorithm for the single Gray-Code (GC) strategy is proposed and is shown to be able to recognize more subcubes than the single GC strategy by using the binary reflected Gray code and inverse binary reflected Gray code, without increasing the execution time. Two algorithms for a complete subcube recognition system are also presented and shown to be more efficient and attractive than the sequential one currently used in the hypercube multiprocessor.<>
{"title":"Parallel algorithms for hypercube allocation","authors":"Yeimkuan Chang, L. Bhuyan","doi":"10.1109/IPPS.1993.262864","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262864","url":null,"abstract":"Parallel algorithms of the hypercube allocation strategies are considered. Although the sequential algorithms of various hypercube allocation strategies are easier to implement, their worst case time complexities exponentially increase as the dimension of the hypercube increases. The authors show that the free processors can be utilized to perform the allocation jobs in parallel to improve the efficiency of the hypercube allocation algorithms. A modified parallel algorithm for the single Gray-Code (GC) strategy is proposed and is shown to be able to recognize more subcubes than the single GC strategy by using the binary reflected Gray code and inverse binary reflected Gray code, without increasing the execution time. Two algorithms for a complete subcube recognition system are also presented and shown to be more efficient and attractive than the sequential one currently used in the hypercube multiprocessor.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"196 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122522083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262903
Srinivasan Venkatraman, Alicia Kime, K. Srinivas
The authors present a simple parallel algorithm to height-balance a binary tree. The algorithm accepts any arbitrary binary tree as its input and yields an optimally shaped binary tree. For any arbitrary binary tree of n nodes the algorithm has a time complexity of O(lgn) and utilizes O(n) processors on a EREW PRAM model. The algorithm uses Euler tours and list ranking, which form the building blocks for many parallel algorithms.<>
{"title":"Parallel algorithms for height balancing binary trees","authors":"Srinivasan Venkatraman, Alicia Kime, K. Srinivas","doi":"10.1109/IPPS.1993.262903","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262903","url":null,"abstract":"The authors present a simple parallel algorithm to height-balance a binary tree. The algorithm accepts any arbitrary binary tree as its input and yields an optimally shaped binary tree. For any arbitrary binary tree of n nodes the algorithm has a time complexity of O(lgn) and utilizes O(n) processors on a EREW PRAM model. The algorithm uses Euler tours and list ranking, which form the building blocks for many parallel algorithms.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"188 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121066848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}