Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262870
Sizheng Wei, E. Schenfeld
The hierarchical interconnection cache network (HICN) is a novel network architecture for massively parallel processing systems. The HICN's topology is a hierarchy of multiple, three-stage interconnection cache networks. The first and third stages of each network use small, fast crossbar switches. Large, slow switching (reconfigurable) crossbars are used in the middle stages. HICN exploits a special kind of communication locality, called switching locality, offering greater flexibility and lower latency compared with the classical hierarchical networks. HICN uses small size switches for the communication routing and large size switches for setting up the network (reconfiguration) to match as close as possible the expected communication pattern. The trade-off between the routing speed and the switch size is one major factor of achieving high speed communication in massively parallel interconnection networks. The authors present efficient embeddings of several classical network topologies, such as hypercubes, complete binary trees, and grids, into HICNs. They also show that HICNs are flexibly partitionable.<>
{"title":"Hierarchical interconnection cache networks","authors":"Sizheng Wei, E. Schenfeld","doi":"10.1109/IPPS.1993.262870","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262870","url":null,"abstract":"The hierarchical interconnection cache network (HICN) is a novel network architecture for massively parallel processing systems. The HICN's topology is a hierarchy of multiple, three-stage interconnection cache networks. The first and third stages of each network use small, fast crossbar switches. Large, slow switching (reconfigurable) crossbars are used in the middle stages. HICN exploits a special kind of communication locality, called switching locality, offering greater flexibility and lower latency compared with the classical hierarchical networks. HICN uses small size switches for the communication routing and large size switches for setting up the network (reconfiguration) to match as close as possible the expected communication pattern. The trade-off between the routing speed and the switch size is one major factor of achieving high speed communication in massively parallel interconnection networks. The authors present efficient embeddings of several classical network topologies, such as hypercubes, complete binary trees, and grids, into HICNs. They also show that HICNs are flexibly partitionable.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133645998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262791
D. Acton, G. Neufeld
This paper presents the concurrency features found in Raven, an object-oriented parallel programming system. Raven supports coarse-grained parallelism via class based and user based parallelism. Class based parallelism is provided by the implementor of the class, while user based parallelism is provided by the user, or client of objects. Raven also supports object properties which are determined at object creation time, thereby eliminating the need for separate class hierarchies that support concurrency. Raven is operational on a variety of machine architectures, including a shared memory multiprocessor. Initial experience indicates that sequential code can easily be transformed into parallel code and that a substantial speedup is possible.<>
{"title":"Class and user based parallelism in Raven","authors":"D. Acton, G. Neufeld","doi":"10.1109/IPPS.1993.262791","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262791","url":null,"abstract":"This paper presents the concurrency features found in Raven, an object-oriented parallel programming system. Raven supports coarse-grained parallelism via class based and user based parallelism. Class based parallelism is provided by the implementor of the class, while user based parallelism is provided by the user, or client of objects. Raven also supports object properties which are determined at object creation time, thereby eliminating the need for separate class hierarchies that support concurrency. Raven is operational on a variety of machine architectures, including a shared memory multiprocessor. Initial experience indicates that sequential code can easily be transformed into parallel code and that a substantial speedup is possible.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"2 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131605925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262824
S. Latifi, N. Bagherzadeh
The authors propose a flexible network called the clustered-star (CS) network. An (n-1)-dimensional CS of order m, denoted by CS/sub n-1//sup m/ is an n-dimensional star with (n-m) of its (n-1)-stars missing. The advantage of CS/sub n-1//sup m/ is that from the network size viewpoint, it is scalable by a factor of 1>
{"title":"The clustered-star graph: a new topology for large interconnection networks","authors":"S. Latifi, N. Bagherzadeh","doi":"10.1109/IPPS.1993.262824","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262824","url":null,"abstract":"The authors propose a flexible network called the clustered-star (CS) network. An (n-1)-dimensional CS of order m, denoted by CS/sub n-1//sup m/ is an n-dimensional star with (n-m) of its (n-1)-stars missing. The advantage of CS/sub n-1//sup m/ is that from the network size viewpoint, it is scalable by a factor of 1<m<n, as opposed to the (n-1)-star which is scalable only by a factor of n. Furthermore, the complete star graph with some faulty components or with some already allocated substars may result in a clustered-star network which renders the study of this new network important. Basic topological properties of CS/sub n-1//sup m/ are derived and optimal routing and broadcasting algorithms for this network are presented. It is shown that CS/sub n-1//sup m/ is hamiltonian for m=4, and m=3k, k not=2.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124547990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262790
E. Loyot, A. Grimshaw
In the field of parallel processing, there is a great diversity of languages and architectures which become obsolete at a rapid pace. In this environment, portability is an important issue. Unfortunately, most parallel languages are not portable. This portability problem can be solved using a virtual machine approach. In this approach, front-end translators translate various parallel source languages into code for a virtual machine. Back-end translators translate the virtual machine code into executable codes for a variety of parallel architectures. The Virtual Machine for Parallel Processing (VMPP) is designed to provide portability for a variety of high-level parallel programming languages without significantly sacrificing performance.<>
{"title":"VMPP: a virtual machine for parallel processing","authors":"E. Loyot, A. Grimshaw","doi":"10.1109/IPPS.1993.262790","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262790","url":null,"abstract":"In the field of parallel processing, there is a great diversity of languages and architectures which become obsolete at a rapid pace. In this environment, portability is an important issue. Unfortunately, most parallel languages are not portable. This portability problem can be solved using a virtual machine approach. In this approach, front-end translators translate various parallel source languages into code for a virtual machine. Back-end translators translate the virtual machine code into executable codes for a variety of parallel architectures. The Virtual Machine for Parallel Processing (VMPP) is designed to provide portability for a variety of high-level parallel programming languages without significantly sacrificing performance.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123971249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262918
Arun Kumar Somani
The author presents a methodology to design an efficient communication reconfigurable network of processor using a circuit switching environment. He assumes that the operation is synchronous and reconfigurations occur at pre-specified times. This network is based on two architectural concepts, the generalized folding cube and the enhanced hypercube architectures. The author demonstrates the effectiveness, versatility, and flexibility of his approach.<>
{"title":"Design of efficient reconfigurable networks","authors":"Arun Kumar Somani","doi":"10.1109/IPPS.1993.262918","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262918","url":null,"abstract":"The author presents a methodology to design an efficient communication reconfigurable network of processor using a circuit switching environment. He assumes that the operation is synchronous and reconfigurations occur at pre-specified times. This network is based on two architectural concepts, the generalized folding cube and the enhanced hypercube architectures. The author demonstrates the effectiveness, versatility, and flexibility of his approach.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"3 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122595131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262827
James J. Liu, M. Ercegovac
The authors derive high-level parallel processing arrays for matrix computations using symbolic transformations. They propose a graphical language MGD (Mesh Graph Descriptor) as the basis for the transformations. The input to the synthesis system is the single-assignment form of matrix algorithms and the output is a structure of the synthesized parallel arrays. The synthesized arrays produced range from fully-parallel systolic arrays to limited-size parallel arrays. The approach is concise, verifiable, and easy to use. An example of LU decomposition illustrates the approach.<>
{"title":"Symbolic synthesis of parallel processing systems","authors":"James J. Liu, M. Ercegovac","doi":"10.1109/IPPS.1993.262827","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262827","url":null,"abstract":"The authors derive high-level parallel processing arrays for matrix computations using symbolic transformations. They propose a graphical language MGD (Mesh Graph Descriptor) as the basis for the transformations. The input to the synthesis system is the single-assignment form of matrix algorithms and the output is a structure of the synthesized parallel arrays. The synthesized arrays produced range from fully-parallel systolic arrays to limited-size parallel arrays. The approach is concise, verifiable, and easy to use. An example of LU decomposition illustrates the approach.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115450035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262786
Moncef Hamdaoui, P. Ramanathan
The paper proposes a scheme where nodes adaptively send multiple copies of time-critical messages to increase the probability of their timely delivery. A message is replicated only when the time remaining to its deadline is below a pre-computed threshold. An off-line algorithm for computing the number of copies and the deadline thresholds is presented. Simulation results indicate that the reductions in the expected cost due to missed deadlines are substantial as a result of using the proposed scheme.<>
{"title":"A dynamic multiple copy approach for message passing in a virtual cut-through environment","authors":"Moncef Hamdaoui, P. Ramanathan","doi":"10.1109/IPPS.1993.262786","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262786","url":null,"abstract":"The paper proposes a scheme where nodes adaptively send multiple copies of time-critical messages to increase the probability of their timely delivery. A message is replicated only when the time remaining to its deadline is below a pre-computed threshold. An off-line algorithm for computing the number of copies and the deadline thresholds is presented. Simulation results indicate that the reductions in the expected cost due to missed deadlines are substantial as a result of using the proposed scheme.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115598401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262797
R. Shonkwiler, Farzad Ghannadian, C. Alford
A parallel simulated annealing method, IIP, is applied to the n-queen problem. By this method, identical multiple copies of the single process algorithm are independently run in parallel. This technique gives superlinear speedup, in some cases on the order of 50 using only 8 processors. Convergence to the solution exceeds 99.96% for as few as 4 processors. In addition, simulated annealing was compared with a constant temperature version of itself since the resulting homogeneous Markov chain is amendable to Perron-Frobenius analysis. The two algorithms perform similarly.<>
{"title":"Parallel simulated annealing for the n-queen problem","authors":"R. Shonkwiler, Farzad Ghannadian, C. Alford","doi":"10.1109/IPPS.1993.262797","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262797","url":null,"abstract":"A parallel simulated annealing method, IIP, is applied to the n-queen problem. By this method, identical multiple copies of the single process algorithm are independently run in parallel. This technique gives superlinear speedup, in some cases on the order of 50 using only 8 processors. Convergence to the solution exceeds 99.96% for as few as 4 processors. In addition, simulated annealing was compared with a constant temperature version of itself since the resulting homogeneous Markov chain is amendable to Perron-Frobenius analysis. The two algorithms perform similarly.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116820137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262879
Amit Jain, N. Chandrasekharan
The authors consider the problem of finding the minimum cost of a feasible flow in directed series-parallel networks with real-valued lower and upper bounds for the flows on edges. While strongly polynomial-time algorithms are known for this problem on arbitrary networks, it is known to be 'hard' for parallelization. The authors develop, for the first time, an NC algorithm to solve the min-cost flow problem on directed series-parallel networks, solving a problem posed by H. Booth (1990). The authors algorithm takes O(log/sup 2/m) time using O(m/log m) processors on an EREW PRAM and it is optimal with respect to Booth's algorithm with running time O(m log m). Their algorithm owes its efficiency to the tree contraction technique and the use of simple data structures as opposed to Booth's finger search trees.<>
{"title":"An efficient parallel algorithm for min-cost flow on directed series-parallel networks","authors":"Amit Jain, N. Chandrasekharan","doi":"10.1109/IPPS.1993.262879","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262879","url":null,"abstract":"The authors consider the problem of finding the minimum cost of a feasible flow in directed series-parallel networks with real-valued lower and upper bounds for the flows on edges. While strongly polynomial-time algorithms are known for this problem on arbitrary networks, it is known to be 'hard' for parallelization. The authors develop, for the first time, an NC algorithm to solve the min-cost flow problem on directed series-parallel networks, solving a problem posed by H. Booth (1990). The authors algorithm takes O(log/sup 2/m) time using O(m/log m) processors on an EREW PRAM and it is optimal with respect to Booth's algorithm with running time O(m log m). Their algorithm owes its efficiency to the tree contraction technique and the use of simple data structures as opposed to Booth's finger search trees.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126901354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1993-04-13DOI: 10.1109/IPPS.1993.262811
Steven P. Vanderwiel, J. Davis
This paper describes an implementation scheme that maps sequences (lists) in the functional language FP onto a data-parallel SIMD multiprocessor. The mapping is dynamic (i.e., self-organizing at run-time via an atom vector) and is transparent to the programmer. Furthermore, as the problem size and the capability of the architecture increases, the method described will proportionally scale the degree of parallelism. The authors chose FP as the application language because it is a simple yet expressive language and because FP allows one to create functional forms that yield highly-parallel computations when applied to lists representing matrix or vector data. The target architecture is a MasPar MP-1 with 16 K processors.<>
{"title":"Data-parallel functional programming","authors":"Steven P. Vanderwiel, J. Davis","doi":"10.1109/IPPS.1993.262811","DOIUrl":"https://doi.org/10.1109/IPPS.1993.262811","url":null,"abstract":"This paper describes an implementation scheme that maps sequences (lists) in the functional language FP onto a data-parallel SIMD multiprocessor. The mapping is dynamic (i.e., self-organizing at run-time via an atom vector) and is transparent to the programmer. Furthermore, as the problem size and the capability of the architecture increases, the method described will proportionally scale the degree of parallelism. The authors chose FP as the application language because it is a simple yet expressive language and because FP allows one to create functional forms that yield highly-parallel computations when applied to lists representing matrix or vector data. The target architecture is a MasPar MP-1 with 16 K processors.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121829756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}