Pub Date : 1992-03-01DOI: 10.1109/IPPS.1992.223040
Susanne E. Hambrusch, F. Dehne
Let I be a n*n binary image stored in a n*n mesh of processors with one pixel per processor. Image I is k-width-connected if, informally, between any pair of pixels of value 'I' there exists a path of width k (composed of 1-pixels only). The authors consider the problem of determining the largest integer k such that I is k-width-connected, and present an optimal O(n) time algorithm for the mesh architecture.<>
{"title":"Determining maximum k-width-connectivity on meshes","authors":"Susanne E. Hambrusch, F. Dehne","doi":"10.1109/IPPS.1992.223040","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223040","url":null,"abstract":"Let I be a n*n binary image stored in a n*n mesh of processors with one pixel per processor. Image I is k-width-connected if, informally, between any pair of pixels of value 'I' there exists a path of width k (composed of 1-pixels only). The authors consider the problem of determining the largest integer k such that I is k-width-connected, and present an optimal O(n) time algorithm for the mesh architecture.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122294717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-03-01DOI: 10.1109/IPPS.1992.223043
H. Bi, W. Giloi
Many elementary numerical algorithms involve not only vector operations but also matrix operations. Today's vector processors only support vector operations, and execute matrix operations in terms of vector operations, because they can not access matrix operands in one instruction. This will lead to poor sustained performances of vector machines. The paper discusses how to support both vector operations and matrix operations in vector architectures. At first subarray patterns for vector and matrix operations are introduced. Then it presents a set of accessing modes which can make vector architectures to access both vector and matrix operands. Finally the performance improvement for matrix multiplication and the FFT is demonstrated.<>
{"title":"Supporting matrix operations in vector architectures","authors":"H. Bi, W. Giloi","doi":"10.1109/IPPS.1992.223043","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223043","url":null,"abstract":"Many elementary numerical algorithms involve not only vector operations but also matrix operations. Today's vector processors only support vector operations, and execute matrix operations in terms of vector operations, because they can not access matrix operands in one instruction. This will lead to poor sustained performances of vector machines. The paper discusses how to support both vector operations and matrix operations in vector architectures. At first subarray patterns for vector and matrix operations are introduced. Then it presents a set of accessing modes which can make vector architectures to access both vector and matrix operands. Finally the performance improvement for matrix multiplication and the FFT is demonstrated.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125023541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-03-01DOI: 10.1109/IPPS.1992.223079
A. W. Kwan, L. Bic
A technique for structuring compute-aggregate-broadcast algorithms on distributed memory computers is presented. The compute-aggregate-broadcast paradigm provides an abstraction of the problem for the programmer, allowing for separation of computation and synchronization. Such algorithms are well suited for application on distributed memory computers. The structuring technique assists the parallel programmer with synchronization, allowing the programmer to concentrate more on developing code for computation. Two examples are presented.<>
{"title":"A structuring technique for compute-aggregate-broadcast algorithms on distributed memory computers","authors":"A. W. Kwan, L. Bic","doi":"10.1109/IPPS.1992.223079","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223079","url":null,"abstract":"A technique for structuring compute-aggregate-broadcast algorithms on distributed memory computers is presented. The compute-aggregate-broadcast paradigm provides an abstraction of the problem for the programmer, allowing for separation of computation and synchronization. Such algorithms are well suited for application on distributed memory computers. The structuring technique assists the parallel programmer with synchronization, allowing the programmer to concentrate more on developing code for computation. Two examples are presented.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127181399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-03-01DOI: 10.1109/IPPS.1992.223044
W. Deng, S. Iyengar
The paper discusses an optimal parallel algorithm for tree form generation of arithmetic expressions on an SIMD-SM EREW model. The main idea is how to avoid the read conflict posted by Bar-On and Vishkin's algorithm (1985) by modifying their parenthesis pairing algorithm.<>
{"title":"An optimal parallel algorithm for arithmetic expression parsing","authors":"W. Deng, S. Iyengar","doi":"10.1109/IPPS.1992.223044","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223044","url":null,"abstract":"The paper discusses an optimal parallel algorithm for tree form generation of arithmetic expressions on an SIMD-SM EREW model. The main idea is how to avoid the read conflict posted by Bar-On and Vishkin's algorithm (1985) by modifying their parenthesis pairing algorithm.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114461267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-03-01DOI: 10.1109/IPPS.1992.223049
D. Menascé, S. Porto, S. Tripathi
It has been already demonstrated that cost-effective multiprocessor designs may be obtained by combining in the same architecture processors of different speeds (heterogeneous architecture) so that the serial and critical portions of the application may benefit from a fast single processor. The paper presents a systematic way to build static heuristic scheduling algorithms for such environments. Several algorithms are proposed and their performances are compared through simulation. One of the proposed algorithms is shown to achieve substantial performance gains as the degree of heterogeneity of the architecture increases.<>
{"title":"Processor assignment in heterogeneous parallel architectures","authors":"D. Menascé, S. Porto, S. Tripathi","doi":"10.1109/IPPS.1992.223049","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223049","url":null,"abstract":"It has been already demonstrated that cost-effective multiprocessor designs may be obtained by combining in the same architecture processors of different speeds (heterogeneous architecture) so that the serial and critical portions of the application may benefit from a fast single processor. The paper presents a systematic way to build static heuristic scheduling algorithms for such environments. Several algorithms are proposed and their performances are compared through simulation. One of the proposed algorithms is shown to achieve substantial performance gains as the degree of heterogeneity of the architecture increases.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"109 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124159810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-03-01DOI: 10.1109/IPPS.1992.223027
Weixiong Zhang, R. Korf
The authors present parallel algorithms for heap operations on an EREW PRAM. They first present a parallel heap construction algorithm with p processors running in O(n/p+logp) time. It takes 3.625n/p+4log p time in the worst case. The algorithm is optimal when p= theta (n/logn). They then propose a method to delete the root of a heap in parallel. To facilitate dynamic processor allocation, a data structure is developed in a preparatory step using O((n/logn)/sup 1-1/p/) processors in O(logp) time. A sequence of root deletion operations is realized such that each of these operations takes O((logn)/p+logp+loglogn) time using p processors. The authors also suggest an O((logn)/p+log p) time optimal parallel insert algorithm using p processors. When p= theta ((logn)/loglogn), both algorithms run in O(loglogn) time. The algorithms can also be extended to a parallel algorithm for deleting an element from a heap, given the address of the element.<>
{"title":"Parallel heap operations on EREW PRAM: summary of results","authors":"Weixiong Zhang, R. Korf","doi":"10.1109/IPPS.1992.223027","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223027","url":null,"abstract":"The authors present parallel algorithms for heap operations on an EREW PRAM. They first present a parallel heap construction algorithm with p processors running in O(n/p+logp) time. It takes 3.625n/p+4log p time in the worst case. The algorithm is optimal when p= theta (n/logn). They then propose a method to delete the root of a heap in parallel. To facilitate dynamic processor allocation, a data structure is developed in a preparatory step using O((n/logn)/sup 1-1/p/) processors in O(logp) time. A sequence of root deletion operations is realized such that each of these operations takes O((logn)/p+logp+loglogn) time using p processors. The authors also suggest an O((logn)/p+log p) time optimal parallel insert algorithm using p processors. When p= theta ((logn)/loglogn), both algorithms run in O(loglogn) time. The algorithms can also be extended to a parallel algorithm for deleting an element from a heap, given the address of the element.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"165 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115173649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-03-01DOI: 10.1109/IPPS.1992.223052
T. V. Lakshman, A. Bagchi, K. Rastani
The paper describes a scheme to schedule uncoordinated requests for resources that arrive in parallel. The specific application that it considered is that of scheduling transmission requests in ATM switches. The scheme is capable of handling both unicast and multicast transmission requests. Two implementations of the scheme using photonic devices are described. A novel aspect of the scheme is that it uses photonic devices to implement a heuristic graph-coloring algorithm needed to generate transmission schedules.<>
{"title":"A fast parallel scheduler for resource requests implemented using optical devices","authors":"T. V. Lakshman, A. Bagchi, K. Rastani","doi":"10.1109/IPPS.1992.223052","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223052","url":null,"abstract":"The paper describes a scheme to schedule uncoordinated requests for resources that arrive in parallel. The specific application that it considered is that of scheduling transmission requests in ATM switches. The scheme is capable of handling both unicast and multicast transmission requests. Two implementations of the scheme using photonic devices are described. A novel aspect of the scheme is that it uses photonic devices to implement a heuristic graph-coloring algorithm needed to generate transmission schedules.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115827498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-03-01DOI: 10.1109/IPPS.1992.223078
A. Aggarwal, W. T. Ma, G. Sandri, S. Sarkar
Results from parallel computing on a CM-2 Connection Machine are reported for a variety of graph-theoretic models for fitness optimization in evolutionary biology. These computations are among the most complex ever undertaken in this field and make full use of the internal hypercube architecture of the CM-2.<>
{"title":"Adaptive graph computations with a connection machine","authors":"A. Aggarwal, W. T. Ma, G. Sandri, S. Sarkar","doi":"10.1109/IPPS.1992.223078","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223078","url":null,"abstract":"Results from parallel computing on a CM-2 Connection Machine are reported for a variety of graph-theoretic models for fitness optimization in evolutionary biology. These computations are among the most complex ever undertaken in this field and make full use of the internal hypercube architecture of the CM-2.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125863376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-03-01DOI: 10.1109/IPPS.1992.223037
Myung-Kook Yang, C. Das
The authors propose a parallel decomposite, best-first' search branch-and bound algorithm for MIN-based multiprocessors. They start with a new probabilistic model to estimate the number of evaluated nodes for a serial algorithm. The proposed algorithm initially decomposes a problem into several subproblems. Each processor executes the serial best-first search to find a local feasible solution. The local solutions are broadcast through the network to compute the final solution. The speed-up analysis considers both the computation and communication overheads. It is seen that the parallel decomposite best-first search algorithm performs better than other reported schemes when communication overhead is taken into consideration.<>
{"title":"Analytical modeling of a parallel branch-and-bound algorithm on MIN-based multiprocessors","authors":"Myung-Kook Yang, C. Das","doi":"10.1109/IPPS.1992.223037","DOIUrl":"https://doi.org/10.1109/IPPS.1992.223037","url":null,"abstract":"The authors propose a parallel decomposite, best-first' search branch-and bound algorithm for MIN-based multiprocessors. They start with a new probabilistic model to estimate the number of evaluated nodes for a serial algorithm. The proposed algorithm initially decomposes a problem into several subproblems. Each processor executes the serial best-first search to find a local feasible solution. The local solutions are broadcast through the network to compute the final solution. The speed-up analysis considers both the computation and communication overheads. It is seen that the parallel decomposite best-first search algorithm performs better than other reported schemes when communication overhead is taken into consideration.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121503518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1992-03-01DOI: 10.1109/IPPS.1992.222969
Zhiyong Liu, Jia-Huai You, Xiaobo Li
The authors present a parallel storage scheme to distribute the elements of an N*N matrix over N memory banks, where N is any (odd or even) power of two, such that any rows, columns, forward and backward diagonals, and square or rectangular blocks can be accessed simultaneously without memory conflict. They present a simple scheme for address generation, which requires only logic operations and can be completed in constant time. They present two network implementation methods for data alignments for this storage scheme. Different from previously proposed routing algorithms, the algorithms for hypercube routing in this paper are free from network conflict. They do not require buffering and time length of a 'step' is shorter, therefore they are more efficient in terms of both hardware cost and speed. The authors also present a simple MIN implementation scheme for the realization of the data alignments. Schemes for processing smaller matrices efficiently on larger scale systems are also developed.<>
{"title":"The odd-even expansion storage scheme and its implementation issues","authors":"Zhiyong Liu, Jia-Huai You, Xiaobo Li","doi":"10.1109/IPPS.1992.222969","DOIUrl":"https://doi.org/10.1109/IPPS.1992.222969","url":null,"abstract":"The authors present a parallel storage scheme to distribute the elements of an N*N matrix over N memory banks, where N is any (odd or even) power of two, such that any rows, columns, forward and backward diagonals, and square or rectangular blocks can be accessed simultaneously without memory conflict. They present a simple scheme for address generation, which requires only logic operations and can be completed in constant time. They present two network implementation methods for data alignments for this storage scheme. Different from previously proposed routing algorithms, the algorithms for hypercube routing in this paper are free from network conflict. They do not require buffering and time length of a 'step' is shorter, therefore they are more efficient in terms of both hardware cost and speed. The authors also present a simple MIN implementation scheme for the realization of the data alignments. Schemes for processing smaller matrices efficiently on larger scale systems are also developed.<<ETX>>","PeriodicalId":340070,"journal":{"name":"Proceedings Sixth International Parallel Processing Symposium","volume":" 19","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1992-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120828750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}