Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472220
A. Datta, T. Gonzalez, V. Thiagarajan
This paper presents self-stabilizing algorithms for finding the diameter, centroid(s) and median(s) of a tree. The algorithms compute these metrics of a tree in a finite number of steps. The distributed tree structured system is maintained by another self-stabilizing spanning tree protocol over a graph. This makes the system resilient to transient failures, from which it is guaranteed to recover after a finite number of moves.<>
{"title":"Self-stabilizing algorithms for tree metrics","authors":"A. Datta, T. Gonzalez, V. Thiagarajan","doi":"10.1109/ICAPP.1995.472220","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472220","url":null,"abstract":"This paper presents self-stabilizing algorithms for finding the diameter, centroid(s) and median(s) of a tree. The algorithms compute these metrics of a tree in a finite number of steps. The distributed tree structured system is maintained by another self-stabilizing spanning tree protocol over a graph. This makes the system resilient to transient failures, from which it is guaranteed to recover after a finite number of moves.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122292458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472200
S. Das, P. Dhar
Like protocol conversion, protocol complementation is an approach for network interconnection. This paper describes the internetworking between TP4 and TCP at the transport level through protocol complementation. This applies to the need for interoperability between ISO-OSI and Internet. From the given CFSM specifications of protocol P[P/sub s/,P/sub r/] of TP4 and Q[Qs,Qr] of TCP, we have constructed a composite protocol CFSM R/sub PQ/ for the converter which may be inserted as a virtual layer to provide a uniform view to the users. An attempt has been made to implement the converter through Estelle-C (Extended State Transition Language) compiler, a formal description tool for protocol specification and verification.<>
与协议转换一样,协议互补也是网络互连的一种方式。本文描述了通过协议互补在传输层实现TP4与TCP的互连。这适用于ISO-OSI和Internet之间互操作性的需求。根据TP4协议P[P/sub /s /,P/sub / r/]和TCP协议Q[Q,Qr]的CFSM规范,我们为转换器构建了一个复合协议CFSM r/ sub PQ/,该协议可以作为虚拟层插入,为用户提供统一的视图。本文尝试通过Estelle-C(扩展状态转换语言)编译器实现该转换器,该编译器是协议规范和验证的形式化描述工具。
{"title":"Internetworking between TP4 and TCP through protocol complementation","authors":"S. Das, P. Dhar","doi":"10.1109/ICAPP.1995.472200","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472200","url":null,"abstract":"Like protocol conversion, protocol complementation is an approach for network interconnection. This paper describes the internetworking between TP4 and TCP at the transport level through protocol complementation. This applies to the need for interoperability between ISO-OSI and Internet. From the given CFSM specifications of protocol P[P/sub s/,P/sub r/] of TP4 and Q[Qs,Qr] of TCP, we have constructed a composite protocol CFSM R/sub PQ/ for the converter which may be inserted as a virtual layer to provide a uniform view to the users. An attempt has been made to implement the converter through Estelle-C (Extended State Transition Language) compiler, a formal description tool for protocol specification and verification.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134006407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472280
P. Bonnin, C. Maurette, B. Hoelzener-Douarin, E. Pissaloux
The paper focuses on the problem of the multi-spectral image segmentation, which leads-through the data fusion of several mono-spectral images-to reliable and robust vision systems for military or industrial purposes. The proposed approach does not fit the classical taxonomy of image data fusion methods: indeed, data fusion is performed during the segmentation, in parallel, of different images. The presented algorithm has been implemented on the Connection Machine CM5 with the data programming style.<>
{"title":"A parallel implementation on CM5 of a multi-spectral cooperative segmentation","authors":"P. Bonnin, C. Maurette, B. Hoelzener-Douarin, E. Pissaloux","doi":"10.1109/ICAPP.1995.472280","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472280","url":null,"abstract":"The paper focuses on the problem of the multi-spectral image segmentation, which leads-through the data fusion of several mono-spectral images-to reliable and robust vision systems for military or industrial purposes. The proposed approach does not fit the classical taxonomy of image data fusion methods: indeed, data fusion is performed during the segmentation, in parallel, of different images. The presented algorithm has been implemented on the Connection Machine CM5 with the data programming style.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130850568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472290
H. Suresh
Development of efficient algorithms for parallel computer architectures is an on-going research area and in the recent past a great volume of theoretical work has been carried out for the search of suitable algorithms in concurrent processing environment. In this paper, the results obtained in the implementation of an Optimal Parallel Algorithm developed by Deng and Iyengar (1992) in the esoteric area of arithmetic expression parsing is reported. The 'C' code developed and tested on an IBM Compatible Personal Computer in this investigative study, is a simple recursive descent parser and may be used for parallel parsing of arithmetic expressions. The algorithm was developed to suit the SIMD parallel architecture to avoid any communication bottlenecks posed by PVM system, however, design and structure of the code readily permits portability to a parallel computer system.<>
{"title":"Implementation of an optimal parallel algorithm for arithmetic expression parsing","authors":"H. Suresh","doi":"10.1109/ICAPP.1995.472290","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472290","url":null,"abstract":"Development of efficient algorithms for parallel computer architectures is an on-going research area and in the recent past a great volume of theoretical work has been carried out for the search of suitable algorithms in concurrent processing environment. In this paper, the results obtained in the implementation of an Optimal Parallel Algorithm developed by Deng and Iyengar (1992) in the esoteric area of arithmetic expression parsing is reported. The 'C' code developed and tested on an IBM Compatible Personal Computer in this investigative study, is a simple recursive descent parser and may be used for parallel parsing of arithmetic expressions. The algorithm was developed to suit the SIMD parallel architecture to avoid any communication bottlenecks posed by PVM system, however, design and structure of the code readily permits portability to a parallel computer system.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114999187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472183
D. Wo, K. Forward
This paper presents a new data dependence checking technique called the variable tracking technique (VTT). It is a single-pass data dependence checking method which locates dependent statements in a serial computer program. VTT produces a schedule which lists the operations in the source code in groups. The list of operations in a particular group can be executed concurrently. The user is not required to provide a profile of the program to the compiler, hence VTT is suitable for applications which automate the process of exploiting parallelism. Here we describe the use of this technique in gacc, a parallelising compiler, which compiles C functions to field programmable gate array (FPGA) circuits. The results presented in this paper show that VTT has been instrumental in gaining improved performance from a parallelising compiler which automates the process of executing the computational intensive portion of the program in hardware.<>
{"title":"Variable tracking technique: a single-pass method to determine data dependence","authors":"D. Wo, K. Forward","doi":"10.1109/ICAPP.1995.472183","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472183","url":null,"abstract":"This paper presents a new data dependence checking technique called the variable tracking technique (VTT). It is a single-pass data dependence checking method which locates dependent statements in a serial computer program. VTT produces a schedule which lists the operations in the source code in groups. The list of operations in a particular group can be executed concurrently. The user is not required to provide a profile of the program to the compiler, hence VTT is suitable for applications which automate the process of exploiting parallelism. Here we describe the use of this technique in gacc, a parallelising compiler, which compiles C functions to field programmable gate array (FPGA) circuits. The results presented in this paper show that VTT has been instrumental in gaining improved performance from a parallelising compiler which automates the process of executing the computational intensive portion of the program in hardware.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"276 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114422055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472203
G. Turner, H. Schroder
We propose an algorithm to solve the Token Distribution problem, a static variant of the load balancing problem, on d-dimensional, reconfigurable meshes with toroidal connections and side length n. No other algorithms have been proposed under this model of computation. We show that for token size T, the discrepancy /spl Delta/ between the maximum and minimum number of tokens per PE can be reduced to 1 in at most 2n/spl Delta/(T+4d) steps.<>
{"title":"Token distribution on reconfigurable d-dimensional meshes","authors":"G. Turner, H. Schroder","doi":"10.1109/ICAPP.1995.472203","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472203","url":null,"abstract":"We propose an algorithm to solve the Token Distribution problem, a static variant of the load balancing problem, on d-dimensional, reconfigurable meshes with toroidal connections and side length n. No other algorithms have been proposed under this model of computation. We show that for token size T, the discrepancy /spl Delta/ between the maximum and minimum number of tokens per PE can be reduced to 1 in at most 2n/spl Delta/(T+4d) steps.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"140 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114778539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472274
B. Zhou, Mohammed Atiquzzaman
Many of the existing analytical models for output buffered switching elements (SE) assume uniform traffic and infinite buffers at each output port of an SE. Moreover, because of simplifying assumptions, the results are not accurate. It is important to develop an accurate analytical model to tailor the design of the network parameters and optimize the network performance by proper dimensioning of the buffers. The objective of this paper is to develop an accurate model for the performance of MINs using finite output buffered SEs, and operating in the presence of nonuniform traffic patterns. It is shown that the proposed analytical model is much accurate than existing models.<>
{"title":"Accurate analysis of multistage interconnection networks using finite output-buffered switching elements","authors":"B. Zhou, Mohammed Atiquzzaman","doi":"10.1109/ICAPP.1995.472274","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472274","url":null,"abstract":"Many of the existing analytical models for output buffered switching elements (SE) assume uniform traffic and infinite buffers at each output port of an SE. Moreover, because of simplifying assumptions, the results are not accurate. It is important to develop an accurate analytical model to tailor the design of the network parameters and optimize the network performance by proper dimensioning of the buffers. The objective of this paper is to develop an accurate model for the performance of MINs using finite output buffered SEs, and operating in the presence of nonuniform traffic patterns. It is shown that the proposed analytical model is much accurate than existing models.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128246211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472251
L. Lundberg, H. Lennerstad
We consider an ideal multiprocessor system with q processors and a centralized scheduler without overhead that selects processes from one common pool, permitting dynamic relocation of processes. A parallel program P consisting of n processes is executed on this system and terminates when all processes are completed. Due to synchronizations, processes may be blocked while waiting for events in other processes. The parallel program is executed using some schedule of processes to processors, resulting in a speedup /spl sigma/. We then consider an ideal multiprocessor with k clusters containing u processors each. In this system processes may not be relocated between clusters. Finding a schedule which results in maximum speedup is NP-hard. Here, we present a formula for the optimal lower bound on the maximum speedup for program P, as a function of q, n, /spl sigma/, k and u. We also present a formula for the optimal lower bound when the number of processes (n) is unknown. Using these results we are able to decide if a certain schedule is close to optimal or if it is worth-while to look for other schedules. This is demonstrated by evaluating the speedup of a specific schedule of a particular program.<>
{"title":"An optimal lower bound on the maximum speedup in multiprocessors with clusters","authors":"L. Lundberg, H. Lennerstad","doi":"10.1109/ICAPP.1995.472251","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472251","url":null,"abstract":"We consider an ideal multiprocessor system with q processors and a centralized scheduler without overhead that selects processes from one common pool, permitting dynamic relocation of processes. A parallel program P consisting of n processes is executed on this system and terminates when all processes are completed. Due to synchronizations, processes may be blocked while waiting for events in other processes. The parallel program is executed using some schedule of processes to processors, resulting in a speedup /spl sigma/. We then consider an ideal multiprocessor with k clusters containing u processors each. In this system processes may not be relocated between clusters. Finding a schedule which results in maximum speedup is NP-hard. Here, we present a formula for the optimal lower bound on the maximum speedup for program P, as a function of q, n, /spl sigma/, k and u. We also present a formula for the optimal lower bound when the number of processes (n) is unknown. Using these results we are able to decide if a certain schedule is close to optimal or if it is worth-while to look for other schedules. This is demonstrated by evaluating the speedup of a specific schedule of a particular program.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124876524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472176
D. Gee, Hong Shen
A fundamental and important research area in parallel computing is the design of high-performance interconnection networks for connecting the processors in parallel computers. This paper presents a new interconnection network, the X-cube, a variant of the Cube-Connected-Cycles (CCC), which has the same degree, and same diameter in the worst case as the CCC of the same size and a decreased number of routing steps in the average case. Associated with this network is a construction algorithm which illustrates the way of building the network, and a routing algorithm that describes the method of passing messages in the network. The proposed network is validated and its performance is evaluated experimentally through implementation of the above algorithms. A number of comparisons are made between this network and three existing networks, mesh, Hypercube, and the CCC.<>
{"title":"X-cube: a variation of cube-connected-cycles network with lower average routing steps","authors":"D. Gee, Hong Shen","doi":"10.1109/ICAPP.1995.472176","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472176","url":null,"abstract":"A fundamental and important research area in parallel computing is the design of high-performance interconnection networks for connecting the processors in parallel computers. This paper presents a new interconnection network, the X-cube, a variant of the Cube-Connected-Cycles (CCC), which has the same degree, and same diameter in the worst case as the CCC of the same size and a decreased number of routing steps in the average case. Associated with this network is a construction algorithm which illustrates the way of building the network, and a routing algorithm that describes the method of passing messages in the network. The proposed network is validated and its performance is evaluated experimentally through implementation of the above algorithms. A number of comparisons are made between this network and three existing networks, mesh, Hypercube, and the CCC.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124066584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472295
P. Williams, R. Togneri
Workstation clusters, in which user processes run on one specified machine, cause the potential for a load imbalance. An analysis is conducted to determine how various system resources interact with one another in terms of job throughput and user interaction time. Quantitative and qualitative analyses of theoretical load sharing methods are used in the development of a well engineering system which is configurable, reliable and has a standard interface. Performance evaluation and system testing show very positive results.<>
{"title":"Dynamic load sharing within workstation clusters","authors":"P. Williams, R. Togneri","doi":"10.1109/ICAPP.1995.472295","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472295","url":null,"abstract":"Workstation clusters, in which user processes run on one specified machine, cause the potential for a load imbalance. An analysis is conducted to determine how various system resources interact with one another in terms of job throughput and user interaction time. Quantitative and qualitative analyses of theoretical load sharing methods are used in the development of a well engineering system which is configurable, reliable and has a standard interface. Performance evaluation and system testing show very positive results.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"460 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120879684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}