Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472262
C. Papadopoulos
This paper analyzes the ability of several bounded degree networks that are commonly used for parallel computation to tolerate faults. Among other things it is shown that an N-node butterfly containing N/sup 1-/spl epsiv// worst-case faults (for any constant /spl epsiv/>0) can emulate a fault-free butterfly of the same size with only constant slowdown. Similar results are proven for the shuffle-exchange graph. Hence, these networks become the first connected bounded-degree networks known to be able to sustain more than a constant number of worst-case faults without suffering more than a constant-factor slowdown in performance.<>
{"title":"A formal study on the fault tolerance of parallel and distributed systems","authors":"C. Papadopoulos","doi":"10.1109/ICAPP.1995.472262","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472262","url":null,"abstract":"This paper analyzes the ability of several bounded degree networks that are commonly used for parallel computation to tolerate faults. Among other things it is shown that an N-node butterfly containing N/sup 1-/spl epsiv// worst-case faults (for any constant /spl epsiv/>0) can emulate a fault-free butterfly of the same size with only constant slowdown. Similar results are proven for the shuffle-exchange graph. Hence, these networks become the first connected bounded-degree networks known to be able to sustain more than a constant number of worst-case faults without suffering more than a constant-factor slowdown in performance.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"29 16","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133357605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472272
C. Papadopoulos, R. Hoffman
Conformance testing of communication protocols has recently become a major issue in the context of OSI-based standardization of protocols. The aim of conformance testing is to assure that a protocol fulfils an OSI specification. A performance study is presented for a distributed protocol test system that has been installed for conformance testing of the ISDN D-channel signalling protocol. Using a general approach for performance measurements and evaluation in distributed systems, a queueing model is developed and evaluated, based on runtimes as obtained from measurements of the test system. It is demonstrated that significant performance improvements can be achieved once the process scheduling strategy at the ISDN protocol testers is properly adjusted.<>
{"title":"Performance modelling for a distributed ISDN protocol test system","authors":"C. Papadopoulos, R. Hoffman","doi":"10.1109/ICAPP.1995.472272","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472272","url":null,"abstract":"Conformance testing of communication protocols has recently become a major issue in the context of OSI-based standardization of protocols. The aim of conformance testing is to assure that a protocol fulfils an OSI specification. A performance study is presented for a distributed protocol test system that has been installed for conformance testing of the ISDN D-channel signalling protocol. Using a general approach for performance measurements and evaluation in distributed systems, a queueing model is developed and evaluated, based on runtimes as obtained from measurements of the test system. It is demonstrated that significant performance improvements can be achieved once the process scheduling strategy at the ISDN protocol testers is properly adjusted.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123943478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472277
H. Suresh
Representation of data using hierarchical data structures is commonly used in applications such as computer graphics, digital image processing, computer vision and techniques are being evolved for efficient representation of these data. Transforming bilevel images to linear quadtrees is a way of representing the high-volume data. In this paper, the preliminary investigation and results thus obtained for transforming binary images to linear quadtrees using Parallel Virtual Machine System Software are presented. Single Instruction Multiple Data hypercube algorithms implemented using PVM software was tested under DOS operating system on IBM compatible PCs. The quadtree algorithm generates locational codes in pre-order and generally runs in O(log n) time and this paper tested the feasibility of achieving this time for an SIMD machine.<>
{"title":"PVM implementation of quadtree building algorithms on SIMD hypercube system","authors":"H. Suresh","doi":"10.1109/ICAPP.1995.472277","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472277","url":null,"abstract":"Representation of data using hierarchical data structures is commonly used in applications such as computer graphics, digital image processing, computer vision and techniques are being evolved for efficient representation of these data. Transforming bilevel images to linear quadtrees is a way of representing the high-volume data. In this paper, the preliminary investigation and results thus obtained for transforming binary images to linear quadtrees using Parallel Virtual Machine System Software are presented. Single Instruction Multiple Data hypercube algorithms implemented using PVM software was tested under DOS operating system on IBM compatible PCs. The quadtree algorithm generates locational codes in pre-order and generally runs in O(log n) time and this paper tested the feasibility of achieving this time for an SIMD machine.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"97 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124024007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472218
Chi-Chang Chen, Jianer Chen
We show the necessary and sufficient condition for any two nodes in an n-dimensional star graph to have n-1 vertex-disjoint paths with length less than or equal to the minimum distance plus 2. We also provide an algorithm to generate these n-1 vertex-disjoint paths.<>
{"title":"Vertex-disjoint routings in star graphs","authors":"Chi-Chang Chen, Jianer Chen","doi":"10.1109/ICAPP.1995.472218","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472218","url":null,"abstract":"We show the necessary and sufficient condition for any two nodes in an n-dimensional star graph to have n-1 vertex-disjoint paths with length less than or equal to the minimum distance plus 2. We also provide an algorithm to generate these n-1 vertex-disjoint paths.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115870813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472165
S. He, M. Torkelson
A new expandable 2-dimensional systolic array consisting of N homogeneous processing elements in a rectangular structure to compute DFT is proposed. A DFT of size N=M/sup 2/ can be computed in 2M steps of pipelined operations, achieving the optimal area-time complexity of AT/sup 2/=O(N/sup 2/). The orthogonal pipelining of the processing is obtained by exploiting the symbiosis between 1-dimensional systolic arrays of H.T. Kung (1980) and L.W. Chang and M.Y. Chen (1988). Compared with another 2D array based on "sandwiched" triple matrix product, the presented approach integrates the twiddle factor multiplication into the row transform. This not only reduces computational complexity and the size of coefficient matrix, but also eliminates the twiddle factor preloading procedure. DFT of size 2/sup L/N can be readily computed with 2/sup L/ N-size arrays abutted together. VHDL modules have been written and successfully simulated for the proposed architecture.<>
{"title":"A new expandable 2D systolic array for DFT computation based on symbiosis of 1D arrays","authors":"S. He, M. Torkelson","doi":"10.1109/ICAPP.1995.472165","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472165","url":null,"abstract":"A new expandable 2-dimensional systolic array consisting of N homogeneous processing elements in a rectangular structure to compute DFT is proposed. A DFT of size N=M/sup 2/ can be computed in 2M steps of pipelined operations, achieving the optimal area-time complexity of AT/sup 2/=O(N/sup 2/). The orthogonal pipelining of the processing is obtained by exploiting the symbiosis between 1-dimensional systolic arrays of H.T. Kung (1980) and L.W. Chang and M.Y. Chen (1988). Compared with another 2D array based on \"sandwiched\" triple matrix product, the presented approach integrates the twiddle factor multiplication into the row transform. This not only reduces computational complexity and the size of coefficient matrix, but also eliminates the twiddle factor preloading procedure. DFT of size 2/sup L/N can be readily computed with 2/sup L/ N-size arrays abutted together. VHDL modules have been written and successfully simulated for the proposed architecture.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116199346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472269
L. Stals
Early experiments with parallel multigrid used square domains and uniform grids. More recently, several authors have considered problems with more complicated domains (see for example McCormick's AFAC method (1989) and Baden et al. (1994)). However, these methods still use structured grids. In this paper we present a program which is based upon unstructured grids. By allowing unstructured grids we can solve problems on more general regions and use adaptive refinement methods.<>
早期的平行多重网格实验使用的是方形域和均匀网格。最近,一些作者考虑了更复杂领域的问题(例如McCormick的AFAC方法(1989)和Baden et al.(1994))。然而,这些方法仍然使用结构化网格。本文提出了一个基于非结构化网格的程序。通过允许非结构化网格,我们可以在更一般的区域解决问题,并使用自适应细化方法。
{"title":"Adaptive multigrid in parallel","authors":"L. Stals","doi":"10.1109/ICAPP.1995.472269","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472269","url":null,"abstract":"Early experiments with parallel multigrid used square domains and uniform grids. More recently, several authors have considered problems with more complicated domains (see for example McCormick's AFAC method (1989) and Baden et al. (1994)). However, these methods still use structured grids. In this paper we present a program which is based upon unstructured grids. By allowing unstructured grids we can solve problems on more general regions and use adaptive refinement methods.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125567149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472205
Weiping Zhu
Distributed computer systems from time to time experience uneven loads on different resources. Dynamic load balancing aims to identify uneven load instances and takes appropriate actions to restore the balance. This paper presents our experiences in implementing a load balancing facility on the Amoeba system, which allows us to carry out a series of experiments with various algorithms. The results front a preliminary study of different load balancing algorithms are also presented. These results indicate that load balancing has great impact on system performance, it not only reduces the average response time of processes, but also the variation of response time. A comparison between these algorithm under various conditions is included, which indicates that with tens computers in a system, a centralized algorithm outperforms a distributed one. The results further indicate job initiation is an important part of a load balancing facility.<>
{"title":"Dynamic load balancing on Amoeba","authors":"Weiping Zhu","doi":"10.1109/ICAPP.1995.472205","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472205","url":null,"abstract":"Distributed computer systems from time to time experience uneven loads on different resources. Dynamic load balancing aims to identify uneven load instances and takes appropriate actions to restore the balance. This paper presents our experiences in implementing a load balancing facility on the Amoeba system, which allows us to carry out a series of experiments with various algorithms. The results front a preliminary study of different load balancing algorithms are also presented. These results indicate that load balancing has great impact on system performance, it not only reduces the average response time of processes, but also the variation of response time. A comparison between these algorithm under various conditions is included, which indicates that with tens computers in a system, a centralized algorithm outperforms a distributed one. The results further indicate job initiation is an important part of a load balancing facility.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124104307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472209
P. Nastou, K. Kyrimis, D. Maritsas
This paper describes the implementation of a library of low-level image processing algorithms. This library is divided into two families of algorithms, one for those that apply to the spatial domain (local histogram equalization, local average filter, median filter, Sobel edge detector, and histogram evaluation), and one for those that apply to the frequency domain (forward and inverse discrete Fourier Transform, amplitude of the forward discrete Fourier transform, forward and inverse discrete cosine transform, and Butterworth filters). The efficiency of these algorithms depends on the number of processors used, the method of combining results produced by different processors (e.g., sequentially or using a binary tree), and the time required for the combination of two independently produced results compared to the time required to produce them.<>
{"title":"Computer vision algorithms on the Parsytec GCel 3/512","authors":"P. Nastou, K. Kyrimis, D. Maritsas","doi":"10.1109/ICAPP.1995.472209","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472209","url":null,"abstract":"This paper describes the implementation of a library of low-level image processing algorithms. This library is divided into two families of algorithms, one for those that apply to the spatial domain (local histogram equalization, local average filter, median filter, Sobel edge detector, and histogram evaluation), and one for those that apply to the frequency domain (forward and inverse discrete Fourier Transform, amplitude of the forward discrete Fourier transform, forward and inverse discrete cosine transform, and Butterworth filters). The efficiency of these algorithms depends on the number of processors used, the method of combining results produced by different processors (e.g., sequentially or using a binary tree), and the time required for the combination of two independently produced results compared to the time required to produce them.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"155 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121486580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472284
H. Lonn, R. Snedsbøl
Distributed computer systems for real-time control require a global timebase with high precision. A small time skew between local clocks in the system is required to obtain good control performance through well synchronised task execution, but also provides a base for efficient communication. In distributed safety critical applications, clocks have traditionally been synchronised with fault tolerant clock synchronisation algorithms. With these methods, a limited number of erroneous clock readings are allowed in each adjustment. On the other hand, readings from all clocks in the system are required before an adjustment can be made. In this paper an alternative approach, the Daisy Chain method, is proposed and compared with present solutions. Daisy Chain synchronisation does not allow erroneous clock readings, but methods of avoiding them are described. Due to its simplicity, the method can be implemented with little hardware. Low precision frequency sources are sufficient and recovery after arbitrary failures is fast because no special start up phase is required. The paper also discusses effects of quantisation uncertainty and transmission delay, and outline the implementation of a global time base in an embedded distributed real-time architecture.<>
{"title":"Synchronisation in safety-critical distributed control systems","authors":"H. Lonn, R. Snedsbøl","doi":"10.1109/ICAPP.1995.472284","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472284","url":null,"abstract":"Distributed computer systems for real-time control require a global timebase with high precision. A small time skew between local clocks in the system is required to obtain good control performance through well synchronised task execution, but also provides a base for efficient communication. In distributed safety critical applications, clocks have traditionally been synchronised with fault tolerant clock synchronisation algorithms. With these methods, a limited number of erroneous clock readings are allowed in each adjustment. On the other hand, readings from all clocks in the system are required before an adjustment can be made. In this paper an alternative approach, the Daisy Chain method, is proposed and compared with present solutions. Daisy Chain synchronisation does not allow erroneous clock readings, but methods of avoiding them are described. Due to its simplicity, the method can be implemented with little hardware. Low precision frequency sources are sufficient and recovery after arbitrary failures is fast because no special start up phase is required. The paper also discusses effects of quantisation uncertainty and transmission delay, and outline the implementation of a global time base in an embedded distributed real-time architecture.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122067610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-04-19DOI: 10.1109/ICAPP.1995.472243
M. Gallagher, V. Narasimhan
This paper presents the design of the software system, ADTEST, for generating test data for programs developed in Ada. The key feature of this system is that the problem of test data generation is treated entirely as a dynamic numerical optimisation problem and, as a, consequence, this method does not suffer from difficulties commonly found in symbolic execution systems, such as those associated with input-variable-dependent loops, array references, and module calls. Instead, program instrumentation is used to solve a set of path constraints without explicitly knowing their form. The system supports not only the generation of integer and real data types, but also non-numeric data types such as characters and enumerated types. The system has been tested on large Ada programs (>60000 lines of code) and found to reduce the effort required to test programs as well as provide an increase in test coverage.<>
{"title":"Software test data generation using program instrumentation","authors":"M. Gallagher, V. Narasimhan","doi":"10.1109/ICAPP.1995.472243","DOIUrl":"https://doi.org/10.1109/ICAPP.1995.472243","url":null,"abstract":"This paper presents the design of the software system, ADTEST, for generating test data for programs developed in Ada. The key feature of this system is that the problem of test data generation is treated entirely as a dynamic numerical optimisation problem and, as a, consequence, this method does not suffer from difficulties commonly found in symbolic execution systems, such as those associated with input-variable-dependent loops, array references, and module calls. Instead, program instrumentation is used to solve a set of path constraints without explicitly knowing their form. The system supports not only the generation of integer and real data types, but also non-numeric data types such as characters and enumerated types. The system has been tested on large Ada programs (>60000 lines of code) and found to reduce the effort required to test programs as well as provide an increase in test coverage.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125476122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}