Pub Date : 1997-12-10DOI: 10.1109/ICAPP.1997.651537
Wanlei Zhou
This paper presents the design and prototype implementation of a reactive system that facilitates the fault detection and fault tolerance of a loosely integrated heterogeneous database system. The fault detection mechanism uses sensors to monitor individual databases and system objects and to detect database or system component failures. The fault tolerance mechanism uses actuators to react to these failures.
{"title":"Fault detection and fault tolerance in a loosely integrated heterogeneous database system","authors":"Wanlei Zhou","doi":"10.1109/ICAPP.1997.651537","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651537","url":null,"abstract":"This paper presents the design and prototype implementation of a reactive system that facilitates the fault detection and fault tolerance of a loosely integrated heterogeneous database system. The fault detection mechanism uses sensors to monitor individual databases and system objects and to detect database or system component failures. The fault tolerance mechanism uses actuators to react to these failures.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132893745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-10DOI: 10.1109/ICAPP.1997.651501
Haun-Chao Keh, Jen-Chih Lin
The supercube is a novel interconnection network that is derived from the hypercube. Unlike the hypercube, the supercube can be constructed for any number of nodes. That is, the supercube is incrementally expandable. In addition, the supercube retains the connectivity and diameter properties of the corresponding hypercube. In this paper, we consider the problem of embedding and reconfiguring binary tree structures in a faulty supercube. Further more, for finding the replaceable node of the faulty node, we allow 2-expansion such that we can show that up to (n-2) faults can be tolerated with congestion 1 and dilation 4 that is (n-1) is the dimension of a supercube.
{"title":"Embedding a complete binary tree into a faulty supercube","authors":"Haun-Chao Keh, Jen-Chih Lin","doi":"10.1109/ICAPP.1997.651501","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651501","url":null,"abstract":"The supercube is a novel interconnection network that is derived from the hypercube. Unlike the hypercube, the supercube can be constructed for any number of nodes. That is, the supercube is incrementally expandable. In addition, the supercube retains the connectivity and diameter properties of the corresponding hypercube. In this paper, we consider the problem of embedding and reconfiguring binary tree structures in a faulty supercube. Further more, for finding the replaceable node of the faulty node, we allow 2-expansion such that we can show that up to (n-2) faults can be tolerated with congestion 1 and dilation 4 that is (n-1) is the dimension of a supercube.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132317039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-10DOI: 10.1109/ICAPP.1997.651538
R. Wong, R. Topor, Hong Shen
This paper presents an efficient parallel algorithm for computing the mutual range-join of N sets of numbers on shared-nothing hypercube computers. The algorithm iteratively joins each set to the mutual range-join of the preceding sets. Each join is performed on all processors of the hypercube in parallel. The algorithm uses a global sorting method to distribute the elements of the first set evenly across all processors in increasing order, a new data balancing technique to distribute the elements of subsequent sets to match the intermediate set at each processor and to compensate for join skew, and a new efficient local range-join procedure. We analyse the performance of this algorithm and demonstrate that it improves on the best previously published algorithm for this problem when the join selectivity factor is small. The method can also be applied to similar problems such as band-join and equi-join.
{"title":"A parallel sort-balance mutual range-join algorithm on hypercube computers","authors":"R. Wong, R. Topor, Hong Shen","doi":"10.1109/ICAPP.1997.651538","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651538","url":null,"abstract":"This paper presents an efficient parallel algorithm for computing the mutual range-join of N sets of numbers on shared-nothing hypercube computers. The algorithm iteratively joins each set to the mutual range-join of the preceding sets. Each join is performed on all processors of the hypercube in parallel. The algorithm uses a global sorting method to distribute the elements of the first set evenly across all processors in increasing order, a new data balancing technique to distribute the elements of subsequent sets to match the intermediate set at each processor and to compensate for join skew, and a new efficient local range-join procedure. We analyse the performance of this algorithm and demonstrate that it improves on the best previously published algorithm for this problem when the join selectivity factor is small. The method can also be applied to similar problems such as band-join and equi-join.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"78 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114001374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-10DOI: 10.1109/ICAPP.1997.651492
A. Barak, A. Braverman
Scalable computing clusters (SCC) are becoming an alternative to mainframes and MPP's for the execution of high performance, demanding applications in multi-user, time-sharing environments. In order to better utilize the multiple resources of such systems, it is necessary to develop means for cluster wide resource allocation and sharing, that will make an SCC easy to program and use. This paper presents the details of a memory ushering algorithm among the nodes of an SCC. This algorithm allows a node which has exhausted its main memory to use available memory in other nodes. The paper first presents results of simulations of several algorithms for process placement to nodes. It then describes the memory ushering algorithm of the MOSIX multicomputer operating system for an SCC and its performance.
{"title":"Memory ushering in a scalable computing cluster","authors":"A. Barak, A. Braverman","doi":"10.1109/ICAPP.1997.651492","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651492","url":null,"abstract":"Scalable computing clusters (SCC) are becoming an alternative to mainframes and MPP's for the execution of high performance, demanding applications in multi-user, time-sharing environments. In order to better utilize the multiple resources of such systems, it is necessary to develop means for cluster wide resource allocation and sharing, that will make an SCC easy to program and use. This paper presents the details of a memory ushering algorithm among the nodes of an SCC. This algorithm allows a node which has exhausted its main memory to use available memory in other nodes. The paper first presents results of simulations of several algorithms for process placement to nodes. It then describes the memory ushering algorithm of the MOSIX multicomputer operating system for an SCC and its performance.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"181 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116502368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-10DOI: 10.1109/ICAPP.1997.651502
Yuh-Shyan Chen, Y. Tseng, T. Juang, Chiou-Jyu Chang
Trees are a common structure to represent the inter-task communication pattern of a parallel algorithm. In this paper, we consider the problem of embedding a complete binary tree in a star graph with the objective of minimizing congestion and dilation. We develop a congestion-free, dilation-2, load-1 embedding of a level-p binary tree into an n-dimensional star graph, where p=/spl Sigma//sub i=2//sup n/ [log i] and k is any positive integer. The result offers a tree of size comparable or superior to existing results, but with less congestion and dilation.
{"title":"Embedding of congestion-free complete binary trees with dilation two in star graphs","authors":"Yuh-Shyan Chen, Y. Tseng, T. Juang, Chiou-Jyu Chang","doi":"10.1109/ICAPP.1997.651502","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651502","url":null,"abstract":"Trees are a common structure to represent the inter-task communication pattern of a parallel algorithm. In this paper, we consider the problem of embedding a complete binary tree in a star graph with the objective of minimizing congestion and dilation. We develop a congestion-free, dilation-2, load-1 embedding of a level-p binary tree into an n-dimensional star graph, where p=/spl Sigma//sub i=2//sup n/ [log i] and k is any positive integer. The result offers a tree of size comparable or superior to existing results, but with less congestion and dilation.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123684852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-10DOI: 10.1109/ICAPP.1997.651481
Abdel-Halim Smai, L. Thorelli
The multiplexing and arbitration of the physical channel among many virtual channels is an important issue in parallel systems using wormhole switching technique. Many existing multicomputer routers do not provide support for prioritized traffic at the link level. The main goal of network designers has been to improve average delay and throughput. In this paper, we propose a new, low-cost prioritized physical channel scheduling scheme for wormhole networks. Conventional demand multiplexing allows a set of ready flits to share a physical channel in a strict round-robin manner. Thus, it does not allow flexibility for fast movement of high priority messages such as synchronization and control information. The proposed scheme allows high priority flits to bypass low priority flits, while applying round-robin among flits with the same priority. This paper presents the motivation behind the proposed scheme, description of the scheme, and its implementation. The prioritized physical channel scheduling scheme is evaluated and compared against the conventional demand multiplexing for a wide range of system parameters. Results based on thorough simulation show that the performance of high priority traffic can be significantly improved, latency can be improved by up to 45%. The results demonstrate significant potential for designing high performance wormhole systems to support prioritized traffic.
{"title":"Prioritized physical channel scheduling in wormhole networks","authors":"Abdel-Halim Smai, L. Thorelli","doi":"10.1109/ICAPP.1997.651481","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651481","url":null,"abstract":"The multiplexing and arbitration of the physical channel among many virtual channels is an important issue in parallel systems using wormhole switching technique. Many existing multicomputer routers do not provide support for prioritized traffic at the link level. The main goal of network designers has been to improve average delay and throughput. In this paper, we propose a new, low-cost prioritized physical channel scheduling scheme for wormhole networks. Conventional demand multiplexing allows a set of ready flits to share a physical channel in a strict round-robin manner. Thus, it does not allow flexibility for fast movement of high priority messages such as synchronization and control information. The proposed scheme allows high priority flits to bypass low priority flits, while applying round-robin among flits with the same priority. This paper presents the motivation behind the proposed scheme, description of the scheme, and its implementation. The prioritized physical channel scheduling scheme is evaluated and compared against the conventional demand multiplexing for a wide range of system parameters. Results based on thorough simulation show that the performance of high priority traffic can be significantly improved, latency can be improved by up to 45%. The results demonstrate significant potential for designing high performance wormhole systems to support prioritized traffic.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133573548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-10DOI: 10.1109/ICAPP.1997.651528
K. Watanabe, L. Wang, Hyeong-Woo Cha, S. Ogawa
CMOS equivalents of the synapse and the neuron are proposed for LSI implementation of an adaptive analog neural network. The synapse is a multiplying digital-to-analog converter based on an R-2R ladder and the neuron consists of the second-generation current conveyor. Prototype chips fabricated independently using 0.6 /spl mu/m CMOS process have confirmed the wideband signal processing capability owing to a fully current-mode approach. Detailed analyses of measured performances have also given the design criteria for fully parallel implementation.
在自适应模拟神经网络的大规模集成电路实现中,提出了突触和神经元的CMOS等效元件。突触是一个基于R-2R阶梯的倍增数模转换器,神经元由第二代电流传送带组成。使用0.6 /spl μ m CMOS工艺独立制造的原型芯片由于采用全电流模式方法,证实了宽带信号处理能力。对测量性能的详细分析也给出了完全并行实现的设计准则。
{"title":"A current-mode approach to CMOS neural network implementation","authors":"K. Watanabe, L. Wang, Hyeong-Woo Cha, S. Ogawa","doi":"10.1109/ICAPP.1997.651528","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651528","url":null,"abstract":"CMOS equivalents of the synapse and the neuron are proposed for LSI implementation of an adaptive analog neural network. The synapse is a multiplying digital-to-analog converter based on an R-2R ladder and the neuron consists of the second-generation current conveyor. Prototype chips fabricated independently using 0.6 /spl mu/m CMOS process have confirmed the wideband signal processing capability owing to a fully current-mode approach. Detailed analyses of measured performances have also given the design criteria for fully parallel implementation.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130110847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-10DOI: 10.1109/ICAPP.1997.651488
Sang-Hwa Chung, Soo-Cheol Oh, K. Ryu, Soo-Hee Park
This paper presents a parallel information retrieval (IR) system which provides information for users with precision and speed. For precision, the IR system adopts a full-text search method. To supply a fast information service, parallel processing techniques are used, so multiple queries are processed concurrently and each of these queries is also handled in parallel using multiple processors. For the efficient utilization of multiple processors, we developed a processor allocation method which dynamically assigns various-size processor clusters for incoming queries based on the current workload of the system. The parallel IR model is implemented on a multi-transputer system composed of 16 processors. According to the experiments, a linear speed-up of up to 11.3-fold is obtained, and network and hard disk overheads are negligible in comparison with the response time.
{"title":"Parallel information retrieval on a distributed memory multiprocessor system","authors":"Sang-Hwa Chung, Soo-Cheol Oh, K. Ryu, Soo-Hee Park","doi":"10.1109/ICAPP.1997.651488","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651488","url":null,"abstract":"This paper presents a parallel information retrieval (IR) system which provides information for users with precision and speed. For precision, the IR system adopts a full-text search method. To supply a fast information service, parallel processing techniques are used, so multiple queries are processed concurrently and each of these queries is also handled in parallel using multiple processors. For the efficient utilization of multiple processors, we developed a processor allocation method which dynamically assigns various-size processor clusters for incoming queries based on the current workload of the system. The parallel IR model is implemented on a multi-transputer system composed of 16 processors. According to the experiments, a linear speed-up of up to 11.3-fold is obtained, and network and hard disk overheads are negligible in comparison with the response time.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131909078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-10DOI: 10.1109/ICAPP.1997.651521
R. Baraglia, R. Ferrini, D. Laforenza, A. Laganà
As a case study of computational problems difficult to parallelize we discuss the parallelization of a quantum reactive scattering application integrating a two dimensional Schrodinger equation. From a detailed analysis of the computational procedure, sources for inefficiency causing a performance degradation when running the application on a MIMD-DM machine were singled out and the parallel model was structured accordingly. The suitability of the model adopted was checked by running some test calculations.
{"title":"On the optimization of a task-farm model for the parallel integration of a two-dimensional Schrodinger equation","authors":"R. Baraglia, R. Ferrini, D. Laforenza, A. Laganà","doi":"10.1109/ICAPP.1997.651521","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651521","url":null,"abstract":"As a case study of computational problems difficult to parallelize we discuss the parallelization of a quantum reactive scattering application integrating a two dimensional Schrodinger equation. From a detailed analysis of the computational procedure, sources for inefficiency causing a performance degradation when running the application on a MIMD-DM machine were singled out and the parallel model was structured accordingly. The suitability of the model adopted was checked by running some test calculations.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115816205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-12-10DOI: 10.1109/ICAPP.1997.651527
V. Kecman
The paper presents a neural network (NN) based adaptive backthrough control (ABC) scheme for both linear and nonlinear dynamic plants. Unlike other feedforward NN based control schemes the ABC comprises of one neural network which simultaneously acts as both plant model (emulator) and the controller (inverse of the emulator). For linear plants, without noise, the resulting feedforward controller, providing that the order of the plant and plant model are equal, is a perfect adaptive poles-zeros canceller. In the case of a nonlinear dynamic system, and for the monotonic nonlinearity, the proposed ABC control represents the nonlinear predictive controller. The ABC scheme is based on the discrete nonlinear (NARMAX) dynamic model. For such models and for monotonic nonlinearity, the calculation of the desired control signal is the result of the nonlinear optimization procedure with a guaranteed convex search function and consequently with a unique solution.
{"title":"Learning in an adaptive backthrough control structure","authors":"V. Kecman","doi":"10.1109/ICAPP.1997.651527","DOIUrl":"https://doi.org/10.1109/ICAPP.1997.651527","url":null,"abstract":"The paper presents a neural network (NN) based adaptive backthrough control (ABC) scheme for both linear and nonlinear dynamic plants. Unlike other feedforward NN based control schemes the ABC comprises of one neural network which simultaneously acts as both plant model (emulator) and the controller (inverse of the emulator). For linear plants, without noise, the resulting feedforward controller, providing that the order of the plant and plant model are equal, is a perfect adaptive poles-zeros canceller. In the case of a nonlinear dynamic system, and for the monotonic nonlinearity, the proposed ABC control represents the nonlinear predictive controller. The ABC scheme is based on the discrete nonlinear (NARMAX) dynamic model. For such models and for monotonic nonlinearity, the calculation of the desired control signal is the result of the nonlinear optimization procedure with a guaranteed convex search function and consequently with a unique solution.","PeriodicalId":325978,"journal":{"name":"Proceedings of 3rd International Conference on Algorithms and Architectures for Parallel Processing","volume":"41 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120822967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}