Pub Date : 2000-01-19DOI: 10.1109/EMPDP.2000.823427
O. Botti, V. D. Florio, Geert Deconinck, R. Lauwereins, F. Cassinari, S. Donatelli, A. Bobbio, A. Klein, Holger Küfner, Erwin M. Thurner, E. Verhulst
Available solutions for fault tolerance in embedded automation are often based on strong customisation, have impacts on the whole life-cycle, and require highly specialised design teams, thus making dependable embedded systems costly and difficult to develop and maintain. The TIRAN project develops a framework which provides fault tolerance capabilities to automation systems, with the goal of allowing portable, reusable and cost-effective solutions. Application developers are allowed to select, configure and integrate in their own environment a variety of software-based functions for error detection, confinement and recovery provided by the framework.
{"title":"The TIRAN approach to reusing software implemented fault tolerance","authors":"O. Botti, V. D. Florio, Geert Deconinck, R. Lauwereins, F. Cassinari, S. Donatelli, A. Bobbio, A. Klein, Holger Küfner, Erwin M. Thurner, E. Verhulst","doi":"10.1109/EMPDP.2000.823427","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823427","url":null,"abstract":"Available solutions for fault tolerance in embedded automation are often based on strong customisation, have impacts on the whole life-cycle, and require highly specialised design teams, thus making dependable embedded systems costly and difficult to develop and maintain. The TIRAN project develops a framework which provides fault tolerance capabilities to automation systems, with the goal of allowing portable, reusable and cost-effective solutions. Application developers are allowed to select, configure and integrate in their own environment a variety of software-based functions for error detection, confinement and recovery provided by the framework.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129987747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-01-19DOI: 10.1109/EMPDP.2000.823425
F. Baiardi, D. Guerri, P. Mori, L. Ricci
We introduce DVSA, distributed virtual shared areas, a virtual machine supporting the sharing of information on distributed memory architectures. The shared memory is structured as a set of areas where the size of each area may be chosen in all architecture dependent range. DVSA supports the sharing of areas rather than of variables because the exchange of chunks of data may result in better performances on distributed memory architectures offering little or no hardware support to information sharing. DVSA does not implement replication or prefetching strategies under the assumption that these strategies should be implemented by application specific virtual machines. The definition of these machines may often be driven by the compilation of the adopted programming languages. To validate the assumption, at first we consider the implementation of data parallel loops and show that a set of static analyses based on the closed forms approach makes it possible to define compiler driven caching and prefetching strategies. These strategies fully exploit the operations offered by the DVSA machine and they noticeably reduce the time to access shared information. The optimizations strategies that can be exploited by the compiler includes the merging of accesses to avoid multiple access to the same area, the prefetching of areas and the reduction of the overhead due to barrier synchronization. Preliminary performance figures are discussed.
{"title":"Evaluation of a virtual shared memory machine by the compilation of data parallel loops","authors":"F. Baiardi, D. Guerri, P. Mori, L. Ricci","doi":"10.1109/EMPDP.2000.823425","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823425","url":null,"abstract":"We introduce DVSA, distributed virtual shared areas, a virtual machine supporting the sharing of information on distributed memory architectures. The shared memory is structured as a set of areas where the size of each area may be chosen in all architecture dependent range. DVSA supports the sharing of areas rather than of variables because the exchange of chunks of data may result in better performances on distributed memory architectures offering little or no hardware support to information sharing. DVSA does not implement replication or prefetching strategies under the assumption that these strategies should be implemented by application specific virtual machines. The definition of these machines may often be driven by the compilation of the adopted programming languages. To validate the assumption, at first we consider the implementation of data parallel loops and show that a set of static analyses based on the closed forms approach makes it possible to define compiler driven caching and prefetching strategies. These strategies fully exploit the operations offered by the DVSA machine and they noticeably reduce the time to access shared information. The optimizations strategies that can be exploited by the compiler includes the merging of accesses to avoid multiple access to the same area, the prefetching of areas and the reduction of the overhead due to barrier synchronization. Preliminary performance figures are discussed.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134564510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-01-19DOI: 10.1109/EMPDP.2000.823428
L. Lefèvre, Olivier Reymann
With the development of clusters based on high performance networks, it is now possible to design efficient Distributed Shared Memory systems. In this paper we present the approach we choose to implement a high performance DSM system on top of a cluster by combining the use of low-latency communication protocols (MPI-BIP on Myrinet networks) with multithreading approach (PM2). We present our approach called Distributed Objects Shared MemOry System (DOSMOS system), its design and experiments performed on various communication libraries (PVM, MPI) and on various networks (Ethernet, Myrinet).
{"title":"Combining low-latency communication protocols with multithreading for high performance DSM systems on clusters","authors":"L. Lefèvre, Olivier Reymann","doi":"10.1109/EMPDP.2000.823428","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823428","url":null,"abstract":"With the development of clusters based on high performance networks, it is now possible to design efficient Distributed Shared Memory systems. In this paper we present the approach we choose to implement a high performance DSM system on top of a cluster by combining the use of low-latency communication protocols (MPI-BIP on Myrinet networks) with multithreading approach (PM2). We present our approach called Distributed Objects Shared MemOry System (DOSMOS system), its design and experiments performed on various communication libraries (PVM, MPI) and on various networks (Ethernet, Myrinet).","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130228714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-01-19DOI: 10.1109/EMPDP.2000.823413
V. Tran, L. Hluchý, Giang T. Nguyen
In this paper, bye present a new powerful method for parallel program representation called Data Driven Graph (DDG). DDG takes all advantages of classical Directed Acyclic graph (DAC) and adds much more. Simple definition, flexibility and ability to represent loops and dynamically created tasks. With DDG, scheduling becomes an efficient tool for increasing performance of parallel systems. DDG is not only a parallel program model, it also initiates a new parallel programming style, allows programmer to write a parallel program with minimal difficulty. We also present our parallel program development tool with support for DDG and scheduling.
{"title":"Parallel programming with data driven model","authors":"V. Tran, L. Hluchý, Giang T. Nguyen","doi":"10.1109/EMPDP.2000.823413","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823413","url":null,"abstract":"In this paper, bye present a new powerful method for parallel program representation called Data Driven Graph (DDG). DDG takes all advantages of classical Directed Acyclic graph (DAC) and adds much more. Simple definition, flexibility and ability to represent loops and dynamically created tasks. With DDG, scheduling becomes an efficient tool for increasing performance of parallel systems. DDG is not only a parallel program model, it also initiates a new parallel programming style, allows programmer to write a parallel program with minimal difficulty. We also present our parallel program development tool with support for DDG and scheduling.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131351668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-01-19DOI: 10.1109/EMPDP.2000.823435
C. Napoli, M. Giordano, M. Furnari, F. Vitobello
The use of software packages that allow for distributed application computing is becoming more and more attractive due to the advent of high speed computer networks as well as of high performance general purpose workstations. Nevertheless it is well known that the success of cluster-based parallel computing depends on the performance of communication among processors, which in turn depends on the physical network architecture, protocols, and interface software. In this paper we present the results of our investigation into an efficient use of PVM software package in real distributed application settings over ATM networks. In particular we will show that the tuning of PVM network parameters, done at application-level, improves communication performances on ATM networks without using a native ATM PVM implementation that directly adopts the ATM Adaptation Layer instead of TCP/IP.
{"title":"PVM application-level tuning over ATM","authors":"C. Napoli, M. Giordano, M. Furnari, F. Vitobello","doi":"10.1109/EMPDP.2000.823435","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823435","url":null,"abstract":"The use of software packages that allow for distributed application computing is becoming more and more attractive due to the advent of high speed computer networks as well as of high performance general purpose workstations. Nevertheless it is well known that the success of cluster-based parallel computing depends on the performance of communication among processors, which in turn depends on the physical network architecture, protocols, and interface software. In this paper we present the results of our investigation into an efficient use of PVM software package in real distributed application settings over ATM networks. In particular we will show that the tuning of PVM network parameters, done at application-level, improves communication performances on ATM networks without using a native ATM PVM implementation that directly adopts the ATM Adaptation Layer instead of TCP/IP.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116767887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-01-19DOI: 10.1109/EMPDP.2000.823423
J. Sahuquillo, Teresa Nachiondo Frinós, Juan-Carlos Cano, J. A. Gil, A. Pont
The workload used for evaluating and obtaining performance results in shared memory multiprocessors are widely heterogeneous. Traces have been used over several decades and as computers systems grew in power, semantic benchmarks, like SPLASH2, became the most common workloads. Unfortunately, few benchmarks are available. Recently, self-similar studies have been performed in several computer domains. In this paper, we study the self-similar properties of several SPLASH2 benchmarks. Each benchmark has been studied independently, and all exhibit a clearly self-similar behaviour. The results enable the construction of a self-similar memory reference generator that makes a wide variety of parallel workload traces in a a flexible manner; as well as quickly.
{"title":"Self-similarity in SPLASH-2 workloads on shared memory multiprocessors systems","authors":"J. Sahuquillo, Teresa Nachiondo Frinós, Juan-Carlos Cano, J. A. Gil, A. Pont","doi":"10.1109/EMPDP.2000.823423","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823423","url":null,"abstract":"The workload used for evaluating and obtaining performance results in shared memory multiprocessors are widely heterogeneous. Traces have been used over several decades and as computers systems grew in power, semantic benchmarks, like SPLASH2, became the most common workloads. Unfortunately, few benchmarks are available. Recently, self-similar studies have been performed in several computer domains. In this paper, we study the self-similar properties of several SPLASH2 benchmarks. Each benchmark has been studied independently, and all exhibit a clearly self-similar behaviour. The results enable the construction of a self-similar memory reference generator that makes a wide variety of parallel workload traces in a a flexible manner; as well as quickly.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122016971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-01-19DOI: 10.1109/EMPDP.2000.823407
I. Zoraja, A. Bode, V. Sunderam
Proves that process migration can successfully be implemented for software distributed shared memory (DSM) environments. We have developed a migration framework that is able to transparently migrate DSM processes, thereby preserving the consistency of running applications. The migration framework is integrated into the CORAL (Cooperative Online monitoRing Actions Layer) system, an online monitoring system that connects parallel tools to a running application. A special emphasis has been put on techniques and mechanisms for the migration of shared resources and communication channels as well as internal monitoring data structures. Currently, the migration framework migrates parallel processes based on the TreadMarks library. The Condor library has been utilized for the state transfer of a single process. In a computing environment consisting of eight nodes running TreadMarks applications, the migration framework brings a 10% overhead to Condor and grows almost linearly with added nodes. Although our first implementation supports TreadMarks applications, both the monitoring system and the migration framework are designed to be reusable and easily adaptable to other software DSM systems.
{"title":"A framework for process migration in software DSM environments","authors":"I. Zoraja, A. Bode, V. Sunderam","doi":"10.1109/EMPDP.2000.823407","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823407","url":null,"abstract":"Proves that process migration can successfully be implemented for software distributed shared memory (DSM) environments. We have developed a migration framework that is able to transparently migrate DSM processes, thereby preserving the consistency of running applications. The migration framework is integrated into the CORAL (Cooperative Online monitoRing Actions Layer) system, an online monitoring system that connects parallel tools to a running application. A special emphasis has been put on techniques and mechanisms for the migration of shared resources and communication channels as well as internal monitoring data structures. Currently, the migration framework migrates parallel processes based on the TreadMarks library. The Condor library has been utilized for the state transfer of a single process. In a computing environment consisting of eight nodes running TreadMarks applications, the migration framework brings a 10% overhead to Condor and grows almost linearly with added nodes. Although our first implementation supports TreadMarks applications, both the monitoring system and the migration framework are designed to be reusable and easily adaptable to other software DSM systems.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"197 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120979039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-01-19DOI: 10.1109/EMPDP.2000.823419
G. Folino, G. Spezzano
This paper describes CELLAR, a language for cellular programming which extends the cellular automata model through the concept of regions. Regions are spatiotemporal objects that define zones of the automaton (set of cells), containing interesting and meaningful data patterns or trends that can be defined as events. Each cell of the automaton can monitor regions for a given period and observe their evolution by global functions (max, min, sum etc.). Furthermore, each cell can have an associated attribute called its perception rating, that indicates how far that cell can 'see'. On the basis of this value and the cell's position in the cellular space, we can define the regions that are visible to the cell. Using these constructs, a cell can define significant events to extract data of interest in one or more regions and perform actions when an event is detected. In the paper, we show that regions simplify programming and allow the building of more complex models. After describing the main constructs of CELLAR, the paper illustrates the region-based programming model by describing the design of a parallel model of animal migration. Performance results of the model implemented on a Meiko CS-2 are also given.
{"title":"CELLAR: a high level cellular programming language with regions","authors":"G. Folino, G. Spezzano","doi":"10.1109/EMPDP.2000.823419","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823419","url":null,"abstract":"This paper describes CELLAR, a language for cellular programming which extends the cellular automata model through the concept of regions. Regions are spatiotemporal objects that define zones of the automaton (set of cells), containing interesting and meaningful data patterns or trends that can be defined as events. Each cell of the automaton can monitor regions for a given period and observe their evolution by global functions (max, min, sum etc.). Furthermore, each cell can have an associated attribute called its perception rating, that indicates how far that cell can 'see'. On the basis of this value and the cell's position in the cellular space, we can define the regions that are visible to the cell. Using these constructs, a cell can define significant events to extract data of interest in one or more regions and perform actions when an event is detected. In the paper, we show that regions simplify programming and allow the building of more complex models. After describing the main constructs of CELLAR, the paper illustrates the region-based programming model by describing the design of a parallel model of animal migration. Performance results of the model implemented on a Meiko CS-2 are also given.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"406 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124317848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-01-19DOI: 10.1109/EMPDP.2000.823421
Frédéric Brégier, M. Counilh, J. Roman
In this paper we study one kind of irregular computation on distributed arrays, the irregular prefix operation, that is currently not well taken into account by the standard data-parallel language HPF2. We show a parallel implementation that efficiently takes advantage of the independent computations arising in this irregular operation. Our approach is based on the use of a directive which characterizes an irregular prefix operation and on inspector/executor support, implemented in the CoLuMBO library, which optimizes the execution by using an asynchronous communication scheme and then communication/computation overlap. We validate our contribution with results achieved on IBM SP2 for basic experiments and for a sparse Cholesky factorization algorithm applied to real size problems.
{"title":"Asynchronous progressive irregular prefix operation in HPF2","authors":"Frédéric Brégier, M. Counilh, J. Roman","doi":"10.1109/EMPDP.2000.823421","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823421","url":null,"abstract":"In this paper we study one kind of irregular computation on distributed arrays, the irregular prefix operation, that is currently not well taken into account by the standard data-parallel language HPF2. We show a parallel implementation that efficiently takes advantage of the independent computations arising in this irregular operation. Our approach is based on the use of a directive which characterizes an irregular prefix operation and on inspector/executor support, implemented in the CoLuMBO library, which optimizes the execution by using an asynchronous communication scheme and then communication/computation overlap. We validate our contribution with results achieved on IBM SP2 for basic experiments and for a sparse Cholesky factorization algorithm applied to real size problems.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115300755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2000-01-19DOI: 10.1109/EMPDP.2000.823399
Stefan Koch, George H. Schneider
We present the implementation of Virtual Notes, an annotation service on the World Wide Web using only standard Internet technology. Therefore Virtual Notes are immediately usable by anyone on the WWW without preparation. This makes the solution especially suited for environments lacking a coherent workgroup, e.g. teaching or research. Also described are the user interface, several use cases and the administrative side of the annotation service, where the overhead for employing Virtual Notes has been minimized.
{"title":"Implementation of an annotation service on the WWW-Virtual Notes","authors":"Stefan Koch, George H. Schneider","doi":"10.1109/EMPDP.2000.823399","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823399","url":null,"abstract":"We present the implementation of Virtual Notes, an annotation service on the World Wide Web using only standard Internet technology. Therefore Virtual Notes are immediately usable by anyone on the WWW without preparation. This makes the solution especially suited for environments lacking a coherent workgroup, e.g. teaching or research. Also described are the user interface, several use cases and the administrative side of the annotation service, where the overhead for employing Virtual Notes has been minimized.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122016725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}