首页 > 最新文献

Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing最新文献

英文 中文
An efficient algorithm for the physical mapping of clustered task graphs onto multiprocessor architectures 聚类任务图到多处理器架构物理映射的有效算法
Pub Date : 2000-01-19 DOI: 10.1109/EMPDP.2000.823437
N. Koziris, M. Romesis, P. Tsanakas, G. Papakonstantinou
The most important issue in sequential program parallelisation is the efficient assignment of computations into different processing elements. In the past, too many approaches were devoted in efficient program parallelization considering various models for the parallel programs and the target architectures. The most widely used parallelism description model is the task graph model with precedence constraints. Nevertheless, as far as physical mapping of tasks onto parallel architectures is concerned little research has given practical results. It is well known that the physical mapping problem is NP-hard in the strong sense, thus allowing only for heuristic approaches. Most researchers or tool programmers use exhaustive algorithms, or the classical method of simulated annealing. This paper presents an alternative approach onto the mapping problem. Given the graph of clustered tasks, and the graph of the target distributed architecture, our heuristic finds a mapping by first placing the highly communicative tasks on adjacent nodes of the processor network. Once these "backbone" tasks are mapped there is no backtracking, thus achieving low complexity. Therefore, the remaining tasks are placed beginning from those close to the "backbone" tasks. The paper concludes with performance and comparison results which reveal the method's efficiency.
顺序程序并行化中最重要的问题是将计算有效地分配到不同的处理元素中。过去,考虑到并行程序的各种模型和目标体系结构,有太多的方法致力于高效的程序并行化。最广泛使用的并行描述模型是具有优先约束的任务图模型。然而,就任务到并行架构的物理映射而言,很少有研究给出实际结果。众所周知,物理映射问题在强意义上是np困难的,因此只允许启发式方法。大多数研究人员或工具程序员使用穷举算法,或模拟退火的经典方法。本文提出了解决映射问题的另一种方法。给定集群任务图和目标分布式架构图,我们的启发式算法首先将高通信任务放置在处理器网络的相邻节点上,从而找到映射。一旦这些“骨干”任务被映射,就没有回溯,从而实现低复杂性。因此,剩余的任务从那些接近“骨干”任务开始放置。最后给出了性能和对比结果,表明了该方法的有效性。
{"title":"An efficient algorithm for the physical mapping of clustered task graphs onto multiprocessor architectures","authors":"N. Koziris, M. Romesis, P. Tsanakas, G. Papakonstantinou","doi":"10.1109/EMPDP.2000.823437","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823437","url":null,"abstract":"The most important issue in sequential program parallelisation is the efficient assignment of computations into different processing elements. In the past, too many approaches were devoted in efficient program parallelization considering various models for the parallel programs and the target architectures. The most widely used parallelism description model is the task graph model with precedence constraints. Nevertheless, as far as physical mapping of tasks onto parallel architectures is concerned little research has given practical results. It is well known that the physical mapping problem is NP-hard in the strong sense, thus allowing only for heuristic approaches. Most researchers or tool programmers use exhaustive algorithms, or the classical method of simulated annealing. This paper presents an alternative approach onto the mapping problem. Given the graph of clustered tasks, and the graph of the target distributed architecture, our heuristic finds a mapping by first placing the highly communicative tasks on adjacent nodes of the processor network. Once these \"backbone\" tasks are mapped there is no backtracking, thus achieving low complexity. Therefore, the remaining tasks are placed beginning from those close to the \"backbone\" tasks. The paper concludes with performance and comparison results which reveal the method's efficiency.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128212749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 85
Tailoring a self-distributing architecture to a cluster computer environment 为集群计算机环境定制自分布体系结构
Pub Date : 2000-01-19 DOI: 10.1109/EMPDP.2000.823406
R. Moore, B. Klauer, K. Waldschmidt
This paper analyzes the consequences of existing network structure for the design of a protocol for a radical COMA (Cache Only Memory Architecture). Parallel computing today faces two significant challenges: the difficulty of programming and the need to leverage existing "off-the-shelf" hardware. The difficulty of programming parallel computers can be split into two problems: distributing the data, and distributing the computation. Parallelizing compilers address both problems, but have limited application outside the domain of loop intensive "scientific" code. Conventional COMAs provide an adaptive, self-distributing solution to data distribution, but do not address computation distribution. Our proposal leverages parallelizing compilers, and then extends COMA to provide adaptive self-distribution of both data and computation. The radical COMA protocols can be implemented in hardware, software, or a combination of both. When, however, the implementation is constrained to operate in a cluster computing environment (that is, to use only existing, already installed hardware), the protocols have to be reengineered to accommodate the deficiencies of the hardware. This paper identifies the critical quantities of various existing network structures, and discusses their repercussions for protocol design. A new protocol is presented in detail.
本文分析了现有网络结构对根本的纯缓存存储器结构(COMA)协议设计的影响。当今的并行计算面临两个重大挑战:编程的困难和利用现有“现成”硬件的需要。并行计算机编程的难点可分为数据分布和计算分布两个方面。并行编译器解决了这两个问题,但在循环密集的“科学”代码领域之外的应用有限。传统的coma为数据分布提供了一种自适应的自分布解决方案,但不解决计算分布问题。我们的建议利用并行编译器,然后扩展昏迷来提供数据和计算的自适应分布。激进的COMA协议可以在硬件、软件或两者的组合中实现。但是,当实现被限制在集群计算环境中运行时(即仅使用现有的、已经安装的硬件),必须重新设计协议以适应硬件的缺陷。本文确定了各种现有网络结构的临界数量,并讨论了它们对协议设计的影响。详细介绍了一种新的协议。
{"title":"Tailoring a self-distributing architecture to a cluster computer environment","authors":"R. Moore, B. Klauer, K. Waldschmidt","doi":"10.1109/EMPDP.2000.823406","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823406","url":null,"abstract":"This paper analyzes the consequences of existing network structure for the design of a protocol for a radical COMA (Cache Only Memory Architecture). Parallel computing today faces two significant challenges: the difficulty of programming and the need to leverage existing \"off-the-shelf\" hardware. The difficulty of programming parallel computers can be split into two problems: distributing the data, and distributing the computation. Parallelizing compilers address both problems, but have limited application outside the domain of loop intensive \"scientific\" code. Conventional COMAs provide an adaptive, self-distributing solution to data distribution, but do not address computation distribution. Our proposal leverages parallelizing compilers, and then extends COMA to provide adaptive self-distribution of both data and computation. The radical COMA protocols can be implemented in hardware, software, or a combination of both. When, however, the implementation is constrained to operate in a cluster computing environment (that is, to use only existing, already installed hardware), the protocols have to be reengineered to accommodate the deficiencies of the hardware. This paper identifies the critical quantities of various existing network structures, and discusses their repercussions for protocol design. A new protocol is presented in detail.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123747535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Gypsy: a component-based mobile agent system Gypsy:一个基于组件的移动代理系统
Pub Date : 2000-01-19 DOI: 10.1109/EMPDP.2000.823403
M. Jazayeri, Wolfgang Lugmayr
Gypsy is a component-based, dynamically extensible environment for mobile agent systems. The runtime environment consists of lightweight servers that provide a distributed execution environment for agents, and a remote administration tool that supports the set-up and shutdown of servers and agents. A server hosts a number of places to which agents may move to execute their functions. Each place is specialized to support a particular service. A supervisor agent may contain several worker agents that may execute concurrently. A supervisor agent travels from place to place according to its itinerary and launches its workers at appropriate places. A mobile agent is the basic abstraction from which all other components, including servers and supervisor agents, are constructed. The primary goal of the Gypsy project is to build a multi-language, extensible environment for experimenting with mobile agents as a programming paradigm. The environment is implemented in Java and currently supports agents written in the Java or Python programming languages. This paper presents an overview of the Gypsy project, the current system architecture and the design of the important components of the Gypsy system.
Gypsy是用于移动代理系统的基于组件的动态可扩展环境。运行时环境包括轻量级服务器(为代理提供分布式执行环境)和远程管理工具(支持服务器和代理的设置和关闭)。服务器托管许多位置,代理可以移动到这些位置执行它们的功能。每个位置都专门用于支持特定的服务。一个主管代理可以包含几个可以并发执行的工作代理。监督代理人根据其行程从一个地方旅行到另一个地方,并在适当的地方派遣工人。移动代理是构建所有其他组件(包括服务器和管理代理)的基本抽象。Gypsy项目的主要目标是构建一个多语言、可扩展的环境,用于将移动代理作为一种编程范例进行实验。该环境是用Java实现的,目前支持用Java或Python编程语言编写的代理。本文介绍了吉普赛项目的概况,当前的系统架构和吉普赛系统的重要组成部分的设计。
{"title":"Gypsy: a component-based mobile agent system","authors":"M. Jazayeri, Wolfgang Lugmayr","doi":"10.1109/EMPDP.2000.823403","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823403","url":null,"abstract":"Gypsy is a component-based, dynamically extensible environment for mobile agent systems. The runtime environment consists of lightweight servers that provide a distributed execution environment for agents, and a remote administration tool that supports the set-up and shutdown of servers and agents. A server hosts a number of places to which agents may move to execute their functions. Each place is specialized to support a particular service. A supervisor agent may contain several worker agents that may execute concurrently. A supervisor agent travels from place to place according to its itinerary and launches its workers at appropriate places. A mobile agent is the basic abstraction from which all other components, including servers and supervisor agents, are constructed. The primary goal of the Gypsy project is to build a multi-language, extensible environment for experimenting with mobile agents as a programming paradigm. The environment is implemented in Java and currently supports agents written in the Java or Python programming languages. This paper presents an overview of the Gypsy project, the current system architecture and the design of the important components of the Gypsy system.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124186767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Using agent wills to provide fault-tolerance in distributed shared memory systems 在分布式共享内存系统中使用代理遗嘱提供容错
Pub Date : 2000-01-19 DOI: 10.1109/EMPDP.2000.823426
A. Rowstron
In this paper we describe how we use mobile objects to provide distributed programs coordinating through a persistent distributed shared memory (DSM) with tolerance to sudden agent failure, and use the increasingly popular Linda-like tuple space languages as an example for implementation of the concept. In programs coordinating and communicating through a DSM a data structure is shared between multiple agents, and the agents update the shared structure directly. However, if an agent should suddenly fail it is often hard for the agents to make the data structures consistent with the new application state. For example consider if a data structure contains a list of active agents. In such a case, transactions can be used when adding and removing agent names from the list ensuring that that the data structure is consistent and does not become corrupted should an agent fail. However If failure of the agent occurs after the name has been added, how does the application ensure the list is correct? We argue that using mobile objects we can provide wills for the agents to effectively enable them to ensure the shared data structure is application consistent, even once they have Sailed We show how we have integrated the use of agent wills into a Linda system and show that we have not increased the complexity, of program writing. The integration is simple and general, does not alter the underlying semantics of the operations performed in the will and the use of mobility is transparent to the programmer.
在本文中,我们描述了如何使用移动对象通过持久的分布式共享内存(DSM)来提供分布式程序协调,并容忍突然的代理故障,并使用日益流行的类似linda的元组空间语言作为实现该概念的示例。在通过DSM进行协调和通信的程序中,数据结构在多个代理之间共享,代理直接更新共享结构。但是,如果代理突然失效,代理通常很难使数据结构与新的应用程序状态保持一致。例如,考虑一个数据结构是否包含一个活动代理列表。在这种情况下,可以在从列表中添加和删除代理名称时使用事务,以确保数据结构是一致的,并且在代理失败时不会损坏。但是,如果在添加名称之后代理发生故障,应用程序如何确保列表是正确的呢?我们认为,使用移动对象,我们可以为代理提供遗嘱,以有效地确保共享数据结构与应用程序一致,即使它们已经航行。我们展示了我们如何将代理遗嘱的使用集成到Linda系统中,并表明我们没有增加程序编写的复杂性。这种集成简单而通用,不会改变在遗嘱中执行的操作的底层语义,而且移动性的使用对程序员来说是透明的。
{"title":"Using agent wills to provide fault-tolerance in distributed shared memory systems","authors":"A. Rowstron","doi":"10.1109/EMPDP.2000.823426","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823426","url":null,"abstract":"In this paper we describe how we use mobile objects to provide distributed programs coordinating through a persistent distributed shared memory (DSM) with tolerance to sudden agent failure, and use the increasingly popular Linda-like tuple space languages as an example for implementation of the concept. In programs coordinating and communicating through a DSM a data structure is shared between multiple agents, and the agents update the shared structure directly. However, if an agent should suddenly fail it is often hard for the agents to make the data structures consistent with the new application state. For example consider if a data structure contains a list of active agents. In such a case, transactions can be used when adding and removing agent names from the list ensuring that that the data structure is consistent and does not become corrupted should an agent fail. However If failure of the agent occurs after the name has been added, how does the application ensure the list is correct? We argue that using mobile objects we can provide wills for the agents to effectively enable them to ensure the shared data structure is application consistent, even once they have Sailed We show how we have integrated the use of agent wills into a Linda system and show that we have not increased the complexity, of program writing. The integration is simple and general, does not alter the underlying semantics of the operations performed in the will and the use of mobility is transparent to the programmer.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134082106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Modelling message-passing programs for static mapping 为静态映射建模消息传递程序
Pub Date : 2000-01-19 DOI: 10.1109/EMPDP.2000.823416
C. Roig, A. Ripoll, M. A. Senar, F. Guirado, E. Luque
An efficient mapping of a parallel program in the processors is vital for achieving a high performance on a parallel computer. When the structure of the parallel program in terms of its task execution times, task dependencies, and amount communication data, is known a priori, mapping can be accomplished statically at compile time. Mapping algorithms start from a parallel application model and map automatically tasks to processors in order to minimise the execution time of the program. In this paper we discuss the current models used in mapping parallel programs: Task Precedence Graph (TPG), Task Interaction Graph (TIG) and we define a new model called Temporal Task Interaction Graph (TTIG). The contribution of the TTIG is that it enhances these two previous models with the ability to explicitly capture the potential degree of parallel execution between adjacent tasks allowing the development of efficient mapping algorithms. Experimentation had been performed in order to show the effectiveness of TTIG model for a set of graphs. The results are compared with the optimal assignment and the obtained using TIG model and they confirm that using the TTIG model, better assignments can be obtained.
处理器中并行程序的有效映射对于实现并行计算机的高性能至关重要。当并行程序的结构(包括任务执行时间、任务依赖关系和通信数据量)先验已知时,可以在编译时静态地完成映射。映射算法从并行应用程序模型开始,并自动将任务映射到处理器,以尽量减少程序的执行时间。本文讨论了当前用于并行程序映射的模型:任务优先图(TPG)和任务交互图(TIG),并定义了一个新的模型——时序任务交互图(TTIG)。TTIG的贡献在于,它增强了前面两个模型,能够显式地捕捉相邻任务之间潜在的并行执行程度,从而开发高效的映射算法。为了证明TTIG模型对一组图的有效性,进行了实验。将结果与最优分配和使用TIG模型得到的结果进行了比较,证实使用TTIG模型可以得到更好的分配。
{"title":"Modelling message-passing programs for static mapping","authors":"C. Roig, A. Ripoll, M. A. Senar, F. Guirado, E. Luque","doi":"10.1109/EMPDP.2000.823416","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823416","url":null,"abstract":"An efficient mapping of a parallel program in the processors is vital for achieving a high performance on a parallel computer. When the structure of the parallel program in terms of its task execution times, task dependencies, and amount communication data, is known a priori, mapping can be accomplished statically at compile time. Mapping algorithms start from a parallel application model and map automatically tasks to processors in order to minimise the execution time of the program. In this paper we discuss the current models used in mapping parallel programs: Task Precedence Graph (TPG), Task Interaction Graph (TIG) and we define a new model called Temporal Task Interaction Graph (TTIG). The contribution of the TTIG is that it enhances these two previous models with the ability to explicitly capture the potential degree of parallel execution between adjacent tasks allowing the development of efficient mapping algorithms. Experimentation had been performed in order to show the effectiveness of TTIG model for a set of graphs. The results are compared with the optimal assignment and the obtained using TIG model and they confirm that using the TTIG model, better assignments can be obtained.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132097948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
ViMPIOS, a "truly" portable MPI-IO implementation ViMPIOS,一个“真正的”便携式MPI-IO实现
Pub Date : 2000-01-19 DOI: 10.1109/EMPDP.2000.823386
Kurt Stockinger, E. Schikuta
We present ViMPIOS, a novel MPI-IO implementation based on ViPIOS, the Vienna Parallel Input Output System. ViMPIOS inherits the defining characteristics of ViPIOS, which makes it a client-server based system focusing on cluster architectures. ViMPIOS stands out from all other MPI-IO implementations by its "truly" portable design, which allows not only applications to be transferred between parallel architectures easily but also to keep their original performance characteristics on the new platform as far as possible. This is kept by the "smart" AI-blackboard module of ViPIOS, which is responsible for an appropriate data layout. Specifically in this paper we concentrate on the algorithm, which maps MPI-IO data structures on respective ViPIOS structures, and thus allows to exploit the ViPIOS properties.
我们提出了一种基于ViPIOS(维也纳并行输入输出系统)的新型MPI-IO实现。ViMPIOS继承了ViPIOS的定义特征,这使它成为一个基于客户机-服务器的系统,专注于集群体系结构。ViMPIOS以其“真正的”可移植设计从所有其他MPI-IO实现中脱颖而出,这不仅允许应用程序在并行体系结构之间轻松传输,而且还允许在新平台上尽可能保持其原有的性能特征。这是由ViPIOS的“智能”AI-blackboard模块保持的,该模块负责适当的数据布局。特别是在本文中,我们专注于算法,该算法将MPI-IO数据结构映射到各自的ViPIOS结构上,从而允许利用ViPIOS属性。
{"title":"ViMPIOS, a \"truly\" portable MPI-IO implementation","authors":"Kurt Stockinger, E. Schikuta","doi":"10.1109/EMPDP.2000.823386","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823386","url":null,"abstract":"We present ViMPIOS, a novel MPI-IO implementation based on ViPIOS, the Vienna Parallel Input Output System. ViMPIOS inherits the defining characteristics of ViPIOS, which makes it a client-server based system focusing on cluster architectures. ViMPIOS stands out from all other MPI-IO implementations by its \"truly\" portable design, which allows not only applications to be transferred between parallel architectures easily but also to keep their original performance characteristics on the new platform as far as possible. This is kept by the \"smart\" AI-blackboard module of ViPIOS, which is responsible for an appropriate data layout. Specifically in this paper we concentrate on the algorithm, which maps MPI-IO data structures on respective ViPIOS structures, and thus allows to exploit the ViPIOS properties.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117197697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Consistency requirements of distributed shared memory for Lamport's bakery algorithm for mutual exclusion 分布式共享内存对Lamport面包房互斥算法的一致性要求
Pub Date : 2000-01-19 DOI: 10.1109/EMPDP.2000.823415
J. Brzeziński, D. Wawrzyniak
As is well known Lamport's Bakery algorithm for mutual exclusion of n processes is correct if a physically shared memory is used as the communication facility between processes. An application of weaker consistency models (e.g. causal, processor, PRAM), available in replicated distributed shared memory (DSM) systems appealing due to possible performance improvement may imply incorrectness of the algorithm. It raises consistency requirement problem, a problem of finding weaker consistency models of DSM that is sufficient for the algorithm correctness. In this paper, consistency requirements of distributed shared memory for Lamport's Bakery algorithm for mutual exclusion of n processes are considered It is proven that the algorithm is correct with a consistency model resulting from a combination of sequential consistency and one of the weakest consistency models, PRAM, without explicit synchronisation. The combination is achieved by specifying the consistency model with write operations on shared locations.
众所周知,如果使用物理共享内存作为进程之间的通信设施,那么Lamport的面包房算法对于n个进程的互斥是正确的。在复制分布式共享内存(DSM)系统中可用的较弱一致性模型(例如因果、处理器、PRAM)的应用,由于可能的性能改进而具有吸引力,可能意味着算法不正确。它提出了一致性要求问题,即寻找足以保证算法正确性的弱一致性模型的问题。本文考虑了分布式共享内存中n进程互斥的Lamport的Bakery算法的一致性要求,并证明了该算法是正确的,该一致性模型是由顺序一致性和最弱的一致性模型之一PRAM组合而成的,没有显式同步。这种组合是通过在共享位置上指定写操作的一致性模型来实现的。
{"title":"Consistency requirements of distributed shared memory for Lamport's bakery algorithm for mutual exclusion","authors":"J. Brzeziński, D. Wawrzyniak","doi":"10.1109/EMPDP.2000.823415","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823415","url":null,"abstract":"As is well known Lamport's Bakery algorithm for mutual exclusion of n processes is correct if a physically shared memory is used as the communication facility between processes. An application of weaker consistency models (e.g. causal, processor, PRAM), available in replicated distributed shared memory (DSM) systems appealing due to possible performance improvement may imply incorrectness of the algorithm. It raises consistency requirement problem, a problem of finding weaker consistency models of DSM that is sufficient for the algorithm correctness. In this paper, consistency requirements of distributed shared memory for Lamport's Bakery algorithm for mutual exclusion of n processes are considered It is proven that the algorithm is correct with a consistency model resulting from a combination of sequential consistency and one of the weakest consistency models, PRAM, without explicit synchronisation. The combination is achieved by specifying the consistency model with write operations on shared locations.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121345169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Heterogeneous client-server architecture for a virtual meeting environment 用于虚拟会议环境的异构客户机-服务器架构
Pub Date : 2000-01-19 DOI: 10.1109/EMPDP.2000.823396
M. Masoodian, S. Luz
Magic Lounge is a shared virtual meeting environment which has been designed to support meetings between physically remote people who would like to interact with each other using any one of a number of heterogeneous communication devices. This paper describes the heterogeneous client-server architecture of the Magic Lounge which supports communication between PCs, PDAs, palmtops, and mobile telephones. This architecture combines a number of different technologies, including CORBA and MBone, to provide the necessary means of audio and textual communication between the users of different devices. This paper also discusses the various requirements of this type of meeting environment, as well as describing some of the Magic Lounge software tools and components which have been developed to provide intelligent communication services to its users.
Magic Lounge是一个共享的虚拟会议环境,旨在支持物理上远程的人之间的会议,这些人希望使用许多异构通信设备中的任何一种进行交互。本文描述了Magic Lounge的异构客户机-服务器架构,该架构支持pc、pda、掌上电脑和移动电话之间的通信。该体系结构结合了许多不同的技术,包括CORBA和MBone,以提供不同设备用户之间音频和文本通信的必要手段。本文还讨论了这种类型的会议环境的各种需求,并描述了Magic Lounge的一些软件工具和组件,这些工具和组件已经开发出来,可以为其用户提供智能通信服务。
{"title":"Heterogeneous client-server architecture for a virtual meeting environment","authors":"M. Masoodian, S. Luz","doi":"10.1109/EMPDP.2000.823396","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823396","url":null,"abstract":"Magic Lounge is a shared virtual meeting environment which has been designed to support meetings between physically remote people who would like to interact with each other using any one of a number of heterogeneous communication devices. This paper describes the heterogeneous client-server architecture of the Magic Lounge which supports communication between PCs, PDAs, palmtops, and mobile telephones. This architecture combines a number of different technologies, including CORBA and MBone, to provide the necessary means of audio and textual communication between the users of different devices. This paper also discusses the various requirements of this type of meeting environment, as well as describing some of the Magic Lounge software tools and components which have been developed to provide intelligent communication services to its users.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122633110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A robust multigrid solver on parallel computers 并行计算机上的鲁棒多网格求解器
Pub Date : 2000-01-19 DOI: 10.1109/EMPDP.2000.823393
R. Montero, M. Prieto, I. Llorente, F. Tirado
In this paper two well-known robust multigrid solvers for anisotropic operators on structured grids are compared: alternating-plane smoothers with full coarsening and plane smoothers combined with semicoarsening. The study takes into account not only numerical properties but also architectural ones, focusing on cache memory exploitation and parallel characteristics. Experimental results for the sequential algorithms have been obtained on two different systems based on the MIPS R10000 processor but with different L2 cache sizes (an SGI O2 workstation and an SGI Origin 2000 system). Two different parallel implementations for the latter robust approach have been considered. The first one has optimal parallel characteristics but due to deterioration of the convergence properties its realistic efficiency is not satisfactory. In the second one, some processors remain idle during a short period of time on every multigrid cycle, however the algorithm is more efficient since it preserves the numerical properties of the sequential version. Parallel experiments have also been taken on a Cray T3E system.
本文比较了结构网格上各向异性算子的两种著名的鲁棒多网格求解方法:完全粗化的交替平面光滑法和半粗化结合的平面光滑法。该研究不仅考虑了数值特性,而且考虑了架构特性,重点关注了缓存内存的开发和并行特性。在基于MIPS R10000处理器但L2缓存大小不同的两种不同系统(SGI O2工作站和SGI Origin 2000系统)上获得了顺序算法的实验结果。本文考虑了后一种健壮方法的两种不同的并行实现。第一种方法具有最优的并行特性,但由于收敛性的退化,其实际效率不理想。在第二种算法中,在每个多网格周期中,一些处理器在短时间内保持空闲状态,但是该算法由于保留了顺序版本的数值特性而更有效。在克雷T3E系统上也进行了平行实验。
{"title":"A robust multigrid solver on parallel computers","authors":"R. Montero, M. Prieto, I. Llorente, F. Tirado","doi":"10.1109/EMPDP.2000.823393","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823393","url":null,"abstract":"In this paper two well-known robust multigrid solvers for anisotropic operators on structured grids are compared: alternating-plane smoothers with full coarsening and plane smoothers combined with semicoarsening. The study takes into account not only numerical properties but also architectural ones, focusing on cache memory exploitation and parallel characteristics. Experimental results for the sequential algorithms have been obtained on two different systems based on the MIPS R10000 processor but with different L2 cache sizes (an SGI O2 workstation and an SGI Origin 2000 system). Two different parallel implementations for the latter robust approach have been considered. The first one has optimal parallel characteristics but due to deterioration of the convergence properties its realistic efficiency is not satisfactory. In the second one, some processors remain idle during a short period of time on every multigrid cycle, however the algorithm is more efficient since it preserves the numerical properties of the sequential version. Parallel experiments have also been taken on a Cray T3E system.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129405380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Predictability of bulk synchronous programs using MPI 使用MPI的批量同步程序的可预测性
Pub Date : 2000-01-19 DOI: 10.1109/EMPDP.2000.823402
A. Zavanella, Alessandro Milazzo
The BSP cost model provides a general framework to design efficient and portable data-parallel algorithms. Execution costs of BSP programs are predicted combining a limited number of program and machine dependent parameters. BSP programs can be written using several programming tools. In this work we explore the predictability of bulk synchronous programs implemented with the Message Passing Interface. Two classic computational geometry problems: the convex hull (CH) and the lower envelope (LE) are considered as cases of study. Efficient BSP algorithms have been implemented using MPI and executed on three different parallel architectures: a Fujitsu AP1000 (distributed memory), a CRAY T3E (distributed shared memory) and a cluster of PCs (Backus). The paper compares the degree of predictability on these architectures, analysing the main sources of error.
BSP代价模型为设计高效、便携的数据并行算法提供了一个通用框架。结合有限数量的程序和机器相关参数来预测BSP程序的执行成本。BSP程序可以使用多种编程工具编写。在这项工作中,我们探讨了用消息传递接口实现的批量同步程序的可预测性。两个经典的计算几何问题:凸壳(CH)和下包络(LE)被视为研究的案例。高效的BSP算法已经使用MPI实现,并在三种不同的并行架构上执行:富士通AP1000(分布式内存),CRAY T3E(分布式共享内存)和pc集群(Backus)。本文比较了这些体系结构的可预测性程度,分析了误差的主要来源。
{"title":"Predictability of bulk synchronous programs using MPI","authors":"A. Zavanella, Alessandro Milazzo","doi":"10.1109/EMPDP.2000.823402","DOIUrl":"https://doi.org/10.1109/EMPDP.2000.823402","url":null,"abstract":"The BSP cost model provides a general framework to design efficient and portable data-parallel algorithms. Execution costs of BSP programs are predicted combining a limited number of program and machine dependent parameters. BSP programs can be written using several programming tools. In this work we explore the predictability of bulk synchronous programs implemented with the Message Passing Interface. Two classic computational geometry problems: the convex hull (CH) and the lower envelope (LE) are considered as cases of study. Efficient BSP algorithms have been implemented using MPI and executed on three different parallel architectures: a Fujitsu AP1000 (distributed memory), a CRAY T3E (distributed shared memory) and a cluster of PCs (Backus). The paper compares the degree of predictability on these architectures, analysing the main sources of error.","PeriodicalId":128020,"journal":{"name":"Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2000-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121558055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
期刊
Proceedings 8th Euromicro Workshop on Parallel and Distributed Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1