首页 > 最新文献

High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on最新文献

英文 中文
Policies for swapping MPI processes 交换MPI进程的策略
O. Sievert, H. Casanova
Despite the enormous amount of research and development work in the area of parallel computing, it is a common observation that simultaneous performance and ease-of-use are elusive. We believe that ease-of-use is critical for many end users, and thus seek performance enhancing techniques that can be easily retrofitted to existing parallel applications. In a precious paper we have presented MPI (message passing interface) process swapping, a simple add-on to the MPI programming environment that can improve performance in shared computing environments. MPI process swapping requires as few as three lines of source code change to an existing application. In this paper we explore a question that we had left open in our previous work: based on which policies should processes be swapped for best performance? Our results show that, with adequate swapping policies, MPI process swapping can provide substantial performance benefits with very limited implementation effort.
尽管在并行计算领域进行了大量的研究和开发工作,但人们普遍认为并行性能和易用性是难以捉摸的。我们相信易用性对许多最终用户来说是至关重要的,因此我们寻求能够轻松地对现有并行应用程序进行改进的性能增强技术。在一篇宝贵的论文中,我们提出了MPI(消息传递接口)进程交换,这是MPI编程环境的一个简单附加组件,可以提高共享计算环境中的性能。MPI进程交换只需要对现有应用程序更改三行源代码。在本文中,我们探讨了我们在以前的工作中留下的一个问题:应该根据哪些策略交换流程以获得最佳性能?我们的结果表明,使用适当的交换策略,MPI进程交换可以在非常有限的实现工作量下提供实质性的性能优势。
{"title":"Policies for swapping MPI processes","authors":"O. Sievert, H. Casanova","doi":"10.1109/HPDC.2003.1210020","DOIUrl":"https://doi.org/10.1109/HPDC.2003.1210020","url":null,"abstract":"Despite the enormous amount of research and development work in the area of parallel computing, it is a common observation that simultaneous performance and ease-of-use are elusive. We believe that ease-of-use is critical for many end users, and thus seek performance enhancing techniques that can be easily retrofitted to existing parallel applications. In a precious paper we have presented MPI (message passing interface) process swapping, a simple add-on to the MPI programming environment that can improve performance in shared computing environments. MPI process swapping requires as few as three lines of source code change to an existing application. In this paper we explore a question that we had left open in our previous work: based on which policies should processes be swapped for best performance? Our results show that, with adequate swapping policies, MPI process swapping can provide substantial performance benefits with very limited implementation effort.","PeriodicalId":430378,"journal":{"name":"High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132656984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
SODA: a service-on-demand architecture for application service hosting utility platforms SODA:用于应用程序服务托管实用程序平台的按需服务体系结构
Xuxian Jiang, Dongyan Xu
The grid is realizing the vision of providing computation as utility: computational jobs can be scheduled on-demand at grid hosts based on available computational capacity. In this project, we study another emerging usage of grid utility: the hosting of application services. Different from a computational job, an application service such as an e-Laboratory or an on-line business has longer lifetime, and performs multiple jobs requested by its clients. A service hosting utility platform (HUP) is formed by a set of hosts in the grid, and multiple application services will be hosted on the HUP. SODA is a service-on-demand architecture that enables on-demand creation of application services on a HUP. With SODA, an application service will be created in the form of a set of virtual service nodes; each node is a virtual machine which is physically a 'slice' of a real host in the HUP. SODA involves both OS and middleware techniques, and has the following salient capabilities: (1) on-demand service priming: the image of an application service as well as the OS on which it runs will be created on-demand and bootstrapped automatically; (2) better service isolation: services sharing the same HUP host are isolated with respect to administration, faults, intrusion, and resources; (3) integrated service load management: for each service, a service switch will be created to direct client requests to appropriate virtual service nodes. Moreover, the application service provider can replace the default request switching policy with a service-specific policy.
网格正在实现将计算作为实用工具提供的愿景:计算作业可以根据可用的计算能力在网格主机上按需调度。在这个项目中,我们研究了网格实用程序的另一个新兴用途:应用程序服务的托管。与计算作业不同,应用程序服务(如电子实验室或在线业务)具有更长的生命周期,并执行其客户请求的多个作业。服务托管实用平台(HUP)由网格中的一组主机组成,多个应用服务将托管在HUP上。SODA是一种按需服务体系结构,支持在HUP上按需创建应用程序服务。使用SODA,将以一组虚拟服务节点的形式创建应用程序服务;每个节点都是一个虚拟机,它在物理上是HUP中真实主机的一个“切片”。SODA涉及操作系统和中间件技术,并具有以下突出功能:(1)按需服务启动:应用程序服务的映像及其运行的操作系统将按需创建并自动启动;(2)更好的服务隔离:共享同一HUP主机的服务在管理、故障、入侵和资源方面是隔离的;(3)集成服务负载管理:为每项服务创建一个服务交换机,将客户端请求引导到相应的虚拟服务节点。此外,应用程序服务提供者可以用特定于服务的策略替换默认的请求切换策略。
{"title":"SODA: a service-on-demand architecture for application service hosting utility platforms","authors":"Xuxian Jiang, Dongyan Xu","doi":"10.1109/HPDC.2003.1210027","DOIUrl":"https://doi.org/10.1109/HPDC.2003.1210027","url":null,"abstract":"The grid is realizing the vision of providing computation as utility: computational jobs can be scheduled on-demand at grid hosts based on available computational capacity. In this project, we study another emerging usage of grid utility: the hosting of application services. Different from a computational job, an application service such as an e-Laboratory or an on-line business has longer lifetime, and performs multiple jobs requested by its clients. A service hosting utility platform (HUP) is formed by a set of hosts in the grid, and multiple application services will be hosted on the HUP. SODA is a service-on-demand architecture that enables on-demand creation of application services on a HUP. With SODA, an application service will be created in the form of a set of virtual service nodes; each node is a virtual machine which is physically a 'slice' of a real host in the HUP. SODA involves both OS and middleware techniques, and has the following salient capabilities: (1) on-demand service priming: the image of an application service as well as the OS on which it runs will be created on-demand and bootstrapped automatically; (2) better service isolation: services sharing the same HUP host are isolated with respect to administration, faults, intrusion, and resources; (3) integrated service load management: for each service, a service switch will be created to direct client requests to appropriate virtual service nodes. Moreover, the application service provider can replace the default request switching policy with a service-specific policy.","PeriodicalId":430378,"journal":{"name":"High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122948006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 94
Flexible information discovery in decentralized distributed systems 分散分布式系统中的灵活信息发现
C. Schmidt, M. Parashar
The ability to efficiently discover information using partial knowledge (for example keywords, attributes or ranges) is important in large, decentralized, resource sharing distributed environments such as computational grids and peer-to-peer (P2P) storage and retrieval systems. This paper presents a P2P information discovery system that supports flexible queries using partial keywords and wildcards, and range queries. It guarantees that all existing data elements that match a query are found with bounded costs in terms of number of messages and number of peers involved. The key innovation is a dimension reducing indexing scheme that effectively maps the multidimensional information space to physical peers. The design, implementation and experimental evaluation of the system are presented.
使用部分知识(例如关键字、属性或范围)有效地发现信息的能力在大型的、分散的、资源共享的分布式环境(例如计算网格和点对点(P2P)存储和检索系统)中非常重要。本文提出了一个P2P信息发现系统,该系统支持使用部分关键字和通配符的灵活查询以及范围查询。它保证查找匹配查询的所有现有数据元素时,所涉及的消息数量和对等节点数量的代价是有限的。关键的创新是一个降维索引方案,有效地将多维信息空间映射到物理对等点。介绍了系统的设计、实现和实验评价。
{"title":"Flexible information discovery in decentralized distributed systems","authors":"C. Schmidt, M. Parashar","doi":"10.1109/HPDC.2003.1210032","DOIUrl":"https://doi.org/10.1109/HPDC.2003.1210032","url":null,"abstract":"The ability to efficiently discover information using partial knowledge (for example keywords, attributes or ranges) is important in large, decentralized, resource sharing distributed environments such as computational grids and peer-to-peer (P2P) storage and retrieval systems. This paper presents a P2P information discovery system that supports flexible queries using partial keywords and wildcards, and range queries. It guarantees that all existing data elements that match a query are found with bounded costs in terms of number of messages and number of peers involved. The key innovation is a dimension reducing indexing scheme that effectively maps the multidimensional information space to physical peers. The design, implementation and experimental evaluation of the system are presented.","PeriodicalId":430378,"journal":{"name":"High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130618927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 201
Policy driven heterogeneous resource co-allocation with Gangmatching 策略驱动的异构资源协同分配
Rajesh Raman, M. Livny, M. Solomon
Dynamic, heterogeneous and distributively owned resource environments present unique challenges to the problems of resource representation, allocation and management. Conventional resource management methods that rely on static models of resource allocation policy and behavior fail to address these challenges. We previously argued that Matchmaking provides an elegant and robust solution to resource management in such dynamic and federated environments. However, Matchmaking is limited by its purely bilateral formalism of matching a single customer with a single resource, precluding more advanced resource management services such as co-allocation. In this paper, we present Gangmatching, a multilateral extension to the Matchmaking model, and discuss the Gangmatching model and its associated implementation and performance issues in context of a real-world license management co-allocation problem.
动态、异构和分布式拥有的资源环境对资源表示、分配和管理问题提出了独特的挑战。传统的资源管理方法依赖于资源分配策略和行为的静态模型,无法解决这些挑战。我们以前讨论过,Matchmaking为这种动态和联合环境中的资源管理提供了一种优雅而健壮的解决方案。然而,配对受限于其纯粹的双边形式,即将单个客户与单个资源进行匹配,从而排除了更先进的资源管理服务,如共同分配。在本文中,我们提出了配对模型的多边扩展,并在现实世界的许可证管理共同分配问题的背景下讨论了配对模型及其相关的实现和性能问题。
{"title":"Policy driven heterogeneous resource co-allocation with Gangmatching","authors":"Rajesh Raman, M. Livny, M. Solomon","doi":"10.1109/HPDC.2003.1210018","DOIUrl":"https://doi.org/10.1109/HPDC.2003.1210018","url":null,"abstract":"Dynamic, heterogeneous and distributively owned resource environments present unique challenges to the problems of resource representation, allocation and management. Conventional resource management methods that rely on static models of resource allocation policy and behavior fail to address these challenges. We previously argued that Matchmaking provides an elegant and robust solution to resource management in such dynamic and federated environments. However, Matchmaking is limited by its purely bilateral formalism of matching a single customer with a single resource, precluding more advanced resource management services such as co-allocation. In this paper, we present Gangmatching, a multilateral extension to the Matchmaking model, and discuss the Gangmatching model and its associated implementation and performance issues in context of a real-world license management co-allocation problem.","PeriodicalId":430378,"journal":{"name":"High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134015309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 124
Performance analysis of scheduling and replication algorithms on Grid Datafarm architecture for high-energy physics applications 高能物理应用中网格数据农场架构的调度和复制算法的性能分析
A. Takefusa, O. Tatebe, S. Matsuoka, Y. Morita
Data Grid is a Grid for ubiquitous access and analysis of large-scale data. Because Data Grid is in the early stages of development, the performance of its petabyte-scale models in a realistic data processing setting has not been well investigated. By enhancing our Bricks Grid simulator to accommodated Data Grid scenarios, we investigate and compare the performance of different Data Grid models. These are categorized mainly as either central or tier models; they employ various scheduling and replication strategies under realistic assumptions of job processing for CERN LHC experiments on the Grid Datafarm system. Our results show that the central model is efficient but that the tier model, with its greater resources and its speculative class of background replication policies, are quite effective and achieve higher performance, while each tier is smaller than the central model.
数据网格是一种用于无处不在的大规模数据访问和分析的网格。由于Data Grid还处于开发的早期阶段,其pb级模型在实际数据处理环境中的性能还没有得到很好的研究。通过增强我们的Bricks Grid模拟器以适应数据网格场景,我们调查并比较了不同数据网格模型的性能。这些模型主要分为中心模型和层模型;他们在网格数据农场系统上对CERN大型强子对撞机实验作业处理的现实假设下采用了各种调度和复制策略。我们的结果表明,中心模型是有效的,但层模型,其更大的资源和推测类的后台复制策略,是相当有效的,实现更高的性能,而每个层都比中心模型小。
{"title":"Performance analysis of scheduling and replication algorithms on Grid Datafarm architecture for high-energy physics applications","authors":"A. Takefusa, O. Tatebe, S. Matsuoka, Y. Morita","doi":"10.1109/HPDC.2003.1210014","DOIUrl":"https://doi.org/10.1109/HPDC.2003.1210014","url":null,"abstract":"Data Grid is a Grid for ubiquitous access and analysis of large-scale data. Because Data Grid is in the early stages of development, the performance of its petabyte-scale models in a realistic data processing setting has not been well investigated. By enhancing our Bricks Grid simulator to accommodated Data Grid scenarios, we investigate and compare the performance of different Data Grid models. These are categorized mainly as either central or tier models; they employ various scheduling and replication strategies under realistic assumptions of job processing for CERN LHC experiments on the Grid Datafarm system. Our results show that the central model is efficient but that the tier model, with its greater resources and its speculative class of background replication policies, are quite effective and achieve higher performance, while each tier is smaller than the central model.","PeriodicalId":430378,"journal":{"name":"High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130631021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
Security for Grid services 网格服务的安全性
Von Welch, F. Siebenlist, Ian T Foster, J. Bresnahan, K. Czajkowski, Jarek Gawor, C. Kesselman, Sam Meder, L. Pearlman, S. Tuecke
Grid computing is concerned with the sharing and coordinated use of diverse resources in distributed "virtual organizations." The dynamic and multiinstitutional nature of these environments introduces challenging security issues that demand new technical approaches. In particular, one must deal with diverse local mechanisms, support dynamic creation of services, and enable dynamic creation of trust domains. We describe how these issues are addressed in two generations of the Globus Toolkit/spl reg/. First, we review the Globus Toolkit version 2 (GT2) approach; then we describe new approaches developed to support the Globus Toolkit version 3 (GT3) implementation of the Open Grid Services Architecture, an initiative that is recasting Grid concepts within a service-oriented framework based on Web services. GT3's security implementation uses Web services security mechanisms for credential exchange and other purposes, and introduces a tight least-privilege model that avoids the need for any privileged network service.
网格计算关注的是分布式“虚拟组织”中各种资源的共享和协调使用。这些环境的动态性和多机构性引入了具有挑战性的安全问题,需要新的技术方法。特别是,必须处理各种本地机制,支持动态创建服务,并启用动态创建信任域。我们将描述如何在两代Globus Toolkit/spl reg/中解决这些问题。首先,我们回顾Globus Toolkit版本2 (GT2)方法;然后,我们描述了为支持开放网格服务体系结构的Globus Toolkit版本3 (GT3)实现而开发的新方法,开放网格服务体系结构是一个在基于Web服务的面向服务的框架内重新定义网格概念的计划。GT3的安全实现使用Web服务安全机制进行凭证交换和其他目的,并引入了严格的最小特权模型,避免了对任何特权网络服务的需求。
{"title":"Security for Grid services","authors":"Von Welch, F. Siebenlist, Ian T Foster, J. Bresnahan, K. Czajkowski, Jarek Gawor, C. Kesselman, Sam Meder, L. Pearlman, S. Tuecke","doi":"10.1109/HPDC.2003.1210015","DOIUrl":"https://doi.org/10.1109/HPDC.2003.1210015","url":null,"abstract":"Grid computing is concerned with the sharing and coordinated use of diverse resources in distributed \"virtual organizations.\" The dynamic and multiinstitutional nature of these environments introduces challenging security issues that demand new technical approaches. In particular, one must deal with diverse local mechanisms, support dynamic creation of services, and enable dynamic creation of trust domains. We describe how these issues are addressed in two generations of the Globus Toolkit/spl reg/. First, we review the Globus Toolkit version 2 (GT2) approach; then we describe new approaches developed to support the Globus Toolkit version 3 (GT3) implementation of the Open Grid Services Architecture, an initiative that is recasting Grid concepts within a service-oriented framework based on Web services. GT3's security implementation uses Web services security mechanisms for credential exchange and other purposes, and introduces a tight least-privilege model that avoids the need for any privileged network service.","PeriodicalId":430378,"journal":{"name":"High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126848505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 584
Dynamic virtual clusters in a grid site manager 网格站点管理器中的动态虚拟集群
J. Chase, David E. Irwin, Laura E. Grit, Justin D. Moore, Sara Sprenkle
This paper presents new mechanisms for dynamic resource management in a cluster manager called Cluster-on-Demand (COD). COD allocates servers from a common pool to multiple virtual clusters (vclusters), with independently configured software environments, name spaces, user access controls, and network storage volumes. We present experiments using the popular Sun GridEngine batch scheduler to demonstrate that dynamic virtual clusters are an enabling abstraction for advanced resource management in computing utilities and grids. In particular, they support dynamic, policy-based cluster sharing between local users and hosted Grid services, resource reservation and adaptive provisioning, scavenging of the idle resources, and dynamic instantiation of Grid services. These goals are achieved in a direct and general way through a new set of fundamental cluster management functions, with minimal impact on the Grid middleware itself.
本文提出了一种在集群管理器中实现动态资源管理的新机制——集群按需管理(COD)。COD将一个公共池中的服务器分配给多个虚拟集群(vcluster),这些集群具有独立配置的软件环境、名称空间、用户访问控制和网络存储卷。我们展示了使用流行的Sun GridEngine批调度程序的实验,以证明动态虚拟集群是计算实用程序和网格中高级资源管理的启用抽象。特别是,它们支持本地用户和托管网格服务之间动态的、基于策略的集群共享、资源保留和自适应供应、空闲资源的清除以及网格服务的动态实例化。这些目标通过一组新的基本集群管理功能以直接和通用的方式实现,对网格中间件本身的影响最小。
{"title":"Dynamic virtual clusters in a grid site manager","authors":"J. Chase, David E. Irwin, Laura E. Grit, Justin D. Moore, Sara Sprenkle","doi":"10.1109/HPDC.2003.1210019","DOIUrl":"https://doi.org/10.1109/HPDC.2003.1210019","url":null,"abstract":"This paper presents new mechanisms for dynamic resource management in a cluster manager called Cluster-on-Demand (COD). COD allocates servers from a common pool to multiple virtual clusters (vclusters), with independently configured software environments, name spaces, user access controls, and network storage volumes. We present experiments using the popular Sun GridEngine batch scheduler to demonstrate that dynamic virtual clusters are an enabling abstraction for advanced resource management in computing utilities and grids. In particular, they support dynamic, policy-based cluster sharing between local users and hosted Grid services, resource reservation and adaptive provisioning, scavenging of the idle resources, and dynamic instantiation of Grid services. These goals are achieved in a direct and general way through a new set of fundamental cluster management functions, with minimal impact on the Grid middleware itself.","PeriodicalId":430378,"journal":{"name":"High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115574732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 342
Grid workflow: a flexible failure handling framework for the grid 网格工作流:一个灵活的网格故障处理框架
Soonwook Hwang, C. Kesselman
The generic, heterogeneous, and dynamic nature of the grid requires a new from of failure recovery mechanism to address its unique requirements such as support for diverse failure handling strategies, separation of failure handling strategies from application codes, and user-defined exception handling. We here propose a grid workflow system (grid-WFS), a flexible failure handling framework for the grid, which addresses these grid-unique failure recovery requirements. Central to the framework is flexibility by the use of workflow structure as a high-level recovery policy specification. We show how this use of high-level workflow structure allows users to achieve failure recovery in a variety of ways depending on the requirements and constraints of their applications. We also demonstrate that this use of workflow structure enables users to not only rapidly prototype and investigate failure handling strategies, but also easily change them by simply modifying the encompassing workflow structure, while the application code remains intact. Finally, we present an experimental evaluation of our framework using a simulation, demonstrating the value of supporting multiple failure recovery techniques in grid systems to achieve high performance in the presence of failures.
网格的通用性、异构性和动态性需要一种新的故障恢复机制来满足其独特的需求,例如支持多种故障处理策略、将故障处理策略与应用程序代码分离,以及用户定义的异常处理。本文提出一种网格工作流系统(grid- wfs),它是一种灵活的网格故障处理框架,可以解决网格特有的故障恢复需求。该框架的核心是通过使用工作流结构作为高级恢复策略规范来实现灵活性。我们将展示高级工作流结构的使用如何允许用户根据其应用程序的需求和约束以各种方式实现故障恢复。我们还证明了这种工作流结构的使用使用户不仅可以快速地建立原型并调查故障处理策略,而且可以通过简单地修改包含的工作流结构来轻松地更改它们,同时应用程序代码保持完整。最后,我们使用模拟对我们的框架进行了实验评估,展示了在网格系统中支持多种故障恢复技术以在故障存在时实现高性能的价值。
{"title":"Grid workflow: a flexible failure handling framework for the grid","authors":"Soonwook Hwang, C. Kesselman","doi":"10.1109/HPDC.2003.1210023","DOIUrl":"https://doi.org/10.1109/HPDC.2003.1210023","url":null,"abstract":"The generic, heterogeneous, and dynamic nature of the grid requires a new from of failure recovery mechanism to address its unique requirements such as support for diverse failure handling strategies, separation of failure handling strategies from application codes, and user-defined exception handling. We here propose a grid workflow system (grid-WFS), a flexible failure handling framework for the grid, which addresses these grid-unique failure recovery requirements. Central to the framework is flexibility by the use of workflow structure as a high-level recovery policy specification. We show how this use of high-level workflow structure allows users to achieve failure recovery in a variety of ways depending on the requirements and constraints of their applications. We also demonstrate that this use of workflow structure enables users to not only rapidly prototype and investigate failure handling strategies, but also easily change them by simply modifying the encompassing workflow structure, while the application code remains intact. Finally, we present an experimental evaluation of our framework using a simulation, demonstrating the value of supporting multiple failure recovery techniques in grid systems to achieve high performance in the presence of failures.","PeriodicalId":430378,"journal":{"name":"High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130286657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 185
Impact of high performance sockets on data intensive applications 高性能套接字对数据密集型应用的影响
P. Balaji, Jiesheng Wu, T. Kurç, Ümit V. Çatalyürek, D. Panda, J. Saltz
The challenging issues in supporting data intensive applications on clusters include efficient movement of large volumes of data between processor memories and efficient coordination of data movement and processing by a runtime support to achieve high performance. Such applications have several requirements such as guarantees in performance, scalability with these guarantees and adaptability to heterogeneous environments. With the advent of user-level protocols like the Virtual Interface Architecture (VIA) and the modern InfiniBand Architecture, the latency and bandwidth experienced by applications has approached to that of the physical network on clusters. In order to enable applications written on top of TCP/IP to take advantage of the high performance of these user-level protocols, researchers have come up with a number of techniques including User Level Sockets Layers over high performance protocols. In this paper, we study the performance and limitations of such substrate, referred to here as SocketVIA, using a component framework designed to provide runtime support for data intensive applications. The experimental results show that by reorganizing certain components of an application (in our case, the partitioning of a dataset into smaller data chunks), we can make significant improvements in application performance. This leads to a higher scalability of applications with performance guarantees. It also allows fine grained load balancing, hence making applications more adaptable to heterogeneity in resource availability. The experimental results also show that the different performance characteristics of SocketVIA allow a more efficient partitioning of data at the source nodes, thus improving the performance of the application up to an order of magnitude in some cases.
在集群上支持数据密集型应用程序的挑战性问题包括在处理器内存之间有效地移动大量数据,以及通过运行时支持有效地协调数据移动和处理以实现高性能。这样的应用程序有几个需求,比如性能保证、这些保证的可伸缩性以及对异构环境的适应性。随着用户级协议(如虚拟接口体系结构(VIA)和现代InfiniBand体系结构)的出现,应用程序所经历的延迟和带宽已经接近集群上物理网络的延迟和带宽。为了使在TCP/IP之上编写的应用程序能够利用这些用户级协议的高性能,研究人员提出了许多技术,包括在高性能协议之上的用户级套接字层。在本文中,我们研究了这种基板的性能和局限性,这里称为SocketVIA,使用一个组件框架,旨在为数据密集型应用程序提供运行时支持。实验结果表明,通过重新组织应用程序的某些组件(在我们的例子中,将数据集划分为更小的数据块),我们可以显著提高应用程序的性能。这将导致具有性能保证的应用程序具有更高的可伸缩性。它还支持细粒度负载平衡,从而使应用程序更适应资源可用性的异构性。实验结果还表明,SocketVIA的不同性能特征允许在源节点上更有效地划分数据,从而在某些情况下将应用程序的性能提高到一个数量级。
{"title":"Impact of high performance sockets on data intensive applications","authors":"P. Balaji, Jiesheng Wu, T. Kurç, Ümit V. Çatalyürek, D. Panda, J. Saltz","doi":"10.1109/HPDC.2003.1210013","DOIUrl":"https://doi.org/10.1109/HPDC.2003.1210013","url":null,"abstract":"The challenging issues in supporting data intensive applications on clusters include efficient movement of large volumes of data between processor memories and efficient coordination of data movement and processing by a runtime support to achieve high performance. Such applications have several requirements such as guarantees in performance, scalability with these guarantees and adaptability to heterogeneous environments. With the advent of user-level protocols like the Virtual Interface Architecture (VIA) and the modern InfiniBand Architecture, the latency and bandwidth experienced by applications has approached to that of the physical network on clusters. In order to enable applications written on top of TCP/IP to take advantage of the high performance of these user-level protocols, researchers have come up with a number of techniques including User Level Sockets Layers over high performance protocols. In this paper, we study the performance and limitations of such substrate, referred to here as SocketVIA, using a component framework designed to provide runtime support for data intensive applications. The experimental results show that by reorganizing certain components of an application (in our case, the partitioning of a dataset into smaller data chunks), we can make significant improvements in application performance. This leads to a higher scalability of applications with performance guarantees. It also allows fine grained load balancing, hence making applications more adaptable to heterogeneity in resource availability. The experimental results also show that the different performance characteristics of SocketVIA allow a more efficient partitioning of data at the source nodes, thus improving the performance of the application up to an order of magnitude in some cases.","PeriodicalId":430378,"journal":{"name":"High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128232757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Using views for customizing reusable components in component-based frameworks 在基于组件的框架中使用视图自定义可重用组件
A. Ivan, V. Karamcheti
Increasingly, scalable distributed applications are being constructed by integrating reusable components spanning multiple administrative domains. Dynamic composition and deployment of such applications enables flexible QoS-aware adaptation to changing client and network characteristics. However, dynamic deployment across multiple administrative domains needs to perform cross-domain authentication and authorization, and satisfy various network and application-level constraints that may only be expressed in terms meaningful within a particular domain. Our solution to these problems, developed as part of the partitionable services framework, integrates a decentralized trust management and access control system (dRBAC) with a programming and run-time abstraction (object views). dRBAC encodes statements within and across domains using cryptographically signed credentials, providing a unifying and powerful mechanism for cross-domain authorization and expression of network and application constraints. Views define multiple implementations of a reusable component, thus enriching the set of components available for dynamic deployment and enabling fine-grained, customizable access control. We describe the runtime support for views, which consists of a view generator (VIG) and a host-level communication resource (Switchboard) for creating secure channels between pairs of components. We present a simple mail application to illustrate how dRBAC, views, and Switchboard can be used to customize reusable components and securely deploy them in heterogeneous environments.
越来越多的可伸缩分布式应用程序是通过集成跨多个管理域的可重用组件来构建的。这种应用程序的动态组合和部署支持灵活的qos感知适应,以适应不断变化的客户机和网络特性。但是,跨多个管理域的动态部署需要执行跨域身份验证和授权,并满足各种网络和应用程序级约束,这些约束可能只能用在特定域中有意义的术语表示。作为可分区服务框架的一部分,我们对这些问题的解决方案将分散的信任管理和访问控制系统(dRBAC)与编程和运行时抽象(对象视图)集成在一起。dRBAC使用加密签名凭证对域内和跨域的语句进行编码,为跨域授权和表达网络和应用程序约束提供了统一而强大的机制。视图定义了可重用组件的多个实现,从而丰富了可用于动态部署的组件集,并支持细粒度的、可定制的访问控制。我们描述了对视图的运行时支持,它由视图生成器(VIG)和用于在组件对之间创建安全通道的主机级通信资源(交换机)组成。我们提供了一个简单的邮件应用程序来说明如何使用dRBAC、视图和交换机来定制可重用组件,并在异构环境中安全地部署它们。
{"title":"Using views for customizing reusable components in component-based frameworks","authors":"A. Ivan, V. Karamcheti","doi":"10.1109/HPDC.2003.1210029","DOIUrl":"https://doi.org/10.1109/HPDC.2003.1210029","url":null,"abstract":"Increasingly, scalable distributed applications are being constructed by integrating reusable components spanning multiple administrative domains. Dynamic composition and deployment of such applications enables flexible QoS-aware adaptation to changing client and network characteristics. However, dynamic deployment across multiple administrative domains needs to perform cross-domain authentication and authorization, and satisfy various network and application-level constraints that may only be expressed in terms meaningful within a particular domain. Our solution to these problems, developed as part of the partitionable services framework, integrates a decentralized trust management and access control system (dRBAC) with a programming and run-time abstraction (object views). dRBAC encodes statements within and across domains using cryptographically signed credentials, providing a unifying and powerful mechanism for cross-domain authorization and expression of network and application constraints. Views define multiple implementations of a reusable component, thus enriching the set of components available for dynamic deployment and enabling fine-grained, customizable access control. We describe the runtime support for views, which consists of a view generator (VIG) and a host-level communication resource (Switchboard) for creating secure channels between pairs of components. We present a simple mail application to illustrate how dRBAC, views, and Switchboard can be used to customize reusable components and securely deploy them in heterogeneous environments.","PeriodicalId":430378,"journal":{"name":"High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on","volume":"102 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116295907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
High Performance Distributed Computing, 2003. Proceedings. 12th IEEE International Symposium on
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1