2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing最新文献

英文中文

The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems 故障跟踪档案:在不同的分布式系统中实现故障的比较分析

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing

Pub Date : 2010-05-17 DOI: 10.1109/CCGRID.2010.71

Derrick Kondo, Bahman Javadi, A. Iosup, D. Epema

With the increasing functionality and complexity of distributed systems, resource failures are inevitable. While numerous models and algorithms for dealing with failures exist, the lack of public trace data sets and tools has prevented meaningful comparisons. To facilitate the design, validation, and comparison of fault-tolerant models and algorithms, we have created the Failure Trace Archive (FTA) as an online public repository of availability traces taken from diverse parallel and distributed systems. Our main contributions in this study are the following. First, we describe the design of the archive, in particular the rationale of the standard FTA format, and the design of a toolbox that facilitates automated analysis of trace data sets. Second, applying the toolbox, we present a uniform comparative analysis with statistics and models of failures in nine distributed systems. Third, we show how different interpretations of these data sets can result in different conclusions. This emphasizes the critical need for the public availability of trace data and methods for their analysis.

随着分布式系统的功能和复杂性的增加，资源故障是不可避免的。虽然存在许多处理故障的模型和算法，但缺乏公共跟踪数据集和工具妨碍了有意义的比较。为了便于容错模型和算法的设计、验证和比较，我们创建了故障跟踪归档(FTA)，作为从不同并行和分布式系统获取的可用性跟踪的在线公共存储库。我们在本研究中的主要贡献如下。首先，我们描述了归档的设计，特别是标准FTA格式的基本原理，以及促进跟踪数据集自动分析的工具箱的设计。其次，应用工具箱，我们提出了一个统一的比较分析与统计和模型的故障在九个分布式系统。第三，我们展示了对这些数据集的不同解释如何导致不同的结论。这强调了对跟踪数据和分析方法的公开可用性的迫切需要。

{"title":"The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems","authors":"Derrick Kondo, Bahman Javadi, A. Iosup, D. Epema","doi":"10.1109/CCGRID.2010.71","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.71","url":null,"abstract":"With the increasing functionality and complexity of distributed systems, resource failures are inevitable. While numerous models and algorithms for dealing with failures exist, the lack of public trace data sets and tools has prevented meaningful comparisons. To facilitate the design, validation, and comparison of fault-tolerant models and algorithms, we have created the Failure Trace Archive (FTA) as an online public repository of availability traces taken from diverse parallel and distributed systems. Our main contributions in this study are the following. First, we describe the design of the archive, in particular the rationale of the standard FTA format, and the design of a toolbox that facilitates automated analysis of trace data sets. Second, applying the toolbox, we present a uniform comparative analysis with statistics and models of failures in nine distributed systems. Third, we show how different interpretations of these data sets can result in different conclusions. This emphasizes the critical need for the public availability of trace data and methods for their analysis.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117175771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 202

Unibus-managed Execution of Scientific Applications on Aggregated Clouds unibus管理的科学应用程序在聚合云上的执行

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing

Pub Date : 2010-05-17 DOI: 10.1109/CCGRID.2010.53

Jaroslaw Slawinski, M. Slawinska, V. Sunderam

Our on-going project, Unibus, aims to facilitate provisioning and aggregation of multifaceted resources from resource providers and end-users’ perspectives. To achieve that, Unibus proposes (1) the Capability Model and mediators (resource drivers) to virtualize access to diverse resources, and (2) soft and successive conditioning to enable automatic and user-transparent resource provisioning. In this paper we examine the Unibus concepts and prototype in a real situation of aggregation of two commercial clouds and execution of benchmarks on aggregated resources. We also present and discuss benchmarks’ results.

我们正在进行的项目Unibus旨在从资源提供者和最终用户的角度促进多方面资源的供应和聚合。为了实现这一目标，Unibus提出了(1)能力模型和中介(资源驱动程序)来虚拟化对不同资源的访问，以及(2)软条件和连续条件来实现自动和用户透明的资源供应。在本文中，我们在两个商业云的聚合和在聚合资源上执行基准测试的实际情况下研究了Unibus的概念和原型。我们还介绍并讨论了基准测试的结果。

引用次数: 6

Enabling Instantaneous Relocation of Virtual Machines with a Lightweight VMM Extension 使用轻量级VMM扩展实现虚拟机的瞬时重定位

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing

Pub Date : 2010-05-17 DOI: 10.1109/CCGRID.2010.42

Takahiro Hirofuchi, H. Nakada, S. Itoh, S. Sekiguchi

We are developing an efficient resource management system with aggressive virtual machine (VM) relocation among physical nodes in a data center. Existing live migration technology, however, requires a long time to change the execution host of a VM, it is difficult to optimize VM packing on physical nodes dynamically, corresponding to ever-changing resource usage. In this paper, we propose an advanced live migration mechanism enabling instantaneous relocation of VMs. To minimize the time needed for switching the execution host, memory pages are transferred after a VM resumes at a destination host. A special character device driver allows transparent memory page retrievals from a source host for the running VM at the destination. In comparison with related work, the proposed mechanism supports guest operating systems without any modifications to them (i.e, no special device drivers and programs are needed in VMs). It is implemented as a lightweight extension to KVM (Kernel-based Virtual Machine Monitor). It is not required to modify critical parts of the VMM code. Experiments were conducted using the SPECweb2005 benchmark. A running VM with heavily-loaded web servers was successfully relocated to a destination within one second. Temporal performance degradation after relocation was resolved by means of a precaching mechanism for memory pages. In addition, for memory intensive workloads, our migration mechanism moved all the states of a VM faster than existing migration technology.

我们正在开发一种高效的资源管理系统，该系统在数据中心的物理节点之间具有积极的虚拟机(VM)迁移。但现有的热迁移技术需要较长时间更换虚拟机的执行主机，难以对物理节点上的虚拟机打包进行动态优化，导致资源使用情况不断变化。在本文中，我们提出了一种先进的实时迁移机制，可以实现虚拟机的瞬时迁移。为了最大限度地减少切换执行主机所需的时间，虚拟机在目标主机上恢复后会传输内存页面。特殊字符设备驱动程序允许从源主机为目标位置上正在运行的虚拟机透明地检索内存页。与相关工作相比，提议的机制支持客户机操作系统，而不需要对它们进行任何修改(即，在vm中不需要特殊的设备驱动程序和程序)。它是作为KVM(基于内核的虚拟机监视器)的轻量级扩展实现的。不需要修改VMM代码的关键部分。实验采用SPECweb2005基准测试。在一秒钟内，一个正在运行的虚拟机和负载沉重的web服务器被成功地重新定位到目的地。通过内存页的预传机制解决了重定位后的时间性能下降问题。此外，对于内存密集型工作负载，我们的迁移机制比现有的迁移技术更快地移动VM的所有状态。

{"title":"Enabling Instantaneous Relocation of Virtual Machines with a Lightweight VMM Extension","authors":"Takahiro Hirofuchi, H. Nakada, S. Itoh, S. Sekiguchi","doi":"10.1109/CCGRID.2010.42","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.42","url":null,"abstract":"We are developing an efficient resource management system with aggressive virtual machine (VM) relocation among physical nodes in a data center. Existing live migration technology, however, requires a long time to change the execution host of a VM, it is difficult to optimize VM packing on physical nodes dynamically, corresponding to ever-changing resource usage. In this paper, we propose an advanced live migration mechanism enabling instantaneous relocation of VMs. To minimize the time needed for switching the execution host, memory pages are transferred after a VM resumes at a destination host. A special character device driver allows transparent memory page retrievals from a source host for the running VM at the destination. In comparison with related work, the proposed mechanism supports guest operating systems without any modifications to them (i.e, no special device drivers and programs are needed in VMs). It is implemented as a lightweight extension to KVM (Kernel-based Virtual Machine Monitor). It is not required to modify critical parts of the VMM code. Experiments were conducted using the SPECweb2005 benchmark. A running VM with heavily-loaded web servers was successfully relocated to a destination within one second. Temporal performance degradation after relocation was resolved by means of a precaching mechanism for memory pages. In addition, for memory intensive workloads, our migration mechanism moved all the states of a VM faster than existing migration technology.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122403188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 77

Topology Aggregation for E-science Networks 面向E-science网络的拓扑聚合

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing

Pub Date : 2010-05-17 DOI: 10.1109/CCGRID.2010.113

Eun-Sung Jung, S. Ranka, S. Sahni

We propose several algorithms for topology aggregation (TA) to effectively summarize large-scale networks. These TA techniques are shown to significantly better for path requests in e-Science that may consist of simultaneous reservation of multiple paths and/or simultaneous reservation for multiple requests. Our extensive simulation demonstrates the benefits of our algorithms both in terms of accuracy and performance.

我们提出了几种拓扑聚合算法来有效地对大规模网络进行聚合。这些TA技术被证明对于e-Science中的路径请求明显更好，这些路径请求可能包含多个路径的同时预订和/或多个请求的同时预订。我们广泛的仿真证明了我们的算法在准确性和性能方面的优势。

引用次数: 0

Runtime Energy Adaptation with Low-Impact Instrumented Code in a Power-Scalable Cluster System 功率可扩展集群系统中低影响仪表代码的运行时能量适应

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing

Pub Date : 2010-05-17 DOI: 10.1109/CCGRID.2010.70

Hideaki Kimura, Takayuki Imada, M. Sato

Recently, improving the energy efficiency of high performance PC clusters has become important. In order to reduce the energy consumption of the microprocessor, many high performance microprocessors have a Dynamic Voltage and Frequency Scaling (DVFS) mechanism. This paper proposes a new DVFS method called the Code-Instrumented Runtime (CIRuntime) DVFS method, in which a combination of voltage and frequency, which is called a P-State, is managed in the instrumented code at runtime. The proposed CI-Runtime DVFS method achieves better energy saving than the Interrupt based Runtime DVFS method, since it selects the appropriate P-State in each defined region based on the characteristics of program execution. Moreover, the proposed CI-Runtime DVFS method is more useful than the Static DVFS method, since it does not acquire exhaustive profiles for each P-State. The method consists of two parts. In the first part of the proposed CI-Runtime DVFS method, the instrumented codes are inserted by defining regions that have almost the same characteristics. The instrumented code must be inserted at the appropriate point, because the performance of the application decreases greatly if the instrumented code is called too many times in a short period. A method for automatically defining regions is proposed in this paper. The second part of the proposed method is the energy adaptation algorithm which is used at runtime. Two types of DVFS control algorithms energy adaptation with estimated energy consumption and energy adaptation with only performance information, are compared. The proposed CIRuntime DVFS method was implemented on a power-scalable PC cluster. The results show that the proposed CI-Runtime with energy adaptation using estimated energy consumption could achieve an energy saving of 14.2% which is close to the optimal value, without obtaining exhaustive profiles for every available P-State setting.

近年来，提高高性能PC集群的能源效率变得越来越重要。为了降低微处理器的能耗，许多高性能微处理器都采用了动态电压频率缩放(DVFS)机制。本文提出了一种新的DVFS方法，称为代码仪表化运行时(CIRuntime) DVFS方法，该方法在运行时在仪表化代码中管理称为p态的电压和频率组合。所提出的CI-Runtime DVFS方法根据程序执行的特点，在每个定义的区域中选择合适的P-State，比基于中断的Runtime DVFS方法节能。此外，所提出的CI-Runtime DVFS方法比静态DVFS方法更有用，因为它不获取每个P-State的详尽配置文件。该方法由两部分组成。在提出的CI-Runtime DVFS方法的第一部分中，通过定义具有几乎相同特征的区域来插入检测代码。插装代码必须在适当的位置插入，因为如果在短时间内多次调用插装代码，应用程序的性能将大大降低。提出了一种自动定义区域的方法。该方法的第二部分是在运行时使用的能量自适应算法。比较了带预估能耗的能量自适应和仅带性能信息的能量自适应两种DVFS控制算法。在功率可扩展的PC集群上实现了所提出的运行时DVFS方法。结果表明，在不需要获得所有可用P-State设置的详尽概况的情况下，基于估计能耗的能量自适应CI-Runtime可以实现接近最优值的14.2%的节能。

{"title":"Runtime Energy Adaptation with Low-Impact Instrumented Code in a Power-Scalable Cluster System","authors":"Hideaki Kimura, Takayuki Imada, M. Sato","doi":"10.1109/CCGRID.2010.70","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.70","url":null,"abstract":"Recently, improving the energy efficiency of high performance PC clusters has become important. In order to reduce the energy consumption of the microprocessor, many high performance microprocessors have a Dynamic Voltage and Frequency Scaling (DVFS) mechanism. This paper proposes a new DVFS method called the Code-Instrumented Runtime (CIRuntime) DVFS method, in which a combination of voltage and frequency, which is called a P-State, is managed in the instrumented code at runtime. The proposed CI-Runtime DVFS method achieves better energy saving than the Interrupt based Runtime DVFS method, since it selects the appropriate P-State in each defined region based on the characteristics of program execution. Moreover, the proposed CI-Runtime DVFS method is more useful than the Static DVFS method, since it does not acquire exhaustive profiles for each P-State. The method consists of two parts. In the first part of the proposed CI-Runtime DVFS method, the instrumented codes are inserted by defining regions that have almost the same characteristics. The instrumented code must be inserted at the appropriate point, because the performance of the application decreases greatly if the instrumented code is called too many times in a short period. A method for automatically defining regions is proposed in this paper. The second part of the proposed method is the energy adaptation algorithm which is used at runtime. Two types of DVFS control algorithms energy adaptation with estimated energy consumption and energy adaptation with only performance information, are compared. The proposed CIRuntime DVFS method was implemented on a power-scalable PC cluster. The results show that the proposed CI-Runtime with energy adaptation using estimated energy consumption could achieve an energy saving of 14.2% which is close to the optimal value, without obtaining exhaustive profiles for every available P-State setting.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117108626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Enhanced Paxos Commit for Transactions on DHTs 增强Paxos提交dht上的事务

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing

Pub Date : 2010-05-17 DOI: 10.1109/CCGRID.2010.41

F. Schintke, A. Reinefeld, Seif Haridi, T. Schütt

Key/value stores which are built on structured overlay networks often lack support for atomic transactions and strong data consistency among replicas. This is unfortunate, because consistency guarantees and transactions would allow a wide range of additional application domains to benefit from the inherent scalability and fault-tolerance of DHTs. The Scalaris key/value store supports strong data consistency and atomic transactions. It uses an enhanced Paxos Commit protocol with only four communication steps rather than six. This improvement was possible by exploiting information from the replica distribution in the DHT. Scalaris enables implementation of more reliable and scalable infrastructure for collaborative Web services that require strong consistency and atomic changes across multiple items.

建立在结构化覆盖网络上的键/值存储通常缺乏对原子事务和副本之间强数据一致性的支持。这是不幸的，因为一致性保证和事务将允许范围广泛的其他应用程序域受益于dht固有的可伸缩性和容错性。Scalaris键/值存储支持强数据一致性和原子事务。它使用增强的Paxos Commit协议，只有四个通信步骤，而不是六个。这种改进可以通过利用DHT中的副本分布中的信息来实现。Scalaris支持为协作Web服务实现更可靠和可伸缩的基础设施，这些服务需要强一致性和跨多个项目的原子性更改。

引用次数: 27

Region-Based Prefetch Techniques for Software Distributed Shared Memory Systems 基于区域的软件分布式共享内存预取技术

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing

Pub Date : 2010-05-17 DOI: 10.1109/CCGRID.2010.16

Jie Cai, P. Strazdins, Alistair P. Rendell

Although shared memory programming models show good programmability compared to message passing programming models, their implementation by page-based software distributed shared memory systems usually suffers from high memory consistency costs. The major part of these costs is inter-node data transfer for keeping virtual shared memory consistent. A good prefetch strategy can reduce this cost. We develop two prefetch techniques, TReP and HReP, which are based on the execution history of each parallel region. These techniques are evaluated using offline simulations with the NAS Parallel Benchmarks and the LINPACK benchmark. On average, TReP achieves an efficiency (ratio of pages prefetched that were subsequently accessed) of 96% and a coverage (ratio of access faults avoided by prefetches) of 65%. HReP achieves an efficiency of 91% but has a coverage of 79%. Treating the cost of an incorrectly prefetched page to be equivalent to that of a miss, these techniques have an effective page miss rate of 63% and 71% respectively. Additionally, these two techniques are compared with two well-known software distributed shared memory (sDSM) prefetch techniques, Adaptive++ and TODFCM. TReP effectively reduces page miss rate by 53% and 34% more, and HReP effectively reduces page miss rate by 62% and 43% more, compared to Adaptive++ and TODFCM respectively. As for Adaptive++, these techniques also permit bulk prefetching for pages predicted using temporal locality, amortizing network communication costs and permitting bandwidth improvement from multi-rail network interfaces.

尽管与消息传递编程模型相比，共享内存编程模型显示出良好的可编程性，但基于页面的软件分布式共享内存系统实现它们通常需要高昂的内存一致性成本。这些成本的主要部分是节点间数据传输，以保持虚拟共享内存的一致性。一个好的预取策略可以减少这个成本。基于每个并行区域的执行历史，我们开发了两种预取技术:TReP和HReP。使用NAS并行基准测试和LINPACK基准测试进行离线模拟，对这些技术进行评估。平均而言，TReP的效率(预取的页面随后被访问的比率)为96%，覆盖率(预取避免的访问错误比率)为65%。HReP的效率为91%，覆盖率为79%。如果将错误预取页面的成本等同于丢失页面的成本，这些技术的有效页面丢失率分别为63%和71%。此外，还将这两种技术与两种知名的软件分布式共享内存(sDSM)预取技术Adaptive++和TODFCM进行了比较。与自适应++和TODFCM相比，TReP有效降低页面缺失率分别提高53%和34%，HReP有效降低页面缺失率分别提高62%和43%。对于Adaptive++，这些技术还允许对使用时间局部性预测的页面进行批量预取，平摊网络通信成本，并允许从多轨道网络接口改进带宽。

{"title":"Region-Based Prefetch Techniques for Software Distributed Shared Memory Systems","authors":"Jie Cai, P. Strazdins, Alistair P. Rendell","doi":"10.1109/CCGRID.2010.16","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.16","url":null,"abstract":"Although shared memory programming models show good programmability compared to message passing programming models, their implementation by page-based software distributed shared memory systems usually suffers from high memory consistency costs. The major part of these costs is inter-node data transfer for keeping virtual shared memory consistent. A good prefetch strategy can reduce this cost. We develop two prefetch techniques, TReP and HReP, which are based on the execution history of each parallel region. These techniques are evaluated using offline simulations with the NAS Parallel Benchmarks and the LINPACK benchmark. On average, TReP achieves an efficiency (ratio of pages prefetched that were subsequently accessed) of 96% and a coverage (ratio of access faults avoided by prefetches) of 65%. HReP achieves an efficiency of 91% but has a coverage of 79%. Treating the cost of an incorrectly prefetched page to be equivalent to that of a miss, these techniques have an effective page miss rate of 63% and 71% respectively. Additionally, these two techniques are compared with two well-known software distributed shared memory (sDSM) prefetch techniques, Adaptive++ and TODFCM. TReP effectively reduces page miss rate by 53% and 34% more, and HReP effectively reduces page miss rate by 62% and 43% more, compared to Adaptive++ and TODFCM respectively. As for Adaptive++, these techniques also permit bulk prefetching for pages predicted using temporal locality, amortizing network communication costs and permitting bandwidth improvement from multi-rail network interfaces.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115945758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Efficient On-Demand Connection Management Mechanisms with PGAS Models over InfiniBand 基于ib的PGAS模型的高效按需连接管理机制

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing

Pub Date : 2010-05-17 DOI: 10.1109/CCGRID.2010.58

Abhinav Vishnu, M. Krishnan

In the last decade or so, clusters have observed a tremendous rise in popularity due to the excellent price to performance ratio. A variety of Interconnects have been proposed during this period, with InfiniBand leading the way due to its high performance and open standard. At the same time, multiple programming models have emerged in order to meet the requirements of various applications and their programming models. To support requirements of multiple programming models, InfiniBand provides multiple transport semantics, ranging from unreliable connectionless to reliable connected characteristics. Among them, the reliable connection (RC) semantics is being widely used due to its high performance and support for novel features like Remote Direct Memory Acesss (RDMA), hardware atomics and Network Fault Tolerance. However, the pair wise connection oriented nature of the RC transport semantics limits its scalability and usage at the increasing processor counts. In this paper, we design and implement on-demand connection management approaches in the context of Partitioned Global Address Space (PGAS) programming models, which provided shared memory abstraction and one-sided communication semantics, leading to the development of multiple languages (UPC, X10, Chapel) and libraries (Global Arrays, MPI-RMA). Using Global Arrays as the research vehicle, we implement this approach with Aggregate Remote Memory Copy Interface (ARMCI), the runtime system of Global Arrays. We evaluate our approach, ARMCI-On Demand Connection Management (ARMCI-ODCM) using various micro benchmarks and benchmarks (LU Factorization, Random-Access and Lennard Jones simulation) and application (Subsurface transport over multiple phases (STOMP)). With the performance evaluation for up to 4096 processors, we are able to have a multi-fold reduction in connection memory with a negligible degradation in performance. Using STOMP at 4096 processors, reduces the overall connection memory by 66 times with no performance degradation. To the best of our knowledge, this is the first design, implementation and evaluation of on-demand connection management with InfiniBand using PGAS models.

在过去十年左右的时间里，由于出色的性价比，集群的受欢迎程度大幅上升。在此期间，各种各样的互连被提出，InfiniBand因其高性能和开放标准而引领潮流。同时，为了满足各种应用及其编程模型的需求，出现了多种编程模型。为了支持多种编程模型的需求，InfiniBand提供了多种传输语义，从不可靠的无连接到可靠的连接特征。其中，可靠连接(RC)语义由于其高性能和对远程直接内存访问(RDMA)、硬件原子和网络容错等新特性的支持而得到广泛应用。然而，RC传输语义的面向对连接的特性限制了它的可伸缩性和在处理器数量增加时的使用。在本文中，我们在分区全局地址空间(PGAS)编程模型的背景下设计并实现了按需连接管理方法，该方法提供了共享内存抽象和单向通信语义，从而导致了多种语言(UPC, X10, Chapel)和库(全局数组，MPI-RMA)的发展。本文以全局阵列为研究载体，利用全局阵列的运行时系统——聚合远程内存复制接口(armi)实现了该方法。我们使用各种微基准和基准(LU分解、随机访问和Lennard Jones模拟)以及应用(多阶段地下传输(STOMP))来评估我们的方法——armci随需应变连接管理(ARMCI-ODCM)。通过对多达4096个处理器的性能评估，我们能够将连接内存减少数倍，而性能的下降可以忽略不计。在4096个处理器上使用STOMP，在没有性能下降的情况下，将总连接内存减少了66倍。据我们所知，这是使用PGAS模型的InfiniBand按需连接管理的第一个设计，实施和评估。

{"title":"Efficient On-Demand Connection Management Mechanisms with PGAS Models over InfiniBand","authors":"Abhinav Vishnu, M. Krishnan","doi":"10.1109/CCGRID.2010.58","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.58","url":null,"abstract":"In the last decade or so, clusters have observed a tremendous rise in popularity due to the excellent price to performance ratio. A variety of Interconnects have been proposed during this period, with InfiniBand leading the way due to its high performance and open standard. At the same time, multiple programming models have emerged in order to meet the requirements of various applications and their programming models. To support requirements of multiple programming models, InfiniBand provides multiple transport semantics, ranging from unreliable connectionless to reliable connected characteristics. Among them, the reliable connection (RC) semantics is being widely used due to its high performance and support for novel features like Remote Direct Memory Acesss (RDMA), hardware atomics and Network Fault Tolerance. However, the pair wise connection oriented nature of the RC transport semantics limits its scalability and usage at the increasing processor counts. In this paper, we design and implement on-demand connection management approaches in the context of Partitioned Global Address Space (PGAS) programming models, which provided shared memory abstraction and one-sided communication semantics, leading to the development of multiple languages (UPC, X10, Chapel) and libraries (Global Arrays, MPI-RMA). Using Global Arrays as the research vehicle, we implement this approach with Aggregate Remote Memory Copy Interface (ARMCI), the runtime system of Global Arrays. We evaluate our approach, ARMCI-On Demand Connection Management (ARMCI-ODCM) using various micro benchmarks and benchmarks (LU Factorization, Random-Access and Lennard Jones simulation) and application (Subsurface transport over multiple phases (STOMP)). With the performance evaluation for up to 4096 processors, we are able to have a multi-fold reduction in connection memory with a negligible degradation in performance. Using STOMP at 4096 processors, reduces the overall connection memory by 66 times with no performance degradation. To the best of our knowledge, this is the first design, implementation and evaluation of on-demand connection management with InfiniBand using PGAS models.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"105 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126636427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13

Overdimensioning for Consistent Performance in Grids 网格中一致性能的过维化

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing

Pub Date : 2010-05-17 DOI: 10.1109/CCGRID.2010.44

N. Yigitbasi, D. Epema

Grid users may experience inconsistent performance due to specific characteristics of grids, such as fluctuating workloads, high failure rates, and high resource heterogeneity. Although extensive research has been done in grids, providing consistent performance remains largely an unsolved problem. In this study we use overdimensioning, a simple but cost-ineffective solution, to solve the performance inconsistency problem in grids. To this end, we propose several overdimensioning strategies, and we evaluate these strategies through simulations with workloads consisting of Bag-of-Tasks. We find that although overdimensioning is a simple solution, it is a viable solution to provide consistent performance in grids.

由于网格的特定特征，例如波动的工作负载、高故障率和高资源异构性，网格用户可能会遇到不一致的性能。尽管已经对网格进行了广泛的研究，但提供一致的性能在很大程度上仍然是一个未解决的问题。在本研究中，我们使用一种简单但成本低的解决方案——过维化来解决网格中的性能不一致问题。为此，我们提出了几种过维化策略，并通过由任务袋组成的工作负载模拟来评估这些策略。我们发现，虽然过维化是一种简单的解决方案，但它是在网格中提供一致性能的可行解决方案。

引用次数: 4

Cache Performance Optimization for Processing XML-Based Application Data on Multi-core Processors 多核处理器上处理基于xml的应用程序数据的缓存性能优化

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing

Pub Date : 2010-05-17 DOI: 10.1109/CCGRID.2010.122

Rajdeep Bhowmik, M. Govindaraju

There is a critical need to develop new programming paradigms for grid middleware tools and applications to harness the opportunities presented by emerging multi-core processors. Implementations of grid middleware and applications that do not adapt to the programming paradigm when executing on emerging processors can severely impact the overall performance. We focus on the utilization of the L2 cache, which is a critical shared resource on Chip Multiprocessors. The access pattern of the shared L2 cache, which is dependent on how the application schedules and assigns processing work to each thread, can either enhance or undermine the ability to hide memory latency on a multi-core processor. None of the current grid simulators and emulators provides feedback and fine-grained performance data that is essential for a detailed analysis. Using the feedback from an emulation framework, we present performance analysis and provide recommendations on how processing threads can be scheduled on multi-core nodes to enhance the performance of a class of grid applications that requires processing of large-scale XML data. In particular, we discuss the gains associated with the use of the adaptations we have made to the Cache-Affinity and Balanced-Set scheduling algorithms to improve L2 cache performance, and hence the overall application execution time.

迫切需要为网格中间件工具和应用程序开发新的编程范例，以利用新兴的多核处理器带来的机会。在新兴处理器上执行不适应编程范例的网格中间件和应用程序的实现可能会严重影响整体性能。我们关注的是二级缓存的利用率，这是芯片多处理器上一个关键的共享资源。共享L2缓存的访问模式取决于应用程序如何调度和为每个线程分配处理工作，它可以增强或破坏多核处理器上隐藏内存延迟的能力。目前的网格模拟器和模拟器都没有提供反馈和细粒度的性能数据，而这些数据是详细分析所必需的。利用仿真框架的反馈，我们给出了性能分析，并就如何在多核节点上调度处理线程以增强需要处理大规模XML数据的一类网格应用程序的性能提供了建议。特别是，我们讨论了与使用我们对cache - affinity和Balanced-Set调度算法所做的调整相关的收益，以提高L2缓存性能，从而提高整个应用程序的执行时间。

{"title":"Cache Performance Optimization for Processing XML-Based Application Data on Multi-core Processors","authors":"Rajdeep Bhowmik, M. Govindaraju","doi":"10.1109/CCGRID.2010.122","DOIUrl":"https://doi.org/10.1109/CCGRID.2010.122","url":null,"abstract":"There is a critical need to develop new programming paradigms for grid middleware tools and applications to harness the opportunities presented by emerging multi-core processors. Implementations of grid middleware and applications that do not adapt to the programming paradigm when executing on emerging processors can severely impact the overall performance. We focus on the utilization of the L2 cache, which is a critical shared resource on Chip Multiprocessors. The access pattern of the shared L2 cache, which is dependent on how the application schedules and assigns processing work to each thread, can either enhance or undermine the ability to hide memory latency on a multi-core processor. None of the current grid simulators and emulators provides feedback and fine-grained performance data that is essential for a detailed analysis. Using the feedback from an emulation framework, we present performance analysis and provide recommendations on how processing threads can be scheduled on multi-core nodes to enhance the performance of a class of grid applications that requires processing of large-scale XML data. In particular, we discuss the gains associated with the use of the adaptations we have made to the Cache-Affinity and Balanced-Set scheduling algorithms to improve L2 cache performance, and hence the overall application execution time.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123986294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀