2008 IEEE International Conference on Cluster Computing最新文献

英文中文

Scalable, high performance InfiniBand-attached SAN Volume Controller 可扩展，高性能无限带宽连接的SAN卷控制器

2008 IEEE International Conference on Cluster Computing

Pub Date : 2008-10-31 DOI: 10.1109/CLUSTR.2008.4663807

D. S. Guthridge

We have developed a highly reliable InfiniBand host attached block storage management and virtualization system that supports several off-the-shelf Fibre Channel RAID controllers on the back end. The system is based on the existing IBM TotalStorage SAN Volume Controller (SVC) product, and therefore offers performance, a wide array of storage virtualization features, and support for many existing storage controllers. We provide an overview of the driver design as well as performance results. Large read performance from SVC cache exceeds 3 GB/s in a minimal two-node cluster configuration.

我们开发了一个高可靠的InfiniBand主机附加块存储管理和虚拟化系统，支持几个现成的后端光纤通道RAID控制器。该系统基于现有的IBM TotalStorage SAN Volume Controller (SVC)产品，因此提供了性能、广泛的存储虚拟化特性和对许多现有存储控制器的支持。我们提供了驱动程序设计的概述以及性能结果。在最小的双节点集群配置中，来自SVC缓存的大读性能超过3gb /s。

引用次数: 2

Improving message passing over Ethernet with I/OAT copy offload in Open-MX 在Open-MX中通过I/OAT拷贝卸载改进以太网上的消息传递

2008 IEEE International Conference on Cluster Computing

Pub Date : 2008-10-31 DOI: 10.1109/CLUSTR.2008.4663775

Brice Goglin

Open-MX is a new message passing layer implemented on top of the generic Ethernet stack of the Linux kernel. Open-MX works on all Ethernet hardware, but it suffers from expensive memory copy requirements on the receiver side due to the hardwarepsilas inability to deposit messages directly in the target application buffers.

Open-MX是在Linux内核的通用以太网堆栈之上实现的一个新的消息传递层。Open-MX适用于所有以太网硬件，但由于硬件无法将消息直接存储在目标应用程序缓冲区中，因此在接收端存在昂贵的内存复制需求。

引用次数: 14

Divisible load scheduling with improved asymptotic optimality 具有改进渐近最优性的可分负荷调度

2008 IEEE International Conference on Cluster Computing

Pub Date : 2008-10-31 DOI: 10.1109/CLUSTR.2008.4663779

R. Suda

Divisible load model allows scheduling algorithms that give nearly optimal makespan with practical computational complexity. Beaumont et al. have shown that their algorithm produces a schedule whose makespan is within 1+O(1/radicT) times larger than the optimal solution when the total amount of tasks T scales up and the other conditions are fixed. We have proposed an extension of their algorithm for multiple masters with heterogeneous performance of processors but limited to uniform network performance. This paper analyzes the asymptotic performance of our algorithm, and shows that the asymptotic performance of our algorithm is either 1+O(1/radicT), 1+O(log T/T) or 1+O(1/T ), depending on the problem. For the latter two cases, our algorithm asymptotically outperforms the algorithm by Beaumont et al.

可分负载模型允许调度算法在实际计算复杂度下给出接近最优的最大完工时间。Beaumont等人的研究表明，当任务总量T扩大且其他条件固定时，他们的算法产生的时间表的makespan比最优解大1+O(1/radicT)倍。我们提出了一种扩展算法，适用于具有异构处理器性能但限于统一网络性能的多主机。本文分析了算法的渐近性能，并证明了算法的渐近性能可以是1+O(1/radicT)， 1+O(log T/T)或1+O(1/T)，具体取决于问题。对于后两种情况，我们的算法逐渐优于Beaumont等人的算法。

引用次数: 0

Exploiting data compression in collective I/O techniques 在集体I/O技术中利用数据压缩

2008 IEEE International Conference on Cluster Computing

Pub Date : 2008-10-31 DOI: 10.1109/CLUSTR.2008.4663811

Rosa Filgueira, D. E. Singh, J. C. Pichel, J. Carretero

This paper presents Two-Phase Compressed I/O (TPC I/O,) an optimization of the Two-Phase collective I/O technique from ROMIO, the most popular MPI-IO implementation. In order to reduce network traffic, TPC I/O employs LZO algorithm to compress and decompress exchanged data in the inter-node communication operations. The compression algorithm has been fully implemented in the MPI collective technique, allowing to dynamically use (or not) compression. Compared with Two-Phase I/O, Two-Phase Compressed I/O obtains important improvements in the overall execution time for many of the considered scenarios.

本文提出了两阶段压缩I/O (TPC I/O)，这是对最流行的MPI-IO实现的两阶段集合I/O技术的优化。为了减少网络流量，TPC I/O采用LZO算法对节点间通信操作中交换的数据进行压缩和解压缩。压缩算法已在MPI集合技术中完全实现，允许动态使用(或不使用)压缩。与两阶段I/O相比，在许多考虑的场景中，两阶段压缩I/O在总体执行时间上得到了重要的改进。

引用次数: 6

Multistage switches are not crossbars: Effects of static routing in high-performance networks 多级交换机不是横杆:高性能网络中静态路由的影响

2008 IEEE International Conference on Cluster Computing

Pub Date : 2008-10-31 DOI: 10.1109/CLUSTR.2008.4663762

T. Hoefler, Timo Schneider, A. Lumsdaine

Multistage interconnection networks based on central switches are ubiquitous in high-performance computing. Applications and communication libraries typically make use of such networks without consideration of the actual internal characteristics of the switch. However, application performance of these networks, particularly with respect to bisection bandwidth, does depend on communication paths through the switch. In this paper we discuss the limitations of the hardware definition of bisection bandwidth (capacity-based) and introduce a new metric: effective bisection bandwidth. We assess the effective bisection bandwidth of several large-scale production clusters by simulating artificial communication patterns on them. Networks with full bisection bandwidth typically provided effective bisection bandwidth in the range of 55-60%. Simulations with application-based patterns showed that the difference between effective and rated bisection bandwidth could impact overall application performance by up to 12%.

基于中心交换机的多级互连网络在高性能计算中无处不在。应用程序和通信库通常使用这种网络，而不考虑交换机的实际内部特性。然而，这些网络的应用性能，特别是在对分带宽方面，确实依赖于通过交换机的通信路径。本文讨论了硬件定义对角带宽(基于容量)的局限性，并引入了一个新的度量:有效对角带宽。我们通过模拟人工通信模式来评估几个大规模生产集群的有效对分带宽。具有全等分带宽的网络通常提供55-60%范围内的有效等分带宽。基于应用程序模式的模拟表明，有效和额定二分带宽之间的差异可能会对应用程序的整体性能产生高达12%的影响。

引用次数: 110

DifferStore: A differentiated storage service in object-based storage system DifferStore:对象存储系统中的差别化存储服务

2008 IEEE International Conference on Cluster Computing

Pub Date : 2008-10-31 DOI: 10.1109/CLUSTR.2008.4663770

Q. Wei, Zhixiang Li

This paper presents a differentiated storage service in object-based storage system, called DifferStore. To enable differentiated storage service for different applications in a single object-based storage platform, DifferStore utilizes a two-layer architecture to efficiently decouple upper-layer application specific storage policies and lower-layer application independent storage functions. For the lower application independent layer, this paper proposes a weight-based object I/O scheduler with differentiated scheduling policy for different request classes, and a versatile storage manager. The versatile storage manager implements differentiated storage policies in terms of disk layout and free space allocation, as well as an efficient object namespace management enabling directly access object on-disk data just with object ID. The DifferStore also provides ability for upper application specific layer to assign complex striping, placement, load-balancing policies and specific metadata structure of file. Experimental evaluation on our user space prototype demonstrates that the DifferStore can perform well under mixed workloads and satisfy requirements of different applications.

本文提出了一种基于对象存储系统的差异化存储服务——DifferStore。为了在单一对象存储平台上为不同的应用提供差异化的存储服务，DifferStore采用两层架构，将上层应用专用的存储策略和下层应用独立的存储功能高效解耦。对于较低的应用独立层，本文提出了一个基于权重的对象I/O调度器，针对不同的请求类采用不同的调度策略，并提出了一个通用的存储管理器。多功能存储管理器在磁盘布局和空闲空间分配方面实现了差异化的存储策略，以及高效的对象命名空间管理，只需使用对象ID即可直接访问磁盘上的对象数据。DifferStore还为上层应用程序特定层提供了分配复杂的条带、放置、负载平衡策略和文件特定元数据结构的能力。对我们的用户空间原型的实验评估表明，DifferStore可以在混合工作负载下良好地运行，并满足不同应用程序的需求。

{"title":"DifferStore: A differentiated storage service in object-based storage system","authors":"Q. Wei, Zhixiang Li","doi":"10.1109/CLUSTR.2008.4663770","DOIUrl":"https://doi.org/10.1109/CLUSTR.2008.4663770","url":null,"abstract":"This paper presents a differentiated storage service in object-based storage system, called DifferStore. To enable differentiated storage service for different applications in a single object-based storage platform, DifferStore utilizes a two-layer architecture to efficiently decouple upper-layer application specific storage policies and lower-layer application independent storage functions. For the lower application independent layer, this paper proposes a weight-based object I/O scheduler with differentiated scheduling policy for different request classes, and a versatile storage manager. The versatile storage manager implements differentiated storage policies in terms of disk layout and free space allocation, as well as an efficient object namespace management enabling directly access object on-disk data just with object ID. The DifferStore also provides ability for upper application specific layer to assign complex striping, placement, load-balancing policies and specific metadata structure of file. Experimental evaluation on our user space prototype demonstrates that the DifferStore can perform well under mixed workloads and satisfy requirements of different applications.","PeriodicalId":198768,"journal":{"name":"2008 IEEE International Conference on Cluster Computing","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121153329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

An OSD-based approach to managing directory operations in parallel file systems 一种在并行文件系统中管理目录操作的基于osd的方法

2008 IEEE International Conference on Cluster Computing

Pub Date : 2008-10-31 DOI: 10.1109/CLUSTR.2008.4663769

N. Ali, A. Devulapalli, D. Dalessandro, P. Wyckoff, P. Sadayappan

Distributed file systems that use multiple servers to store data in parallel are becoming commonplace. Much work has already gone into such systems to maximize data throughput. However, metadata management has historically been treated as an afterthought. In previous work we focused on improving metadata management techniques by placing file metadata along with data on object-based storage devices (OSDs). However, we did not investigate directory operations. This work looks at the possibility of designing directory structures directly on OSDs, without the need for intervening servers. In particular, the need for atomicity is a fundamental requirement that we explore in depth. Through performance results of benchmarks and applications we show the feasibility of using OSDs directly for metadata, including directory operations.

使用多个服务器并行存储数据的分布式文件系统正变得越来越普遍。为了最大限度地提高数据吞吐量，这种系统已经做了很多工作。然而，元数据管理历来被视为事后的想法。在之前的工作中，我们着重于通过将文件元数据和数据放在基于对象的存储设备(osd)上来改进元数据管理技术。但是，我们没有研究目录操作。这项工作着眼于直接在osd上设计目录结构的可能性，而不需要中间的服务器。特别是，对原子性的需求是我们将深入探讨的一个基本需求。通过基准测试和应用程序的性能结果，我们展示了直接将osd用于元数据(包括目录操作)的可行性。

引用次数: 22

Continuous adaptation for high performance throughput computing across distributed clusters 持续适应跨分布式集群的高性能吞吐量计算

2008 IEEE International Conference on Cluster Computing

Pub Date : 2008-10-31 DOI: 10.1109/CLUSTR.2008.4663797

E. Walker

A job proxy is an abstraction for provisioning CPU resources. This paper proposes an adaptive algorithm for allocating job proxies to distributed host clusters with the objective of improving large-scale job ensemble throughput. Specifically, the paper proposes a decision metric for selecting appropriate pending job proxies for migration between host clusters, and a self-synchronizing Paxos-style distributed consensus algorithm for performing the migration of these selected job proxies. The algorithm is further described in the context of a concrete application, the MyCluster system, which implements a framework for submitting, managing and adapting job proxies across distributed high performance computing (HPC) host clusters. To date, the system has been used to provision many hundreds of thousands of CPUs for computational experiments requiring high throughput on HPC infrastructures like the NSF TeraGrid. Experimental evaluation of the proposed algorithm shows significant improvement in user job throughput: an average of 8% in simulation, and 15% in a real-world experiment.

作业代理是用于分配CPU资源的抽象。为了提高大规模作业集成吞吐量，提出了一种分配作业代理的自适应算法。具体而言，本文提出了一种决策度量，用于在主机集群之间选择合适的待挂作业代理进行迁移，并提出了一种自同步paxos风格的分布式共识算法，用于执行这些所选作业代理的迁移。该算法在一个具体应用程序MyCluster系统的背景下进一步描述，该系统实现了一个框架，用于跨分布式高性能计算(HPC)主机集群提交、管理和调整作业代理。到目前为止，该系统已被用于在高性能计算基础设施(如NSF TeraGrid)上为需要高吞吐量的计算实验提供数十万个cpu。对所提出算法的实验评估表明，用户作业吞吐量显著提高:在模拟中平均提高8%，在真实世界的实验中平均提高15%。

引用次数: 3

Context-aware address translation for high performance SMP cluster system 面向高性能SMP集群系统的上下文感知地址转换

2008 IEEE International Conference on Cluster Computing

Pub Date : 2008-10-31 DOI: 10.1109/CLUSTR.2008.4663784

Moon-Sang Lee, Joonwon Lee, S. Maeng

User-level communication allows an application process to access the network interface directly. Bypassing the kernel requires that a user process accesses the network interface using its own virtual address which should be translated to a physical address. A small caching structure which is similar to the hardware TLB on the host processor has been used to cache the mappings between virtual and physical addresses on the network interface memory. In this study, we propose a new TLB architecture for the network interface. The proposed architecture splits an original caching structure into as many partitions as the number of processors on the SMP system and assigns a separate partition to each application process. In addition, the architecture becomes aware of user contexts and switches the content of caching structure in accordance with context switching. According to our experiments, our scheme achieves significant reduction in application execution time compared to the previous approach.

用户级通信允许应用程序进程直接访问网络接口。绕过内核需要用户进程使用自己的虚拟地址访问网络接口，这个虚拟地址应该转换为物理地址。一个类似于主机处理器上的硬件TLB的小缓存结构被用来缓存网络接口内存上的虚拟地址和物理地址之间的映射。在这项研究中，我们提出了一种新的TLB网络接口架构。建议的体系结构将原始缓存结构拆分为与SMP系统上的处理器数量一样多的分区，并为每个应用程序进程分配一个单独的分区。此外，体系结构意识到用户上下文，并根据上下文切换切换缓存结构的内容。实验表明，与之前的方法相比，我们的方案显著减少了应用程序的执行时间。

引用次数: 0

Design and implementation of an effective HyperTransport core in FPGA 一种有效的FPGA超传输核心的设计与实现

2008 IEEE International Conference on Cluster Computing

Pub Date : 2008-10-31 DOI: 10.1109/CLUSTR.2008.4663805

Fei Chen, Hailiang Cheng, Xiaojun Yang, R. Liu

This paper presents a design and implementation of a HyperTransport (HT) core in lattice SCM FPGA which can run at 800 MHz DDR link frequency. An effective approach is also proposed to solve the ordering problem caused by different virtual channels which exists not only in HT but also PCI-e. HT is a high performance, low latency I/O standard which can be used directly to connect with some general-purpose processors, such as AMDpsilas Opteron processor family. HT interface on Opteron processor run at a maximum of 1 GHz frequency. However, most HT core in FPGA runs at a maximum of 500 MHz frequency which limits the performance of communication. In this paper, a 16 bit 800 MHz HT core is proposed to reduce the gap of ASIC and FPGA.

本文设计并实现了一个运行在800 MHz DDR链路频率上的超传输(HT)核。提出了一种有效的方法来解决由于虚拟信道不同而引起的排序问题，这种问题不仅存在于HT中，也存在于PCI-e中。HT是一种高性能、低延迟的I/O标准，可直接用于连接一些通用处理器，如AMDpsilas Opteron处理器系列。Opteron处理器上的HT接口最高运行频率为1ghz。然而，FPGA中大多数HT核的最大运行频率为500mhz，这限制了通信性能。为了缩小ASIC和FPGA之间的差距，本文提出了一种16位800mhz的HT内核。

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2008 IEEE International Conference on Cluster Computing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀