2012 IEEE International Conference on Cluster Computing最新文献

英文中文

On Optimal and Balanced Sparse Matrix Partitioning Problems 关于最优平衡稀疏矩阵划分问题

2012 IEEE International Conference on Cluster Computing

Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.77

Anaël Grandjean, J. Langguth, B. Uçar

We investigate one dimensional partitioning of sparse matrices under a given ordering of the rows/columns. The partitioning constraint is to have load balance across processors when different parts are assigned to different processors. The load is defined as the number of rows, or columns, or the nonzeros assigned to a processor. The partitioning objective is to optimize different functions, including the well-known total communication volume arising in a distributed memory implementation of parallel sparse matrix-vector multiplication operations. The difference between our problem in this work and the general sparse matrix partitioning problem is that the parts should correspond to disjoint intervals of the given order. Whereas the partitioning problem without the interval constraint corresponds to the NP-complete hyper graph partitioning problem, the restricted problem corresponds to a polynomial-time solvable variant of the hyper graph partitioning problem. We adapt an existing dynamic programming algorithm designed for graphs to solve two related partitioning problems in graphs. We then propose graph models for a given hyper graph and a partitioning objective function so that the standard cut size definition in the graph model exactly corresponds to the hyper graph partitioning objective function. In extensive experiments, we show that our proposed algorithm is helpful in practice. It even demonstrates performance superior to the standard hyper graph partitioners when the number of parts is high.

研究了稀疏矩阵在给定的行/列顺序下的一维划分问题。分区约束是为了在将不同的部分分配给不同的处理器时实现跨处理器的负载平衡。负载定义为分配给处理器的行数、列数或非零数。分区的目标是优化不同的功能，包括众所周知的分布式内存实现中并行稀疏矩阵向量乘法操作产生的总通信量。本文问题与一般稀疏矩阵划分问题的不同之处在于，各部分应对应于给定阶的不相交区间。不带区间约束的分区问题对应于np完全超图分区问题，而带区间约束的分区问题对应于超图分区问题的多项式时间可解变体。我们采用一种现有的图动态规划算法来解决图中两个相关的分区问题。然后对给定的超图和分区目标函数提出了图模型，使图模型中的标准切尺定义与超图分区目标函数精确对应。在大量的实验中，我们证明了我们提出的算法在实践中是有用的。当部件数量较多时，它甚至表现出优于标准超图分区器的性能。

{"title":"On Optimal and Balanced Sparse Matrix Partitioning Problems","authors":"Anaël Grandjean, J. Langguth, B. Uçar","doi":"10.1109/CLUSTER.2012.77","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.77","url":null,"abstract":"We investigate one dimensional partitioning of sparse matrices under a given ordering of the rows/columns. The partitioning constraint is to have load balance across processors when different parts are assigned to different processors. The load is defined as the number of rows, or columns, or the nonzeros assigned to a processor. The partitioning objective is to optimize different functions, including the well-known total communication volume arising in a distributed memory implementation of parallel sparse matrix-vector multiplication operations. The difference between our problem in this work and the general sparse matrix partitioning problem is that the parts should correspond to disjoint intervals of the given order. Whereas the partitioning problem without the interval constraint corresponds to the NP-complete hyper graph partitioning problem, the restricted problem corresponds to a polynomial-time solvable variant of the hyper graph partitioning problem. We adapt an existing dynamic programming algorithm designed for graphs to solve two related partitioning problems in graphs. We then propose graph models for a given hyper graph and a partitioning objective function so that the standard cut size definition in the graph model exactly corresponds to the hyper graph partitioning objective function. In extensive experiments, we show that our proposed algorithm is helpful in practice. It even demonstrates performance superior to the standard hyper graph partitioners when the number of parts is high.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114661377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Hierarchical Clustering Strategies for Fault Tolerance in Large Scale HPC Systems 大规模高性能计算系统容错的分层聚类策略

2012 IEEE International Conference on Cluster Computing

Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.71

L. Bautista-Gomez, Thomas Ropars, N. Maruyama, F. Cappello, S. Matsuoka

Future high performance computing systems will need to use novel techniques to allow scientific applications to progress despite frequent failures. Checkpoint-Restart is currently the most popular way to mitigate the impact of failures during long-running executions. Different techniques try to reduce the cost of Checkpoint-Restart, some of them such as local check pointing and erasure codes aim to reduce the time to checkpoint while others such as uncoordinated checkpoint and message-logging aim to decrease the cost of recovery. In this paper, we study how to combine all these techniques together in order to optimize both: check pointing and recovery. We present several clustering and topology challenges that lead us to an optimization problem in a four-dimensional space: reliability level, recovery cost, encoding time and message logging overhead. We propose a novel clustering method inspired from brain topology studies in neuroscience and evaluate it with a Tsunami simulation application in TSUBAME2. Our evaluation with 1024 processes shows that our novel clustering method can guarantee good performance for all of the four mentioned dimensions of our optimization problem.

未来的高性能计算系统将需要使用新颖的技术来允许科学应用在频繁故障的情况下取得进展。Checkpoint-Restart是目前最流行的减轻长时间执行过程中失败影响的方法。不同的技术试图降低检查点重新启动的成本，其中一些技术(如本地检查点和擦除代码)旨在减少到达检查点的时间，而其他技术(如非协调检查点和消息日志)旨在降低恢复成本。在本文中，我们研究了如何将所有这些技术结合在一起，以优化检查指向和恢复。我们提出了几个集群和拓扑挑战，这些挑战导致我们在四维空间中遇到一个优化问题:可靠性级别、恢复成本、编码时间和消息日志开销。我们提出了一种新的聚类方法，灵感来自神经科学中的大脑拓扑研究，并通过TSUBAME2中的海啸模拟应用对其进行了评估。我们对1024个进程的评估表明，我们的新聚类方法可以保证我们的优化问题的所有四个维度的良好性能。

{"title":"Hierarchical Clustering Strategies for Fault Tolerance in Large Scale HPC Systems","authors":"L. Bautista-Gomez, Thomas Ropars, N. Maruyama, F. Cappello, S. Matsuoka","doi":"10.1109/CLUSTER.2012.71","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.71","url":null,"abstract":"Future high performance computing systems will need to use novel techniques to allow scientific applications to progress despite frequent failures. Checkpoint-Restart is currently the most popular way to mitigate the impact of failures during long-running executions. Different techniques try to reduce the cost of Checkpoint-Restart, some of them such as local check pointing and erasure codes aim to reduce the time to checkpoint while others such as uncoordinated checkpoint and message-logging aim to decrease the cost of recovery. In this paper, we study how to combine all these techniques together in order to optimize both: check pointing and recovery. We present several clustering and topology challenges that lead us to an optimization problem in a four-dimensional space: reliability level, recovery cost, encoding time and message logging overhead. We propose a novel clustering method inspired from brain topology studies in neuroscience and evaluate it with a Tsunami simulation application in TSUBAME2. Our evaluation with 1024 processes shows that our novel clustering method can guarantee good performance for all of the four mentioned dimensions of our optimization problem.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122161287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Memory Affinity: Balancing Performance, Power, Thermal and Fairness for Multi-core Systems 内存亲和性:多核系统的平衡性能、功耗、热和公平性

2012 IEEE International Conference on Cluster Computing

Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.33

Gangyong Jia, Xi Li, Chao Wang, Xuehai Zhou, Zongwei Zhu

Main memory is expected to grow significantly in both speed and capacity for it is a major shared resource among cores in a multi-core system, which will lead to increasing power consumption. Therefore, it is critical to address the power issue without seriously decreasing performance in the memory subsystem. In this paper, we firstly propose memory affinity which retains the active and low power memory ranks as long as possible to avoid frequently switching between active and low power status, and then present a memory affinity aware scheduling (MAS) to balance performance, power, thermal and fairness for multi-core systems. Experimental results demonstrate our memory affinity aware scheduling algorithms well adapt to system loading to maximize power saving and avoid memory hotspot at the same time while sustaining the system bandwidth demand and preserving fairness among threads.

在多核系统中，主存作为核心间的主要共享资源，其速度和容量都将显著增长，这将导致功耗的增加。因此，在不严重降低内存子系统性能的情况下解决功耗问题至关重要。本文首先提出尽可能长时间保持活动和低功耗内存等级的内存亲和性，以避免频繁地在活动和低功耗状态之间切换，然后提出一种内存亲和性感知调度(MAS)，以平衡多核系统的性能、功耗、散热和公平性。实验结果表明，该算法能很好地适应系统负载，在保证系统带宽需求和线程间公平性的同时，最大限度地节省功耗和避免内存热点。

引用次数: 12

Dynamic Network Forecasting Using SimGrid Simulations 使用SimGrid模拟的动态网络预测

2012 IEEE International Conference on Cluster Computing

Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.40

Matthieu Imbert, E. Caron

To be able to efficiently schedule network transfers on computing platforms such as clusters, grids or clouds, accurate and timely predictions of network transfers completion times are needed. We designed a new metrology and performance prediction framework called Pilgrim which offers a service predicting the completion times of current and concurrent TCP transfers. We describe Pilgrim and show some experimental results comparing the predictions to the real transfer completion times.

为了能够在集群、网格或云等计算平台上有效地调度网络传输，需要准确及时地预测网络传输完成时间。我们设计了一个名为Pilgrim的新的计量和性能预测框架，它提供了一个预测当前和并发TCP传输完成时间的服务。我们描述了Pilgrim，并展示了一些实验结果，比较了预测和真实的转移完成时间。

引用次数: 0

Replication Based QoS Framework for Flash Arrays 基于复制的Flash阵列QoS框架

2012 IEEE International Conference on Cluster Computing

Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.53

Nihat Altiparmak, A. Tosun

The increasing popularity of the storage cloud is leading organizations to move their applications and enterprise data into the cloud. It is desirable to move time-critical applications demanding high performance I/O operations. Flash based storage arrays have emerged to address the high performance I/O requirements, however, providing predictable Quality of Service (QoS) for applications with real time data requirements is a challenging open problem. This paper introduces a QoS framework for flash based storage arrays. Our framework provides deterministic and statistical response time guarantees through a combination of techniques including replication, data mining, and online retrieval. We evaluated the framework using synthetic and real-world traces. The QoS performance of the system is compared to the existing high-throughput RAID designs. Numerical results show that under the synthetic traces, QoS performance of the proposed system outperforms the existing high performance RAID designs. Real world traces indicate that the proposed QoS mechanism is tunable to support the guarantees required by various real world applications.

存储云的日益普及正在引导组织将其应用程序和企业数据迁移到云中。移动需要高性能I/O操作的时间关键型应用程序是可取的。基于闪存的存储阵列已经出现，以解决高性能I/O需求，然而，为具有实时数据需求的应用程序提供可预测的服务质量(QoS)是一个具有挑战性的开放问题。介绍了一种基于闪存存储阵列的QoS框架。我们的框架通过包括复制、数据挖掘和在线检索在内的技术组合提供确定性和统计响应时间保证。我们使用合成和真实世界的痕迹来评估框架。将系统的QoS性能与现有的高吞吐量RAID设计进行了比较。数值结果表明，在综合路径下，所提系统的QoS性能优于现有的高性能RAID设计。实际跟踪表明，建议的QoS机制是可调的，以支持各种实际应用程序所需的保证。

引用次数: 2

Synergy: A Middleware for Energy Conservation in Mobile Devices 协同:移动设备节能中间件

2012 IEEE International Conference on Cluster Computing

Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.64

Harshit Kharbanda, Manoj Krishnan, R. Campbell

The combined effect of Moore's law and the failure of Den nard scaling have led to multi-core mobile devices with immense computation capabilities. The biggest limitation of the computation capability for any mobile device is its battery. Mobile cloud computing is used to offload compute intensive tasks that affect a mobile device's battery. Mobile ad-hoc computing can be used as an alternative to mobile cloud computing in cases where cloud access is not available or is inhibitive to application performance, although battery drain remains a critical argument against mobile ad-hoc computing. In this paper, we present Synergy, a middleware that increases the battery life for a system of mobile devices connected in a peer-to-peer ad-hoc network. Synergy conserves energy by scaling core frequencies and by intelligently distributing the computation among peer devices. The middleware is not restricted to mobile phones and in no way restricts the mobility of the devices. Synergy considers the mobile devices connected in a peer-to-peer fashion as a single multicore device with Wifi as the interconnect. With Synergy running on Google Nexus phones we were able to conserve up to 30.6% of the system battery while incurring a latency penalty of less than 5%.

摩尔定律和登纳德缩放失败的共同作用导致了具有巨大计算能力的多核移动设备。任何移动设备的计算能力的最大限制是它的电池。移动云计算用于卸载影响移动设备电池的计算密集型任务。在云访问不可用或会影响应用程序性能的情况下，移动自组织计算可以作为移动云计算的替代方案，尽管电池消耗仍然是反对移动自组织计算的一个关键论点。在本文中，我们介绍了Synergy，这是一种中间件，可以延长在点对点自组织网络中连接的移动设备系统的电池寿命。Synergy通过缩放核心频率和智能地在对等设备之间分配计算来节省能量。中间件不局限于移动电话，也不限制设备的移动性。Synergy将以点对点方式连接的移动设备视为以Wifi作为互连的单个多核设备。当Synergy在谷歌Nexus手机上运行时，我们能够节省高达30.6%的系统电池，同时产生不到5%的延迟损失。

{"title":"Synergy: A Middleware for Energy Conservation in Mobile Devices","authors":"Harshit Kharbanda, Manoj Krishnan, R. Campbell","doi":"10.1109/CLUSTER.2012.64","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.64","url":null,"abstract":"The combined effect of Moore's law and the failure of Den nard scaling have led to multi-core mobile devices with immense computation capabilities. The biggest limitation of the computation capability for any mobile device is its battery. Mobile cloud computing is used to offload compute intensive tasks that affect a mobile device's battery. Mobile ad-hoc computing can be used as an alternative to mobile cloud computing in cases where cloud access is not available or is inhibitive to application performance, although battery drain remains a critical argument against mobile ad-hoc computing. In this paper, we present Synergy, a middleware that increases the battery life for a system of mobile devices connected in a peer-to-peer ad-hoc network. Synergy conserves energy by scaling core frequencies and by intelligently distributing the computation among peer devices. The middleware is not restricted to mobile phones and in no way restricts the mobility of the devices. Synergy considers the mobile devices connected in a peer-to-peer fashion as a single multicore device with Wifi as the interconnect. With Synergy running on Google Nexus phones we were able to conserve up to 30.6% of the system battery while incurring a latency penalty of less than 5%.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132425901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Towards a Cost-Aware Data Migration Approach for Key-Value Stores 基于成本意识的键值存储数据迁移方法

2012 IEEE International Conference on Cluster Computing

Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.14

Xiulei Qin, Wen-bo Zhang, Wei Wang, Jun Wei, Xin Zhao, Tao Huang

Live data migration is an important technique for key-value stores. However, due to the stateful feature, new virtualization technology, stringent low latency requirements and unexpected workload changes, key-value stores deployed in cloud environment have to face new challenges for data migration: effects of VM interference, and the need to trade off between the two ingredients of migration cost, say migration time and performance impact. To address these challenges, we focus on the data migration problem in a load rebalancing scenario and build a new framework that aims to rebalance load while minimizing migration costs. We build two interference-aware prediction models to predict the migration time and performance impact for each action using statistical machine learning and then create a cost model to strike a right balance between the two ingredients of cost. A cost-aware migration algorithm is designed to utilize the cost model and balance rate to guide the choice of possible migration actions. We demonstrate the effectiveness of the data migration approach as well as the cost model and two prediction models using YCSB.

动态数据迁移是键值存储的一项重要技术。然而，由于有状态特性、新的虚拟化技术、严格的低延迟需求和意想不到的工作负载变化，部署在云环境中的键值存储在数据迁移中不得不面临新的挑战:VM干扰的影响，以及需要在迁移成本的两个组成部分(如迁移时间和性能影响)之间进行权衡。为了应对这些挑战，我们将重点关注负载再平衡场景中的数据迁移问题，并构建一个旨在重新平衡负载同时最小化迁移成本的新框架。我们建立了两个干扰感知预测模型，使用统计机器学习来预测每个操作的迁移时间和性能影响，然后创建一个成本模型，在成本的两个组成部分之间取得适当的平衡。设计了一种成本感知迁移算法，利用成本模型和平衡率来指导可能迁移行为的选择。我们证明了数据迁移方法以及使用YCSB的成本模型和两个预测模型的有效性。

{"title":"Towards a Cost-Aware Data Migration Approach for Key-Value Stores","authors":"Xiulei Qin, Wen-bo Zhang, Wei Wang, Jun Wei, Xin Zhao, Tao Huang","doi":"10.1109/CLUSTER.2012.14","DOIUrl":"https://doi.org/10.1109/CLUSTER.2012.14","url":null,"abstract":"Live data migration is an important technique for key-value stores. However, due to the stateful feature, new virtualization technology, stringent low latency requirements and unexpected workload changes, key-value stores deployed in cloud environment have to face new challenges for data migration: effects of VM interference, and the need to trade off between the two ingredients of migration cost, say migration time and performance impact. To address these challenges, we focus on the data migration problem in a load rebalancing scenario and build a new framework that aims to rebalance load while minimizing migration costs. We build two interference-aware prediction models to predict the migration time and performance impact for each action using statistical machine learning and then create a cost model to strike a right balance between the two ingredients of cost. A cost-aware migration algorithm is designed to utilize the cost model and balance rate to guide the choice of possible migration actions. We demonstrate the effectiveness of the data migration approach as well as the cost model and two prediction models using YCSB.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133946686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Built-in Device Simulator for OS Performance Evaluation 用于操作系统性能评估的内置设备模拟器

2012 IEEE International Conference on Cluster Computing

Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.30

Junjie Mao, Yu Chen, Yaozu Dong

I/O devices are evolving rapidly, while OS optimization is always slower because of its dependence on physical devices. This inevitably prevents latest devices from working with their rating performance, which remains a big problem for performance-critical applications. Though I/O device simulators can help carry out performance evaluation before physical devices are ready, the existing simulator implementations are still unsatisfactory, either having too big overhead or requiring too much extra work. In this paper, we propose kernel built-in device simulation to provide accurate real time evaluations with acceptable extra effort. With the work of simulation well isolated, the overhead is reasonable compared to native environment. A bonding Ethernet interface is implemented in this way and experiments on it confirm the close-to-native performance of the idea.

I/O设备正在快速发展，而操作系统优化总是比较慢，因为它依赖于物理设备。这不可避免地会阻止最新设备使用其额定性能，这对于性能关键型应用程序来说仍然是一个大问题。尽管I/O设备模拟器可以在物理设备准备就绪之前帮助执行性能评估，但现有的模拟器实现仍然不令人满意，要么开销太大，要么需要太多额外的工作。在本文中，我们提出内核内置设备模拟，以提供准确的实时评估，并提供可接受的额外工作。由于仿真工作隔离较好，因此与本地环境相比，开销是合理的。用这种方法实现了一个绑定以太网接口，实验证明了该方法的性能接近于本机。

引用次数: 0

BWCC: A FS-Cache Based Cooperative Caching System for Network Storage System 基于FS-Cache的网络存储系统协同缓存系统

2012 IEEE International Conference on Cluster Computing

Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.41

Liu Shi, Zhenjun Liu, Lu Xu

A cooperative caching system, using disks as its cache media, is proposed for network storage system. This system is called Blue Whale Cooperative Caching System (BWCC). Through sharing the cached data among clients of cluster file system, the load on the centralized storage server is lowered, therefore, the BWCC significantly enhances the scalability of the network storage system. The advantages of BWCC are as follows: 1) direct data positioning technology without the participation of the centralized storage server guarantees low latency of cooperative data access, 2) supporting several granularities of data sharing is applicable to multiple data access patterns, 3) a global cache management strategy for Video-on-Demand service is designed according to the characteristics of the video data access pattern. BWCC has been implemented as a module in Linux Kernel-2.6.32. The preliminary experimental results verify the effectiveness of BWCC.

针对网络存储系统，提出了一种以磁盘为缓存介质的协同缓存系统。这个系统被称为蓝鲸协同缓存系统(BWCC)。BWCC通过在集群文件系统的客户端之间共享缓存的数据，降低了集中式存储服务器的负载，从而显著提高了网络存储系统的可扩展性。BWCC的优势在于:1)无需集中存储服务器参与的直接数据定位技术，保证了协同数据访问的低时延;2)支持多粒度的数据共享，适用于多种数据访问模式;3)根据视频数据访问模式的特点，设计了视频点播业务的全局缓存管理策略。BWCC已经在Linux Kernel-2.6.32中作为一个模块实现。初步实验结果验证了BWCC的有效性。

引用次数: 14

Overlay-Centric Load Balancing: Applications to UTS and B&B 以覆盖为中心的负载平衡:在UTS和B&amp中的应用

2012 IEEE International Conference on Cluster Computing

Pub Date : 2012-09-24 DOI: 10.1109/CLUSTER.2012.17

Trong-Tuan Vu, B. Derbel, Ali Asim, A. Bendjoudi, N. Melab

To deal with dynamic load balancing in large scale distributed systems, we propose to organize computing resources following a logical peer-to-peer overlay and to distribute the load according to the so-defined overlay. We use a tree as a logical structure connecting distributed nodes and we balance the load according to the size of induced sub trees. We conduct extensive experiments involving up to 1000 computing cores and provide a throughout analysis of different properties of our generic approach for two different applications, namely, the standard Unbalanced Tree Search and the more challenging parallel Branch-and-Bound algorithm. Substantial improvements are reported in comparison with the classical random work stealing and two finely tuned application specific strategies taken from the literature.

为了处理大规模分布式系统中的动态负载平衡问题，我们提出按照逻辑对等覆盖来组织计算资源，并根据所定义的覆盖来分配负载。我们使用树作为连接分布式节点的逻辑结构，并根据诱导子树的大小来平衡负载。我们进行了大量的实验，涉及多达1000个计算核心，并为两种不同的应用程序提供了我们的通用方法的不同特性的全面分析，即标准的不平衡树搜索和更具挑战性的并行分支和边界算法。与经典的随机工作窃取和从文献中获得的两种精细调整的应用特定策略相比，报告了实质性的改进。

引用次数: 5

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2012 IEEE International Conference on Cluster Computing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀