Topology and Affinity Aware Hierarchical and Distributed Load-Balancing in Charm++

E. Jeannot, Guillaume Mercier, François Tessier
{"title":"Topology and Affinity Aware Hierarchical and Distributed Load-Balancing in Charm++","authors":"E. Jeannot, Guillaume Mercier, François Tessier","doi":"10.1109/COM-HPC.2016.12","DOIUrl":null,"url":null,"abstract":"The evolution of massively parallel supercomputers make palpable two issues in particular: the load imbalance and the poor management of data locality in applications. Thus, with the increase of the number of cores and the drastic decrease of amount of memory per core, the large performance needs imply to particularly take care of the load-balancing and as much as possible of the locality of data. One mean to take into account this locality issue relies on the placement of the processing entities and load balancing techniques are relevant in order to improve application performance. With large-scale platforms in mind, we developed a hierarchical and distributed algorithm which aim is to perform a topology-aware load balancing tailored for Charm++ applications. This algorithm is based on both LibTopoMap for the network awareness aspects and on TREEMATCH to determine a relevant placement of the processing entities. We show that the proposed algorithm improves the overall execution time in both the cases of real applications and a synthetic benchmark as well. For this last experiment, we show a scalability up to one millions processing entities.","PeriodicalId":332852,"journal":{"name":"2016 First International Workshop on Communication Optimizations in HPC (COMHPC)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 First International Workshop on Communication Optimizations in HPC (COMHPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COM-HPC.2016.12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

The evolution of massively parallel supercomputers make palpable two issues in particular: the load imbalance and the poor management of data locality in applications. Thus, with the increase of the number of cores and the drastic decrease of amount of memory per core, the large performance needs imply to particularly take care of the load-balancing and as much as possible of the locality of data. One mean to take into account this locality issue relies on the placement of the processing entities and load balancing techniques are relevant in order to improve application performance. With large-scale platforms in mind, we developed a hierarchical and distributed algorithm which aim is to perform a topology-aware load balancing tailored for Charm++ applications. This algorithm is based on both LibTopoMap for the network awareness aspects and on TREEMATCH to determine a relevant placement of the processing entities. We show that the proposed algorithm improves the overall execution time in both the cases of real applications and a synthetic benchmark as well. For this last experiment, we show a scalability up to one millions processing entities.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于拓扑和亲和性的分层分布式负载均衡
大规模并行超级计算机的发展特别突出了两个问题:负载不平衡和应用程序中数据局部性管理不善。因此,随着核心数量的增加和每个核心内存量的急剧减少,大的性能需要特别注意负载平衡和尽可能多的数据位置。考虑局部性问题的一种方法是依赖于处理实体的位置和与之相关的负载平衡技术,以提高应用程序性能。考虑到大规模平台,我们开发了一种分层和分布式算法,旨在为Charm++应用程序执行拓扑感知负载平衡。该算法基于LibTopoMap(用于网络感知方面)和TREEMATCH(用于确定处理实体的相关位置)。我们证明了所提出的算法在实际应用程序和综合基准测试中都提高了总体执行时间。对于最后一个实验,我们展示了多达一百万个处理实体的可伸缩性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
DISP: Optimizations towards Scalable MPI Startup Topology and Affinity Aware Hierarchical and Distributed Load-Balancing in Charm++ Scalable Hierarchical Aggregation Protocol (SHArP): A Hardware Architecture for Efficient Data Reduction Efficient Reliability Support for Hardware Multicast-Based Broadcast in GPU-enabled Streaming Applications Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Supercomputers
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1