A Scalable Unified Model for Dynamic Data Structures in Message Passing (Clusters) and Shared Memory (multicore CPUs) Computing environments

G. Laccetti, M. Lapegna, R. Montella
{"title":"A Scalable Unified Model for Dynamic Data Structures in Message Passing (Clusters) and Shared Memory (multicore CPUs) Computing environments","authors":"G. Laccetti, M. Lapegna, R. Montella","doi":"10.1109/CCGRID.2018.00007","DOIUrl":null,"url":null,"abstract":"Concurrent data structures are widely used in many software stack levels, ranging from high level parallel scientific applications to low level operating systems. The key issue of these objects is their concurrent use by several computing units (threads or process) so that the design of these structures is much more difficult compared to their sequential counterpart, because of their extremely dynamic nature requiring protocols to ensure data consistency, with a significant cost overhead. At this regard, several studies emphasize a tension between the needs of sequential correctness of the concurrent data structures and scalability of the algorithms, and in many cases it is evident the need to rethink the data structure design, using approaches based on randomization and/or redistribution techniques in order to fully exploit the computational power of the recent computing environments. The problem is grown in importance with the new generation High Performance Computing systems aimed to achieve extreme performance. It is easy to observe that such systems are based on heterogeneous architectures integrating several independent nodes in the form of clusters or MPP systems, where each node is composed by powerful computing elements (CPU core, GPUs or other acceleration devices) sharing resources in a single node. These systems therefore make massive use of communication libraries to exchange data among the nodes, as well as other tools for the management of the shared resources inside a single node. For such a reason, the development of algorithms and scientific software for dynamic data structures on these heterogeneous systems implies a suitable combination of several methodologies and tools to deal with the different kinds of parallelism corresponding to each specific device, so that to be aware of the underlying platform. The present work is aimed to introduce a scalable model to manage a special class of dynamic data structure known as heap based priority queue (or simply heap) on these heterogeneous architectures. A heap is generally used when the applications needs set of data not requiring a complete ordering, but only the access to some items tagged with high priority. In order to ensure a tradeoff between the correct access to high priority items by the several computing units with a low communication and synchronization overhead, a suitable reorganization of the heap is needed. More precisely we introduce a unified scalable model that can be used, with no modifications, to redeploy the items of a heap both in message passing environments (such as clusters and or MMP multicomputers with several nodes) as well as in shared memory environments (such as CPUs and multiprocessors with several cores) with an overhead independent of the number of computing units. Computational results related to the application of the proposed strategy on some numerical case studies are presented for different types of computing environments.","PeriodicalId":321027,"journal":{"name":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","volume":"378 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGRID.2018.00007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Concurrent data structures are widely used in many software stack levels, ranging from high level parallel scientific applications to low level operating systems. The key issue of these objects is their concurrent use by several computing units (threads or process) so that the design of these structures is much more difficult compared to their sequential counterpart, because of their extremely dynamic nature requiring protocols to ensure data consistency, with a significant cost overhead. At this regard, several studies emphasize a tension between the needs of sequential correctness of the concurrent data structures and scalability of the algorithms, and in many cases it is evident the need to rethink the data structure design, using approaches based on randomization and/or redistribution techniques in order to fully exploit the computational power of the recent computing environments. The problem is grown in importance with the new generation High Performance Computing systems aimed to achieve extreme performance. It is easy to observe that such systems are based on heterogeneous architectures integrating several independent nodes in the form of clusters or MPP systems, where each node is composed by powerful computing elements (CPU core, GPUs or other acceleration devices) sharing resources in a single node. These systems therefore make massive use of communication libraries to exchange data among the nodes, as well as other tools for the management of the shared resources inside a single node. For such a reason, the development of algorithms and scientific software for dynamic data structures on these heterogeneous systems implies a suitable combination of several methodologies and tools to deal with the different kinds of parallelism corresponding to each specific device, so that to be aware of the underlying platform. The present work is aimed to introduce a scalable model to manage a special class of dynamic data structure known as heap based priority queue (or simply heap) on these heterogeneous architectures. A heap is generally used when the applications needs set of data not requiring a complete ordering, but only the access to some items tagged with high priority. In order to ensure a tradeoff between the correct access to high priority items by the several computing units with a low communication and synchronization overhead, a suitable reorganization of the heap is needed. More precisely we introduce a unified scalable model that can be used, with no modifications, to redeploy the items of a heap both in message passing environments (such as clusters and or MMP multicomputers with several nodes) as well as in shared memory environments (such as CPUs and multiprocessors with several cores) with an overhead independent of the number of computing units. Computational results related to the application of the proposed strategy on some numerical case studies are presented for different types of computing environments.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
消息传递(集群)和共享内存(多核cpu)计算环境下动态数据结构的可扩展统一模型
并发数据结构广泛应用于许多软件堆栈级别,从高级并行科学应用程序到低级操作系统。这些对象的关键问题是它们被多个计算单元(线程或进程)并发使用,因此这些结构的设计比顺序结构的设计要困难得多,因为它们具有极其动态的特性,需要协议来确保数据一致性,并且成本开销很大。在这方面,一些研究强调了并发数据结构的顺序正确性和算法的可扩展性之间的紧张关系,并且在许多情况下,显然需要重新考虑数据结构设计,使用基于随机化和/或再分配技术的方法,以便充分利用最新计算环境的计算能力。随着新一代高性能计算系统的发展,这个问题变得越来越重要。很容易观察到,这样的系统是基于异构架构,以集群或MPP系统的形式集成了几个独立的节点,其中每个节点由强大的计算元素(CPU核心,gpu或其他加速设备)组成,共享单个节点中的资源。因此,这些系统大量使用通信库在节点之间交换数据,并使用其他工具管理单个节点内的共享资源。因此,在这些异构系统上开发用于动态数据结构的算法和科学软件意味着需要将几种方法和工具适当地结合起来,以处理对应于每个特定设备的不同类型的并行性,从而了解底层平台。目前的工作旨在引入一个可扩展的模型来管理这些异构架构上的一类特殊的动态数据结构,称为基于堆的优先队列(或简称堆)。当应用程序需要一组不需要完整排序的数据,而只需要访问一些标记为高优先级的项目时,通常使用堆。为了确保几个计算单元对高优先级项的正确访问与低通信和同步开销之间的权衡,需要对堆进行适当的重组。更准确地说,我们引入了一个统一的可扩展模型,该模型无需修改即可用于在消息传递环境(例如具有多个节点的集群和/或MMP多计算机)以及开销与计算单元数量无关的共享内存环境(例如具有多个内核的cpu和多处理器)中重新部署堆的项。针对不同类型的计算环境,给出了与所提出的策略应用有关的一些数值案例研究的计算结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Extreme-Scale Realistic Stencil Computations on Sunway TaihuLight with Ten Million Cores RideMatcher: Peer-to-Peer Matching of Passengers for Efficient Ridesharing Nitro: Network-Aware Virtual Machine Image Management in Geo-Distributed Clouds Improving Energy Efficiency of Database Clusters Through Prefetching and Caching Main-Memory Requirements of Big Data Applications on Commodity Server Platform
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1