Portability and scalability evaluation of large-scale statistical modeling and prediction software through HPC-ready containers

IF 6.2 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Future Generation Computer Systems-The International Journal of Escience Pub Date : 2024-07-09 DOI:10.1016/j.future.2024.06.057
{"title":"Portability and scalability evaluation of large-scale statistical modeling and prediction software through HPC-ready containers","authors":"","doi":"10.1016/j.future.2024.06.057","DOIUrl":null,"url":null,"abstract":"<div><p>HPC-based applications often have complex workflows with many software dependencies that hinder their portability on contemporary HPC architectures. In addition, these applications often require extraordinary efforts to deploy and execute at performance potential on new HPC systems, while the users expert in these applications generally have less expertise in HPC and related technologies. This paper provides a dynamic solution that facilitates containerization for transferring HPC software onto diverse parallel systems. The study relies on the HPC Workflow as a Service (HPCWaaS) paradigm proposed by the EuroHPC eFlows4HPC project. It offers to deploy workflows through containers tailored for any of a number of specific HPC systems. Traditional container image creation tools rely on OS system packages compiled for generic architecture families (x86_64, amd64, ppc64, …) and specific MPI or GPU runtime library versions. The containerization solution proposed in this paper leverages HPC Builders such as Spack or Easybuild and multi-platform builders such as buildx to create a service for automating the creation of container images for the software specific to each hardware architecture, aiming to sustain the overall performance of the software. We assess the efficiency of our proposed solution for porting the geostatistics ExaGeoStat software on various parallel systems while preserving the computational performance. The results show that the performance of the generated images is comparable with the native execution of the software on the same architectures. On the distributed-memory system, the containerized version can scale up to 256 nodes without impacting performance.</p></div>","PeriodicalId":55132,"journal":{"name":"Future Generation Computer Systems-The International Journal of Escience","volume":null,"pages":null},"PeriodicalIF":6.2000,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Future Generation Computer Systems-The International Journal of Escience","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167739X24003571","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

HPC-based applications often have complex workflows with many software dependencies that hinder their portability on contemporary HPC architectures. In addition, these applications often require extraordinary efforts to deploy and execute at performance potential on new HPC systems, while the users expert in these applications generally have less expertise in HPC and related technologies. This paper provides a dynamic solution that facilitates containerization for transferring HPC software onto diverse parallel systems. The study relies on the HPC Workflow as a Service (HPCWaaS) paradigm proposed by the EuroHPC eFlows4HPC project. It offers to deploy workflows through containers tailored for any of a number of specific HPC systems. Traditional container image creation tools rely on OS system packages compiled for generic architecture families (x86_64, amd64, ppc64, …) and specific MPI or GPU runtime library versions. The containerization solution proposed in this paper leverages HPC Builders such as Spack or Easybuild and multi-platform builders such as buildx to create a service for automating the creation of container images for the software specific to each hardware architecture, aiming to sustain the overall performance of the software. We assess the efficiency of our proposed solution for porting the geostatistics ExaGeoStat software on various parallel systems while preserving the computational performance. The results show that the performance of the generated images is comparable with the native execution of the software on the same architectures. On the distributed-memory system, the containerized version can scale up to 256 nodes without impacting performance.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过高性能计算就绪容器对大型统计建模和预测软件的可移植性和可扩展性进行评估
基于高性能计算的应用程序通常具有复杂的工作流程和许多软件依赖性,这阻碍了它们在当代高性能计算架构上的可移植性。此外,这些应用往往需要付出极大的努力才能在新的高性能计算系统上部署和执行并发挥其性能潜力,而这些应用的专业用户通常对高性能计算和相关技术的了解较少。本文提供了一种动态解决方案,可促进容器化,将高性能计算软件转移到不同的并行系统上。这项研究依赖于 EuroHPC eFlows4HPC 项目提出的高性能计算工作流即服务(HPCWaaS)范例。它提供了通过容器来部署工作流的方法,容器是为任何特定的高性能计算系统量身定制的。传统的容器镜像创建工具依赖于为通用架构系列(x86_64、amd64、ppc64......)和特定 MPI 或 GPU 运行库版本编译的操作系统系统包。本文提出的容器化解决方案利用 Spack 或 Easybuild 等 HPC 构建工具和 buildx 等多平台构建工具,创建了一项服务,可自动为特定于各种硬件架构的软件创建容器镜像,旨在保持软件的整体性能。我们评估了我们提出的解决方案在各种并行系统上移植地理统计 ExaGeoStat 软件并保持计算性能的效率。结果表明,在相同的架构上,生成图像的性能与软件的本地执行性能相当。在分布式内存系统上,容器化版本可以扩展到 256 个节点而不会影响性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
19.90
自引率
2.70%
发文量
376
审稿时长
10.6 months
期刊介绍: Computing infrastructures and systems are constantly evolving, resulting in increasingly complex and collaborative scientific applications. To cope with these advancements, there is a growing need for collaborative tools that can effectively map, control, and execute these applications. Furthermore, with the explosion of Big Data, there is a requirement for innovative methods and infrastructures to collect, analyze, and derive meaningful insights from the vast amount of data generated. This necessitates the integration of computational and storage capabilities, databases, sensors, and human collaboration. Future Generation Computer Systems aims to pioneer advancements in distributed systems, collaborative environments, high-performance computing, and Big Data analytics. It strives to stay at the forefront of developments in grids, clouds, and the Internet of Things (IoT) to effectively address the challenges posed by these wide-area, fully distributed sensing and computing systems.
期刊最新文献
Analyzing inference workloads for spatiotemporal modeling An efficient federated learning solution for the artificial intelligence of things Generative adversarial networks to detect intrusion and anomaly in IP flow-based networks Blockchain-based conditional privacy-preserving authentication scheme using PUF for vehicular ad hoc networks UAV-IRS-assisted energy harvesting for edge computing based on deep reinforcement learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1