GPU 加速数据密集型应用的现场技术

Yi Ju, Mingshuai Li, Adalberto Perez, Laura Bellentani, Niclas Jansson, Stefano Markidis, Philipp Schlatter, Erwin Laure
{"title":"GPU 加速数据密集型应用的现场技术","authors":"Yi Ju, Mingshuai Li, Adalberto Perez, Laura Bellentani, Niclas Jansson, Stefano Markidis, Philipp Schlatter, Erwin Laure","doi":"arxiv-2407.20731","DOIUrl":null,"url":null,"abstract":"The computational power of High-Performance Computing (HPC) systems is\nconstantly increasing, however, their input/output (IO) performance grows\nrelatively slowly, and their storage capacity is also limited. This unbalance\npresents significant challenges for applications such as Molecular Dynamics\n(MD) and Computational Fluid Dynamics (CFD), which generate massive amounts of\ndata for further visualization or analysis. At the same time, checkpointing is\ncrucial for long runs on HPC clusters, due to limited walltimes and/or failures\nof system components, and typically requires the storage of large amount of\ndata. Thus, restricted IO performance and storage capacity can lead to\nbottlenecks for the performance of full application workflows (as compared to\ncomputational kernels without IO). In-situ techniques, where data is further\nprocessed while still in memory rather to write it out over the I/O subsystem,\ncan help to tackle these problems. In contrast to traditional post-processing\nmethods, in-situ techniques can reduce or avoid the need to write or read data\nvia the IO subsystem. They offer a promising approach for applications aiming\nto leverage the full power of large scale HPC systems. In-situ techniques can\nalso be applied to hybrid computational nodes on HPC systems consisting of\ngraphics processing units (GPUs) and central processing units (CPUs). On one\nnode, the GPUs would have significant performance advantages over the CPUs.\nTherefore, current approaches for GPU-accelerated applications often focus on\nmaximizing GPU usage, leaving CPUs underutilized. In-situ tasks using CPUs to\nperform data analysis or preprocess data concurrently to the running\nsimulation, offer a possibility to improve this underutilization.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"In-Situ Techniques on GPU-Accelerated Data-Intensive Applications\",\"authors\":\"Yi Ju, Mingshuai Li, Adalberto Perez, Laura Bellentani, Niclas Jansson, Stefano Markidis, Philipp Schlatter, Erwin Laure\",\"doi\":\"arxiv-2407.20731\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The computational power of High-Performance Computing (HPC) systems is\\nconstantly increasing, however, their input/output (IO) performance grows\\nrelatively slowly, and their storage capacity is also limited. This unbalance\\npresents significant challenges for applications such as Molecular Dynamics\\n(MD) and Computational Fluid Dynamics (CFD), which generate massive amounts of\\ndata for further visualization or analysis. At the same time, checkpointing is\\ncrucial for long runs on HPC clusters, due to limited walltimes and/or failures\\nof system components, and typically requires the storage of large amount of\\ndata. Thus, restricted IO performance and storage capacity can lead to\\nbottlenecks for the performance of full application workflows (as compared to\\ncomputational kernels without IO). In-situ techniques, where data is further\\nprocessed while still in memory rather to write it out over the I/O subsystem,\\ncan help to tackle these problems. In contrast to traditional post-processing\\nmethods, in-situ techniques can reduce or avoid the need to write or read data\\nvia the IO subsystem. They offer a promising approach for applications aiming\\nto leverage the full power of large scale HPC systems. In-situ techniques can\\nalso be applied to hybrid computational nodes on HPC systems consisting of\\ngraphics processing units (GPUs) and central processing units (CPUs). On one\\nnode, the GPUs would have significant performance advantages over the CPUs.\\nTherefore, current approaches for GPU-accelerated applications often focus on\\nmaximizing GPU usage, leaving CPUs underutilized. In-situ tasks using CPUs to\\nperform data analysis or preprocess data concurrently to the running\\nsimulation, offer a possibility to improve this underutilization.\",\"PeriodicalId\":501291,\"journal\":{\"name\":\"arXiv - CS - Performance\",\"volume\":\"8 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Performance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2407.20731\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.20731","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

高性能计算(HPC)系统的计算能力在不断提高,但其输入/输出(IO)性能却增长相对缓慢,而且存储容量也有限。这种不平衡给分子动力学(MD)和计算流体力学(CFD)等应用带来了巨大挑战,因为这些应用会产生海量数据供进一步可视化或分析。同时,由于挂壁时间有限和/或系统组件故障,检查点对于高性能计算集群上的长时间运行至关重要,并且通常需要存储大量数据。因此,受限的 IO 性能和存储容量会导致完整应用工作流的性能出现瓶颈(与无 IO 的计算内核相比)。原位技术,即在内存中对数据进行进一步处理,而不是通过 I/O 子系统将数据写出,有助于解决这些问题。与传统的后处理方法相比,原位技术可以减少或避免通过 IO 子系统写入或读取数据的需要。它们为旨在充分利用大型高性能计算系统的应用提供了一种前景广阔的方法。原位技术还可应用于由图形处理器(GPU)和中央处理器(CPU)组成的高性能计算系统的混合计算节点。因此,目前的 GPU 加速应用方法通常侧重于最大限度地利用 GPU,而不充分利用 CPU。使用 CPU 进行数据分析或在模拟运行的同时进行数据预处理的原位任务,为改善这种利用率不足的情况提供了可能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
In-Situ Techniques on GPU-Accelerated Data-Intensive Applications
The computational power of High-Performance Computing (HPC) systems is constantly increasing, however, their input/output (IO) performance grows relatively slowly, and their storage capacity is also limited. This unbalance presents significant challenges for applications such as Molecular Dynamics (MD) and Computational Fluid Dynamics (CFD), which generate massive amounts of data for further visualization or analysis. At the same time, checkpointing is crucial for long runs on HPC clusters, due to limited walltimes and/or failures of system components, and typically requires the storage of large amount of data. Thus, restricted IO performance and storage capacity can lead to bottlenecks for the performance of full application workflows (as compared to computational kernels without IO). In-situ techniques, where data is further processed while still in memory rather to write it out over the I/O subsystem, can help to tackle these problems. In contrast to traditional post-processing methods, in-situ techniques can reduce or avoid the need to write or read data via the IO subsystem. They offer a promising approach for applications aiming to leverage the full power of large scale HPC systems. In-situ techniques can also be applied to hybrid computational nodes on HPC systems consisting of graphics processing units (GPUs) and central processing units (CPUs). On one node, the GPUs would have significant performance advantages over the CPUs. Therefore, current approaches for GPU-accelerated applications often focus on maximizing GPU usage, leaving CPUs underutilized. In-situ tasks using CPUs to perform data analysis or preprocess data concurrently to the running simulation, offer a possibility to improve this underutilization.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
HRA: A Multi-Criteria Framework for Ranking Metaheuristic Optimization Algorithms Temporal Load Imbalance on Ondes3D Seismic Simulator for Different Multicore Architectures Can Graph Reordering Speed Up Graph Neural Network Training? An Experimental Study The Landscape of GPU-Centric Communication A Global Perspective on the Past, Present, and Future of Video Streaming over Starlink
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1