Yi Ju, Mingshuai Li, Adalberto Perez, Laura Bellentani, Niclas Jansson, Stefano Markidis, Philipp Schlatter, Erwin Laure
{"title":"In-Situ Techniques on GPU-Accelerated Data-Intensive Applications","authors":"Yi Ju, Mingshuai Li, Adalberto Perez, Laura Bellentani, Niclas Jansson, Stefano Markidis, Philipp Schlatter, Erwin Laure","doi":"arxiv-2407.20731","DOIUrl":null,"url":null,"abstract":"The computational power of High-Performance Computing (HPC) systems is\nconstantly increasing, however, their input/output (IO) performance grows\nrelatively slowly, and their storage capacity is also limited. This unbalance\npresents significant challenges for applications such as Molecular Dynamics\n(MD) and Computational Fluid Dynamics (CFD), which generate massive amounts of\ndata for further visualization or analysis. At the same time, checkpointing is\ncrucial for long runs on HPC clusters, due to limited walltimes and/or failures\nof system components, and typically requires the storage of large amount of\ndata. Thus, restricted IO performance and storage capacity can lead to\nbottlenecks for the performance of full application workflows (as compared to\ncomputational kernels without IO). In-situ techniques, where data is further\nprocessed while still in memory rather to write it out over the I/O subsystem,\ncan help to tackle these problems. In contrast to traditional post-processing\nmethods, in-situ techniques can reduce or avoid the need to write or read data\nvia the IO subsystem. They offer a promising approach for applications aiming\nto leverage the full power of large scale HPC systems. In-situ techniques can\nalso be applied to hybrid computational nodes on HPC systems consisting of\ngraphics processing units (GPUs) and central processing units (CPUs). On one\nnode, the GPUs would have significant performance advantages over the CPUs.\nTherefore, current approaches for GPU-accelerated applications often focus on\nmaximizing GPU usage, leaving CPUs underutilized. In-situ tasks using CPUs to\nperform data analysis or preprocess data concurrently to the running\nsimulation, offer a possibility to improve this underutilization.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.20731","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The computational power of High-Performance Computing (HPC) systems is
constantly increasing, however, their input/output (IO) performance grows
relatively slowly, and their storage capacity is also limited. This unbalance
presents significant challenges for applications such as Molecular Dynamics
(MD) and Computational Fluid Dynamics (CFD), which generate massive amounts of
data for further visualization or analysis. At the same time, checkpointing is
crucial for long runs on HPC clusters, due to limited walltimes and/or failures
of system components, and typically requires the storage of large amount of
data. Thus, restricted IO performance and storage capacity can lead to
bottlenecks for the performance of full application workflows (as compared to
computational kernels without IO). In-situ techniques, where data is further
processed while still in memory rather to write it out over the I/O subsystem,
can help to tackle these problems. In contrast to traditional post-processing
methods, in-situ techniques can reduce or avoid the need to write or read data
via the IO subsystem. They offer a promising approach for applications aiming
to leverage the full power of large scale HPC systems. In-situ techniques can
also be applied to hybrid computational nodes on HPC systems consisting of
graphics processing units (GPUs) and central processing units (CPUs). On one
node, the GPUs would have significant performance advantages over the CPUs.
Therefore, current approaches for GPU-accelerated applications often focus on
maximizing GPU usage, leaving CPUs underutilized. In-situ tasks using CPUs to
perform data analysis or preprocess data concurrently to the running
simulation, offer a possibility to improve this underutilization.