Jeremy J. Williams, Daniel Medeiros, Stefan Costea, David Tskhakaya, Franz Poeschel, René Widera, Axel Huebl, Scott Klasky, Norbert Podhorszki, Leon Kos, Ales Podolnik, Jakub Hromadka, Tapish Narwal, Klaus Steiniger, Michael Bussmann, Erwin Laure, Stefano Markidis
{"title":"利用 openPMD 和 Darshan I/O 监控在细胞内粒子蒙特卡罗模拟中实现高吞吐量并行 I/O","authors":"Jeremy J. Williams, Daniel Medeiros, Stefan Costea, David Tskhakaya, Franz Poeschel, René Widera, Axel Huebl, Scott Klasky, Norbert Podhorszki, Leon Kos, Ales Podolnik, Jakub Hromadka, Tapish Narwal, Klaus Steiniger, Michael Bussmann, Erwin Laure, Stefano Markidis","doi":"arxiv-2408.02869","DOIUrl":null,"url":null,"abstract":"Large-scale HPC simulations of plasma dynamics in fusion devices require\nefficient parallel I/O to avoid slowing down the simulation and to enable the\npost-processing of critical information. Such complex simulations lacking\nparallel I/O capabilities may encounter performance bottlenecks, hindering\ntheir effectiveness in data-intensive computing tasks. In this work, we focus\non introducing and enhancing the efficiency of parallel I/O operations in\nParticle-in-Cell Monte Carlo simulations. We first evaluate the scalability of\nBIT1, a massively-parallel electrostatic PIC MC code, determining its initial\nwrite throughput capabilities and performance bottlenecks using an HPC I/O\nperformance monitoring tool, Darshan. We design and develop an adaptor to the\nopenPMD I/O interface that allows us to stream PIC particle and field\ninformation to I/O using the BP4 backend, aggressively optimized for I/O\nefficiency, including the highly efficient ADIOS2 interface. Next, we explore\nadvanced optimization techniques such as data compression, aggregation, and\nLustre file striping, achieving write throughput improvements while enhancing\ndata storage efficiency. Finally, we analyze the enhanced high-throughput\nparallel I/O and storage capabilities achieved through the integration of\nopenPMD with rapid metadata extraction in BP4 format. Our study demonstrates\nthat the integration of openPMD and advanced I/O optimizations significantly\nenhances BIT1's I/O performance and storage capabilities, successfully\nintroducing high throughput parallel I/O and surpassing the capabilities of\ntraditional file I/O.","PeriodicalId":501274,"journal":{"name":"arXiv - PHYS - Plasma Physics","volume":"13 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enabling High-Throughput Parallel I/O in Particle-in-Cell Monte Carlo Simulations with openPMD and Darshan I/O Monitoring\",\"authors\":\"Jeremy J. Williams, Daniel Medeiros, Stefan Costea, David Tskhakaya, Franz Poeschel, René Widera, Axel Huebl, Scott Klasky, Norbert Podhorszki, Leon Kos, Ales Podolnik, Jakub Hromadka, Tapish Narwal, Klaus Steiniger, Michael Bussmann, Erwin Laure, Stefano Markidis\",\"doi\":\"arxiv-2408.02869\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large-scale HPC simulations of plasma dynamics in fusion devices require\\nefficient parallel I/O to avoid slowing down the simulation and to enable the\\npost-processing of critical information. Such complex simulations lacking\\nparallel I/O capabilities may encounter performance bottlenecks, hindering\\ntheir effectiveness in data-intensive computing tasks. In this work, we focus\\non introducing and enhancing the efficiency of parallel I/O operations in\\nParticle-in-Cell Monte Carlo simulations. We first evaluate the scalability of\\nBIT1, a massively-parallel electrostatic PIC MC code, determining its initial\\nwrite throughput capabilities and performance bottlenecks using an HPC I/O\\nperformance monitoring tool, Darshan. We design and develop an adaptor to the\\nopenPMD I/O interface that allows us to stream PIC particle and field\\ninformation to I/O using the BP4 backend, aggressively optimized for I/O\\nefficiency, including the highly efficient ADIOS2 interface. Next, we explore\\nadvanced optimization techniques such as data compression, aggregation, and\\nLustre file striping, achieving write throughput improvements while enhancing\\ndata storage efficiency. Finally, we analyze the enhanced high-throughput\\nparallel I/O and storage capabilities achieved through the integration of\\nopenPMD with rapid metadata extraction in BP4 format. Our study demonstrates\\nthat the integration of openPMD and advanced I/O optimizations significantly\\nenhances BIT1's I/O performance and storage capabilities, successfully\\nintroducing high throughput parallel I/O and surpassing the capabilities of\\ntraditional file I/O.\",\"PeriodicalId\":501274,\"journal\":{\"name\":\"arXiv - PHYS - Plasma Physics\",\"volume\":\"13 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - PHYS - Plasma Physics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.02869\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Plasma Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.02869","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Enabling High-Throughput Parallel I/O in Particle-in-Cell Monte Carlo Simulations with openPMD and Darshan I/O Monitoring
Large-scale HPC simulations of plasma dynamics in fusion devices require
efficient parallel I/O to avoid slowing down the simulation and to enable the
post-processing of critical information. Such complex simulations lacking
parallel I/O capabilities may encounter performance bottlenecks, hindering
their effectiveness in data-intensive computing tasks. In this work, we focus
on introducing and enhancing the efficiency of parallel I/O operations in
Particle-in-Cell Monte Carlo simulations. We first evaluate the scalability of
BIT1, a massively-parallel electrostatic PIC MC code, determining its initial
write throughput capabilities and performance bottlenecks using an HPC I/O
performance monitoring tool, Darshan. We design and develop an adaptor to the
openPMD I/O interface that allows us to stream PIC particle and field
information to I/O using the BP4 backend, aggressively optimized for I/O
efficiency, including the highly efficient ADIOS2 interface. Next, we explore
advanced optimization techniques such as data compression, aggregation, and
Lustre file striping, achieving write throughput improvements while enhancing
data storage efficiency. Finally, we analyze the enhanced high-throughput
parallel I/O and storage capabilities achieved through the integration of
openPMD with rapid metadata extraction in BP4 format. Our study demonstrates
that the integration of openPMD and advanced I/O optimizations significantly
enhances BIT1's I/O performance and storage capabilities, successfully
introducing high throughput parallel I/O and surpassing the capabilities of
traditional file I/O.