In situ and in-transit analysis of cosmological simulations

Brian Friesen, Ann Almgren, Zarija Lukić, Gunther Weber, Dmitriy Morozov, Vincent Beckner, Marcus Day
{"title":"In situ and in-transit analysis of cosmological simulations","authors":"Brian Friesen,&nbsp;Ann Almgren,&nbsp;Zarija Lukić,&nbsp;Gunther Weber,&nbsp;Dmitriy Morozov,&nbsp;Vincent Beckner,&nbsp;Marcus Day","doi":"10.1186/s40668-016-0017-2","DOIUrl":null,"url":null,"abstract":"<p>Modern cosmological simulations have reached the trillion-element scale, rendering data storage and subsequent analysis formidable tasks. To address this circumstance, we present a new MPI-parallel approach for analysis of simulation data while the simulation runs, as an alternative to the traditional workflow consisting of periodically saving large data sets to disk for subsequent ‘offline’ analysis. We demonstrate this approach in the compressible gasdynamics/<i>N</i>-body code Nyx, a hybrid <span>\\(\\mbox{MPI}+\\mbox{OpenMP}\\)</span> code based on the BoxLib framework, used for large-scale cosmological simulations. We have enabled on-the-fly workflows in two different ways: one is a straightforward approach consisting of all MPI processes periodically halting the main simulation and analyzing each component of data that they own (‘<i>in situ</i>’). The other consists of partitioning processes into disjoint MPI groups, with one performing the simulation and periodically sending data to the other ‘sidecar’ group, which post-processes it while the simulation continues (‘in-transit’). The two groups execute their tasks asynchronously, stopping only to synchronize when a new set of simulation data needs to be analyzed. For both the <i>in situ</i> and in-transit approaches, we experiment with two different analysis suites with distinct performance behavior: one which finds dark matter halos in the simulation using merge trees to calculate the mass contained within iso-density contours, and another which calculates probability distribution functions and power spectra of various fields in the simulation. Both are common analysis tasks for cosmology, and both result in summary statistics significantly smaller than the original data set. We study the behavior of each type of analysis in each workflow in order to determine the optimal configuration for the different data analysis algorithms.</p>","PeriodicalId":523,"journal":{"name":"Computational Astrophysics and Cosmology","volume":null,"pages":null},"PeriodicalIF":16.2810,"publicationDate":"2016-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s40668-016-0017-2","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Astrophysics and Cosmology","FirstCategoryId":"4","ListUrlMain":"https://link.springer.com/article/10.1186/s40668-016-0017-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25

Abstract

Modern cosmological simulations have reached the trillion-element scale, rendering data storage and subsequent analysis formidable tasks. To address this circumstance, we present a new MPI-parallel approach for analysis of simulation data while the simulation runs, as an alternative to the traditional workflow consisting of periodically saving large data sets to disk for subsequent ‘offline’ analysis. We demonstrate this approach in the compressible gasdynamics/N-body code Nyx, a hybrid \(\mbox{MPI}+\mbox{OpenMP}\) code based on the BoxLib framework, used for large-scale cosmological simulations. We have enabled on-the-fly workflows in two different ways: one is a straightforward approach consisting of all MPI processes periodically halting the main simulation and analyzing each component of data that they own (‘in situ’). The other consists of partitioning processes into disjoint MPI groups, with one performing the simulation and periodically sending data to the other ‘sidecar’ group, which post-processes it while the simulation continues (‘in-transit’). The two groups execute their tasks asynchronously, stopping only to synchronize when a new set of simulation data needs to be analyzed. For both the in situ and in-transit approaches, we experiment with two different analysis suites with distinct performance behavior: one which finds dark matter halos in the simulation using merge trees to calculate the mass contained within iso-density contours, and another which calculates probability distribution functions and power spectra of various fields in the simulation. Both are common analysis tasks for cosmology, and both result in summary statistics significantly smaller than the original data set. We study the behavior of each type of analysis in each workflow in order to determine the optimal configuration for the different data analysis algorithms.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
宇宙模拟的原位和在途分析
现代宇宙学模拟已经达到了万亿元素的规模,这使得数据存储和随后的分析任务变得艰巨。为了解决这种情况,我们提出了一种新的mpi并行方法,用于在模拟运行时分析模拟数据,作为传统工作流程的替代方案,传统工作流程包括定期将大型数据集保存到磁盘上,以供随后的“离线”分析。我们在可压缩气体动力学/ n -体代码Nyx中演示了这种方法,这是一种基于BoxLib框架的混合\(\mbox{MPI}+\mbox{OpenMP}\)代码,用于大规模宇宙学模拟。我们以两种不同的方式启用了实时工作流程:一种是由所有MPI进程定期停止主要模拟并分析它们拥有的每个数据组件(“原位”)组成的直接方法。另一种方法是将进程划分为不同的MPI组,其中一个执行模拟并定期将数据发送给另一个“sidecar”组,后者在模拟继续进行时对其进行后处理(“传输中”)。这两个组异步执行它们的任务,只有在需要分析一组新的模拟数据时才会停止同步。对于原位和在途方法,我们实验了两种不同的分析套件,它们具有不同的性能行为:一种是在模拟中使用合并树来发现暗物质晕,以计算等密度轮廓中包含的质量,另一种是计算模拟中各个场的概率分布函数和功率谱。这两种方法都是宇宙学中常见的分析任务,并且都会导致汇总统计数据明显小于原始数据集。我们研究了每个工作流中每种分析类型的行为,以确定不同数据分析算法的最佳配置。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊介绍: Computational Astrophysics and Cosmology (CompAC) is now closed and no longer accepting submissions. However, we would like to assure you that Springer will maintain an archive of all articles published in CompAC, ensuring their accessibility through SpringerLink's comprehensive search functionality.
期刊最新文献
Machine learning applied to simulations of collisions between rotating, differentiated planets Technologies for supporting high-order geodesic mesh frameworks for computational astrophysics and space sciences Cosmological N-body simulations: a challenge for scalable generative models A detection metric designed for O’Connell effect eclipsing binaries DESTINY: Database for the Effects of STellar encounters on dIsks and plaNetary sYstems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1