Using open-science workflow tools to produce SCEC CyberShake physics-based probabilistic seismic hazard models

S. Callaghan, P. Maechling, F. Silva, M. Su, K. Milner, Robert W. Graves, Kim B. Olsen, Yifeng Cui, K. Vahi, Albert Kottke, Christine A. Goulet, E. Deelman, Thomas H. Jordan, Y. Ben‐Zion
{"title":"Using open-science workflow tools to produce SCEC CyberShake physics-based probabilistic seismic hazard models","authors":"S. Callaghan, P. Maechling, F. Silva, M. Su, K. Milner, Robert W. Graves, Kim B. Olsen, Yifeng Cui, K. Vahi, Albert Kottke, Christine A. Goulet, E. Deelman, Thomas H. Jordan, Y. Ben‐Zion","doi":"10.3389/fhpcp.2024.1360720","DOIUrl":null,"url":null,"abstract":"The Statewide (formerly Southern) California Earthquake Center (SCEC) conducts multidisciplinary earthquake system science research that aims to develop predictive models of earthquake processes, and to produce accurate seismic hazard information that can improve societal preparedness and resiliency to earthquake hazards. As part of this program, SCEC has developed the CyberShake platform, which calculates physics-based probabilistic seismic hazard analysis (PSHA) models for regions with high-quality seismic velocity and fault models. The CyberShake platform implements a sophisticated computational workflow that includes over 15 individual codes written by 6 developers. These codes are heterogeneous, ranging from short-running high-throughput serial CPU codes to large, long-running, parallel GPU codes. Additionally, CyberShake simulation campaigns are computationally extensive, typically producing tens of terabytes of meaningful scientific data and metadata over several months of around-the-clock execution on leadership-class supercomputers. To meet the needs of the CyberShake platform, we have developed an extreme-scale workflow stack, including the Pegasus Workflow Management System, HTCondor, Globus, and custom tools. We present this workflow software stack and identify how the CyberShake platform and supporting tools enable us to meet a variety of challenges that come with large-scale simulations, such as automated remote job submission, data management, and verification and validation. This platform enabled us to perform our most recent simulation campaign, CyberShake Study 22.12, from December 2022 to April 2023. During this time, our workflow tools executed approximately 32,000 jobs, and used up to 73% of the Summit system at Oak Ridge Leadership Computing Facility. Our workflow tools managed about 2.5 PB of total temporary and output data, and automatically staged 19 million output files totaling 74 TB back to archival storage on the University of Southern California's Center for Advanced Research Computing systems, including file-based relational data and large binary files to efficiently store millions of simulated seismograms. CyberShake extreme-scale workflows have generated simulation-based probabilistic seismic hazard models that are being used by seismological, engineering, and governmental communities.","PeriodicalId":399190,"journal":{"name":"Frontiers in High Performance Computing","volume":"42 14","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in High Performance Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fhpcp.2024.1360720","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The Statewide (formerly Southern) California Earthquake Center (SCEC) conducts multidisciplinary earthquake system science research that aims to develop predictive models of earthquake processes, and to produce accurate seismic hazard information that can improve societal preparedness and resiliency to earthquake hazards. As part of this program, SCEC has developed the CyberShake platform, which calculates physics-based probabilistic seismic hazard analysis (PSHA) models for regions with high-quality seismic velocity and fault models. The CyberShake platform implements a sophisticated computational workflow that includes over 15 individual codes written by 6 developers. These codes are heterogeneous, ranging from short-running high-throughput serial CPU codes to large, long-running, parallel GPU codes. Additionally, CyberShake simulation campaigns are computationally extensive, typically producing tens of terabytes of meaningful scientific data and metadata over several months of around-the-clock execution on leadership-class supercomputers. To meet the needs of the CyberShake platform, we have developed an extreme-scale workflow stack, including the Pegasus Workflow Management System, HTCondor, Globus, and custom tools. We present this workflow software stack and identify how the CyberShake platform and supporting tools enable us to meet a variety of challenges that come with large-scale simulations, such as automated remote job submission, data management, and verification and validation. This platform enabled us to perform our most recent simulation campaign, CyberShake Study 22.12, from December 2022 to April 2023. During this time, our workflow tools executed approximately 32,000 jobs, and used up to 73% of the Summit system at Oak Ridge Leadership Computing Facility. Our workflow tools managed about 2.5 PB of total temporary and output data, and automatically staged 19 million output files totaling 74 TB back to archival storage on the University of Southern California's Center for Advanced Research Computing systems, including file-based relational data and large binary files to efficiently store millions of simulated seismograms. CyberShake extreme-scale workflows have generated simulation-based probabilistic seismic hazard models that are being used by seismological, engineering, and governmental communities.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用开放科学工作流程工具制作 SCEC CyberShake 物理概率地震灾害模型
全州(原南加州)地震中心(SCEC)开展多学科地震系统科学研究,旨在开发地震过程的预测模型,并提供准确的地震灾害信息,以提高社会对地震灾害的防备和抵御能力。作为该计划的一部分,SCEC 开发了 CyberShake 平台,该平台可为具有高质量地震速度和断层模型的地区计算基于物理学的概率地震灾害分析 (PSHA) 模型。CyberShake 平台实施了一套复杂的计算工作流程,其中包括由 6 名开发人员编写的 15 个以上的独立代码。这些代码是异构的,既有短时间运行的高吞吐量串行 CPU 代码,也有长时间运行的大型并行 GPU 代码。此外,CyberShake 模拟活动的计算量非常大,通常要在领先的超级计算机上全天候执行几个月,才能产生数十 TB 的有意义的科学数据和元数据。为了满足CyberShake平台的需求,我们开发了一个极端规模的工作流堆栈,包括Pegasus工作流管理系统、HTCondor、Globus和定制工具。我们介绍了这一工作流软件栈,并明确了CyberShake平台和支持工具如何使我们能够应对大规模模拟所带来的各种挑战,如自动远程作业提交、数据管理以及验证和确认。该平台使我们能够在 2022 年 12 月至 2023 年 4 月期间执行最新的模拟活动,即 CyberShake 研究 22.12。在此期间,我们的工作流工具执行了约 32,000 项工作,使用了橡树岭领先计算设施高达 73% 的 Summit 系统。我们的工作流工具管理着总计约 2.5 PB 的临时数据和输出数据,并自动将 1900 万个输出文件(总计 74 TB)存储到南加州大学高级研究计算中心系统的档案存储中,其中包括基于文件的关系数据和大型二进制文件,以有效存储数百万个模拟地震图。CyberShake 极端规模工作流程生成了基于模拟的概率地震灾害模型,地震学、工程学和政府部门都在使用这些模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Runtime support for CPU-GPU high-performance computing on distributed memory platforms Using open-science workflow tools to produce SCEC CyberShake physics-based probabilistic seismic hazard models The fast and the capacious: memory-efficient multi-GPU accelerated explicit state space exploration with GPUexplore 3.0 Asgard: Are NoSQL databases suitable for ephemeral data in serverless workloads? SNDVI: a new scalable serverless framework to compute NDVI
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1