用于极端规模应用程序性能建模和仿真的可扩展合成工作负载生成

C. Carothers, J. Meredith, Mark P. Blanco, J. Vetter, M. Mubarak, Justin M. LaPre, S. Moore
{"title":"用于极端规模应用程序性能建模和仿真的可扩展合成工作负载生成","authors":"C. Carothers, J. Meredith, Mark P. Blanco, J. Vetter, M. Mubarak, Justin M. LaPre, S. Moore","doi":"10.1145/3064911.3064923","DOIUrl":null,"url":null,"abstract":"Performance modeling of extreme-scale applications on accurate representations of potential architectures is critical for designing next generation supercomputing systems because it is impractical to construct prototype systems at scale with new network hardware in order to explore designs and policies. However, these simulations often rely on static application traces that can be difficult to work with because of their size and lack of flexibility to extend or scale up without rerunning the original application. To address this problem, we have created a new technique for generating scalable, flexible workloads from real applications, we have implemented a prototype, called Durango, that combines a proven analytical performance modeling language, Aspen, with the massively parallel HPC network modeling capabilities of the CODES framework. Our models are compact, parameterized and representative of real applications with computation events. They are not resource intensive to create and are portable across simulator environments. We demonstrate the utility of Durango by simulating the LULESH application in the CODES simulation environment on several topologies and show that Durango is practical to use for simulation without loss of fidelity, as quantified by simulation metrics. During our validation of Durango's generated communication model of LULESH, we found that the original LULESH miniapp code had a latent bug where the MPI_Waitall operation was used incorrectly. This finding underscores the potential need for a tool such as Durango, beyond its benefits for flexible workload generation and modeling. Additionally, we demonstrate the efficacy of Durango's direct integration approach, which links Aspen into CODES as part of the running network simulation model. Here, Aspen generates the application-level computation timing events, which in turn drive the start of a network communication phase. Results show that Durango's performance scales well when executing both torus and dragonfly network models on up to 4K Blue Gene/Q nodes using 32K MPI ranks, Durango also avoids the overheads and complexities associated with extreme-scale trace files.","PeriodicalId":341026,"journal":{"name":"Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Durango: Scalable Synthetic Workload Generation for Extreme-Scale Application Performance Modeling and Simulation\",\"authors\":\"C. Carothers, J. Meredith, Mark P. Blanco, J. Vetter, M. Mubarak, Justin M. LaPre, S. Moore\",\"doi\":\"10.1145/3064911.3064923\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Performance modeling of extreme-scale applications on accurate representations of potential architectures is critical for designing next generation supercomputing systems because it is impractical to construct prototype systems at scale with new network hardware in order to explore designs and policies. However, these simulations often rely on static application traces that can be difficult to work with because of their size and lack of flexibility to extend or scale up without rerunning the original application. To address this problem, we have created a new technique for generating scalable, flexible workloads from real applications, we have implemented a prototype, called Durango, that combines a proven analytical performance modeling language, Aspen, with the massively parallel HPC network modeling capabilities of the CODES framework. Our models are compact, parameterized and representative of real applications with computation events. They are not resource intensive to create and are portable across simulator environments. We demonstrate the utility of Durango by simulating the LULESH application in the CODES simulation environment on several topologies and show that Durango is practical to use for simulation without loss of fidelity, as quantified by simulation metrics. During our validation of Durango's generated communication model of LULESH, we found that the original LULESH miniapp code had a latent bug where the MPI_Waitall operation was used incorrectly. This finding underscores the potential need for a tool such as Durango, beyond its benefits for flexible workload generation and modeling. Additionally, we demonstrate the efficacy of Durango's direct integration approach, which links Aspen into CODES as part of the running network simulation model. Here, Aspen generates the application-level computation timing events, which in turn drive the start of a network communication phase. Results show that Durango's performance scales well when executing both torus and dragonfly network models on up to 4K Blue Gene/Q nodes using 32K MPI ranks, Durango also avoids the overheads and complexities associated with extreme-scale trace files.\",\"PeriodicalId\":341026,\"journal\":{\"name\":\"Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3064911.3064923\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3064911.3064923","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

摘要

基于潜在架构的精确表示的极端规模应用程序的性能建模对于设计下一代超级计算系统至关重要,因为为了探索设计和策略,使用新的网络硬件构建大规模原型系统是不切实际的。然而,这些模拟通常依赖于静态应用程序跟踪,这些跟踪很难处理,因为它们的大小和缺乏在不重新运行原始应用程序的情况下扩展或扩展的灵活性。为了解决这个问题,我们创建了一种新技术,用于从实际应用程序中生成可扩展的、灵活的工作负载,我们实现了一个名为Durango的原型,它结合了经过验证的分析性能建模语言Aspen和CODES框架的大规模并行HPC网络建模功能。我们的模型是紧凑的,参数化的,并且代表了具有计算事件的实际应用。它们的创建不需要耗费大量资源,并且可以跨模拟器环境移植。我们通过在几种拓扑结构上的CODES仿真环境中模拟LULESH应用程序来演示Durango的实用性,并表明Durango可用于仿真而不会损失保真度,并通过仿真指标进行量化。在我们验证Durango生成的LULESH通信模型期间,我们发现原来的LULESH miniapp代码有一个潜在的错误,其中MPI_Waitall操作被错误地使用。这一发现强调了对Durango这样的工具的潜在需求,除了它在灵活的工作负载生成和建模方面的好处之外。此外,我们还展示了Durango直接集成方法的有效性,该方法将Aspen连接到CODES中,作为运行网络仿真模型的一部分。在这里,Aspen生成应用程序级计算计时事件,这些事件反过来驱动网络通信阶段的开始。结果表明,当使用32K MPI等级在高达4K Blue Gene/Q节点上执行环面和蜻蜓网络模型时,Durango的性能可以很好地扩展,Durango还避免了与极端规模跟踪文件相关的开销和复杂性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Durango: Scalable Synthetic Workload Generation for Extreme-Scale Application Performance Modeling and Simulation
Performance modeling of extreme-scale applications on accurate representations of potential architectures is critical for designing next generation supercomputing systems because it is impractical to construct prototype systems at scale with new network hardware in order to explore designs and policies. However, these simulations often rely on static application traces that can be difficult to work with because of their size and lack of flexibility to extend or scale up without rerunning the original application. To address this problem, we have created a new technique for generating scalable, flexible workloads from real applications, we have implemented a prototype, called Durango, that combines a proven analytical performance modeling language, Aspen, with the massively parallel HPC network modeling capabilities of the CODES framework. Our models are compact, parameterized and representative of real applications with computation events. They are not resource intensive to create and are portable across simulator environments. We demonstrate the utility of Durango by simulating the LULESH application in the CODES simulation environment on several topologies and show that Durango is practical to use for simulation without loss of fidelity, as quantified by simulation metrics. During our validation of Durango's generated communication model of LULESH, we found that the original LULESH miniapp code had a latent bug where the MPI_Waitall operation was used incorrectly. This finding underscores the potential need for a tool such as Durango, beyond its benefits for flexible workload generation and modeling. Additionally, we demonstrate the efficacy of Durango's direct integration approach, which links Aspen into CODES as part of the running network simulation model. Here, Aspen generates the application-level computation timing events, which in turn drive the start of a network communication phase. Results show that Durango's performance scales well when executing both torus and dragonfly network models on up to 4K Blue Gene/Q nodes using 32K MPI ranks, Durango also avoids the overheads and complexities associated with extreme-scale trace files.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Session details: Paper Session 4 GPU and Hardware Acceleration Lightweight WebSIM Rendering Framework Based on Cloud-Baking Efficient Simulation of Nested Hollow Sphere Intersections: for Dynamically Nested Compartmental Models in Cell Biology Session details: Paper Session 3 Performance Modeling and Simulation Analyzing Emergency Evacuation Strategies for Mass Gatherings using Crowd Simulation And Analysis framework: Hajj Scenario
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1