Evaluating MPI Message Size Summary Statistics

Kurt B. Ferreira, Scott Levy
{"title":"Evaluating MPI Message Size Summary Statistics","authors":"Kurt B. Ferreira, Scott Levy","doi":"10.1145/3416315.3416322","DOIUrl":null,"url":null,"abstract":"The Message Passing Interface (MPI) remains the dominant programming model for scientific applications running on today’s high-performance computing (HPC) systems. This dominance stems from MPI’s powerful semantics for inter-process communication that has enabled scientists to write applications for simulating important physical phenomena. MPI does not, however, specify how messages and synchronization should be carried out. Those details are typically dependent on low-level architecture details and the message characteristics of the application. Therefore, analyzing an applications MPI usage is critical to tuning MPI’s performance on a particular platform. The results of this analysis is typically a discussion of average message sizes for a workload or set of workloads. While a discussion of the message average might be the most intuitive summary statistic, it might not be the most useful in terms of representing the entire message size dataset for an application. Using a previously developed MPI trace collector, we analyze the MPI message traces for a number of key MPI workloads. Through this analysis, we demonstrate that the average, while easy and efficient to calculate, may not be a good representation of all subsets of application messages sizes, with median and mode of message sizes being a superior choice in most cases. We show that the problem with using the average relate to the multi-modal nature of the distribution of point-to-point messages. Finally, we show that while scaling a workload has little discernible impact on which measures of central tendency are representative of the underlying data, different input descriptions can significantly impact which metric is most effective. The results and analysis in this paper have the potential for providing valuable guidance on how we as a community should discuss and analyze MPI message data for scientific applications.","PeriodicalId":176723,"journal":{"name":"Proceedings of the 27th European MPI Users' Group Meeting","volume":"34 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 27th European MPI Users' Group Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3416315.3416322","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

The Message Passing Interface (MPI) remains the dominant programming model for scientific applications running on today’s high-performance computing (HPC) systems. This dominance stems from MPI’s powerful semantics for inter-process communication that has enabled scientists to write applications for simulating important physical phenomena. MPI does not, however, specify how messages and synchronization should be carried out. Those details are typically dependent on low-level architecture details and the message characteristics of the application. Therefore, analyzing an applications MPI usage is critical to tuning MPI’s performance on a particular platform. The results of this analysis is typically a discussion of average message sizes for a workload or set of workloads. While a discussion of the message average might be the most intuitive summary statistic, it might not be the most useful in terms of representing the entire message size dataset for an application. Using a previously developed MPI trace collector, we analyze the MPI message traces for a number of key MPI workloads. Through this analysis, we demonstrate that the average, while easy and efficient to calculate, may not be a good representation of all subsets of application messages sizes, with median and mode of message sizes being a superior choice in most cases. We show that the problem with using the average relate to the multi-modal nature of the distribution of point-to-point messages. Finally, we show that while scaling a workload has little discernible impact on which measures of central tendency are representative of the underlying data, different input descriptions can significantly impact which metric is most effective. The results and analysis in this paper have the potential for providing valuable guidance on how we as a community should discuss and analyze MPI message data for scientific applications.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
评估MPI消息大小汇总统计
消息传递接口(MPI)仍然是运行在当今高性能计算(HPC)系统上的科学应用程序的主要编程模型。这种优势源于MPI强大的进程间通信语义,它使科学家能够编写用于模拟重要物理现象的应用程序。但是,MPI没有指定应该如何执行消息和同步。这些细节通常依赖于底层架构细节和应用程序的消息特征。因此,分析应用程序的MPI使用情况对于在特定平台上调优MPI的性能至关重要。这种分析的结果通常是对一个工作负载或一组工作负载的平均消息大小的讨论。虽然对消息平均值的讨论可能是最直观的汇总统计,但就表示应用程序的整个消息大小数据集而言,它可能不是最有用的。使用以前开发的MPI跟踪收集器,我们分析了许多关键MPI工作负载的MPI消息跟踪。通过此分析,我们证明了平均值虽然易于计算且有效,但可能不是应用程序消息大小的所有子集的良好表示,在大多数情况下,消息大小的中位数和模式是更好的选择。我们表明,使用平均值的问题与点对点消息分布的多模态性质有关。最后,我们表明,虽然扩展工作负载对集中趋势的哪些度量代表底层数据几乎没有明显的影响,但不同的输入描述可以显著影响哪个度量最有效。本文的结果和分析有可能为我们作为一个社区应该如何讨论和分析MPI信息数据以用于科学应用提供有价值的指导。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Fibers are not (P)Threads: The Case for Loose Coupling of Asynchronous Programming Models and MPI Through Continuations Using Advanced Vector Extensions AVX-512 for MPI Reductions Signature Datatypes for Type Correct Collective Operations, Revisited Communication and Timing Issues with MPI Virtualization Collectives and Communicators: A Case for Orthogonality: (Or: How to get rid of MPI neighbor and enhance Cartesian collectives)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1