Santos Dumont超级计算机的集体I/O性能

A. Carneiro, J. L. Bez, F. Boito, Bruno Alves Fagundes, Carla Osthoff, P. Navaux
{"title":"Santos Dumont超级计算机的集体I/O性能","authors":"A. Carneiro, J. L. Bez, F. Boito, Bruno Alves Fagundes, Carla Osthoff, P. Navaux","doi":"10.1109/PDP2018.2018.00015","DOIUrl":null,"url":null,"abstract":"The historical gap between processing and data access speeds causes many applications to spend a large portion of their execution on I/O operations. From the point of view of a large-scale, expensive, supercomputer, it is important to ensure applications achieve the best I/O performance to promote an efficient usage of the machine. In this paper, we evaluate the I/O infrastructure of the Santos Dumont supercomputer, the largest one from Latin America. More specifically, we investigate the performance of collective I/O operations. By conducting an analysis of a scientific application that uses the machine, we identify large performance differences between the available MPI implementations. We then further study the observed phenomenon using the BT-IO and IOR benchmarks, in addition to a custom microbenchmark. We conclude that the customized MPI implementation by Bull (used by more than 20% of the jobs) presents the worst performance for small collective write operations. Our results are being used to help the Santos Dumont users to achieve the best performance for their applications. Additionally, by investigating the observed phenomenon, we provide information to help improve future MPI-IO collective write implementations.","PeriodicalId":333367,"journal":{"name":"2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Collective I/O Performance on the Santos Dumont Supercomputer\",\"authors\":\"A. Carneiro, J. L. Bez, F. Boito, Bruno Alves Fagundes, Carla Osthoff, P. Navaux\",\"doi\":\"10.1109/PDP2018.2018.00015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The historical gap between processing and data access speeds causes many applications to spend a large portion of their execution on I/O operations. From the point of view of a large-scale, expensive, supercomputer, it is important to ensure applications achieve the best I/O performance to promote an efficient usage of the machine. In this paper, we evaluate the I/O infrastructure of the Santos Dumont supercomputer, the largest one from Latin America. More specifically, we investigate the performance of collective I/O operations. By conducting an analysis of a scientific application that uses the machine, we identify large performance differences between the available MPI implementations. We then further study the observed phenomenon using the BT-IO and IOR benchmarks, in addition to a custom microbenchmark. We conclude that the customized MPI implementation by Bull (used by more than 20% of the jobs) presents the worst performance for small collective write operations. Our results are being used to help the Santos Dumont users to achieve the best performance for their applications. Additionally, by investigating the observed phenomenon, we provide information to help improve future MPI-IO collective write implementations.\",\"PeriodicalId\":333367,\"journal\":{\"name\":\"2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-03-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDP2018.2018.00015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 26th Euromicro International Conference on Parallel, Distributed and Network-based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP2018.2018.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

处理速度和数据访问速度之间的历史差距导致许多应用程序将大部分执行时间花在I/O操作上。从大型、昂贵的超级计算机的角度来看,确保应用程序实现最佳I/O性能以促进机器的有效使用是很重要的。在本文中,我们评估了Santos Dumont超级计算机的I/O基础设施,这是拉丁美洲最大的超级计算机。更具体地说,我们研究了集合I/O操作的性能。通过对使用该机器的科学应用程序进行分析,我们确定了可用的MPI实现之间的巨大性能差异。然后,我们使用BT-IO和IOR基准测试以及自定义微基准测试进一步研究观察到的现象。我们得出结论,Bull的定制MPI实现(超过20%的作业使用)在小型集体写操作中表现出最差的性能。我们的结果被用来帮助Santos Dumont用户实现其应用程序的最佳性能。此外,通过调查观察到的现象,我们提供了有助于改进未来MPI-IO集体写实现的信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Collective I/O Performance on the Santos Dumont Supercomputer
The historical gap between processing and data access speeds causes many applications to spend a large portion of their execution on I/O operations. From the point of view of a large-scale, expensive, supercomputer, it is important to ensure applications achieve the best I/O performance to promote an efficient usage of the machine. In this paper, we evaluate the I/O infrastructure of the Santos Dumont supercomputer, the largest one from Latin America. More specifically, we investigate the performance of collective I/O operations. By conducting an analysis of a scientific application that uses the machine, we identify large performance differences between the available MPI implementations. We then further study the observed phenomenon using the BT-IO and IOR benchmarks, in addition to a custom microbenchmark. We conclude that the customized MPI implementation by Bull (used by more than 20% of the jobs) presents the worst performance for small collective write operations. Our results are being used to help the Santos Dumont users to achieve the best performance for their applications. Additionally, by investigating the observed phenomenon, we provide information to help improve future MPI-IO collective write implementations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
TMbarrier: Speculative Barriers Using Hardware Transactional Memory Evaluating the Effect of Multi-Tenancy Patterns in Containerized Cloud-Hosted Content Management System A Generic Learning Multi-agent-System Approach for Spatio-Temporal-, Thermal- and Energy-Aware Scheduling Developing and Using a Geometric Multigrid, Unstructured Grid Mini-Application to Assess Many-Core Architectures Extending PluTo for Multiple Devices by Integrating OpenACC
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1