文献互助智能选刊最新文献

高级搜索发布求助登录注册

I/O Characterization and Performance Evaluation of BeeGFS for Deep Learning

Proceedings of the 48th International Conference on Parallel Processing Pub Date : 2019-08-05 DOI:10.1145/3337821.3337902

Fahim Chowdhury, Yue Zhu, T. Heer, Saul Paredes, A. Moody, R. Goldstone, K. Mohror, Weikuan Yu

{"title":"I/O Characterization and Performance Evaluation of BeeGFS for Deep Learning","authors":"Fahim Chowdhury, Yue Zhu, T. Heer, Saul Paredes, A. Moody, R. Goldstone, K. Mohror, Weikuan Yu","doi":"10.1145/3337821.3337902","DOIUrl":null,"url":null,"abstract":"Parallel File Systems (PFSs) are frequently deployed on leadership High Performance Computing (HPC) systems to ensure efficient I/O, persistent storage and scalable performance. Emerging Deep Learning (DL) applications incur new I/O and storage requirements to HPC systems with batched input of small random files. This mandates PFSs to have commensurate features that can meet the needs of DL applications. BeeGFS is a recently emerging PFS that has grabbed the attention of the research and industry world because of its performance, scalability and ease of use. While emphasizing a systematic performance analysis of BeeGFS, in this paper, we present the architectural and system features of BeeGFS, and perform an experimental evaluation using cutting-edge I/O, Metadata and DL application benchmarks. Particularly, we have utilized AlexNet and ResNet-50 models for the classification of ImageNet dataset using the Livermore Big Artificial Neural Network Toolkit (LBANN), and ImageNet data reader pipeline atop TensorFlow and Horovod. Through extensive performance characterization of BeeGFS, our study provides a useful documentation on how to leverage BeeGFS for the emerging DL applications.","PeriodicalId":405273,"journal":{"name":"Proceedings of the 48th International Conference on Parallel Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 48th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3337821.3337902","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

Abstract

Parallel File Systems (PFSs) are frequently deployed on leadership High Performance Computing (HPC) systems to ensure efficient I/O, persistent storage and scalable performance. Emerging Deep Learning (DL) applications incur new I/O and storage requirements to HPC systems with batched input of small random files. This mandates PFSs to have commensurate features that can meet the needs of DL applications. BeeGFS is a recently emerging PFS that has grabbed the attention of the research and industry world because of its performance, scalability and ease of use. While emphasizing a systematic performance analysis of BeeGFS, in this paper, we present the architectural and system features of BeeGFS, and perform an experimental evaluation using cutting-edge I/O, Metadata and DL application benchmarks. Particularly, we have utilized AlexNet and ResNet-50 models for the classification of ImageNet dataset using the Livermore Big Artificial Neural Network Toolkit (LBANN), and ImageNet data reader pipeline atop TensorFlow and Horovod. Through extensive performance characterization of BeeGFS, our study provides a useful documentation on how to leverage BeeGFS for the emerging DL applications.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

深度学习BeeGFS的I/O表征和性能评估

并行文件系统(pfs)经常部署在高性能计算(HPC)系统上，以确保高效的I/O、持久的存储和可扩展的性能。新兴的深度学习(DL)应用程序会对HPC系统产生新的I/O和存储需求，并批量输入小型随机文件。这就要求pfs具有能够满足DL应用程序需求的相应特性。BeeGFS是最近出现的一种PFS，由于其性能、可扩展性和易用性而引起了研究和工业界的注意。在强调BeeGFS的系统性能分析的同时，本文介绍了BeeGFS的架构和系统特征，并使用先进的I/O，元数据和DL应用基准进行了实验评估。特别是，我们使用了AlexNet和ResNet-50模型对ImageNet数据集进行分类，使用了Livermore大人工神经网络工具包(LBANN)，以及基于TensorFlow和Horovod的ImageNet数据读取器管道。通过对BeeGFS进行广泛的性能表征，我们的研究为如何将BeeGFS用于新兴的深度学习应用程序提供了有用的文档。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 48th International Conference on Parallel Processing

Proceedings of the 48th International Conference on Parallel Processing

自引率

0.00%

发文量

0

期刊最新文献

Express Link Placement for NoC-Based Many-Core Platforms Cartesian Collective Communication Artemis A Specialized Concurrent Queue for Scheduling Irregular Workloads on GPUs diBELLA: Distributed Long Read to Long Read Alignment

0

微信

客服QQ

Book学术公众号

扫码关注我们

反馈

Book学术官方微信

Book学术文献互助

Book学术文献互助群
群号：604180095

文献互助智能选刊最新文献互助须知联系我们：info@booksci.cn

Book学术提供免费学术资源搜索服务，方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。

Copyright © 2023 Book学术 All rights reserved.

京公网安备 11010802042870号京ICP备2023020795号-1