Explaining Wide Area Data Transfer Performance

Zhengchun Liu, Prasanna Balaprakash, R. Kettimuthu, Ian T Foster
{"title":"Explaining Wide Area Data Transfer Performance","authors":"Zhengchun Liu, Prasanna Balaprakash, R. Kettimuthu, Ian T Foster","doi":"10.1145/3078597.3078605","DOIUrl":null,"url":null,"abstract":"Disk-to-disk wide-area file transfers involve many subsystems and tunable application parameters that pose significant challenges for bottleneck detection, system optimization, and performance prediction. Performance models can be used to address these challenges but have not proved generally usable because of a need for extensive online experiments to characterize subsystems. We show here how to overcome the need for such experiments by applying machine learning methods to historical data to estimate parameters for predictive models. Starting with log data for millions of Globus transfers involving billions of files and hundreds of petabytes, we engineer features for endpoint CPU load, network interface card load, and transfer characteristics; and we use these features in both linear and nonlinear models of transfer performance, We show that the resulting models have high explanatory power. For a representative set of 30,653 transfers over 30 heavily used source-destination pairs (\"edges''),totaling 2,053 TB in 46.6 million files, we obtain median absolute percentage prediction errors (MdAPE) of 7.0% and 4.6% when using distinct linear and nonlinear models per edge, respectively; when using a single nonlinear model for all edges, we obtain an MdAPE of 7.8%. Our work broadens understanding of factors that influence file transfer rate by clarifying relationships between achieved transfer rates, transfer characteristics, and competing load. Our predictions can be used for distributed workflow scheduling and optimization, and our features can also be used for optimization and explanation.","PeriodicalId":436194,"journal":{"name":"Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"39","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3078597.3078605","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 39

Abstract

Disk-to-disk wide-area file transfers involve many subsystems and tunable application parameters that pose significant challenges for bottleneck detection, system optimization, and performance prediction. Performance models can be used to address these challenges but have not proved generally usable because of a need for extensive online experiments to characterize subsystems. We show here how to overcome the need for such experiments by applying machine learning methods to historical data to estimate parameters for predictive models. Starting with log data for millions of Globus transfers involving billions of files and hundreds of petabytes, we engineer features for endpoint CPU load, network interface card load, and transfer characteristics; and we use these features in both linear and nonlinear models of transfer performance, We show that the resulting models have high explanatory power. For a representative set of 30,653 transfers over 30 heavily used source-destination pairs ("edges''),totaling 2,053 TB in 46.6 million files, we obtain median absolute percentage prediction errors (MdAPE) of 7.0% and 4.6% when using distinct linear and nonlinear models per edge, respectively; when using a single nonlinear model for all edges, we obtain an MdAPE of 7.8%. Our work broadens understanding of factors that influence file transfer rate by clarifying relationships between achieved transfer rates, transfer characteristics, and competing load. Our predictions can be used for distributed workflow scheduling and optimization, and our features can also be used for optimization and explanation.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
解释广域数据传输性能
磁盘到磁盘的广域文件传输涉及许多子系统和可调应用程序参数,这对瓶颈检测、系统优化和性能预测构成了重大挑战。性能模型可用于解决这些挑战,但由于需要大量的在线实验来描述子系统的特征,因此尚未证明通常可用。我们在这里展示了如何通过将机器学习方法应用于历史数据来估计预测模型的参数来克服对此类实验的需求。从涉及数十亿文件和数百pb的数百万Globus传输的日志数据开始,我们为端点CPU负载、网络接口卡负载和传输特性设计功能;并将这些特征应用于迁移绩效的线性和非线性模型中,结果表明所得模型具有较高的解释力。对于30,653个传输的代表性集合,超过30个频繁使用的源-目的地对(“边”),在4660万个文件中总计2,053 TB,我们分别在每个边使用不同的线性和非线性模型时获得中位数绝对百分比预测误差(MdAPE)为7.0%和4.6%;当对所有边使用单一非线性模型时,我们获得了7.8%的MdAPE。我们的工作通过澄清已实现的传输速率、传输特性和竞争负载之间的关系,拓宽了对影响文件传输速率的因素的理解。我们的预测可用于分布式工作流调度和优化,我们的特性也可用于优化和解释。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Deep Learning in Cancer and Infectious Disease: Novel Driver Problems for Future HPC Architecture LetGo: A Lightweight Continuous Framework for HPC Applications Under Failures Explaining Wide Area Data Transfer Performance IOGP: An Incremental Online Graph Partitioning Algorithm for Distributed Graph Databases Better Safe than Sorry: Grappling with Failures of In-Memory Data Analytics Frameworks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1