Dynamic Sizing of Continuously Divisible Jobs for Heterogeneous Resources

Nicholas L. Hazekamp, Benjamín Tovar, D. Thain
{"title":"Dynamic Sizing of Continuously Divisible Jobs for Heterogeneous Resources","authors":"Nicholas L. Hazekamp, Benjamín Tovar, D. Thain","doi":"10.1109/eScience.2019.00026","DOIUrl":null,"url":null,"abstract":"Many scientific applications operate on large datasets that can be partitioned and operated on concurrently. The existing approaches for concurrent execution generally rely on statically partitioned data. This static partitioning can lock performance in a sub-optimal configuration, leading to higher execution time and an inability to respond to dynamic resources. We present the Continuously Divisible Job abstraction which allows statically defined applications to have their component tasks dynamically sized responding to system behavior. The Continuously Divisible Job abstraction defines a simple interface that dictates how work can be recursively divided, executed, and merged. Implementing this abstraction allows scientific applications to leverage dynamic job coordinators for execution. We also propose the Virtual File abstraction which allows read-only subsets of large files to be treated as separate files. In exploring the Continuously Divisible Job abstraction, two applications were implemented using the Continuously Divisible Job interface: a bioinformatics application and a high-energy physics event analysis. These were tested using an abstract job interface and several job coordinators. Comparing these against a previous static partitioning implementation we show comparable or better performance without having to make static decisions or implement complex dynamic application handling.","PeriodicalId":142614,"journal":{"name":"2019 15th International Conference on eScience (eScience)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 15th International Conference on eScience (eScience)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/eScience.2019.00026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Many scientific applications operate on large datasets that can be partitioned and operated on concurrently. The existing approaches for concurrent execution generally rely on statically partitioned data. This static partitioning can lock performance in a sub-optimal configuration, leading to higher execution time and an inability to respond to dynamic resources. We present the Continuously Divisible Job abstraction which allows statically defined applications to have their component tasks dynamically sized responding to system behavior. The Continuously Divisible Job abstraction defines a simple interface that dictates how work can be recursively divided, executed, and merged. Implementing this abstraction allows scientific applications to leverage dynamic job coordinators for execution. We also propose the Virtual File abstraction which allows read-only subsets of large files to be treated as separate files. In exploring the Continuously Divisible Job abstraction, two applications were implemented using the Continuously Divisible Job interface: a bioinformatics application and a high-energy physics event analysis. These were tested using an abstract job interface and several job coordinators. Comparing these against a previous static partitioning implementation we show comparable or better performance without having to make static decisions or implement complex dynamic application handling.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
异构资源下连续可分作业的动态分级
许多科学应用程序操作的大型数据集可以进行分区和并发操作。现有的并发执行方法通常依赖于静态分区数据。这种静态分区可以在次优配置中锁定性能,从而导致更长的执行时间和无法响应动态资源。我们提出了连续可分割的工作抽象,它允许静态定义的应用程序动态地调整其组件任务的大小,以响应系统行为。连续可分割作业抽象定义了一个简单的接口,该接口指示如何递归地划分、执行和合并工作。实现这个抽象允许科学应用程序利用动态作业协调器来执行。我们还提出了虚拟文件抽象,它允许将大文件的只读子集视为单独的文件。在探索连续可分作业抽象的过程中,使用连续可分作业接口实现了两个应用程序:生物信息学应用程序和高能物理事件分析。使用一个抽象作业接口和几个作业协调器对它们进行了测试。将这些与以前的静态分区实现进行比较,我们可以显示出相当或更好的性能,而无需做出静态决策或实现复杂的动态应用程序处理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Accelerating Scientific Discovery with SCAIGATE Science Gateway Contextual Linking between Workflow Provenance and System Performance Logs BBBlockchain: Blockchain-Based Participation in Urban Development Streaming Workflows on Edge Devices to Process Sensor Data on a Smart Manufacturing Platform Serverless Science for Simple, Scalable, and Shareable Scholarship
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1