Iterative Scheduling for Distributed Stream Processing Systems

Leila Eskandari, J. Mair, Zhiyi Huang, D. Eyers
{"title":"Iterative Scheduling for Distributed Stream Processing Systems","authors":"Leila Eskandari, J. Mair, Zhiyi Huang, D. Eyers","doi":"10.1145/3210284.3219768","DOIUrl":null,"url":null,"abstract":"Nowadays data stream processing systems need to efficiently handle large volumes of data in near real-time. To achieve this, the schedulers within such systems minimise the data movement between highly communicating tasks, improving system throughput. However, finding an optimal schedule for these systems is NP-hard. In this research, we propose a heuristic scheduling algorithm which reliably and efficiently finds the highly communicating tasks by exploiting graph partitioning algorithms and a mathematical optimisation software package. We evaluate our scheduler with two popular existing schedulers R-Storm and Aniello et al.'s 'Online scheduler' using two real-world applications and show that our proposed scheduler outperforms R-Storm, increasing throughput by between 3% and 30% and Online scheduler by 20--86% as a result of finding a more efficient schedule.","PeriodicalId":412438,"journal":{"name":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3210284.3219768","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

Nowadays data stream processing systems need to efficiently handle large volumes of data in near real-time. To achieve this, the schedulers within such systems minimise the data movement between highly communicating tasks, improving system throughput. However, finding an optimal schedule for these systems is NP-hard. In this research, we propose a heuristic scheduling algorithm which reliably and efficiently finds the highly communicating tasks by exploiting graph partitioning algorithms and a mathematical optimisation software package. We evaluate our scheduler with two popular existing schedulers R-Storm and Aniello et al.'s 'Online scheduler' using two real-world applications and show that our proposed scheduler outperforms R-Storm, increasing throughput by between 3% and 30% and Online scheduler by 20--86% as a result of finding a more efficient schedule.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
分布式流处理系统的迭代调度
目前,数据流处理系统需要在接近实时的情况下高效地处理大量数据。为了实现这一点,这些系统中的调度器将高度通信任务之间的数据移动最小化,从而提高系统吞吐量。然而,为这些系统找到最优调度是np困难的。在本研究中,我们提出了一种启发式调度算法,该算法利用图划分算法和数学优化软件包可靠有效地找到高通信任务。我们使用两个流行的现有调度器R-Storm和Aniello等人的“在线调度器”来评估我们的调度器,并使用两个真实世界的应用程序,结果表明我们提出的调度器优于R-Storm,由于找到了更有效的调度,我们的调度器将吞吐量提高了3%到30%,在线调度器提高了20%到86%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Vessel Trajectory Prediction using Sequence-to-Sequence Models over Spatial Grid MtDetector Predicting Destinations by Nearest Neighbor Search on Training Vessel Routes Venilia, On-line Learning and Prediction of Vessel Destination Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1