A Task Scheduling Algorithm Based on Replication for Maximizing Reliability on Heterogeneous Computing Systems

Shuli Wang, Kenli Li, Jing Mei, Kuan-Ching Li, Yan Wang
{"title":"A Task Scheduling Algorithm Based on Replication for Maximizing Reliability on Heterogeneous Computing Systems","authors":"Shuli Wang, Kenli Li, Jing Mei, Kuan-Ching Li, Yan Wang","doi":"10.1109/IPDPSW.2014.175","DOIUrl":null,"url":null,"abstract":"Over the past several years, a heterogeneous computing (HC) system has become more competative as a commercial computing platform than a homogeneous system. With the growing scale of HC systems, network failures become inevitable. To achieve high performance, communication reliability should be considered while designing reliability-aware task scheduling algorithms. In this paper, we propose a new algorithm called RMSR (Replication-based scheduling for Maximizing System Reliability), which incorporates task communication into system reliability. To maximize communication reliability, an improved algorithm which searches all optimal reliability communication paths for current tasks is proposed. During the task replication phase, the task reliability threshold is determined by users and each task has dynamic replicas. Our comparative studies based on randomly generated graphs show that our RMSR algorithm outperforms existing scheduling algorithms in terms of system reliability. Several factors affecting the performance are analyzed in the paper.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2014.175","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Over the past several years, a heterogeneous computing (HC) system has become more competative as a commercial computing platform than a homogeneous system. With the growing scale of HC systems, network failures become inevitable. To achieve high performance, communication reliability should be considered while designing reliability-aware task scheduling algorithms. In this paper, we propose a new algorithm called RMSR (Replication-based scheduling for Maximizing System Reliability), which incorporates task communication into system reliability. To maximize communication reliability, an improved algorithm which searches all optimal reliability communication paths for current tasks is proposed. During the task replication phase, the task reliability threshold is determined by users and each task has dynamic replicas. Our comparative studies based on randomly generated graphs show that our RMSR algorithm outperforms existing scheduling algorithms in terms of system reliability. Several factors affecting the performance are analyzed in the paper.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于复制的异构计算系统最大可靠性任务调度算法
在过去的几年中,作为商业计算平台,异构计算(HC)系统比同构系统更具竞争力。随着HC系统规模的不断扩大,网络故障成为不可避免的问题。为了实现高性能,在设计可靠性感知任务调度算法时需要考虑通信可靠性。在本文中,我们提出了一种新的算法,称为RMSR(基于复制的调度最大化系统可靠性),该算法将任务通信纳入系统可靠性。为了使通信可靠性最大化,提出了一种针对当前任务搜索所有最优可靠性通信路径的改进算法。在任务复制阶段,任务可靠性阈值由用户决定,每个任务都有动态副本。基于随机生成图的对比研究表明,我们的RMSR算法在系统可靠性方面优于现有的调度算法。本文分析了影响其性能的几个因素。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A New Parallel Algorithm for Two-Pass Connected Component Labeling RAW Introduction and Committees HPDIC Introduction and Committees An Evaluation of User Satisfaction Driven Scheduling in a Polymorphic Embedded System HPGC Introduction and Committees
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1