{"title":"An Improved Replica Placement Policy for Hadoop Distributed File System Running on Cloud Platforms","authors":"Wei Dai, Ibrahim Adel Ibrahim, M. Bassiouni","doi":"10.1109/CSCloud.2017.65","DOIUrl":null,"url":null,"abstract":"Load balance is a crucial issue for data-intensive computing on cloud platforms, because a load balanced cluster can significantly improve the completion time of data-intensive jobs. In this paper, we present an improved replica placement policy for Hadoop Distributed File System (HDFS), which is specifically designed for heterogeneous clusters. The HDFS replica placement policy cannot generate balanced replica assignment, and hence has to rely on a load balance utility to balance the load among cluster nodes. In contrast, our proposed policy can generate perfectly even replica assignment, and also achieve load balance among cluster nodes in any heterogeneous or homogeneous environments without the running of the load balance utility.","PeriodicalId":436299,"journal":{"name":"2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 4th International Conference on Cyber Security and Cloud Computing (CSCloud)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSCloud.2017.65","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19
Abstract
Load balance is a crucial issue for data-intensive computing on cloud platforms, because a load balanced cluster can significantly improve the completion time of data-intensive jobs. In this paper, we present an improved replica placement policy for Hadoop Distributed File System (HDFS), which is specifically designed for heterogeneous clusters. The HDFS replica placement policy cannot generate balanced replica assignment, and hence has to rely on a load balance utility to balance the load among cluster nodes. In contrast, our proposed policy can generate perfectly even replica assignment, and also achieve load balance among cluster nodes in any heterogeneous or homogeneous environments without the running of the load balance utility.