{"title":"异构资源的自适应Hadoop调度器","authors":"A. Elkholy, E. Sallam","doi":"10.1109/ICCES.2014.7030999","DOIUrl":null,"url":null,"abstract":"Nowadays, Hadoop is a widely used framework for processing large data. Hadoop scheduler is a critical element which has a big effect on Hadoop performance. Finding a dynamic scheduler which adapts to different nodes computing capabilities and the same node performance is a challenging problem. Most of the current Hadoop schedulers consider the homogeneity of the resources on which Hadoop is running and assign each node in the cluster a fixed capacity over the run time, neglecting the different nodes computing capabilities and the performance of each node over the run time. This causes under/over utilization of resources, poor performance and longer run time. So, we propose a dynamic Hadoop scheduler which adapts to the performance and the computing capabilities of each node separately. The proposed scheduler controls the capacity of each node which represented by the number of tasks that can be processed concurrently at a time. The scheduler extends/shrinks the capacity of each node depending on its available resources and performance over the run time. Our scheduler is implemented on Hadoop and compared by the Hadoop Fair Scheduler. The experimental results show that our scheduler has achieved less average completion time and higher resources utilization.","PeriodicalId":339697,"journal":{"name":"2014 9th International Conference on Computer Engineering & Systems (ICCES)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Self adaptive Hadoop scheduler for heterogeneous resources\",\"authors\":\"A. Elkholy, E. Sallam\",\"doi\":\"10.1109/ICCES.2014.7030999\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, Hadoop is a widely used framework for processing large data. Hadoop scheduler is a critical element which has a big effect on Hadoop performance. Finding a dynamic scheduler which adapts to different nodes computing capabilities and the same node performance is a challenging problem. Most of the current Hadoop schedulers consider the homogeneity of the resources on which Hadoop is running and assign each node in the cluster a fixed capacity over the run time, neglecting the different nodes computing capabilities and the performance of each node over the run time. This causes under/over utilization of resources, poor performance and longer run time. So, we propose a dynamic Hadoop scheduler which adapts to the performance and the computing capabilities of each node separately. The proposed scheduler controls the capacity of each node which represented by the number of tasks that can be processed concurrently at a time. The scheduler extends/shrinks the capacity of each node depending on its available resources and performance over the run time. Our scheduler is implemented on Hadoop and compared by the Hadoop Fair Scheduler. The experimental results show that our scheduler has achieved less average completion time and higher resources utilization.\",\"PeriodicalId\":339697,\"journal\":{\"name\":\"2014 9th International Conference on Computer Engineering & Systems (ICCES)\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 9th International Conference on Computer Engineering & Systems (ICCES)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCES.2014.7030999\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 9th International Conference on Computer Engineering & Systems (ICCES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCES.2014.7030999","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Self adaptive Hadoop scheduler for heterogeneous resources
Nowadays, Hadoop is a widely used framework for processing large data. Hadoop scheduler is a critical element which has a big effect on Hadoop performance. Finding a dynamic scheduler which adapts to different nodes computing capabilities and the same node performance is a challenging problem. Most of the current Hadoop schedulers consider the homogeneity of the resources on which Hadoop is running and assign each node in the cluster a fixed capacity over the run time, neglecting the different nodes computing capabilities and the performance of each node over the run time. This causes under/over utilization of resources, poor performance and longer run time. So, we propose a dynamic Hadoop scheduler which adapts to the performance and the computing capabilities of each node separately. The proposed scheduler controls the capacity of each node which represented by the number of tasks that can be processed concurrently at a time. The scheduler extends/shrinks the capacity of each node depending on its available resources and performance over the run time. Our scheduler is implemented on Hadoop and compared by the Hadoop Fair Scheduler. The experimental results show that our scheduler has achieved less average completion time and higher resources utilization.