{"title":"先调度,后管理:网络感知负载均衡","authors":"Amir Nahir, A. Orda, D. Raz","doi":"10.1109/INFCOM.2013.6566825","DOIUrl":null,"url":null,"abstract":"Load balancing in large distributed server systems is a complex optimization problem of critical importance in cloud systems and data centers. Existing schedulers often incur a high overhead in communication when collecting the data required to make the scheduling decision, hence delaying the job request on its way to the executing server. We propose a novel scheme that incurs no communication overhead between the users and the servers upon job arrival, thus removing any scheduling overhead from the job's critical path. Our approach is based on creating several replicas of each job and sending each replica to a different server. Upon the arrival of a replica to the head of the queue at its server, the latter signals the servers holding replicas of that job, so as to remove them from their queues. We show, through analysis and simulations, that this scheme improves the expected queuing overhead over traditional schemes by a factor of 9 (or more) under various load conditions. In addition, we show that our scheme remains efficient even when the inter-server signal propagation delay is significant (relative to the job's execution time). We provide heuristic solutions to the performance degradation that occurs in such cases and show, by simulations, that they efficiently mitigate the detrimental effect of propagation delays. Finally, we demonstrate the efficiency of our proposed scheme in a real-world environment by implementing a load balancing system based on it, deploying the system on the Amazon Elastic Compute Cloud (EC2), and measuring its performance.","PeriodicalId":206346,"journal":{"name":"2013 Proceedings IEEE INFOCOM","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Schedule first, manage later: Network-aware load balancing\",\"authors\":\"Amir Nahir, A. Orda, D. Raz\",\"doi\":\"10.1109/INFCOM.2013.6566825\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Load balancing in large distributed server systems is a complex optimization problem of critical importance in cloud systems and data centers. Existing schedulers often incur a high overhead in communication when collecting the data required to make the scheduling decision, hence delaying the job request on its way to the executing server. We propose a novel scheme that incurs no communication overhead between the users and the servers upon job arrival, thus removing any scheduling overhead from the job's critical path. Our approach is based on creating several replicas of each job and sending each replica to a different server. Upon the arrival of a replica to the head of the queue at its server, the latter signals the servers holding replicas of that job, so as to remove them from their queues. We show, through analysis and simulations, that this scheme improves the expected queuing overhead over traditional schemes by a factor of 9 (or more) under various load conditions. In addition, we show that our scheme remains efficient even when the inter-server signal propagation delay is significant (relative to the job's execution time). We provide heuristic solutions to the performance degradation that occurs in such cases and show, by simulations, that they efficiently mitigate the detrimental effect of propagation delays. Finally, we demonstrate the efficiency of our proposed scheme in a real-world environment by implementing a load balancing system based on it, deploying the system on the Amazon Elastic Compute Cloud (EC2), and measuring its performance.\",\"PeriodicalId\":206346,\"journal\":{\"name\":\"2013 Proceedings IEEE INFOCOM\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 Proceedings IEEE INFOCOM\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INFCOM.2013.6566825\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Proceedings IEEE INFOCOM","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFCOM.2013.6566825","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Load balancing in large distributed server systems is a complex optimization problem of critical importance in cloud systems and data centers. Existing schedulers often incur a high overhead in communication when collecting the data required to make the scheduling decision, hence delaying the job request on its way to the executing server. We propose a novel scheme that incurs no communication overhead between the users and the servers upon job arrival, thus removing any scheduling overhead from the job's critical path. Our approach is based on creating several replicas of each job and sending each replica to a different server. Upon the arrival of a replica to the head of the queue at its server, the latter signals the servers holding replicas of that job, so as to remove them from their queues. We show, through analysis and simulations, that this scheme improves the expected queuing overhead over traditional schemes by a factor of 9 (or more) under various load conditions. In addition, we show that our scheme remains efficient even when the inter-server signal propagation delay is significant (relative to the job's execution time). We provide heuristic solutions to the performance degradation that occurs in such cases and show, by simulations, that they efficiently mitigate the detrimental effect of propagation delays. Finally, we demonstrate the efficiency of our proposed scheme in a real-world environment by implementing a load balancing system based on it, deploying the system on the Amazon Elastic Compute Cloud (EC2), and measuring its performance.