{"title":"Evaluation of Energy Consumption of Replicated Tasks in a Volunteer Computing Environment","authors":"A. McGough, M. Forshaw","doi":"10.1145/3185768.3186313","DOIUrl":null,"url":null,"abstract":"High Throughput Computing allows workloads of many thousands of tasks to be performed efficiently over many distributed resources and frees the user from the laborious process of managing task deployment, execution and result collection. However, in many cases the High Throughput Computing system is comprised from volunteer computational resources where tasks may be evicted by the owner of the resource. This has two main disadvantages. First, tasks may take longer to run as they may require multiple deployments before finally obtaining enough time on a resource to complete. Second, the wasted computation time will lead to wasted energy. We may be able to reduce the effect of the first disadvantage here by submitting multiple replicas of the task and take the results from the first one to complete. This, though, could lead to a significant increase in energy consumption. Thus we desire to only ever submit the minimum number of replicas required to run the task in the allocated time whilst simultaneously minimising energy. In this work we evaluate the use of fixed replica counts and Reinforcement Learning on the proportion of task which fail to finish in a given time-frame and the energy consumed by the system.","PeriodicalId":10596,"journal":{"name":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion of the 2018 ACM/SPEC International Conference on Performance Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3185768.3186313","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
High Throughput Computing allows workloads of many thousands of tasks to be performed efficiently over many distributed resources and frees the user from the laborious process of managing task deployment, execution and result collection. However, in many cases the High Throughput Computing system is comprised from volunteer computational resources where tasks may be evicted by the owner of the resource. This has two main disadvantages. First, tasks may take longer to run as they may require multiple deployments before finally obtaining enough time on a resource to complete. Second, the wasted computation time will lead to wasted energy. We may be able to reduce the effect of the first disadvantage here by submitting multiple replicas of the task and take the results from the first one to complete. This, though, could lead to a significant increase in energy consumption. Thus we desire to only ever submit the minimum number of replicas required to run the task in the allocated time whilst simultaneously minimising energy. In this work we evaluate the use of fixed replica counts and Reinforcement Learning on the proportion of task which fail to finish in a given time-frame and the energy consumed by the system.