Pub Date : 2024-02-01DOI: 10.1109/TSUSC.2024.3360975
Fu Jiang;Yaoxin Xia;Lisen Yan;Weirong Liu;Xiaoyong Zhang;Heng Li;Jun Peng
Battery degradation is a main hinder to extend the persistent lifespan of the portable heterogeneous computing device. Excessive energy consumption and prominent current fluctuations can lead to a sharp decline of battery endurance. To address this issue, a battery-aware workflow scheduling algorithm is proposed to maximize the battery lifetime and release the computing potential of the device fully. First, a dynamic optimal budget strategy is developed to select the highest cost-effectiveness processors to meet the deadline of each task, accelerating the budget optimization by incorporating deep neural network. Second, an integer-programming greedy strategy is utilized to determine the start time of each task, minimizing the fluctuation of the battery supply current to mitigate the battery degradation. Finally, a long-term operation experiment and Monte Carlo experiments are performed on the battery simulator, SLIDE. The experimental results under real operating conditions for more than 1800 hours validate that the proposed scheduling algorithm can effectively extend the battery life by 7.31%-8.23%. The results on various parallel workflows illustrate that the proposed algorithm has comparable performance with speed improvement over the integer programming method.
{"title":"Battery-Aware Workflow Scheduling for Portable Heterogeneous Computing","authors":"Fu Jiang;Yaoxin Xia;Lisen Yan;Weirong Liu;Xiaoyong Zhang;Heng Li;Jun Peng","doi":"10.1109/TSUSC.2024.3360975","DOIUrl":"https://doi.org/10.1109/TSUSC.2024.3360975","url":null,"abstract":"Battery degradation is a main hinder to extend the persistent lifespan of the portable heterogeneous computing device. Excessive energy consumption and prominent current fluctuations can lead to a sharp decline of battery endurance. To address this issue, a battery-aware workflow scheduling algorithm is proposed to maximize the battery lifetime and release the computing potential of the device fully. First, a dynamic optimal budget strategy is developed to select the highest cost-effectiveness processors to meet the deadline of each task, accelerating the budget optimization by incorporating deep neural network. Second, an integer-programming greedy strategy is utilized to determine the start time of each task, minimizing the fluctuation of the battery supply current to mitigate the battery degradation. Finally, a long-term operation experiment and Monte Carlo experiments are performed on the battery simulator, SLIDE. The experimental results under real operating conditions for more than 1800 hours validate that the proposed scheduling algorithm can effectively extend the battery life by 7.31%-8.23%. The results on various parallel workflows illustrate that the proposed algorithm has comparable performance with speed improvement over the integer programming method.","PeriodicalId":13268,"journal":{"name":"IEEE Transactions on Sustainable Computing","volume":"9 4","pages":"677-694"},"PeriodicalIF":3.0,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141965765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Computing servers have played a key role in developing and processing emerging compute-intensive applications in recent years. Consolidating multiple virtual machines (VMs) inside one server to run various applications introduces severe competence for limited resources among VMs. Many techniques such as VM scheduling and resource provisioning are proposed to maximize the cost-efficiency of the computing servers while alleviating the performance inference between VMs. However, these management techniques require accurate performance prediction of the application running inside the VM, which is challenging to get in the public cloud due to the black-box nature of the VMs. From this perspective, this paper proposes a novel machine learning-based performance prediction approach for applications running in the cloud. To achieve high-accuracy predictions for black-box VMs, the proposed method first identifies the running application inside the virtual machine. It then selects highly correlated runtime metrics as the input of the machine learning approach to accurately predict the performance level of the cloud application. Experimental results with state-of-the-art cloud benchmarks demonstrate that our proposed method outperforms existing prediction methods by more than 2× in terms of the worst prediction error. In addition, we successfully tackle the challenge of performance prediction for applications with variable workloads by introducing the performance degradation index, which other comparison methods fail to consider. The workflow versatility of the proposed approach has been verified with different modern servers and VM configurations.
{"title":"CloudProphet: A Machine Learning-Based Performance Prediction for Public Clouds","authors":"Darong Huang;Luis Costero;Ali Pahlevan;Marina Zapater;David Atienza","doi":"10.1109/TSUSC.2024.3359325","DOIUrl":"https://doi.org/10.1109/TSUSC.2024.3359325","url":null,"abstract":"Computing servers have played a key role in developing and processing emerging compute-intensive applications in recent years. Consolidating multiple virtual machines (VMs) inside one server to run various applications introduces severe competence for limited resources among VMs. Many techniques such as VM scheduling and resource provisioning are proposed to maximize the cost-efficiency of the computing servers while alleviating the performance inference between VMs. However, these management techniques require accurate performance prediction of the application running inside the VM, which is challenging to get in the public cloud due to the black-box nature of the VMs. From this perspective, this paper proposes a novel machine learning-based performance prediction approach for applications running in the cloud. To achieve high-accuracy predictions for black-box VMs, the proposed method first identifies the running application inside the virtual machine. It then selects highly correlated runtime metrics as the input of the machine learning approach to accurately predict the performance level of the cloud application. Experimental results with state-of-the-art cloud benchmarks demonstrate that our proposed method outperforms existing prediction methods by more than 2× in terms of the worst prediction error. In addition, we successfully tackle the challenge of performance prediction for applications with variable workloads by introducing the performance degradation index, which other comparison methods fail to consider. The workflow versatility of the proposed approach has been verified with different modern servers and VM configurations.","PeriodicalId":13268,"journal":{"name":"IEEE Transactions on Sustainable Computing","volume":"9 4","pages":"661-676"},"PeriodicalIF":3.0,"publicationDate":"2024-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141965766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-26DOI: 10.1109/TSUSC.2024.3358915
Aman Mishra;Yash Garg;Om Jee Pandey;Mahendra K. Shukla;Athanasios V. Vasilakos;Rajesh M. Hegde
At present, the centralized learning models, used for IoT applications generating large amount of data, face several challenges such as bandwidth scarcity, more energy consumption, increased uses of computing resources, poor connectivity, high computational complexity, reduced privacy, and large latency towards data transfer. In order to address the aforementioned challenges, Blockchain-Enabled Federated Learning Networks (BFLNs) emerged recently, which deal with trained model parameters only, rather than raw data. BFLNs provide enhanced security along with improved energy-efficiency and Quality-of-Service (QoS). However, BFLNs suffer with the challenges of exponential increased action space in deciding various parameter levels towards training and block generation. Motivated by aforementioned challenges of BFLNs, in this work, we are proposing an actor-critic Reinforcement Learning (RL) method to model the Machine Learning Model Owner (MLMO) in selecting the optimal set of parameter levels, addressing the challenges of exponential grow of action space in BFLNs. Further, due to the implicit entropy exploration, actor-critic RL method balances the exploration-exploitation trade-off and shows better performance than most off-policy methods, on large discrete action spaces. Therefore, in this work, considering the mobile scenario of the devices, MLMO decides the data and energy levels that the mobile devices use for the training and determine the block generation rate. This leads to minimized system latency and reduced overall cost, while achieving the target accuracy. Specifically, we have used Proximal Policy Optimization (PPO) as an on-policy actor-critic method with it's two variants, one based on Monte Carlo (MC) returns and another based on Generalized Advantage Estimate (GAE). We analyzed that PPO has better exploration and sample efficiency, lesser training time, and consistently higher cumulative rewards, when compared to off-policy Deep Q-Network (DQN).
{"title":"A Novel Resource Management Framework for Blockchain-Based Federated Learning in IoT Networks","authors":"Aman Mishra;Yash Garg;Om Jee Pandey;Mahendra K. Shukla;Athanasios V. Vasilakos;Rajesh M. Hegde","doi":"10.1109/TSUSC.2024.3358915","DOIUrl":"https://doi.org/10.1109/TSUSC.2024.3358915","url":null,"abstract":"At present, the centralized learning models, used for IoT applications generating large amount of data, face several challenges such as bandwidth scarcity, more energy consumption, increased uses of computing resources, poor connectivity, high computational complexity, reduced privacy, and large latency towards data transfer. In order to address the aforementioned challenges, Blockchain-Enabled Federated Learning Networks (BFLNs) emerged recently, which deal with trained model parameters only, rather than raw data. BFLNs provide enhanced security along with improved energy-efficiency and Quality-of-Service (QoS). However, BFLNs suffer with the challenges of exponential increased action space in deciding various parameter levels towards training and block generation. Motivated by aforementioned challenges of BFLNs, in this work, we are proposing an actor-critic Reinforcement Learning (RL) method to model the Machine Learning Model Owner (MLMO) in selecting the optimal set of parameter levels, addressing the challenges of exponential grow of action space in BFLNs. Further, due to the implicit entropy exploration, actor-critic RL method balances the exploration-exploitation trade-off and shows better performance than most off-policy methods, on large discrete action spaces. Therefore, in this work, considering the mobile scenario of the devices, MLMO decides the data and energy levels that the mobile devices use for the training and determine the block generation rate. This leads to minimized system latency and reduced overall cost, while achieving the target accuracy. Specifically, we have used Proximal Policy Optimization (PPO) as an on-policy actor-critic method with it's two variants, one based on Monte Carlo (MC) returns and another based on Generalized Advantage Estimate (GAE). We analyzed that PPO has better exploration and sample efficiency, lesser training time, and consistently higher cumulative rewards, when compared to off-policy Deep Q-Network (DQN).","PeriodicalId":13268,"journal":{"name":"IEEE Transactions on Sustainable Computing","volume":"9 4","pages":"648-660"},"PeriodicalIF":3.0,"publicationDate":"2024-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141965830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-09DOI: 10.1109/TSUSC.2024.3351684
Jie Li;Yuhui Deng;Zhifeng Fan;Zijie Zhong;Geyong Min
The explosion of large-scale data has increased the scale and capacity of storage clusters in data centers, leading to huge power consumption issues. Cloud providers can effectively promote the energy efficiency of data centers by employing energy-aware data placement techniques, which primarily encompass storage cluster's power and cooling power. Traditional data placement approaches do not diminish the overall power consumption of the data center due to the heat recirculation effect between storage nodes. To fill this gap, we build an elaborate thermal-aware data center model. Then we propose two energy-efficient thermal-aware data placement strategies, ETDP-I and ETDP-II, to reduce the overall power consumption of the data center. The principle of our proposed algorithm is to utilize a greedy algorithm to calculate the optimal disk sequence at the minimum total power of the data center and then place the data into the optimal disk sequence. We implement these two strategies in a cloud computing simulation platform based on CloudSim. Experimental results unveil that ETDA-I and ETDP-II outperform MinTin-G and MinTout-G in terms of the supplied temperature of CRAC, storage nodes power, cooling cost, and total power consumption of the data center. In particular, ETDP-I and ETDP-II algorithms can save about 9.46 $%$