{"title":"Deep Reinforcement Learning Method for Task Offloading in Mobile Edge Computing Networks Based on Parallel Exploration with Asynchronous Training","authors":"Junyan Chen, Lei Jin, Rui Yao, Hongmei Zhang","doi":"10.1007/s11036-024-02397-7","DOIUrl":null,"url":null,"abstract":"<p>In mobile edge computing (MEC), randomly offloading tasks to edge servers (ES) can cause wireless devices (WD) to compete for limited bandwidth resources, leading to overall performance degradation. Reinforcement learning can provide suitable strategies for task offloading and resource allocation through exploration and trial-and-error, helping to avoid blind offloading. However, traditional reinforcement learning algorithms suffer from slow convergence and a tendency to get stuck in suboptimal local minima, significantly impacting the energy consumption and data timeliness of edge computing task unloading. To address these issues, we propose Parallel Exploration with Asynchronous Training-based Deep Reinforcement Learning (PEATDRL) algorithm for MEC network offloading decisions. Its objective is to maximize system performance while limiting energy consumption in an MEC environment characterized by time-varying wireless channels and random user task arrivals. Firstly, our model employs two independent DNNs for parallel exploration, each generating different offloading strategies. This parallel exploration enhances environmental adaptability, avoids the limitations of a single DNN, and addresses the issue of agents getting stuck in suboptimal local minima due to the explosion of decision combinations, thereby improving decision performance. Secondly, we set different learning rates for the two DNNs during the training phase and trained them at various intervals. This asynchronous training strategy increases the randomness of decision exploration, prevents the two DNNs from converging to the same suboptimal local solution, and improves convergence efficiency by enhancing sample utilization. Finally, we examine the impact of different parallel levels and training step differences on system performance metrics and explain the parameter choices. Experimental results show that the proposed method provides a viable solution to the performance issues caused by slow convergence and local minima, with PEATDRL improving task queue convergence speed by more than 20% compared to baseline algorithms.</p>","PeriodicalId":501103,"journal":{"name":"Mobile Networks and Applications","volume":"1837 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mobile Networks and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11036-024-02397-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In mobile edge computing (MEC), randomly offloading tasks to edge servers (ES) can cause wireless devices (WD) to compete for limited bandwidth resources, leading to overall performance degradation. Reinforcement learning can provide suitable strategies for task offloading and resource allocation through exploration and trial-and-error, helping to avoid blind offloading. However, traditional reinforcement learning algorithms suffer from slow convergence and a tendency to get stuck in suboptimal local minima, significantly impacting the energy consumption and data timeliness of edge computing task unloading. To address these issues, we propose Parallel Exploration with Asynchronous Training-based Deep Reinforcement Learning (PEATDRL) algorithm for MEC network offloading decisions. Its objective is to maximize system performance while limiting energy consumption in an MEC environment characterized by time-varying wireless channels and random user task arrivals. Firstly, our model employs two independent DNNs for parallel exploration, each generating different offloading strategies. This parallel exploration enhances environmental adaptability, avoids the limitations of a single DNN, and addresses the issue of agents getting stuck in suboptimal local minima due to the explosion of decision combinations, thereby improving decision performance. Secondly, we set different learning rates for the two DNNs during the training phase and trained them at various intervals. This asynchronous training strategy increases the randomness of decision exploration, prevents the two DNNs from converging to the same suboptimal local solution, and improves convergence efficiency by enhancing sample utilization. Finally, we examine the impact of different parallel levels and training step differences on system performance metrics and explain the parameter choices. Experimental results show that the proposed method provides a viable solution to the performance issues caused by slow convergence and local minima, with PEATDRL improving task queue convergence speed by more than 20% compared to baseline algorithms.