{"title":"基于多线程马尔可夫奖励过程的能量约束CR-NOMA网络延迟DRL研究","authors":"Qiuping Jiang, Chenyu Zhang, Wei Zheng, X. Wen","doi":"10.1109/ICCC56324.2022.10065916","DOIUrl":null,"url":null,"abstract":"Applying deep reinforcement learning in wireless networks has been a hot topic in the field of non-orthogonal multiple access. Most present works focus on the design of algorithms and ignore one of the practical problems when deploying them in actual networks: the computing delay, which may lead to performance deterioration. In this paper, we focus on DDPG applied in energy-constrained CR-NOMA networks with delays and propose a multi-threads scheme to assist the main agent to select action in time. We first discuss the workflow of the proposed scheme, and then restore the impaired Markovianity due to the introduction of subthreads by enhancing the state space in MRP. Test results show that the proposed scheme can significantly improve the performance of DDPG in CR-NOMA networks with delays.","PeriodicalId":263098,"journal":{"name":"2022 IEEE 8th International Conference on Computer and Communications (ICCC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on Delay DRL in Energy-Constrained CR-NOMA Networks based on Multi-Threads Markov Reward Process\",\"authors\":\"Qiuping Jiang, Chenyu Zhang, Wei Zheng, X. Wen\",\"doi\":\"10.1109/ICCC56324.2022.10065916\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Applying deep reinforcement learning in wireless networks has been a hot topic in the field of non-orthogonal multiple access. Most present works focus on the design of algorithms and ignore one of the practical problems when deploying them in actual networks: the computing delay, which may lead to performance deterioration. In this paper, we focus on DDPG applied in energy-constrained CR-NOMA networks with delays and propose a multi-threads scheme to assist the main agent to select action in time. We first discuss the workflow of the proposed scheme, and then restore the impaired Markovianity due to the introduction of subthreads by enhancing the state space in MRP. Test results show that the proposed scheme can significantly improve the performance of DDPG in CR-NOMA networks with delays.\",\"PeriodicalId\":263098,\"journal\":{\"name\":\"2022 IEEE 8th International Conference on Computer and Communications (ICCC)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 8th International Conference on Computer and Communications (ICCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCC56324.2022.10065916\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 8th International Conference on Computer and Communications (ICCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCC56324.2022.10065916","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on Delay DRL in Energy-Constrained CR-NOMA Networks based on Multi-Threads Markov Reward Process
Applying deep reinforcement learning in wireless networks has been a hot topic in the field of non-orthogonal multiple access. Most present works focus on the design of algorithms and ignore one of the practical problems when deploying them in actual networks: the computing delay, which may lead to performance deterioration. In this paper, we focus on DDPG applied in energy-constrained CR-NOMA networks with delays and propose a multi-threads scheme to assist the main agent to select action in time. We first discuss the workflow of the proposed scheme, and then restore the impaired Markovianity due to the introduction of subthreads by enhancing the state space in MRP. Test results show that the proposed scheme can significantly improve the performance of DDPG in CR-NOMA networks with delays.