{"title":"基于smdp的分片请求资源分配与长期奖励最大化","authors":"Xin-lian Zhou, X. Wen, Luhan Wang, Zhaoming Lu, Wanqing Guan","doi":"10.1109/ICCChinaW.2019.8849933","DOIUrl":null,"url":null,"abstract":"Network slicing has emerged as a key technology to support the coexistence of multi-service in the the 5G/B5G networks. However, due to the resource scarcity and the diversity of slice requests, how to allocate resources efficiently to maximize the long-term reward of the infrastructure network is a challenging issue. In this paper, we model the resource allocation problem as a semi-Markov decision process (SMDP), which is defined by state space, action space, reward and transition probability distribution. The reward function jointly considers the total income, the cost of available resource and the utilization of the total resource for the infrastructure network. Not focusing on one step decisions reward, we apply the Bellman equation to obtain long-term reward by accumulating. Then we exploit value iteration algorithm to determine the resource allocation scheme according to a certain state in such a way that the long-term reward can be maximized. Extensive simulation results show that the proposed SMDP can achieve a superior performance compared with the existing heuristic methods.","PeriodicalId":252172,"journal":{"name":"2019 IEEE/CIC International Conference on Communications Workshops in China (ICCC Workshops)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"SMDP-Based Resource Allocation for Slice Requests with Long-term Reward Maximization\",\"authors\":\"Xin-lian Zhou, X. Wen, Luhan Wang, Zhaoming Lu, Wanqing Guan\",\"doi\":\"10.1109/ICCChinaW.2019.8849933\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Network slicing has emerged as a key technology to support the coexistence of multi-service in the the 5G/B5G networks. However, due to the resource scarcity and the diversity of slice requests, how to allocate resources efficiently to maximize the long-term reward of the infrastructure network is a challenging issue. In this paper, we model the resource allocation problem as a semi-Markov decision process (SMDP), which is defined by state space, action space, reward and transition probability distribution. The reward function jointly considers the total income, the cost of available resource and the utilization of the total resource for the infrastructure network. Not focusing on one step decisions reward, we apply the Bellman equation to obtain long-term reward by accumulating. Then we exploit value iteration algorithm to determine the resource allocation scheme according to a certain state in such a way that the long-term reward can be maximized. Extensive simulation results show that the proposed SMDP can achieve a superior performance compared with the existing heuristic methods.\",\"PeriodicalId\":252172,\"journal\":{\"name\":\"2019 IEEE/CIC International Conference on Communications Workshops in China (ICCC Workshops)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE/CIC International Conference on Communications Workshops in China (ICCC Workshops)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCChinaW.2019.8849933\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE/CIC International Conference on Communications Workshops in China (ICCC Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCChinaW.2019.8849933","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SMDP-Based Resource Allocation for Slice Requests with Long-term Reward Maximization
Network slicing has emerged as a key technology to support the coexistence of multi-service in the the 5G/B5G networks. However, due to the resource scarcity and the diversity of slice requests, how to allocate resources efficiently to maximize the long-term reward of the infrastructure network is a challenging issue. In this paper, we model the resource allocation problem as a semi-Markov decision process (SMDP), which is defined by state space, action space, reward and transition probability distribution. The reward function jointly considers the total income, the cost of available resource and the utilization of the total resource for the infrastructure network. Not focusing on one step decisions reward, we apply the Bellman equation to obtain long-term reward by accumulating. Then we exploit value iteration algorithm to determine the resource allocation scheme according to a certain state in such a way that the long-term reward can be maximized. Extensive simulation results show that the proposed SMDP can achieve a superior performance compared with the existing heuristic methods.