{"title":"使用深度强化学习的物联网瞬态数据边缘缓存","authors":"Shuran Sheng, Peng Chen, Zhimin Chen, Lenan Wu, Hao Jiang","doi":"10.1109/IECON43393.2020.9255111","DOIUrl":null,"url":null,"abstract":"Connected devices generate large amount of data for IoT applicatons. Assisted by edge computing, caching IoT data at the edge nodes is considered as a promising technique for its advantage in reducing network traffic and service delay of cloud platform. However, the IoT data is characterized by transient lifetime and cache capacity that is limited by the edge nodes. As a consequence, caching policy should consider both data transiency and storage capacity of edge nodes. Inspired by the success of deep reinforcement learning (DRL) in deal with Markov Decision Process (MDP) problem in unknown environment, A DRL-based algorithm for edge caching problem is proposed in this paper. The proposed Advantage Actor Critic (A2C)-based algorithm is aimed at maximizing the long-term energy saving without knowledge of the IoT data popularity profiles. Simulation results demonstrate that the proposed DRL-based algorithm can achieve higher energy saving and cache hit ratio compared with the baseline algorithms.","PeriodicalId":13045,"journal":{"name":"IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society","volume":"183 1","pages":"4477-4482"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Edge Caching for IoT Transient Data Using Deep Reinforcement Learning\",\"authors\":\"Shuran Sheng, Peng Chen, Zhimin Chen, Lenan Wu, Hao Jiang\",\"doi\":\"10.1109/IECON43393.2020.9255111\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Connected devices generate large amount of data for IoT applicatons. Assisted by edge computing, caching IoT data at the edge nodes is considered as a promising technique for its advantage in reducing network traffic and service delay of cloud platform. However, the IoT data is characterized by transient lifetime and cache capacity that is limited by the edge nodes. As a consequence, caching policy should consider both data transiency and storage capacity of edge nodes. Inspired by the success of deep reinforcement learning (DRL) in deal with Markov Decision Process (MDP) problem in unknown environment, A DRL-based algorithm for edge caching problem is proposed in this paper. The proposed Advantage Actor Critic (A2C)-based algorithm is aimed at maximizing the long-term energy saving without knowledge of the IoT data popularity profiles. Simulation results demonstrate that the proposed DRL-based algorithm can achieve higher energy saving and cache hit ratio compared with the baseline algorithms.\",\"PeriodicalId\":13045,\"journal\":{\"name\":\"IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society\",\"volume\":\"183 1\",\"pages\":\"4477-4482\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IECON43393.2020.9255111\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IECON43393.2020.9255111","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Edge Caching for IoT Transient Data Using Deep Reinforcement Learning
Connected devices generate large amount of data for IoT applicatons. Assisted by edge computing, caching IoT data at the edge nodes is considered as a promising technique for its advantage in reducing network traffic and service delay of cloud platform. However, the IoT data is characterized by transient lifetime and cache capacity that is limited by the edge nodes. As a consequence, caching policy should consider both data transiency and storage capacity of edge nodes. Inspired by the success of deep reinforcement learning (DRL) in deal with Markov Decision Process (MDP) problem in unknown environment, A DRL-based algorithm for edge caching problem is proposed in this paper. The proposed Advantage Actor Critic (A2C)-based algorithm is aimed at maximizing the long-term energy saving without knowledge of the IoT data popularity profiles. Simulation results demonstrate that the proposed DRL-based algorithm can achieve higher energy saving and cache hit ratio compared with the baseline algorithms.