{"title":"基于能量收集的嵌入式系统通信策略迭代与Q-Sarsa方法优化","authors":"Mohammed Assaouy, O. Zytoune, D. Aboutajdine","doi":"10.1109/ATSIP.2017.8075557","DOIUrl":null,"url":null,"abstract":"In this paper, we consider a wireless point-to-point communication in the context of battery powered embedded systems with energy harvesting equipment. The successive actions taken by the transmitter constitutes the policy that it follows. In the first stage, we suppose a limited knowledge of the system behavior characterized by its probability transition matrix, and then use the policy iteration algorithm to find the optimal policy. In the second stage, we consider that such basic stochastic knowledge is not available at the transmitter, and consider the Q-Sarsa algorithm to find out optimal policies. The two approaches are first simulated and then compared.","PeriodicalId":259951,"journal":{"name":"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","volume":"2013 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Policy iteration vs Q-Sarsa approach optimization for embedded system communications with energy harvesting\",\"authors\":\"Mohammed Assaouy, O. Zytoune, D. Aboutajdine\",\"doi\":\"10.1109/ATSIP.2017.8075557\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we consider a wireless point-to-point communication in the context of battery powered embedded systems with energy harvesting equipment. The successive actions taken by the transmitter constitutes the policy that it follows. In the first stage, we suppose a limited knowledge of the system behavior characterized by its probability transition matrix, and then use the policy iteration algorithm to find the optimal policy. In the second stage, we consider that such basic stochastic knowledge is not available at the transmitter, and consider the Q-Sarsa algorithm to find out optimal policies. The two approaches are first simulated and then compared.\",\"PeriodicalId\":259951,\"journal\":{\"name\":\"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)\",\"volume\":\"2013 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ATSIP.2017.8075557\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ATSIP.2017.8075557","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Policy iteration vs Q-Sarsa approach optimization for embedded system communications with energy harvesting
In this paper, we consider a wireless point-to-point communication in the context of battery powered embedded systems with energy harvesting equipment. The successive actions taken by the transmitter constitutes the policy that it follows. In the first stage, we suppose a limited knowledge of the system behavior characterized by its probability transition matrix, and then use the policy iteration algorithm to find the optimal policy. In the second stage, we consider that such basic stochastic knowledge is not available at the transmitter, and consider the Q-Sarsa algorithm to find out optimal policies. The two approaches are first simulated and then compared.