{"title":"物联网设备动态传输控制的无模型学习算法","authors":"Hanieh Malekijou, Vesal Hakami","doi":"10.1109/ICSPIS54653.2021.9729333","DOIUrl":null,"url":null,"abstract":"We consider an energy-harvesting IoT device transmitting delay- and jitter-sensitive data over a wireless fading channel. Given the limited harvested energy, our goal is to compute optimal transmission control policies that decide on how many packets of data should be transmitted from the buffer's head-of-line at each discrete timeslot such that a long-run criterion involving the average delay/jitter is either minimized or never exceeds a pre-specified threshold. We utilize a suite of Q-learning-based techniques (from the reinforcement learning theory) to optimize the transmission policy in a model-free fashion. Compared to prior work, our novelty lies in proposing a model-free learning algorithm that enables jitter-aware transmissions by penalizing control decisions with the variance of the delay cost function. Extensive numerical results are presented for performance evaluation.","PeriodicalId":286966,"journal":{"name":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","volume":"181 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Model-Free Learning Algorithms for Dynamic Transmission Control in IoT Equipment\",\"authors\":\"Hanieh Malekijou, Vesal Hakami\",\"doi\":\"10.1109/ICSPIS54653.2021.9729333\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider an energy-harvesting IoT device transmitting delay- and jitter-sensitive data over a wireless fading channel. Given the limited harvested energy, our goal is to compute optimal transmission control policies that decide on how many packets of data should be transmitted from the buffer's head-of-line at each discrete timeslot such that a long-run criterion involving the average delay/jitter is either minimized or never exceeds a pre-specified threshold. We utilize a suite of Q-learning-based techniques (from the reinforcement learning theory) to optimize the transmission policy in a model-free fashion. Compared to prior work, our novelty lies in proposing a model-free learning algorithm that enables jitter-aware transmissions by penalizing control decisions with the variance of the delay cost function. Extensive numerical results are presented for performance evaluation.\",\"PeriodicalId\":286966,\"journal\":{\"name\":\"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)\",\"volume\":\"181 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSPIS54653.2021.9729333\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 7th International Conference on Signal Processing and Intelligent Systems (ICSPIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSPIS54653.2021.9729333","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Model-Free Learning Algorithms for Dynamic Transmission Control in IoT Equipment
We consider an energy-harvesting IoT device transmitting delay- and jitter-sensitive data over a wireless fading channel. Given the limited harvested energy, our goal is to compute optimal transmission control policies that decide on how many packets of data should be transmitted from the buffer's head-of-line at each discrete timeslot such that a long-run criterion involving the average delay/jitter is either minimized or never exceeds a pre-specified threshold. We utilize a suite of Q-learning-based techniques (from the reinforcement learning theory) to optimize the transmission policy in a model-free fashion. Compared to prior work, our novelty lies in proposing a model-free learning algorithm that enables jitter-aware transmissions by penalizing control decisions with the variance of the delay cost function. Extensive numerical results are presented for performance evaluation.