{"title":"学会调度:工业物联网中以数据新鲜度为导向的智能调度","authors":"Jianhua Tang;Fangfang Chen;Jiaping Li;Zilong Liu","doi":"10.1109/TCCN.2024.3445342","DOIUrl":null,"url":null,"abstract":"In the context of the Industrial Internet of Things (IIoT), developing an accurate and timely scheduling policy is essential. Recently, the Age of Incorrect Information (AoII) is proposed for measuring the timeliness and accuracy of certain status information for monitoring/controlling purposes. In this work, we investigate a multi-sensor state updating system in which AoII is used for quantifying information freshness. We aim to find an optimal scheduling policy to minimize the system-wide cost under bandwidth constraint. We first model the source status updates monitored by sensors as Markov chains and the scheduling problem as a constrained Markov decision process (CMDP). It is challenging to solve the formulated CMDP problem by conventional methods, due to the heterogeneity of source status updates in IIoT and the bandwidth constraint. As such, a framework with the aid of deep reinforcement learning, i.e., Order-Preserving Quantization-Based Constrained Reinforcement Learning Algorithm with Historical Adjustment (OPQ-RL_HA) is developed. Furthermore, by integrating it with the Asynchronous Advantage Actor-Critic (A3C) and the Deep Deterministic Policy Gradient (DDPG), two different algorithms are proposed, i.e., OPQ-A3C_HA and OPQ-DDPG_HA. With extensive numerical validation, it is demonstrated that the proposed algorithm has a lower average system-wide cost compared to the benchmark algorithms.","PeriodicalId":13069,"journal":{"name":"IEEE Transactions on Cognitive Communications and Networking","volume":"11 1","pages":"505-518"},"PeriodicalIF":7.0000,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learn to Schedule: Data Freshness-Oriented Intelligent Scheduling in Industrial IoT\",\"authors\":\"Jianhua Tang;Fangfang Chen;Jiaping Li;Zilong Liu\",\"doi\":\"10.1109/TCCN.2024.3445342\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the context of the Industrial Internet of Things (IIoT), developing an accurate and timely scheduling policy is essential. Recently, the Age of Incorrect Information (AoII) is proposed for measuring the timeliness and accuracy of certain status information for monitoring/controlling purposes. In this work, we investigate a multi-sensor state updating system in which AoII is used for quantifying information freshness. We aim to find an optimal scheduling policy to minimize the system-wide cost under bandwidth constraint. We first model the source status updates monitored by sensors as Markov chains and the scheduling problem as a constrained Markov decision process (CMDP). It is challenging to solve the formulated CMDP problem by conventional methods, due to the heterogeneity of source status updates in IIoT and the bandwidth constraint. As such, a framework with the aid of deep reinforcement learning, i.e., Order-Preserving Quantization-Based Constrained Reinforcement Learning Algorithm with Historical Adjustment (OPQ-RL_HA) is developed. Furthermore, by integrating it with the Asynchronous Advantage Actor-Critic (A3C) and the Deep Deterministic Policy Gradient (DDPG), two different algorithms are proposed, i.e., OPQ-A3C_HA and OPQ-DDPG_HA. With extensive numerical validation, it is demonstrated that the proposed algorithm has a lower average system-wide cost compared to the benchmark algorithms.\",\"PeriodicalId\":13069,\"journal\":{\"name\":\"IEEE Transactions on Cognitive Communications and Networking\",\"volume\":\"11 1\",\"pages\":\"505-518\"},\"PeriodicalIF\":7.0000,\"publicationDate\":\"2024-08-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cognitive Communications and Networking\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10638762/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TELECOMMUNICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10638762/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
Learn to Schedule: Data Freshness-Oriented Intelligent Scheduling in Industrial IoT
In the context of the Industrial Internet of Things (IIoT), developing an accurate and timely scheduling policy is essential. Recently, the Age of Incorrect Information (AoII) is proposed for measuring the timeliness and accuracy of certain status information for monitoring/controlling purposes. In this work, we investigate a multi-sensor state updating system in which AoII is used for quantifying information freshness. We aim to find an optimal scheduling policy to minimize the system-wide cost under bandwidth constraint. We first model the source status updates monitored by sensors as Markov chains and the scheduling problem as a constrained Markov decision process (CMDP). It is challenging to solve the formulated CMDP problem by conventional methods, due to the heterogeneity of source status updates in IIoT and the bandwidth constraint. As such, a framework with the aid of deep reinforcement learning, i.e., Order-Preserving Quantization-Based Constrained Reinforcement Learning Algorithm with Historical Adjustment (OPQ-RL_HA) is developed. Furthermore, by integrating it with the Asynchronous Advantage Actor-Critic (A3C) and the Deep Deterministic Policy Gradient (DDPG), two different algorithms are proposed, i.e., OPQ-A3C_HA and OPQ-DDPG_HA. With extensive numerical validation, it is demonstrated that the proposed algorithm has a lower average system-wide cost compared to the benchmark algorithms.
期刊介绍:
The IEEE Transactions on Cognitive Communications and Networking (TCCN) aims to publish high-quality manuscripts that push the boundaries of cognitive communications and networking research. Cognitive, in this context, refers to the application of perception, learning, reasoning, memory, and adaptive approaches in communication system design. The transactions welcome submissions that explore various aspects of cognitive communications and networks, focusing on innovative and holistic approaches to complex system design. Key topics covered include architecture, protocols, cross-layer design, and cognition cycle design for cognitive networks. Additionally, research on machine learning, artificial intelligence, end-to-end and distributed intelligence, software-defined networking, cognitive radios, spectrum sharing, and security and privacy issues in cognitive networks are of interest. The publication also encourages papers addressing novel services and applications enabled by these cognitive concepts.