{"title":"利用强化学习实现布尔控制网络可达性的最小成本状态翻转控制","authors":"Jingjie Ni;Yang Tang;Fangfei Li","doi":"10.1109/TCYB.2024.3454253","DOIUrl":null,"url":null,"abstract":"This article proposes model-free reinforcement learning methods for minimum-cost state-flipped control in Boolean control networks (BCNs). We tackle two questions: 1) finding the flipping kernel, namely, the flip set with the smallest cardinality ensuring reachability and 2) deriving optimal policies to minimize the number of flipping actions for reachability based on the obtained flipping kernel. For Question 1), Q-learning’s capability in determining reachability is demonstrated. To expedite convergence, we incorporate two improvements: 1) demonstrating that previously reachable states remain reachable after adding elements to the flip set, followed by employing transfer learning and 2) initiating each episode with special initial states whose reachability to the target state set are currently unknown. For Question 2), it is challenging to encapsulate the objective of simultaneously reducing control costs and satisfying terminal constraints exclusively through the reward function employed in the Q-learning framework. To bridge the gap, we propose a BCN-characteristics-based reward scheme and prove its optimality. Questions 1) and 2) with large-scale BCNs are addressed by employing small memory Q-learning, which reduces memory usage by only recording visited action-values. An upper bound on memory usage is provided to assess the algorithm’s feasibility. To expedite convergence for Question 2) in large-scale BCNs, we introduce adaptive variable rewards based on the known maximum steps needed to reach the target state set without cycles. Finally, the effectiveness of the proposed methods is validated on both small- and large-scale BCNs.","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"54 11","pages":"7103-7115"},"PeriodicalIF":9.4000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Minimum-Cost State-Flipped Control for Reachability of Boolean Control Networks Using Reinforcement Learning\",\"authors\":\"Jingjie Ni;Yang Tang;Fangfei Li\",\"doi\":\"10.1109/TCYB.2024.3454253\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article proposes model-free reinforcement learning methods for minimum-cost state-flipped control in Boolean control networks (BCNs). We tackle two questions: 1) finding the flipping kernel, namely, the flip set with the smallest cardinality ensuring reachability and 2) deriving optimal policies to minimize the number of flipping actions for reachability based on the obtained flipping kernel. For Question 1), Q-learning’s capability in determining reachability is demonstrated. To expedite convergence, we incorporate two improvements: 1) demonstrating that previously reachable states remain reachable after adding elements to the flip set, followed by employing transfer learning and 2) initiating each episode with special initial states whose reachability to the target state set are currently unknown. For Question 2), it is challenging to encapsulate the objective of simultaneously reducing control costs and satisfying terminal constraints exclusively through the reward function employed in the Q-learning framework. To bridge the gap, we propose a BCN-characteristics-based reward scheme and prove its optimality. Questions 1) and 2) with large-scale BCNs are addressed by employing small memory Q-learning, which reduces memory usage by only recording visited action-values. An upper bound on memory usage is provided to assess the algorithm’s feasibility. To expedite convergence for Question 2) in large-scale BCNs, we introduce adaptive variable rewards based on the known maximum steps needed to reach the target state set without cycles. Finally, the effectiveness of the proposed methods is validated on both small- and large-scale BCNs.\",\"PeriodicalId\":13112,\"journal\":{\"name\":\"IEEE Transactions on Cybernetics\",\"volume\":\"54 11\",\"pages\":\"7103-7115\"},\"PeriodicalIF\":9.4000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Cybernetics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10681440/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10681440/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Minimum-Cost State-Flipped Control for Reachability of Boolean Control Networks Using Reinforcement Learning
This article proposes model-free reinforcement learning methods for minimum-cost state-flipped control in Boolean control networks (BCNs). We tackle two questions: 1) finding the flipping kernel, namely, the flip set with the smallest cardinality ensuring reachability and 2) deriving optimal policies to minimize the number of flipping actions for reachability based on the obtained flipping kernel. For Question 1), Q-learning’s capability in determining reachability is demonstrated. To expedite convergence, we incorporate two improvements: 1) demonstrating that previously reachable states remain reachable after adding elements to the flip set, followed by employing transfer learning and 2) initiating each episode with special initial states whose reachability to the target state set are currently unknown. For Question 2), it is challenging to encapsulate the objective of simultaneously reducing control costs and satisfying terminal constraints exclusively through the reward function employed in the Q-learning framework. To bridge the gap, we propose a BCN-characteristics-based reward scheme and prove its optimality. Questions 1) and 2) with large-scale BCNs are addressed by employing small memory Q-learning, which reduces memory usage by only recording visited action-values. An upper bound on memory usage is provided to assess the algorithm’s feasibility. To expedite convergence for Question 2) in large-scale BCNs, we introduce adaptive variable rewards based on the known maximum steps needed to reach the target state set without cycles. Finally, the effectiveness of the proposed methods is validated on both small- and large-scale BCNs.
期刊介绍:
The scope of the IEEE Transactions on Cybernetics includes computational approaches to the field of cybernetics. Specifically, the transactions welcomes papers on communication and control across machines or machine, human, and organizations. The scope includes such areas as computational intelligence, computer vision, neural networks, genetic algorithms, machine learning, fuzzy systems, cognitive systems, decision making, and robotics, to the extent that they contribute to the theme of cybernetics or demonstrate an application of cybernetics principles.