The intricate and unpredictable nature of underwater environments and disturbances necessitates the use of model predictive control for the effective operation and inspection of remotely operated vehicles (ROVs). This paper presented an innovative suspension system for a hull‐cleaning robot to control impedance while reducing the vibration of ROV brushes in the presence of environmental disturbances and uncertainties. The use of a model predictive controller that utilizes Laguerre functions results significant reduction in tracking time, and the efficiency of the proposed controller is demonstrated through successful impedance tracking in Z‐direction and vibration reduction in Z and Y directions of the robot in an uncertain environment with disturbance. A prototype robot is built and the controller performance is validated in a real condition and modal analysis theory output with experimental data. The results highlight the effectiveness of the designed suspension system and the developed MPC for real‐world applications where environmental conditions are unpredictable or subject to change while the robot is needed to clean the surface perfectly without scratching the hull.
水下环境和干扰错综复杂且不可预测,因此有必要使用模型预测控制来有效操作和检查遥控潜水器(ROV)。本文介绍了一种用于船体清洁机器人的创新悬挂系统,该系统可在环境干扰和不确定性的情况下控制阻抗,同时减少遥控潜水器刷子的振动。利用拉盖尔函数的模型预测控制器大大缩短了跟踪时间,并通过在不确定的干扰环境中成功实现机器人 Z 方向的阻抗跟踪以及 Z 和 Y 方向的减振,证明了所提控制器的效率。我们制作了一个机器人原型,并在真实条件下验证了控制器的性能,同时将模态分析理论输出与实验数据相结合。结果凸显了所设计的悬挂系统和所开发的 MPC 在实际应用中的有效性,在实际应用中,环境条件是不可预测的或可能发生变化的,而机器人需要在不刮伤船体的情况下完美地清洁表面。
{"title":"Innovative hull cleaning robot design and control by Laguerre base model predictive control for impedance and vibration management","authors":"Vahid Madanipour, Farid Najafi","doi":"10.1049/cth2.12716","DOIUrl":"https://doi.org/10.1049/cth2.12716","url":null,"abstract":"The intricate and unpredictable nature of underwater environments and disturbances necessitates the use of model predictive control for the effective operation and inspection of remotely operated vehicles (ROVs). This paper presented an innovative suspension system for a hull‐cleaning robot to control impedance while reducing the vibration of ROV brushes in the presence of environmental disturbances and uncertainties. The use of a model predictive controller that utilizes Laguerre functions results significant reduction in tracking time, and the efficiency of the proposed controller is demonstrated through successful impedance tracking in Z‐direction and vibration reduction in Z and Y directions of the robot in an uncertain environment with disturbance. A prototype robot is built and the controller performance is validated in a real condition and modal analysis theory output with experimental data. The results highlight the effectiveness of the designed suspension system and the developed MPC for real‐world applications where environmental conditions are unpredictable or subject to change while the robot is needed to clean the surface perfectly without scratching the hull.","PeriodicalId":502998,"journal":{"name":"IET Control Theory & Applications","volume":"34 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141658789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the increase of wind power generation, the safety and economy of power system operations are greatly influenced by the intermittency and fluctuation of wind power. To take the advantage of the complementary characteristics between different energy storage devices, a Hybrid Energy Storage System (HESS) consisting of Battery Energy Storage System (BESS) and Flywheel Energy Storage System (FESS) can alleviate the uncertainty of wind power. This article has proposed a coordinated control strategy through group consensus algorithm based on Model Predictive Control (MPC) for Hybrid Energy Storage Array (HESA) to smooth wind power fluctuations. To allocate power commands to the FESS and BESS, the fluctuation of wind power output is extracted with different frequency domain characteristics as instructions by Empirical Mode Decomposition (EMD) technology. Moreover, a group consensus algorithm based on MPC is proposed to complete the adaptive power allocation of energy storage units. Eventually, the actual wind farm data is used for the simulation to verify the effect of control strategy proposed in this paper. It can be seen that the developed group consensus algorithm based on MPC can cope with different frequency power commands, avoid overcharging and discharging of energy storage media, and smooth wind power effectively.
{"title":"A hybrid energy storage array group control strategy for wind power smoothing","authors":"Tong Tong, Le Wei, Yuanye Chen, Fang Fang","doi":"10.1049/cth2.12698","DOIUrl":"https://doi.org/10.1049/cth2.12698","url":null,"abstract":"With the increase of wind power generation, the safety and economy of power system operations are greatly influenced by the intermittency and fluctuation of wind power. To take the advantage of the complementary characteristics between different energy storage devices, a Hybrid Energy Storage System (HESS) consisting of Battery Energy Storage System (BESS) and Flywheel Energy Storage System (FESS) can alleviate the uncertainty of wind power. This article has proposed a coordinated control strategy through group consensus algorithm based on Model Predictive Control (MPC) for Hybrid Energy Storage Array (HESA) to smooth wind power fluctuations. To allocate power commands to the FESS and BESS, the fluctuation of wind power output is extracted with different frequency domain characteristics as instructions by Empirical Mode Decomposition (EMD) technology. Moreover, a group consensus algorithm based on MPC is proposed to complete the adaptive power allocation of energy storage units. Eventually, the actual wind farm data is used for the simulation to verify the effect of control strategy proposed in this paper. It can be seen that the developed group consensus algorithm based on MPC can cope with different frequency power commands, avoid overcharging and discharging of energy storage media, and smooth wind power effectively.","PeriodicalId":502998,"journal":{"name":"IET Control Theory & Applications","volume":" 46","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141673003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper aims to analyse the dynamic response of a corrupted spacecraft rendezvous system from the perspective of attacker. The optimal data injection attack problem is formulated by constructing a tradeoff cost function in a quadratic form. First, the optimal attack strategy and associated sufficient condition for its existence are derived similar to optimal control for attacker without being detected. Breaking the assumption in most existing works, the goal of this paper is to explore the optimal attack strategy without knowing system matrices. A model free Q‐learning approach is designed with the application to solve attacker's optimization problem. Critic network and action network are used to adaptive tuning the value and action for attacker in a forward time. For a more practical situation, a model free attack strategy design is implemented only based on measured input/output data. Finally, the simulation results on the spacecraft system are presented to show the effectiveness of the proposed method for model free attack strategy design.
{"title":"Optimal data injection attack design for spacecraft systems via a model free Q‐learning approach","authors":"Huanhuan Yuan, Mengbi Wang, Chao Xi","doi":"10.1049/cth2.12685","DOIUrl":"https://doi.org/10.1049/cth2.12685","url":null,"abstract":"This paper aims to analyse the dynamic response of a corrupted spacecraft rendezvous system from the perspective of attacker. The optimal data injection attack problem is formulated by constructing a tradeoff cost function in a quadratic form. First, the optimal attack strategy and associated sufficient condition for its existence are derived similar to optimal control for attacker without being detected. Breaking the assumption in most existing works, the goal of this paper is to explore the optimal attack strategy without knowing system matrices. A model free Q‐learning approach is designed with the application to solve attacker's optimization problem. Critic network and action network are used to adaptive tuning the value and action for attacker in a forward time. For a more practical situation, a model free attack strategy design is implemented only based on measured input/output data. Finally, the simulation results on the spacecraft system are presented to show the effectiveness of the proposed method for model free attack strategy design.","PeriodicalId":502998,"journal":{"name":"IET Control Theory & Applications","volume":"51 48","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141339645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guoyong Qian, Dongbo Xie, Dawei Bi, Qi Wang, Liqing Chen, Hai Wang
Accurately and quickly detecting obstacles ahead is a prerequisite for intelligent driving. The combined detection scheme of light detection and ranging (LiDAR) and the camera is far more capable of coping with complex road conditions than a single sensor. However, immediately afterward, ensuring the real‐time performance of the sensing algorithms through a significantly increased amount of computation has become a new challenge. For this purpose, the paper introduces an improved dynamic obstacle detection algorithm based on YOLOv7 (You Only Look Once version 7) to overcome the drawbacks of slow and unstable detection of traditional methods. Concretely, Mobilenetv3 supplants the backbone network utilized in the original YOLOv7 architecture, thereby achieving a reduction in computational overhead. It integrates a specialized layer for the detection of small‐scale targets and incorporates a convolutional block attention module to enhance detection efficacy for diminutive obstacles. Furthermore, the framework adopts the Efficient Intersection over Union Loss function, which is specifically designed to mitigate the issue of mutual occlusion among detected objects. On a dataset consisting of 27,362 labelled KITTI data samples, the improved YOLOv7 algorithm achieves 92.6% mean average precision and 82 frames per second, which reduces the Model_size by 85.9% and loses only 1.5% accuracy compared with the traditional YOLOv7 algorithm. In addition, this paper builds a virtual scene to test the improved algorithm and fuses LiDAR and camera data. Experimental results conducted on a test vehicle equipped with a camera and LiDAR sensor demonstrate the effectiveness and significant performance of the method. The improved obstacle detection algorithm proposed in this research can significantly reduce the computational cost of the environment perception task, meet the requirements of real‐world applications, and is crucial for achieving safer and smarter driving.
准确、快速地探测前方障碍物是智能驾驶的先决条件。光探测与测距(LiDAR)和摄像头的组合探测方案远比单一传感器更能应对复杂的路况。然而,紧接着,通过大幅增加计算量来确保传感算法的实时性能就成了新的挑战。为此,本文介绍了一种基于 YOLOv7(You Only Look Once version 7)的改进型动态障碍物检测算法,以克服传统方法检测速度慢和不稳定的缺点。具体来说,Mobilenetv3 取代了原有 YOLOv7 架构中使用的主干网络,从而减少了计算开销。它集成了一个专门用于检测小型目标的层,并加入了一个卷积块注意力模块,以提高对小型障碍物的检测效率。此外,该框架还采用了 "Efficient Intersection over Union Loss "函数,该函数专门用于缓解检测对象之间的相互遮挡问题。在由 27,362 个带标签的 KITTI 数据样本组成的数据集上,改进后的 YOLOv7 算法达到了 92.6% 的平均精度和 82 帧/秒的速度,与传统的 YOLOv7 算法相比,模型大小减少了 85.9%,精度仅降低了 1.5%。此外,本文还建立了一个虚拟场景来测试改进算法,并融合了激光雷达和摄像头数据。在装有摄像头和激光雷达传感器的测试车辆上进行的实验结果证明了该方法的有效性和显著性能。本研究提出的改进型障碍物检测算法能显著降低环境感知任务的计算成本,满足实际应用的要求,对实现更安全、更智能的驾驶至关重要。
{"title":"Lightweight environment sensing algorithm for intelligent driving based on improved YOLOv7","authors":"Guoyong Qian, Dongbo Xie, Dawei Bi, Qi Wang, Liqing Chen, Hai Wang","doi":"10.1049/cth2.12704","DOIUrl":"https://doi.org/10.1049/cth2.12704","url":null,"abstract":"Accurately and quickly detecting obstacles ahead is a prerequisite for intelligent driving. The combined detection scheme of light detection and ranging (LiDAR) and the camera is far more capable of coping with complex road conditions than a single sensor. However, immediately afterward, ensuring the real‐time performance of the sensing algorithms through a significantly increased amount of computation has become a new challenge. For this purpose, the paper introduces an improved dynamic obstacle detection algorithm based on YOLOv7 (You Only Look Once version 7) to overcome the drawbacks of slow and unstable detection of traditional methods. Concretely, Mobilenetv3 supplants the backbone network utilized in the original YOLOv7 architecture, thereby achieving a reduction in computational overhead. It integrates a specialized layer for the detection of small‐scale targets and incorporates a convolutional block attention module to enhance detection efficacy for diminutive obstacles. Furthermore, the framework adopts the Efficient Intersection over Union Loss function, which is specifically designed to mitigate the issue of mutual occlusion among detected objects. On a dataset consisting of 27,362 labelled KITTI data samples, the improved YOLOv7 algorithm achieves 92.6% mean average precision and 82 frames per second, which reduces the Model_size by 85.9% and loses only 1.5% accuracy compared with the traditional YOLOv7 algorithm. In addition, this paper builds a virtual scene to test the improved algorithm and fuses LiDAR and camera data. Experimental results conducted on a test vehicle equipped with a camera and LiDAR sensor demonstrate the effectiveness and significant performance of the method. The improved obstacle detection algorithm proposed in this research can significantly reduce the computational cost of the environment perception task, meet the requirements of real‐world applications, and is crucial for achieving safer and smarter driving.","PeriodicalId":502998,"journal":{"name":"IET Control Theory & Applications","volume":" 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141367446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guojun Nan, Zixiang Shen, Haibo Du, Lanlin Yu, Wenwu Zhu
The planning of power transmission line projects encompasses vast and complex geographical terrains. To address the complexity of transmission line planning and achieve lower line costs, this study proposes a novel intelligent line planning method. For the first time, it combines the Dueling Double Deep Q Network (D3QN) with the prioritized experience replay (PER) mechanism. First, correlate the reward function with metrics such as line length, number of corner points, and geographical environmental data, which are pertinent to the construction costs of power transmission line. Second, the D3QN algorithm is formulated by integrating Double DQN and Dueling DQN. The network's input information is divided into two components during training, aligning with the characteristics of power transmission line planning projects. Finally, the convergence efficiency of the algorithm is improved by using the PER mechanism for the problem of cost difference due to the different number of corner points in the planning path. In order to test the feasibility of the algorithm, we conducted experiments using real maps. Compared with the traditional ant colony optimization (ACO) algorithm, the D3QN‐PER deep reinforcement learning algorithm reduces the line length by more than 4% and the number of corner points by more than 60%.
输电线路项目规划涉及广阔而复杂的地理地形。为解决输电线路规划的复杂性并降低线路成本,本研究提出了一种新颖的智能线路规划方法。它首次将决斗双深 Q 网络(D3QN)与优先经验重放(PER)机制相结合。首先,将奖励函数与线路长度、转角点数量和地理环境数据等指标相关联,这些指标与输电线路的建设成本息息相关。其次,通过整合双DQN和决斗DQN,制定了D3QN算法。在训练过程中,根据输电线路规划项目的特点,将网络的输入信息分为两部分。最后,针对规划路径中角点数量不同导致的成本差异问题,利用 PER 机制提高了算法的收敛效率。为了检验算法的可行性,我们使用真实地图进行了实验。与传统的蚁群优化(ACO)算法相比,D3QN-PER 深度强化学习算法的线路长度减少了 4% 以上,角点数量减少了 60% 以上。
{"title":"Smart line planning method for power transmission based on D3QN‐PER algorithm","authors":"Guojun Nan, Zixiang Shen, Haibo Du, Lanlin Yu, Wenwu Zhu","doi":"10.1049/cth2.12689","DOIUrl":"https://doi.org/10.1049/cth2.12689","url":null,"abstract":"The planning of power transmission line projects encompasses vast and complex geographical terrains. To address the complexity of transmission line planning and achieve lower line costs, this study proposes a novel intelligent line planning method. For the first time, it combines the Dueling Double Deep Q Network (D3QN) with the prioritized experience replay (PER) mechanism. First, correlate the reward function with metrics such as line length, number of corner points, and geographical environmental data, which are pertinent to the construction costs of power transmission line. Second, the D3QN algorithm is formulated by integrating Double DQN and Dueling DQN. The network's input information is divided into two components during training, aligning with the characteristics of power transmission line planning projects. Finally, the convergence efficiency of the algorithm is improved by using the PER mechanism for the problem of cost difference due to the different number of corner points in the planning path. In order to test the feasibility of the algorithm, we conducted experiments using real maps. Compared with the traditional ant colony optimization (ACO) algorithm, the D3QN‐PER deep reinforcement learning algorithm reduces the line length by more than 4% and the number of corner points by more than 60%.","PeriodicalId":502998,"journal":{"name":"IET Control Theory & Applications","volume":" 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141367311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bo Lu, L. Ru, Maolong Lv, Shiguang Hu, Hongguo Zhang, Zilong Zhao
To tackle the challenges presented by the two‐player zero sum game (TZSG) in three‐dimensional space, this study introduces an enhanced deep Q‐learning (DQN) algorithm that utilizes long short term memory (LSTM) network. The primary objective of this algorithm is to enhance the temporal correlation of the TZSG in three‐dimensional space. Additionally, it incorporates the hindsight experience replay (HER) mechanism to improve the learning efficiency of the network and mitigate the issue of the “sparse reward” that arises from prolonged training of intelligence in solving the TZSG in the three‐dimensional. Furthermore, this method enhances the convergence and stability of the overall solution.An intelligent training environment centred around an airborne agent and its mutual pursuit interaction scenario was designed to proposed approach's effectiveness. The algorithm training and comparison results show that the LSTM‐DQN‐HER algorithm outperforms similar algorithm in solving the TZSG in three‐dimensional space. In conclusion, this paper presents an improved DQN algorithm based on LSTM and incorporates the HER mechanism to address the challenges posed by the TZSG in three‐dimensional space. The proposed algorithm enhances the solution's temporal correlation, learning efficiency, convergence, and stability. The simulation results confirm its superior performance in solving the TZSG in three‐dimensional space.
{"title":"Enhanced LSTM‐DQN algorithm for a two‐player zero‐sum game in three‐dimensional space","authors":"Bo Lu, L. Ru, Maolong Lv, Shiguang Hu, Hongguo Zhang, Zilong Zhao","doi":"10.1049/cth2.12677","DOIUrl":"https://doi.org/10.1049/cth2.12677","url":null,"abstract":"To tackle the challenges presented by the two‐player zero sum game (TZSG) in three‐dimensional space, this study introduces an enhanced deep Q‐learning (DQN) algorithm that utilizes long short term memory (LSTM) network. The primary objective of this algorithm is to enhance the temporal correlation of the TZSG in three‐dimensional space. Additionally, it incorporates the hindsight experience replay (HER) mechanism to improve the learning efficiency of the network and mitigate the issue of the “sparse reward” that arises from prolonged training of intelligence in solving the TZSG in the three‐dimensional. Furthermore, this method enhances the convergence and stability of the overall solution.An intelligent training environment centred around an airborne agent and its mutual pursuit interaction scenario was designed to proposed approach's effectiveness. The algorithm training and comparison results show that the LSTM‐DQN‐HER algorithm outperforms similar algorithm in solving the TZSG in three‐dimensional space. In conclusion, this paper presents an improved DQN algorithm based on LSTM and incorporates the HER mechanism to address the challenges posed by the TZSG in three‐dimensional space. The proposed algorithm enhances the solution's temporal correlation, learning efficiency, convergence, and stability. The simulation results confirm its superior performance in solving the TZSG in three‐dimensional space.","PeriodicalId":502998,"journal":{"name":"IET Control Theory & Applications","volume":"91 16","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140978382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper considers the problem of robust optimal tracking control of multiple autonomous underwater Vehicles (AUVs) subject to uncertain external disturbances. First, the Takagi‐Sugeno (T‐S) fuzzy based technique is utilized to convert the high‐order nonlinear multi‐AUV system into a series of linearized subsystems. Second, a novel fully distributed sliding mode control (FDSMC) strategy is proposed to attenuate the disturbances. Meanwhile, the leader‐following consensus and the nearly optimization of the energy‐cost function for the multi‐AUV system can be achieved simultaneously through the designed optimal nominal control protocol. Moreover, the proposed control strategy has more mild constraints on the communication topologies. Finally, the effectiveness of the proposed FDSMC strategy is verified by numerical simulation studies.
{"title":"Robust optimal tracking control of multiple autonomous underwater vehicles subject to uncertain disturbances","authors":"Guan Huang, Zhuo Zhang, Weisheng Yan, Rongxin Cui, Shouxu Zhang, Xinxin Guo","doi":"10.1049/cth2.12671","DOIUrl":"https://doi.org/10.1049/cth2.12671","url":null,"abstract":"This paper considers the problem of robust optimal tracking control of multiple autonomous underwater Vehicles (AUVs) subject to uncertain external disturbances. First, the Takagi‐Sugeno (T‐S) fuzzy based technique is utilized to convert the high‐order nonlinear multi‐AUV system into a series of linearized subsystems. Second, a novel fully distributed sliding mode control (FDSMC) strategy is proposed to attenuate the disturbances. Meanwhile, the leader‐following consensus and the nearly optimization of the energy‐cost function for the multi‐AUV system can be achieved simultaneously through the designed optimal nominal control protocol. Moreover, the proposed control strategy has more mild constraints on the communication topologies. Finally, the effectiveness of the proposed FDSMC strategy is verified by numerical simulation studies.","PeriodicalId":502998,"journal":{"name":"IET Control Theory & Applications","volume":" 80","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140990657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Generally, the coverage control is studied in a convex region, in which the agent kinematics and the coverage environment both have strong limitations. It is difficult to directly apply these results to practical scenarios, such as the road environment or indoor environment. In this study, the multi‐agent coverage control problems in a line intersection region is investigated, where the agents can only move along the given lines. To present the agents motion in this line intersection region, the moving directions and velocities of the agents are analyzed in the first part. Then, the coverage control model for the multi‐agent system in line intersection region is presented, in which the cost function is provided based on the agent's minimum moving distance and the agent motions are used as the constraints. To solve this constrained coverage problem, the deep Q‐learning network (DQN) is employed to find the optimal positions for each agent in the line intersection region. In final, numerical simulations are presented to validate the feasibility and effectiveness of proposed approaches.
{"title":"DQN based coverage control for multi‐agent system in line intersection region","authors":"Zuo Lei, Tengfei Zhang, Zhang Jinqi, Yan Maode","doi":"10.1049/cth2.12670","DOIUrl":"https://doi.org/10.1049/cth2.12670","url":null,"abstract":"Generally, the coverage control is studied in a convex region, in which the agent kinematics and the coverage environment both have strong limitations. It is difficult to directly apply these results to practical scenarios, such as the road environment or indoor environment. In this study, the multi‐agent coverage control problems in a line intersection region is investigated, where the agents can only move along the given lines. To present the agents motion in this line intersection region, the moving directions and velocities of the agents are analyzed in the first part. Then, the coverage control model for the multi‐agent system in line intersection region is presented, in which the cost function is provided based on the agent's minimum moving distance and the agent motions are used as the constraints. To solve this constrained coverage problem, the deep Q‐learning network (DQN) is employed to find the optimal positions for each agent in the line intersection region. In final, numerical simulations are presented to validate the feasibility and effectiveness of proposed approaches.","PeriodicalId":502998,"journal":{"name":"IET Control Theory & Applications","volume":"5 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140675105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dingye Zhang, Hang Yu, Keren Dai, Wenjun Yi, He Zhang, Zhiming Lei
In this paper, a novel three‐dimensional fixed‐time integrated guidance and control (IGC) scheme with multi‐stage interconnected observers is proposed for cooperative attacks using multiple missiles against a maneuvering target under impact angle and input saturation constraints. External disturbances, modeling errors, and aerodynamic parameter variations are considered as system uncertainties and a three‐channel fully coupled IGC model for multiple missiles is established. The IGC system is designed optimally based on fixed‐time stability theory, sliding mode control, and the backstepping technique. Three inter‐cascaded fixed‐time disturbance observers based on an improved super‐twisting algorithm are designed to estimate and compensate for system uncertainties. Second‐order command filters are used to constrain virtual control signals, and additional filtering error subsystems are introduced to compensate for the tracking errors of filters. System stability and uniformly ultimately fixed‐time boundedness of all states are proven using the Lyapunov stability theory. Finally, the limits of the acceleration components of the maneuvering target perpendicular to the line of sight direction are derived. The effectiveness of the designed IGC scheme and the ability of multi‐stage interconnected observers to sense disturbances with each other are verified through simulations.
{"title":"Multiple‐missile fixed‐time integrated guidance and control design with multi‐stage interconnected observers under impact angle and input saturation constraints","authors":"Dingye Zhang, Hang Yu, Keren Dai, Wenjun Yi, He Zhang, Zhiming Lei","doi":"10.1049/cth2.12658","DOIUrl":"https://doi.org/10.1049/cth2.12658","url":null,"abstract":"In this paper, a novel three‐dimensional fixed‐time integrated guidance and control (IGC) scheme with multi‐stage interconnected observers is proposed for cooperative attacks using multiple missiles against a maneuvering target under impact angle and input saturation constraints. External disturbances, modeling errors, and aerodynamic parameter variations are considered as system uncertainties and a three‐channel fully coupled IGC model for multiple missiles is established. The IGC system is designed optimally based on fixed‐time stability theory, sliding mode control, and the backstepping technique. Three inter‐cascaded fixed‐time disturbance observers based on an improved super‐twisting algorithm are designed to estimate and compensate for system uncertainties. Second‐order command filters are used to constrain virtual control signals, and additional filtering error subsystems are introduced to compensate for the tracking errors of filters. System stability and uniformly ultimately fixed‐time boundedness of all states are proven using the Lyapunov stability theory. Finally, the limits of the acceleration components of the maneuvering target perpendicular to the line of sight direction are derived. The effectiveness of the designed IGC scheme and the ability of multi‐stage interconnected observers to sense disturbances with each other are verified through simulations.","PeriodicalId":502998,"journal":{"name":"IET Control Theory & Applications","volume":"7 5","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140710599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper studies the cooperative tracking control problem of interacted multi‐agent systems (MASs) under undirected communication. Based on differential graphical game theory, the MAS tracking control problem is formulated as an infinite horizon cooperative differential graphical game‐theoretic tracking control framework, where a multi‐objective optimization problem is designed and then cast into a Pareto‐equivalent single‐objective optimization problem using a scalarization method. Necessary and sufficient conditions for the existence of the Pareto‐optimal strategy to the game theoretic tracking control are established, where it has been proven that the solution to the integral Bellman optimality equation leads to Pareto‐optimal strategy. Then, an off‐policy integral reinforcement learning scheme to find optimal control strategy using a pure data‐driven manner is developed, which consumes less computation efforts than the traditional learning scheme. Simulated results are conducted to validate the effectiveness of the proposed game and IRL‐based tracking control method.
本文研究了无定向通信条件下交互式多代理系统(MAS)的合作跟踪控制问题。基于微分图式博弈论,将 MAS 跟踪控制问题表述为一个无限视界合作微分图式博弈论跟踪控制框架,设计了一个多目标优化问题,并利用标量化方法将其转化为帕累托最优单目标优化问题。建立了博弈论跟踪控制帕累托最优策略存在的必要条件和充分条件,证明了积分贝尔曼最优方程的解会导致帕累托最优策略。然后,开发了一种非策略积分强化学习方案,以纯数据驱动的方式找到最优控制策略,与传统学习方案相比计算量更小。模拟结果验证了所提出的博弈和基于 IRL 的跟踪控制方法的有效性。
{"title":"Differential graphical game‐based multi‐agent tracking control using integral reinforcement learning","authors":"Yaning Guo, Qi Sun, Yintao Wang, Quan Pan","doi":"10.1049/cth2.12667","DOIUrl":"https://doi.org/10.1049/cth2.12667","url":null,"abstract":"This paper studies the cooperative tracking control problem of interacted multi‐agent systems (MASs) under undirected communication. Based on differential graphical game theory, the MAS tracking control problem is formulated as an infinite horizon cooperative differential graphical game‐theoretic tracking control framework, where a multi‐objective optimization problem is designed and then cast into a Pareto‐equivalent single‐objective optimization problem using a scalarization method. Necessary and sufficient conditions for the existence of the Pareto‐optimal strategy to the game theoretic tracking control are established, where it has been proven that the solution to the integral Bellman optimality equation leads to Pareto‐optimal strategy. Then, an off‐policy integral reinforcement learning scheme to find optimal control strategy using a pure data‐driven manner is developed, which consumes less computation efforts than the traditional learning scheme. Simulated results are conducted to validate the effectiveness of the proposed game and IRL‐based tracking control method.","PeriodicalId":502998,"journal":{"name":"IET Control Theory & Applications","volume":"75 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140710997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}