首页 > 最新文献

IEEE Transactions on Systems Man Cybernetics-Systems最新文献

英文 中文
Approximation-Based Admittance Control of Robot-Environment Interaction With Guaranteed Performance 基于近似值的机器人与环境交互导纳控制与性能保证
IF 8.6 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-08-02 DOI: 10.1109/TSMC.2024.3430265
Guangzhu Peng;Tao Li;Chenguang Yang;C. L. Philip Chen
Humans are able to compliantly interact with the environment by adapting its motion trajectory and contact force. Robots with the human versatility can perform contact tasks more efficiently with high motion precision. Motivated by multiple capabilities, we develop an approximation-based admittance control strategy that adapts and tracks the trajectory with guaranteed performance for the robots interacting with unknown environments. In this strategy, the robot can adapt and compensate its feedforward force and stiffness to interact with the unknown environment. In particular, a reference trajectory is generated through the admittance control to achieve a desired interaction level. To improve the interaction performance, a tracking error bound for both the transient and steady states is prespecified, and a controller is designed to ensure the tracking control performance. In the presence of unknown robot dynamics, neural networks are integrated into tracking controller to compensate uncertainties. The stability and convergence conditions of the closed-loop system are analysed by the Lyapunov theory. The effectiveness of the proposed control method is demonstrated on the Baxter robot.
人类能够通过调整运动轨迹和接触力与环境进行顺畅的互动。具有人类多功能性的机器人能以高运动精度更高效地执行接触任务。在多种能力的激励下,我们开发了一种基于近似的导纳控制策略,该策略可在保证性能的前提下调整和跟踪与未知环境交互的机器人的运动轨迹。在该策略中,机器人可以调整和补偿其前馈力和刚度,以与未知环境进行交互。特别是,通过导纳控制生成参考轨迹,以达到理想的交互水平。为了提高交互性能,预先确定了瞬态和稳态的跟踪误差边界,并设计了一个控制器来确保跟踪控制性能。在机器人动力学未知的情况下,神经网络被集成到跟踪控制器中,以补偿不确定性。利用 Lyapunov 理论分析了闭环系统的稳定性和收敛条件。在 Baxter 机器人上演示了所提出的控制方法的有效性。
{"title":"Approximation-Based Admittance Control of Robot-Environment Interaction With Guaranteed Performance","authors":"Guangzhu Peng;Tao Li;Chenguang Yang;C. L. Philip Chen","doi":"10.1109/TSMC.2024.3430265","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3430265","url":null,"abstract":"Humans are able to compliantly interact with the environment by adapting its motion trajectory and contact force. Robots with the human versatility can perform contact tasks more efficiently with high motion precision. Motivated by multiple capabilities, we develop an approximation-based admittance control strategy that adapts and tracks the trajectory with guaranteed performance for the robots interacting with unknown environments. In this strategy, the robot can adapt and compensate its feedforward force and stiffness to interact with the unknown environment. In particular, a reference trajectory is generated through the admittance control to achieve a desired interaction level. To improve the interaction performance, a tracking error bound for both the transient and steady states is prespecified, and a controller is designed to ensure the tracking control performance. In the presence of unknown robot dynamics, neural networks are integrated into tracking controller to compensate uncertainties. The stability and convergence conditions of the closed-loop system are analysed by the Lyapunov theory. The effectiveness of the proposed control method is demonstrated on the Baxter robot.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":null,"pages":null},"PeriodicalIF":8.6,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142235683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Adaptive Large Neighborhood Search for Sensor–Weapon–Target Assignment 用于传感器-武器-目标分配的高效自适应大邻域搜索
IF 8.6 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-08-02 DOI: 10.1109/TSMC.2024.3431033
Yang Wang;Junpeng Wang;Jin-Kao Hao;Jianguang Feng
We study a sensor-weapon–target assignment (S-WTA) problem that considers the desired probability of target destruction and aims to minimize the total cost of combat resources. Lower and upper bounds for the S-WTA problem are obtained by constructing linear approximation models. We also propose an adaptive large neighborhood search (ALNS) algorithm characterized by a model-driven repair phase to solve this problem. The destruction phase adaptively selects a destruction operator to remove partial resource assignments and produces an incomplete reference solution. For the destroyed solution, the repair phase generates a reduced subproblem that optimizes only the destroyed parts while keeping the other parts fixed. Each subproblem is formulated as a mixed integer programming model and solved by a general-purpose solver to repair the destroyed solution. Computational experiments show that the approximation formulations can obtain tight lower and upper bounds for most problem instances. Moreover, our proposed ALNS algorithm is competitive with the solver for small instances and effectively solves large instances. In addition, we experimentally demonstrate that our ALNS outperforms state-of-the-art algorithms in the literature, and the proposed model-driven solution repair phase outperforms the traditional heuristic repair operators.
我们研究了一个传感器-武器-目标分配(S-WTA)问题,该问题考虑了目标被摧毁的期望概率,旨在最大限度地降低作战资源的总成本。通过构建线性近似模型,我们获得了 S-WTA 问题的下限和上限。我们还提出了一种以模型驱动的修复阶段为特征的自适应大邻域搜索(ALNS)算法来解决这一问题。破坏阶段会自适应地选择一个破坏算子来移除部分资源分配,并产生一个不完整的参考解。对于已销毁的解决方案,修复阶段会生成一个精简的子问题,只对已销毁的部分进行优化,而其他部分保持不变。每个子问题都被表述为混合整数编程模型,并由通用求解器求解,以修复被破坏的解决方案。计算实验表明,对于大多数问题实例,近似公式都能获得严格的下限和上限。此外,我们提出的 ALNS 算法在小实例方面与求解器具有竞争力,并能有效求解大实例。此外,我们还通过实验证明,我们的 ALNS 算法优于文献中最先进的算法,而且我们提出的模型驱动解修复阶段优于传统的启发式修复算子。
{"title":"Efficient Adaptive Large Neighborhood Search for Sensor–Weapon–Target Assignment","authors":"Yang Wang;Junpeng Wang;Jin-Kao Hao;Jianguang Feng","doi":"10.1109/TSMC.2024.3431033","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3431033","url":null,"abstract":"We study a sensor-weapon–target assignment (S-WTA) problem that considers the desired probability of target destruction and aims to minimize the total cost of combat resources. Lower and upper bounds for the S-WTA problem are obtained by constructing linear approximation models. We also propose an adaptive large neighborhood search (ALNS) algorithm characterized by a model-driven repair phase to solve this problem. The destruction phase adaptively selects a destruction operator to remove partial resource assignments and produces an incomplete reference solution. For the destroyed solution, the repair phase generates a reduced subproblem that optimizes only the destroyed parts while keeping the other parts fixed. Each subproblem is formulated as a mixed integer programming model and solved by a general-purpose solver to repair the destroyed solution. Computational experiments show that the approximation formulations can obtain tight lower and upper bounds for most problem instances. Moreover, our proposed ALNS algorithm is competitive with the solver for small instances and effectively solves large instances. In addition, we experimentally demonstrate that our ALNS outperforms state-of-the-art algorithms in the literature, and the proposed model-driven solution repair phase outperforms the traditional heuristic repair operators.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":null,"pages":null},"PeriodicalIF":8.6,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142274910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decentralized and Privacy-Preserving Learning of Approximate Stackelberg Solutions in Energy Trading Games With Demand Response Aggregators 有需求响应聚合器的能源交易博弈中近似斯塔克尔伯格解决方案的分散和隐私保护学习
IF 8.6 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-08-02 DOI: 10.1109/TSMC.2024.3432000
Styliani I. Kampezidou;Justin Romberg;Kyriakos G. Vamvoudakis;Dimitri N. Mavris
In the pathway to 2030 electricity generation decarbonization and 2050 net-zero economies, scalable integration of distributed load can support environmental goals and also help alleviate smart grid operational issues through its electricity market participation. In this work, a novel Stackelberg game theoretic framework is proposed for trading the energy bidirectionally between the demand-response (DR) aggregator and the prosumers (distributed load). This formulation allows for flexible energy arbitrage and additional monetary rewards while ensuring that the prosumers’ desired daily energy demand is met. Then, a scalable (linear with the number of prosumers and the number of learning samples), the decentralized privacy-preserving algorithm is proposed to find approximate equilibria with online sampling and learning of the prosumers’ cumulative best response, which finds applications beyond this energy game. Moreover, cost bounds are provided on the quality of the approximate equilibrium solution. Finally, the real data from the California day-ahead market and the UC Davis campus building energy demands are utilized to demonstrate the efficacy of the proposed framework and the algorithm.
在实现 2030 年发电去碳化和 2050 年净零排放经济的道路上,分布式负载的可扩展集成可支持环境目标,还可通过参与电力市场帮助缓解智能电网的运行问题。在这项工作中,我们提出了一个新颖的 Stackelberg 博弈论框架,用于在需求响应(DR)聚合器和用户(分布式负载)之间进行双向能源交易。该框架允许灵活的能源套利和额外的货币奖励,同时确保满足消费者的日常能源需求。然后,提出了一种可扩展(与消费者数量和学习样本数量成线性关系)的分散式隐私保护算法,通过在线采样和学习消费者的累积最佳响应来找到近似均衡点,该算法的应用范围超出了能源博弈。此外,还提供了近似均衡解质量的成本界限。最后,利用加州日前市场和加州大学戴维斯分校校园建筑能源需求的真实数据,证明了所提框架和算法的有效性。
{"title":"Decentralized and Privacy-Preserving Learning of Approximate Stackelberg Solutions in Energy Trading Games With Demand Response Aggregators","authors":"Styliani I. Kampezidou;Justin Romberg;Kyriakos G. Vamvoudakis;Dimitri N. Mavris","doi":"10.1109/TSMC.2024.3432000","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3432000","url":null,"abstract":"In the pathway to 2030 electricity generation decarbonization and 2050 net-zero economies, scalable integration of distributed load can support environmental goals and also help alleviate smart grid operational issues through its electricity market participation. In this work, a novel Stackelberg game theoretic framework is proposed for trading the energy bidirectionally between the demand-response (DR) aggregator and the prosumers (distributed load). This formulation allows for flexible energy arbitrage and additional monetary rewards while ensuring that the prosumers’ desired daily energy demand is met. Then, a scalable (linear with the number of prosumers and the number of learning samples), the decentralized privacy-preserving algorithm is proposed to find approximate equilibria with online sampling and learning of the prosumers’ cumulative best response, which finds applications beyond this energy game. Moreover, cost bounds are provided on the quality of the approximate equilibrium solution. Finally, the real data from the California day-ahead market and the UC Davis campus building energy demands are utilized to demonstrate the efficacy of the proposed framework and the algorithm.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":null,"pages":null},"PeriodicalIF":8.6,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142274954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scientometric Analysis of Quantum Algorithms for VANET Optimization 用于 VANET 优化的量子算法的科学计量分析
IF 8.6 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-08-01 DOI: 10.1109/TSMC.2024.3428707
Pooja;Sandeep Kumar Sood
The rapid proliferation of quantum information technologies, spanning theoretical investigations to practical experiments, has generated a number of research papers and documents in quantum algorithms. Consequently, the current research serves as a gateway for interested readers to comprehend the status quo of quantum algorithms, with a specific focus on vehicular network optimization. It aims to explore the research patterns and latest trends by analyzing the dataset sourced from the Scopus and Web of Science databases. The scientometric implications offer valuable insights into publication patterns, keyword co-occurrence, author co-citation, country collaboration, and burst reference. These analyses delineate the temporal progression, prominent research topics, emerging research areas, leading collaborative nations, prolific authors, and research trends within this knowledge domain. The results reveal that smart power grids, traveling salesman problem, electric vehicle charging, battery life estimation, and air traffic control are emerging research areas. Similarly, quantum approximate optimization algorithms, adiabatic quantum computing, quantum-inspired evolutionary algorithms, and quantum annealing emerge as prominent quantum algorithms employed for vehicular network optimization problems. In addition, systematic literature analysis is objectively conducted to discern key insights, research challenges and future research directions in the current knowledge domain.
从理论研究到实际实验,量子信息技术的迅速发展催生了大量量子算法方面的研究论文和文献。因此,当前的研究为感兴趣的读者提供了一个了解量子算法现状的途径,特别是在车载网络优化方面。本研究旨在通过分析 Scopus 和 Web of Science 数据库中的数据集,探索研究模式和最新趋势。科学计量学的意义在于对发表模式、关键词共现、作者共引、国家合作和突发参考文献提供有价值的见解。这些分析勾勒出该知识领域的时间进程、突出研究课题、新兴研究领域、主要合作国家、多产作者和研究趋势。结果显示,智能电网、旅行推销员问题、电动汽车充电、电池寿命估计和空中交通管制是新兴研究领域。同样,量子近似优化算法、绝热量子计算、量子启发的进化算法和量子退火也成为车载网络优化问题的主要量子算法。此外,还客观地进行了系统的文献分析,以找出当前知识领域的关键见解、研究挑战和未来研究方向。
{"title":"Scientometric Analysis of Quantum Algorithms for VANET Optimization","authors":"Pooja;Sandeep Kumar Sood","doi":"10.1109/TSMC.2024.3428707","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3428707","url":null,"abstract":"The rapid proliferation of quantum information technologies, spanning theoretical investigations to practical experiments, has generated a number of research papers and documents in quantum algorithms. Consequently, the current research serves as a gateway for interested readers to comprehend the status quo of quantum algorithms, with a specific focus on vehicular network optimization. It aims to explore the research patterns and latest trends by analyzing the dataset sourced from the Scopus and Web of Science databases. The scientometric implications offer valuable insights into publication patterns, keyword co-occurrence, author co-citation, country collaboration, and burst reference. These analyses delineate the temporal progression, prominent research topics, emerging research areas, leading collaborative nations, prolific authors, and research trends within this knowledge domain. The results reveal that smart power grids, traveling salesman problem, electric vehicle charging, battery life estimation, and air traffic control are emerging research areas. Similarly, quantum approximate optimization algorithms, adiabatic quantum computing, quantum-inspired evolutionary algorithms, and quantum annealing emerge as prominent quantum algorithms employed for vehicular network optimization problems. In addition, systematic literature analysis is objectively conducted to discern key insights, research challenges and future research directions in the current knowledge domain.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":null,"pages":null},"PeriodicalIF":8.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142274960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Supervisor Synthesis Using Labeled Petri Nets for Forbidden State Specifications 使用标签 Petri 网对禁止状态规范进行监控器合成
IF 8.6 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-08-01 DOI: 10.1109/TSMC.2024.3422925
Yihui Hu;Ziyue Ma;Ruotian Liu;Maria Pia Fanti;Zhiwu Li
This research focuses on the forbidden state problem in the framework of labeled Petri nets (LPNs), i.e., to design a supervisor for a plant modeled by an LPN such that the closed-loop system cannot reach a set of predefined forbidden markings and does not contain any deadlock. Different from the traditional control scheme, the supervisor derived by this work can not only observe the observable transitions, but also the quiescence information. First, a new structure named an extended basis reachability graph (EBRG) is introduced to describe the reachability space of an LPN without computing all reachable markings. Based on an EBRG, a basis observer is then excogitated to represent the behavior of an LPN. Some states in the basis observer are defined as bad states and control-induced deadlocks, which relates to the undesirable behavior of the plant. Finally, an algorithm is introduced to compute a supervisor based on the basis observer. The consideration of system quiescence provides extra information on the marking estimation of the closed-loop system such that certain disabled transitions are re-enabled. Consequently, the developed supervisor in this article is generally more permissive than those do not observe the quiescence.
本研究的重点是标注 Petri 网(LPN)框架下的禁止状态问题,即为 LPN 建模的工厂设计一个监控器,使闭环系统不能达到一组预定义的禁止标记,并且不包含任何死锁。与传统控制方案不同的是,本文所推导的监控器不仅能观测到可观测的转换,还能观测到静态信息。首先,本文引入了一种名为扩展基础可达性图(EBRG)的新结构,用于描述 LPN 的可达性空间,而无需计算所有可达标记。然后在 EBRG 的基础上,激发出一个基础观测器来表示 LPN 的行为。基础观测器中的一些状态被定义为不良状态和控制引起的死锁,这与工厂的不良行为有关。最后,介绍了一种基于基础观测器计算监控器的算法。系统静止的考虑为闭环系统的标记估计提供了额外的信息,从而使某些禁用的转换重新启用。因此,本文所开发的监控器通常比那些不观察静态的监控器更为宽松。
{"title":"Supervisor Synthesis Using Labeled Petri Nets for Forbidden State Specifications","authors":"Yihui Hu;Ziyue Ma;Ruotian Liu;Maria Pia Fanti;Zhiwu Li","doi":"10.1109/TSMC.2024.3422925","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3422925","url":null,"abstract":"This research focuses on the forbidden state problem in the framework of labeled Petri nets (LPNs), i.e., to design a supervisor for a plant modeled by an LPN such that the closed-loop system cannot reach a set of predefined forbidden markings and does not contain any deadlock. Different from the traditional control scheme, the supervisor derived by this work can not only observe the observable transitions, but also the quiescence information. First, a new structure named an extended basis reachability graph (EBRG) is introduced to describe the reachability space of an LPN without computing all reachable markings. Based on an EBRG, a basis observer is then excogitated to represent the behavior of an LPN. Some states in the basis observer are defined as bad states and control-induced deadlocks, which relates to the undesirable behavior of the plant. Finally, an algorithm is introduced to compute a supervisor based on the basis observer. The consideration of system quiescence provides extra information on the marking estimation of the closed-loop system such that certain disabled transitions are re-enabled. Consequently, the developed supervisor in this article is generally more permissive than those do not observe the quiescence.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":null,"pages":null},"PeriodicalIF":8.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142274848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed Inertial Proximal Neurodynamic Approach for Sparse Recovery on Directed Networks 定向网络稀疏恢复的分布式惯性近端神经动力学方法
IF 8.6 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-07-31 DOI: 10.1109/TSMC.2024.3408473
You Zhao;Xing He;Mingliang Zhou;Junzhi Yu;Tingwen Huang
This article investigates a fully distributed inertial neurodynamic approach for sparse recovery. The approach is based on proximal operators and inertia items. It aims to solve the $L_{1}$ -norm minimization problem with consensus and linear observation constraints over directed communication networks. The proposed neurodynamic approach has the advantages of only requiring the communication network to be directed and weight-balanced, does not involve a central processing node and global parameters, which means that no single node can access the entire network and observe it at any time, so it is fully distributed. To effectively deal with the nonsmooth objective function, $L_{1}$ -norm, the proximal operator method is used here. For efficiently handling linear observation and consensus constraints, a primal-dual method is applied to the inertial dynamic system. With the aid of maximal monotone operator theory and Baillon-Haddad lemmas, it reveals that the trajectories of our approach can converge to consensus solution at the optimal solution, provided that the distributed parameters satisfy technical conditions. In addition, we aim to demonstrate the weak convergence of the trajectories in our proposed neurodynamic approach toward the zeros of the optimal operator in Hilbert space, using Opial’s lemma. Finally, comparative experiments on sparse signal and image recovery confirm the efficiency and effectiveness of our proposed neurodynamic approach.
本文研究了一种用于稀疏恢复的全分布式惯性神经动力学方法。该方法以近算子和惯性项为基础。它旨在通过有向通信网络解决具有共识和线性观测约束的 $L_{1}$ -norm 最小化问题。所提出的神经动力学方法的优点是只要求通信网络是有向和权重平衡的,不涉及中央处理节点和全局参数,这意味着没有任何一个节点可以随时访问整个网络并对其进行观测,因此它是完全分布式的。为了有效处理非光滑目标函数 $L_{1}$ -norm,这里使用了近似算子法。为有效处理线性观测和共识约束,对惯性动态系统采用了初等二元法。借助最大单调算子理论和 Baillon-Haddad 定理,我们发现只要分布参数满足技术条件,我们方法的轨迹就能在最优解处收敛到共识解。此外,我们还利用 Opial's lemma 证明了我们提出的神经动力学方法的轨迹对希尔伯特空间中最优算子零点的弱收敛性。最后,稀疏信号和图像复原的对比实验证实了我们提出的神经动力学方法的效率和有效性。
{"title":"Distributed Inertial Proximal Neurodynamic Approach for Sparse Recovery on Directed Networks","authors":"You Zhao;Xing He;Mingliang Zhou;Junzhi Yu;Tingwen Huang","doi":"10.1109/TSMC.2024.3408473","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3408473","url":null,"abstract":"This article investigates a fully distributed inertial neurodynamic approach for sparse recovery. The approach is based on proximal operators and inertia items. It aims to solve the \u0000<inline-formula> <tex-math>$L_{1}$ </tex-math></inline-formula>\u0000-norm minimization problem with consensus and linear observation constraints over directed communication networks. The proposed neurodynamic approach has the advantages of only requiring the communication network to be directed and weight-balanced, does not involve a central processing node and global parameters, which means that no single node can access the entire network and observe it at any time, so it is fully distributed. To effectively deal with the nonsmooth objective function, \u0000<inline-formula> <tex-math>$L_{1}$ </tex-math></inline-formula>\u0000-norm, the proximal operator method is used here. For efficiently handling linear observation and consensus constraints, a primal-dual method is applied to the inertial dynamic system. With the aid of maximal monotone operator theory and Baillon-Haddad lemmas, it reveals that the trajectories of our approach can converge to consensus solution at the optimal solution, provided that the distributed parameters satisfy technical conditions. In addition, we aim to demonstrate the weak convergence of the trajectories in our proposed neurodynamic approach toward the zeros of the optimal operator in Hilbert space, using Opial’s lemma. Finally, comparative experiments on sparse signal and image recovery confirm the efficiency and effectiveness of our proposed neurodynamic approach.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":null,"pages":null},"PeriodicalIF":8.6,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142275007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed Pursuit-Evasion Game of Limited Perception USV Swarm Based on Multiagent Proximal Policy Optimization 基于多代理近端策略优化的有限感知 USV 蜂群的分布式追逐-入侵博弈
IF 8.6 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-07-30 DOI: 10.1109/TSMC.2024.3429467
Fanbiao Li;Mengmeng Yin;Tengda Wang;Tingwen Huang;Chunhua Yang;Weihua Gui
This article proposes a distributed capture strategy optimization method for the pursuit-evasion game involving multiple unmanned surface vehicles. Considering the limited perception range of each pursuer, a multiagent proximal policy optimization method combined with a novel velocity control mechanism is utilized to guide the pursuers in approaching the evader and form a dynamic encirclement. Moreover, to facilitate deep reinforcement learning (DRL) training, a bidirectional gated recurrent unit feature network is constructed to extract the fixed-length vector representations from the variable-length observation sequences. In terms of the policy training, by employing virtual barriers and curriculum learning techniques during the training process, the generalization capabilities and convergence speed of the policy have been further improved. Finally, our method is compared with the other DRL methods through the comparative simulation experiments and virtual reality scene testing based on the gazebo three dimensional physics engine, verifying its significant advantages in the policy convergence speed, capture efficiency, and generalization capabilities.
本文针对涉及多个无人水面飞行器的追逐-规避博弈,提出了一种分布式捕获策略优化方法。考虑到每个追逐者的感知范围有限,文章利用多代理近程策略优化方法结合新颖的速度控制机制,引导追逐者接近逃避者并形成动态包围。此外,为了便于深度强化学习(DRL)训练,还构建了一个双向门控递归单元特征网络,以从变长观测序列中提取定长向量表示。在策略训练方面,通过在训练过程中采用虚拟障碍和课程学习技术,进一步提高了策略的泛化能力和收敛速度。最后,基于 gazebo 三维物理引擎,通过对比仿真实验和虚拟现实场景测试,将我们的方法与其他 DRL 方法进行了比较,验证了其在策略收敛速度、捕获效率和泛化能力方面的显著优势。
{"title":"Distributed Pursuit-Evasion Game of Limited Perception USV Swarm Based on Multiagent Proximal Policy Optimization","authors":"Fanbiao Li;Mengmeng Yin;Tengda Wang;Tingwen Huang;Chunhua Yang;Weihua Gui","doi":"10.1109/TSMC.2024.3429467","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3429467","url":null,"abstract":"This article proposes a distributed capture strategy optimization method for the pursuit-evasion game involving multiple unmanned surface vehicles. Considering the limited perception range of each pursuer, a multiagent proximal policy optimization method combined with a novel velocity control mechanism is utilized to guide the pursuers in approaching the evader and form a dynamic encirclement. Moreover, to facilitate deep reinforcement learning (DRL) training, a bidirectional gated recurrent unit feature network is constructed to extract the fixed-length vector representations from the variable-length observation sequences. In terms of the policy training, by employing virtual barriers and curriculum learning techniques during the training process, the generalization capabilities and convergence speed of the policy have been further improved. Finally, our method is compared with the other DRL methods through the comparative simulation experiments and virtual reality scene testing based on the gazebo three dimensional physics engine, verifying its significant advantages in the policy convergence speed, capture efficiency, and generalization capabilities.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":null,"pages":null},"PeriodicalIF":8.6,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142274918","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prescribed-Time Fault-Tolerant Control of the FO Decoupled Dual-Mass MEMS Gyro With Deferred Constraints-Design and Implementation 具有延迟约束条件的 FO 解耦双质量 MEMS 陀螺仪的规定时间容错控制--设计与实现
IF 8.6 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-07-30 DOI: 10.1109/TSMC.2024.3427312
Shaohua Luo;Yongduan Song;Guangwei Deng;Junxing Zhang;Hassen M. Ouakad
This article mainly investigates the model design, field programmable gate array (FPGA) implementation, and prescribed-time fault-tolerant control of a fractional-order (FO) decoupled dual-mass micro-electro-mechanical system (MEMS) gyro with deferred constraints. First, the structure of such MEMS gyro is designed to eliminate the linear acceleration in the sensing direction and its mathematical model is built based on the Lagrange’s equation. The dynamical analysis shows that such gyro can generate unpredictable, random, and disorder motions under various FOs, stiffness cross the coupling coefficients and proof masses. The designed FPGA circuit further demonstrates the undesirable chaotic oscillations of such MEMS gyro and good hardware resources utilization, avoiding the time consuming and board redesign. Second, to better solve the problems of constraints, actuator faults, uncertainties, drive couplings, and chaotic oscillations, a dependent deferred-error function superimposed to a prescribed-time function is used to guarantee no violation of constraints after a finite time. Furthermore, a $beta $ -cut type-2 fuzzy logic system (T2FLS) is employed to solve the uncertainty, and an FO hyperbolic tangent tracking differentiator (HTTD) is utilized to deal with the direct FO derivative and repeated derivative in the framework of the backstepping control. Then, a prescribed-time fault-tolerant control scheme of the FO decoupled dual-mass MEMS gyro is proposed under the actuator fault. Finally, the abundant simulation experimental results verify the feasibility and effectiveness of our scheme.
本文主要研究具有延迟约束的分数阶(FO)解耦双质量微机电系统(MEMS)陀螺的模型设计、现场可编程门阵列(FPGA)实现和规定时间容错控制。首先,设计了这种 MEMS 陀螺的结构以消除感应方向上的线性加速度,并根据拉格朗日方程建立了其数学模型。动力学分析表明,在不同的 FO、刚度交叉耦合系数和证明质量下,这种陀螺可以产生不可预测、随机和无序的运动。所设计的 FPGA 电路进一步证明了这种 MEMS 陀螺的无序振荡是不可取的,并很好地利用了硬件资源,避免了耗时和电路板的重新设计。其次,为了更好地解决约束、致动器故障、不确定性、驱动耦合和混沌振荡等问题,在规定时间函数上叠加了一个依赖的延迟误差函数,以保证在有限时间后不违反约束。此外,还采用了$beta$-cut 2型模糊逻辑系统(T2FLS)来解决不确定性问题,并利用FO双曲正切跟踪微分器(HTTD)来处理反步控制框架中的直接FO导数和重复导数。然后,提出了在执行器故障情况下 FO 解耦双质量 MEMS 陀螺的规定时间容错控制方案。最后,丰富的仿真实验结果验证了我们方案的可行性和有效性。
{"title":"Prescribed-Time Fault-Tolerant Control of the FO Decoupled Dual-Mass MEMS Gyro With Deferred Constraints-Design and Implementation","authors":"Shaohua Luo;Yongduan Song;Guangwei Deng;Junxing Zhang;Hassen M. Ouakad","doi":"10.1109/TSMC.2024.3427312","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3427312","url":null,"abstract":"This article mainly investigates the model design, field programmable gate array (FPGA) implementation, and prescribed-time fault-tolerant control of a fractional-order (FO) decoupled dual-mass micro-electro-mechanical system (MEMS) gyro with deferred constraints. First, the structure of such MEMS gyro is designed to eliminate the linear acceleration in the sensing direction and its mathematical model is built based on the Lagrange’s equation. The dynamical analysis shows that such gyro can generate unpredictable, random, and disorder motions under various FOs, stiffness cross the coupling coefficients and proof masses. The designed FPGA circuit further demonstrates the undesirable chaotic oscillations of such MEMS gyro and good hardware resources utilization, avoiding the time consuming and board redesign. Second, to better solve the problems of constraints, actuator faults, uncertainties, drive couplings, and chaotic oscillations, a dependent deferred-error function superimposed to a prescribed-time function is used to guarantee no violation of constraints after a finite time. Furthermore, a \u0000<inline-formula> <tex-math>$beta $ </tex-math></inline-formula>\u0000-cut type-2 fuzzy logic system (T2FLS) is employed to solve the uncertainty, and an FO hyperbolic tangent tracking differentiator (HTTD) is utilized to deal with the direct FO derivative and repeated derivative in the framework of the backstepping control. Then, a prescribed-time fault-tolerant control scheme of the FO decoupled dual-mass MEMS gyro is proposed under the actuator fault. Finally, the abundant simulation experimental results verify the feasibility and effectiveness of our scheme.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":null,"pages":null},"PeriodicalIF":8.6,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142274970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Novel Parallel Formulation for Iterative Reinforcement Learning Control 迭代强化学习控制的新型并行公式
IF 8.6 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-07-29 DOI: 10.1109/TSMC.2024.3428482
Ding Wang;Jiangyu Wang;Lingzhi Hu;Liguo Zhang
Parallelization is widely employed to improve the exploration ability of controllers. However, it is rare to provide a lightweight scheme for reducing homogeneous policies with theoretical guarantees. This article is concerned with a novel parallel scheme for solving optimal control problems. In brief, we design a novel global indicator that inherits the theoretical guarantees of a class of iterative reinforcement learning algorithms. By generating a tentative function, the global indicator can guide and communicate with parallel controllers to accelerate the learning process. Using two typical exploration policies, the novel parallel scheme can rapidly compress the neighborhood of the optimal cost function. Besides, two parallel algorithms based on value iteration and Q-learning are established to improve the data efficiency through different extensions. Finally, two benchmark problems are presented to demonstrate the learning effectiveness of the novel parallel scheme.
并行化被广泛用于提高控制器的探索能力。然而,提供一种具有理论保证的减少同质策略的轻量级方案并不多见。本文关注的是一种解决最优控制问题的新型并行方案。简而言之,我们设计了一种新型全局指标,它继承了一类迭代强化学习算法的理论保证。通过生成一个暂定函数,全局指标可以指导并与并行控制器通信,从而加速学习过程。利用两种典型的探索策略,新颖的并行方案可以快速压缩最优成本函数的邻域。此外,还建立了基于值迭代和 Q-learning 的两种并行算法,通过不同的扩展提高数据效率。最后,介绍了两个基准问题,以证明新型并行方案的学习效果。
{"title":"Novel Parallel Formulation for Iterative Reinforcement Learning Control","authors":"Ding Wang;Jiangyu Wang;Lingzhi Hu;Liguo Zhang","doi":"10.1109/TSMC.2024.3428482","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3428482","url":null,"abstract":"Parallelization is widely employed to improve the exploration ability of controllers. However, it is rare to provide a lightweight scheme for reducing homogeneous policies with theoretical guarantees. This article is concerned with a novel parallel scheme for solving optimal control problems. In brief, we design a novel global indicator that inherits the theoretical guarantees of a class of iterative reinforcement learning algorithms. By generating a tentative function, the global indicator can guide and communicate with parallel controllers to accelerate the learning process. Using two typical exploration policies, the novel parallel scheme can rapidly compress the neighborhood of the optimal cost function. Besides, two parallel algorithms based on value iteration and Q-learning are established to improve the data efficiency through different extensions. Finally, two benchmark problems are presented to demonstrate the learning effectiveness of the novel parallel scheme.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":null,"pages":null},"PeriodicalIF":8.6,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142274972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resolving the Resource Decision-Making Dilemma of Leaderless Group-Based Multiagent Systems and Repeated Games 解决无领导小组多代理系统和重复博弈的资源决策困境
IF 8.6 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-07-26 DOI: 10.1109/TSMC.2024.3427688
Junxiao Xue;Mingchuang Zhang;Bowei Dong;Lei Shi;Andrés Adolfo Navarro Newball
Leaderless rational individuals often lead the group into a resource decision dilemma in resource competition. Reducing the cost of resource competition while avoiding group decision dilemmas is a challenging task. Inspired by multiagent systems (MASs) and repeated games, we propose a decision-making reward discrimination (DRD) framework to address the resource competition dilemma of leaderless group formation. We aim to model the leaderless group’s resource gaming process using MAS and achieve optimal rewards for the group while minimizing conflict in resource competition. The proposed framework consists of three modules: 1) the decision-making module; 2) the reward module; and 3) the discriminative module. The decision-making module defines the agents and models the decision-making process, while the reward module calculates the group reward in each round using the reward matrix. The discriminative module compares the group reward with the target reward while providing the agent with environmental information. We verify the feasibility of the model through numerous experiments. The results show that agents adopt a revenge strategy to avoid resource competition dilemmas and achieve group reward optimality.
在资源竞争中,无领导的理性个体往往会导致群体陷入资源决策困境。在避免群体决策困境的同时降低资源竞争成本是一项具有挑战性的任务。受多代理系统(MAS)和重复博弈的启发,我们提出了一种决策奖赏判别(DRD)框架,以解决无领导小组形成过程中的资源竞争困境。我们的目标是利用 MAS 对无领导小组的资源博弈过程进行建模,并在资源竞争冲突最小化的同时实现小组的最优回报。所提出的框架由三个模块组成:1)决策模块;2)奖励模块;3)判别模块。决策模块定义了代理并模拟了决策过程,而奖励模块则利用奖励矩阵计算每轮的群体奖励。判别模块在为代理提供环境信息的同时,将群体奖励与目标奖励进行比较。我们通过大量实验验证了该模型的可行性。结果表明,代理采用复仇策略避免了资源竞争困境,并实现了群体奖励最优。
{"title":"Resolving the Resource Decision-Making Dilemma of Leaderless Group-Based Multiagent Systems and Repeated Games","authors":"Junxiao Xue;Mingchuang Zhang;Bowei Dong;Lei Shi;Andrés Adolfo Navarro Newball","doi":"10.1109/TSMC.2024.3427688","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3427688","url":null,"abstract":"Leaderless rational individuals often lead the group into a resource decision dilemma in resource competition. Reducing the cost of resource competition while avoiding group decision dilemmas is a challenging task. Inspired by multiagent systems (MASs) and repeated games, we propose a decision-making reward discrimination (DRD) framework to address the resource competition dilemma of leaderless group formation. We aim to model the leaderless group’s resource gaming process using MAS and achieve optimal rewards for the group while minimizing conflict in resource competition. The proposed framework consists of three modules: 1) the decision-making module; 2) the reward module; and 3) the discriminative module. The decision-making module defines the agents and models the decision-making process, while the reward module calculates the group reward in each round using the reward matrix. The discriminative module compares the group reward with the target reward while providing the agent with environmental information. We verify the feasibility of the model through numerous experiments. The results show that agents adopt a revenge strategy to avoid resource competition dilemmas and achieve group reward optimality.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":null,"pages":null},"PeriodicalIF":8.6,"publicationDate":"2024-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142274945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Systems Man Cybernetics-Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1