首页 > 最新文献

Ieee-Caa Journal of Automatica Sinica最新文献

英文 中文
Multi-USV Formation Collision Avoidance via Deep Reinforcement Learning and COLREGs 通过深度强化学习和 COLREGs 避免多 USV 编队碰撞
IF 15.3 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-08 DOI: 10.1109/JAS.2023.123846
Cheng-Cheng Wang;Yu-Long Wang;Li Jia
Dear Editor, This letter focuses on the collision avoidance for a multi-unmanned surface vehicle (multi-USV) system. A novel multi-USV collision avoidance (MUCA) algorithm is proposed. Firstly, in order to get a more reasonable collision avoidance policy, reward functions are constructed according to international regulations for preventing col-lisions at sea (COLREGS) and USV dynamics. Secondly, to reduce data noises and the impacts of outliers, an improved normalization method is proposed. States and rewards of USVs are normalized to avoid gradient vanishing and exploding. Thirdly, a novel $epsilon$-greedy method is proposed to help the optimal policy converge faster. It is easier for USVs to explore the optimal policy in the learning process. Finally, the proposed MUCA algorithm is tested in a multi-encounter situation including head-on, crossing, and overtaking. The experimental results demonstrate that the newly proposed MUCA algorithm can provide a collision-free marching policy for the USVs in formation.
亲爱的编辑,这封信主要讨论了多无人水面飞行器(multi-USV)系统的防撞问题。本文提出了一种新颖的多无人水面飞行器防撞(MUCA)算法。首先,为了获得更合理的避碰策略,根据防止海上碰撞的国际法规(COLREGS)和 USV 动态构建了奖励函数。其次,为了减少数据噪声和异常值的影响,提出了一种改进的归一化方法。对 USV 的状态和奖励进行归一化处理,以避免梯度消失和爆炸。第三,提出了一种新颖的$epsilon$-greedy方法,以帮助最优策略更快收敛。在学习过程中,USV 更容易探索最优策略。最后,提出的 MUCA 算法在迎面、交叉和超车等多重交会情况下进行了测试。实验结果表明,新提出的 MUCA 算法可以为编队中的 USV 提供无碰撞的行进策略。
{"title":"Multi-USV Formation Collision Avoidance via Deep Reinforcement Learning and COLREGs","authors":"Cheng-Cheng Wang;Yu-Long Wang;Li Jia","doi":"10.1109/JAS.2023.123846","DOIUrl":"https://doi.org/10.1109/JAS.2023.123846","url":null,"abstract":"Dear Editor, This letter focuses on the collision avoidance for a multi-unmanned surface vehicle (multi-USV) system. A novel multi-USV collision avoidance (MUCA) algorithm is proposed. Firstly, in order to get a more reasonable collision avoidance policy, reward functions are constructed according to international regulations for preventing col-lisions at sea (COLREGS) and USV dynamics. Secondly, to reduce data noises and the impacts of outliers, an improved normalization method is proposed. States and rewards of USVs are normalized to avoid gradient vanishing and exploding. Thirdly, a novel \u0000<tex>$epsilon$</tex>\u0000-greedy method is proposed to help the optimal policy converge faster. It is easier for USVs to explore the optimal policy in the learning process. Finally, the proposed MUCA algorithm is tested in a multi-encounter situation including head-on, crossing, and overtaking. The experimental results demonstrate that the newly proposed MUCA algorithm can provide a collision-free marching policy for the USVs in formation.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 11","pages":"2349-2351"},"PeriodicalIF":15.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10707649","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Privacy Preserving Distributed Bandit Residual Feedback Online Optimization Over Time-Varying Unbalanced Graphs 时变不平衡图上的隐私保护分布式 Bandit 残差反馈在线优化
IF 15.3 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-08 DOI: 10.1109/JAS.2024.124656
Zhongyuan Zhao;Zhiqiang Yang;Luyao Jiang;Ju Yang;Quanbo Ge
This paper considers the distributed online optimization (DOO) problem over time-varying unbalanced networks, where gradient information is explicitly unknown. To address this issue, a privacy-preserving distributed online one-point residual feedback (OPRF) optimization algorithm is proposed. This algorithm updates decision variables by leveraging one-point residual feedback to estimate the true gradient information. It can achieve the same performance as the two-point feedback scheme while only requiring a single function value query per iteration. Additionally, it effectively eliminates the effect of time-varying unbalanced graphs by dynamically constructing row stochastic matrices. Furthermore, compared to other distributed optimization algorithms that only consider explicitly unknown cost functions, this paper also addresses the issue of privacy information leakage of nodes. Theoretical analysis demonstrate that the method attains sublinear regret while protecting the privacy information of agents. Finally, numerical experiments on distributed collaborative localization problem and federated learning confirm the effectiveness of the algorithm.
本文考虑的是时变不平衡网络上的分布式在线优化(DOO)问题,其中梯度信息是明确未知的。为解决这一问题,本文提出了一种保护隐私的分布式在线一点残差反馈(OPRF)优化算法。该算法通过利用一点残差反馈来估计真实梯度信息,从而更新决策变量。它可以实现与两点反馈方案相同的性能,而每次迭代只需查询一次函数值。此外,它通过动态构建行随机矩阵,有效消除了时变不平衡图的影响。此外,与其他只考虑显式未知成本函数的分布式优化算法相比,本文还解决了节点隐私信息泄露的问题。理论分析表明,该方法在保护代理隐私信息的同时,还能获得亚线性遗憾。最后,分布式协作定位问题和联合学习的数值实验证实了该算法的有效性。
{"title":"Privacy Preserving Distributed Bandit Residual Feedback Online Optimization Over Time-Varying Unbalanced Graphs","authors":"Zhongyuan Zhao;Zhiqiang Yang;Luyao Jiang;Ju Yang;Quanbo Ge","doi":"10.1109/JAS.2024.124656","DOIUrl":"https://doi.org/10.1109/JAS.2024.124656","url":null,"abstract":"This paper considers the distributed online optimization (DOO) problem over time-varying unbalanced networks, where gradient information is explicitly unknown. To address this issue, a privacy-preserving distributed online one-point residual feedback (OPRF) optimization algorithm is proposed. This algorithm updates decision variables by leveraging one-point residual feedback to estimate the true gradient information. It can achieve the same performance as the two-point feedback scheme while only requiring a single function value query per iteration. Additionally, it effectively eliminates the effect of time-varying unbalanced graphs by dynamically constructing row stochastic matrices. Furthermore, compared to other distributed optimization algorithms that only consider explicitly unknown cost functions, this paper also addresses the issue of privacy information leakage of nodes. Theoretical analysis demonstrate that the method attains sublinear regret while protecting the privacy information of agents. Finally, numerical experiments on distributed collaborative localization problem and federated learning confirm the effectiveness of the algorithm.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 11","pages":"2284-2297"},"PeriodicalIF":15.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Linear Programming-Based Reinforcement Learning Mechanism for Incomplete-Information Games 基于线性规划的不完全信息游戏强化学习机制
IF 15.3 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-08 DOI: 10.1109/JAS.2024.124464
Baosen Yang;Changbing Tang;Yang Liu;Guanghui Wen;Guanrong Chen
Dear Editor, Recently, with the development of artificial intelligence, game intelligence decision-making has attracted more and more attention. In particular, incomplete-information games (IIG) have gradually become a new research focus, where players make decisions without sufficient information, such as the opponent's strategies or preferences. In this case, a selfish player can only make reactive decisions based on the changes in environment and state. Thus, blind decisions by players may drift them away from the path of reward maximization, and may even hinder the health of the IIG environment. Therefore, it is necessary to design an effective mechanism to optimize decision-making for IIG players.
亲爱的编辑,近年来,随着人工智能的发展,博弈智能决策越来越受到人们的关注。其中,不完全信息博弈(incomplete-information game,IIG)逐渐成为一个新的研究热点。在这种情况下,自私的玩家只能根据环境和状态的变化做出被动决策。因此,棋手的盲目决策可能会使他们偏离回报最大化的道路,甚至会阻碍 IIG 环境的健康发展。因此,有必要设计一种有效的机制来优化 IIG 玩家的决策。
{"title":"A Linear Programming-Based Reinforcement Learning Mechanism for Incomplete-Information Games","authors":"Baosen Yang;Changbing Tang;Yang Liu;Guanghui Wen;Guanrong Chen","doi":"10.1109/JAS.2024.124464","DOIUrl":"https://doi.org/10.1109/JAS.2024.124464","url":null,"abstract":"Dear Editor, Recently, with the development of artificial intelligence, game intelligence decision-making has attracted more and more attention. In particular, incomplete-information games (IIG) have gradually become a new research focus, where players make decisions without sufficient information, such as the opponent's strategies or preferences. In this case, a selfish player can only make reactive decisions based on the changes in environment and state. Thus, blind decisions by players may drift them away from the path of reward maximization, and may even hinder the health of the IIG environment. Therefore, it is necessary to design an effective mechanism to optimize decision-making for IIG players.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 11","pages":"2340-2342"},"PeriodicalIF":15.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10707689","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Zero Dynamics and Controllable Cyber-Attacks in Cyber-Physical Systems and Dynamic Coding Schemes as Their Countermeasures 论网络物理系统中的零动态和可控网络攻击以及作为其对策的动态编码方案
IF 15.3 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-08 DOI: 10.1109/JAS.2024.124692
Mahdi Taheri;Khashayar Khorasani;Nader Meskin
In this paper, we study stealthy cyber-attacks on actuators of cyber-physical systems (CPS), namely zero dynamics and controllable attacks. In particular, under certain assumptions, we investigate and propose conditions under which one can execute zero dynamics and controllable attacks in the CPS. The above conditions are derived based on the Markov parameters of the CPS and elements of the system observability matrix. Consequently, in addition to outlining the number of required actuators to be attacked, these conditions provide one with the minimum system knowledge needed to perform zero dynamics and controllable cyber-attacks. As a countermeasure against the above stealthy cyber-attacks, we develop a dynamic coding scheme that increases the minimum number of the CPS required actuators to carry out zero dynamics and controllable cyber-attacks to its maximum possible value. It is shown that if at least one secure input channel exists, the proposed dynamic coding scheme can prevent adversaries from executing the zero dynamics and controllable attacks even if they have complete knowledge of the coding system. Finally, two illustrative numerical case studies are provided to demonstrate the effectiveness and capabilities of our derived conditions and proposed methodologies.
本文研究了针对网络物理系统(CPS)执行器的隐形网络攻击,即零动态攻击和可控攻击。特别是,在某些假设条件下,我们研究并提出了可以在 CPS 中实施零动态和可控攻击的条件。上述条件是根据 CPS 的马尔可夫参数和系统可观测性矩阵的元素推导出来的。因此,这些条件除了概括了所需攻击执行器的数量外,还提供了执行零动态和可控网络攻击所需的最少系统知识。作为应对上述隐蔽网络攻击的对策,我们开发了一种动态编码方案,可将实施零动态和可控网络攻击所需的 CPS 执行器最小数量增加到最大可能值。研究表明,如果至少存在一个安全输入通道,即使对手完全了解编码系统,所提出的动态编码方案也能阻止他们实施零动态和可控攻击。最后,我们提供了两个示例研究,以证明我们推导的条件和提出的方法的有效性和能力。
{"title":"On Zero Dynamics and Controllable Cyber-Attacks in Cyber-Physical Systems and Dynamic Coding Schemes as Their Countermeasures","authors":"Mahdi Taheri;Khashayar Khorasani;Nader Meskin","doi":"10.1109/JAS.2024.124692","DOIUrl":"https://doi.org/10.1109/JAS.2024.124692","url":null,"abstract":"In this paper, we study stealthy cyber-attacks on actuators of cyber-physical systems (CPS), namely zero dynamics and controllable attacks. In particular, under certain assumptions, we investigate and propose conditions under which one can execute zero dynamics and controllable attacks in the CPS. The above conditions are derived based on the Markov parameters of the CPS and elements of the system observability matrix. Consequently, in addition to outlining the number of required actuators to be attacked, these conditions provide one with the minimum system knowledge needed to perform zero dynamics and controllable cyber-attacks. As a countermeasure against the above stealthy cyber-attacks, we develop a dynamic coding scheme that increases the minimum number of the CPS required actuators to carry out zero dynamics and controllable cyber-attacks to its maximum possible value. It is shown that if at least one secure input channel exists, the proposed dynamic coding scheme can prevent adversaries from executing the zero dynamics and controllable attacks even if they have complete knowledge of the coding system. Finally, two illustrative numerical case studies are provided to demonstrate the effectiveness and capabilities of our derived conditions and proposed methodologies.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 11","pages":"2191-2203"},"PeriodicalIF":15.3,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142397143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pure State Feedback Switching Control Based on the Online Estimated State for Stochastic Open Quantum Systems 基于随机开放量子系统在线估计状态的纯状态反馈切换控制
IF 15.3 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-09-04 DOI: 10.1109/JAS.2023.124071
Shuang Cong;Zhixiang Dong
For the $n$-qubit stochastic open quantum systems, based on the Lyapunov stability theorem and LaSalle's invariant set principle, a pure state switching control based on on-line estimated state feedback (short for OQST-SFC) is proposed to realize the state transition the pure state of the target state including eigenstate and superposition state. The proposed switching control consists of a constant control and a control law designed based on the Lyapunov method, in which the Lyapunov function is the state distance of the system. The constant control is used to drive the system state from an initial state to the convergence domain only containing the target state, and a Lyapunov-based control is used to make the state enter the convergence domain and then continue to converge to the target state. At the same time, the continuous weak measurement of quantum system and the quantum state tomography method based on the on-line alternating direction multiplier (QST-OADM) are used to obtain the system information and estimate the quantum state which is used as the input of the quantum system controller. Then, the pure state feedback switching control method based on the on-line estimated state feedback is realized in an $n$-qubit stochastic open quantum system. The complete derivation process of $n$-qubit QST-OADM algorithm is given; Through strict theoretical proof and analysis, the convergence conditions to ensure any initial state of the quantum system to converge the target pure state are given. The proposed control method is applied to a 2-qubit stochastic open quantum system for numerical simulation experiments. Four possible different position cases between the initial estimated state and that of the controlled system are studied and discussed, and the performances of the state transition under the corresponding cases are analyzed.
针对 $n$-qubit 随机开放量子系统,基于李雅普诺夫稳定性定理和拉萨尔不变集原理,提出了一种基于在线估计状态反馈的纯态切换控制(简称 OQST-SFC),以实现包括特征态和叠加态在内的目标态纯态的状态转换。所提出的切换控制由常数控制和基于 Lyapunov 方法设计的控制律组成,其中 Lyapunov 函数是系统的状态距离。恒定控制用于驱动系统状态从初始状态到仅包含目标状态的收敛域,基于 Lyapunov 的控制用于使状态进入收敛域,然后继续收敛到目标状态。同时,利用量子系统的连续弱测量和基于在线交变方向乘法器(QST-OADM)的量子态层析方法获取系统信息并估计量子态,作为量子系统控制器的输入。然后,在一个 $n$-qubit 随机开放量子系统中实现了基于在线估计状态反馈的纯状态反馈开关控制方法。给出了 $n$-qubit QST-OADM 算法的完整推导过程;通过严格的理论证明和分析,给出了保证量子系统任意初始状态收敛到目标纯态的收敛条件。将所提出的控制方法应用于一个 2 量子位随机开放量子系统的数值模拟实验。研究和讨论了初始估计状态与被控系统状态之间可能存在的四种不同位置情况,并分析了相应情况下的状态转换性能。
{"title":"Pure State Feedback Switching Control Based on the Online Estimated State for Stochastic Open Quantum Systems","authors":"Shuang Cong;Zhixiang Dong","doi":"10.1109/JAS.2023.124071","DOIUrl":"https://doi.org/10.1109/JAS.2023.124071","url":null,"abstract":"For the \u0000<tex>$n$</tex>\u0000-qubit stochastic open quantum systems, based on the Lyapunov stability theorem and LaSalle's invariant set principle, a pure state switching control based on on-line estimated state feedback (short for OQST-SFC) is proposed to realize the state transition the pure state of the target state including eigenstate and superposition state. The proposed switching control consists of a constant control and a control law designed based on the Lyapunov method, in which the Lyapunov function is the state distance of the system. The constant control is used to drive the system state from an initial state to the convergence domain only containing the target state, and a Lyapunov-based control is used to make the state enter the convergence domain and then continue to converge to the target state. At the same time, the continuous weak measurement of quantum system and the quantum state tomography method based on the on-line alternating direction multiplier (QST-OADM) are used to obtain the system information and estimate the quantum state which is used as the input of the quantum system controller. Then, the pure state feedback switching control method based on the on-line estimated state feedback is realized in an \u0000<tex>$n$</tex>\u0000-qubit stochastic open quantum system. The complete derivation process of \u0000<tex>$n$</tex>\u0000-qubit QST-OADM algorithm is given; Through strict theoretical proof and analysis, the convergence conditions to ensure any initial state of the quantum system to converge the target pure state are given. The proposed control method is applied to a 2-qubit stochastic open quantum system for numerical simulation experiments. Four possible different position cases between the initial estimated state and that of the controlled system are studied and discussed, and the performances of the state transition under the corresponding cases are analyzed.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 10","pages":"2166-2178"},"PeriodicalIF":15.3,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142137610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed Predefined-Time Control for Cooperative Tracking of Multiple Quadrotor UAVs 用于多架四旋翼无人飞行器合作跟踪的分布式预定义时间控制
IF 15.3 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-09-04 DOI: 10.1109/JAS.2023.123861
Kewei Xia;Xinyi Li;Kaidan Li;Yao Zou
Dear Editor, This letter addresses the predefined-time control for cooperative tracking of multiple quadrotor unmanned aerial vehicles (UAVs) under a directed communication network. A predefined-time distributed estimator is first introduced to accurately get the reference velocity and acceleration for each UAV. Then, a cascade predefined-time control strategy is proposed to guarantee that all the UAVs track the reference trajectory while maintaining a preassigned configuration, where an attitude constraint algorithm is developed to avoid the flipping over of each VAV. Stability analysis demonstrates that the tracking errors of the closed-loop systems converge to zero within a predefined time. Finally, experiment results validate the proposed control strategy.
亲爱的编辑,这封信讨论了在定向通信网络下对多个四旋翼无人飞行器(UAV)进行合作跟踪的预定义时间控制。首先介绍了一种预定义时间分布式估计器,用于精确获取每个无人飞行器的参考速度和加速度。然后,提出了一种级联预定义时间控制策略,以保证所有无人飞行器在跟踪参考轨迹的同时保持预先分配的配置,其中开发了一种姿态约束算法,以避免每个无人飞行器翻转。稳定性分析表明,闭环系统的跟踪误差在预定时间内趋于零。最后,实验结果验证了所提出的控制策略。
{"title":"Distributed Predefined-Time Control for Cooperative Tracking of Multiple Quadrotor UAVs","authors":"Kewei Xia;Xinyi Li;Kaidan Li;Yao Zou","doi":"10.1109/JAS.2023.123861","DOIUrl":"https://doi.org/10.1109/JAS.2023.123861","url":null,"abstract":"Dear Editor, This letter addresses the predefined-time control for cooperative tracking of multiple quadrotor unmanned aerial vehicles (UAVs) under a directed communication network. A predefined-time distributed estimator is first introduced to accurately get the reference velocity and acceleration for each UAV. Then, a cascade predefined-time control strategy is proposed to guarantee that all the UAVs track the reference trajectory while maintaining a preassigned configuration, where an attitude constraint algorithm is developed to avoid the flipping over of each VAV. Stability analysis demonstrates that the tracking errors of the closed-loop systems converge to zero within a predefined time. Finally, experiment results validate the proposed control strategy.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 10","pages":"2179-2181"},"PeriodicalIF":15.3,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10664603","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142137529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evolution and Role of Optimizers in Training Deep Learning Models 优化器在训练深度学习模型中的演变和作用
IF 15.3 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-09-04 DOI: 10.1109/JAS.2024.124806
XiaoHao Wen;MengChu Zhou
To perform well, deep learning (DL) models have to be trained well. Which optimizer should be adopted? We answer this question by discussing how optimizers have evolved from traditional methods like gradient descent to more advanced techniques to address challenges posed by high-dimensional and non-convex problem space. Ongoing challenges include their hyperparameter sensitivity, balancing between convergence and generalization performance, and improving interpretability of optimization processes. Researchers continue to seek robust, efficient, and universally applicable optimizers to advance the field of DL across various domains.
深度学习(DL)模型要想表现出色,就必须训练有素。应该采用哪种优化器?我们通过讨论优化器如何从梯度下降等传统方法发展到更先进的技术,以应对高维和非凸问题空间带来的挑战,来回答这个问题。目前面临的挑战包括超参数敏感性、收敛性和泛化性能之间的平衡,以及提高优化过程的可解释性。研究人员将继续寻求稳健、高效和普遍适用的优化器,以推动各个领域的 DL 研究。
{"title":"Evolution and Role of Optimizers in Training Deep Learning Models","authors":"XiaoHao Wen;MengChu Zhou","doi":"10.1109/JAS.2024.124806","DOIUrl":"https://doi.org/10.1109/JAS.2024.124806","url":null,"abstract":"To perform well, deep learning (DL) models have to be trained well. Which optimizer should be adopted? We answer this question by discussing how optimizers have evolved from traditional methods like gradient descent to more advanced techniques to address challenges posed by high-dimensional and non-convex problem space. Ongoing challenges include their hyperparameter sensitivity, balancing between convergence and generalization performance, and improving interpretability of optimization processes. Researchers continue to seek robust, efficient, and universally applicable optimizers to advance the field of DL across various domains.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 10","pages":"2039-2042"},"PeriodicalIF":15.3,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10664602","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142137514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Controller Synthesis Under Linear Temporal Logic Specifications Using Dynamic Quantization 利用动态量化在线性时态逻辑规范下合成分层控制器
IF 15.3 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-09-04 DOI: 10.1109/JAS.2024.124473
Wei Ren;Zhuo-Rui Pan;Weiguo Xia;Xi-Ming Sun
Linear temporal logic (LTL) is an intuitive and expressive language to specify complex control tasks, and how to design an efficient control strategy for LTL specification is still a challenge. In this paper, we implement the dynamic quantization technique to propose a novel hierarchical control strategy for nonlinear control systems under LTL specifications. Based on the regions of interest involved in the LTL formula, an accepting path is derived first to provide a high-level solution for the controller synthesis problem. Second, we develop a dynamic quantization based approach to verify the realization of the accepting path. The realization verification results in the necessity of the controller design and a sequence of quantization regions for the controller design. Third, the techniques of dynamic quantization and abstraction-based control are combined together to establish the local-to-global control strategy. Both abstraction construction and controller design are local and dynamic, thereby resulting in the potential reduction of the computational complexity. Since each quantization region can be considered locally and individually, the proposed hierarchical mechanism is more efficient and can solve much larger problems than many existing methods. Finally, the proposed control strategy is illustrated via two examples from the path planning and tracking problems of mobile robots.
线性时态逻辑(LTL)是一种直观而富有表现力的语言,可用于指定复杂的控制任务,而如何针对 LTL 规范设计高效的控制策略仍是一个挑战。本文采用动态量化技术,针对 LTL 规范下的非线性控制系统提出了一种新的分层控制策略。根据 LTL 公式中涉及的兴趣区域,首先推导出接受路径,为控制器合成问题提供高层次的解决方案。其次,我们开发了一种基于动态量化的方法来验证接受路径的实现。实现验证的结果是控制器设计的必要性和控制器设计的量化区域序列。第三,将动态量化技术和基于抽象的控制技术结合起来,建立局部到全局的控制策略。抽象构建和控制器设计都是局部和动态的,因此有可能降低计算复杂度。由于每个量化区域都可以在局部单独考虑,因此与许多现有方法相比,所提出的分层机制更加高效,可以解决更大的问题。最后,我们通过两个移动机器人路径规划和跟踪问题的例子来说明所提出的控制策略。
{"title":"Hierarchical Controller Synthesis Under Linear Temporal Logic Specifications Using Dynamic Quantization","authors":"Wei Ren;Zhuo-Rui Pan;Weiguo Xia;Xi-Ming Sun","doi":"10.1109/JAS.2024.124473","DOIUrl":"https://doi.org/10.1109/JAS.2024.124473","url":null,"abstract":"Linear temporal logic (LTL) is an intuitive and expressive language to specify complex control tasks, and how to design an efficient control strategy for LTL specification is still a challenge. In this paper, we implement the dynamic quantization technique to propose a novel hierarchical control strategy for nonlinear control systems under LTL specifications. Based on the regions of interest involved in the LTL formula, an accepting path is derived first to provide a high-level solution for the controller synthesis problem. Second, we develop a dynamic quantization based approach to verify the realization of the accepting path. The realization verification results in the necessity of the controller design and a sequence of quantization regions for the controller design. Third, the techniques of dynamic quantization and abstraction-based control are combined together to establish the local-to-global control strategy. Both abstraction construction and controller design are local and dynamic, thereby resulting in the potential reduction of the computational complexity. Since each quantization region can be considered locally and individually, the proposed hierarchical mechanism is more efficient and can solve much larger problems than many existing methods. Finally, the proposed control strategy is illustrated via two examples from the path planning and tracking problems of mobile robots.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 10","pages":"2082-2098"},"PeriodicalIF":15.3,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142137597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic Vision-Based Machinery Fault Diagnosis with Cross-Modality Feature Alignment 基于视觉的动态机械故障诊断与跨模态特征对齐
IF 15.3 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-09-04 DOI: 10.1109/JAS.2024.124470
Xiang Li;Shupeng Yu;Yaguo Lei;Naipeng Li;Bin Yang
Intelligent machinery fault diagnosis methods have been popularly and successfully developed in the past decades, and the vibration acceleration data collected by contact accelerometers have been widely investigated. In many industrial scenarios, contactless sensors are more preferred. The event camera is an emerging bio-inspired technology for vision sensing, which asynchronously records per-pixel brightness change polarity with high temporal resolution and low latency. It offers a promising tool for contactless machine vibration sensing and fault diagnosis. However, the dynamic vision-based methods suffer from variations of practical factors such as camera position, machine operating condition, etc. Furthermore, as a new sensing technology, the labeled dynamic vision data are limited, which generally cannot cover a wide range of machine fault modes. Aiming at these challenges, a novel dynamic vision-based machinery fault diagnosis method is proposed in this paper. It is motivated to explore the abundant vibration acceleration data for enhancing the dynamic vision-based model performance. A cross-modality feature alignment method is thus proposed with deep adversarial neural networks to achieve fault diagnosis knowledge transfer. An event erasing method is further proposed for improving model robustness against variations. The proposed method can effectively identify unseen fault mode with dynamic vision data. Experiments on two rotating machine monitoring datasets are carried out for validations, and the results suggest the proposed method is promising for generalized contactless machinery fault diagnosis.
智能机械故障诊断方法在过去几十年中得到了广泛应用和成功开发,接触式加速度计采集的振动加速度数据也得到了广泛研究。在许多工业场景中,非接触式传感器更受青睐。事件相机是一种新兴的受生物启发的视觉传感技术,它能异步记录每个像素的亮度极性变化,具有高时间分辨率和低延迟的特点。它为非接触式机器振动传感和故障诊断提供了一种前景广阔的工具。然而,基于动态视觉的方法会受到相机位置、机器运行状态等实际因素的影响。此外,作为一种新的传感技术,标注的动态视觉数据有限,通常无法涵盖广泛的机器故障模式。针对这些挑战,本文提出了一种新型的基于动态视觉的机器故障诊断方法。其动机是探索丰富的振动加速度数据,以提高基于动态视觉的模型性能。因此,本文提出了一种跨模态特征对齐方法,通过深度对抗神经网络实现故障诊断知识的转移。此外,还提出了一种事件擦除方法,以提高模型对各种变化的鲁棒性。所提出的方法能有效识别动态视觉数据中未见的故障模式。在两个旋转机械监测数据集上进行了实验验证,结果表明所提出的方法有望用于通用非接触式机械故障诊断。
{"title":"Dynamic Vision-Based Machinery Fault Diagnosis with Cross-Modality Feature Alignment","authors":"Xiang Li;Shupeng Yu;Yaguo Lei;Naipeng Li;Bin Yang","doi":"10.1109/JAS.2024.124470","DOIUrl":"https://doi.org/10.1109/JAS.2024.124470","url":null,"abstract":"Intelligent machinery fault diagnosis methods have been popularly and successfully developed in the past decades, and the vibration acceleration data collected by contact accelerometers have been widely investigated. In many industrial scenarios, contactless sensors are more preferred. The event camera is an emerging bio-inspired technology for vision sensing, which asynchronously records per-pixel brightness change polarity with high temporal resolution and low latency. It offers a promising tool for contactless machine vibration sensing and fault diagnosis. However, the dynamic vision-based methods suffer from variations of practical factors such as camera position, machine operating condition, etc. Furthermore, as a new sensing technology, the labeled dynamic vision data are limited, which generally cannot cover a wide range of machine fault modes. Aiming at these challenges, a novel dynamic vision-based machinery fault diagnosis method is proposed in this paper. It is motivated to explore the abundant vibration acceleration data for enhancing the dynamic vision-based model performance. A cross-modality feature alignment method is thus proposed with deep adversarial neural networks to achieve fault diagnosis knowledge transfer. An event erasing method is further proposed for improving model robustness against variations. The proposed method can effectively identify unseen fault mode with dynamic vision data. Experiments on two rotating machine monitoring datasets are carried out for validations, and the results suggest the proposed method is promising for generalized contactless machinery fault diagnosis.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 10","pages":"2068-2081"},"PeriodicalIF":15.3,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142137595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridge Bidding via Deep Reinforcement Learning and Belief Monte Carlo Search 通过深度强化学习和信念蒙特卡洛搜索进行桥牌竞价
IF 15.3 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-09-04 DOI: 10.1109/JAS.2024.124488
Zizhang Qiu;Shouguang Wang;Dan You;MengChu Zhou
Contract Bridge, a four-player imperfect information game, comprises two phases: bidding and playing. While computer programs excel at playing, bidding presents a challenging aspect due to the need for information exchange with partners and interference with communication of opponents. In this work, we introduce a Bridge bidding agent that combines supervised learning, deep reinforcement learning via self-play, and a test-time search approach. Our experiments demonstrate that our agent outperforms WBridge5, a highly regarded computer Bridge software that has won multiple world championships, by a performance of 0.98 IMPs (international match points) per deal over 10 000 deals, with a much cost-effective approach. The performance significantly surpasses previous state-of-the-art (0.85 IMPs per deal). Note 0.1 IMPs per deal is a significant improvement in Bridge bidding.
契约桥牌是一种四人不完全信息游戏,包括两个阶段:竞标和下注。虽然计算机程序擅长下棋,但由于需要与合作伙伴交换信息并干扰对手的交流,竞标是一个具有挑战性的方面。在这项工作中,我们介绍了一种桥牌竞标代理,它结合了监督学习、通过自我比赛进行的深度强化学习以及测试时间搜索方法。我们的实验证明,我们的代理在 10,000 次交易中,以每交易 0.98 IMPs(国际比赛积分)的成绩超越了 WBridge5(一款备受推崇的计算机桥牌软件,曾多次获得世界冠军),而且采用的方法更具成本效益。这一成绩大大超过了以前的先进水平(每盘 0.85 IMPs)。注意:每局 0.1 IMPs 是桥牌竞标中的一项重大改进。
{"title":"Bridge Bidding via Deep Reinforcement Learning and Belief Monte Carlo Search","authors":"Zizhang Qiu;Shouguang Wang;Dan You;MengChu Zhou","doi":"10.1109/JAS.2024.124488","DOIUrl":"https://doi.org/10.1109/JAS.2024.124488","url":null,"abstract":"Contract Bridge, a four-player imperfect information game, comprises two phases: bidding and playing. While computer programs excel at playing, bidding presents a challenging aspect due to the need for information exchange with partners and interference with communication of opponents. In this work, we introduce a Bridge bidding agent that combines supervised learning, deep reinforcement learning via self-play, and a test-time search approach. Our experiments demonstrate that our agent outperforms WBridge5, a highly regarded computer Bridge software that has won multiple world championships, by a performance of 0.98 IMPs (international match points) per deal over 10 000 deals, with a much cost-effective approach. The performance significantly surpasses previous state-of-the-art (0.85 IMPs per deal). Note 0.1 IMPs per deal is a significant improvement in Bridge bidding.","PeriodicalId":54230,"journal":{"name":"Ieee-Caa Journal of Automatica Sinica","volume":"11 10","pages":"2111-2122"},"PeriodicalIF":15.3,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142137609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Ieee-Caa Journal of Automatica Sinica
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1