Modular control architecture for safe marine navigation: Reinforcement learning with predictive safety filters

IF 4.6 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Artificial Intelligence Pub Date : 2024-08-13 DOI:10.1016/j.artint.2024.104201

Aksel Vaaler , Svein Jostein Husa , Daniel Menges , Thomas Nakken Larsen , Adil Rasheed

{"title":"Modular control architecture for safe marine navigation: Reinforcement learning with predictive safety filters","authors":"Aksel Vaaler , Svein Jostein Husa , Daniel Menges , Thomas Nakken Larsen , Adil Rasheed","doi":"10.1016/j.artint.2024.104201","DOIUrl":null,"url":null,"abstract":"<div><p>Many autonomous systems are safety-critical, making it essential to have a closed-loop control system that satisfies constraints arising from underlying physical limitations and safety aspects in a robust manner. However, this is often challenging to achieve for real-world systems. For example, autonomous ships at sea have nonlinear and uncertain dynamics and are subject to numerous time-varying environmental disturbances such as waves, currents, and wind. There is increasing interest in using machine learning-based approaches to adapt these systems to more complex scenarios, but there are few standard frameworks that guarantee the safety and stability of such systems. Recently, predictive safety filters (PSF) have emerged as a promising method to ensure constraint satisfaction in learning-based control, bypassing the need for explicit constraint handling in the learning algorithms themselves. The safety filter approach leads to a modular separation of the problem, allowing the use of arbitrary control policies in a task-agnostic way. The filter takes in a potentially unsafe control action from the main controller and solves an optimization problem to compute a minimal perturbation of the proposed action that adheres to both physical and safety constraints. In this work, we combine reinforcement learning (RL) with predictive safety filtering in the context of marine navigation and control. The RL agent is trained on path-following and safety adherence across a wide range of randomly generated environments, while the predictive safety filter continuously monitors the agents' proposed control actions and modifies them if necessary. The combined PSF/RL scheme is implemented on a simulated model of Cybership II, a miniature replica of a typical supply ship. Safety performance and learning rate are evaluated and compared with those of a standard, non-PSF, RL agent. It is demonstrated that the predictive safety filter is able to keep the vessel safe, while not prohibiting the learning rate and performance of the RL agent.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"336 ","pages":"Article 104201"},"PeriodicalIF":4.6000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224001371/pdfft?md5=32cb7040f174b219329c813dbac41fde&pid=1-s2.0-S0004370224001371-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0004370224001371","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Many autonomous systems are safety-critical, making it essential to have a closed-loop control system that satisfies constraints arising from underlying physical limitations and safety aspects in a robust manner. However, this is often challenging to achieve for real-world systems. For example, autonomous ships at sea have nonlinear and uncertain dynamics and are subject to numerous time-varying environmental disturbances such as waves, currents, and wind. There is increasing interest in using machine learning-based approaches to adapt these systems to more complex scenarios, but there are few standard frameworks that guarantee the safety and stability of such systems. Recently, predictive safety filters (PSF) have emerged as a promising method to ensure constraint satisfaction in learning-based control, bypassing the need for explicit constraint handling in the learning algorithms themselves. The safety filter approach leads to a modular separation of the problem, allowing the use of arbitrary control policies in a task-agnostic way. The filter takes in a potentially unsafe control action from the main controller and solves an optimization problem to compute a minimal perturbation of the proposed action that adheres to both physical and safety constraints. In this work, we combine reinforcement learning (RL) with predictive safety filtering in the context of marine navigation and control. The RL agent is trained on path-following and safety adherence across a wide range of randomly generated environments, while the predictive safety filter continuously monitors the agents' proposed control actions and modifies them if necessary. The combined PSF/RL scheme is implemented on a simulated model of Cybership II, a miniature replica of a typical supply ship. Safety performance and learning rate are evaluated and compared with those of a standard, non-PSF, RL agent. It is demonstrated that the predictive safety filter is able to keep the vessel safe, while not prohibiting the learning rate and performance of the RL agent.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于海上安全航行的模块化控制架构：带有预测性安全过滤器的强化学习

许多自主系统对安全至关重要，因此必须拥有一个闭环控制系统，以稳健的方式满足基本物理限制和安全方面的约束。然而，现实世界中的系统往往难以实现这一点。例如，海上自主航行的船只具有非线性和不确定的动态特性，并受到海浪、海流和风等众多时变环境干扰的影响。人们对使用基于机器学习的方法使这些系统适应更复杂场景的兴趣与日俱增，但很少有标准框架能保证此类系统的安全性和稳定性。最近，预测安全滤波器（PSF）作为一种有前途的方法出现了，它绕过了在学习算法本身中进行显式约束处理的需要，确保了基于学习的控制中的约束满足。安全过滤器方法将问题模块化，允许以任务无关的方式使用任意控制策略。过滤器从主控制器中接收潜在的不安全控制操作，并解决优化问题，计算出符合物理和安全约束条件的拟议操作的最小扰动。在这项工作中，我们将强化学习（RL）与预测性安全过滤相结合，用于海洋导航和控制。强化学习（RL）代理在各种随机生成的环境中接受路径跟踪和安全坚持方面的训练，而预测性安全过滤器则持续监控代理提出的控制行动，并在必要时对其进行修改。PSF/RL 组合方案是在 Cybership II（一艘典型补给船的微型复制品）的仿真模型上实施的。对安全性能和学习率进行了评估，并与标准、非 PSF、RL 代理的安全性能和学习率进行了比较。结果表明，预测性安全过滤器能够保证船只的安全，同时不影响 RL 代理的学习率和性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Artificial Intelligence 工程技术-计算机：人工智能

CiteScore

11.20

自引率

1.40%

发文量

118

审稿时长

8 months

期刊介绍： The Journal of Artificial Intelligence (AIJ) welcomes papers covering a broad spectrum of AI topics, including cognition, automated reasoning, computer vision, machine learning, and more. Papers should demonstrate advancements in AI and propose innovative approaches to AI problems. Additionally, the journal accepts papers describing AI applications, focusing on how new methods enhance performance rather than reiterating conventional approaches. In addition to regular papers, AIJ also accepts Research Notes, Research Field Reviews, Position Papers, Book Reviews, and summary papers on AI challenges and competitions.

期刊最新文献

Global and local context in short text neural topic model Proportional Justified Representation Mathematical Runtime Analysis of a Multi-Valued Estimation of Distribution Algorithm Learning Semi-parametric Tree Models from Mixed Data Editorial Board