基于双模拟度量的约束视觉表示学习用于安全强化学习

Rongrong Wang;Yuhu Cheng;Xuesong Wang
{"title":"基于双模拟度量的约束视觉表示学习用于安全强化学习","authors":"Rongrong Wang;Yuhu Cheng;Xuesong Wang","doi":"10.1109/TIP.2024.3523798","DOIUrl":null,"url":null,"abstract":"Safe reinforcement learning aims to ensure the optimal performance while minimizing potential risks. In real-world applications, especially in scenarios that rely on visual inputs, a key challenge lies in the extraction of essential features for safe decision-making while maintaining the sample efficiency. To address this issue, we propose the constrained visual representation learning with bisimulation metrics for safe reinforcement learning (CVRL-BM). CVRL-BM constructs a sequential conditional variational inference model to compress high-dimensional visual observations into low-dimensional state representations. Additionally, safety bisimulation metrics are introduced to quantify the behavioral similarity between states, and our objective is to make the distance between any two latent state representations as close as possible to the safety bisimulation metric between their corresponding states. By integrating these two components, CVRL-BM is able to learn compact and information-rich visual state representations while satisfying predefined safety constraints. Experiments on Safety Gym show that CVRL-BM outperforms existing vision-based safe reinforcement learning methods in safety and efficacy. Particularly, CVRL-BM surpasses the state-of-the-art Safe SLAC method by achieving a 19.748% higher reward return, a 41.772% lower cost return, and a 5.027% decrease in cost regret. These results highlight the effectiveness of our proposed CVRL-BM.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"379-393"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Constrained Visual Representation Learning With Bisimulation Metrics for Safe Reinforcement Learning\",\"authors\":\"Rongrong Wang;Yuhu Cheng;Xuesong Wang\",\"doi\":\"10.1109/TIP.2024.3523798\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Safe reinforcement learning aims to ensure the optimal performance while minimizing potential risks. In real-world applications, especially in scenarios that rely on visual inputs, a key challenge lies in the extraction of essential features for safe decision-making while maintaining the sample efficiency. To address this issue, we propose the constrained visual representation learning with bisimulation metrics for safe reinforcement learning (CVRL-BM). CVRL-BM constructs a sequential conditional variational inference model to compress high-dimensional visual observations into low-dimensional state representations. Additionally, safety bisimulation metrics are introduced to quantify the behavioral similarity between states, and our objective is to make the distance between any two latent state representations as close as possible to the safety bisimulation metric between their corresponding states. By integrating these two components, CVRL-BM is able to learn compact and information-rich visual state representations while satisfying predefined safety constraints. Experiments on Safety Gym show that CVRL-BM outperforms existing vision-based safe reinforcement learning methods in safety and efficacy. Particularly, CVRL-BM surpasses the state-of-the-art Safe SLAC method by achieving a 19.748% higher reward return, a 41.772% lower cost return, and a 5.027% decrease in cost regret. These results highlight the effectiveness of our proposed CVRL-BM.\",\"PeriodicalId\":94032,\"journal\":{\"name\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"volume\":\"34 \",\"pages\":\"379-393\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10829536/\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10829536/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

安全强化学习的目的是在保证最佳性能的同时最小化潜在风险。在现实世界的应用中,特别是在依赖视觉输入的场景中,一个关键的挑战在于在保持样本效率的同时提取安全决策的基本特征。为了解决这个问题,我们提出了安全强化学习的约束视觉表征学习和双模拟度量(CVRL-BM)。CVRL-BM构建了顺序条件变分推理模型,将高维视觉观测压缩为低维状态表示。此外,引入了安全双模拟度量来量化状态之间的行为相似性,我们的目标是使任意两个潜在状态表示之间的距离尽可能接近其对应状态之间的安全双模拟度量。通过集成这两个组件,CVRL-BM能够学习紧凑且信息丰富的视觉状态表示,同时满足预定义的安全约束。在Safety Gym上的实验表明,CVRL-BM在安全性和有效性上都优于现有的基于视觉的安全强化学习方法。特别是,CVRL-BM超越了最先进的Safe SLAC方法,实现了19.748%的高回报,41.772%的低成本回报,5.027%的低成本后悔。这些结果突出了我们提出的CVRL-BM的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Constrained Visual Representation Learning With Bisimulation Metrics for Safe Reinforcement Learning
Safe reinforcement learning aims to ensure the optimal performance while minimizing potential risks. In real-world applications, especially in scenarios that rely on visual inputs, a key challenge lies in the extraction of essential features for safe decision-making while maintaining the sample efficiency. To address this issue, we propose the constrained visual representation learning with bisimulation metrics for safe reinforcement learning (CVRL-BM). CVRL-BM constructs a sequential conditional variational inference model to compress high-dimensional visual observations into low-dimensional state representations. Additionally, safety bisimulation metrics are introduced to quantify the behavioral similarity between states, and our objective is to make the distance between any two latent state representations as close as possible to the safety bisimulation metric between their corresponding states. By integrating these two components, CVRL-BM is able to learn compact and information-rich visual state representations while satisfying predefined safety constraints. Experiments on Safety Gym show that CVRL-BM outperforms existing vision-based safe reinforcement learning methods in safety and efficacy. Particularly, CVRL-BM surpasses the state-of-the-art Safe SLAC method by achieving a 19.748% higher reward return, a 41.772% lower cost return, and a 5.027% decrease in cost regret. These results highlight the effectiveness of our proposed CVRL-BM.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Enhancing Text-Video Retrieval Performance With Low-Salient but Discriminative Objects Breaking Boundaries: Unifying Imaging and Compression for HDR Image Compression A Pyramid Fusion MLP for Dense Prediction IFENet: Interaction, Fusion, and Enhancement Network for V-D-T Salient Object Detection NeuralDiffuser: Neuroscience-Inspired Diffusion Guidance for fMRI Visual Reconstruction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1