RiskAwareBench:为基于 LLM 的嵌入式代理的高级别规划评估物理风险意识

Zihao Zhu, Bingzhe Wu, Zhengyou Zhang, Baoyuan Wu
{"title":"RiskAwareBench:为基于 LLM 的嵌入式代理的高级别规划评估物理风险意识","authors":"Zihao Zhu, Bingzhe Wu, Zhengyou Zhang, Baoyuan Wu","doi":"arxiv-2408.04449","DOIUrl":null,"url":null,"abstract":"The integration of large language models (LLMs) into robotics significantly\nenhances the capabilities of embodied agents in understanding and executing\ncomplex natural language instructions. However, the unmitigated deployment of\nLLM-based embodied systems in real-world environments may pose potential\nphysical risks, such as property damage and personal injury. Existing security\nbenchmarks for LLMs overlook risk awareness for LLM-based embodied agents. To\naddress this gap, we propose RiskAwareBench, an automated framework designed to\nassess physical risks awareness in LLM-based embodied agents. RiskAwareBench\nconsists of four modules: safety tips generation, risky scene generation, plan\ngeneration, and evaluation, enabling comprehensive risk assessment with minimal\nmanual intervention. Utilizing this framework, we compile the PhysicalRisk\ndataset, encompassing diverse scenarios with associated safety tips,\nobservations, and instructions. Extensive experiments reveal that most LLMs\nexhibit insufficient physical risk awareness, and baseline risk mitigation\nstrategies yield limited enhancement, which emphasizes the urgency and\ncruciality of improving risk awareness in LLM-based embodied agents in the\nfuture.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RiskAwareBench: Towards Evaluating Physical Risk Awareness for High-level Planning of LLM-based Embodied Agents\",\"authors\":\"Zihao Zhu, Bingzhe Wu, Zhengyou Zhang, Baoyuan Wu\",\"doi\":\"arxiv-2408.04449\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The integration of large language models (LLMs) into robotics significantly\\nenhances the capabilities of embodied agents in understanding and executing\\ncomplex natural language instructions. However, the unmitigated deployment of\\nLLM-based embodied systems in real-world environments may pose potential\\nphysical risks, such as property damage and personal injury. Existing security\\nbenchmarks for LLMs overlook risk awareness for LLM-based embodied agents. To\\naddress this gap, we propose RiskAwareBench, an automated framework designed to\\nassess physical risks awareness in LLM-based embodied agents. RiskAwareBench\\nconsists of four modules: safety tips generation, risky scene generation, plan\\ngeneration, and evaluation, enabling comprehensive risk assessment with minimal\\nmanual intervention. Utilizing this framework, we compile the PhysicalRisk\\ndataset, encompassing diverse scenarios with associated safety tips,\\nobservations, and instructions. Extensive experiments reveal that most LLMs\\nexhibit insufficient physical risk awareness, and baseline risk mitigation\\nstrategies yield limited enhancement, which emphasizes the urgency and\\ncruciality of improving risk awareness in LLM-based embodied agents in the\\nfuture.\",\"PeriodicalId\":501479,\"journal\":{\"name\":\"arXiv - CS - Artificial Intelligence\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.04449\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.04449","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

将大型语言模型(LLM)集成到机器人技术中,可大大提高机器人理解和执行复杂自然语言指令的能力。然而,在现实环境中不加区分地部署基于 LLM 的化身系统可能会带来潜在的物理风险,如财产损失和人身伤害。现有的 LLM 安全基准忽略了基于 LLM 的嵌入式代理的风险意识。为了弥补这一缺陷,我们提出了 RiskAwareBench,这是一个自动化框架,旨在评估基于 LLM 的具身代理的物理风险意识。RiskAwareBench 由四个模块组成:安全提示生成、风险场景生成、计划生成和评估。利用这一框架,我们编译了物理风险数据集,其中包括与相关安全提示、观察结果和说明有关的各种场景。广泛的实验表明,大多数 LLM 的物理风险意识不足,而基线风险缓解策略只能产生有限的增强效果,这强调了未来提高基于 LLM 的化身代理的风险意识的紧迫性和重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
RiskAwareBench: Towards Evaluating Physical Risk Awareness for High-level Planning of LLM-based Embodied Agents
The integration of large language models (LLMs) into robotics significantly enhances the capabilities of embodied agents in understanding and executing complex natural language instructions. However, the unmitigated deployment of LLM-based embodied systems in real-world environments may pose potential physical risks, such as property damage and personal injury. Existing security benchmarks for LLMs overlook risk awareness for LLM-based embodied agents. To address this gap, we propose RiskAwareBench, an automated framework designed to assess physical risks awareness in LLM-based embodied agents. RiskAwareBench consists of four modules: safety tips generation, risky scene generation, plan generation, and evaluation, enabling comprehensive risk assessment with minimal manual intervention. Utilizing this framework, we compile the PhysicalRisk dataset, encompassing diverse scenarios with associated safety tips, observations, and instructions. Extensive experiments reveal that most LLMs exhibit insufficient physical risk awareness, and baseline risk mitigation strategies yield limited enhancement, which emphasizes the urgency and cruciality of improving risk awareness in LLM-based embodied agents in the future.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Abductive explanations of classifiers under constraints: Complexity and properties Explaining Non-monotonic Normative Reasoning using Argumentation Theory with Deontic Logic Towards Explainable Goal Recognition Using Weight of Evidence (WoE): A Human-Centered Approach A Metric Hybrid Planning Approach to Solving Pandemic Planning Problems with Simple SIR Models Neural Networks for Vehicle Routing Problem
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1