SMARLA：深度强化学习代理的安全监控方法

IF 6.5 1区计算机科学 Q1 COMPUTER SCIENCE, SOFTWARE ENGINEERING IEEE Transactions on Software Engineering Pub Date : 2024-11-06 DOI:10.1109/TSE.2024.3491496

Amirhossein Zolfagharian;Manel Abdellatif;Lionel C. Briand;Ramesh S

{"title":"SMARLA：深度强化学习代理的安全监控方法","authors":"Amirhossein Zolfagharian;Manel Abdellatif;Lionel C. Briand;Ramesh S","doi":"10.1109/TSE.2024.3491496","DOIUrl":null,"url":null,"abstract":"Deep Reinforcement Learning (DRL) has made significant advancements in various fields, such as autonomous driving, healthcare, and robotics, by enabling agents to learn optimal policies through interactions with their environments. However, the application of DRL in safety-critical domains presents challenges, particularly concerning the safety of the learned policies. DRL agents, which are focused on maximizing rewards, may select unsafe actions, leading to safety violations. Runtime safety monitoring is thus essential to ensure the safe operation of these agents, especially in unpredictable and dynamic environments. This paper introduces \n<italic>SMARLA\n, a black-box safety monitoring approach specifically designed for DRL agents. \n<italic>SMARLA\n utilizes machine learning to predict safety violations by observing the agent's behavior during execution. The approach is based on Q-values, which reflect the expected reward for taking actions in specific states. \n<italic>SMARLA\n employs state abstraction to reduce the complexity of the state space, enhancing the predictive capabilities of the monitoring model. Such abstraction enables the early detection of unsafe states, allowing for the implementation of corrective and preventive measures before incidents occur. We quantitatively and qualitatively validated \n<italic>SMARLA\n on three well-known case studies widely used in DRL research. Empirical results reveal that \n<italic>SMARLA\n is accurate at predicting safety violations, with a low false positive rate, and can predict violations at an early stage, approximately halfway through the execution of the agent, before violations occur. We also discuss different decision criteria, based on confidence intervals of the predicted violation probabilities, to trigger safety mechanisms aiming at a trade-off between early detection and low false positive rates.","PeriodicalId":13324,"journal":{"name":"IEEE Transactions on Software Engineering","volume":"51 1","pages":"82-105"},"PeriodicalIF":6.5000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SMARLA: A Safety Monitoring Approach for Deep Reinforcement Learning Agents\",\"authors\":\"Amirhossein Zolfagharian;Manel Abdellatif;Lionel C. Briand;Ramesh S\",\"doi\":\"10.1109/TSE.2024.3491496\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep Reinforcement Learning (DRL) has made significant advancements in various fields, such as autonomous driving, healthcare, and robotics, by enabling agents to learn optimal policies through interactions with their environments. However, the application of DRL in safety-critical domains presents challenges, particularly concerning the safety of the learned policies. DRL agents, which are focused on maximizing rewards, may select unsafe actions, leading to safety violations. Runtime safety monitoring is thus essential to ensure the safe operation of these agents, especially in unpredictable and dynamic environments. This paper introduces \\n<italic>SMARLA\\n, a black-box safety monitoring approach specifically designed for DRL agents. \\n<italic>SMARLA\\n utilizes machine learning to predict safety violations by observing the agent's behavior during execution. The approach is based on Q-values, which reflect the expected reward for taking actions in specific states. \\n<italic>SMARLA\\n employs state abstraction to reduce the complexity of the state space, enhancing the predictive capabilities of the monitoring model. Such abstraction enables the early detection of unsafe states, allowing for the implementation of corrective and preventive measures before incidents occur. We quantitatively and qualitatively validated \\n<italic>SMARLA\\n on three well-known case studies widely used in DRL research. Empirical results reveal that \\n<italic>SMARLA\\n is accurate at predicting safety violations, with a low false positive rate, and can predict violations at an early stage, approximately halfway through the execution of the agent, before violations occur. We also discuss different decision criteria, based on confidence intervals of the predicted violation probabilities, to trigger safety mechanisms aiming at a trade-off between early detection and low false positive rates.\",\"PeriodicalId\":13324,\"journal\":{\"name\":\"IEEE Transactions on Software Engineering\",\"volume\":\"51 1\",\"pages\":\"82-105\"},\"PeriodicalIF\":6.5000,\"publicationDate\":\"2024-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Software Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10745554/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Software Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10745554/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

摘要

深度强化学习（DRL）通过使智能体通过与其环境的交互学习最佳策略，在自动驾驶、医疗保健和机器人等各个领域取得了重大进展。然而，DRL在安全关键领域的应用面临着挑战，特别是在学习策略的安全性方面。DRL主体以奖励最大化为目标，可能会选择不安全的行为，导致安全违规。因此，运行时安全监控对于确保这些代理的安全操作至关重要，尤其是在不可预测和动态的环境中。本文介绍了一种专门为DRL代理设计的黑盒安全监测方法SMARLA。SMARLA利用机器学习，通过观察代理在执行过程中的行为来预测安全违规行为。该方法基于q值，它反映了在特定状态下采取行动的预期回报。SMARLA采用状态抽象来降低状态空间的复杂性，增强了监控模型的预测能力。这样的抽象可以使不安全状态的早期检测成为可能，允许在事故发生之前实施纠正和预防措施。我们通过在DRL研究中广泛使用的三个知名案例对SMARLA进行了定量和定性验证。实证结果表明，SMARLA在预测安全违规方面是准确的，假阳性率低，并且可以在违规发生之前的早期阶段（大约在代理执行的中途）预测违规。我们还讨论了不同的决策标准，基于预测违规概率的置信区间，以触发安全机制，旨在早期检测和低假阳性率之间的权衡。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

SMARLA: A Safety Monitoring Approach for Deep Reinforcement Learning Agents

Deep Reinforcement Learning (DRL) has made significant advancements in various fields, such as autonomous driving, healthcare, and robotics, by enabling agents to learn optimal policies through interactions with their environments. However, the application of DRL in safety-critical domains presents challenges, particularly concerning the safety of the learned policies. DRL agents, which are focused on maximizing rewards, may select unsafe actions, leading to safety violations. Runtime safety monitoring is thus essential to ensure the safe operation of these agents, especially in unpredictable and dynamic environments. This paper introduces SMARLA , a black-box safety monitoring approach specifically designed for DRL agents. SMARLA utilizes machine learning to predict safety violations by observing the agent's behavior during execution. The approach is based on Q-values, which reflect the expected reward for taking actions in specific states. SMARLA employs state abstraction to reduce the complexity of the state space, enhancing the predictive capabilities of the monitoring model. Such abstraction enables the early detection of unsafe states, allowing for the implementation of corrective and preventive measures before incidents occur. We quantitatively and qualitatively validated SMARLA on three well-known case studies widely used in DRL research. Empirical results reveal that SMARLA is accurate at predicting safety violations, with a low false positive rate, and can predict violations at an early stage, approximately halfway through the execution of the agent, before violations occur. We also discuss different decision criteria, based on confidence intervals of the predicted violation probabilities, to trigger safety mechanisms aiming at a trade-off between early detection and low false positive rates.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Software Engineering 工程技术-工程：电子与电气

CiteScore

9.70

自引率

10.80%

发文量

724

审稿时长

6 months

期刊介绍： IEEE Transactions on Software Engineering seeks contributions comprising well-defined theoretical results and empirical studies with potential impacts on software construction, analysis, or management. The scope of this Transactions extends from fundamental mechanisms to the development of principles and their application in specific environments. Specific topic areas include: a) Development and maintenance methods and models: Techniques and principles for specifying, designing, and implementing software systems, encompassing notations and process models. b) Assessment methods: Software tests, validation, reliability models, test and diagnosis procedures, software redundancy, design for error control, and measurements and evaluation of process and product aspects. c) Software project management: Productivity factors, cost models, schedule and organizational issues, and standards. d) Tools and environments: Specific tools, integrated tool environments, associated architectures, databases, and parallel and distributed processing issues. e) System issues: Hardware-software trade-offs. f) State-of-the-art surveys: Syntheses and comprehensive reviews of the historical development within specific areas of interest.