Defining a Metric-Driven Approach for Learning Hazardous Situations

Mario Fiorino, Muddasar Naeem, Mario Ciampi, Antonio Coronato
{"title":"Defining a Metric-Driven Approach for Learning Hazardous Situations","authors":"Mario Fiorino, Muddasar Naeem, Mario Ciampi, Antonio Coronato","doi":"10.3390/technologies12070103","DOIUrl":null,"url":null,"abstract":"Artificial intelligence has brought many innovations to our lives. At the same time, it is worth designing robust safety machine learning (ML) algorithms to obtain more benefits from technology. Reinforcement learning (RL) being an important ML method is largely applied in safety-centric scenarios. In such a situation, learning safety constraints are necessary to avoid undesired outcomes. Within the traditional RL paradigm, agents typically focus on identifying states associated with high rewards to maximize its long-term returns. This prioritization can lead to a neglect of potentially hazardous situations. Particularly, the exploration phase can pose significant risks, as it necessitates actions that may have unpredictable consequences. For instance, in autonomous driving applications, an RL agent might discover routes that yield high efficiency but fail to account for sudden hazardous conditions such as sharp turns or pedestrian crossings, potentially leading to catastrophic failures. Ensuring the safety of agents operating in unpredictable environments with potentially catastrophic failure states remains a critical challenge. This paper introduces a novel metric-driven approach aimed at containing risk in RL applications. Central to this approach are two developed indicators: the Hazard Indicator and the Risk Indicator. These metrics are designed to evaluate the safety of an environment by quantifying the likelihood of transitioning from safe states to failure states and assessing the associated risks. The fact that these indicators are characterized by a straightforward implementation, a highly generalizable probabilistic mathematical foundation, and a domain-independent nature makes them particularly interesting. To demonstrate their efficacy, we conducted experiments across various use cases, showcasing the feasibility of our proposed metrics. By enabling RL agents to effectively manage hazardous states, this approach paves the way for a more reliable and readily implementable RL in practical applications.","PeriodicalId":504839,"journal":{"name":"Technologies","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/technologies12070103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Artificial intelligence has brought many innovations to our lives. At the same time, it is worth designing robust safety machine learning (ML) algorithms to obtain more benefits from technology. Reinforcement learning (RL) being an important ML method is largely applied in safety-centric scenarios. In such a situation, learning safety constraints are necessary to avoid undesired outcomes. Within the traditional RL paradigm, agents typically focus on identifying states associated with high rewards to maximize its long-term returns. This prioritization can lead to a neglect of potentially hazardous situations. Particularly, the exploration phase can pose significant risks, as it necessitates actions that may have unpredictable consequences. For instance, in autonomous driving applications, an RL agent might discover routes that yield high efficiency but fail to account for sudden hazardous conditions such as sharp turns or pedestrian crossings, potentially leading to catastrophic failures. Ensuring the safety of agents operating in unpredictable environments with potentially catastrophic failure states remains a critical challenge. This paper introduces a novel metric-driven approach aimed at containing risk in RL applications. Central to this approach are two developed indicators: the Hazard Indicator and the Risk Indicator. These metrics are designed to evaluate the safety of an environment by quantifying the likelihood of transitioning from safe states to failure states and assessing the associated risks. The fact that these indicators are characterized by a straightforward implementation, a highly generalizable probabilistic mathematical foundation, and a domain-independent nature makes them particularly interesting. To demonstrate their efficacy, we conducted experiments across various use cases, showcasing the feasibility of our proposed metrics. By enabling RL agents to effectively manage hazardous states, this approach paves the way for a more reliable and readily implementable RL in practical applications.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
确定以指标为导向的危险情况学习方法
人工智能为我们的生活带来了许多创新。与此同时,为了从技术中获得更多益处,设计强大的安全机器学习(ML)算法也是值得的。强化学习(RL)作为一种重要的 ML 方法,主要应用于以安全为中心的场景。在这种情况下,有必要学习安全约束,以避免出现不期望的结果。在传统的强化学习范例中,代理通常专注于识别与高回报相关的状态,以最大化其长期回报。这种优先顺序可能会导致忽视潜在的危险情况。特别是,探索阶段可能会带来巨大风险,因为它需要采取可能产生不可预测后果的行动。例如,在自动驾驶应用中,RL 代理可能会发现产生高效率的路线,但却没有考虑到急转弯或人行横道等突发危险情况,从而可能导致灾难性故障。确保代理在不可预测的环境中运行的安全性,以及潜在的灾难性故障状态,仍然是一个严峻的挑战。本文介绍了一种新颖的度量驱动方法,旨在控制 RL 应用中的风险。这种方法的核心是两个已开发的指标:危险指标和风险指标。这些指标旨在通过量化从安全状态过渡到失效状态的可能性以及评估相关风险来评价环境的安全性。这些指标的特点是实施简单、具有高度通用性的概率数学基础以及与领域无关的性质,这使得它们特别有趣。为了证明这些指标的有效性,我们在各种使用案例中进行了实验,展示了我们提出的指标的可行性。通过让 RL 代理有效管理危险状态,这种方法为在实际应用中实现更可靠、更易于实施的 RL 铺平了道路。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Oxygen Measurement in Cuprate Superconductors Using the Dissolved Oxygen/Chlorine Method Development and Evaluation of an mHealth App That Promotes Access to 3D Printable Assistive Devices Probabilistic Confusion Matrix: A Novel Method for Machine Learning Algorithm Generalized Performance Analysis Improvement of the ANN-Based Prediction Technology for Extremely Small Biomedical Data Analysis Optimizing Speech Emotion Recognition with Machine Learning Based Advanced Audio Cue Analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1