首页 > 最新文献

Artificial Intelligence最新文献

英文 中文
Addressing maximization bias in reinforcement learning with two-sample testing 用双样本测试解决强化学习中的最大化偏差
IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-16 DOI: 10.1016/j.artint.2024.104204

Value-based reinforcement-learning algorithms have shown strong results in games, robotics, and other real-world applications. Overestimation bias is a known threat to those algorithms and can sometimes lead to dramatic performance decreases or even complete algorithmic failure. We frame the bias problem statistically and consider it an instance of estimating the maximum expected value (MEV) of a set of random variables. We propose the T-Estimator (TE) based on two-sample testing for the mean, that flexibly interpolates between over- and underestimation by adjusting the significance level of the underlying hypothesis tests. We also introduce a generalization, termed K-Estimator (KE), that obeys the same bias and variance bounds as the TE and relies on a nearly arbitrary kernel function. We introduce modifications of Q-Learning and the Bootstrapped Deep Q-Network (BDQN) using the TE and the KE, and prove convergence in the tabular setting. Furthermore, we propose an adaptive variant of the TE-based BDQN that dynamically adjusts the significance level to minimize the absolute estimation bias. All proposed estimators and algorithms are thoroughly tested and validated on diverse tasks and environments, illustrating the bias control and performance potential of the TE and KE.

基于价值的强化学习算法在游戏、机器人和其他实际应用中都取得了很好的效果。高估偏差是这些算法面临的一个已知威胁,有时会导致性能急剧下降甚至算法完全失效。我们从统计学的角度来看待偏差问题,并将其视为估计一组随机变量的最大期望值 (MEV) 的一个实例。我们提出了基于均值双样本检验的 T-估计器(TE),它可以通过调整基本假设检验的显著性水平,在高估和低估之间灵活插值。我们还引入了一种概括,称为 K-估计器(KE),它与 TE 遵循相同的偏差和方差约束,并依赖于近乎任意的核函数。我们介绍了使用 TE 和 KE 对 Q-Learning 和 Bootstrapped Deep Q-Network (BDQN) 的修改,并证明了在表格设置中的收敛性。此外,我们还提出了基于 TE 的 BDQN 的自适应变体,该变体可动态调整显著性水平,使绝对估计偏差最小化。所有提出的估计器和算法都在不同的任务和环境中进行了全面的测试和验证,说明了 TE 和 KE 的偏差控制和性能潜力。
{"title":"Addressing maximization bias in reinforcement learning with two-sample testing","authors":"","doi":"10.1016/j.artint.2024.104204","DOIUrl":"10.1016/j.artint.2024.104204","url":null,"abstract":"<div><p>Value-based reinforcement-learning algorithms have shown strong results in games, robotics, and other real-world applications. Overestimation bias is a known threat to those algorithms and can sometimes lead to dramatic performance decreases or even complete algorithmic failure. We frame the bias problem statistically and consider it an instance of estimating the maximum expected value (MEV) of a set of random variables. We propose the <em>T</em>-Estimator (TE) based on two-sample testing for the mean, that flexibly interpolates between over- and underestimation by adjusting the significance level of the underlying hypothesis tests. We also introduce a generalization, termed <em>K</em>-Estimator (KE), that obeys the same bias and variance bounds as the TE and relies on a nearly arbitrary kernel function. We introduce modifications of <em>Q</em>-Learning and the Bootstrapped Deep <em>Q</em>-Network (BDQN) using the TE and the KE, and prove convergence in the tabular setting. Furthermore, we propose an adaptive variant of the TE-based BDQN that dynamically adjusts the significance level to minimize the absolute estimation bias. All proposed estimators and algorithms are thoroughly tested and validated on diverse tasks and environments, illustrating the bias control and performance potential of the TE and KE.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":5.1,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224001401/pdfft?md5=5b6841aff0d8d49b8cc40332377d2f38&pid=1-s2.0-S0004370224001401-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142020506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Modular control architecture for safe marine navigation: Reinforcement learning with predictive safety filters 用于海上安全航行的模块化控制架构:带有预测性安全过滤器的强化学习
IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-13 DOI: 10.1016/j.artint.2024.104201

Many autonomous systems are safety-critical, making it essential to have a closed-loop control system that satisfies constraints arising from underlying physical limitations and safety aspects in a robust manner. However, this is often challenging to achieve for real-world systems. For example, autonomous ships at sea have nonlinear and uncertain dynamics and are subject to numerous time-varying environmental disturbances such as waves, currents, and wind. There is increasing interest in using machine learning-based approaches to adapt these systems to more complex scenarios, but there are few standard frameworks that guarantee the safety and stability of such systems. Recently, predictive safety filters (PSF) have emerged as a promising method to ensure constraint satisfaction in learning-based control, bypassing the need for explicit constraint handling in the learning algorithms themselves. The safety filter approach leads to a modular separation of the problem, allowing the use of arbitrary control policies in a task-agnostic way. The filter takes in a potentially unsafe control action from the main controller and solves an optimization problem to compute a minimal perturbation of the proposed action that adheres to both physical and safety constraints. In this work, we combine reinforcement learning (RL) with predictive safety filtering in the context of marine navigation and control. The RL agent is trained on path-following and safety adherence across a wide range of randomly generated environments, while the predictive safety filter continuously monitors the agents' proposed control actions and modifies them if necessary. The combined PSF/RL scheme is implemented on a simulated model of Cybership II, a miniature replica of a typical supply ship. Safety performance and learning rate are evaluated and compared with those of a standard, non-PSF, RL agent. It is demonstrated that the predictive safety filter is able to keep the vessel safe, while not prohibiting the learning rate and performance of the RL agent.

许多自主系统对安全至关重要,因此必须拥有一个闭环控制系统,以稳健的方式满足基本物理限制和安全方面的约束。然而,现实世界中的系统往往难以实现这一点。例如,海上自主航行的船只具有非线性和不确定的动态特性,并受到海浪、海流和风等众多时变环境干扰的影响。人们对使用基于机器学习的方法使这些系统适应更复杂场景的兴趣与日俱增,但很少有标准框架能保证此类系统的安全性和稳定性。最近,预测安全滤波器(PSF)作为一种有前途的方法出现了,它绕过了在学习算法本身中进行显式约束处理的需要,确保了基于学习的控制中的约束满足。安全过滤器方法将问题模块化,允许以任务无关的方式使用任意控制策略。过滤器从主控制器中接收潜在的不安全控制操作,并解决优化问题,计算出符合物理和安全约束条件的拟议操作的最小扰动。在这项工作中,我们将强化学习(RL)与预测性安全过滤相结合,用于海洋导航和控制。强化学习(RL)代理在各种随机生成的环境中接受路径跟踪和安全坚持方面的训练,而预测性安全过滤器则持续监控代理提出的控制行动,并在必要时对其进行修改。PSF/RL 组合方案是在 Cybership II(一艘典型补给船的微型复制品)的仿真模型上实施的。对安全性能和学习率进行了评估,并与标准、非 PSF、RL 代理的安全性能和学习率进行了比较。结果表明,预测性安全过滤器能够保证船只的安全,同时不影响 RL 代理的学习率和性能。
{"title":"Modular control architecture for safe marine navigation: Reinforcement learning with predictive safety filters","authors":"","doi":"10.1016/j.artint.2024.104201","DOIUrl":"10.1016/j.artint.2024.104201","url":null,"abstract":"<div><p>Many autonomous systems are safety-critical, making it essential to have a closed-loop control system that satisfies constraints arising from underlying physical limitations and safety aspects in a robust manner. However, this is often challenging to achieve for real-world systems. For example, autonomous ships at sea have nonlinear and uncertain dynamics and are subject to numerous time-varying environmental disturbances such as waves, currents, and wind. There is increasing interest in using machine learning-based approaches to adapt these systems to more complex scenarios, but there are few standard frameworks that guarantee the safety and stability of such systems. Recently, predictive safety filters (PSF) have emerged as a promising method to ensure constraint satisfaction in learning-based control, bypassing the need for explicit constraint handling in the learning algorithms themselves. The safety filter approach leads to a modular separation of the problem, allowing the use of arbitrary control policies in a task-agnostic way. The filter takes in a potentially unsafe control action from the main controller and solves an optimization problem to compute a minimal perturbation of the proposed action that adheres to both physical and safety constraints. In this work, we combine reinforcement learning (RL) with predictive safety filtering in the context of marine navigation and control. The RL agent is trained on path-following and safety adherence across a wide range of randomly generated environments, while the predictive safety filter continuously monitors the agents' proposed control actions and modifies them if necessary. The combined PSF/RL scheme is implemented on a simulated model of Cybership II, a miniature replica of a typical supply ship. Safety performance and learning rate are evaluated and compared with those of a standard, non-PSF, RL agent. It is demonstrated that the predictive safety filter is able to keep the vessel safe, while not prohibiting the learning rate and performance of the RL agent.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":5.1,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224001371/pdfft?md5=32cb7040f174b219329c813dbac41fde&pid=1-s2.0-S0004370224001371-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141985125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
QCDCL with cube learning or pure literal elimination – What is best? 带有立方体学习功能的 QCDCL 或纯粹的字面排除 - 哪种方法最好?
IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-08 DOI: 10.1016/j.artint.2024.104194

Quantified conflict-driven clause learning (QCDCL) is one of the main approaches for solving quantified Boolean formulas (QBF). We formalise and investigate several versions of QCDCL that include cube learning and/or pure-literal elimination, and formally compare the resulting solving variants via proof complexity techniques. Our results show that almost all of the QCDCL variants are exponentially incomparable with respect to proof size (and hence solver running time), pointing towards different orthogonal ways how to practically implement QCDCL.

量化冲突驱动子句学习(QCDCL)是求解量化布尔公式(QBF)的主要方法之一。我们对 QCDCL 的几个版本进行了形式化和研究,其中包括立方学习和/或纯字面消除,并通过证明复杂性技术对由此产生的求解变体进行了形式化比较。我们的结果表明,几乎所有的 QCDCL 变体在证明大小(以及求解器运行时间)方面都是指数级的,这表明了如何实际实现 QCDCL 的不同正交方法。
{"title":"QCDCL with cube learning or pure literal elimination – What is best?","authors":"","doi":"10.1016/j.artint.2024.104194","DOIUrl":"10.1016/j.artint.2024.104194","url":null,"abstract":"<div><p>Quantified conflict-driven clause learning (QCDCL) is one of the main approaches for solving quantified Boolean formulas (QBF). We formalise and investigate several versions of QCDCL that include cube learning and/or pure-literal elimination, and formally compare the resulting solving variants via proof complexity techniques. Our results show that almost all of the QCDCL variants are exponentially incomparable with respect to proof size (and hence solver running time), pointing towards different orthogonal ways how to practically implement QCDCL.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":5.1,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224001309/pdfft?md5=5239acd648349c514fda83a672a66c32&pid=1-s2.0-S0004370224001309-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141910632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying roles of formulas in inconsistency under Priest's minimally inconsistent logic of paradox 在普里斯特的悖论最小不一致逻辑中确定公式在不一致中的作用
IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-05 DOI: 10.1016/j.artint.2024.104199

It has been increasingly recognized that identifying roles of formulas of a knowledge base in the inconsistency of that base can help us better look inside the inconsistency. However, there are few approaches to identifying such roles of formulas from a perspective of models in some paraconsistent logic, one of typical tools used to characterize inconsistency in semantics. In this paper, we characterize the role of each formula in the inconsistency arising in a knowledge base from informational as well as causal aspects in the framework of Priest's minimally inconsistent logic of paradox. At first, we identify the causal responsibility of a formula for the inconsistency based on the counterfactual dependence of the inconsistency on the formula under some contingency in semantics. Then we incorporate the change on semantic information in the framework of causal responsibility to develop the informational responsibility of a formula for the inconsistency to capture the contribution made by the formula for the inconsistent information. This incorporation makes the informational responsibility interpretable from the point of view of causality, and capable of catching the role of a formula in inconsistent information concisely. In addition, we propose notions of naive and quasi naive responsibilities as two auxiliaries to describe special relations between inconsistency and formulas in semantic sense. Some intuitive and interesting properties of the two kinds of responsibilities are also discussed.

越来越多的人认识到,识别知识库的公式在该知识库的不一致性中所起的作用,可以帮助我们更好地观察不一致性的内部。然而,从准一致逻辑模型的角度来识别公式的这种作用的方法还很少,而准一致逻辑是用来描述语义学中不一致现象的典型工具之一。本文在普里斯特最小不一致悖论逻辑的框架内,从信息和因果两个方面描述了每个公式在知识库中产生的不一致中的作用。首先,我们根据不一致性在语义上的某种偶然性下对公式的反事实依赖性来确定公式对不一致性的因果责任。然后,我们将语义信息的变化纳入因果责任的框架,发展出公式对不一致的信息责任,以捕捉公式对不一致信息所做的贡献。这种结合使得信息责任可以从因果关系的角度进行解释,并能简明地捕捉公式在不一致信息中的作用。此外,我们还提出了 "天真责任 "和 "准天真责任 "这两个辅助概念,以描述不一致与公式在语义上的特殊关系。我们还讨论了这两种责任的一些直观而有趣的特性。
{"title":"Identifying roles of formulas in inconsistency under Priest's minimally inconsistent logic of paradox","authors":"","doi":"10.1016/j.artint.2024.104199","DOIUrl":"10.1016/j.artint.2024.104199","url":null,"abstract":"<div><p>It has been increasingly recognized that identifying roles of formulas of a knowledge base in the inconsistency of that base can help us better look inside the inconsistency. However, there are few approaches to identifying such roles of formulas from a perspective of models in some paraconsistent logic, one of typical tools used to characterize inconsistency in semantics. In this paper, we characterize the role of each formula in the inconsistency arising in a knowledge base from informational as well as causal aspects in the framework of Priest's minimally inconsistent logic of paradox. At first, we identify the causal responsibility of a formula for the inconsistency based on the counterfactual dependence of the inconsistency on the formula under some contingency in semantics. Then we incorporate the change on semantic information in the framework of causal responsibility to develop the informational responsibility of a formula for the inconsistency to capture the contribution made by the formula for the inconsistent information. This incorporation makes the informational responsibility interpretable from the point of view of causality, and capable of catching the role of a formula in inconsistent information concisely. In addition, we propose notions of naive and quasi naive responsibilities as two auxiliaries to describe special relations between inconsistency and formulas in semantic sense. Some intuitive and interesting properties of the two kinds of responsibilities are also discussed.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":5.1,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141910633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Representing states in iterated belief revision 在迭代信念修正中表示状态
IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-05 DOI: 10.1016/j.artint.2024.104200

Iterated belief revision requires information about the current beliefs. This information is represented by mathematical structures called doxastic states. Most literature concentrates on how to revise a doxastic state and neglects that it may exponentially grow. This problem is studied for the most common ways of storing a doxastic state. All four of them are able to store every doxastic state, but some do it in less space than others. In particular, the explicit representation (an enumeration of the current beliefs) is the more wasteful on space. The level representation (a sequence of propositional formulae) and the natural representation (a history of natural revisions) are more succinct than it. The lexicographic representation (a history of lexicographic revision) is even more succinct than them.

迭代式信念修正需要有关当前信念的信息。这些信息由称为 "哆嗦状态 "的数学结构表示。大多数文献都专注于如何修正 "哆嗦状态",而忽略了它可能会以指数形式增长。我们针对最常见的哆嗦状态存储方式研究了这一问题。所有四种方法都能存储每一种哆嗦状态,但有些方法的空间比其他方法小。尤其是显式表示法(对当前信念的枚举)更浪费空间。水平表示法(命题公式序列)和自然表示法(自然修订的历史)比它更简洁。词法表示法(词法修订史)甚至比它们更简洁。
{"title":"Representing states in iterated belief revision","authors":"","doi":"10.1016/j.artint.2024.104200","DOIUrl":"10.1016/j.artint.2024.104200","url":null,"abstract":"<div><p>Iterated belief revision requires information about the current beliefs. This information is represented by mathematical structures called doxastic states. Most literature concentrates on how to revise a doxastic state and neglects that it may exponentially grow. This problem is studied for the most common ways of storing a doxastic state. All four of them are able to store every doxastic state, but some do it in less space than others. In particular, the explicit representation (an enumeration of the current beliefs) is the more wasteful on space. The level representation (a sequence of propositional formulae) and the natural representation (a history of natural revisions) are more succinct than it. The lexicographic representation (a history of lexicographic revision) is even more succinct than them.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":5.1,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141910631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On measuring inconsistency in graph databases with regular path constraints 关于利用规则路径约束测量图数据库中的不一致性
IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-02 DOI: 10.1016/j.artint.2024.104197

Real-world data are often inconsistent. Although a substantial amount of research has been done on measuring inconsistency, this research concentrated on knowledge bases formalized in propositional logic. Recently, inconsistency measures have been introduced for relational databases. However, nowadays, real-world information is always more frequently represented by graph-based structures which offer a more intuitive conceptualization than relational ones. In this paper, we explore inconsistency measures for graph databases with regular path constraints, a class of integrity constraints based on a well-known navigational language for graph data. In this context, we define several inconsistency measures dealing with specific elements contributing to inconsistency in graph databases. We also define some rationality postulates that are desirable properties for an inconsistency measure for graph databases. We analyze the compliance of each measure with each postulate and find various degrees of satisfaction; in fact, one of the measures satisfies all the postulates. Finally, we investigate the data and combined complexity of the calculation of all the measures as well as the complexity of deciding whether a measure is lower than, equal to, or greater than a given threshold. It turns out that for a majority of the measures these problems are tractable, while for the other different levels of intractability are exhibited.

现实世界中的数据往往是不一致的。虽然已有大量关于测量不一致性的研究,但这些研究主要集中在以命题逻辑形式化的知识库上。最近,关系数据库也引入了不一致性测量方法。然而,如今现实世界中的信息总是更多地以基于图的结构来表示,这种结构比关系型结构提供了更直观的概念化。在本文中,我们将探讨具有规则路径约束的图数据库的不一致性度量,规则路径约束是一类基于著名的图数据导航语言的完整性约束。在这种情况下,我们定义了几种不一致度量方法,用于处理导致图数据库不一致的特定因素。我们还定义了一些合理性假设,这些假设是图数据库不一致性度量的理想属性。我们分析了每种度量方法是否符合每个假设,并发现了不同程度的满足情况;事实上,其中一种度量方法满足了所有假设。最后,我们研究了计算所有度量的数据和综合复杂性,以及判定度量是否小于、等于或大于给定阈值的复杂性。结果表明,对于大多数度量,这些问题都是可以解决的,而对于其他度量,则表现出不同程度的难解性。
{"title":"On measuring inconsistency in graph databases with regular path constraints","authors":"","doi":"10.1016/j.artint.2024.104197","DOIUrl":"10.1016/j.artint.2024.104197","url":null,"abstract":"<div><p>Real-world data are often inconsistent. Although a substantial amount of research has been done on measuring inconsistency, this research concentrated on knowledge bases formalized in propositional logic. Recently, inconsistency measures have been introduced for relational databases. However, nowadays, real-world information is always more frequently represented by graph-based structures which offer a more intuitive conceptualization than relational ones. In this paper, we explore inconsistency measures for graph databases with regular path constraints, a class of integrity constraints based on a well-known navigational language for graph data. In this context, we define several inconsistency measures dealing with specific elements contributing to inconsistency in graph databases. We also define some rationality postulates that are desirable properties for an inconsistency measure for graph databases. We analyze the compliance of each measure with each postulate and find various degrees of satisfaction; in fact, one of the measures satisfies all the postulates. Finally, we investigate the data and combined complexity of the calculation of all the measures as well as the complexity of deciding whether a measure is lower than, equal to, or greater than a given threshold. It turns out that for a majority of the measures these problems are tractable, while for the other different levels of intractability are exhibited.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":5.1,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224001334/pdfft?md5=113adf90619058fb60d34c4ed866c0e0&pid=1-s2.0-S0004370224001334-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141910662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sample-based bounds for coherent risk measures: Applications to policy synthesis and verification 基于样本的一致性风险度量界限:政策综合与验证的应用
IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-02 DOI: 10.1016/j.artint.2024.104195

Autonomous systems are increasingly used in highly variable and uncertain environments giving rise to the pressing need to consider risk in both the synthesis and verification of policies for these systems. This paper first develops a sample-based method to upper bound the risk measure evaluation of a random variable whose distribution is unknown. These bounds permit us to generate high-confidence verification statements for a large class of robotic systems in a sample-efficient manner. Second, we develop a sample-based method to determine solutions to non-convex optimization problems that outperform a large fraction of the decision space of possible solutions. Both sample-based approaches then permit us to rapidly synthesize risk-aware policies that are guaranteed to achieve a minimum level of system performance. To showcase our approach in simulation, we verify a cooperative multi-agent system and develop a risk-aware controller that outperforms the system's baseline controller. Our approach can be extended to account for any g-entropic risk measure.

自主系统越来越多地应用于高度多变和不确定的环境中,因此迫切需要在综合和验证这些系统的策略时考虑风险因素。本文首先开发了一种基于样本的方法,为分布未知的随机变量的风险度量评估设定上限。这些上界使我们能够以样本高效的方式为一大类机器人系统生成高置信度的验证声明。其次,我们开发了一种基于样本的方法,用于确定非凸优化问题的解决方案,这些解决方案优于决策空间中大部分可能解决方案。然后,这两种基于样本的方法允许我们快速合成风险意识策略,以保证达到最低水平的系统性能。为了在仿真中展示我们的方法,我们验证了一个合作的多代理系统,并开发了一个风险意识控制器,其性能优于系统的基准控制器。我们的方法可以扩展到任何 g熵风险度量。
{"title":"Sample-based bounds for coherent risk measures: Applications to policy synthesis and verification","authors":"","doi":"10.1016/j.artint.2024.104195","DOIUrl":"10.1016/j.artint.2024.104195","url":null,"abstract":"<div><p>Autonomous systems are increasingly used in highly variable and uncertain environments giving rise to the pressing need to consider risk in both the synthesis and verification of policies for these systems. This paper first develops a sample-based method to upper bound the risk measure evaluation of a random variable whose distribution is unknown. These bounds permit us to generate high-confidence verification statements for a large class of robotic systems in a sample-efficient manner. Second, we develop a sample-based method to determine solutions to non-convex optimization problems that outperform a large fraction of the decision space of possible solutions. Both sample-based approaches then permit us to rapidly synthesize risk-aware policies that are guaranteed to achieve a minimum level of system performance. To showcase our approach in simulation, we verify a cooperative multi-agent system and develop a risk-aware controller that outperforms the system's baseline controller. Our approach can be extended to account for any <em>g</em>-entropic risk measure.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":5.1,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142012116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Manipulation and peer mechanisms: A survey 操纵和同行机制:调查
IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-02 DOI: 10.1016/j.artint.2024.104196

In peer mechanisms, the competitors for a prize also determine who wins. Each competitor may be asked to rank, grade, or nominate peers for the prize. Since the prize can be valuable, such as financial aid, course grades, or an award at a conference, competitors may be tempted to manipulate the mechanism. We survey approaches to prevent or discourage the manipulation of peer mechanisms. We conclude our survey by identifying several important research challenges.

在同侪机制中,奖项的竞争者也决定谁会获奖。每个竞争者都可能被要求对同伴进行排名、打分或提名。由于奖品可能是有价值的,如助学金、课程成绩或会议奖项,竞争者可能会受到诱惑而操纵该机制。我们调查了防止或阻止操纵同伴机制的方法。最后,我们指出了几项重要的研究挑战。
{"title":"Manipulation and peer mechanisms: A survey","authors":"","doi":"10.1016/j.artint.2024.104196","DOIUrl":"10.1016/j.artint.2024.104196","url":null,"abstract":"<div><p>In peer mechanisms, the competitors for a prize also determine who wins. Each competitor may be asked to rank, grade, or nominate peers for the prize. Since the prize can be valuable, such as financial aid, course grades, or an award at a conference, competitors may be tempted to manipulate the mechanism. We survey approaches to prevent or discourage the manipulation of peer mechanisms. We conclude our survey by identifying several important research challenges.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":5.1,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224001322/pdfft?md5=42efb319f0313556632e659a0b77231b&pid=1-s2.0-S0004370224001322-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141910637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NovPhy: A Physical Reasoning Benchmark for Open-world AI Systems NovPhy:开放世界人工智能系统的物理推理基准
IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-02 DOI: 10.1016/j.artint.2024.104198

Due to the emergence of AI systems that interact with the physical environment, there is an increased interest in incorporating physical reasoning capabilities into those AI systems. But is it enough to only have physical reasoning capabilities to operate in a real physical environment? In the real world, we constantly face novel situations we have not encountered before. As humans, we are competent at successfully adapting to those situations. Similarly, an agent needs to have the ability to function under the impact of novelties in order to properly operate in an open-world physical environment. To facilitate the development of such AI systems, we propose a new benchmark, NovPhy, that requires an agent to reason about physical scenarios in the presence of novelties and take actions accordingly. The benchmark consists of tasks that require agents to detect and adapt to novelties in physical scenarios. To create tasks in the benchmark, we develop eight novelties representing a diverse novelty space and apply them to five commonly encountered scenarios in a physical environment, related to applying forces and motions such as rolling, falling, and sliding of objects. According to our benchmark design, we evaluate two capabilities of an agent: the performance on a novelty when it is applied to different physical scenarios and the performance on a physical scenario when different novelties are applied to it. We conduct a thorough evaluation with human players, learning agents, and heuristic agents. Our evaluation shows that humans' performance is far beyond the agents' performance. Some agents, even with good normal task performance, perform significantly worse when there is a novelty, and the agents that can adapt to novelties typically adapt slower than humans. We promote the development of intelligent agents capable of performing at the human level or above when operating in open-world physical environments. benchmark website: https://github.com/phy-q/novphy

由于与物理环境交互的人工智能系统的出现,人们对将物理推理能力纳入这些人工智能系统的兴趣与日俱增。但是,在真实的物理环境中运行时,仅具备物理推理能力是否就足够了呢?在现实世界中,我们经常会遇到从未遇到过的新情况。作为人类,我们有能力成功适应这些情况。同样,要想在开放世界的物理环境中正常运行,代理也需要具备在新情况影响下运行的能力。为了促进这类人工智能系统的开发,我们提出了一个新的基准--NovPhy,它要求代理对存在新奇事物的物理场景进行推理,并采取相应的行动。该基准由要求代理检测和适应物理场景中新奇事物的任务组成。为了创建该基准中的任务,我们开发了代表不同新奇空间的八种新奇事物,并将它们应用到物理环境中常见的五种场景中,这些场景与施加力和物体滚动、下落和滑动等运动有关。根据我们的基准设计,我们对代理的两种能力进行了评估:将新奇事物应用于不同物理场景时的表现,以及将不同新奇事物应用于一个物理场景时的表现。我们对人类玩家、学习型代理和启发式代理进行了全面评估。我们的评估结果表明,人类的表现远远超过了代理的表现。有些代理即使在正常任务中表现出色,但在出现新情况时,其表现就会大打折扣,而能够适应新情况的代理通常适应得比人类慢。我们提倡开发在开放世界物理环境中运行时能达到或超过人类水平的智能代理。基准网站:https://github.com/phy-q/novphy
{"title":"NovPhy: A Physical Reasoning Benchmark for Open-world AI Systems","authors":"","doi":"10.1016/j.artint.2024.104198","DOIUrl":"10.1016/j.artint.2024.104198","url":null,"abstract":"<div><p>Due to the emergence of AI systems that interact with the physical environment, there is an increased interest in incorporating physical reasoning capabilities into those AI systems. But is it enough to only have physical reasoning capabilities to operate in a real physical environment? In the real world, we constantly face novel situations we have not encountered before. As humans, we are competent at successfully adapting to those situations. Similarly, an agent needs to have the ability to function under the impact of novelties in order to properly operate in an open-world physical environment. To facilitate the development of such AI systems, we propose a new benchmark, NovPhy, that requires an agent to reason about physical scenarios in the presence of novelties and take actions accordingly. The benchmark consists of tasks that require agents to detect and adapt to novelties in physical scenarios. To create tasks in the benchmark, we develop eight novelties representing a diverse novelty space and apply them to five commonly encountered scenarios in a physical environment, related to applying forces and motions such as rolling, falling, and sliding of objects. According to our benchmark design, we evaluate two capabilities of an agent: the performance on a novelty when it is applied to different physical scenarios and the performance on a physical scenario when different novelties are applied to it. We conduct a thorough evaluation with human players, learning agents, and heuristic agents. Our evaluation shows that humans' performance is far beyond the agents' performance. Some agents, even with good normal task performance, perform significantly worse when there is a novelty, and the agents that can adapt to novelties typically adapt slower than humans. We promote the development of intelligent agents capable of performing at the human level or above when operating in open-world physical environments. benchmark website: <span><span>https://github.com/phy-q/novphy</span><svg><path></path></svg></span></p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":5.1,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0004370224001346/pdfft?md5=387702c1b2d7756ba391c869b2457b2c&pid=1-s2.0-S0004370224001346-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142012115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Truthful aggregation of budget proposals with proportionality guarantees 如实汇总预算建议,保证比例相称
IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-30 DOI: 10.1016/j.artint.2024.104178

We study a participatory budgeting problem, where a set of strategic agents wish to split a divisible budget among different projects, by aggregating their proposals on a single division. Unfortunately, the straightforward rule that divides the budget proportionally is susceptible to manipulation. Recently, a class of truthful mechanisms has been proposed, namely the moving phantom mechanisms. One such mechanism satisfies the proportionality property, in the sense that in the extreme case where all agents prefer a single project to receive the whole amount, the budget is assigned proportionally.

While proportionality is a naturally desired property, it is defined over a limited type of preference profiles. To address this, we expand the notion of proportionality, by proposing a quantitative framework that evaluates a budget aggregation mechanism according to its worst-case distance from the proportional allocation. Crucially, this is defined for every preference profile. We study this measure on the class of moving phantom mechanisms, and we provide approximation guarantees. For two projects, we show that the Uniform Phantom mechanism is optimal among all truthful mechanisms. For three projects, we propose a new, proportional mechanism that is virtually optimal among all moving phantom mechanisms. Finally, we provide impossibility results regarding the approximability of moving phantom mechanisms.

我们研究的是一个参与式预算编制问题,在这个问题中,一组具有战略眼光的代理人希望通过将他们的建议汇总在一起,在不同的项目之间分配可分割的预算。不幸的是,按比例分配预算的直接规则很容易被操纵。最近,人们提出了一类真实机制,即移动幻影机制。其中一种机制满足比例属性,即在极端情况下,所有代理人都倾向于让一个项目获得全部资金,预算也会按比例分配。
{"title":"Truthful aggregation of budget proposals with proportionality guarantees","authors":"","doi":"10.1016/j.artint.2024.104178","DOIUrl":"10.1016/j.artint.2024.104178","url":null,"abstract":"<div><p>We study a participatory budgeting problem, where a set of strategic agents wish to split a divisible budget among different projects, by aggregating their proposals on a single division. Unfortunately, the straightforward rule that divides the budget proportionally is susceptible to manipulation. Recently, a class of truthful mechanisms has been proposed, namely the moving phantom mechanisms. One such mechanism satisfies the proportionality property, in the sense that in the extreme case where all agents prefer a single project to receive the whole amount, the budget is assigned proportionally.</p><p>While proportionality is a naturally desired property, it is defined over a limited type of preference profiles. To address this, we expand the notion of proportionality, by proposing a quantitative framework that evaluates a budget aggregation mechanism according to its worst-case distance from the proportional allocation. Crucially, this is defined for every preference profile. We study this measure on the class of moving phantom mechanisms, and we provide approximation guarantees. For two projects, we show that the Uniform Phantom mechanism is optimal among all truthful mechanisms. For three projects, we propose a new, proportional mechanism that is virtually optimal among all moving phantom mechanisms. Finally, we provide impossibility results regarding the approximability of moving phantom mechanisms.</p></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":5.1,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141892011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1