首页 > 最新文献

Journal of Artificial Intelligence Research最新文献

英文 中文
The LM-Cut Heuristic Family for Optimal Numeric Planning with Simple Conditions 简单条件下最优数值规划的LM-Cut启发式族
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2022-12-28 DOI: 10.1613/jair.1.14034
Ryo Kuroiwa, Alexander Shleyfman, Chiara Piacentini, Margarita P. Castro, J. Christopher Beck
The LM-cut heuristic, both alone and as part of the operator counting framework, represents one of the most successful heuristics for classical planning. In this paper, we generalize LM-cut and its use in operator counting to optimal numeric planning with simple conditions and simple numeric effects, i.e., linear expressions over numeric state variables and actions that increase or decrease such variables by constant quantities. We introduce a variant of hmaxhbd (a previously proposed numeric hmax heuristic) based on the delete-relaxed version of such planning tasks and show that, although inadmissible by itself, our variant yields a numeric version of the classical LM-cut heuristic which is admissible. We classify the three existing families of heuristics for this class of numeric planning tasks and introduce the LM-cut family, proving dominance or incomparability between all pairs of existing max and LM-cut heuristics for numeric planning with simple conditions. Our extensive empirical evaluation shows that the new LM-cut heuristic, both on its own and as part of the operator counting framework, is the state-of-the-art for this class of numeric planning problem.
LM-cut启发式方法,无论是单独的还是作为算子计数框架的一部分,都是经典规划中最成功的启发式方法之一。在本文中,我们将LM-cut及其在算子计数中的应用推广到具有简单条件和简单数值效果的最优数值规划,即数值状态变量的线性表达式和以常量增加或减少这些变量的动作。我们基于这些规划任务的删除-放松版本引入了hmaxbd的一个变体(先前提出的数值hmax启发式),并表明,尽管我们的变体本身是不可接受的,但我们的变体产生了经典LM-cut启发式的一个可接受的数值版本。我们对这类数值规划任务的三种现有启发式方法进行了分类,并引入了LM-cut族,证明了在简单条件下,所有对现有的最大启发式和LM-cut启发式之间的优势性或不可比性。我们广泛的经验评估表明,新的LM-cut启发式方法,无论是作为其本身还是作为算子计数框架的一部分,都是这类数值规划问题的最新技术。
{"title":"The LM-Cut Heuristic Family for Optimal Numeric Planning with Simple Conditions","authors":"Ryo Kuroiwa, Alexander Shleyfman, Chiara Piacentini, Margarita P. Castro, J. Christopher Beck","doi":"10.1613/jair.1.14034","DOIUrl":"https://doi.org/10.1613/jair.1.14034","url":null,"abstract":"The LM-cut heuristic, both alone and as part of the operator counting framework, represents one of the most successful heuristics for classical planning. In this paper, we generalize LM-cut and its use in operator counting to optimal numeric planning with simple conditions and simple numeric effects, i.e., linear expressions over numeric state variables and actions that increase or decrease such variables by constant quantities. We introduce a variant of hmaxhbd (a previously proposed numeric hmax heuristic) based on the delete-relaxed version of such planning tasks and show that, although inadmissible by itself, our variant yields a numeric version of the classical LM-cut heuristic which is admissible. We classify the three existing families of heuristics for this class of numeric planning tasks and introduce the LM-cut family, proving dominance or incomparability between all pairs of existing max and LM-cut heuristics for numeric planning with simple conditions. Our extensive empirical evaluation shows that the new LM-cut heuristic, both on its own and as part of the operator counting framework, is the state-of-the-art for this class of numeric planning problem.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2022-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88041327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Data-Driven Revision of Conditional Norms in Multi-Agent Systems 多Agent系统中条件范数的数据驱动修正
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2022-12-28 DOI: 10.1613/jair.1.13683
Davide Dell’Anna, N. Alechina, F. Dalpiaz, M. Dastani, B. Logan
In multi-agent systems, norm enforcement is a mechanism for steering the behavior of individual agents in order to achieve desired system-level objectives. Due to the dynamics of multi-agent systems, however, it is hard to design norms that guarantee the achievement of the objectives in every operating context. Also, these objectives may change over time, thereby making previously defined norms ineffective. In this paper, we investigate the use of system execution data to automatically synthesise and revise conditional prohibitions with deadlines, a type of norms aimed at prohibiting agents from exhibiting certain patterns of behaviors. We propose DDNR (Data-Driven Norm Revision), a data-driven approach to norm revision that synthesises revised norms with respect to a data set of traces describing the behavior of the agents in the system. We evaluate DDNR using a state-of-the-art, off-the-shelf urban traffic simulator. The results show that DDNR synthesises revised norms that are significantly more accurate than the original norms in distinguishing adequate and inadequate behaviors for the achievement of the system-level objectives.
在多智能体系统中,规范执行是一种控制个体行为以实现所需系统级目标的机制。然而,由于多智能体系统的动态性,很难设计出保证在每个操作环境中实现目标的规范。此外,这些目标可能会随着时间的推移而改变,从而使先前定义的规范失效。在本文中,我们研究了使用系统执行数据来自动合成和修改带有截止日期的有条件禁令,这是一种旨在禁止代理人表现出某些行为模式的规范。我们提出了DDNR(数据驱动的规范修订),这是一种规范修订的数据驱动方法,它综合了关于描述系统中代理行为的轨迹数据集的修订规范。我们使用最先进的现成城市交通模拟器来评估DDNR。结果表明,DDNR综合了修订后的规范,这些规范在区分实现系统级目标的适当和不适当行为方面比原始规范准确得多。
{"title":"Data-Driven Revision of Conditional Norms in Multi-Agent Systems","authors":"Davide Dell’Anna, N. Alechina, F. Dalpiaz, M. Dastani, B. Logan","doi":"10.1613/jair.1.13683","DOIUrl":"https://doi.org/10.1613/jair.1.13683","url":null,"abstract":"In multi-agent systems, norm enforcement is a mechanism for steering the behavior of individual agents in order to achieve desired system-level objectives. Due to the dynamics of multi-agent systems, however, it is hard to design norms that guarantee the achievement of the objectives in every operating context. Also, these objectives may change over time, thereby making previously defined norms ineffective. In this paper, we investigate the use of system execution data to automatically synthesise and revise conditional prohibitions with deadlines, a type of norms aimed at prohibiting agents from exhibiting certain patterns of behaviors. We propose DDNR (Data-Driven Norm Revision), a data-driven approach to norm revision that synthesises revised norms with respect to a data set of traces describing the behavior of the agents in the system. We evaluate DDNR using a state-of-the-art, off-the-shelf urban traffic simulator. The results show that DDNR synthesises revised norms that are significantly more accurate than the original norms in distinguishing adequate and inadequate behaviors for the achievement of the system-level objectives.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2022-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48784946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Proofs and Certificates for Max-SAT Max-SAT的证明和证书
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2022-12-08 DOI: 10.1613/jair.1.13811
M. Py, Mohamed Sami Cherif, Djamal Habet
Current Max-SAT solvers are able to efficiently compute the optimal value of an input instance but they do not provide any certificate of its validity. In this paper, we present a tool, called MS-Builder, which generates certificates for the Max-SAT problem in the particular form of a sequence of equivalence-preserving transformations. To generate a certificate, MS-Builder iteratively calls a SAT oracle to get a SAT resolution refutation which is handled and adapted into a sound refutation for Max-SAT. In particular, we prove that the size of the computed Max-SAT refutation is linear with respect to the size of the initial refutation if it is semi-read-once, tree-like regular, tree-like or semi-tree-like. Additionally, we propose an extendable tool, called MS-Checker, able to verify the validity of any Max-SAT certificate using Max-SAT inference rules. Both tools are evaluated on the unweighted and weighted benchmark instances of the 2020 Max-SAT Evaluation.
目前的Max-SAT求解器能够有效地计算输入实例的最优值,但它们不提供其有效性的任何证书。在本文中,我们提出了一个叫做MS-Builder的工具,它以一系列保持等价变换的特定形式为Max-SAT问题生成证书。为了生成证书,MS-Builder迭代地调用一个SAT oracle来获得一个SAT解析反驳,该解析反驳被处理并适应为Max-SAT的一个声音反驳。特别地,我们证明了计算的Max-SAT反驳的大小与初始反驳的大小是线性的,如果它是半读一次,树状正则,树状或半树状。此外,我们提出了一个可扩展的工具,称为MS-Checker,能够使用Max-SAT推理规则验证任何Max-SAT证书的有效性。这两个工具都是在2020年Max-SAT评估的未加权和加权基准实例上进行评估的。
{"title":"Proofs and Certificates for Max-SAT","authors":"M. Py, Mohamed Sami Cherif, Djamal Habet","doi":"10.1613/jair.1.13811","DOIUrl":"https://doi.org/10.1613/jair.1.13811","url":null,"abstract":"Current Max-SAT solvers are able to efficiently compute the optimal value of an input instance but they do not provide any certificate of its validity. In this paper, we present a tool, called MS-Builder, which generates certificates for the Max-SAT problem in the particular form of a sequence of equivalence-preserving transformations. To generate a certificate, MS-Builder iteratively calls a SAT oracle to get a SAT resolution refutation which is handled and adapted into a sound refutation for Max-SAT. In particular, we prove that the size of the computed Max-SAT refutation is linear with respect to the size of the initial refutation if it is semi-read-once, tree-like regular, tree-like or semi-tree-like. Additionally, we propose an extendable tool, called MS-Checker, able to verify the validity of any Max-SAT certificate using Max-SAT inference rules. Both tools are evaluated on the unweighted and weighted benchmark instances of the 2020 Max-SAT Evaluation.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2022-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88507166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Chance-constrained Static Schedules for Temporally Probabilistic Plans 时间概率计划的机会约束静态调度
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2022-12-07 DOI: 10.1613/jair.1.13636
Cheng Fang, Andrew J. Wang, B. Williams
Time management under uncertainty is essential to large scale projects. From space exploration to industrial production, there is a need to schedule and perform activities. given complex specifications on timing. In order to generate schedules that are robust to uncertainty in the duration of activities, prior work has focused on a problem framing that uses an interval-bounded uncertainty representation. However, such approaches are unable to take advantage of known probability distributions over duration.In this paper we concentrate on a probabilistic formulation of temporal problems with uncertain duration, called the probabilistic simple temporal problem. As distributions often have an unbounded range of outcomes, we consider chance-constrained solutions, with guarantees on the probability of meeting temporal constraints. By considering distributions over uncertain duration, we are able to use risk as a resource, reason over the relative likelihood of outcomes, and derive higher utility solutions. We first demonstrate our approach by encoding the problem as a convex program. We then develop a more efficient hybrid algorithm whose parent solver generates risk allocations and whose child solver generates schedules for a particular risk allocation. The child is made efficient by leveraging existing interval-bounded scheduling algorithms, while the parent is made efficient by extracting conflicts over risk allocations. We perform numerical experiments to show the advantages of reasoning over probabilistic uncertainty, by comparing the utility of schedules generated with risk allocation against those generated from reasoning over bounded uncertainty. We also empirically show that solution time is greatly reduced by incorporating conflict-directed risk allocation.
不确定条件下的时间管理对于大型项目来说是必不可少的。从空间探索到工业生产,都需要安排和执行活动。给定复杂的时序规范。为了生成对活动持续时间的不确定性具有鲁棒性的计划,先前的工作集中在使用区间有界不确定性表示的问题框架上。然而,这种方法无法利用已知的持续时间内的概率分布。在本文中,我们集中讨论具有不确定持续时间的时间问题的一个概率公式,称为概率简单时间问题。由于分布通常具有无界的结果范围,我们考虑机会约束的解决方案,并保证满足时间约束的概率。通过考虑不确定持续时间的分布,我们能够将风险作为一种资源,对结果的相对可能性进行推理,并得出更高的效用解决方案。我们首先通过将问题编码为凸程序来演示我们的方法。然后,我们开发了一个更有效的混合算法,其父解算器生成风险分配,其子解算器生成特定风险分配的调度。子节点通过利用现有的间隔有界调度算法来提高效率,而父节点通过提取风险分配上的冲突来提高效率。我们进行了数值实验,通过比较风险分配生成的调度与有界不确定性推理生成的调度的效用,来显示推理优于概率不确定性的优势。我们还通过经验表明,通过合并冲突导向的风险分配,解决时间大大减少。
{"title":"Chance-constrained Static Schedules for Temporally Probabilistic Plans","authors":"Cheng Fang, Andrew J. Wang, B. Williams","doi":"10.1613/jair.1.13636","DOIUrl":"https://doi.org/10.1613/jair.1.13636","url":null,"abstract":"Time management under uncertainty is essential to large scale projects. From space exploration to industrial production, there is a need to schedule and perform activities. given complex specifications on timing. In order to generate schedules that are robust to uncertainty in the duration of activities, prior work has focused on a problem framing that uses an interval-bounded uncertainty representation. However, such approaches are unable to take advantage of known probability distributions over duration.\u0000In this paper we concentrate on a probabilistic formulation of temporal problems with uncertain duration, called the probabilistic simple temporal problem. As distributions often have an unbounded range of outcomes, we consider chance-constrained solutions, with guarantees on the probability of meeting temporal constraints. By considering distributions over uncertain duration, we are able to use risk as a resource, reason over the relative likelihood of outcomes, and derive higher utility solutions. We first demonstrate our approach by encoding the problem as a convex program. We then develop a more efficient hybrid algorithm whose parent solver generates risk allocations and whose child solver generates schedules for a particular risk allocation. The child is made efficient by leveraging existing interval-bounded scheduling algorithms, while the parent is made efficient by extracting conflicts over risk allocations. We perform numerical experiments to show the advantages of reasoning over probabilistic uncertainty, by comparing the utility of schedules generated with risk allocation against those generated from reasoning over bounded uncertainty. We also empirically show that solution time is greatly reduced by incorporating conflict-directed risk allocation.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86533324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Towards Evidence Retrieval Cost Reduction in Abstract Argumentation Frameworks with Fallible Evidence 基于可错证据的抽象论证框架中降低证据检索成本的研究
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2022-12-07 DOI: 10.1613/jair.1.13639
Andrea Cohen, Sebastian Gottifredi, A. García, Guillermo R. Simari
Arguments in argumentation systems cannot always be considered as standalone entities, requiring the consideration of the pieces of evidence they rely on. This evidence might have to be retrieved from external sources such as databases or the web, and each attempt to retrieve a piece of evidence comes with an associated cost. Moreover, a piece of evidence may be available in a given scenario but not in others, and this is not known beforehand. As a result, the collection of active arguments (whose entire set of evidence is available) that can be used by the argumentation machinery of the system may vary from one scenario to another. In this work, we consider an Abstract Argumentation Framework with Fallible Evidence that accounts for these issues, and propose a heuristic measure used as part of the acceptability calculus (specifically, for building pruned dialectical trees) with the aim of minimizing the evidence retrieval cost of the arguments involved in the reasoning process. We provide an algorithmic solution that is empirically tested against two baselines and formally show the correctness of our approach.
论证系统中的论证不能总是被视为独立的实体,需要考虑它们所依赖的证据片段。这些证据可能必须从数据库或网络等外部来源中检索,并且每次检索证据的尝试都伴随着相关的成本。此外,某一证据可能在某一特定情况下可用,而在其他情况下则不可用,这是事先不知道的。因此,系统的论证机制可以使用的活动论证(其全部证据都是可用的)的集合可能因场景而异。在这项工作中,我们考虑了一个具有可错证据的抽象论证框架来解释这些问题,并提出了一种启发式措施,作为可接受性演算的一部分(特别是用于构建修剪的辩证树),目的是最大限度地减少推理过程中涉及的论据的证据检索成本。我们提供了一个算法解决方案,针对两个基线进行了经验测试,并正式显示了我们方法的正确性。
{"title":"Towards Evidence Retrieval Cost Reduction in Abstract Argumentation Frameworks with Fallible Evidence","authors":"Andrea Cohen, Sebastian Gottifredi, A. García, Guillermo R. Simari","doi":"10.1613/jair.1.13639","DOIUrl":"https://doi.org/10.1613/jair.1.13639","url":null,"abstract":"Arguments in argumentation systems cannot always be considered as standalone entities, requiring the consideration of the pieces of evidence they rely on. This evidence might have to be retrieved from external sources such as databases or the web, and each attempt to retrieve a piece of evidence comes with an associated cost. Moreover, a piece of evidence may be available in a given scenario but not in others, and this is not known beforehand. As a result, the collection of active arguments (whose entire set of evidence is available) that can be used by the argumentation machinery of the system may vary from one scenario to another. In this work, we consider an Abstract Argumentation Framework with Fallible Evidence that accounts for these issues, and propose a heuristic measure used as part of the acceptability calculus (specifically, for building pruned dialectical trees) with the aim of minimizing the evidence retrieval cost of the arguments involved in the reasoning process. We provide an algorithmic solution that is empirically tested against two baselines and formally show the correctness of our approach.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2022-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73877920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical Decompositions and Termination Analysis for Generalized Planning 广义规划的层次分解与终止分析
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2022-12-06 DOI: 10.1613/jair.1.14185
Siddharth Srivastava
This paper presents new methods for analyzing and evaluating generalized plans that can solve broad classes of related planning problems. Although synthesis and learning of generalized plans has been a longstanding goal in AI, it remains challenging due to fundamental gaps in methods for analyzing the scope and utility of a given generalized plan. This paper addresses these gaps by developing a new conceptual framework along with proof techniques and algorithmic processes for assessing termination and goal-reachability related properties of generalized plans. We build upon classic results from graph theory to decompose generalized plans into smaller components that are then used to derive hierarchical termination arguments. These methods can be used to determine the utility of a given generalized plan, as well as to guide the synthesis and learning processes for generalized plans. We present theoretical as well as empirical results illustrating the scope of this new approach. Our analysis shows that this approach significantly extends the class of generalized plans that can be assessed automatically, thereby reducing barriers in the synthesis and learning of reliable generalized plans.
本文提出了一种分析和评价广义规划的新方法,它可以解决大量的相关规划问题。尽管综合和学习广义计划一直是人工智能的长期目标,但由于分析给定广义计划的范围和效用的方法存在根本差距,因此仍然具有挑战性。本文通过开发一个新的概念框架以及用于评估广义计划的终止和目标可达性相关性质的证明技术和算法过程来解决这些差距。我们以图论的经典结果为基础,将广义计划分解为更小的组件,然后用于派生分层终止参数。这些方法可用于确定给定广义计划的效用,以及指导广义计划的综合和学习过程。我们提出了理论和实证结果,说明了这种新方法的范围。我们的分析表明,这种方法显著地扩展了可以自动评估的广义计划的类别,从而减少了综合和学习可靠的广义计划的障碍。
{"title":"Hierarchical Decompositions and Termination Analysis for Generalized Planning","authors":"Siddharth Srivastava","doi":"10.1613/jair.1.14185","DOIUrl":"https://doi.org/10.1613/jair.1.14185","url":null,"abstract":"This paper presents new methods for analyzing and evaluating generalized plans that can solve broad classes of related planning problems. Although synthesis and learning of generalized plans has been a longstanding goal in AI, it remains challenging due to fundamental gaps in methods for analyzing the scope and utility of a given generalized plan. This paper addresses these gaps by developing a new conceptual framework along with proof techniques and algorithmic processes for assessing termination and goal-reachability related properties of generalized plans. We build upon classic results from graph theory to decompose generalized plans into smaller components that are then used to derive hierarchical termination arguments. These methods can be used to determine the utility of a given generalized plan, as well as to guide the synthesis and learning processes for generalized plans. We present theoretical as well as empirical results illustrating the scope of this new approach. Our analysis shows that this approach significantly extends the class of generalized plans that can be assessed automatically, thereby reducing barriers in the synthesis and learning of reliable generalized plans.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85412786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Strategy Graphs for Influence Diagrams 影响图的策略图
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2022-11-30 DOI: 10.1613/jair.1.13865
E. Hansen, Jinchuan Shi, James Kastrantas
An influence diagram is a graphical model of a Bayesian decision problem that is solved by finding a strategy that maximizes expected utility. When an influence diagram is solved by variable elimination or a related dynamic programming algorithm, it is traditional to represent a strategy as a sequence of policies, one for each decision variable, where a policy maps the relevant history for a decision to an action. We propose an alternative representation of a strategy as a graph, called a strategy graph, and show how to modify a variable elimination algorithm so that it constructs a strategy graph. We consider both a classic variable elimination algorithm for influence diagrams and a recent extension of this algorithm that has more relaxed constraints on elimination order that allow improved performance. We consider the advantages of representing a strategy as a graph and, in particular, how to simplify a strategy graph so that it is easier to interpret and analyze.
影响图是贝叶斯决策问题的图形模型,通过寻找最大化预期效用的策略来解决该问题。当通过变量消除或相关的动态规划算法求解影响图时,传统的做法是将策略表示为策略序列,每个决策变量对应一个策略,其中策略将决策的相关历史映射到操作。我们提出了一种策略图的替代表示,称为策略图,并展示了如何修改变量消除算法,使其构建策略图。我们考虑了影响图的经典变量消除算法和该算法的最新扩展,该算法对消除顺序有更宽松的约束,从而提高了性能。我们考虑了用图表示策略的优点,特别是如何简化策略图,使其更容易解释和分析。
{"title":"Strategy Graphs for Influence Diagrams","authors":"E. Hansen, Jinchuan Shi, James Kastrantas","doi":"10.1613/jair.1.13865","DOIUrl":"https://doi.org/10.1613/jair.1.13865","url":null,"abstract":"\u0000\u0000\u0000An influence diagram is a graphical model of a Bayesian decision problem that is solved by finding a strategy that maximizes expected utility. When an influence diagram is solved by variable elimination or a related dynamic programming algorithm, it is traditional to represent a strategy as a sequence of policies, one for each decision variable, where a policy maps the relevant history for a decision to an action. We propose an alternative representation of a strategy as a graph, called a strategy graph, and show how to modify a variable elimination algorithm so that it constructs a strategy graph. We consider both a classic variable elimination algorithm for influence diagrams and a recent extension of this algorithm that has more relaxed constraints on elimination order that allow improved performance. We consider the advantages of representing a strategy as a graph and, in particular, how to simplify a strategy graph so that it is easier to interpret and analyze.\u0000\u0000\u0000","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91020447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Asymmetric Action Abstractions for Planning in Real-Time Strategy Games 即时策略游戏中的非对称行动抽象
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2022-11-30 DOI: 10.1613/jair.1.13769
Rubens O. Moraes, M. Nascimento, Levi H. S. Lelis
Action abstractions restrict the number of legal actions available for real-time planning in zero-sum extensive-form games, thus allowing algorithms to focus their search on a set of promising actions. Even though unabstracted game trees can lead to optimal policies, due to real-time constraints and the tree size, they are not a practical choice. In this context, we introduce an action abstraction scheme which we call asymmetric action abstraction. Asymmetric abstractions allow search algorithms to “pay more attention” to some aspects of the game by unevenly dividing the algorithm’s search effort amongst different aspects of the game. We also introduce four algorithms that search in asymmetrically abstracted game trees to evaluate the effectiveness of our abstraction schemes. Two of our algorithms are adaptations of algorithms developed for searching in action-abstracted spaces, Portfolio Greedy Search and Stratified Strategy Selection, and the other two are adaptations of an algorithm developed for searching in unabstracted spaces, NaïveMCTS. An extensive set of experiments in a real-time strategy game shows that search algorithms using asymmetric abstractions are able to outperform all other search algorithms tested.
在零和游戏中,动作抽象限制了可用于实时规划的合法行动的数量,从而允许算法将搜索重点放在一组有希望的行动上。尽管非抽象的游戏树可以带来最佳策略,但由于实时约束和树的大小,它们并不是一个实际的选择。在这种情况下,我们引入了一种称为非对称动作抽象的动作抽象方案。非对称抽象允许搜索算法通过在游戏的不同方面不均匀地分配算法的搜索努力而“更加关注”游戏的某些方面。我们还介绍了在非对称抽象博弈树中搜索的四种算法,以评估我们的抽象方案的有效性。我们的两个算法是针对在动作抽象空间中搜索而开发的算法(投资组合贪婪搜索和分层策略选择)的改编,另外两个算法是针对在非抽象空间中搜索而开发的算法(NaïveMCTS)的改编。在实时策略游戏中进行的大量实验表明,使用非对称抽象的搜索算法能够优于所有其他经过测试的搜索算法。
{"title":"Asymmetric Action Abstractions for Planning in Real-Time Strategy Games","authors":"Rubens O. Moraes, M. Nascimento, Levi H. S. Lelis","doi":"10.1613/jair.1.13769","DOIUrl":"https://doi.org/10.1613/jair.1.13769","url":null,"abstract":"Action abstractions restrict the number of legal actions available for real-time planning in zero-sum extensive-form games, thus allowing algorithms to focus their search on a set of promising actions. Even though unabstracted game trees can lead to optimal policies, due to real-time constraints and the tree size, they are not a practical choice. In this context, we introduce an action abstraction scheme which we call asymmetric action abstraction. Asymmetric abstractions allow search algorithms to “pay more attention” to some aspects of the game by unevenly dividing the algorithm’s search effort amongst different aspects of the game. We also introduce four algorithms that search in asymmetrically abstracted game trees to evaluate the effectiveness of our abstraction schemes. Two of our algorithms are adaptations of algorithms developed for searching in action-abstracted spaces, Portfolio Greedy Search and Stratified Strategy Selection, and the other two are adaptations of an algorithm developed for searching in unabstracted spaces, NaïveMCTS. An extensive set of experiments in a real-time strategy game shows that search algorithms using asymmetric abstractions are able to outperform all other search algorithms tested.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80962716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning to Design Fair and Private Voting Rules 学习设计公平和私人投票规则
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2022-11-30 DOI: 10.1613/jair.1.13734
Farhad Mohsin, Ao Liu, Pin-Yu Chen, Francesca Rossi, Lirong Xia
Voting is used widely to identify a collective decision for a group of agents, based on their preferences. In this paper, we focus on evaluating and designing voting rules that support both the privacy of the voting agents and a notion of fairness over such agents. To do this, we introduce a novel notion of group fairness and adopt the existing notion of local differential privacy. We then evaluate the level of group fairness in several existing voting rules, as well as the trade-offs between fairness and privacy, showing that it is not possible to always obtain maximal economic efficiency with high fairness or high privacy levels. Then, we present both a machine learning and a constrained optimization approach to design new voting rules that are fair while maintaining a high level of economic efficiency. Finally, we empirically examine the effect of adding noise to create local differentially private voting rules and discuss the three-way trade-off between economic efficiency, fairness, and privacy.This paper appears in the special track on AI & Society.
投票被广泛用于确定一组代理的集体决策,基于他们的偏好。在本文中,我们专注于评估和设计投票规则,这些规则既支持投票代理的隐私,又支持对这些代理的公平概念。为此,我们引入了一种新的群体公平概念,并采用了现有的局部差分隐私概念。然后,我们评估了几种现有投票规则中的群体公平水平,以及公平与隐私之间的权衡,表明在高公平或高隐私水平下不可能总是获得最大的经济效率。然后,我们提出了一种机器学习和约束优化方法来设计公平的新投票规则,同时保持高水平的经济效率。最后,我们实证检验了加入噪声来创建局部差异私密投票规则的效果,并讨论了经济效率、公平和隐私之间的三方权衡。本文发表在《人工智能与社会》专刊上。
{"title":"Learning to Design Fair and Private Voting Rules","authors":"Farhad Mohsin, Ao Liu, Pin-Yu Chen, Francesca Rossi, Lirong Xia","doi":"10.1613/jair.1.13734","DOIUrl":"https://doi.org/10.1613/jair.1.13734","url":null,"abstract":"Voting is used widely to identify a collective decision for a group of agents, based on their preferences. In this paper, we focus on evaluating and designing voting rules that support both the privacy of the voting agents and a notion of fairness over such agents. To do this, we introduce a novel notion of group fairness and adopt the existing notion of local differential privacy. We then evaluate the level of group fairness in several existing voting rules, as well as the trade-offs between fairness and privacy, showing that it is not possible to always obtain maximal economic efficiency with high fairness or high privacy levels. Then, we present both a machine learning and a constrained optimization approach to design new voting rules that are fair while maintaining a high level of economic efficiency. Finally, we empirically examine the effect of adding noise to create local differentially private voting rules and discuss the three-way trade-off between economic efficiency, fairness, and privacy.\u0000This paper appears in the special track on AI & Society.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86800576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Reinforcement Learning from Optimization Proxy for Ride-Hailing Vehicle Relocation 基于优化代理的网约车重新定位强化学习
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2022-11-28 DOI: 10.1613/jair.1.13794
Enpeng Yuan, Wenbo Chen, P. V. Hentenryck
Idle vehicle relocation is crucial for addressing demand-supply imbalance that frequently arises in the ride-hailing system. Current mainstream methodologies - optimization and reinforcement learning - suffer from obvious computational drawbacks. Optimization models need to be solved in real-time and often trade off model fidelity (hence quality of solutions) for computational efficiency. Reinforcement learning is expensive to train and often struggles to achieve coordination among a large fleet. This paper designs a hybrid approach that leverages the strengths of the two while overcoming their drawbacks. Specifically, it trains an optimization proxy, i.e., a machine-learning model that approximates an optimization model, and then refines the proxy with reinforcement learning. This Reinforcement Learning from Optimization Proxy (RLOP) approach is computationally efficient to train and deploy, and achieves better results than RL or optimization alone. Numerical experiments on the New York City dataset show that the RLOP approach reduces both the relocation costs and computation time significantly compared to the optimization model, while pure reinforcement learning fails to converge due to computational complexity.
闲置车辆的重新安置对于解决网约车系统中经常出现的供需失衡问题至关重要。目前的主流方法——优化和强化学习——在计算上存在明显的缺陷。优化模型需要实时求解,并且经常为了计算效率而牺牲模型保真度(即解的质量)。强化学习的训练成本很高,而且常常难以实现大型机群之间的协调。本文设计了一种混合方法,利用两者的优势,同时克服它们的缺点。具体来说,它训练一个优化代理,即一个近似于优化模型的机器学习模型,然后用强化学习来改进代理。这种基于优化代理的强化学习(RLOP)方法在训练和部署方面具有计算效率,并且比单独的强化学习或优化获得更好的结果。在纽约市数据集上的数值实验表明,与优化模型相比,RLOP方法显著降低了迁移成本和计算时间,而单纯的强化学习由于计算复杂度而无法收敛。
{"title":"Reinforcement Learning from Optimization Proxy for Ride-Hailing Vehicle Relocation","authors":"Enpeng Yuan, Wenbo Chen, P. V. Hentenryck","doi":"10.1613/jair.1.13794","DOIUrl":"https://doi.org/10.1613/jair.1.13794","url":null,"abstract":"\u0000\u0000\u0000Idle vehicle relocation is crucial for addressing demand-supply imbalance that frequently arises in the ride-hailing system. Current mainstream methodologies - optimization and reinforcement learning - suffer from obvious computational drawbacks. Optimization models need to be solved in real-time and often trade off model fidelity (hence quality of solutions) for computational efficiency. Reinforcement learning is expensive to train and often struggles to achieve coordination among a large fleet. This paper designs a hybrid approach that leverages the strengths of the two while overcoming their drawbacks. Specifically, it trains an optimization proxy, i.e., a machine-learning model that approximates an optimization model, and then refines the proxy with reinforcement learning. This Reinforcement Learning from Optimization Proxy (RLOP) approach is computationally efficient to train and deploy, and achieves better results than RL or optimization alone. Numerical experiments on the New York City dataset show that the RLOP approach reduces both the relocation costs and computation time significantly compared to the optimization model, while pure reinforcement learning fails to converge due to computational complexity.\u0000\u0000\u0000","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83907912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Journal of Artificial Intelligence Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1