首页 > 最新文献

Journal of Artificial Intelligence Research最新文献

英文 中文
Contract Scheduling with Predictions 带有预测的合同调度
3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-06-12 DOI: 10.1613/jair.1.14117
Spyros Angelopoulos, Shahin Kamali
Contract scheduling is a general technique that allows the design of systems with interruptible capabilities, given an algorithm that is not necessarily interruptible. Previous work on this topic has assumed that the interruption is a worst-case deadline that is unknown to the scheduler. In this work, we study new settings in which the scheduler has access to some imperfect prediction in regards to the interruption. In the first setting, which is inspired by recent advances in learning-enhanced algorithms, the prediction describes the time that the interruption occurs. The second setting introduces a new model in which predictions are elicited as responses to a number of binary queries. For both settings, we investigate trade-offs between the robustness (i.e., the worst-case performance of the schedule if the prediction is generated adversarially) and the consistency (i.e., the performance assuming that the prediction is error-free). We also establish results on the performance of the schedules as a function of the prediction error.
契约调度是一种通用技术,它允许设计具有可中断能力的系统,给定一个不一定可中断的算法。关于此主题的先前工作假设中断是调度程序未知的最坏情况截止日期。在这项工作中,我们研究了新的设置,其中调度程序可以访问一些关于中断的不完美预测。在第一种设置中,受到学习增强算法最新进展的启发,预测描述了中断发生的时间。第二个设置引入了一个新模型,其中预测是作为对许多二进制查询的响应得出的。对于这两种设置,我们研究了鲁棒性(即,如果预测是对抗性生成的,则调度的最坏情况性能)和一致性(即,假设预测是无错误的性能)之间的权衡。我们还建立了计划性能的结果作为预测误差的函数。
{"title":"Contract Scheduling with Predictions","authors":"Spyros Angelopoulos, Shahin Kamali","doi":"10.1613/jair.1.14117","DOIUrl":"https://doi.org/10.1613/jair.1.14117","url":null,"abstract":"Contract scheduling is a general technique that allows the design of systems with interruptible capabilities, given an algorithm that is not necessarily interruptible. Previous work on this topic has assumed that the interruption is a worst-case deadline that is unknown to the scheduler. In this work, we study new settings in which the scheduler has access to some imperfect prediction in regards to the interruption. In the first setting, which is inspired by recent advances in learning-enhanced algorithms, the prediction describes the time that the interruption occurs. The second setting introduces a new model in which predictions are elicited as responses to a number of binary queries. For both settings, we investigate trade-offs between the robustness (i.e., the worst-case performance of the schedule if the prediction is generated adversarially) and the consistency (i.e., the performance assuming that the prediction is error-free). We also establish results on the performance of the schedules as a function of the prediction error.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136221754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Your Prompt is My Command: On Assessing the Human-Centred Generality of Multimodal Models 你的提示就是我的命令:关于评估以人为中心的多模态模型的一般性
IF 5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-06-12 DOI: 10.1613/jair.1.14157
Wout Schellaert, Fernando Martínez-Plumed, Karina Vold, John Burden, Pablo Antonio Moreno Casares, B. S. Loe, Roi Reichart, Seán Ó hÉigeartaigh, A. Korhonen, J. Hernández-Orallo
Even with obvious deficiencies, large prompt-commanded multimodal models are proving to be flexible cognitive tools representing an unprecedented generality. But the directness, diversity, and degree of user interaction create a distinctive “human-centred generality” (HCG), rather than a fully autonomous one. HCG implies that —for a specific user— a system is only as general as it is effective for the user’s relevant tasks and their prevalent ways of prompting. A human-centred evaluation of general-purpose AI systems therefore needs to reflect the personal nature of interaction, tasks and cognition. We argue that the best way to understand these systems is as highly-coupled cognitive extenders, and to analyse the bidirectional cognitive adaptations between them and humans. In this paper, we give a formulation of HCG, as well as a high-level overview of the elements and trade-offs involved in the prompting process. We end the paper by outlining some essential research questions and suggestions for improving evaluation practices, which we envision as characteristic for the evaluation of general artificial intelligence in the future.This paper appears in the AI & Society track.
尽管存在明显的缺陷,但大型即时命令多模态模型被证明是一种灵活的认知工具,代表了前所未有的普遍性。但是,用户交互的直接性、多样性和程度创造了一种独特的“以人为中心的普遍性”(HCG),而不是完全自主的普遍性。HCG意味着——对于一个特定的用户——一个系统只有在它对用户的相关任务和他们普遍的提示方式有效时才具有普遍性。因此,以人为中心的通用人工智能系统评估需要反映交互、任务和认知的个人性质。我们认为,理解这些系统的最好方法是作为高度耦合的认知扩展器,并分析它们与人类之间的双向认知适应。在本文中,我们给出了HCG的配方,以及提示过程中涉及的元素和权衡的高层次概述。最后,我们概述了一些重要的研究问题和改进评估实践的建议,我们认为这是未来通用人工智能评估的特征。这篇论文发表在人工智能与社会轨道上。
{"title":"Your Prompt is My Command: On Assessing the Human-Centred Generality of Multimodal Models","authors":"Wout Schellaert, Fernando Martínez-Plumed, Karina Vold, John Burden, Pablo Antonio Moreno Casares, B. S. Loe, Roi Reichart, Seán Ó hÉigeartaigh, A. Korhonen, J. Hernández-Orallo","doi":"10.1613/jair.1.14157","DOIUrl":"https://doi.org/10.1613/jair.1.14157","url":null,"abstract":"Even with obvious deficiencies, large prompt-commanded multimodal models are proving to be flexible cognitive tools representing an unprecedented generality. But the directness, diversity, and degree of user interaction create a distinctive “human-centred generality” (HCG), rather than a fully autonomous one. HCG implies that —for a specific user— a system is only as general as it is effective for the user’s relevant tasks and their prevalent ways of prompting. A human-centred evaluation of general-purpose AI systems therefore needs to reflect the personal nature of interaction, tasks and cognition. We argue that the best way to understand these systems is as highly-coupled cognitive extenders, and to analyse the bidirectional cognitive adaptations between them and humans. In this paper, we give a formulation of HCG, as well as a high-level overview of the elements and trade-offs involved in the prompting process. We end the paper by outlining some essential research questions and suggestions for improving evaluation practices, which we envision as characteristic for the evaluation of general artificial intelligence in the future.\u0000This paper appears in the AI & Society track.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"11 1","pages":"377-394"},"PeriodicalIF":5.0,"publicationDate":"2023-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75840088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Multi-Goal Reinforcement Learning via Value Consistency Prioritization 基于价值一致性优先排序的高效多目标强化学习
IF 5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-06-05 DOI: 10.1613/jair.1.14398
Jiawei Xu, Shuxing Li, Rui Yang, Chun Yuan, Lei Han
Goal-conditioned reinforcement learning (RL) with sparse rewards remains a challenging problem in deep RL. Hindsight Experience Replay (HER) has been demonstrated to be an effective solution, where HER replaces desired goals in failed experiences with practically achieved states. Existing approaches mainly focus on either exploration or exploitation to improve the performance of HER. From a joint perspective, exploiting specific past experiences can also implicitly drive exploration. Therefore, we concentrate on prioritizing both original and relabeled samples for efficient goal-conditioned RL. To achieve this, we propose a novel value consistency prioritization (VCP) method, where the priority of samples is determined by the consistency of ensemble Q-values. This distinguishes the VCP method with most existing prioritization approaches which prioritizes samples based on the uncertainty of ensemble Q-values. Through extensive experiments, we demonstrate that VCP achieves significantly higher sample efficiency than existing algorithms on a range of challenging goal-conditioned manipulation tasks. We also visualize how VCP prioritizes good experiences to enhance policy learning.
基于稀疏奖励的目标条件强化学习(RL)是深度强化学习中一个具有挑战性的问题。事后经验回放(HER)已被证明是一种有效的解决方案,HER将失败经验中的期望目标替换为实际实现的状态。现有的方法主要集中在探索或开发上,以提高HER的性能。从共同的角度来看,利用特定的过去经验也可以隐含地推动探索。因此,我们专注于对原始和重新标记的样本进行优先排序,以实现有效的目标条件强化学习。为了实现这一点,我们提出了一种新的值一致性优先化(VCP)方法,其中样本的优先级由集合q值的一致性决定。这将VCP方法与大多数现有的基于集合q值的不确定性对样本进行优先排序的方法区分开来。通过大量的实验,我们证明了VCP在一系列具有挑战性的目标条件操作任务上比现有算法实现了更高的样本效率。我们还可视化了VCP如何优先考虑好的经验来加强政策学习。
{"title":"Efficient Multi-Goal Reinforcement Learning via Value Consistency Prioritization","authors":"Jiawei Xu, Shuxing Li, Rui Yang, Chun Yuan, Lei Han","doi":"10.1613/jair.1.14398","DOIUrl":"https://doi.org/10.1613/jair.1.14398","url":null,"abstract":"Goal-conditioned reinforcement learning (RL) with sparse rewards remains a challenging problem in deep RL. Hindsight Experience Replay (HER) has been demonstrated to be an effective solution, where HER replaces desired goals in failed experiences with practically achieved states. Existing approaches mainly focus on either exploration or exploitation to improve the performance of HER. From a joint perspective, exploiting specific past experiences can also implicitly drive exploration. Therefore, we concentrate on prioritizing both original and relabeled samples for efficient goal-conditioned RL. To achieve this, we propose a novel value consistency prioritization (VCP) method, where the priority of samples is determined by the consistency of ensemble Q-values. This distinguishes the VCP method with most existing prioritization approaches which prioritizes samples based on the uncertainty of ensemble Q-values. Through extensive experiments, we demonstrate that VCP achieves significantly higher sample efficiency than existing algorithms on a range of challenging goal-conditioned manipulation tasks. We also visualize how VCP prioritizes good experiences to enhance policy learning.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"98 1","pages":"355-376"},"PeriodicalIF":5.0,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73838875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semiring Reasoning Frameworks in AI and Their Computational Complexity 人工智能中的半环推理框架及其计算复杂度
IF 5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-05-31 DOI: 10.1613/jair.1.13970
Thomas Eiter, Rafael Kiesel
Many important problems in AI, among them #SAT, parameter learning and probabilistic inference go beyond the classical satisfiability problem. Here, instead of finding a solution we are interested in a quantity associated with the set of solutions, such as the number of solutions, the optimal solution or the probability that a query holds in a solution. To model such quantitative problems in a uniform manner, a number of frameworks, e.g. Algebraic Model Counting and Semiring-based Constraint Satisfaction Problems, employ what we call the semiring paradigm. In the latter the abstract algebraic structure of the semiring serves as a means of parameterizing the problem definition, thus allowing for different modes of quantitative computations by choosing different semirings. While efficiently solvable cases have been widely studied, a systematic study of the computational complexity of such problems depending on the semiring parameter is missing. In this work, we characterize the latter by NP(R), a novel generalization of NP over semiring R, and obtain NP(R)-completeness results for a selection of semiring frameworks. To obtain more tangible insights into the hardness of NP(R), we link it to well-known complexity classes from the literature. Interestingly, we manage to connect the computational hardness to properties of the semiring. Using this insight, we see that, on the one hand, NP(R) is always at least as hard as NP or ModpP depending on the semiring R and in general unlikely to be in FPSPACEpoly. On the other hand, for broad subclasses of semirings relevant in practice we can employ reductions to NP, ModpP and #P. These results show that in many cases solutions are only mildly harder to compute than functions in NP, ModpP and #P, give us new insights into how problems that involve counting on semirings can be approached, and provide a means of assessing whether an algorithm is appropriate for a given class of problems.
人工智能中的许多重要问题,如#SAT、参数学习和概率推理,都超越了经典的可满足性问题。在这里,我们感兴趣的不是寻找一个解,而是与解集相关的一个量,比如解的数量、最优解或查询在一个解中存在的概率。为了以统一的方式对这些定量问题建模,许多框架,例如代数模型计数和基于半环的约束满足问题,采用了我们所说的半环范式。在后者中,半环的抽象代数结构作为一种参数化问题定义的手段,从而允许通过选择不同的半环进行不同模式的定量计算。虽然有效可解的情况已经得到了广泛的研究,但缺乏对这类问题依赖于半环参数的计算复杂度的系统研究。在这项工作中,我们用NP(R)来描述后者,NP(R)是NP在半环R上的一种新的推广,并获得了一些半环框架的NP(R)-完备性结果。为了对NP(R)的硬度有更切实的了解,我们将其与文献中众所周知的复杂度类联系起来。有趣的是,我们设法将计算硬度与半环的性质联系起来。使用这种见解,我们看到,一方面,NP(R)总是至少与NP或ModpP一样难,这取决于半环R,并且通常不太可能在FPSPACEpoly中。另一方面,对于与实践相关的半环的广泛子类,我们可以使用NP, ModpP和#P的约简。这些结果表明,在许多情况下,解决方案只比NP, ModpP和#P中的函数更难计算,这给我们提供了关于如何处理涉及半环计数的问题的新见解,并提供了一种评估算法是否适用于给定类别问题的方法。
{"title":"Semiring Reasoning Frameworks in AI and Their Computational Complexity","authors":"Thomas Eiter, Rafael Kiesel","doi":"10.1613/jair.1.13970","DOIUrl":"https://doi.org/10.1613/jair.1.13970","url":null,"abstract":"Many important problems in AI, among them #SAT, parameter learning and probabilistic inference go beyond the classical satisfiability problem. Here, instead of finding a solution we are interested in a quantity associated with the set of solutions, such as the number of solutions, the optimal solution or the probability that a query holds in a solution. To model such quantitative problems in a uniform manner, a number of frameworks, e.g. Algebraic Model Counting and Semiring-based Constraint Satisfaction Problems, employ what we call the semiring paradigm. In the latter the abstract algebraic structure of the semiring serves as a means of parameterizing the problem definition, thus allowing for different modes of quantitative computations by choosing different semirings. While efficiently solvable cases have been widely studied, a systematic study of the computational complexity of such problems depending on the semiring parameter is missing. In this work, we characterize the latter by NP(R), a novel generalization of NP over semiring R, and obtain NP(R)-completeness results for a selection of semiring frameworks. To obtain more tangible insights into the hardness of NP(R), we link it to well-known complexity classes from the literature. Interestingly, we manage to connect the computational hardness to properties of the semiring. Using this insight, we see that, on the one hand, NP(R) is always at least as hard as NP or ModpP depending on the semiring R and in general unlikely to be in FPSPACEpoly. On the other hand, for broad subclasses of semirings relevant in practice we can employ reductions to NP, ModpP and #P. These results show that in many cases solutions are only mildly harder to compute than functions in NP, ModpP and #P, give us new insights into how problems that involve counting on semirings can be approached, and provide a means of assessing whether an algorithm is appropriate for a given class of problems.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"84 1","pages":"207-293"},"PeriodicalIF":5.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83838050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Centralized Critics in Multi-Agent Reinforcement Learning 多智能体强化学习中的集中批评
IF 5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-05-31 DOI: 10.1613/jair.1.14386
Xueguang Lyu, Andrea Baisero, Yuchen Xiao, Brett Daley, Chris Amato
Centralized Training for Decentralized Execution, where agents are trained offline in a centralized fashion and execute online in a decentralized manner, has become a popular approach in Multi-Agent Reinforcement Learning (MARL). In particular, it has become popular to develop actor-critic methods that train decentralized actors with a centralized critic where the centralized critic is allowed access global information of the entire system, including the true system state. Such centralized critics are possible given offline information and are not used for online execution. While these methods perform well in a number of domains and have become a de facto standard in MARL, using a centralized critic in this context has yet to be sufficiently analyzed theoretically or empirically. In this paper, we therefore formally analyze centralized and decentralized critic approaches, and analyze the effect of using state-based critics in partially observable environments. We derive theories contrary to the common intuition: critic centralization is not strictly beneficial, and using state values can be harmful. We further prove that, in particular, state-based critics can introduce unexpected bias and variance compared to history-based critics. Finally, we demonstrate how the theory applies in practice by comparing different forms of critics on a wide range of common multi-agent benchmarks. The experiments show practical issues such as the difficulty of representation learning with partial observability, which highlights why the theoretical problems are often overlooked in the literature.
分散式执行的集中式训练,即代理以集中的方式离线训练,并以分散的方式在线执行,已成为多代理强化学习(MARL)中的一种流行方法。特别是,开发参与者-批评者方法已经变得流行,这种方法可以用集中的批评者来训练分散的参与者,其中集中的批评者可以访问整个系统的全局信息,包括真实的系统状态。这种集中式批评在提供离线信息的情况下是可能的,而不是用于在线执行。虽然这些方法在许多领域表现良好,并已成为MARL的事实上的标准,但在这种情况下使用集中式批评尚未得到充分的理论或经验分析。因此,在本文中,我们正式分析了集中式和分散式批评方法,并分析了在部分可观察的环境中使用基于状态的批评的效果。我们得出了与普遍直觉相反的理论:批评家集中化并不严格有益,使用国家价值可能有害。我们进一步证明,与基于历史的批评相比,基于国家的批评尤其会引入意想不到的偏见和方差。最后,我们通过在广泛的常见多智能体基准上比较不同形式的批评来证明该理论如何应用于实践。实验显示了实际问题,如部分可观察性表征学习的困难,这突出了为什么理论问题在文献中经常被忽视。
{"title":"On Centralized Critics in Multi-Agent Reinforcement Learning","authors":"Xueguang Lyu, Andrea Baisero, Yuchen Xiao, Brett Daley, Chris Amato","doi":"10.1613/jair.1.14386","DOIUrl":"https://doi.org/10.1613/jair.1.14386","url":null,"abstract":"Centralized Training for Decentralized Execution, where agents are trained offline in a centralized fashion and execute online in a decentralized manner, has become a popular approach in Multi-Agent Reinforcement Learning (MARL). In particular, it has become popular to develop actor-critic methods that train decentralized actors with a centralized critic where the centralized critic is allowed access global information of the entire system, including the true system state. Such centralized critics are possible given offline information and are not used for online execution. While these methods perform well in a number of domains and have become a de facto standard in MARL, using a centralized critic in this context has yet to be sufficiently analyzed theoretically or empirically. In this paper, we therefore formally analyze centralized and decentralized critic approaches, and analyze the effect of using state-based critics in partially observable environments. We derive theories contrary to the common intuition: critic centralization is not strictly beneficial, and using state values can be harmful. We further prove that, in particular, state-based critics can introduce unexpected bias and variance compared to history-based critics. Finally, we demonstrate how the theory applies in practice by comparing different forms of critics on a wide range of common multi-agent benchmarks. The experiments show practical issues such as the difficulty of representation learning with partial observability, which highlights why the theoretical problems are often overlooked in the literature.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"99 1","pages":"295-354"},"PeriodicalIF":5.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79256386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Computational Modelling of Quantifier Use: Corpus, Models, and Evaluation 量词使用的计算建模:语料库、模型和评估
IF 5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-05-30 DOI: 10.1613/jair.1.13899
Guanyi Chen, Kees van Deemter
A prominent strand of work in formal semantics investigates the ways in which human languages quantify the elements of a set, as when we say All A are B, Few A are B, and so on. Building on a growing body of empirical studies that shed light on the meaning and the use of quantifiers, we extend this line of work by computationally modelling how human speakers textually describe complex scenes in which quantitative relations play an important role. To this end, we conduct a series of elicitation experiments in which human speakers were asked to perform a linguistic task that invites the use of quantified expressions. The experiments result in a corpus, called QTUNA, made up of short texts that contain a large variety of quantified expressions. We analyse QTUNA, summarise our findings, and explain how we design computational models of human quantifier use accordingly. Finally, we evaluate these models in accordance with QTUNA.
形式语义学中一个突出的工作是研究人类语言量化集合元素的方式,比如我们说所有A都是B,很少A是B,等等。建立在越来越多的实证研究的基础上,揭示了量词的意义和使用,我们通过计算模拟人类说话者如何在文本中描述数量关系发挥重要作用的复杂场景来扩展这条工作线。为此,我们进行了一系列的启发实验,在这些实验中,我们要求人类说话者执行一项语言任务,该任务邀请使用量化表达。实验产生了一个名为QTUNA的语料库,该语料库由包含大量量化表达的短文本组成。我们分析QTUNA,总结我们的发现,并解释我们如何相应地设计人类量词使用的计算模型。最后,根据QTUNA对这些模型进行评价。
{"title":"Computational Modelling of Quantifier Use: Corpus, Models, and Evaluation","authors":"Guanyi Chen, Kees van Deemter","doi":"10.1613/jair.1.13899","DOIUrl":"https://doi.org/10.1613/jair.1.13899","url":null,"abstract":"A prominent strand of work in formal semantics investigates the ways in which human languages quantify the elements of a set, as when we say All A are B, Few A are B, and so on. Building on a growing body of empirical studies that shed light on the meaning and the use of quantifiers, we extend this line of work by computationally modelling how human speakers textually describe complex scenes in which quantitative relations play an important role. To this end, we conduct a series of elicitation experiments in which human speakers were asked to perform a linguistic task that invites the use of quantified expressions. The experiments result in a corpus, called QTUNA, made up of short texts that contain a large variety of quantified expressions. We analyse QTUNA, summarise our findings, and explain how we design computational models of human quantifier use accordingly. Finally, we evaluate these models in accordance with QTUNA.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"14 1","pages":"167-206"},"PeriodicalIF":5.0,"publicationDate":"2023-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75224210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Object-agnostic Affordance Categorization via Unsupervised Learning of Graph Embeddings 基于图嵌入的无监督学习的对象不可知功能分类
3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-05-06 DOI: 10.1613/jair.1.13253
Alexia Toumpa, Anthony G. Cohn
Acquiring knowledge about object interactions and affordances can facilitate scene understanding and human-robot collaboration tasks. As humans tend to use objects in many different ways depending on the scene and the objects’ availability, learning object affordances in everyday-life scenarios is a challenging task, particularly in the presence of an open set of interactions and objects. We address the problem of affordance categorization for class-agnostic objects with an open set of interactions; we achieve this by learning similarities between object interactions in an unsupervised way and thus inducing clusters of object affordances. A novel depth-informed qualitative spatial representation is proposed for the construction of Activity Graphs (AGs), which abstract from the continuous representation of spatio-temporal interactions in RGB-D videos. These AGs are clustered to obtain groups of objects with similar affordances. Our experiments in a real-world scenario demonstrate that our method learns to create object affordance clusters with a high V-measure even in cluttered scenes. The proposed approach handles object occlusions by capturing effectively possible interactions and without imposing any object or scene constraints.
获取关于对象交互和启示的知识可以促进场景理解和人机协作任务。由于人类倾向于根据场景和对象的可用性以多种不同的方式使用对象,因此在日常生活场景中学习对象的可视性是一项具有挑战性的任务,特别是在存在一组开放的交互和对象的情况下。我们用一组开放的交互解决了类不可知论对象的功能分类问题;我们通过以无监督的方式学习对象交互之间的相似性来实现这一点,从而诱导对象的可视性集群。针对RGB-D视频中时空交互的连续表示,提出了一种新的深度通知定性空间表示,用于构建活动图(AGs)。将这些AGs聚类以获得具有相似可视性的对象组。我们在现实场景中的实验表明,即使在混乱的场景中,我们的方法也可以学习创建具有高v度量的对象提供性集群。所提出的方法通过有效地捕获可能的交互来处理物体遮挡,而不施加任何物体或场景约束。
{"title":"Object-agnostic Affordance Categorization via Unsupervised Learning of Graph Embeddings","authors":"Alexia Toumpa, Anthony G. Cohn","doi":"10.1613/jair.1.13253","DOIUrl":"https://doi.org/10.1613/jair.1.13253","url":null,"abstract":"Acquiring knowledge about object interactions and affordances can facilitate scene understanding and human-robot collaboration tasks. As humans tend to use objects in many different ways depending on the scene and the objects’ availability, learning object affordances in everyday-life scenarios is a challenging task, particularly in the presence of an open set of interactions and objects. We address the problem of affordance categorization for class-agnostic objects with an open set of interactions; we achieve this by learning similarities between object interactions in an unsupervised way and thus inducing clusters of object affordances. A novel depth-informed qualitative spatial representation is proposed for the construction of Activity Graphs (AGs), which abstract from the continuous representation of spatio-temporal interactions in RGB-D videos. These AGs are clustered to obtain groups of objects with similar affordances. Our experiments in a real-world scenario demonstrate that our method learns to create object affordance clusters with a high V-measure even in cluttered scenes. The proposed approach handles object occlusions by capturing effectively possible interactions and without imposing any object or scene constraints.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135962030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid? flexbert:电流互感器架构是否过于单一和僵化?
3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-05-06 DOI: 10.1613/jair.1.13942
Shikhar Tuli, Bhishma Dedhia, Shreshth Tuli, Niraj K. Jha
The existence of a plethora of language models makes the problem of selecting the best one for a custom task challenging. Most state-of-the-art methods leverage transformer-based models (e.g., BERT) or their variants. However, training such models and exploring their hyperparameter space is computationally expensive. Prior work proposes several neural architecture search (NAS) methods that employ performance predictors (e.g., surrogate models) to address this issue; however, such works limit analysis to homogeneous models that use fixed dimensionality throughout the network. This leads to sub-optimal architectures. To address this limitation, we propose a suite of heterogeneous and flexible models, namely FlexiBERT, that have varied encoder layers with a diverse set of possible operations and different hidden dimensions. For better-posed surrogate modeling in this expanded design space, we propose a new graph-similarity-based embedding scheme. We also propose a novel NAS policy, called BOSHNAS, that leverages this new scheme, Bayesian modeling, and second-order optimization, to quickly train and use a neural surrogate model to converge to the optimal architecture. A comprehensive set of experiments shows that the proposed policy, when applied to the FlexiBERT design space, pushes the performance frontier upwards compared to traditional models. FlexiBERT-Mini, one of our proposed models, has 3% fewer parameters than BERT-Mini and achieves 8.9% higher GLUE score. A FlexiBERT model with equivalent performance as the best homogeneous model has 2.6× smaller size. FlexiBERT-Large, another proposed model, attains state-of-the-art results, outperforming the baseline models by at least 5.7% on the GLUE benchmark.
过多的语言模型的存在使得为自定义任务选择最佳模型的问题具有挑战性。大多数最先进的方法利用基于变压器的模型(例如BERT)或它们的变体。然而,训练这样的模型和探索它们的超参数空间在计算上是昂贵的。先前的工作提出了几种使用性能预测因子(例如代理模型)的神经结构搜索(NAS)方法来解决这个问题;然而,这些工作将分析限制在整个网络中使用固定维数的同质模型。这将导致次优架构。为了解决这个限制,我们提出了一套异构和灵活的模型,即FlexiBERT,它具有不同的编码器层,具有不同的可能操作集和不同的隐藏维度。为了在这个扩展的设计空间中更好地定位代理建模,我们提出了一种新的基于图相似度的嵌入方案。我们还提出了一种新的NAS策略,称为BOSHNAS,它利用这种新方案、贝叶斯建模和二阶优化来快速训练并使用神经代理模型收敛到最优架构。一组全面的实验表明,当将所提出的策略应用于FlexiBERT设计空间时,与传统模型相比,该策略将性能边界推向了更高的水平。flexbert - mini是我们提出的模型之一,其参数比BERT-Mini少3%,GLUE得分高出8.9%。具有同等性能的最佳同质模型的FlexiBERT模型尺寸缩小2.6倍。flexbert - large是另一种被提议的模型,它获得了最先进的结果,在GLUE基准测试中比基准模型至少高出5.7%。
{"title":"FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?","authors":"Shikhar Tuli, Bhishma Dedhia, Shreshth Tuli, Niraj K. Jha","doi":"10.1613/jair.1.13942","DOIUrl":"https://doi.org/10.1613/jair.1.13942","url":null,"abstract":"The existence of a plethora of language models makes the problem of selecting the best one for a custom task challenging. Most state-of-the-art methods leverage transformer-based models (e.g., BERT) or their variants. However, training such models and exploring their hyperparameter space is computationally expensive. Prior work proposes several neural architecture search (NAS) methods that employ performance predictors (e.g., surrogate models) to address this issue; however, such works limit analysis to homogeneous models that use fixed dimensionality throughout the network. This leads to sub-optimal architectures. To address this limitation, we propose a suite of heterogeneous and flexible models, namely FlexiBERT, that have varied encoder layers with a diverse set of possible operations and different hidden dimensions. For better-posed surrogate modeling in this expanded design space, we propose a new graph-similarity-based embedding scheme. We also propose a novel NAS policy, called BOSHNAS, that leverages this new scheme, Bayesian modeling, and second-order optimization, to quickly train and use a neural surrogate model to converge to the optimal architecture. A comprehensive set of experiments shows that the proposed policy, when applied to the FlexiBERT design space, pushes the performance frontier upwards compared to traditional models. FlexiBERT-Mini, one of our proposed models, has 3% fewer parameters than BERT-Mini and achieves 8.9% higher GLUE score. A FlexiBERT model with equivalent performance as the best homogeneous model has 2.6× smaller size. FlexiBERT-Large, another proposed model, attains state-of-the-art results, outperforming the baseline models by at least 5.7% on the GLUE benchmark.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135962522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
FactGen: Faithful Text Generation by Factuality-aware Pre-training and Contrastive Ranking Fine-tuning FactGen:基于事实感知预训练和对比排序微调的忠实文本生成
IF 5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-04-27 DOI: 10.1613/jair.1.14267
Zhibin Lan, Wei Li, Jinsong Su, Xinyan Xiao, Jiachen Liu, Wenhao Wu, Yajuan Lyu
Conditional text generation is supposed to generate a fluent and coherent target text that is faithful to the source text. Although pre-trained models have achieved promising results, they still suffer from the crucial factuality problem. To deal with this issue, we propose a factuality-aware pretraining-finetuning framework named FactGen, which fully considers factuality during two training stages. Specifically, at the pre-training stage, we utilize a natural language inference model to construct target texts that are entailed by the source texts, resulting in a more factually consistent pre-training objective. Then, during the fine-tuning stage, we further introduce a contrastive ranking loss to encourage the model to generate factually consistent text with higher probability. Extensive experiments on three conditional text generation tasks demonstrate the effectiveness and generality of our training framework.
条件文本生成的目的是生成连贯流畅、忠实于源文本的目标文本。尽管预训练模型已经取得了令人鼓舞的结果,但它们仍然存在关键的事实性问题。为了解决这个问题,我们提出了一个事实感知的预训练微调框架FactGen,该框架在两个训练阶段充分考虑了事实性。具体而言,在预训练阶段,我们利用自然语言推理模型构建源文本所包含的目标文本,从而实现更符合事实的预训练目标。然后,在微调阶段,我们进一步引入对比排名损失,以鼓励模型以更高的概率生成事实一致的文本。在三个条件文本生成任务上的大量实验证明了我们的训练框架的有效性和通用性。
{"title":"FactGen: Faithful Text Generation by Factuality-aware Pre-training and Contrastive Ranking Fine-tuning","authors":"Zhibin Lan, Wei Li, Jinsong Su, Xinyan Xiao, Jiachen Liu, Wenhao Wu, Yajuan Lyu","doi":"10.1613/jair.1.14267","DOIUrl":"https://doi.org/10.1613/jair.1.14267","url":null,"abstract":"Conditional text generation is supposed to generate a fluent and coherent target text that is faithful to the source text. Although pre-trained models have achieved promising results, they still suffer from the crucial factuality problem. To deal with this issue, we propose a factuality-aware pretraining-finetuning framework named FactGen, which fully considers factuality during two training stages. Specifically, at the pre-training stage, we utilize a natural language inference model to construct target texts that are entailed by the source texts, resulting in a more factually consistent pre-training objective. Then, during the fine-tuning stage, we further introduce a contrastive ranking loss to encourage the model to generate factually consistent text with higher probability. Extensive experiments on three conditional text generation tasks demonstrate the effectiveness and generality of our training framework.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"38 1","pages":"1281-1303"},"PeriodicalIF":5.0,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86112408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
An Overview of Environmental Features that Impact Deep Reinforcement Learning in Sparse-Reward Domains 影响稀疏奖励领域深度强化学习的环境特征概述
IF 5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-04-26 DOI: 10.1613/jair.1.14390
Jim Martin Catacora Ocana, R. Capobianco, D. Nardi
Deep reinforcement learning has achieved impressive results in recent years; yet, it is still severely troubled by environments showcasing sparse rewards. On top of that, not all sparse-reward environments are created equal, i.e., they can differ in the presence or absence of various features, with many of them having a great impact on learning. In light of this, the present work puts together a literature compilation of such environmental features, covering particularly those that have been taken advantage of and those that continue to pose a challenge. We expect this effort to provide guidance to researchers for assessing the generality of their new proposals and to call their attention to issues that remain unresolved when dealing with sparse rewards.
近年来,深度强化学习取得了令人印象深刻的成果;然而,它仍然受到奖励稀少的环境的严重困扰。最重要的是,并不是所有的稀疏奖励环境都是平等的,也就是说,它们可能因存在或不存在各种特征而有所不同,其中许多特征对学习有很大的影响。鉴于此,本工作将这些环境特征的文献汇编放在一起,特别是那些已经被利用的和那些继续构成挑战的环境特征。我们希望这项工作能够为研究人员提供指导,以评估他们的新建议的普遍性,并提请他们注意在处理稀疏奖励时仍未解决的问题。
{"title":"An Overview of Environmental Features that Impact Deep Reinforcement Learning in Sparse-Reward Domains","authors":"Jim Martin Catacora Ocana, R. Capobianco, D. Nardi","doi":"10.1613/jair.1.14390","DOIUrl":"https://doi.org/10.1613/jair.1.14390","url":null,"abstract":"Deep reinforcement learning has achieved impressive results in recent years; yet, it is still severely troubled by environments showcasing sparse rewards. On top of that, not all sparse-reward environments are created equal, i.e., they can differ in the presence or absence of various features, with many of them having a great impact on learning. In light of this, the present work puts together a literature compilation of such environmental features, covering particularly those that have been taken advantage of and those that continue to pose a challenge. We expect this effort to provide guidance to researchers for assessing the generality of their new proposals and to call their attention to issues that remain unresolved when dealing with sparse rewards.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":"12 1","pages":"1181-1218"},"PeriodicalIF":5.0,"publicationDate":"2023-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79801463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Journal of Artificial Intelligence Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1