首页 > 最新文献

Journal of Artificial Intelligence Research最新文献

英文 中文
Symbolic Task Inference in Deep Reinforcement Learning 深度强化学习中的符号任务推理
IF 4.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-23 DOI: 10.1613/jair.1.14063
Hosein Hasanbeig, N. Jeppu, Alessandro Abate, Tom Melham, Daniel Kroening
This paper proposes DeepSynth, a method for effective training of deep reinforcement learning agents when the reward is sparse or non-Markovian, but at the same time progress towards the reward requires achieving an unknown sequence of high-level objectives. Our method employs a novel algorithm for synthesis of compact finite state automata to uncover this sequential structure automatically. We synthesise a human-interpretable automaton from trace data collected by exploring the environment. The state space of the environment is then enriched with the synthesised automaton, so that the generation of a control policy by deep reinforcement learning is guided by the discovered structure encoded in the automaton. The proposed approach is able to cope with both high-dimensional, low-level features and unknown sparse or non-Markovian rewards. We have evaluated DeepSynth’s performance in a set of experiments that includes the Atari game Montezuma’s Revenge, known to be challenging. Compared to approaches that rely solely on deep reinforcement learning, we obtain a reduction of two orders of magnitude in the iterations required for policy synthesis, and a significant improvement in scalability.
本文提出了一种有效训练深度强化学习代理的方法--DeepSynth,当奖励是稀疏的或非马尔可夫的,但同时为获得奖励需要实现一连串未知的高级目标。我们的方法采用了一种用于合成紧凑型有限状态自动机的新型算法,以自动发现这种序列结构。我们从探索环境收集到的轨迹数据中合成一个人类可理解的自动机。然后,用合成的自动机丰富环境的状态空间,这样,通过深度强化学习生成的控制策略就能以自动机中编码的已发现结构为指导。所提出的方法既能处理高维、低级特征,也能处理未知的稀疏或非马尔可夫奖励。我们在一组实验中对 DeepSynth 的性能进行了评估,其中包括雅达利游戏 "蒙特祖玛的复仇"(Montezuma's Revenge)。与完全依赖深度强化学习的方法相比,我们发现策略合成所需的迭代次数减少了两个数量级,可扩展性也有了显著提高。
{"title":"Symbolic Task Inference in Deep Reinforcement Learning","authors":"Hosein Hasanbeig, N. Jeppu, Alessandro Abate, Tom Melham, Daniel Kroening","doi":"10.1613/jair.1.14063","DOIUrl":"https://doi.org/10.1613/jair.1.14063","url":null,"abstract":"This paper proposes DeepSynth, a method for effective training of deep reinforcement learning agents when the reward is sparse or non-Markovian, but at the same time progress towards the reward requires achieving an unknown sequence of high-level objectives. Our method employs a novel algorithm for synthesis of compact finite state automata to uncover this sequential structure automatically. We synthesise a human-interpretable automaton from trace data collected by exploring the environment. The state space of the environment is then enriched with the synthesised automaton, so that the generation of a control policy by deep reinforcement learning is guided by the discovered structure encoded in the automaton. The proposed approach is able to cope with both high-dimensional, low-level features and unknown sparse or non-Markovian rewards. We have evaluated DeepSynth’s performance in a set of experiments that includes the Atari game Montezuma’s Revenge, known to be challenging. Compared to approaches that rely solely on deep reinforcement learning, we obtain a reduction of two orders of magnitude in the iterations required for policy synthesis, and a significant improvement in scalability.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141812026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Axiomatization of Non-Recursive Aggregates in First-Order Answer Set Programming 一阶答案集编程中的非递归聚合公理化
IF 4.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-07 DOI: 10.1613/jair.1.15786
Jorge Fandinno, Zachary Hansen, Yuliya Lierler
This paper contributes to the development of theoretical foundations of answer set programming. Groundbreaking work on the SM operator by Ferraris, Lee, and Lifschitz proposed a definition/semantics for logic (answer set) programs based on a syntactic transformation similar to parallel circumscription. That definition radically differed from its predecessors by using classical (second-order) logic and avoiding reference to either grounding or fixpoints. Yet, the work lacked the formalization of crucial and commonly used answer set programming language constructs called aggregates. In this paper, we present a characterization of logic programs with aggregates based on a many-sorted generalization of the SM operator. This characterization introduces new function symbols for aggregate operations and aggregate elements, whose meaning can be fixed by adding appropriate axioms to the result of the SM transformation. We prove that our characterization coincides with the ASP-Core-2 semantics for logic programs and, if we allow non-positive recursion through aggregates, it coincides with the semantics of the answer set solver CLINGO.
本文对答案集编程理论基础的发展做出了贡献。Ferraris、Lee 和 Lifschitz 关于 SM 算子的开创性工作提出了逻辑(答案集)程序的定义/语义学,其基础是类似于并行周延的语法转换。该定义使用经典(二阶)逻辑,避免了对接地或固定点的引用,与前人的定义截然不同。然而,这项工作缺乏对关键且常用的答案集编程语言构造(称为聚合)的形式化。在本文中,我们基于 SM 算子的多排序广义化,提出了具有聚合的逻辑程序的表征。该表征为聚合运算和聚合元素引入了新的函数符号,通过为 SM 变换的结果添加适当的公理,可以固定其含义。我们证明,我们的表征与逻辑程序的 ASP-Core-2 语义相吻合,而且,如果我们允许通过聚合进行非正递归,它还与答案集求解器 CLINGO 的语义相吻合。
{"title":"Axiomatization of Non-Recursive Aggregates in First-Order Answer Set Programming","authors":"Jorge Fandinno, Zachary Hansen, Yuliya Lierler","doi":"10.1613/jair.1.15786","DOIUrl":"https://doi.org/10.1613/jair.1.15786","url":null,"abstract":"This paper contributes to the development of theoretical foundations of answer set programming. Groundbreaking work on the SM operator by Ferraris, Lee, and Lifschitz proposed a definition/semantics for logic (answer set) programs based on a syntactic transformation similar to parallel circumscription. That definition radically differed from its predecessors by using classical (second-order) logic and avoiding reference to either grounding or fixpoints. Yet, the work lacked the formalization of crucial and commonly used answer set programming language constructs called aggregates. In this paper, we present a characterization of logic programs with aggregates based on a many-sorted generalization of the SM operator. This characterization introduces new function symbols for aggregate operations and aggregate elements, whose meaning can be fixed by adding appropriate axioms to the result of the SM transformation. We prove that our characterization coincides with the ASP-Core-2 semantics for logic programs and, if we allow non-positive recursion through aggregates, it coincides with the semantics of the answer set solver CLINGO.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141670554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unifying SAT-Based Approaches to Maximum Satisfiability Solving 统一基于 SAT 的最大可满足性求解方法
IF 4.5 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-07 DOI: 10.1613/jair.1.15986
Hannes Ihalainen, Jeremias Berg, Matti Järvisalo
Maximum satisfiability (MaxSAT), employing propositional logic as the declarative language of choice, has turned into a viable approach to solving NP-hard optimization problems arising from artificial intelligence and other real-world settings. A key contributing factor to the success of MaxSAT is the rise of increasingly effective exact solvers that are based on iterative calls to a Boolean satisfiability (SAT) solver. The three types of SAT-based MaxSAT solving approaches, each with its distinguishing features, implemented in current state-of-the-art MaxSAT solvers are the core-guided, the implicit hitting set (IHS), and the objective-bounding approaches. The objective-bounding approach is based on directly searching over the objective function range by iteratively querying a SAT solver if the MaxSAT instance at hand has a solution under different bounds on the objective. In contrast, both core-guided and IHS are so-called unsatisfiability-based approaches that employ a SAT solver as an unsatisfiable core extractor to determine sources of inconsistencies, but critically differ in how the found unsatisfiable cores are made use of towards finding a provably optimal solution. Furthermore, a variety of different algorithmic variants of the core-guided approach in particular have been proposed and implemented in solvers. It is well-acknowledged that each of the three approaches has its advantages and disadvantages, which is also witnessed by instance and problem-domain specific runtime performance differences (and at times similarities) of MaxSAT solvers implementing variants of the approaches. However, the questions of to what extent the approaches are fundamentally different and how the benefits of the individual methods could be combined in a single algorithmic approach are currently not fully understood. In this work, we approach these questions by developing UniMaxSAT, a general unifying algorithmic framework. Based on the recent notion of abstract cores, UniMaxSAT captures in general core-guided, IHS and objective-bounding computations. The framework offers a unified way of establishing quite generally the correctness of the current approaches. We illustrate this by formally showing that UniMaxSAT can simulate the computations of various algorithmic instantiations of the three types of MaxSAT solving approaches. Furthermore, UniMaxSAT can be instantiated in novel ways giving rise to new algorithmic variants of the approaches. We illustrate this aspect by developing a prototype implementation of an algorithmic variant for MaxSAT based on the framework.
最大可满足性(MaxSAT)采用命题逻辑作为首选的声明语言,已成为解决人工智能和其他现实世界环境中出现的 NP 难优化问题的可行方法。MaxSAT 取得成功的一个关键因素是,基于布尔可满足性(SAT)求解器迭代调用的精确求解器越来越有效。目前最先进的 MaxSAT 求解器采用了三种基于 SAT 的 MaxSAT 求解方法,分别是核心引导法、隐式命中集 (IHS) 法和目标约束法,每种方法都有其显著特点。目标约束法是通过迭代查询 SAT 求解器来直接搜索目标函数范围,如果手头的 MaxSAT 实例在目标的不同约束下有一个解的话。相比之下,"核心引导 "和 IHS 都是所谓的基于不可满足性的方法,它们采用 SAT 求解器作为不可满足核心提取器来确定不一致的来源,但在如何利用所发现的不可满足核心来找到可证明的最优解上存在很大差异。此外,人们还提出了核心引导法的各种不同算法变体,并在求解器中实现了这些变体。众所周知,这三种方法各有优缺点,这一点也可以从实施这些方法变体的 MaxSAT 求解器在实例和特定问题域运行时的性能差异(有时也有相似之处)中得到证明。然而,这些方法在多大程度上存在本质区别,以及如何将各种方法的优势结合到单一算法中,这些问题目前还没有完全搞清楚。在这项工作中,我们通过开发 UniMaxSAT(一种通用的统一算法框架)来解决这些问题。UniMaxSAT 以最近提出的抽象内核概念为基础,从总体上捕捉了内核引导、IHS 和目标边界计算。该框架提供了一种统一的方法,可以相当普遍地确定当前方法的正确性。我们通过正式展示 UniMaxSAT 可以模拟三种 MaxSAT 求解方法的各种算法实例的计算来说明这一点。此外,UniMaxSAT 还能以新颖的方式实例化,从而产生这些方法的新算法变体。我们通过开发基于该框架的 MaxSAT 算法变体的原型实现来说明这一点。
{"title":"Unifying SAT-Based Approaches to Maximum Satisfiability Solving","authors":"Hannes Ihalainen, Jeremias Berg, Matti Järvisalo","doi":"10.1613/jair.1.15986","DOIUrl":"https://doi.org/10.1613/jair.1.15986","url":null,"abstract":"Maximum satisfiability (MaxSAT), employing propositional logic as the declarative language of choice, has turned into a viable approach to solving NP-hard optimization problems arising from artificial intelligence and other real-world settings. A key contributing factor to the success of MaxSAT is the rise of increasingly effective exact solvers that are based on iterative calls to a Boolean satisfiability (SAT) solver. The three types of SAT-based MaxSAT solving approaches, each with its distinguishing features, implemented in current state-of-the-art MaxSAT solvers are the core-guided, the implicit hitting set (IHS), and the objective-bounding approaches. The objective-bounding approach is based on directly searching over the objective function range by iteratively querying a SAT solver if the MaxSAT instance at hand has a solution under different bounds on the objective. In contrast, both core-guided and IHS are so-called unsatisfiability-based approaches that employ a SAT solver as an unsatisfiable core extractor to determine sources of inconsistencies, but critically differ in how the found unsatisfiable cores are made use of towards finding a provably optimal solution. Furthermore, a variety of different algorithmic variants of the core-guided approach in particular have been proposed and implemented in solvers. It is well-acknowledged that each of the three approaches has its advantages and disadvantages, which is also witnessed by instance and problem-domain specific runtime performance differences (and at times similarities) of MaxSAT solvers implementing variants of the approaches. However, the questions of to what extent the approaches are fundamentally different and how the benefits of the individual methods could be combined in a single algorithmic approach are currently not fully understood. In this work, we approach these questions by developing UniMaxSAT, a general unifying algorithmic framework. Based on the recent notion of abstract cores, UniMaxSAT captures in general core-guided, IHS and objective-bounding computations. The framework offers a unified way of establishing quite generally the correctness of the current approaches. We illustrate this by formally showing that UniMaxSAT can simulate the computations of various algorithmic instantiations of the three types of MaxSAT solving approaches. Furthermore, UniMaxSAT can be instantiated in novel ways giving rise to new algorithmic variants of the approaches. We illustrate this aspect by developing a prototype implementation of an algorithmic variant for MaxSAT based on the framework.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":4.5,"publicationDate":"2024-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141671335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The TOAD System for Totally Ordered HTN Planning 完全有序的高血压治疗规划 TOAD 系统
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2024-06-13 DOI: 10.1613/jair.1.14945
Daniel Höller
We present an approach for translating Totally Ordered Hierarchical Task Network (HTN) planning problems to classical planning problems. While this enables the use of sophisticated classical planning systems to find solutions, we need to overcome the differences in expressiveness of these two planning formalisms. Prior work on this topic did this by translating bounded HTN problems. In contrast, we approximate them, i.e., we change the problem such that every action sequence that is a solution to the HTN problem is also a solution for the classical problem, but the latter might have more solutions. To obtain a sound overall approach, we verify solutions returned by the classical planning system to ensure that they are also solutions to the HTN problem.For translation and approximation, we use techniques introduced to approximate Context-Free Languages by using Finite Automata. We named our system Toad (Totally Ordered HTN Approximation using DFA). For a subset of HTN problems the translation is even possible without approximation. Whether or not it is necessary is decided based on the property of self-embedding, which comes also from the field of formal languages. We investigate the theoretical connection of self-embedding and tail-recursiveness, a property from the HTN literature used to identify a subclass of HTN planning problems that can be translated to classical planning, and show that it is more general. To guide the classical planner, we introduce a novel heuristic tailored towards our models.We evaluate Toad on the benchmark set of the 2020 International Planning Competition. Our evaluation shows that (1) most problems can be translated without approximation and that (2) Toad is competitive with the state of the art in HTN planning.
我们提出了一种将完全有序分层任务网络(HTN)规划问题转化为经典规划问题的方法。虽然这样可以使用复杂的经典规划系统找到解决方案,但我们需要克服这两种规划形式在表达能力上的差异。之前的相关工作是通过转换有界 HTN 问题来实现这一目标的。相比之下,我们对它们进行了近似,也就是说,我们改变了问题,使得 HTN 问题的每个行动序列也是经典问题的解,但后者可能有更多的解。为了获得完善的整体方法,我们会验证经典规划系统返回的解,以确保它们也是 HTN 问题的解。在翻译和近似方面,我们使用了通过有限自动机近似无上下文语言的技术。我们将系统命名为 Toad(使用 DFA 的完全有序 HTN 近似)。对于 HTN 问题的一个子集,甚至可以不进行近似就进行翻译。至于是否需要近似,则要根据自嵌入的特性来决定,这一特性也来自形式语言领域。我们研究了自嵌入和尾递归性的理论联系--尾递归性是 HTN 文献中的一个特性,用来确定 HTN 规划问题中可以转化为经典规划的子类,并证明它更为普遍。为了引导经典规划器,我们引入了一种为我们的模型量身定制的新启发式。我们在 2020 年国际规划竞赛的基准集上对 Toad 进行了评估。评估结果表明:(1) 大部分问题无需近似即可转化;(2) Toad 与 HTN 规划领域的最新技术相比具有竞争力。
{"title":"The TOAD System for Totally Ordered HTN Planning","authors":"Daniel Höller","doi":"10.1613/jair.1.14945","DOIUrl":"https://doi.org/10.1613/jair.1.14945","url":null,"abstract":"We present an approach for translating Totally Ordered Hierarchical Task Network (HTN) planning problems to classical planning problems. While this enables the use of sophisticated classical planning systems to find solutions, we need to overcome the differences in expressiveness of these two planning formalisms. Prior work on this topic did this by translating bounded HTN problems. In contrast, we approximate them, i.e., we change the problem such that every action sequence that is a solution to the HTN problem is also a solution for the classical problem, but the latter might have more solutions. To obtain a sound overall approach, we verify solutions returned by the classical planning system to ensure that they are also solutions to the HTN problem.\u0000For translation and approximation, we use techniques introduced to approximate Context-Free Languages by using Finite Automata. We named our system Toad (Totally Ordered HTN Approximation using DFA). For a subset of HTN problems the translation is even possible without approximation. Whether or not it is necessary is decided based on the property of self-embedding, which comes also from the field of formal languages. We investigate the theoretical connection of self-embedding and tail-recursiveness, a property from the HTN literature used to identify a subclass of HTN planning problems that can be translated to classical planning, and show that it is more general. To guide the classical planner, we introduce a novel heuristic tailored towards our models.\u0000We evaluate Toad on the benchmark set of the 2020 International Planning Competition. Our evaluation shows that (1) most problems can be translated without approximation and that (2) Toad is competitive with the state of the art in HTN planning.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141347313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mitigating Value Hallucination in Dyna-Style Planning via Multistep Predecessor Models 通过多步骤前置模型缓解动态规划中的价值幻觉
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2024-06-09 DOI: 10.1613/jair.1.15155
Farzane Aminmansour, Taher Jafferjee, Ehsan Imani, Erin J. Talvitie, Michael Bowling, Martha White
Dyna-style reinforcement learning (RL) agents improve sample efficiency over model-free RL agents by updating the value function with simulated experience generated by an environment model. However, it is often difficult to learn accurate models of environment dynamics, and even small errors may result in failure of Dyna agents. In this paper, we highlight that one potential cause of that failure is bootstrapping off of the values of simulated states, and introduce a new Dyna algorithm to avoid this failure. We discuss a design space of Dyna algorithms, based on using successor or predecessor models---simulating forwards or backwards---and using one-step or multi-step updates. Three of the variants have been explored, but surprisingly the fourth variant has not: using predecessor models with multi-step updates. We present the emph{Hallucinated Value Hypothesis} (HVH): updating the values of real states towards values of simulated states can result in misleading action values which adversely affect the control policy. We discuss and evaluate all four variants of Dyna amongst which three update real states toward simulated states --- so potentially toward hallucinated values --- and our proposed approach, which does not. The experimental results provide evidence for the HVH, and suggest that using predecessor models with multi-step updates is a fruitful direction toward developing Dyna algorithms that are more robust to model error.
与无模型强化学习(RL)代理相比,Dyna 式强化学习(RL)代理通过利用环境模型产生的模拟经验更新值函数,提高了采样效率。然而,学习精确的环境动态模型通常很困难,即使是很小的错误也可能导致 Dyna 代理失败。在本文中,我们强调了导致这种失败的一个潜在原因,即从模拟状态的值进行引导,并介绍了一种新的 Dyna 算法来避免这种失败。我们讨论了 Dyna 算法的设计空间,其基础是使用后继或前继模型--向前或向后模拟--以及使用一步或多步更新。我们已经探索了其中的三种变体,但令人惊讶的是,第四种变体还没有被探索过:使用多步更新的前置模型。我们提出了 "诱导值假说"(HVH):根据模拟状态的值更新真实状态的值可能会导致误导性的行动值,从而对控制策略产生不利影响。我们讨论并评估了 Dyna 的所有四种变体,其中三种变体将实际状态更新为模拟状态--因此有可能更新为幻觉值--而我们提出的方法则不会。实验结果为 HVH 提供了证据,并表明使用多步更新的前置模型是开发对模型误差更具鲁棒性的 Dyna 算法的一个富有成效的方向。
{"title":"Mitigating Value Hallucination in Dyna-Style Planning via Multistep Predecessor Models","authors":"Farzane Aminmansour, Taher Jafferjee, Ehsan Imani, Erin J. Talvitie, Michael Bowling, Martha White","doi":"10.1613/jair.1.15155","DOIUrl":"https://doi.org/10.1613/jair.1.15155","url":null,"abstract":"Dyna-style reinforcement learning (RL) agents improve sample efficiency over model-free RL agents by updating the value function with simulated experience generated by an environment model. However, it is often difficult to learn accurate models of environment dynamics, and even small errors may result in failure of Dyna agents. In this paper, we highlight that one potential cause of that failure is bootstrapping off of the values of simulated states, and introduce a new Dyna algorithm to avoid this failure. We discuss a design space of Dyna algorithms, based on using successor or predecessor models---simulating forwards or backwards---and using one-step or multi-step updates. Three of the variants have been explored, but surprisingly the fourth variant has not: using predecessor models with multi-step updates. We present the emph{Hallucinated Value Hypothesis} (HVH): updating the values of real states towards values of simulated states can result in misleading action values which adversely affect the control policy. We discuss and evaluate all four variants of Dyna amongst which three update real states toward simulated states --- so potentially toward hallucinated values --- and our proposed approach, which does not. The experimental results provide evidence for the HVH, and suggest that using predecessor models with multi-step updates is a fruitful direction toward developing Dyna algorithms that are more robust to model error.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141368219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimating Agent Skill in Continuous Action Domains 估计连续行动领域中的代理技能
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2024-05-10 DOI: 10.1613/jair.1.15326
Christopher Archibald, Delma Nieves-Rivera
Actions in most real-world continuous domains cannot be executed exactly. An agent’s performance in these domains is influenced by two critical factors: the ability to select effective actions (decision-making skill), and how precisely it can execute those selected actions (execution skill). This article addresses the problem of estimating the execution and decision-making skill of an agent, given observations. Several execution skill estimation methods are presented, each of which utilize different information from the observations and make assumptions about the agent’s decision-making ability. A final novel method forgoes these assumptions about decision-making and instead estimates the execution and decision-making skills simultaneously under a single Bayesian framework. Experimental results in several domains evaluate the estimation accuracy of the estimators, especially focusing on how robust they are as agents and their decision-making methods are varied. These results demonstrate that reasoning about both types of skill together significantly improves the robustness and accuracy of execution skill estimation. A case study is presented using the proposed methods to estimate the skill of Major League Baseball pitchers, demonstrating how these methods can be applied to real-world data sources.
现实世界中大多数连续领域的行动都无法精确执行。代理在这些领域中的表现受到两个关键因素的影响:选择有效行动的能力(决策技能)和如何精确执行这些选定的行动(执行技能)。本文讨论的问题是根据观察结果估算代理的执行和决策技能。文章介绍了几种执行技能估算方法,每种方法都利用了观察到的不同信息,并对代理的决策能力做出了假设。最后一种新方法放弃了这些决策假设,而是在单一贝叶斯框架下同时估算执行和决策技能。在多个领域的实验结果评估了估算器的估算精度,尤其关注了估算器在代理及其决策方法发生变化时的稳健性。这些结果表明,同时推理两种类型的技能可显著提高执行技能估算的稳健性和准确性。本文还介绍了一个案例研究,使用所提出的方法来估算美国职业棒球大联盟投手的技能,展示了如何将这些方法应用到现实世界的数据源中。
{"title":"Estimating Agent Skill in Continuous Action Domains","authors":"Christopher Archibald, Delma Nieves-Rivera","doi":"10.1613/jair.1.15326","DOIUrl":"https://doi.org/10.1613/jair.1.15326","url":null,"abstract":"Actions in most real-world continuous domains cannot be executed exactly. An agent’s performance in these domains is influenced by two critical factors: the ability to select effective actions (decision-making skill), and how precisely it can execute those selected actions (execution skill). This article addresses the problem of estimating the execution and decision-making skill of an agent, given observations. Several execution skill estimation methods are presented, each of which utilize different information from the observations and make assumptions about the agent’s decision-making ability. A final novel method forgoes these assumptions about decision-making and instead estimates the execution and decision-making skills simultaneously under a single Bayesian framework. Experimental results in several domains evaluate the estimation accuracy of the estimators, especially focusing on how robust they are as agents and their decision-making methods are varied. These results demonstrate that reasoning about both types of skill together significantly improves the robustness and accuracy of execution skill estimation. A case study is presented using the proposed methods to estimate the skill of Major League Baseball pitchers, demonstrating how these methods can be applied to real-world data sources.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140990329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
USN: A Robust Imitation Learning Method against Diverse Action Noise USN: 针对各种动作噪声的鲁棒模仿学习方法
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2024-04-21 DOI: 10.1613/jair.1.15819
Xingrui Yu, Bo Han, I. Tsang
Learning from imperfect demonstrations is a crucial challenge in imitation learning (IL). Unlike existing works that still rely on the enormous effort of expert demonstrators, we consider a more cost-effective option for obtaining a large number of demonstrations. That is, hire annotators to label actions for existing image records in realistic scenarios. However, action noise can occur when annotators are not domain experts or encounter confusing states. In this work, we introduce two particular forms of action noise, i.e., state-independent and state-dependent action noise. Previous IL methods fail to achieve expert-level performance when the demonstrations contain action noise, especially the state-dependent action noise.  To mitigate the harmful effects of action noises, we propose a robust learning paradigm called USN (Uncertainty-aware Sample-selection with Negative learning). The model first estimates the predictive uncertainty for all demonstration data and then selects sampleswith high loss based on the uncertainty measures. Finally, it updates the model parameters with additional negative learning on the selected samples. Empirical results in Box2D tasks and Atari games show that USN consistently improves the final rewards of behavioral cloning, online imitation learning, and offline imitation learning methods under various action noises. The ratio of significant improvements is up to 94.44%. Moreover, our method scales to conditional imitation learning with real-world noisy commands in urban driving
从不尽人意的示范中学习是模仿学习(IL)的一个重要挑战。现有的研究仍依赖于专家示范者的巨大努力,与此不同,我们考虑采用一种更具成本效益的方法来获取大量示范。也就是说,在现实场景中聘请标注者为现有图像记录标注动作。然而,当标注者不是领域专家或遇到混乱状态时,就会出现动作噪声。在这项工作中,我们引入了两种特殊形式的动作噪声,即与状态无关的动作噪声和与状态有关的动作噪声。以前的 IL 方法在演示包含动作噪声(尤其是与状态相关的动作噪声)时无法达到专家级性能。 为了减轻动作噪声的有害影响,我们提出了一种称为 USN(带有负学习的不确定性感知样本选择)的稳健学习范式。该模型首先估计所有演示数据的预测不确定性,然后根据不确定性度量选择损失较大的样本。最后,通过对所选样本进行额外的负向学习来更新模型参数。在 Box2D 任务和 Atari 游戏中的实证结果表明,USN 能够在各种动作噪声下持续改进行为克隆、在线模仿学习和离线模仿学习方法的最终奖励。显著提高的比例高达 94.44%。此外,我们的方法还可扩展到城市驾驶中真实世界噪声指令的条件模仿学习。
{"title":"USN: A Robust Imitation Learning Method against Diverse Action Noise","authors":"Xingrui Yu, Bo Han, I. Tsang","doi":"10.1613/jair.1.15819","DOIUrl":"https://doi.org/10.1613/jair.1.15819","url":null,"abstract":"Learning from imperfect demonstrations is a crucial challenge in imitation learning (IL). Unlike existing works that still rely on the enormous effort of expert demonstrators, we consider a more cost-effective option for obtaining a large number of demonstrations. That is, hire annotators to label actions for existing image records in realistic scenarios. However, action noise can occur when annotators are not domain experts or encounter confusing states. In this work, we introduce two particular forms of action noise, i.e., state-independent and state-dependent action noise. Previous IL methods fail to achieve expert-level performance when the demonstrations contain action noise, especially the state-dependent action noise.  To mitigate the harmful effects of action noises, we propose a robust learning paradigm called USN (Uncertainty-aware Sample-selection with Negative learning). The model first estimates the predictive uncertainty for all demonstration data and then selects sampleswith high loss based on the uncertainty measures. Finally, it updates the model parameters with additional negative learning on the selected samples. Empirical results in Box2D tasks and Atari games show that USN consistently improves the final rewards of behavioral cloning, online imitation learning, and offline imitation learning methods under various action noises. The ratio of significant improvements is up to 94.44%. Moreover, our method scales to conditional imitation learning with real-world noisy commands in urban driving","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140678351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Map of Diverse Synthetic Stable Matching Instances 多样化合成稳定匹配实例地图
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2024-04-04 DOI: 10.1613/jair.1.15213
Niclas Boehmer, Klaus Heeger, Stanislaw Szufa
Focusing on Stable Roommates (SR), we contribute to the toolbox for conducting experiments for stable matching problems. We introduce the polynomial-time computable mutual attraction distance to measure the similarity of SR instances, analyze its properties, and use it to create a map of SR instances. This map visualizes 460 synthetic SR instances (each sampled from one of ten different statistical cultures) as follows: Each instance is a point in the plane, and two points are close on the map if the corresponding SR instances are similar with respect to our mutual attraction distance to each other. Subsequently, we conduct several illustrative experiments and depict their results on the map, illustrating the map’s usefulness as a non-aggregate visualization tool, the diversity of our generated dataset, and the need to use instances sampled from different statistical cultures. Lastly, we extend our approach to the bipartite Stable Marriage problem.
以稳定室友(SR)为重点,我们为开展稳定匹配问题实验的工具箱做出了贡献。我们引入了多项式时间可计算的相互吸引距离来测量 SR 实例的相似性,分析了它的特性,并用它创建了 SR 实例的地图。该地图将 460 个合成 SR 实例(每个实例从十种不同的统计文化中抽取一个)可视化如下:每个实例都是平面上的一个点,如果对应的 SR 实例在相互吸引距离上相似,则地图上的两个点就接近。随后,我们进行了几个说明性实验,并在地图上描绘了实验结果,说明了地图作为非聚合可视化工具的实用性、我们生成的数据集的多样性以及使用从不同统计文化中采样的实例的必要性。最后,我们将我们的方法扩展到了两方稳定婚姻问题。
{"title":"A Map of Diverse Synthetic Stable Matching Instances","authors":"Niclas Boehmer, Klaus Heeger, Stanislaw Szufa","doi":"10.1613/jair.1.15213","DOIUrl":"https://doi.org/10.1613/jair.1.15213","url":null,"abstract":"Focusing on Stable Roommates (SR), we contribute to the toolbox for conducting experiments for stable matching problems. We introduce the polynomial-time computable mutual attraction distance to measure the similarity of SR instances, analyze its properties, and use it to create a map of SR instances. This map visualizes 460 synthetic SR instances (each sampled from one of ten different statistical cultures) as follows: Each instance is a point in the plane, and two points are close on the map if the corresponding SR instances are similar with respect to our mutual attraction distance to each other. Subsequently, we conduct several illustrative experiments and depict their results on the map, illustrating the map’s usefulness as a non-aggregate visualization tool, the diversity of our generated dataset, and the need to use instances sampled from different statistical cultures. Lastly, we extend our approach to the bipartite Stable Marriage problem.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140746025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DIGCN: A Dynamic Interaction Graph Convolutional Network Based on Learnable Proposals for Object Detection DIGCN:基于可学习建议的动态交互图卷积网络,用于物体检测
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2024-04-04 DOI: 10.1613/jair.1.15698
Pingping Cao, Yanping Zhu, Yuhao Jin, Benkun Ruan, Qiang Niu
We propose a Dynamic Interaction Graph Convolutional Network (DIGCN), an image object detection method based on learnable proposals and GCN. Existing object detection methods usually work on dense candidates, resulting in redundant and near-duplicate results. Meanwhile, non-maximum suppression post-processing operations are required to eliminate negative effects, which increases the computational complexity. Although the existing sparse detector avoids cumbersome post-processing operations, it ignores the potential relationship between objects and proposals, which hinders detection accuracy improvement. Therefore, we propose a dynamic interaction GCN module in the DIGCN, which performs dynamic interaction and relational modeling on the proposal boxes and proposal features to improve the object detection accuracy. In addition, we introduce a learnable proposal method with a sparse set of learned object proposals to eliminate a huge number of hand-designed object candidates, avoiding complicated tasks such as object candidate design and many-to-one label assignment, and reducing object detection model complexity to a certain extent. DIGCN demonstrates accuracy and run-time performance on par with the well-established and highly optimized detector baselines on the challenging COCO dataset, e.g. with the ResNet-101FPN as the backbone our method attains the accuracy of 46.5 AP while processing 13 frames per second. Our work provides a new method for object detection research.
我们提出的动态交互图卷积网络(DIGCN)是一种基于可学习建议和 GCN 的图像对象检测方法。现有的物体检测方法通常针对密集的候选对象,结果冗余且近乎重复。同时,为了消除负面影响,需要进行非最大抑制后处理操作,从而增加了计算复杂度。现有的稀疏检测器虽然避免了繁琐的后处理操作,但却忽略了对象与提议之间的潜在关系,从而阻碍了检测精度的提高。因此,我们在 DIGCN 中提出了动态交互 GCN 模块,对提案框和提案特征进行动态交互和关系建模,以提高对象检测精度。此外,我们还引入了一种可学习的提案方法,利用稀疏的学习对象提案集来消除大量手工设计的候选对象,避免了候选对象设计和多对一标签分配等复杂任务,在一定程度上降低了对象检测模型的复杂度。在极具挑战性的 COCO 数据集上,DIGCN 的准确度和运行时间性能与成熟的、高度优化的检测器基线相当,例如,以 ResNet-101FPN 为骨干,我们的方法在每秒处理 13 帧图像的情况下,准确度达到 46.5 AP。我们的工作为物体检测研究提供了一种新方法。
{"title":"DIGCN: A Dynamic Interaction Graph Convolutional Network Based on Learnable Proposals for Object Detection","authors":"Pingping Cao, Yanping Zhu, Yuhao Jin, Benkun Ruan, Qiang Niu","doi":"10.1613/jair.1.15698","DOIUrl":"https://doi.org/10.1613/jair.1.15698","url":null,"abstract":"We propose a Dynamic Interaction Graph Convolutional Network (DIGCN), an image object detection method based on learnable proposals and GCN. Existing object detection methods usually work on dense candidates, resulting in redundant and near-duplicate results. Meanwhile, non-maximum suppression post-processing operations are required to eliminate negative effects, which increases the computational complexity. Although the existing sparse detector avoids cumbersome post-processing operations, it ignores the potential relationship between objects and proposals, which hinders detection accuracy improvement. Therefore, we propose a dynamic interaction GCN module in the DIGCN, which performs dynamic interaction and relational modeling on the proposal boxes and proposal features to improve the object detection accuracy. In addition, we introduce a learnable proposal method with a sparse set of learned object proposals to eliminate a huge number of hand-designed object candidates, avoiding complicated tasks such as object candidate design and many-to-one label assignment, and reducing object detection model complexity to a certain extent. DIGCN demonstrates accuracy and run-time performance on par with the well-established and highly optimized detector baselines on the challenging COCO dataset, e.g. with the ResNet-101FPN as the backbone our method attains the accuracy of 46.5 AP while processing 13 frames per second. Our work provides a new method for object detection research.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140741593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning to Resolve Social Dilemmas: A Survey 学会解决社会困境:调查
IF 5 3区 计算机科学 Q2 Computer Science Pub Date : 2024-03-13 DOI: 10.1613/jair.1.15167
Shaheen Fatima, Nicholas R. Jennings, Michael Wooldridge
Social dilemmas are situations of inter-dependent decision making in which individual rationality can lead to outcomes with poor social qualities. The ubiquity of social dilemmas in social, biological, and computational systems has generated substantial research across these diverse disciplines into the study of mechanisms for avoiding deficient outcomes by promoting and maintaining mutual cooperation. Much of this research is focused on studying how individuals faced with a dilemma can learn to cooperate by adapting their behaviours according to their past experience. In particular, three types of learning approaches have been studied: evolutionary game-theoretic learning, reinforcement learning, and best-response learning. This article is a comprehensive integrated survey of these learning approaches in the context of dilemma games. We formally introduce dilemma games and their inherent challenges. We then outline the three learning approaches and, for each approach, provide a survey of the solutions proposed for dilemma resolution. Finally, we provide a comparative summary and discuss directions in which further research is needed.
社会困境是指在相互依赖的决策过程中,个人的理性可能会导致社会品质低下的结果。社会困境在社会、生物和计算系统中无处不在,这促使这些不同学科开展了大量研究,探讨通过促进和维持相互合作来避免不良结果的机制。这些研究大多侧重于研究面临困境的个体如何根据以往的经验调整自己的行为,从而学会合作。具体而言,研究了三类学习方法:进化博弈论学习、强化学习和最佳反应学习。本文以两难博弈为背景,对这些学习方法进行了全面综合的研究。我们正式介绍了两难博弈及其固有的挑战。然后,我们概述了这三种学习方法,并针对每种方法提供了解决两难问题的方案概览。最后,我们进行了比较总结,并讨论了需要进一步研究的方向。
{"title":"Learning to Resolve Social Dilemmas: A Survey","authors":"Shaheen Fatima, Nicholas R. Jennings, Michael Wooldridge","doi":"10.1613/jair.1.15167","DOIUrl":"https://doi.org/10.1613/jair.1.15167","url":null,"abstract":"Social dilemmas are situations of inter-dependent decision making in which individual rationality can lead to outcomes with poor social qualities. The ubiquity of social dilemmas in social, biological, and computational systems has generated substantial research across these diverse disciplines into the study of mechanisms for avoiding deficient outcomes by promoting and maintaining mutual cooperation. Much of this research is focused on studying how individuals faced with a dilemma can learn to cooperate by adapting their behaviours according to their past experience. In particular, three types of learning approaches have been studied: evolutionary game-theoretic learning, reinforcement learning, and best-response learning. This article is a comprehensive integrated survey of these learning approaches in the context of dilemma games. We formally introduce dilemma games and their inherent challenges. We then outline the three learning approaches and, for each approach, provide a survey of the solutions proposed for dilemma resolution. Finally, we provide a comparative summary and discuss directions in which further research is needed.","PeriodicalId":54877,"journal":{"name":"Journal of Artificial Intelligence Research","volume":null,"pages":null},"PeriodicalIF":5.0,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140246182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Artificial Intelligence Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1