首页 > 最新文献

arXiv - CS - Artificial Intelligence最新文献

英文 中文
Traffic expertise meets residual RL: Knowledge-informed model-based residual reinforcement learning for CAV trajectory control 交通专业知识与残差 RL 的结合:基于知识模型的残差强化学习用于 CAV 轨迹控制
Pub Date : 2024-08-30 DOI: arxiv-2408.17380
Zihao Sheng, Zilin Huang, Sikai Chen
Model-based reinforcement learning (RL) is anticipated to exhibit highersample efficiency compared to model-free RL by utilizing a virtual environmentmodel. However, it is challenging to obtain sufficiently accuraterepresentations of the environmental dynamics due to uncertainties in complexsystems and environments. An inaccurate environment model may degrade thesample efficiency and performance of model-based RL. Furthermore, whilemodel-based RL can improve sample efficiency, it often still requiressubstantial training time to learn from scratch, potentially limiting itsadvantages over model-free approaches. To address these challenges, this paperintroduces a knowledge-informed model-based residual reinforcement learningframework aimed at enhancing learning efficiency by infusing established expertknowledge into the learning process and avoiding the issue of beginning fromzero. Our approach integrates traffic expert knowledge into a virtualenvironment model, employing the Intelligent Driver Model (IDM) for basicdynamics and neural networks for residual dynamics, thus ensuring adaptabilityto complex scenarios. We propose a novel strategy that combines traditionalcontrol methods with residual RL, facilitating efficient learning and policyoptimization without the need to learn from scratch. The proposed approach isapplied to CAV trajectory control tasks for the dissipation of stop-and-gowaves in mixed traffic flow. Experimental results demonstrate that our proposedapproach enables the CAV agent to achieve superior performance in trajectorycontrol compared to the baseline agents in terms of sample efficiency, trafficflow smoothness and traffic mobility. The source code and supplementarymaterials are available at https://github.com/zihaosheng/traffic-expertise-RL/.
与无模型强化学习(RL)相比,基于模型的强化学习(RL)通过利用虚拟环境模型,有望表现出更高的样本效率。然而,由于复杂系统和环境的不确定性,要获得足够准确的环境动态描述具有挑战性。不准确的环境模型可能会降低基于模型的 RL 的采样效率和性能。此外,虽然基于模型的 RL 可以提高采样效率,但它通常仍需要大量的训练时间来从头开始学习,这可能会限制它相对于无模型方法的优势。为了应对这些挑战,本文介绍了一种基于知识模型的残差强化学习框架,旨在通过将已有的专家知识注入学习过程来提高学习效率,避免从零开始的问题。我们的方法将交通专家知识集成到虚拟环境模型中,采用智能驾驶员模型(IDM)进行基本动力学分析,采用神经网络进行残差动力学分析,从而确保对复杂场景的适应性。我们提出了一种将传统控制方法与残差 RL 相结合的新策略,有助于高效学习和策略优化,而无需从头开始学习。我们将所提出的方法应用于 CAV 轨迹控制任务,以消除混合交通流中的停顿和波浪。实验结果表明,与基线代理相比,我们提出的方法使 CAV 代理在采样效率、交通流平稳性和交通流动性方面实现了更优越的轨迹控制性能。源代码和补充材料可在 https://github.com/zihaosheng/traffic-expertise-RL/ 上获取。
{"title":"Traffic expertise meets residual RL: Knowledge-informed model-based residual reinforcement learning for CAV trajectory control","authors":"Zihao Sheng, Zilin Huang, Sikai Chen","doi":"arxiv-2408.17380","DOIUrl":"https://doi.org/arxiv-2408.17380","url":null,"abstract":"Model-based reinforcement learning (RL) is anticipated to exhibit higher\u0000sample efficiency compared to model-free RL by utilizing a virtual environment\u0000model. However, it is challenging to obtain sufficiently accurate\u0000representations of the environmental dynamics due to uncertainties in complex\u0000systems and environments. An inaccurate environment model may degrade the\u0000sample efficiency and performance of model-based RL. Furthermore, while\u0000model-based RL can improve sample efficiency, it often still requires\u0000substantial training time to learn from scratch, potentially limiting its\u0000advantages over model-free approaches. To address these challenges, this paper\u0000introduces a knowledge-informed model-based residual reinforcement learning\u0000framework aimed at enhancing learning efficiency by infusing established expert\u0000knowledge into the learning process and avoiding the issue of beginning from\u0000zero. Our approach integrates traffic expert knowledge into a virtual\u0000environment model, employing the Intelligent Driver Model (IDM) for basic\u0000dynamics and neural networks for residual dynamics, thus ensuring adaptability\u0000to complex scenarios. We propose a novel strategy that combines traditional\u0000control methods with residual RL, facilitating efficient learning and policy\u0000optimization without the need to learn from scratch. The proposed approach is\u0000applied to CAV trajectory control tasks for the dissipation of stop-and-go\u0000waves in mixed traffic flow. Experimental results demonstrate that our proposed\u0000approach enables the CAV agent to achieve superior performance in trajectory\u0000control compared to the baseline agents in terms of sample efficiency, traffic\u0000flow smoothness and traffic mobility. The source code and supplementary\u0000materials are available at https://github.com/zihaosheng/traffic-expertise-RL/.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unleashing Artificial Cognition: Integrating Multiple AI Systems 释放人工认知:整合多种人工智能系统
Pub Date : 2024-08-09 DOI: arxiv-2408.04910
Muntasir Adnan, Buddhi Gamage, Zhiwei Xu, Damith Herath, Carlos Noschang Kuhn
In this study, we present an innovative fusion of language models and queryanalysis techniques to unlock cognition in artificial intelligence. Our systemseamlessly integrates a Chess engine with a language model, enabling it topredict moves and provide strategic explanations. Leveraging a vector databasethrough retrievable answer generation, our OpenSI AI system elucidates itsdecision-making process, bridging the gap between raw computation andhuman-like understanding. Our choice of Chess as the demonstration environmentunderscores the versatility of our approach. Beyond Chess, our system holdspromise for diverse applications, from medical diagnostics to financialforecasting.
在这项研究中,我们提出了一种将语言模型和查询分析技术相结合的创新方法,以解开人工智能中的认知难题。我们的系统将国际象棋引擎与语言模型无缝集成,使其能够预测棋步并提供策略解释。我们的 OpenSI 人工智能系统通过可检索答案的生成来利用矢量数据库,从而阐明其决策过程,缩小了原始计算与类人理解之间的差距。我们选择国际象棋作为演示环境,证明了我们方法的通用性。除了国际象棋,我们的系统还有望应用于从医疗诊断到金融预测等各种领域。
{"title":"Unleashing Artificial Cognition: Integrating Multiple AI Systems","authors":"Muntasir Adnan, Buddhi Gamage, Zhiwei Xu, Damith Herath, Carlos Noschang Kuhn","doi":"arxiv-2408.04910","DOIUrl":"https://doi.org/arxiv-2408.04910","url":null,"abstract":"In this study, we present an innovative fusion of language models and query\u0000analysis techniques to unlock cognition in artificial intelligence. Our system\u0000seamlessly integrates a Chess engine with a language model, enabling it to\u0000predict moves and provide strategic explanations. Leveraging a vector database\u0000through retrievable answer generation, our OpenSI AI system elucidates its\u0000decision-making process, bridging the gap between raw computation and\u0000human-like understanding. Our choice of Chess as the demonstration environment\u0000underscores the versatility of our approach. Beyond Chess, our system holds\u0000promise for diverse applications, from medical diagnostics to financial\u0000forecasting.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141949505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Axiomatic Characterisations of Sample-based Explainers 基于样本的解释器的公理特征
Pub Date : 2024-08-09 DOI: arxiv-2408.04903
Leila Amgouda, Martin C. Cooper, Salim Debbaoui
Explaining decisions of black-box classifiers is both important andcomputationally challenging. In this paper, we scrutinize explainers thatgenerate feature-based explanations from samples or datasets. We start bypresenting a set of desirable properties that explainers would ideally satisfy,delve into their relationships, and highlight incompatibilities of some ofthem. We identify the entire family of explainers that satisfy two keyproperties which are compatible with all the others. Its instances providesufficient reasons, called weak abductive explanations.We then unravel itsvarious subfamilies that satisfy subsets of compatible properties. Indeed, wefully characterize all the explainers that satisfy any subset of compatibleproperties. In particular, we introduce the first (broad family of) explainersthat guarantee the existence of explanations and their global consistency.Wediscuss some of its instances including the irrefutable explainer and thesurrogate explainer whose explanations can be found in polynomial time.
解释黑盒分类器的决策既重要又具有计算上的挑战性。在本文中,我们仔细研究了从样本或数据集生成基于特征的解释器。我们首先提出了解释器最好能满足的一系列理想属性,深入探讨了它们之间的关系,并强调了其中一些属性的不兼容性。我们确定了满足两个关键属性且与所有其他属性兼容的解释程序的整个系列。其实例提供了充分的理由,被称为弱归纳解释。然后,我们揭示了满足兼容属性子集的各种子家族。事实上,我们有效地描述了满足任何兼容属性子集的所有解释器。我们讨论了它的一些实例,包括不可反驳的解释者和可以在多项式时间内找到解释的替代解释者。
{"title":"Axiomatic Characterisations of Sample-based Explainers","authors":"Leila Amgouda, Martin C. Cooper, Salim Debbaoui","doi":"arxiv-2408.04903","DOIUrl":"https://doi.org/arxiv-2408.04903","url":null,"abstract":"Explaining decisions of black-box classifiers is both important and\u0000computationally challenging. In this paper, we scrutinize explainers that\u0000generate feature-based explanations from samples or datasets. We start by\u0000presenting a set of desirable properties that explainers would ideally satisfy,\u0000delve into their relationships, and highlight incompatibilities of some of\u0000them. We identify the entire family of explainers that satisfy two key\u0000properties which are compatible with all the others. Its instances provide\u0000sufficient reasons, called weak abductive explanations.We then unravel its\u0000various subfamilies that satisfy subsets of compatible properties. Indeed, we\u0000fully characterize all the explainers that satisfy any subset of compatible\u0000properties. In particular, we introduce the first (broad family of) explainers\u0000that guarantee the existence of explanations and their global consistency.We\u0000discuss some of its instances including the irrefutable explainer and the\u0000surrogate explainer whose explanations can be found in polynomial time.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"191 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141930540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MMRole: A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents MMRole:开发和评估多模式角色扮演代理的综合框架
Pub Date : 2024-08-08 DOI: arxiv-2408.04203
Yanqi Dai, Huanran Hu, Lei Wang, Shengjie Jin, Xu Chen, Zhiwu Lu
Recently, Role-Playing Agents (RPAs) have garnered increasing attention fortheir potential to deliver emotional value and facilitate sociologicalresearch. However, existing studies are primarily confined to the textualmodality, unable to simulate humans' multimodal perceptual capabilities. Tobridge this gap, we introduce the concept of Multimodal Role-Playing Agents(MRPAs), and propose a comprehensive framework, MMRole, for their developmentand evaluation, which comprises a personalized multimodal dataset and a robustevaluation method. Specifically, we construct a large-scale, high-qualitydataset, MMRole-Data, consisting of 85 characters, 11K images, and 14K singleor multi-turn dialogues. Additionally, we present a robust evaluation method,MMRole-Eval, encompassing eight metrics across three dimensions, where a rewardmodel is trained to score MRPAs with the constructed ground-truth data forcomparison. Moreover, we develop the first specialized MRPA, MMRole-Agent.Extensive evaluation results demonstrate the improved performance ofMMRole-Agent and highlight the primary challenges in developing MRPAs,emphasizing the need for enhanced multimodal understanding and role-playingconsistency. The data, code, and models will be available athttps://github.com/YanqiDai/MMRole.
近来,角色扮演代理(RPA)因其在传递情感价值和促进社会学研究方面的潜力而受到越来越多的关注。然而,现有的研究主要局限于文本模式,无法模拟人类的多模态感知能力。为了填补这一空白,我们提出了多模态角色扮演代理(MRPAs)的概念,并为其开发和评估提出了一个综合框架--MMRole,其中包括一个个性化的多模态数据集和一个稳健的评估方法。具体来说,我们构建了一个大规模、高质量的数据集 MMRole-Data,其中包括 85 个字符、11K 张图像和 14K 个单轮或多轮对话。此外,我们还提出了一种稳健的评估方法--MMRole-Eval,该方法包含三个维度的八个指标,其中训练了一个 rewardmodel,用于对与构建的地面实况数据进行比较的 MRPA 进行评分。广泛的评估结果表明了 MMRole-Agent 性能的提高,并突出了开发 MRPA 的主要挑战,强调了增强多模态理解和角色扮演一致性的必要性。数据、代码和模型可在https://github.com/YanqiDai/MMRole。
{"title":"MMRole: A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents","authors":"Yanqi Dai, Huanran Hu, Lei Wang, Shengjie Jin, Xu Chen, Zhiwu Lu","doi":"arxiv-2408.04203","DOIUrl":"https://doi.org/arxiv-2408.04203","url":null,"abstract":"Recently, Role-Playing Agents (RPAs) have garnered increasing attention for\u0000their potential to deliver emotional value and facilitate sociological\u0000research. However, existing studies are primarily confined to the textual\u0000modality, unable to simulate humans' multimodal perceptual capabilities. To\u0000bridge this gap, we introduce the concept of Multimodal Role-Playing Agents\u0000(MRPAs), and propose a comprehensive framework, MMRole, for their development\u0000and evaluation, which comprises a personalized multimodal dataset and a robust\u0000evaluation method. Specifically, we construct a large-scale, high-quality\u0000dataset, MMRole-Data, consisting of 85 characters, 11K images, and 14K single\u0000or multi-turn dialogues. Additionally, we present a robust evaluation method,\u0000MMRole-Eval, encompassing eight metrics across three dimensions, where a reward\u0000model is trained to score MRPAs with the constructed ground-truth data for\u0000comparison. Moreover, we develop the first specialized MRPA, MMRole-Agent.\u0000Extensive evaluation results demonstrate the improved performance of\u0000MMRole-Agent and highlight the primary challenges in developing MRPAs,\u0000emphasizing the need for enhanced multimodal understanding and role-playing\u0000consistency. The data, code, and models will be available at\u0000https://github.com/YanqiDai/MMRole.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141930543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reasoning about Study Regulations in Answer Set Programming 答案集编程中有关学习条例的推理
Pub Date : 2024-08-08 DOI: arxiv-2408.04528
Susana Hahn, Cedric Martens, Amade Nemes, Henry Otunuya, Javier Romero, Torsten Schaub, Sebastian Schellhorn
We are interested in automating reasoning with and about study regulations,catering to various stakeholders, ranging from administrators, over faculty, tostudents at different stages. Our work builds on an extensive analysis ofvarious study programs at the University of Potsdam. The conceptualization ofthe underlying principles provides us with a formal account of studyregulations. In particular, the formalization reveals the properties ofadmissible study plans. With these at end, we propose an encoding of studyregulations in Answer Set Programming that produces corresponding study plans.Finally, we show how this approach can be extended to a generic user interfacefor exploring study plans.
我们对自动推理学习规定和学习规定的推理感兴趣,以满足从管理人员、教师到处于不同阶段的学生等不同利益相关者的需求。我们的工作基于对波茨坦大学各种学习计划的广泛分析。基本原则的概念化为我们提供了学习规章制度的正式说明。特别是,形式化揭示了可接受的学习计划的特性。最后,我们展示了如何将这种方法扩展到用于探索学习计划的通用用户界面。
{"title":"Reasoning about Study Regulations in Answer Set Programming","authors":"Susana Hahn, Cedric Martens, Amade Nemes, Henry Otunuya, Javier Romero, Torsten Schaub, Sebastian Schellhorn","doi":"arxiv-2408.04528","DOIUrl":"https://doi.org/arxiv-2408.04528","url":null,"abstract":"We are interested in automating reasoning with and about study regulations,\u0000catering to various stakeholders, ranging from administrators, over faculty, to\u0000students at different stages. Our work builds on an extensive analysis of\u0000various study programs at the University of Potsdam. The conceptualization of\u0000the underlying principles provides us with a formal account of study\u0000regulations. In particular, the formalization reveals the properties of\u0000admissible study plans. With these at end, we propose an encoding of study\u0000regulations in Answer Set Programming that produces corresponding study plans.\u0000Finally, we show how this approach can be extended to a generic user interface\u0000for exploring study plans.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"84 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141930541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KnowPC: Knowledge-Driven Programmatic Reinforcement Learning for Zero-shot Coordination KnowPC:知识驱动的程序化强化学习,实现零距离协调
Pub Date : 2024-08-08 DOI: arxiv-2408.04336
Yin Gu, Qi Liu, Zhi Li, Kai Zhang
Zero-shot coordination (ZSC) remains a major challenge in the cooperative AIfield, which aims to learn an agent to cooperate with an unseen partner intraining environments or even novel environments. In recent years, a popularZSC solution paradigm has been deep reinforcement learning (DRL) combined withadvanced self-play or population-based methods to enhance the neural policy'sability to handle unseen partners. Despite some success, these approachesusually rely on black-box neural networks as the policy function. However,neural networks typically lack interpretability and logic, making the learnedpolicies difficult for partners (e.g., humans) to understand and limiting theirgeneralization ability. These shortcomings hinder the application ofreinforcement learning methods in diverse cooperative scenarios.We suggest torepresent the agent's policy with an interpretable program. Unlike neuralnetworks, programs contain stable logic, but they are non-differentiable anddifficult to optimize.To automatically learn such programs, we introduceKnowledge-driven Programmatic reinforcement learning for zero-shot Coordination(KnowPC). We first define a foundational Domain-Specific Language (DSL),including program structures, conditional primitives, and action primitives. Asignificant challenge is the vast program search space, making it difficult tofind high-performing programs efficiently. To address this, KnowPC integratesan extractor and an reasoner. The extractor discovers environmental transitionknowledge from multi-agent interaction trajectories, while the reasoner deducesthe preconditions of each action primitive based on the transition knowledge.
零次协调(Zero-shot coordination,ZSC)仍然是合作人工智能领域的一大挑战,其目的是学习代理如何在训练环境甚至新环境中与未曾谋面的伙伴合作。近年来,一种流行的 ZSC 解决范例是将深度强化学习(DRL)与高级自我游戏或基于种群的方法相结合,以增强神经策略处理未知伙伴的能力。尽管取得了一些成功,但这些方法通常依赖黑盒神经网络作为策略函数。然而,神经网络通常缺乏可解释性和逻辑性,使得伙伴(如人类)难以理解所学习的策略,并限制了其泛化能力。我们建议用可解释的程序来表示代理的策略。与神经网络不同,程序包含稳定的逻辑,但它们是无差别的,难以优化。为了自动学习这样的程序,我们引入了知识驱动的零次协调程序强化学习(Knowledge-driven Programmatic reinforcement learning for zero-shot Coordination,KnowPC)。我们首先定义了一种基础性的特定领域语言(DSL),包括程序结构、条件基元和动作基元。巨大的程序搜索空间是一项重大挑战,它使得高效地找到高性能程序变得十分困难。为了解决这个问题,KnowPC 集成了提取器和推理器。提取器从多代理交互轨迹中发现环境转换知识,推理器则根据转换知识推导出每个动作基元的前提条件。
{"title":"KnowPC: Knowledge-Driven Programmatic Reinforcement Learning for Zero-shot Coordination","authors":"Yin Gu, Qi Liu, Zhi Li, Kai Zhang","doi":"arxiv-2408.04336","DOIUrl":"https://doi.org/arxiv-2408.04336","url":null,"abstract":"Zero-shot coordination (ZSC) remains a major challenge in the cooperative AI\u0000field, which aims to learn an agent to cooperate with an unseen partner in\u0000training environments or even novel environments. In recent years, a popular\u0000ZSC solution paradigm has been deep reinforcement learning (DRL) combined with\u0000advanced self-play or population-based methods to enhance the neural policy's\u0000ability to handle unseen partners. Despite some success, these approaches\u0000usually rely on black-box neural networks as the policy function. However,\u0000neural networks typically lack interpretability and logic, making the learned\u0000policies difficult for partners (e.g., humans) to understand and limiting their\u0000generalization ability. These shortcomings hinder the application of\u0000reinforcement learning methods in diverse cooperative scenarios.We suggest to\u0000represent the agent's policy with an interpretable program. Unlike neural\u0000networks, programs contain stable logic, but they are non-differentiable and\u0000difficult to optimize.To automatically learn such programs, we introduce\u0000Knowledge-driven Programmatic reinforcement learning for zero-shot Coordination\u0000(KnowPC). We first define a foundational Domain-Specific Language (DSL),\u0000including program structures, conditional primitives, and action primitives. A\u0000significant challenge is the vast program search space, making it difficult to\u0000find high-performing programs efficiently. To address this, KnowPC integrates\u0000an extractor and an reasoner. The extractor discovers environmental transition\u0000knowledge from multi-agent interaction trajectories, while the reasoner deduces\u0000the preconditions of each action primitive based on the transition knowledge.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141949506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Perceive, Reflect, and Plan: Designing LLM Agent for Goal-Directed City Navigation without Instructions 感知、思考和规划:为无指令的目标导向型城市导航设计 LLM Agent
Pub Date : 2024-08-08 DOI: arxiv-2408.04168
Qingbin Zeng, Qinglong Yang, Shunan Dong, Heming Du, Liang Zheng, Fengli Xu, Yong Li
This paper considers a scenario in city navigation: an AI agent is providedwith language descriptions of the goal location with respect to some well-knownlandmarks; By only observing the scene around, including recognizing landmarksand road network connections, the agent has to make decisions to navigate tothe goal location without instructions. This problem is very challenging,because it requires agent to establish self-position and acquire spatialrepresentation of complex urban environment, where landmarks are ofteninvisible. In the absence of navigation instructions, such abilities are vitalfor the agent to make high-quality decisions in long-range city navigation.With the emergent reasoning ability of large language models (LLMs), a temptingbaseline is to prompt LLMs to "react" on each observation and make decisionsaccordingly. However, this baseline has very poor performance that the agentoften repeatedly visits same locations and make short-sighted, inconsistentdecisions. To address these issues, this paper introduces a novel agenticworkflow featured by its abilities to perceive, reflect and plan. Specifically,we find LLaVA-7B can be fine-tuned to perceive the direction and distance oflandmarks with sufficient accuracy for city navigation. Moreover, reflection isachieved through a memory mechanism, where past experiences are stored and canbe retrieved with current perception for effective decision argumentation.Planning uses reflection results to produce long-term plans, which can avoidshort-sighted decisions in long-range navigation. We show the designed workflowsignificantly improves navigation ability of the LLM agent compared with thestate-of-the-art baselines.
本文考虑了城市导航中的一个场景:一个人工智能代理被提供了目标位置与一些众所周知的地标之间的语言描述;通过只观察周围的场景,包括识别地标和道路网络连接,代理必须在没有指令的情况下做出导航到目标位置的决策。这个问题非常具有挑战性,因为它要求代理建立自我定位,并获得复杂城市环境的空间描述,而在复杂的城市环境中,地标往往是不可见的。由于大型语言模型(LLM)具有新兴的推理能力,一种诱人的基准是促使 LLM 对每次观察做出 "反应",并据此做出决策。然而,这种基线的性能非常差,代理经常重复访问相同的地点,并做出短视、不一致的决定。为了解决这些问题,本文引入了一种新型的代理工作流,其特点是具有感知、反思和规划能力。具体来说,我们发现 LLaVA-7B 可以进行微调,以足够的精度感知地标的方向和距离,从而实现城市导航。此外,LLaVA-7B 还能通过记忆机制进行反思,将过去的经验储存起来,并结合当前的感知进行检索,从而有效地进行决策论证。我们的研究表明,与最先进的基线相比,所设计的工作流程显著提高了 LLM 代理的导航能力。
{"title":"Perceive, Reflect, and Plan: Designing LLM Agent for Goal-Directed City Navigation without Instructions","authors":"Qingbin Zeng, Qinglong Yang, Shunan Dong, Heming Du, Liang Zheng, Fengli Xu, Yong Li","doi":"arxiv-2408.04168","DOIUrl":"https://doi.org/arxiv-2408.04168","url":null,"abstract":"This paper considers a scenario in city navigation: an AI agent is provided\u0000with language descriptions of the goal location with respect to some well-known\u0000landmarks; By only observing the scene around, including recognizing landmarks\u0000and road network connections, the agent has to make decisions to navigate to\u0000the goal location without instructions. This problem is very challenging,\u0000because it requires agent to establish self-position and acquire spatial\u0000representation of complex urban environment, where landmarks are often\u0000invisible. In the absence of navigation instructions, such abilities are vital\u0000for the agent to make high-quality decisions in long-range city navigation.\u0000With the emergent reasoning ability of large language models (LLMs), a tempting\u0000baseline is to prompt LLMs to \"react\" on each observation and make decisions\u0000accordingly. However, this baseline has very poor performance that the agent\u0000often repeatedly visits same locations and make short-sighted, inconsistent\u0000decisions. To address these issues, this paper introduces a novel agentic\u0000workflow featured by its abilities to perceive, reflect and plan. Specifically,\u0000we find LLaVA-7B can be fine-tuned to perceive the direction and distance of\u0000landmarks with sufficient accuracy for city navigation. Moreover, reflection is\u0000achieved through a memory mechanism, where past experiences are stored and can\u0000be retrieved with current perception for effective decision argumentation.\u0000Planning uses reflection results to produce long-term plans, which can avoid\u0000short-sighted decisions in long-range navigation. We show the designed workflow\u0000significantly improves navigation ability of the LLM agent compared with the\u0000state-of-the-art baselines.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141930544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RiskAwareBench: Towards Evaluating Physical Risk Awareness for High-level Planning of LLM-based Embodied Agents RiskAwareBench:为基于 LLM 的嵌入式代理的高级别规划评估物理风险意识
Pub Date : 2024-08-08 DOI: arxiv-2408.04449
Zihao Zhu, Bingzhe Wu, Zhengyou Zhang, Baoyuan Wu
The integration of large language models (LLMs) into robotics significantlyenhances the capabilities of embodied agents in understanding and executingcomplex natural language instructions. However, the unmitigated deployment ofLLM-based embodied systems in real-world environments may pose potentialphysical risks, such as property damage and personal injury. Existing securitybenchmarks for LLMs overlook risk awareness for LLM-based embodied agents. Toaddress this gap, we propose RiskAwareBench, an automated framework designed toassess physical risks awareness in LLM-based embodied agents. RiskAwareBenchconsists of four modules: safety tips generation, risky scene generation, plangeneration, and evaluation, enabling comprehensive risk assessment with minimalmanual intervention. Utilizing this framework, we compile the PhysicalRiskdataset, encompassing diverse scenarios with associated safety tips,observations, and instructions. Extensive experiments reveal that most LLMsexhibit insufficient physical risk awareness, and baseline risk mitigationstrategies yield limited enhancement, which emphasizes the urgency andcruciality of improving risk awareness in LLM-based embodied agents in thefuture.
将大型语言模型(LLM)集成到机器人技术中,可大大提高机器人理解和执行复杂自然语言指令的能力。然而,在现实环境中不加区分地部署基于 LLM 的化身系统可能会带来潜在的物理风险,如财产损失和人身伤害。现有的 LLM 安全基准忽略了基于 LLM 的嵌入式代理的风险意识。为了弥补这一缺陷,我们提出了 RiskAwareBench,这是一个自动化框架,旨在评估基于 LLM 的具身代理的物理风险意识。RiskAwareBench 由四个模块组成:安全提示生成、风险场景生成、计划生成和评估。利用这一框架,我们编译了物理风险数据集,其中包括与相关安全提示、观察结果和说明有关的各种场景。广泛的实验表明,大多数 LLM 的物理风险意识不足,而基线风险缓解策略只能产生有限的增强效果,这强调了未来提高基于 LLM 的化身代理的风险意识的紧迫性和重要性。
{"title":"RiskAwareBench: Towards Evaluating Physical Risk Awareness for High-level Planning of LLM-based Embodied Agents","authors":"Zihao Zhu, Bingzhe Wu, Zhengyou Zhang, Baoyuan Wu","doi":"arxiv-2408.04449","DOIUrl":"https://doi.org/arxiv-2408.04449","url":null,"abstract":"The integration of large language models (LLMs) into robotics significantly\u0000enhances the capabilities of embodied agents in understanding and executing\u0000complex natural language instructions. However, the unmitigated deployment of\u0000LLM-based embodied systems in real-world environments may pose potential\u0000physical risks, such as property damage and personal injury. Existing security\u0000benchmarks for LLMs overlook risk awareness for LLM-based embodied agents. To\u0000address this gap, we propose RiskAwareBench, an automated framework designed to\u0000assess physical risks awareness in LLM-based embodied agents. RiskAwareBench\u0000consists of four modules: safety tips generation, risky scene generation, plan\u0000generation, and evaluation, enabling comprehensive risk assessment with minimal\u0000manual intervention. Utilizing this framework, we compile the PhysicalRisk\u0000dataset, encompassing diverse scenarios with associated safety tips,\u0000observations, and instructions. Extensive experiments reveal that most LLMs\u0000exhibit insufficient physical risk awareness, and baseline risk mitigation\u0000strategies yield limited enhancement, which emphasizes the urgency and\u0000cruciality of improving risk awareness in LLM-based embodied agents in the\u0000future.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141930542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital Avatars: Framework Development and Their Evaluation 数字头像:框架开发及其评估
Pub Date : 2024-08-07 DOI: arxiv-2408.04068
Timothy Rupprecht, Sung-En Chang, Yushu Wu, Lei Lu, Enfu Nan, Chih-hsiang Li, Caiyue Lai, Zhimin Li, Zhijun Hu, Yumei He, David Kaeli, Yanzhi Wang
We present a novel prompting strategy for artificial intelligence drivendigital avatars. To better quantify how our prompting strategy affectsanthropomorphic features like humor, authenticity, and favorability we presentCrowd Vote - an adaptation of Crowd Score that allows for judges to elect alarge language model (LLM) candidate over competitors answering the same orsimilar prompts. To visualize the responses of our LLM, and the effectivenessof our prompting strategy we propose an end-to-end framework for creatinghigh-fidelity artificial intelligence (AI) driven digital avatars. Thispipeline effectively captures an individual's essence for interaction and ourstreaming algorithm delivers a high-quality digital avatar with real-timeaudio-video streaming from server to mobile device. Both our visualizationtool, and our Crowd Vote metrics demonstrate our AI driven digital avatars havestate-of-the-art humor, authenticity, and favorability outperforming allcompetitors and baselines. In the case of our Donald Trump and Joe Bidenavatars, their authenticity and favorability are rated higher than even theirreal-world equivalents.
我们为人工智能驱动的数字头像提出了一种新颖的提示策略。为了更好地量化我们的提示策略对幽默感、真实性和好感度等拟人化特征的影响,我们提出了 "人群投票"(Crowd Vote)--一种对 "人群评分"(Crowd Score)的改编,允许评委在回答相同或相似提示的竞争者中选出一个大型语言模型(LLM)候选者。为了可视化 LLM 的回答以及提示策略的有效性,我们提出了一个端到端框架,用于创建高保真人工智能(AI)驱动的数字头像。该管道能有效捕捉个体的交互本质,我们的流算法能提供高质量的数字头像,并能从服务器向移动设备实时传输音频和视频流。我们的可视化工具和 "群众投票 "指标都表明,我们的人工智能驱动数字化身具有最先进的幽默感、真实性和好感度,优于所有竞争对手和基准线。就我们的唐纳德-特朗普(Donald Trump)和乔-拜登(Joe Biden)头像而言,他们的真实性和好感度甚至高于现实世界中的同类头像。
{"title":"Digital Avatars: Framework Development and Their Evaluation","authors":"Timothy Rupprecht, Sung-En Chang, Yushu Wu, Lei Lu, Enfu Nan, Chih-hsiang Li, Caiyue Lai, Zhimin Li, Zhijun Hu, Yumei He, David Kaeli, Yanzhi Wang","doi":"arxiv-2408.04068","DOIUrl":"https://doi.org/arxiv-2408.04068","url":null,"abstract":"We present a novel prompting strategy for artificial intelligence driven\u0000digital avatars. To better quantify how our prompting strategy affects\u0000anthropomorphic features like humor, authenticity, and favorability we present\u0000Crowd Vote - an adaptation of Crowd Score that allows for judges to elect a\u0000large language model (LLM) candidate over competitors answering the same or\u0000similar prompts. To visualize the responses of our LLM, and the effectiveness\u0000of our prompting strategy we propose an end-to-end framework for creating\u0000high-fidelity artificial intelligence (AI) driven digital avatars. This\u0000pipeline effectively captures an individual's essence for interaction and our\u0000streaming algorithm delivers a high-quality digital avatar with real-time\u0000audio-video streaming from server to mobile device. Both our visualization\u0000tool, and our Crowd Vote metrics demonstrate our AI driven digital avatars have\u0000state-of-the-art humor, authenticity, and favorability outperforming all\u0000competitors and baselines. In the case of our Donald Trump and Joe Biden\u0000avatars, their authenticity and favorability are rated higher than even their\u0000real-world equivalents.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"75 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141930545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Frank's triangular norms in Piaget's logical proportions 皮亚杰逻辑比例中的弗兰克三角准则
Pub Date : 2024-08-07 DOI: arxiv-2408.03795
Henri Prade, Gilles Richard
Starting from the Boolean notion of logical proportion in Piaget's sense,which turns out to be equivalent to analogical proportion, this note proposes adefinition of analogical proportion between numerical values based ontriangular norms (and dual co-norms). Frank's family of triangular norms isparticularly interesting from this perspective. The article concludes with acomparative discussion with another very recent proposal for defininganalogical proportions between numerical values based on the family ofgeneralized means.
从皮亚杰意义上的逻辑比例的布尔概念出发(事实证明它等同于类比比例),本说明提出了基于三角准则(和对偶共准则)的数值间类比比例的定义。从这个角度看,弗兰克的三角准则族尤其有趣。文章最后还比较讨论了最近提出的另一个基于广义均值族定义数值间类比比例的建议。
{"title":"Frank's triangular norms in Piaget's logical proportions","authors":"Henri Prade, Gilles Richard","doi":"arxiv-2408.03795","DOIUrl":"https://doi.org/arxiv-2408.03795","url":null,"abstract":"Starting from the Boolean notion of logical proportion in Piaget's sense,\u0000which turns out to be equivalent to analogical proportion, this note proposes a\u0000definition of analogical proportion between numerical values based on\u0000triangular norms (and dual co-norms). Frank's family of triangular norms is\u0000particularly interesting from this perspective. The article concludes with a\u0000comparative discussion with another very recent proposal for defining\u0000analogical proportions between numerical values based on the family of\u0000generalized means.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141949507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1