首页 > 最新文献

Proceedings of the AAAI Symposium Series最新文献

英文 中文
Federated Learning of Things - Expanding the Heterogeneity in Federated Learning 联合物联网学习--扩大联合学习的异质性
Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31221
Scott Kuzdeba
The Internet of Things (IoT) has revolutionized how our devices are networked, connecting multipleaspects of our life from smart homes and wearables to smart cities and warehouses. IoT’s strengthcomes from the ever-expanding diverse heterogeneous sensors, applications, and concepts that are allcentered around the core concept collecting and sharing data from sensors. Simultaneously, deeplearning has changed how our systems operate, allowing them to learn from data and change the waywe interface with the world. Federated learning moves these two paradigm shifts together, leveragingthe data (securely) from the IoT to train deep learning architectures for performant edge applications. However, today’s federated learning has not yet benefited from the scale of diversity that the IoT anddeep learning sensors and applications provide. This talk explores how we can better tap into theheterogeneity that surrounds the potential of federated learning and use it to build better models. Thisincludes the heterogeneity from device hardware to training paradigms (supervised, unsupervised,reinforcement, self-supervised).
物联网(IoT)彻底改变了我们的设备联网方式,从智能家居和可穿戴设备到智能城市和仓库,物联网连接了我们生活的方方面面。物联网的优势来自于不断扩展的各种异构传感器、应用和概念,它们都围绕着一个核心理念,即收集和共享来自传感器的数据。与此同时,深度学习改变了我们的系统运行方式,使它们能够从数据中学习,并改变我们与世界交互的方式。联盟学习将这两种模式转变结合在一起,利用物联网数据(安全地)来训练深度学习架构,从而实现高性能的边缘应用。然而,当今的联合学习尚未从物联网和深度学习传感器及应用所提供的多样性规模中获益。本讲座将探讨我们如何才能更好地挖掘联合学习潜力周围的异质性,并利用它建立更好的模型。这包括从设备硬件到训练范式(有监督、无监督、强化、自监督)的异质性。
{"title":"Federated Learning of Things - Expanding the Heterogeneity in Federated Learning","authors":"Scott Kuzdeba","doi":"10.1609/aaaiss.v3i1.31221","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31221","url":null,"abstract":"The Internet of Things (IoT) has revolutionized how our devices are networked, connecting multiple\u0000aspects of our life from smart homes and wearables to smart cities and warehouses. IoT’s strength\u0000comes from the ever-expanding diverse heterogeneous sensors, applications, and concepts that are all\u0000centered around the core concept collecting and sharing data from sensors. Simultaneously, deep\u0000learning has changed how our systems operate, allowing them to learn from data and change the way\u0000we interface with the world. Federated learning moves these two paradigm shifts together, leveraging\u0000the data (securely) from the IoT to train deep learning architectures for performant edge applications. \u0000However, today’s federated learning has not yet benefited from the scale of diversity that the IoT and\u0000deep learning sensors and applications provide. This talk explores how we can better tap into the\u0000heterogeneity that surrounds the potential of federated learning and use it to build better models. This\u0000includes the heterogeneity from device hardware to training paradigms (supervised, unsupervised,\u0000reinforcement, self-supervised).","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141120646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comparing Human Behavior to an Optimal Policy for Innovation 将人类行为与最佳创新政策相比较
Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31291
Bonan Zhao, Natalia Vélez, Thomas L. Griffiths
Human learning does not stop at solving a single problem. Instead, we seek new challenges, define new goals, and come up with new ideas. Unlike the classic explore-exploit trade-off between known and unknown options, making new tools or generating new ideas is not about collecting data from existing unknown options, but rather about create new options out of what is currently available. We introduce a discovery game designed to study how rational agents make decisions about pursuing innovations, where discovering new ideas is a process of combining existing ideas in an open-ended compositional space. We derive optimal policies of this decision problem formalized as a Markov decision process, and compare people's behaviors to the model predictions in an online behavioral experiment. We found evidence that people both innovate rationally, guided by potential returns in this discovery game, and under- and over-explore systematically in different settings.
人类的学习不会止步于解决单一问题。相反,我们会寻求新的挑战,确定新的目标,提出新的想法。与在已知和未知选项之间进行经典的探索-开发权衡不同,制造新工具或产生新想法不是从现有的未知选项中收集数据,而是从现有的选项中创造新选项。我们引入了一个发现博弈,旨在研究理性代理人如何做出追求创新的决策,在这个博弈中,发现新想法是在一个开放式的组合空间中组合现有想法的过程。我们推导出这一决策问题的最优策略,并将其形式化为马尔可夫决策过程,同时在在线行为实验中将人们的行为与模型预测进行比较。我们发现有证据表明,在这种发现游戏中,人们既能在潜在回报的指导下理性创新,也能在不同环境下系统性地探索不足或探索过度。
{"title":"Comparing Human Behavior to an Optimal Policy for Innovation","authors":"Bonan Zhao, Natalia Vélez, Thomas L. Griffiths","doi":"10.1609/aaaiss.v3i1.31291","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31291","url":null,"abstract":"Human learning does not stop at solving a single problem. Instead, we seek new challenges, define new goals, and come up with new ideas. Unlike the classic explore-exploit trade-off between known and unknown options, making new tools or generating new ideas is not about collecting data from existing unknown options, but rather about create new options out of what is currently available. We introduce a discovery game designed to study how rational agents make decisions about pursuing innovations, where discovering new ideas is a process of combining existing ideas in an open-ended compositional space. We derive optimal policies of this decision problem formalized as a Markov decision process, and compare people's behaviors to the model predictions in an online behavioral experiment. We found evidence that people both innovate rationally, guided by potential returns in this discovery game, and under- and over-explore systematically in different settings.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141119915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
K-PERM: Personalized Response Generation Using Dynamic Knowledge Retrieval and Persona-Adaptive Queries K-PERM:利用动态知识检索和角色自适应查询生成个性化回复
Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31203
Kanak Raj, Kaushik Roy, Vamshi Bonagiri, Priyanshul Govil, K. Thirunarayan, Raxit Goswami, Manas Gaur
Personalizing conversational agents can enhance the quality of conversations and increase user engagement. However, they often lack external knowledge to appropriately tend to a user’s persona. This is crucial for practical applications like mental health support, nutrition planning, culturally sensitive conversations, or reducing toxic behavior in conversational agents. To enhance the relevance and comprehensiveness of personalized responses, we propose using a two-step approach that involves (1) selectively integrating user personas and (2) contextualizing the response by supplementing information from a background knowledge source. We develop K-PERM (Knowledge-guided PErsonalization with Reward Modulation), a dynamic conversational agent that combines these elements. K-PERM achieves state-of-the- art performance on the popular FoCus dataset, containing real-world personalized conversations concerning global landmarks.We show that using responses from K-PERM can improve performance in state-of-the-art LLMs (GPT 3.5) by 10.5%, highlighting the impact of K-PERM for personalizing chatbots.
个性化对话代理可以提高对话质量,增加用户参与度。然而,它们往往缺乏外部知识,无法适当地照顾用户的角色。这对于心理健康支持、营养计划、文化敏感性对话或减少对话代理中的有毒行为等实际应用至关重要。为了提高个性化回复的相关性和全面性,我们提出了一种分两步走的方法,其中包括:(1)有选择地整合用户角色;(2)通过补充背景知识源的信息来使回复情景化。我们开发了 K-PERM(具有奖励调节功能的知识引导个性化),这是一种结合了这些要素的动态对话代理。K-PERM在流行的FoCus数据集上取得了最先进的性能,该数据集包含真实世界中有关全球地标的个性化对话。我们的研究表明,使用K-PERM的回复可以将最先进的LLM(GPT 3.5)的性能提高10.5%,这凸显了K-PERM对个性化聊天机器人的影响。
{"title":"K-PERM: Personalized Response Generation Using Dynamic Knowledge Retrieval and Persona-Adaptive Queries","authors":"Kanak Raj, Kaushik Roy, Vamshi Bonagiri, Priyanshul Govil, K. Thirunarayan, Raxit Goswami, Manas Gaur","doi":"10.1609/aaaiss.v3i1.31203","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31203","url":null,"abstract":"Personalizing conversational agents can enhance the quality of conversations and increase user engagement. However, they often lack external knowledge to appropriately tend to a user’s persona. This is crucial for practical applications like mental health support, nutrition planning, culturally sensitive conversations, or reducing toxic behavior in conversational agents. To enhance the relevance and comprehensiveness of personalized responses, we propose using a two-step approach that involves (1) selectively integrating user personas and (2) contextualizing the response by supplementing information from a background knowledge source. We develop K-PERM (Knowledge-guided PErsonalization with Reward Modulation), a dynamic conversational agent that combines these elements. K-PERM achieves state-of-the- art performance on the popular FoCus dataset, containing real-world personalized conversations concerning global landmarks.We show that using responses from K-PERM can improve performance in state-of-the-art LLMs (GPT 3.5) by 10.5%, highlighting the impact of K-PERM for personalizing chatbots.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141120369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing Neuro-Inspired Lifelong Learning for Edge with Co-Design 通过协同设计推进神经启发的边缘终身学习
Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31226
Nicholas Soures, Vedant Karia, D. Kudithipudi
Lifelong learning, which refers to an agent's ability to continuously learn and enhance its performance over its lifespan, is a significant challenge in artificial intelligence (AI), that biological systems tackle efficiently. This challenge is further exacerbated when AI is deployed in untethered environments with strict energy and latency constraints. We take inspiration from neural plasticity and investigate how to leverage and build energy-efficient lifelong learning machines. Specifically, we study how a combination of neural plasticity mechanisms, namely neuromodulation, synaptic consolidation, and metaplasticity, enhance the continual learning capabilities of AI models. We further co-design architectures that leverage compute-in-memory topologies and sparse spike-based communication with quantization for the edge. Aspects of this co-design can be transferred to federated lifelong learning scenarios.
终身学习指的是代理在其生命周期内不断学习并提高其性能的能力,它是人工智能(AI)领域的一项重大挑战,而生物系统却能有效地应对这一挑战。当人工智能被部署在具有严格能量和延迟限制的无绳环境中时,这一挑战就会进一步加剧。我们从神经可塑性中汲取灵感,研究如何利用和构建高能效的终身学习机器。具体来说,我们研究神经可塑性机制的组合,即神经调节、突触巩固和元弹性,如何增强人工智能模型的持续学习能力。我们还进一步共同设计了利用内存计算拓扑结构和基于尖峰的稀疏通信以及边缘量化的架构。这种协同设计的某些方面可以移植到联合终身学习场景中。
{"title":"Advancing Neuro-Inspired Lifelong Learning for Edge with Co-Design","authors":"Nicholas Soures, Vedant Karia, D. Kudithipudi","doi":"10.1609/aaaiss.v3i1.31226","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31226","url":null,"abstract":"Lifelong learning, which refers to an agent's ability to continuously learn and enhance its performance over its lifespan, is a significant challenge in artificial intelligence (AI), that biological systems tackle efficiently. This challenge is further exacerbated when AI is deployed in untethered environments with strict energy and latency constraints. \u0000We take inspiration from neural plasticity and investigate how to leverage and build energy-efficient lifelong learning machines. Specifically, we study how a combination of neural plasticity mechanisms, namely neuromodulation, synaptic consolidation, and metaplasticity, enhance the continual learning capabilities of AI models. We further co-design architectures that leverage compute-in-memory topologies and sparse spike-based communication with quantization for the edge. Aspects of this co-design can be transferred to federated lifelong learning scenarios.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141120348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Constructing Deep Concepts through Shallow Search 通过浅层搜索构建深层概念
Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31292
Bonan Zhao, Christopher G Lucas, Neil R. Bramley
We propose bootstrap learning as a computational account for why human learning is modular and incremental, and identify key components of bootstrap learning that allow artificial systems to learn more like people. Originated from developmental psychology, bootstrap learning refers to people's ability to extend and repurpose existing knowledge to create new and more powerful ideas. We view bootstrap learning as a solution of how cognitively-bounded reasoners grasp complex environmental dynamics that are far beyond their initial capacity, by searching ‘locally’ and recursively to extend their existing knowledge. Drawing from techniques of Bayesian library learning and resource rational analysis, we propose a computational modeling framework that achieves human-like bootstrap learning performance in inductive conceptual inference. In addition, we demonstrate modeling and behavioral evidence that highlights the double-edged sword of bootstrap learning, such that people processing the same information in different batch orders could induce drastically different causal conclusions and generalizations, as a result of the different sub-concepts they construct in earlier stages of learning.
我们提出了引导式学习(bootstrap learning)这一计算方法,以解释为什么人类的学习是模块化和渐进式的,并确定了引导式学习的关键组成部分,使人工系统能够像人类一样学习。引导式学习源于发展心理学,指的是人们扩展和重新利用现有知识以创造更强大的新想法的能力。我们认为,引导式学习可以解决认知受限的推理者如何通过 "局部 "搜索和递归扩展现有知识,从而掌握远远超出其初始能力的复杂环境动态。借鉴贝叶斯库学习和资源合理性分析技术,我们提出了一种计算建模框架,它能在归纳概念推理中实现与人类类似的引导学习性能。此外,我们还展示了建模和行为证据,这些证据凸显了引导式学习的双刃剑作用,即人们在不同的批处理顺序中处理相同的信息时,由于在学习的早期阶段构建了不同的子概念,可能会得出截然不同的因果结论和概括。
{"title":"Constructing Deep Concepts through Shallow Search","authors":"Bonan Zhao, Christopher G Lucas, Neil R. Bramley","doi":"10.1609/aaaiss.v3i1.31292","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31292","url":null,"abstract":"We propose bootstrap learning as a computational account for why human learning is modular and incremental, and identify key components of bootstrap learning that allow artificial systems to learn more like people. Originated from developmental psychology, bootstrap learning refers to people's ability to extend and repurpose existing knowledge to create new and more powerful ideas. We view bootstrap learning as a solution of how cognitively-bounded reasoners grasp complex environmental dynamics that are far beyond their initial capacity, by searching ‘locally’ and recursively to extend their existing knowledge. Drawing from techniques of Bayesian library learning and resource rational analysis, we propose a computational modeling framework that achieves human-like bootstrap learning performance in inductive conceptual inference. In addition, we demonstrate modeling and behavioral evidence that highlights the double-edged sword of bootstrap learning, such that people processing the same information in different batch orders could induce drastically different causal conclusions and generalizations, as a result of the different sub-concepts they construct in earlier stages of learning.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141118854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Generative AI-Based Virtual Physician Assistant 基于生成式人工智能的虚拟医生助理
Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31182
Geoffrey W. Rutledge, Alexander Sivura
We describe "Dr. A.I.", a virtual physician assistant that uses generative AI to conduct a pre-visit patient interview and to create a draft clinical note for the physician. We document the effectiveness of Dr. A.I. by measuring the concordance of the actual diagnosis made by the doctor with the generated differ-ential diagnosis (DDx) list. This application demonstrates the practical healthcare capabilities of a large language model to improve efficiency of doctor visits while also addressing safety concerns for the use of generative AI in the workflow of patient care.
我们介绍了虚拟医生助理 "Dr. A.I.",它使用生成式人工智能对患者进行就诊前访谈,并为医生创建临床笔记草稿。我们通过测量医生做出的实际诊断与生成的潜在差异诊断(DDx)列表的一致性,记录了 A.I.医生的有效性。该应用展示了大型语言模型在医疗保健领域的实用能力,它不仅提高了医生出诊的效率,还解决了在患者护理工作流程中使用生成式人工智能的安全问题。
{"title":"A Generative AI-Based Virtual Physician Assistant","authors":"Geoffrey W. Rutledge, Alexander Sivura","doi":"10.1609/aaaiss.v3i1.31182","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31182","url":null,"abstract":"We describe \"Dr. A.I.\", a virtual physician assistant that uses generative AI to conduct a pre-visit patient interview and to create a draft clinical note for the physician. We document the effectiveness of Dr. A.I. by measuring the concordance of the actual diagnosis made by the doctor with the generated differ-ential diagnosis (DDx) list. This application demonstrates the practical healthcare capabilities of a large language model to improve efficiency of doctor visits while also addressing safety concerns for the use of generative AI in the workflow of patient care.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141120226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
What Is a Correct Output by Generative AI From the Viewpoint of Well-Being? – Perspective From Sleep Stage Estimation – 从幸福的角度看什么是生成式人工智能的正确输出?- 从睡眠阶段估计的角度看人工智能
Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31250
K. Takadama
This paper explores an answer to the question of “what is a correct output by generative AI from the viewpoint of well-being?” and discusses an effectiveness of taking account of a biological rhythm for this issue. Concretely, this paper focuses on an estimation of the REM sleep stage as one of sleep stages, and compared its estimations based on random forest as one of the machine learning methods and the ultradian rhythm as one of the biological rhythms. From the human subject experiment, the following implications have been revealed: (1) the REM sleep stage is wrongly estimated in many areas by random forest; and (2) the integration of the REM sleep stage estimation based on the biological rhythm with that based on random forest improves the F-score of the estimated REM sleep stage.
本文探讨了 "从幸福的角度看,什么是生成式人工智能的正确输出?"这一问题的答案,并讨论了考虑生物节律对这一问题的有效性。具体而言,本文重点研究了作为睡眠阶段之一的快速眼动睡眠阶段的估算,并比较了基于随机森林(机器学习方法之一)和超昼夜节律(生物节律之一)的估算结果。通过人体实验,本文得出了以下结论:(1) 随机森林对快速动眼期睡眠阶段的估计在很多方面都是错误的;(2) 将基于生物节律的快速动眼期睡眠阶段估计与基于随机森林的快速动眼期睡眠阶段估计相结合,可以提高快速动眼期睡眠阶段估计的 F 分数。
{"title":"What Is a Correct Output by Generative AI From the Viewpoint of Well-Being? – Perspective From Sleep Stage Estimation –","authors":"K. Takadama","doi":"10.1609/aaaiss.v3i1.31250","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31250","url":null,"abstract":"This paper explores an answer to the question of “what is a correct output by generative AI from the viewpoint of well-being?” and discusses an effectiveness of taking account of a biological rhythm for this issue. Concretely, this paper focuses on an estimation of the REM sleep stage as one of sleep stages, and compared its estimations based on random forest as one of the machine learning methods and the ultradian rhythm as one of the biological rhythms. From the human subject experiment, the following implications have been revealed: (1) the REM sleep stage is wrongly estimated in many areas by random forest; and (2) the integration of the REM sleep stage estimation based on the biological rhythm with that based on random forest improves the F-score of the estimated REM sleep stage.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141121787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating Large Language Models with RAG Capability: A Perspective from Robot Behavior Planning and Execution 评估具有 RAG 功能的大型语言模型:机器人行为规划与执行视角
Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31254
Jin Yamanaka, Takashi Kido
After the significant performance of Large Language Models (LLMs) was revealed, their capabilities were rapidly expanded with techniques such as Retrieval Augmented Generation (RAG). Given their broad applicability and fast development, it's crucial to consider their impact on social systems. On the other hand, assessing these advanced LLMs poses challenges due to their extensive capabilities and the complex nature of social systems.In this study, we pay attention to the similarity between LLMs in social systems and humanoid robots in open environments. We enumerate the essential components required for controlling humanoids in problem solving which help us explore the core capabilities of LLMs and assess the effects of any deficiencies within these components. This approach is justified because the effectiveness of humanoid systems has been thoroughly proven and acknowledged. To identify needed components for humanoids in problem-solving tasks, we create an extensive component framework for planning and controlling humanoid robots in an open environment. Then assess the impacts and risks of LLMs for each component, referencing the latest benchmarks to evaluate their current strengths and weaknesses. Following the assessment guided by our framework, we identified certain capabilities that LLMs lack and concerns in social systems.
大语言模型(LLM)的显著性能被揭示出来后,其功能通过检索增强生成(RAG)等技术得到了迅速扩展。鉴于其广泛的适用性和快速的发展,考虑其对社会系统的影响至关重要。在本研究中,我们关注了社会系统中的 LLM 与开放环境中的仿人机器人之间的相似性。我们列举了在解决问题过程中控制仿人机器人所需的基本组件,这有助于我们探索 LLM 的核心能力,并评估这些组件中任何不足之处的影响。这种方法是合理的,因为仿人系统的有效性已得到充分证明和认可。为了确定仿人机器人在解决问题任务中所需的组件,我们创建了一个广泛的组件框架,用于在开放环境中规划和控制仿人机器人。然后,参考最新基准评估每个组件的 LLM 影响和风险,以评估其当前的优缺点。在我们的框架指导下进行评估后,我们确定了 LLMs 所缺乏的某些能力以及社会系统中存在的问题。
{"title":"Evaluating Large Language Models with RAG Capability: A Perspective from Robot Behavior Planning and Execution","authors":"Jin Yamanaka, Takashi Kido","doi":"10.1609/aaaiss.v3i1.31254","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31254","url":null,"abstract":"After the significant performance of Large Language Models (LLMs) was revealed, their capabilities were rapidly expanded with techniques such as Retrieval Augmented Generation (RAG). Given their broad applicability and fast development, it's crucial to consider their impact on social systems. On the other hand, assessing these advanced LLMs poses challenges due to their extensive capabilities and the complex nature of social systems.\u0000\u0000In this study, we pay attention to the similarity between LLMs in social systems and humanoid robots in open environments. We enumerate the essential components required for controlling humanoids in problem solving which help us explore the core capabilities of LLMs and assess the effects of any deficiencies within these components. This approach is justified because the effectiveness of humanoid systems has been thoroughly proven and acknowledged. To identify needed components for humanoids in problem-solving tasks, we create an extensive component framework for planning and controlling humanoid robots in an open environment. Then assess the impacts and risks of LLMs for each component, referencing the latest benchmarks to evaluate their current strengths and weaknesses. Following the assessment guided by our framework, we identified certain capabilities that LLMs lack and concerns in social systems.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141121105","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Knowledge Graph Consistency through Open Large Language Models: A Case Study 通过开放式大型语言模型增强知识图谱的一致性:案例研究
Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31201
Ankur Padia, Francis Ferraro, Tim Finin
High-quality knowledge graphs (KGs) play a crucial role in many applications. However, KGs created by automated information extraction systems can suffer from erroneous extractions or be inconsistent with provenance/source text. It is important to identify and correct such problems. In this paper, we study leveraging the emergent reasoning capabilities of large language models (LLMs) to detect inconsistencies between extracted facts and their provenance. With a focus on ``open'' LLMs that can be run and trained locally, we find that few-shot approaches can yield an absolute performance gain of 2.5-3.4% over the state-of-the-art method with only 9% of training data. We examine the LLM architectures' effect and show that Decoder-Only models underperform Encoder-Decoder approaches. We also explore how model size impacts performance and counterintuitively find that larger models do not result in consistent performance gains. Our detailed analyses suggest that while LLMs can improve KG consistency, the different LLM models learn different aspects of KG consistency and are sensitive to the number of entities involved.
高质量的知识图谱(KG)在许多应用中发挥着至关重要的作用。然而,自动信息提取系统创建的知识图谱可能会出现提取错误或与出处/源文本不一致的情况。发现并纠正这些问题非常重要。在本文中,我们将研究如何利用大型语言模型(LLM)的新兴推理能力来检测提取事实与其出处之间的不一致性。我们将重点放在可在本地运行和训练的 "开放式 "LLM 上,结果发现,与最先进的方法相比,只需 9% 的训练数据,少数几种方法就能产生 2.5-3.4% 的绝对性能增益。我们研究了 LLM 架构的影响,结果表明仅解码器模型的性能低于编码器-解码器方法。我们还探讨了模型大小对性能的影响,并意外地发现较大的模型并不能带来一致的性能提升。我们的详细分析表明,虽然 LLM 可以提高 KG 一致性,但不同的 LLM 模型学习 KG 一致性的不同方面,并且对所涉及的实体数量很敏感。
{"title":"Enhancing Knowledge Graph Consistency through Open Large Language Models: A Case Study","authors":"Ankur Padia, Francis Ferraro, Tim Finin","doi":"10.1609/aaaiss.v3i1.31201","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31201","url":null,"abstract":"High-quality knowledge graphs (KGs) play a crucial role in many applications. However, KGs created by automated information extraction systems can suffer from erroneous extractions or be inconsistent with provenance/source text. It is important to identify and correct such problems. In this paper, we study leveraging the emergent reasoning capabilities of large language models (LLMs) to detect inconsistencies between extracted facts and their provenance. With a focus on ``open'' LLMs that can be run and trained locally, we find that few-shot approaches can yield an absolute performance gain of 2.5-3.4% over the state-of-the-art method with only 9% of training data. We examine the LLM architectures' effect and show that Decoder-Only models underperform Encoder-Decoder approaches. We also explore how model size impacts performance and counterintuitively find that larger models do not result in consistent performance gains. Our detailed analyses suggest that while LLMs can improve KG consistency, the different LLM models learn different aspects of KG consistency and are sensitive to the number of entities involved.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141119432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Teaching Functions with Gaussian Process Regression 利用高斯过程回归教学函数
Pub Date : 2024-05-20 DOI: 10.1609/aaaiss.v3i1.31277
Maya Malaviya, Mark K. Ho
Humans are remarkably adaptive instructors who adjust advice based on their estimations about a learner’s prior knowledge and current goals. Many topics that people teach, like goal-directed behaviors, causal systems, categorization, and time-series patterns, have an underlying commonality: they map inputs to outputs through an unknown function. This project builds upon a Gaussian process (GP) regression model that describes learner behavior as they search the hypothesis space of possible underlying functions to find the one that best fits their current data. We extend this work by implementing a teacher model that reasons about a learner’s GP regression in order to provide specific information that will help them form an accurate estimation of the function.
人类是适应性极强的导师,他们会根据对学习者先前知识和当前目标的估计来调整建议。人类教授的许多主题,如目标导向行为、因果系统、分类和时间序列模式,都有一个潜在的共性:它们通过一个未知函数将输入映射到输出。本项目建立在高斯过程(GP)回归模型的基础上,该模型描述了学习者在搜索可能的基础函数的假设空间以找到最适合其当前数据的函数时的行为。我们通过实施一个教师模型来扩展这项工作,该模型可对学习者的 GP 回归进行推理,从而提供特定信息,帮助他们形成对函数的准确估计。
{"title":"Teaching Functions with Gaussian Process Regression","authors":"Maya Malaviya, Mark K. Ho","doi":"10.1609/aaaiss.v3i1.31277","DOIUrl":"https://doi.org/10.1609/aaaiss.v3i1.31277","url":null,"abstract":"Humans are remarkably adaptive instructors who adjust advice based on their estimations about a learner’s prior knowledge and current goals. Many topics that people teach, like goal-directed behaviors, causal systems, categorization, and time-series patterns, have an underlying commonality: they map inputs to outputs through an unknown function. This project builds upon a Gaussian process (GP) regression model that describes learner behavior as they search the hypothesis space of possible underlying functions to find the one that best fits their current data. We extend this work by implementing a teacher model that reasons about a learner’s GP regression in order to provide specific information that will help them form an accurate estimation of the function.","PeriodicalId":516827,"journal":{"name":"Proceedings of the AAAI Symposium Series","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141119812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the AAAI Symposium Series
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1