首页 > 最新文献

Journal of Web Semantics最新文献

英文 中文
Enhancing foundation models for scientific discovery via multimodal knowledge graph representations 通过多模态知识图表示增强科学发现的基础模型
IF 2.1 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-01 DOI: 10.1016/j.websem.2024.100845
Vanessa Lopez, Lam Hoang, Marcos Martinez-Galindo, Raúl Fernández-Díaz, Marco Luca Sbodio, Rodrigo Ordonez-Hurtado, Mykhaylo Zayats, Natasha Mulligan, Joao Bettencourt-Silva
Foundation Models (FMs) hold transformative potential to accelerate scientific discovery, yet reaching their full capacity in complex, highly multimodal domains such as genomics, drug discovery, and materials science requires a deeper consideration of the contextual nature of the scientific knowledge. We revisit the synergy between FMs and Multimodal Knowledge Graph (MKG) representation and learning, exploring their potential to enhance predictive and generative tasks in biomedical contexts like drug discovery. We seek to exploit MKGs to improve generative AI models’ ability to capture intricate domain-specific relations and facilitate multimodal fusion. This integration promises to accelerate discovery workflows by providing more meaningful multimodal knowledge-enhanced representations and contextual evidence. Despite this potential, challenges and opportunities remain, including fusing multiple sequential, structural and knowledge modalities and models leveraging the strengths of each; developing scalable architectures for multi-task multi-dataset learning; creating end-to-end workflows to enhance the trustworthiness of biomedical FMs using knowledge from heterogeneous datasets and scientific literature; the domain data bottleneck and the lack of a unified representation between natural language and chemical representations; and benchmarking, specifically the transfer learning to tasks with limited data (e.g., unseen molecules and proteins, rear diseases). Finally, fostering openness and collaboration is key to accelerate scientific breakthroughs.
基础模型(FMs)具有加速科学发现的变革性潜力,但在复杂的、高度多模态的领域,如基因组学、药物发现和材料科学,要充分发挥其潜力,需要对科学知识的上下文本质进行更深入的考虑。我们重新审视了FMs和多模态知识图(MKG)表示和学习之间的协同作用,探索了它们在药物发现等生物医学环境中增强预测和生成任务的潜力。我们寻求利用MKGs来提高生成式人工智能模型捕捉复杂领域特定关系和促进多模态融合的能力。这种集成有望通过提供更有意义的多模式知识增强表示和上下文证据来加速发现工作流程。尽管有这种潜力,挑战和机遇仍然存在,包括融合多种顺序、结构和知识模式和模型,利用每种模式的优势;开发多任务多数据集学习的可扩展架构;创建端到端工作流程,利用异构数据集和科学文献中的知识提高生物医学FMs的可信度;领域数据瓶颈,缺乏自然语言和化学语言之间的统一表示;基准测试,特别是将学习转移到数据有限的任务中(例如,看不见的分子和蛋白质,后方疾病)。最后,促进开放和合作是加速科学突破的关键。
{"title":"Enhancing foundation models for scientific discovery via multimodal knowledge graph representations","authors":"Vanessa Lopez,&nbsp;Lam Hoang,&nbsp;Marcos Martinez-Galindo,&nbsp;Raúl Fernández-Díaz,&nbsp;Marco Luca Sbodio,&nbsp;Rodrigo Ordonez-Hurtado,&nbsp;Mykhaylo Zayats,&nbsp;Natasha Mulligan,&nbsp;Joao Bettencourt-Silva","doi":"10.1016/j.websem.2024.100845","DOIUrl":"10.1016/j.websem.2024.100845","url":null,"abstract":"<div><div>Foundation Models (FMs) hold transformative potential to accelerate scientific discovery, yet reaching their full capacity in complex, highly multimodal domains such as genomics, drug discovery, and materials science requires a deeper consideration of the contextual nature of the scientific knowledge. We revisit the synergy between FMs and Multimodal Knowledge Graph (MKG) representation and learning, exploring their potential to enhance predictive and generative tasks in biomedical contexts like drug discovery. We seek to exploit MKGs to improve generative AI models’ ability to capture intricate domain-specific relations and facilitate multimodal fusion. This integration promises to accelerate discovery workflows by providing more meaningful multimodal knowledge-enhanced representations and contextual evidence. Despite this potential, challenges and opportunities remain, including fusing multiple sequential, structural and knowledge modalities and models leveraging the strengths of each; developing scalable architectures for multi-task multi-dataset learning; creating end-to-end workflows to enhance the trustworthiness of biomedical FMs using knowledge from heterogeneous datasets and scientific literature; the domain data bottleneck and the lack of a unified representation between natural language and chemical representations; and benchmarking, specifically the transfer learning to tasks with limited data (e.g., unseen molecules and proteins, rear diseases). Finally, fostering openness and collaboration is key to accelerate scientific breakthroughs.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100845"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards leveraging explicit negative statements in knowledge graph embeddings 在知识图嵌入中利用明确的否定陈述
IF 2.1 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-01 DOI: 10.1016/j.websem.2024.100851
Rita T. Sousa , Catia Pesquita , Heiko Paulheim
Knowledge Graphs are used in various domains to represent knowledge about entities and their relations. In the vast majority of cases, they capture what is known to be true about those entities, i.e., positive statements, while the Open World Assumption implicitly states that everything not expressed in the graph may or may not be true. This makes it difficult and less frequent to capture information explicitly known not to be true, i.e., negative statements. Moreover, while those negative statements could bear the potential to learn more useful representations in knowledge graph embeddings, that direction has been explored only rarely. However, in many domains, negative information is particularly interesting, for example, in recommender systems, where negative associations of users and items can help in learning better user representations, or in the biomedical domain, where the knowledge that a patient does exhibit a specific symptom can be crucial for accurate disease diagnosis.
In this paper, we argue that negative statements should be given more attention in knowledge graph embeddings. Moreover, we investigate how they can be used in knowledge graph embedding methods, highlighting their potential in some interesting use cases. We discuss some existing works and preliminary results that incorporate explicitly declared negative statements in walk-based knowledge graph embedding methods. Finally, we outline promising avenues for future research in this area.
知识图用于各种领域,以表示关于实体及其关系的知识。在绝大多数情况下,它们捕获了关于这些实体的已知真实内容,即积极陈述,而开放世界假设隐含地声明,图中未表达的所有内容可能是真的,也可能不是真的。这使得捕捉明确知道不真实的信息变得困难和不那么频繁,即否定陈述。此外,虽然这些负面陈述可以承担在知识图嵌入中学习更有用的表示的潜力,但这个方向很少被探索。然而,在许多领域,负面信息特别有趣,例如,在推荐系统中,用户和物品的负面关联可以帮助学习更好的用户表示,或者在生物医学领域,患者确实表现出特定症状的知识对于准确的疾病诊断至关重要。在本文中,我们认为否定语句在知识图嵌入中应该得到更多的关注。此外,我们还研究了如何在知识图嵌入方法中使用它们,强调了它们在一些有趣用例中的潜力。我们讨论了在基于步行的知识图嵌入方法中包含明确声明的否定语句的一些现有工作和初步结果。最后,我们概述了该领域未来研究的前景。
{"title":"Towards leveraging explicit negative statements in knowledge graph embeddings","authors":"Rita T. Sousa ,&nbsp;Catia Pesquita ,&nbsp;Heiko Paulheim","doi":"10.1016/j.websem.2024.100851","DOIUrl":"10.1016/j.websem.2024.100851","url":null,"abstract":"<div><div>Knowledge Graphs are used in various domains to represent knowledge about entities and their relations. In the vast majority of cases, they capture what is known to be true about those entities, i.e., positive statements, while the Open World Assumption implicitly states that everything not expressed in the graph may or may not be true. This makes it difficult and less frequent to capture information explicitly known not to be true, i.e., negative statements. Moreover, while those negative statements could bear the potential to learn more useful representations in knowledge graph embeddings, that direction has been explored only rarely. However, in many domains, negative information is particularly interesting, for example, in recommender systems, where negative associations of users and items can help in learning better user representations, or in the biomedical domain, where the knowledge that a patient does exhibit a specific symptom can be crucial for accurate disease diagnosis.</div><div>In this paper, we argue that negative statements should be given more attention in knowledge graph embeddings. Moreover, we investigate how they can be used in knowledge graph embedding methods, highlighting their potential in some interesting use cases. We discuss some existing works and preliminary results that incorporate explicitly declared negative statements in walk-based knowledge graph embedding methods. Finally, we outline promising avenues for future research in this area.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100851"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the role of knowledge graphs in AI-based scientific discovery 论知识图在人工智能科学发现中的作用
IF 2.1 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-01 DOI: 10.1016/j.websem.2024.100854
Mathieu d’Aquin
Research and the scientific activity are widely seen as an area where the current trends in AI, namely the development of deep learning models (including large language models), are having an increasing impact. Indeed, the ability of such models to extrapolate from data, seemingly finding unknown patterns relating implicit features of the objects under study to their properties can, at the very least, help accelerate and scale up those studies as demonstrated in fields such as molecular biology and chemistry. Knowledge graphs, on the other hand, have more traditionally been used to organize information around the scientific activity, keeping track of existing knowledge, of conducted experiments, of interactions within the research community, etc. However, for machine learning models to be truly used as a tool for scientific advancement, we have to find ways for the knowledge implicitly gained by these models from their training to be integrated with the explicitly represented knowledge captured through knowledge graphs. Based on our experience in ongoing projects in the domain of material science, in this position paper, we discuss the role that knowledge graphs can play in new methodologies for scientific discovery. These methodologies are based on the creation of large and opaque neural models. We therefore focus on the research challenges we need to address to support aligning such neural models to knowledge graphs for them to become a knowledge-level interface to those neural models.
研究和科学活动被广泛视为人工智能当前趋势的一个领域,即深度学习模型(包括大型语言模型)的发展正在产生越来越大的影响。事实上,这种模型从数据中推断的能力,似乎可以找到与所研究对象的隐含特征与其属性相关的未知模式,至少可以帮助加速和扩大这些研究的规模,如分子生物学和化学等领域的研究。另一方面,知识图更传统地用于围绕科学活动组织信息,跟踪现有知识,进行的实验,研究社区内的互动等。然而,为了让机器学习模型真正成为科学进步的工具,我们必须找到方法,让这些模型从训练中隐性获得的知识与通过知识图获得的显式表示的知识相结合。根据我们在材料科学领域正在进行的项目中的经验,在本意见书中,我们讨论了知识图谱在科学发现的新方法中所起的作用。这些方法是基于创建大型且不透明的神经模型。因此,我们专注于我们需要解决的研究挑战,以支持将这些神经模型与知识图对齐,使它们成为这些神经模型的知识级接口。
{"title":"On the role of knowledge graphs in AI-based scientific discovery","authors":"Mathieu d’Aquin","doi":"10.1016/j.websem.2024.100854","DOIUrl":"10.1016/j.websem.2024.100854","url":null,"abstract":"<div><div>Research and the scientific activity are widely seen as an area where the current trends in AI, namely the development of deep learning models (including large language models), are having an increasing impact. Indeed, the ability of such models to extrapolate from data, seemingly finding unknown patterns relating implicit features of the objects under study to their properties can, at the very least, help accelerate and scale up those studies as demonstrated in fields such as molecular biology and chemistry. Knowledge graphs, on the other hand, have more traditionally been used to organize information around the scientific activity, keeping track of existing knowledge, of conducted experiments, of interactions within the research community, etc. However, for machine learning models to be truly used as a tool for scientific advancement, we have to find ways for the knowledge implicitly gained by these models from their training to be integrated with the explicitly represented knowledge captured through knowledge graphs. Based on our experience in ongoing projects in the domain of material science, in this position paper, we discuss the role that knowledge graphs can play in new methodologies for scientific discovery. These methodologies are based on the creation of large and opaque neural models. We therefore focus on the research challenges we need to address to support aligning such neural models to knowledge graphs for them to become a knowledge-level interface to those neural models.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100854"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the legal implications of Large Language Model answers: A prompt engineering approach and a view beyond by exploiting Knowledge Graphs 关于大型语言模型答案的法律含义:一种快速的工程方法和利用知识图的观点
IF 2.1 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-01 DOI: 10.1016/j.websem.2024.100843
George Hannah , Rita T. Sousa , Ioannis Dasoulas , Claudia d’Amato
With the recent surge in popularity of Large Language Models (LLMs), there is the rising risk of users blindly trusting the information in the response. Nevertheless, there are cases where the LLM recommends actions that have potential legal implications and this may put the user in danger. We provide an empirical analysis on multiple existing LLMs showing the urgency of the problem. Hence, we propose a first short-term solution, consisting in an approach for isolating these legal issues through prompt engineering. We prove that this solution is able to stem some risks related to legal implications, nonetheless we also highlight some limitations. Hence, we argue on the need for additional knowledge-intensive resources and specifically Knowledge Graphs for fully solving these limitations. For the purpose, we draw our proposal aiming at designing and developing a solution powered by a legal Knowledge Graph (KG) that, besides capturing and alerting the user on possible legal implications coming from the LLM answers, is also able to provide actual evidence for them by supplying citations of the interested laws. We conclude with a brief discussion on the issues that may be needed to solve for building a comprehensive legal Knowledge Graph
随着最近大型语言模型(llm)的流行,用户盲目信任响应中的信息的风险正在上升。然而,在某些情况下,法学硕士建议的行动可能会产生潜在的法律影响,这可能会使用户处于危险之中。我们对多个现有法学硕士进行了实证分析,显示了问题的紧迫性。因此,我们提出了第一个短期解决方案,包括通过快速工程隔离这些法律问题的方法。我们证明该解决方案能够遏制与法律影响相关的一些风险,尽管如此,我们也强调了一些局限性。因此,我们认为需要额外的知识密集型资源,特别是知识图谱来充分解决这些限制。为此,我们提出了我们的建议,旨在设计和开发一个由法律知识图(KG)驱动的解决方案,除了捕获和提醒用户法学硕士答案可能产生的法律含义外,还能够通过提供相关法律的引用为他们提供实际证据。最后,我们简要讨论了构建一个全面的法律知识图谱可能需要解决的问题
{"title":"On the legal implications of Large Language Model answers: A prompt engineering approach and a view beyond by exploiting Knowledge Graphs","authors":"George Hannah ,&nbsp;Rita T. Sousa ,&nbsp;Ioannis Dasoulas ,&nbsp;Claudia d’Amato","doi":"10.1016/j.websem.2024.100843","DOIUrl":"10.1016/j.websem.2024.100843","url":null,"abstract":"<div><div>With the recent surge in popularity of Large Language Models (LLMs), there is the rising risk of users blindly trusting the information in the response. Nevertheless, there are cases where the LLM recommends actions that have potential legal implications and this may put the user in danger. We provide an empirical analysis on multiple existing LLMs showing the urgency of the problem. Hence, we propose a first short-term solution, consisting in an approach for isolating these legal issues through prompt engineering. We prove that this solution is able to stem some risks related to legal implications, nonetheless we also highlight some limitations. Hence, we argue on the need for additional knowledge-intensive resources and specifically Knowledge Graphs for fully solving these limitations. For the purpose, we draw our proposal aiming at designing and developing a solution powered by a legal Knowledge Graph (KG) that, besides capturing and alerting the user on possible legal implications coming from the LLM answers, is also able to provide actual evidence for them by supplying citations of the interested laws. We conclude with a brief discussion on the issues that may be needed to solve for building a comprehensive legal Knowledge Graph</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100843"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating Knowledge Graphs with Symbolic AI: The Path to Interpretable Hybrid AI Systems in Medicine 集成知识图与符号人工智能:医学中可解释的混合人工智能系统之路
IF 2.1 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-01 DOI: 10.1016/j.websem.2024.100856
Maria-Esther Vidal , Yashrajsinh Chudasama , Hao Huang , Disha Purohit , Maria Torrente
Knowledge Graphs (KGs) are graph-based structures that integrate heterogeneous data, capture domain knowledge, and enable explainable AI through symbolic reasoning. This position paper examines the challenges and research opportunities in integrating KGs with neuro-symbolic AI, highlighting their potential to enhance explainability, scalability, and context-aware reasoning in hybrid AI systems. Using a lung cancer use case, we illustrate how hybrid approaches address tasks such as link prediction—uncovering hidden relationships in medical data—and counterfactual reasoning—analyzing alternative scenarios to understand causal factors. The discussion is framed around TrustKG, which demonstrates how constraint validation, causal reasoning, and user-centric communication can support transparent and reliable decision-making. Additionally, we identify current limitations of KGs, including gaps in knowledge coverage, evolving data integration challenges, and the need for improved usability and impact assessment. These insights are not limited to healthcare but extend to other domains like energy, manufacturing, and mobility, showcasing the broad applicability of KGs. Finally, we propose research directions to unlock their full potential in building robust, transparent, and widely adopted real-world applications.
知识图(KGs)是基于图的结构,它集成了异构数据,捕获领域知识,并通过符号推理实现可解释的人工智能。本文探讨了将KGs与神经符号人工智能集成的挑战和研究机遇,强调了它们在混合人工智能系统中增强可解释性、可扩展性和上下文感知推理的潜力。通过肺癌用例,我们说明了混合方法如何处理链接预测(揭示医疗数据中的隐藏关系)和反事实推理(分析替代方案以了解因果因素)等任务。讨论围绕TrustKG展开,它演示了约束验证、因果推理和以用户为中心的通信如何支持透明和可靠的决策。此外,我们还确定了当前知识库的局限性,包括知识覆盖方面的差距、不断发展的数据集成挑战,以及改进可用性和影响评估的需求。这些见解不仅局限于医疗保健领域,还扩展到能源、制造和移动等其他领域,展示了kg的广泛适用性。最后,我们提出了研究方向,以充分发挥其在构建强大、透明和广泛采用的现实应用中的潜力。
{"title":"Integrating Knowledge Graphs with Symbolic AI: The Path to Interpretable Hybrid AI Systems in Medicine","authors":"Maria-Esther Vidal ,&nbsp;Yashrajsinh Chudasama ,&nbsp;Hao Huang ,&nbsp;Disha Purohit ,&nbsp;Maria Torrente","doi":"10.1016/j.websem.2024.100856","DOIUrl":"10.1016/j.websem.2024.100856","url":null,"abstract":"<div><div>Knowledge Graphs (KGs) are graph-based structures that integrate heterogeneous data, capture domain knowledge, and enable explainable AI through symbolic reasoning. This position paper examines the challenges and research opportunities in integrating KGs with neuro-symbolic AI, highlighting their potential to enhance explainability, scalability, and context-aware reasoning in hybrid AI systems. Using a lung cancer use case, we illustrate how hybrid approaches address tasks such as link prediction—uncovering hidden relationships in medical data—and counterfactual reasoning—analyzing alternative scenarios to understand causal factors. The discussion is framed around TrustKG, which demonstrates how constraint validation, causal reasoning, and user-centric communication can support transparent and reliable decision-making. Additionally, we identify current limitations of KGs, including gaps in knowledge coverage, evolving data integration challenges, and the need for improved usability and impact assessment. These insights are not limited to healthcare but extend to other domains like energy, manufacturing, and mobility, showcasing the broad applicability of KGs. Finally, we propose research directions to unlock their full potential in building robust, transparent, and widely adopted real-world applications.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100856"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LLM experimentation through knowledge graphs: Towards improved management, repeatability, and verification 通过知识图谱进行法学硕士实验:改进管理、可重复性和可验证性
IF 2.1 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-31 DOI: 10.1016/j.websem.2024.100853
John S. Erickson , Henrique Santos , Vládia Pinheiro , Jamie P. McCusker , Deborah L. McGuinness
Generative large language models (LLMs) have transformed AI by enabling rapid, human-like text generation, but they face challenges, including managing inaccurate information generation. Strategies such as prompt engineering, Retrieval-Augmented Generation (RAG), and incorporating domain-specific Knowledge Graphs (KGs) aim to address their issues. However, challenges remain in achieving the desired levels of management, repeatability, and verification of experiments, especially for developers using closed-access LLMs via web APIs, complicating integration with external tools. To tackle this, we are exploring a software architecture to enhance LLM workflows by prioritizing flexibility and traceability while promoting more accurate and explainable outputs. We describe our approach and provide a nutrition case study demonstrating its ability to integrate LLMs with RAG and KGs for more robust AI solutions.
生成式大型语言模型(llm)通过实现快速的、类似人类的文本生成,改变了人工智能,但它们面临着挑战,包括管理不准确的信息生成。诸如快速工程、检索增强生成(RAG)和结合领域特定知识图(KGs)等策略旨在解决这些问题。然而,在实现期望的管理水平、可重复性和实验验证方面仍然存在挑战,特别是对于通过web api使用封闭访问llm的开发人员来说,这使得与外部工具的集成变得复杂。为了解决这个问题,我们正在探索一种软件架构,通过优先考虑灵活性和可追溯性来增强LLM工作流,同时促进更准确和可解释的输出。我们描述了我们的方法,并提供了一个营养案例研究,展示了其将法学硕士与RAG和kg集成在一起的能力,以实现更强大的人工智能解决方案。
{"title":"LLM experimentation through knowledge graphs: Towards improved management, repeatability, and verification","authors":"John S. Erickson ,&nbsp;Henrique Santos ,&nbsp;Vládia Pinheiro ,&nbsp;Jamie P. McCusker ,&nbsp;Deborah L. McGuinness","doi":"10.1016/j.websem.2024.100853","DOIUrl":"10.1016/j.websem.2024.100853","url":null,"abstract":"<div><div>Generative large language models (LLMs) have transformed AI by enabling rapid, human-like text generation, but they face challenges, including managing inaccurate information generation. Strategies such as prompt engineering, Retrieval-Augmented Generation (RAG), and incorporating domain-specific Knowledge Graphs (KGs) aim to address their issues. However, challenges remain in achieving the desired levels of management, repeatability, and verification of experiments, especially for developers using closed-access LLMs via web APIs, complicating integration with external tools. To tackle this, we are exploring a software architecture to enhance LLM workflows by prioritizing flexibility and traceability while promoting more accurate and explainable outputs. We describe our approach and provide a nutrition case study demonstrating its ability to integrate LLMs with RAG and KGs for more robust AI solutions.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"85 ","pages":"Article 100853"},"PeriodicalIF":2.1,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143165538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Education in the era of Neurosymbolic AI 神经符号人工智能时代的教育
IF 2.1 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-30 DOI: 10.1016/j.websem.2024.100857
Chris Davis Jaldi , Eleni Ilkou , Noah Schroeder , Cogan Shimizu
Education is poised for a transformative shift with the advent of neurosymbolic artificial intelligence (NAI), which will redefine how we support deeply adaptive and personalized learning experiences. The integration of Knowledge Graphs (KGs) with Large Language Models (LLMs), a significant and popular form of NAI, presents a promising avenue for advancing personalized instruction via neurosymbolic educational agents. By leveraging structured knowledge, these agents can provide individualized learning experiences that align with specific learner preferences and desired learning paths, while also mitigating biases inherent in traditional AI systems. NAI-powered education systems will be capable of interpreting complex human concepts and contexts while employing advanced problem-solving strategies, all grounded in established pedagogical frameworks. In this paper, we propose a system that leverages the unique affordances of KGs, LLMs, and pedagogical agents – embodied characters designed to enhance learning – as critical components of a hybrid NAI architecture. We discuss the rationale for our system design and the preliminary findings of our work. We conclude that education in the era of NAI will make learning more accessible, equitable, and aligned with real-world skills. This is an era that will explore a new depth of understanding in educational tools.
随着神经符号人工智能(NAI)的出现,教育即将发生革命性的转变,这将重新定义我们如何支持深度适应和个性化的学习体验。知识图(KGs)与大型语言模型(llm)的集成是一种重要而流行的NAI形式,它为通过神经符号教育代理推进个性化教学提供了一条有前途的途径。通过利用结构化知识,这些智能体可以提供个性化的学习体验,与特定的学习者偏好和期望的学习路径保持一致,同时还可以减轻传统人工智能系统固有的偏见。ai驱动的教育系统将能够解释复杂的人类概念和背景,同时采用先进的解决问题的策略,所有这些都以既定的教学框架为基础。在本文中,我们提出了一个系统,该系统利用了KGs, llm和教学代理(旨在增强学习的具体化字符)的独特功能,作为混合NAI架构的关键组件。我们讨论了我们系统设计的基本原理和我们工作的初步发现。我们的结论是,人工智能时代的教育将使学习更容易获得、更公平,并与现实世界的技能保持一致。这是一个将在教育工具中探索新的理解深度的时代。
{"title":"Education in the era of Neurosymbolic AI","authors":"Chris Davis Jaldi ,&nbsp;Eleni Ilkou ,&nbsp;Noah Schroeder ,&nbsp;Cogan Shimizu","doi":"10.1016/j.websem.2024.100857","DOIUrl":"10.1016/j.websem.2024.100857","url":null,"abstract":"<div><div>Education is poised for a transformative shift with the advent of neurosymbolic artificial intelligence (NAI), which will redefine how we support deeply adaptive and personalized learning experiences. The integration of Knowledge Graphs (KGs) with Large Language Models (LLMs), a significant and popular form of NAI, presents a promising avenue for advancing personalized instruction via <em>neurosymbolic educational agents</em>. By leveraging structured knowledge, these agents can provide individualized learning experiences that align with specific learner preferences and desired learning paths, while also mitigating biases inherent in traditional AI systems. NAI-powered education systems will be capable of interpreting complex human concepts and contexts while employing advanced problem-solving strategies, all grounded in established pedagogical frameworks. In this paper, we propose a system that leverages the unique affordances of KGs, LLMs, and pedagogical agents – embodied characters designed to enhance learning – as critical components of a hybrid NAI architecture. We discuss the rationale for our system design and the preliminary findings of our work. We conclude that education in the era of NAI will make learning more accessible, equitable, and aligned with real-world skills. This is an era that will explore a new depth of understanding in educational tools.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"85 ","pages":"Article 100857"},"PeriodicalIF":2.1,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143165539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pattern-based engineering of Neurosymbolic AI Systems 基于模式的神经符号人工智能系统工程
IF 2.1 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-27 DOI: 10.1016/j.websem.2024.100855
Fajar J. Ekaputra
The symbiotic combination of sub-symbolic and symbolic AI techniques is a significant trend in AI, leading to the fast-paced development of various techniques that integrate these paradigms to build intelligent systems. However, the wealth of heterogeneous architectural options for combining the paradigms into Neurosymbolic AI (NeSy-AI) systems poses significant challenges. In particular, there is currently no standardized way to design, engineer, and document such systems that encompass visual and formal notations. Existing works aim to address this challenge by systematically modelling NeSy-AI systems as design patterns that include process, data, and human interactions. However, these works focus on capturing specific views of the system rather than aiming to support the broad process of AI system engineering. This paper outlines a vision of pattern-based AI Systems engineering, aiming to support the engineering process of NeSy-AI systems with tasks such as system documentation and artefact generation through interlinked visual and formal notations with Knowledge Graphs at its core.
子符号和符号人工智能技术的共生组合是人工智能的一个重要趋势,导致各种技术的快速发展,整合这些范式来构建智能系统。然而,将这些范式结合到神经符号人工智能(NeSy-AI)系统中的丰富的异构架构选项带来了重大挑战。特别是,目前还没有标准化的方法来设计、设计和记录这种包含可视和形式化符号的系统。现有的工作旨在通过系统地将NeSy-AI系统建模为包括过程、数据和人类交互的设计模式来解决这一挑战。然而,这些工作侧重于捕获系统的特定视图,而不是旨在支持AI系统工程的广泛过程。本文概述了基于模式的人工智能系统工程的愿景,旨在支持NeSy-AI系统的工程过程,其任务包括系统文档和人工制品生成,通过以知识图为核心的相互关联的可视化和形式化符号。
{"title":"Pattern-based engineering of Neurosymbolic AI Systems","authors":"Fajar J. Ekaputra","doi":"10.1016/j.websem.2024.100855","DOIUrl":"10.1016/j.websem.2024.100855","url":null,"abstract":"<div><div>The symbiotic combination of sub-symbolic and symbolic AI techniques is a significant trend in AI, leading to the fast-paced development of various techniques that integrate these paradigms to build intelligent systems. However, the wealth of heterogeneous architectural options for combining the paradigms into Neurosymbolic AI (NeSy-AI) systems poses significant challenges. In particular, there is currently no standardized way to design, engineer, and document such systems that encompass visual and formal notations. Existing works aim to address this challenge by systematically modelling NeSy-AI systems as design patterns that include process, data, and human interactions. However, these works focus on capturing specific views of the system rather than aiming to support the broad process of AI system engineering. This paper outlines a vision of pattern-based AI Systems engineering, aiming to support the engineering process of NeSy-AI systems with tasks such as system documentation and artefact generation through interlinked visual and formal notations with Knowledge Graphs at its core.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"85 ","pages":"Article 100855"},"PeriodicalIF":2.1,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143165569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge Graphs, Large Language Models, and Hallucinations: An NLP Perspective 知识图谱、大型语言模型和幻觉:一个NLP的视角
IF 2.1 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-24 DOI: 10.1016/j.websem.2024.100844
Ernests Lavrinovics , Russa Biswas , Johannes Bjerva , Katja Hose
Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) based applications including automated text generation, question answering, chatbots, and others. However, they face a significant challenge: hallucinations, where models produce plausible-sounding but factually incorrect responses. This undermines trust and limits the applicability of LLMs in different domains. Knowledge Graphs (KGs), on the other hand, provide a structured collection of interconnected facts represented as entities (nodes) and their relationships (edges). In recent research, KGs have been leveraged to provide context that can fill gaps in an LLM’s understanding of certain topics offering a promising approach to mitigate hallucinations in LLMs, enhancing their reliability and accuracy while benefiting from their wide applicability. Nonetheless, it is still a very active area of research with various unresolved open problems. In this paper, we discuss these open challenges covering state-of-the-art datasets and benchmarks as well as methods for knowledge integration and evaluating hallucinations. In our discussion, we consider the current use of KGs in LLM systems and identify future directions within each of these challenges.
大型语言模型(llm)已经彻底改变了基于自然语言处理(NLP)的应用程序,包括自动文本生成、问答、聊天机器人等。然而,他们面临着一个重大挑战:幻觉,即模型产生看似合理但实际上不正确的反应。这破坏了信任,限制了法学硕士在不同领域的适用性。另一方面,知识图(Knowledge Graphs, KGs)提供了一个相互关联的事实的结构化集合,表示为实体(节点)及其关系(边)。在最近的研究中,KGs已被利用来提供背景,可以填补法学硕士对某些主题的理解空白,提供了一种有希望的方法来减轻法学硕士的幻觉,提高其可靠性和准确性,同时受益于其广泛的适用性。尽管如此,它仍然是一个非常活跃的研究领域,有各种尚未解决的开放问题。在本文中,我们讨论了这些开放的挑战,涵盖了最先进的数据集和基准,以及知识整合和评估幻觉的方法。在我们的讨论中,我们考虑了目前KGs在LLM系统中的使用情况,并确定了这些挑战的未来方向。
{"title":"Knowledge Graphs, Large Language Models, and Hallucinations: An NLP Perspective","authors":"Ernests Lavrinovics ,&nbsp;Russa Biswas ,&nbsp;Johannes Bjerva ,&nbsp;Katja Hose","doi":"10.1016/j.websem.2024.100844","DOIUrl":"10.1016/j.websem.2024.100844","url":null,"abstract":"<div><div>Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) based applications including automated text generation, question answering, chatbots, and others. However, they face a significant challenge: hallucinations, where models produce plausible-sounding but factually incorrect responses. This undermines trust and limits the applicability of LLMs in different domains. Knowledge Graphs (KGs), on the other hand, provide a structured collection of interconnected facts represented as entities (nodes) and their relationships (edges). In recent research, KGs have been leveraged to provide context that can fill gaps in an LLM’s understanding of certain topics offering a promising approach to mitigate hallucinations in LLMs, enhancing their reliability and accuracy while benefiting from their wide applicability. Nonetheless, it is still a very active area of research with various unresolved open problems. In this paper, we discuss these open challenges covering state-of-the-art datasets and benchmarks as well as methods for knowledge integration and evaluating hallucinations. In our discussion, we consider the current use of KGs in LLM systems and identify future directions within each of these challenges.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"85 ","pages":"Article 100844"},"PeriodicalIF":2.1,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143166434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Serendipitous knowledge discovery on the Web of Wisdom based on searching and explaining interesting relations in knowledge graphs 基于搜索和解释知识图中有趣关系的智慧网上的偶然知识发现
IF 2.1 3区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-12-24 DOI: 10.1016/j.websem.2024.100852
Eero Hyvönen
This paper maintains that the Semantic Web is changing into a kind of Web of Wisdom (WoW) where AI-based problem solving, based on symbolic search and sub-symbolic methods, and Information Retrieval (IR) merge: IR is seen as a process for solving information-related problems of the end user with explanations, a form of knowledge discovery. As a case of example, relational search is concerned, i.e., solving problems of the type “How are X1Xn related to Y1Ym?”. For example: how is Pablo Picasso related to Barcelona? The idea is to find explainable “interesting” or even serendipitous associations in Knowledge Graphs (KG) and textual web contents. It is argued that domain knowledge-based symbolic methods based of KGs are needed to complement domain-agnostic graph-based methods and Generative AI (GenAI) boosted by Large Language Models (LLM). By using domain specific knowledge, it is possible to find and explain meaningful reliable textual answers, answer quantitative questions, and use data analyses and visualizations for explaining and studying the relations.
本文认为语义网正在转变为一种智慧网(WoW),其中基于人工智能的问题解决,基于符号搜索和子符号方法,与信息检索(IR)融合:IR被视为通过解释解决最终用户的信息相关问题的过程,是一种知识发现形式。以关系搜索为例,即解决“X1…Xn与Y1…Ym有什么关系?”例如:巴勃罗·毕加索和巴塞罗那有什么关系?这个想法是在知识图(KG)和文本网络内容中找到可解释的“有趣的”甚至是偶然的联系。认为基于领域知识的基于KGs的符号方法是对基于领域不可知论的基于图的方法和基于大语言模型(LLM)的生成式人工智能(GenAI)的补充。通过使用领域特定知识,可以找到并解释有意义的可靠文本答案,回答定量问题,并使用数据分析和可视化来解释和研究这些关系。
{"title":"Serendipitous knowledge discovery on the Web of Wisdom based on searching and explaining interesting relations in knowledge graphs","authors":"Eero Hyvönen","doi":"10.1016/j.websem.2024.100852","DOIUrl":"10.1016/j.websem.2024.100852","url":null,"abstract":"<div><div>This paper maintains that the Semantic Web is changing into a kind of Web of Wisdom (WoW) where AI-based problem solving, based on symbolic search and sub-symbolic methods, and Information Retrieval (IR) merge: IR is seen as a process for solving information-related problems of the end user with explanations, a form of knowledge discovery. As a case of example, relational search is concerned, i.e., solving problems of the type “How are <span><math><mrow><msub><mrow><mi>X</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>…</mo><msub><mrow><mi>X</mi></mrow><mrow><mi>n</mi></mrow></msub></mrow></math></span> related to <span><math><mrow><msub><mrow><mi>Y</mi></mrow><mrow><mn>1</mn></mrow></msub><mo>…</mo><msub><mrow><mi>Y</mi></mrow><mrow><mi>m</mi></mrow></msub></mrow></math></span>?”. For example: how is <em>Pablo Picasso</em> related to <em>Barcelona</em>? The idea is to find explainable “interesting” or even serendipitous associations in Knowledge Graphs (KG) and textual web contents. It is argued that domain knowledge-based symbolic methods based of KGs are needed to complement domain-agnostic graph-based methods and Generative AI (GenAI) boosted by Large Language Models (LLM). By using domain specific knowledge, it is possible to find and explain meaningful reliable textual answers, answer quantitative questions, and use data analyses and visualizations for explaining and studying the relations.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"85 ","pages":"Article 100852"},"PeriodicalIF":2.1,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143166435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Web Semantics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1