Pub Date : 2025-01-01DOI: 10.1016/j.websem.2024.100842
Cogan Shimizu , Shirly Stephen , Adrita Barua , Ling Cai , Antrea Christou , Kitty Currier , Abhilekha Dalal , Colby K. Fisher , Pascal Hitzler , Krzysztof Janowicz , Wenwen Li , Zilong Liu , Mohammad Saeid Mahdavinejad , Gengchen Mai , Dean Rehberger , Mark Schildhauer , Meilin Shi , Sanaz Saki Norouzi , Yuanyuan Tian , Sizhe Wang , Rui Zhu
KnowWhereGraph is one of the largest fully publicly available geospatial knowledge graphs. It includes data from 30 layers on natural hazards (e.g., hurricanes, wildfires), climate variables (e.g., air temperature, precipitation), soil properties, crop and land-cover types, demographics, and human health, various place and region identifiers, among other themes. These have been leveraged through the graph by a variety of applications to address challenges in food security and agricultural supply chains; sustainability related to soil conservation practices and farm labor; and delivery of emergency humanitarian aid following a disaster. In this paper, we introduce the ontology that acts as the schema for KnowWhereGraph. This broad overview provides insight into the requirements and design specifications for the graph and its schema, including the development methodology (modular ontology modeling) and the resources utilized to implement, materialize, and deploy KnowWhereGraph with its end-user interfaces and public query SPARQL endpoint.
{"title":"The KnowWhereGraph ontology","authors":"Cogan Shimizu , Shirly Stephen , Adrita Barua , Ling Cai , Antrea Christou , Kitty Currier , Abhilekha Dalal , Colby K. Fisher , Pascal Hitzler , Krzysztof Janowicz , Wenwen Li , Zilong Liu , Mohammad Saeid Mahdavinejad , Gengchen Mai , Dean Rehberger , Mark Schildhauer , Meilin Shi , Sanaz Saki Norouzi , Yuanyuan Tian , Sizhe Wang , Rui Zhu","doi":"10.1016/j.websem.2024.100842","DOIUrl":"10.1016/j.websem.2024.100842","url":null,"abstract":"<div><div>KnowWhereGraph is one of the largest fully publicly available geospatial knowledge graphs. It includes data from 30 layers on natural hazards (e.g., hurricanes, wildfires), climate variables (e.g., air temperature, precipitation), soil properties, crop and land-cover types, demographics, and human health, various place and region identifiers, among other themes. These have been leveraged through the graph by a variety of applications to address challenges in food security and agricultural supply chains; sustainability related to soil conservation practices and farm labor; and delivery of emergency humanitarian aid following a disaster. In this paper, we introduce the ontology that acts as the schema for KnowWhereGraph. This broad overview provides insight into the requirements and design specifications for the graph and its schema, including the development methodology (modular ontology modeling) and the resources utilized to implement, materialize, and deploy KnowWhereGraph with its end-user interfaces and public query SPARQL endpoint.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100842"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.websem.2024.100841
Gianluca Cima , Domenico Lembo , Lorenzo Marconi , Riccardo Rosati , Domenico Fabio Savo
In this paper we study Controlled Query Evaluation (CQE), a declarative approach to privacy-preserving query answering over databases, knowledge bases, and ontologies. CQE is based on the notion of censor, which defines the answers to each query posed to the data/knowledge base. We investigate both semantic and computational properties of CQE in the context of OWL ontologies, and specifically in the description logic , which underpins the OWL 2 QL profile. In our analysis, we focus on semantics of CQE based on censors (called optimal GA censors) that enjoy the so-called indistinguishability property, analyzing the trade-off between maximizing the amount of data disclosed by query answers and minimizing the computational cost of privacy-preserving query answering. We first study the data complexity of skeptical entailment of unions of conjunctive queries under all the optimal GA censors, showing that the computational cost of query answering in this setting is intractable. To overcome this computational issue, we then define a different semantics for CQE centered around the notion of intersection of all the optimal GA censors. We show that query answering over OWL 2 QL ontologies under the new intersection-based semantics for CQE enjoys tractability and is first-order rewritable, i.e. amenable to be implemented through SQL query rewriting techniques and the use of standard relational database systems; on the other hand, this approach shows limitations in terms of amount of data disclosed. To improve this aspect, we add preferences between ontology predicates to the CQE framework, and identify a semantics under which query answering over OWL 2 QL ontologies maintains the same computational properties of the intersection-based approach without preferences.
{"title":"Indistinguishability in controlled query evaluation over prioritized description logic ontologies","authors":"Gianluca Cima , Domenico Lembo , Lorenzo Marconi , Riccardo Rosati , Domenico Fabio Savo","doi":"10.1016/j.websem.2024.100841","DOIUrl":"10.1016/j.websem.2024.100841","url":null,"abstract":"<div><div>In this paper we study <em>Controlled Query Evaluation (CQE)</em>, a declarative approach to privacy-preserving query answering over databases, knowledge bases, and ontologies. CQE is based on the notion of <em>censor</em>, which defines the answers to each query posed to the data/knowledge base. We investigate both semantic and computational properties of CQE in the context of OWL ontologies, and specifically in the description logic <span><math><msub><mrow><mtext>DL-Lite</mtext></mrow><mrow><mi>R</mi></mrow></msub></math></span>, which underpins the OWL 2 QL profile. In our analysis, we focus on semantics of CQE based on censors (called <em>optimal GA censors</em>) that enjoy the so-called <em>indistinguishability</em> property, analyzing the trade-off between maximizing the amount of data disclosed by query answers and minimizing the computational cost of privacy-preserving query answering. We first study the data complexity of <em>skeptical entailment</em> of unions of conjunctive queries under all the optimal GA censors, showing that the computational cost of query answering in this setting is intractable. To overcome this computational issue, we then define a different semantics for CQE centered around the notion of <em>intersection</em> of all the optimal GA censors. We show that query answering over OWL 2 QL ontologies under the new intersection-based semantics for CQE enjoys tractability and is <em>first-order rewritable</em>, i.e. amenable to be implemented through SQL query rewriting techniques and the use of standard relational database systems; on the other hand, this approach shows limitations in terms of amount of data disclosed. To improve this aspect, we add preferences between ontology predicates to the CQE framework, and identify a semantics under which query answering over OWL 2 QL ontologies maintains the same computational properties of the intersection-based approach without preferences.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100841"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Foundation Models (FMs) hold transformative potential to accelerate scientific discovery, yet reaching their full capacity in complex, highly multimodal domains such as genomics, drug discovery, and materials science requires a deeper consideration of the contextual nature of the scientific knowledge. We revisit the synergy between FMs and Multimodal Knowledge Graph (MKG) representation and learning, exploring their potential to enhance predictive and generative tasks in biomedical contexts like drug discovery. We seek to exploit MKGs to improve generative AI models’ ability to capture intricate domain-specific relations and facilitate multimodal fusion. This integration promises to accelerate discovery workflows by providing more meaningful multimodal knowledge-enhanced representations and contextual evidence. Despite this potential, challenges and opportunities remain, including fusing multiple sequential, structural and knowledge modalities and models leveraging the strengths of each; developing scalable architectures for multi-task multi-dataset learning; creating end-to-end workflows to enhance the trustworthiness of biomedical FMs using knowledge from heterogeneous datasets and scientific literature; the domain data bottleneck and the lack of a unified representation between natural language and chemical representations; and benchmarking, specifically the transfer learning to tasks with limited data (e.g., unseen molecules and proteins, rear diseases). Finally, fostering openness and collaboration is key to accelerate scientific breakthroughs.
{"title":"Enhancing foundation models for scientific discovery via multimodal knowledge graph representations","authors":"Vanessa Lopez, Lam Hoang, Marcos Martinez-Galindo, Raúl Fernández-Díaz, Marco Luca Sbodio, Rodrigo Ordonez-Hurtado, Mykhaylo Zayats, Natasha Mulligan, Joao Bettencourt-Silva","doi":"10.1016/j.websem.2024.100845","DOIUrl":"10.1016/j.websem.2024.100845","url":null,"abstract":"<div><div>Foundation Models (FMs) hold transformative potential to accelerate scientific discovery, yet reaching their full capacity in complex, highly multimodal domains such as genomics, drug discovery, and materials science requires a deeper consideration of the contextual nature of the scientific knowledge. We revisit the synergy between FMs and Multimodal Knowledge Graph (MKG) representation and learning, exploring their potential to enhance predictive and generative tasks in biomedical contexts like drug discovery. We seek to exploit MKGs to improve generative AI models’ ability to capture intricate domain-specific relations and facilitate multimodal fusion. This integration promises to accelerate discovery workflows by providing more meaningful multimodal knowledge-enhanced representations and contextual evidence. Despite this potential, challenges and opportunities remain, including fusing multiple sequential, structural and knowledge modalities and models leveraging the strengths of each; developing scalable architectures for multi-task multi-dataset learning; creating end-to-end workflows to enhance the trustworthiness of biomedical FMs using knowledge from heterogeneous datasets and scientific literature; the domain data bottleneck and the lack of a unified representation between natural language and chemical representations; and benchmarking, specifically the transfer learning to tasks with limited data (e.g., unseen molecules and proteins, rear diseases). Finally, fostering openness and collaboration is key to accelerate scientific breakthroughs.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100845"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.websem.2024.100851
Rita T. Sousa , Catia Pesquita , Heiko Paulheim
Knowledge Graphs are used in various domains to represent knowledge about entities and their relations. In the vast majority of cases, they capture what is known to be true about those entities, i.e., positive statements, while the Open World Assumption implicitly states that everything not expressed in the graph may or may not be true. This makes it difficult and less frequent to capture information explicitly known not to be true, i.e., negative statements. Moreover, while those negative statements could bear the potential to learn more useful representations in knowledge graph embeddings, that direction has been explored only rarely. However, in many domains, negative information is particularly interesting, for example, in recommender systems, where negative associations of users and items can help in learning better user representations, or in the biomedical domain, where the knowledge that a patient does exhibit a specific symptom can be crucial for accurate disease diagnosis.
In this paper, we argue that negative statements should be given more attention in knowledge graph embeddings. Moreover, we investigate how they can be used in knowledge graph embedding methods, highlighting their potential in some interesting use cases. We discuss some existing works and preliminary results that incorporate explicitly declared negative statements in walk-based knowledge graph embedding methods. Finally, we outline promising avenues for future research in this area.
{"title":"Towards leveraging explicit negative statements in knowledge graph embeddings","authors":"Rita T. Sousa , Catia Pesquita , Heiko Paulheim","doi":"10.1016/j.websem.2024.100851","DOIUrl":"10.1016/j.websem.2024.100851","url":null,"abstract":"<div><div>Knowledge Graphs are used in various domains to represent knowledge about entities and their relations. In the vast majority of cases, they capture what is known to be true about those entities, i.e., positive statements, while the Open World Assumption implicitly states that everything not expressed in the graph may or may not be true. This makes it difficult and less frequent to capture information explicitly known not to be true, i.e., negative statements. Moreover, while those negative statements could bear the potential to learn more useful representations in knowledge graph embeddings, that direction has been explored only rarely. However, in many domains, negative information is particularly interesting, for example, in recommender systems, where negative associations of users and items can help in learning better user representations, or in the biomedical domain, where the knowledge that a patient does exhibit a specific symptom can be crucial for accurate disease diagnosis.</div><div>In this paper, we argue that negative statements should be given more attention in knowledge graph embeddings. Moreover, we investigate how they can be used in knowledge graph embedding methods, highlighting their potential in some interesting use cases. We discuss some existing works and preliminary results that incorporate explicitly declared negative statements in walk-based knowledge graph embedding methods. Finally, we outline promising avenues for future research in this area.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100851"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.websem.2024.100854
Mathieu d’Aquin
Research and the scientific activity are widely seen as an area where the current trends in AI, namely the development of deep learning models (including large language models), are having an increasing impact. Indeed, the ability of such models to extrapolate from data, seemingly finding unknown patterns relating implicit features of the objects under study to their properties can, at the very least, help accelerate and scale up those studies as demonstrated in fields such as molecular biology and chemistry. Knowledge graphs, on the other hand, have more traditionally been used to organize information around the scientific activity, keeping track of existing knowledge, of conducted experiments, of interactions within the research community, etc. However, for machine learning models to be truly used as a tool for scientific advancement, we have to find ways for the knowledge implicitly gained by these models from their training to be integrated with the explicitly represented knowledge captured through knowledge graphs. Based on our experience in ongoing projects in the domain of material science, in this position paper, we discuss the role that knowledge graphs can play in new methodologies for scientific discovery. These methodologies are based on the creation of large and opaque neural models. We therefore focus on the research challenges we need to address to support aligning such neural models to knowledge graphs for them to become a knowledge-level interface to those neural models.
{"title":"On the role of knowledge graphs in AI-based scientific discovery","authors":"Mathieu d’Aquin","doi":"10.1016/j.websem.2024.100854","DOIUrl":"10.1016/j.websem.2024.100854","url":null,"abstract":"<div><div>Research and the scientific activity are widely seen as an area where the current trends in AI, namely the development of deep learning models (including large language models), are having an increasing impact. Indeed, the ability of such models to extrapolate from data, seemingly finding unknown patterns relating implicit features of the objects under study to their properties can, at the very least, help accelerate and scale up those studies as demonstrated in fields such as molecular biology and chemistry. Knowledge graphs, on the other hand, have more traditionally been used to organize information around the scientific activity, keeping track of existing knowledge, of conducted experiments, of interactions within the research community, etc. However, for machine learning models to be truly used as a tool for scientific advancement, we have to find ways for the knowledge implicitly gained by these models from their training to be integrated with the explicitly represented knowledge captured through knowledge graphs. Based on our experience in ongoing projects in the domain of material science, in this position paper, we discuss the role that knowledge graphs can play in new methodologies for scientific discovery. These methodologies are based on the creation of large and opaque neural models. We therefore focus on the research challenges we need to address to support aligning such neural models to knowledge graphs for them to become a knowledge-level interface to those neural models.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100854"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.websem.2024.100843
George Hannah , Rita T. Sousa , Ioannis Dasoulas , Claudia d’Amato
With the recent surge in popularity of Large Language Models (LLMs), there is the rising risk of users blindly trusting the information in the response. Nevertheless, there are cases where the LLM recommends actions that have potential legal implications and this may put the user in danger. We provide an empirical analysis on multiple existing LLMs showing the urgency of the problem. Hence, we propose a first short-term solution, consisting in an approach for isolating these legal issues through prompt engineering. We prove that this solution is able to stem some risks related to legal implications, nonetheless we also highlight some limitations. Hence, we argue on the need for additional knowledge-intensive resources and specifically Knowledge Graphs for fully solving these limitations. For the purpose, we draw our proposal aiming at designing and developing a solution powered by a legal Knowledge Graph (KG) that, besides capturing and alerting the user on possible legal implications coming from the LLM answers, is also able to provide actual evidence for them by supplying citations of the interested laws. We conclude with a brief discussion on the issues that may be needed to solve for building a comprehensive legal Knowledge Graph
{"title":"On the legal implications of Large Language Model answers: A prompt engineering approach and a view beyond by exploiting Knowledge Graphs","authors":"George Hannah , Rita T. Sousa , Ioannis Dasoulas , Claudia d’Amato","doi":"10.1016/j.websem.2024.100843","DOIUrl":"10.1016/j.websem.2024.100843","url":null,"abstract":"<div><div>With the recent surge in popularity of Large Language Models (LLMs), there is the rising risk of users blindly trusting the information in the response. Nevertheless, there are cases where the LLM recommends actions that have potential legal implications and this may put the user in danger. We provide an empirical analysis on multiple existing LLMs showing the urgency of the problem. Hence, we propose a first short-term solution, consisting in an approach for isolating these legal issues through prompt engineering. We prove that this solution is able to stem some risks related to legal implications, nonetheless we also highlight some limitations. Hence, we argue on the need for additional knowledge-intensive resources and specifically Knowledge Graphs for fully solving these limitations. For the purpose, we draw our proposal aiming at designing and developing a solution powered by a legal Knowledge Graph (KG) that, besides capturing and alerting the user on possible legal implications coming from the LLM answers, is also able to provide actual evidence for them by supplying citations of the interested laws. We conclude with a brief discussion on the issues that may be needed to solve for building a comprehensive legal Knowledge Graph</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100843"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Knowledge Graphs (KGs) are graph-based structures that integrate heterogeneous data, capture domain knowledge, and enable explainable AI through symbolic reasoning. This position paper examines the challenges and research opportunities in integrating KGs with neuro-symbolic AI, highlighting their potential to enhance explainability, scalability, and context-aware reasoning in hybrid AI systems. Using a lung cancer use case, we illustrate how hybrid approaches address tasks such as link prediction—uncovering hidden relationships in medical data—and counterfactual reasoning—analyzing alternative scenarios to understand causal factors. The discussion is framed around TrustKG, which demonstrates how constraint validation, causal reasoning, and user-centric communication can support transparent and reliable decision-making. Additionally, we identify current limitations of KGs, including gaps in knowledge coverage, evolving data integration challenges, and the need for improved usability and impact assessment. These insights are not limited to healthcare but extend to other domains like energy, manufacturing, and mobility, showcasing the broad applicability of KGs. Finally, we propose research directions to unlock their full potential in building robust, transparent, and widely adopted real-world applications.
{"title":"Integrating Knowledge Graphs with Symbolic AI: The Path to Interpretable Hybrid AI Systems in Medicine","authors":"Maria-Esther Vidal , Yashrajsinh Chudasama , Hao Huang , Disha Purohit , Maria Torrente","doi":"10.1016/j.websem.2024.100856","DOIUrl":"10.1016/j.websem.2024.100856","url":null,"abstract":"<div><div>Knowledge Graphs (KGs) are graph-based structures that integrate heterogeneous data, capture domain knowledge, and enable explainable AI through symbolic reasoning. This position paper examines the challenges and research opportunities in integrating KGs with neuro-symbolic AI, highlighting their potential to enhance explainability, scalability, and context-aware reasoning in hybrid AI systems. Using a lung cancer use case, we illustrate how hybrid approaches address tasks such as link prediction—uncovering hidden relationships in medical data—and counterfactual reasoning—analyzing alternative scenarios to understand causal factors. The discussion is framed around TrustKG, which demonstrates how constraint validation, causal reasoning, and user-centric communication can support transparent and reliable decision-making. Additionally, we identify current limitations of KGs, including gaps in knowledge coverage, evolving data integration challenges, and the need for improved usability and impact assessment. These insights are not limited to healthcare but extend to other domains like energy, manufacturing, and mobility, showcasing the broad applicability of KGs. Finally, we propose research directions to unlock their full potential in building robust, transparent, and widely adopted real-world applications.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100856"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1016/j.websem.2024.100853
John S. Erickson , Henrique Santos , Vládia Pinheiro , Jamie P. McCusker , Deborah L. McGuinness
Generative large language models (LLMs) have transformed AI by enabling rapid, human-like text generation, but they face challenges, including managing inaccurate information generation. Strategies such as prompt engineering, Retrieval-Augmented Generation (RAG), and incorporating domain-specific Knowledge Graphs (KGs) aim to address their issues. However, challenges remain in achieving the desired levels of management, repeatability, and verification of experiments, especially for developers using closed-access LLMs via web APIs, complicating integration with external tools. To tackle this, we are exploring a software architecture to enhance LLM workflows by prioritizing flexibility and traceability while promoting more accurate and explainable outputs. We describe our approach and provide a nutrition case study demonstrating its ability to integrate LLMs with RAG and KGs for more robust AI solutions.
{"title":"LLM experimentation through knowledge graphs: Towards improved management, repeatability, and verification","authors":"John S. Erickson , Henrique Santos , Vládia Pinheiro , Jamie P. McCusker , Deborah L. McGuinness","doi":"10.1016/j.websem.2024.100853","DOIUrl":"10.1016/j.websem.2024.100853","url":null,"abstract":"<div><div>Generative large language models (LLMs) have transformed AI by enabling rapid, human-like text generation, but they face challenges, including managing inaccurate information generation. Strategies such as prompt engineering, Retrieval-Augmented Generation (RAG), and incorporating domain-specific Knowledge Graphs (KGs) aim to address their issues. However, challenges remain in achieving the desired levels of management, repeatability, and verification of experiments, especially for developers using closed-access LLMs via web APIs, complicating integration with external tools. To tackle this, we are exploring a software architecture to enhance LLM workflows by prioritizing flexibility and traceability while promoting more accurate and explainable outputs. We describe our approach and provide a nutrition case study demonstrating its ability to integrate LLMs with RAG and KGs for more robust AI solutions.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"85 ","pages":"Article 100853"},"PeriodicalIF":2.1,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143165538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-30DOI: 10.1016/j.websem.2024.100857
Chris Davis Jaldi , Eleni Ilkou , Noah Schroeder , Cogan Shimizu
Education is poised for a transformative shift with the advent of neurosymbolic artificial intelligence (NAI), which will redefine how we support deeply adaptive and personalized learning experiences. The integration of Knowledge Graphs (KGs) with Large Language Models (LLMs), a significant and popular form of NAI, presents a promising avenue for advancing personalized instruction via neurosymbolic educational agents. By leveraging structured knowledge, these agents can provide individualized learning experiences that align with specific learner preferences and desired learning paths, while also mitigating biases inherent in traditional AI systems. NAI-powered education systems will be capable of interpreting complex human concepts and contexts while employing advanced problem-solving strategies, all grounded in established pedagogical frameworks. In this paper, we propose a system that leverages the unique affordances of KGs, LLMs, and pedagogical agents – embodied characters designed to enhance learning – as critical components of a hybrid NAI architecture. We discuss the rationale for our system design and the preliminary findings of our work. We conclude that education in the era of NAI will make learning more accessible, equitable, and aligned with real-world skills. This is an era that will explore a new depth of understanding in educational tools.
{"title":"Education in the era of Neurosymbolic AI","authors":"Chris Davis Jaldi , Eleni Ilkou , Noah Schroeder , Cogan Shimizu","doi":"10.1016/j.websem.2024.100857","DOIUrl":"10.1016/j.websem.2024.100857","url":null,"abstract":"<div><div>Education is poised for a transformative shift with the advent of neurosymbolic artificial intelligence (NAI), which will redefine how we support deeply adaptive and personalized learning experiences. The integration of Knowledge Graphs (KGs) with Large Language Models (LLMs), a significant and popular form of NAI, presents a promising avenue for advancing personalized instruction via <em>neurosymbolic educational agents</em>. By leveraging structured knowledge, these agents can provide individualized learning experiences that align with specific learner preferences and desired learning paths, while also mitigating biases inherent in traditional AI systems. NAI-powered education systems will be capable of interpreting complex human concepts and contexts while employing advanced problem-solving strategies, all grounded in established pedagogical frameworks. In this paper, we propose a system that leverages the unique affordances of KGs, LLMs, and pedagogical agents – embodied characters designed to enhance learning – as critical components of a hybrid NAI architecture. We discuss the rationale for our system design and the preliminary findings of our work. We conclude that education in the era of NAI will make learning more accessible, equitable, and aligned with real-world skills. This is an era that will explore a new depth of understanding in educational tools.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"85 ","pages":"Article 100857"},"PeriodicalIF":2.1,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143165539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-27DOI: 10.1016/j.websem.2024.100855
Fajar J. Ekaputra
The symbiotic combination of sub-symbolic and symbolic AI techniques is a significant trend in AI, leading to the fast-paced development of various techniques that integrate these paradigms to build intelligent systems. However, the wealth of heterogeneous architectural options for combining the paradigms into Neurosymbolic AI (NeSy-AI) systems poses significant challenges. In particular, there is currently no standardized way to design, engineer, and document such systems that encompass visual and formal notations. Existing works aim to address this challenge by systematically modelling NeSy-AI systems as design patterns that include process, data, and human interactions. However, these works focus on capturing specific views of the system rather than aiming to support the broad process of AI system engineering. This paper outlines a vision of pattern-based AI Systems engineering, aiming to support the engineering process of NeSy-AI systems with tasks such as system documentation and artefact generation through interlinked visual and formal notations with Knowledge Graphs at its core.
{"title":"Pattern-based engineering of Neurosymbolic AI Systems","authors":"Fajar J. Ekaputra","doi":"10.1016/j.websem.2024.100855","DOIUrl":"10.1016/j.websem.2024.100855","url":null,"abstract":"<div><div>The symbiotic combination of sub-symbolic and symbolic AI techniques is a significant trend in AI, leading to the fast-paced development of various techniques that integrate these paradigms to build intelligent systems. However, the wealth of heterogeneous architectural options for combining the paradigms into Neurosymbolic AI (NeSy-AI) systems poses significant challenges. In particular, there is currently no standardized way to design, engineer, and document such systems that encompass visual and formal notations. Existing works aim to address this challenge by systematically modelling NeSy-AI systems as design patterns that include process, data, and human interactions. However, these works focus on capturing specific views of the system rather than aiming to support the broad process of AI system engineering. This paper outlines a vision of pattern-based AI Systems engineering, aiming to support the engineering process of NeSy-AI systems with tasks such as system documentation and artefact generation through interlinked visual and formal notations with Knowledge Graphs at its core.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"85 ","pages":"Article 100855"},"PeriodicalIF":2.1,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143165569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}