Pub Date : 2023-01-01DOI: 10.1016/j.websem.2022.100743
Diya Li , Mohammed J. Zaki , Ching-hua Chen
While the availability of large-scale online recipe collections presents opportunities for health consumers to access a wide variety of recipes, it can be challenging for them to discover relevant recipes. Whereas most recommender systems are designed to offer selections consistent with users’ past behavior, it remains an open problem to offer selections that can help users’ transition from one type of behavior to another, intentionally. In this paper, we introduce health-guided recipe recommendation as a way to incrementally shift users towards healthier recipe options while respecting the preferences reflected in their past choices. Introducing a knowledge graph (KG) into recommender systems as side information has attracted great interest, but its use in recipe recommendation has not been studied. To fill this gap, we consider the task of recipe recommendation over knowledge graphs. In particular, we jointly learn recipe representations via graph neural networks over two graphs extracted from a large-scale Food KG, which capture different semantic relationships, namely, user preferences and recipe healthiness, respectively. To integrate the nutritional aspects into recipe representations and the recommendation task, instead of simple fusion, we utilize a knowledge transfer scheme to enable the transfer of useful semantic information across the preferences and healthiness aspects. Experimental results on two large real-world recipe datasets showcase our model’s ability to recommend tasty as well as healthy recipes to users.
{"title":"Health-guided recipe recommendation over knowledge graphs","authors":"Diya Li , Mohammed J. Zaki , Ching-hua Chen","doi":"10.1016/j.websem.2022.100743","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100743","url":null,"abstract":"<div><p>While the availability of large-scale online recipe collections presents opportunities for health consumers to access a wide variety of recipes, it can be challenging for them to discover relevant recipes. Whereas most recommender systems<span> are designed to offer selections consistent with users’ past behavior, it remains an open problem to offer selections that can help users’ transition from one type of behavior to another, intentionally. In this paper, we introduce health-guided recipe recommendation as a way to incrementally shift users towards healthier recipe options while respecting the preferences reflected in their past choices. Introducing a knowledge graph (KG) into recommender systems as side information has attracted great interest, but its use in recipe recommendation has not been studied. To fill this gap, we consider the task of recipe recommendation over knowledge graphs. In particular, we jointly learn recipe representations via graph neural networks<span> over two graphs extracted from a large-scale Food KG, which capture different semantic relationships<span>, namely, user preferences and recipe healthiness, respectively. To integrate the nutritional aspects into recipe representations and the recommendation task, instead of simple fusion, we utilize a knowledge transfer scheme to enable the transfer of useful semantic information across the preferences and healthiness aspects. Experimental results on two large real-world recipe datasets showcase our model’s ability to recommend tasty as well as healthy recipes to users.</span></span></span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50201119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1016/j.websem.2022.100757
Yuxia Geng , Jiaoyan Chen , Xiang Zhuang , Zhuo Chen , Jeff Z. Pan , Juan Li , Zonggang Yuan , Huajun Chen
External knowledge (a.k.a. side information) plays a critical role in zero-shot learning (ZSL) which aims to predict with unseen classes that have never appeared in training data. Several kinds of external knowledge, such as text and attribute, have been widely investigated, but they alone are limited with incomplete semantics. Some very recent studies thus propose to use Knowledge Graph (KG) due to its high expressivity and compatibility for representing kinds of knowledge. However, the ZSL community is still in short of standard benchmarks for studying and comparing different external knowledge settings and different KG-based ZSL methods. In this paper, we proposed six resources covering three tasks, i.e., zero-shot image classification (ZS-IMGC), zero-shot relation extraction (ZS-RE), and zero-shot KG completion (ZS-KGC). Each resource has a normal ZSL benchmark and a KG containing semantics ranging from text to attribute, from relational knowledge to logical expressions. We have clearly presented these resources including their construction, statistics, data formats and usage cases w.r.t. different ZSL methods. More importantly, we have conducted a comprehensive benchmarking study, with a few classic and state-of-the-art methods for each task, including a method with KG augmented explanation. We discussed and compared different ZSL paradigms w.r.t. different external knowledge settings, and found that our resources have great potential for developing more advanced ZSL methods and more solutions for applying KGs for augmenting machine learning. All the resources are available at https://github.com/China-UK-ZSL/Resources_for_KZSL.
{"title":"Benchmarking knowledge-driven zero-shot learning","authors":"Yuxia Geng , Jiaoyan Chen , Xiang Zhuang , Zhuo Chen , Jeff Z. Pan , Juan Li , Zonggang Yuan , Huajun Chen","doi":"10.1016/j.websem.2022.100757","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100757","url":null,"abstract":"<div><p><span><span><span>External knowledge (a.k.a. side information) plays a critical role in zero-shot learning (ZSL) which aims to predict with unseen classes that have never appeared in training data. Several kinds of external knowledge, such as text and attribute, have been widely investigated, but they alone are limited with incomplete semantics. Some very recent studies thus propose to use Knowledge Graph (KG) due to its high expressivity and compatibility for representing kinds of knowledge. However, the ZSL community is still in short of standard benchmarks for studying and comparing different external knowledge settings and different KG-based ZSL methods. In this paper, we proposed six resources covering three tasks, i.e., zero-shot image classification<span> (ZS-IMGC), zero-shot relation extraction (ZS-RE), and zero-shot KG completion (ZS-KGC). Each resource has a normal ZSL benchmark and a KG containing semantics ranging from text to attribute, from relational knowledge to logical expressions. We have clearly presented these resources including their construction, statistics, data formats and usage cases w.r.t. different ZSL methods. More importantly, we have conducted a comprehensive </span></span>benchmarking study, with a few classic and state-of-the-art methods for each task, including a method with KG augmented explanation. We discussed and compared different ZSL paradigms w.r.t. different external knowledge settings, and found that our resources have great potential for developing more advanced ZSL methods and more solutions for applying KGs for augmenting </span>machine learning. All the resources are available at </span><span>https://github.com/China-UK-ZSL/Resources_for_KZSL</span><svg><path></path></svg>.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50201115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1016/j.websem.2022.100756
Nikola Milošević , Wolfgang Thielemann
Biomedical research is growing at such an exponential pace that scientists, researchers, and practitioners are no more able to cope with the amount of published literature in the domain. The knowledge presented in the literature needs to be systematized in such a way that claims and hypotheses can be easily found, accessed, and validated. Knowledge graphs can provide such a framework for semantic knowledge representation from literature. However, in order to build a knowledge graph, it is necessary to extract knowledge as relationships between biomedical entities and normalize both entities and relationship types. In this paper, we present and compare a few rule-based and machine learning-based (Naive Bayes, Random Forests as examples of traditional machine learning methods and DistilBERT, PubMedBERT, T5, and SciFive-based models as examples of modern deep learning transformers) methods for scalable relationship extraction from biomedical literature, and for the integration into the knowledge graphs. We examine how resilient are these various methods to unbalanced and fairly small datasets. Our experiments show that transformer-based models handle well both small (due to pre-training on a large dataset) and unbalanced datasets. The best performing model was the PubMedBERT-based model fine-tuned on balanced data, with a reported F1-score of 0.92. The distilBERT-based model followed with an F1-score of 0.89, performing faster and with lower resource requirements. BERT-based models performed better than T5-based generative models.
{"title":"Comparison of biomedical relationship extraction methods and models for knowledge graph creation","authors":"Nikola Milošević , Wolfgang Thielemann","doi":"10.1016/j.websem.2022.100756","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100756","url":null,"abstract":"<div><p><span><span><span>Biomedical research is growing at such an exponential pace that scientists, researchers, and practitioners are no more able to cope with the amount of published literature in the domain. The knowledge presented in the literature needs to be systematized in such a way that claims and hypotheses can be easily found, accessed, and validated. Knowledge graphs can provide such a framework for </span>semantic knowledge representation from literature. However, in order to build a knowledge graph, it is necessary to extract knowledge as relationships between biomedical entities and normalize both entities and relationship types. In this paper, we present and compare a few rule-based and machine learning-based (Naive Bayes, </span>Random Forests<span> as examples of traditional machine learning<span> methods and DistilBERT, PubMedBERT, T5, and SciFive-based models as examples of modern deep learning transformers) methods for scalable relationship extraction from biomedical literature, and for the integration into the knowledge graphs. We examine how resilient are these various methods to unbalanced and fairly small datasets. Our experiments show that transformer-based models handle well both small (due to pre-training on a large dataset) and unbalanced datasets. The best performing model was the PubMedBERT-based model fine-tuned on balanced data, with a reported F1-score of 0.92. The distilBERT-based model followed with an F1-score of 0.89, performing faster and with lower resource requirements. BERT-based models performed better than T5-based </span></span></span>generative models.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50201117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1016/j.websem.2022.100759
Sara Abdollahi , Simon Gottschalk , Elena Demidova
While societal events often impact people worldwide, a significant fraction of events has a local focus that primarily affects specific language communities. Examples include national elections, the development of the Coronavirus pandemic in different countries, and local film festivals such as the César Awards in France and the Moscow International Film Festival in Russia. However, existing entity recommendation approaches do not sufficiently address the language context of recommendation. This article introduces the novel task of language-specific event recommendation, which aims to recommend events relevant to the user query in the language-specific context. This task can support essential information retrieval activities, including web navigation and exploratory search, considering the language context of user information needs. We propose LaSER, a novel approach toward language-specific event recommendation. LaSER blends the language-specific latent representations (embeddings) of entities and events and spatio-temporal event features in a learning to rank model. This model is trained on publicly available Wikipedia Clickstream data. The results of our user study demonstrate that LaSER outperforms state-of-the-art recommendation baselines by up to 33 percentage points in MAP@5 concerning the language-specific relevance of recommended events.
{"title":"LaSER: Language-specific event recommendation","authors":"Sara Abdollahi , Simon Gottschalk , Elena Demidova","doi":"10.1016/j.websem.2022.100759","DOIUrl":"10.1016/j.websem.2022.100759","url":null,"abstract":"<div><p>While societal events often impact people worldwide, a significant fraction of events has a local focus that primarily affects specific language communities. Examples include national elections, the development of the Coronavirus pandemic in different countries, and local film festivals such as the <em>César Awards</em> in France and the <em>Moscow International Film Festival</em> in Russia. However, existing entity recommendation approaches do not sufficiently address the language context of recommendation. This article introduces the novel task of language-specific event recommendation, which aims to recommend events relevant to the user query in the language-specific context. This task can support essential information retrieval activities, including web navigation and exploratory search, considering the language context of user information needs. We propose <em>LaSER</em>, a novel approach toward language-specific event recommendation. <em>LaSER</em> blends the language-specific latent representations (embeddings) of entities and events and spatio-temporal event features in a learning to rank model. This model is trained on publicly available Wikipedia Clickstream data. The results of our user study demonstrate that <em>LaSER</em> outperforms state-of-the-art recommendation baselines by up to 33 percentage points in MAP@5 concerning the language-specific relevance of recommended events.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9482171/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33485349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
RDF knowledge graphs (KG) are powerful data structures to represent factual statements created from heterogeneous data sources. KG creation is laborious and demands data management techniques to be executed efficiently. This paper tackles the problem of the automatic generation of KG creation processes declaratively specified; it proposes techniques for planning and transforming heterogeneous data into RDF triples following mapping assertions specified in the RDF Mapping Language (RML). Given a set of mapping assertions, the planner provides an optimized execution plan by partitioning and scheduling the execution of the assertions. First, the planner assesses an optimized number of partitions considering the number of data sources, type of mapping assertions, and the associations between different assertions. After providing a list of partitions and assertions that belong to each partition, the planner determines their execution order. A greedy algorithm is implemented to generate the partitions’ bushy tree execution plan. Bushy tree plans are translated into operating system commands that guide the execution of the partitions of the mapping assertions in the order indicated by the bushy tree. The proposed optimization approach is evaluated over state-of-the-art RML-compliant engines, and existing benchmarks of data sources and RML triples maps. Our experimental results suggest that the performance of the studied engines can be considerably improved, particularly in a complex setting with numerous triples maps and large data sources. As a result, engines that time out in complex cases are enabled to produce at least a portion of the KG applying the planner.
{"title":"Scaling up knowledge graph creation to large and heterogeneous data sources","authors":"Enrique Iglesias , Samaneh Jozashoori , Maria-Esther Vidal","doi":"10.1016/j.websem.2022.100755","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100755","url":null,"abstract":"<div><p><span>RDF knowledge graphs (KG) are powerful data structures<span> to represent factual statements created from heterogeneous data sources. </span></span>KG creation<span> is laborious and demands data management techniques to be executed efficiently. This paper tackles the problem of the automatic generation of KG creation processes declaratively specified; it proposes techniques for planning and transforming heterogeneous data into RDF triples following mapping assertions specified in the RDF Mapping Language (RML). Given a set of mapping assertions, the planner provides an optimized execution plan by partitioning and scheduling the execution of the assertions. First, the planner assesses an optimized number of partitions considering the number of data sources, type of mapping assertions, and the associations between different assertions. After providing a list of partitions and assertions that belong to each partition, the planner determines their execution order. A greedy algorithm<span> is implemented to generate the partitions’ bushy tree execution plan. Bushy tree plans are translated into operating system commands that guide the execution of the partitions of the mapping assertions in the order indicated by the bushy tree. The proposed optimization approach is evaluated over state-of-the-art RML-compliant engines, and existing benchmarks of data sources and RML triples maps. Our experimental results suggest that the performance of the studied engines can be considerably improved, particularly in a complex setting with numerous triples maps and large data sources. As a result, engines that time out in complex cases are enabled to produce at least a portion of the KG applying the planner.</span></span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50201114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1016/j.websem.2022.100754
Spyridon Kantarelis, Edmund Dervakos, Natalia Kotsani, Giorgos Stamou
Symbolic representations of music are emerging as an important data domain both for the music industry and for computer science research, aiding in the organization of large collections of music and facilitating the development of creative and interactive AI. An aspect of symbolic representations of music, which differentiates them from audio representations, is their suitability to be linked with notions from music theory that have been developed over the centuries. One core such notion is that of functional harmony, which involves analyzing progressions of chords. This paper proposes a description of the theory of functional harmony within the OWL 2 RL profile and experimentally demonstrates its practical use.
{"title":"Functional harmony ontology: Musical harmony analysis with Description Logics","authors":"Spyridon Kantarelis, Edmund Dervakos, Natalia Kotsani, Giorgos Stamou","doi":"10.1016/j.websem.2022.100754","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100754","url":null,"abstract":"<div><p>Symbolic representations of music are emerging as an important data domain both for the music industry and for computer science research, aiding in the organization of large collections of music and facilitating the development of creative and interactive AI<span>. An aspect of symbolic representations of music, which differentiates them from audio representations, is their suitability to be linked with notions from music theory that have been developed over the centuries. One core such notion is that of functional harmony, which involves analyzing progressions of chords. This paper proposes a description of the theory of functional harmony within the OWL 2 RL profile and experimentally demonstrates its practical use.</span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50201118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1016/j.websem.2022.100753
Dylan Van Assche , Thomas Delva , Gerald Haesendonck , Pieter Heyvaert , Ben De Meester , Anastasia Dimou
More and more data in various formats are integrated into knowledge graphs. However, there is no overview of existing approaches for generating knowledge graphs from heterogeneous (semi-)structured data, making it difficult to select the right one for a certain use case. To support better decision making, we study the existing approaches for generating knowledge graphs from heterogeneous (semi-)structured data relying on mapping languages. In this paper, we investigated existing mapping languages for schema and data transformations, and corresponding materialization and virtualization systems that generate knowledge graphs. We gather and unify 52 articles regarding knowledge graph generation from heterogeneous (semi-)structured data. We assess 15 characteristics on mapping languages for schema transformations, 5 characteristics for data transformations, and 14 characteristics for systems. Our survey paper provides an overview of the mapping languages and systems proposed the past two decades. Our work paves the way towards a better adoption of knowledge graph generation, as the right mapping language and system can be selected for each use case.
{"title":"Declarative RDF graph generation from heterogeneous (semi-)structured data: A systematic literature review","authors":"Dylan Van Assche , Thomas Delva , Gerald Haesendonck , Pieter Heyvaert , Ben De Meester , Anastasia Dimou","doi":"10.1016/j.websem.2022.100753","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100753","url":null,"abstract":"<div><p>More and more data in various formats are integrated into knowledge graphs. However, there is no overview of existing approaches for generating knowledge graphs from heterogeneous (semi-)structured data, making it difficult to select the right one for a certain use case. To support better decision making, we study the existing approaches for generating knowledge graphs from heterogeneous (semi-)structured data relying on mapping languages. In this paper, we investigated existing mapping languages for schema and data transformations, and corresponding materialization and virtualization systems that generate knowledge graphs. We gather and unify 52 articles regarding knowledge graph generation from heterogeneous (semi-)structured data. We assess 15 characteristics on mapping languages for schema transformations, 5 characteristics for data transformations, and 14 characteristics for systems. Our survey paper provides an overview of the mapping languages and systems proposed the past two decades. Our work paves the way towards a better adoption of knowledge graph generation, as the right mapping language and system can be selected for each use case.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50201120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present Knowledge4COVID-19, a framework that aims to showcase the power of integrating disparate sources of knowledge to discover adverse drug effects caused by drug–drug interactions among COVID-19 treatments and pre-existing condition drugs. Initially, we focus on constructing the Knowledge4COVID-19 knowledge graph (KG) from the declarative definition of mapping rules using the RDF Mapping Language. Since valuable information about drug treatments, drug–drug interactions, and side effects is present in textual descriptions in scientific databases (e.g., DrugBank) or in scientific literature (e.g., the CORD-19, the Covid-19 Open Research Dataset), the Knowledge4COVID-19 framework implements Natural Language Processing. The Knowledge4COVID-19 framework extracts relevant entities and predicates that enable the fine-grained description of COVID-19 treatments and the potential adverse events that may occur when these treatments are combined with treatments of common comorbidities, e.g., hypertension, diabetes, or asthma. Moreover, on top of the KG, several techniques for the discovery and prediction of interactions and potential adverse effects of drugs have been developed with the aim of suggesting more accurate treatments for treating the virus. We provide services to traverse the KG and visualize the effects that a group of drugs may have on a treatment outcome. Knowledge4COVID-19 was part of the Pan-European hackathon#EUvsVirus in April 2020 and is publicly available as a resource through a GitHub repository and a DOI.
{"title":"Knowledge4COVID-19: A semantic-based approach for constructing a COVID-19 related knowledge graph from various sources and analyzing treatments’ toxicities","authors":"Ahmad Sakor , Samaneh Jozashoori , Emetis Niazmand , Ariam Rivas , Konstantinos Bougiatiotis , Fotis Aisopos , Enrique Iglesias , Philipp D. Rohde , Trupti Padiya , Anastasia Krithara , Georgios Paliouras , Maria-Esther Vidal","doi":"10.1016/j.websem.2022.100760","DOIUrl":"10.1016/j.websem.2022.100760","url":null,"abstract":"<div><p>In this paper, we present Knowledge4COVID-19, a framework that aims to showcase the power of integrating disparate sources of knowledge to discover adverse drug effects caused by drug–drug interactions among COVID-19 treatments and pre-existing condition drugs. Initially, we focus on constructing the Knowledge4COVID-19 knowledge graph (KG) from the declarative definition of mapping rules using the RDF Mapping Language. Since valuable information about drug treatments, drug–drug interactions, and side effects is present in textual descriptions in scientific databases (e.g., DrugBank) or in scientific literature (e.g., the CORD-19, the Covid-19 Open Research Dataset), the Knowledge4COVID-19 framework implements Natural Language Processing. The Knowledge4COVID-19 framework extracts relevant entities and predicates that enable the fine-grained description of COVID-19 treatments and the potential adverse events that may occur when these treatments are combined with treatments of common comorbidities, e.g., hypertension, diabetes, or asthma. Moreover, on top of the KG, several techniques for the discovery and prediction of interactions and potential adverse effects of drugs have been developed with the aim of suggesting more accurate treatments for treating the virus. We provide services to traverse the KG and visualize the effects that a group of drugs may have on a treatment outcome. Knowledge4COVID-19 was part of the Pan-European <em>hackathon#EUvsVirus</em> in April 2020 and is publicly available as a resource through a <span>GitHub repository</span><svg><path></path></svg> and a <span>DOI</span><svg><path></path></svg>.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9558693/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10395805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-01DOI: 10.1016/j.websem.2022.100715
Romana Pernisch , Daniele Dell’Aglio , Mirko Serbak , Rafael S. Gonçalves , Abraham Bernstein
Due to the Semantic Web’s decentralised nature, ontology engineers rarely know all applications that leverage their ontology. Consequently, they are unaware of the full extent of possible consequences that changes might cause to the ontology. Our goal is to lessen the gap between ontology engineers and users by investigating ontology engineers’ understanding of ontology changes’ impact at editing time. Hence, this paper introduces the Protégé plugin ChImp which we use to reach our goal. We elicited requirements for ChImp through a questionnaire with ontology engineers. We then developed ChImp according to these requirements and it displays all changes of a given session and provides selected information on said changes and their effects. For each change, it computes a number of metrics on both the ontology and its materialisation. It displays those metrics on both the originally loaded ontology at the beginning of the editing session and the current state to help ontology engineers understand the impact of their changes.
We investigated the informativeness of materialisation impact measures, the meaning of severe impact, and also the usefulness of ChImp in an online user study with 36 ontology engineers. We asked the participants to solve two ontology engineering tasks – with and without ChImp (assigned in random order) – and answer in-depth questions about the applied changes as well as the materialisation impact measures. We found that ChImp increased the participants’ understanding of change effects and that they felt better informed. Answers also suggest that the proposed measures were useful and informative. We also learned that the participants consider different outcomes of changes severe, but most would define severity based on the amount of changes to the materialisation compared to its size. The participants also acknowledged the importance of quantifying the impact of changes and that the study will affect their approach of editing ontologies.
{"title":"Visualising the effects of ontology changes and studying their understanding with ChImp","authors":"Romana Pernisch , Daniele Dell’Aglio , Mirko Serbak , Rafael S. Gonçalves , Abraham Bernstein","doi":"10.1016/j.websem.2022.100715","DOIUrl":"10.1016/j.websem.2022.100715","url":null,"abstract":"<div><p>Due to the Semantic Web’s decentralised nature, ontology engineers rarely know all applications that leverage their ontology. Consequently, they are unaware of the full extent of possible consequences that changes might cause to the ontology. Our goal is to lessen the gap between ontology engineers and users by investigating ontology engineers’ understanding of ontology changes’ impact at editing time. Hence, this paper introduces the Protégé plugin ChImp which we use to reach our goal. We elicited requirements for ChImp through a questionnaire with ontology engineers. We then developed ChImp according to these requirements and it displays all changes of a given session and provides selected information on said changes and their effects. For each change, it computes a number of metrics on both the ontology and its materialisation. It displays those metrics on both the originally loaded ontology at the beginning of the editing session and the current state to help ontology engineers understand the impact of their changes.</p><p>We investigated the informativeness of materialisation impact measures, the meaning of severe impact, and also the usefulness of ChImp in an online user study with 36 ontology engineers. We asked the participants to solve two ontology engineering tasks – with and without ChImp (assigned in random order) – and answer in-depth questions about the applied changes as well as the materialisation impact measures. We found that ChImp increased the participants’ understanding of change effects and that they felt better informed. Answers also suggest that the proposed measures were useful and informative. We also learned that the participants consider different outcomes of changes severe, but most would define severity based on the amount of changes to the materialisation compared to its size. The participants also acknowledged the importance of quantifying the impact of changes and that the study will affect their approach of editing ontologies.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826822000117/pdfft?md5=d670a71e0b97bf575771316f7eb96324&pid=1-s2.0-S1570826822000117-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84917634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-01DOI: 10.1016/j.websem.2022.100721
Paolo Pareti , George Konstantinidis , Fabio Mogavero
The Shapes Constraint Language (SHACL) is the recent W3C recommendation language for validating RDF data, by verifying certain shapes on graphs. Previous work has largely focused on the validation problem, while the standard decision problems of satisfiability and containment, crucial for design and optimisation purposes, have only been investigated for simplified versions of SHACL. Moreover, the SHACL specification does not define the semantics of recursively-defined constraints, which led to several alternative recursive semantics being proposed in the literature. The interaction between these different semantics and important decision problems has not been investigated yet. In this article we provide a comprehensive study of the different features of SHACL, by providing a translation to a new first-order language, called SCL, that precisely captures the semantics of SHACL. We also present MSCL, a second-order extension of SCL, which allows us to define, in a single formal logic framework, the main recursive semantics of SHACL. Within this language we also provide an effective treatment of filter constraints which are often neglected in the related literature. Using this logic we provide a detailed map of (un)decidability and complexity results for the satisfiability and containment decision problems for different SHACL fragments. Notably, we prove that both problems are undecidable for the full language, but we present decidable combinations of interesting features, even in the face of recursion.
{"title":"Satisfiability and containment of recursive SHACL","authors":"Paolo Pareti , George Konstantinidis , Fabio Mogavero","doi":"10.1016/j.websem.2022.100721","DOIUrl":"10.1016/j.websem.2022.100721","url":null,"abstract":"<div><p>The <em>Shapes Constraint Language</em> (SHACL) is the recent W3C recommendation language for validating RDF data, by verifying certain <em>shapes</em> on graphs. Previous work has largely focused on the validation problem, while the standard decision problems of <em>satisfiability</em> and <em>containment</em>, crucial for design and optimisation purposes, have only been investigated for simplified versions of SHACL. Moreover, the SHACL specification does not define the semantics of recursively-defined constraints, which led to several alternative <em>recursive</em> semantics being proposed in the literature. The interaction between these different semantics and important decision problems has not been investigated yet. In this article we provide a comprehensive study of the different features of SHACL, by providing a translation to a new first-order language, called SCL, that precisely captures the semantics of SHACL. We also present MSCL, a second-order extension of SCL, which allows us to define, in a single formal logic framework, the main recursive semantics of SHACL. Within this language we also provide an effective treatment of filter constraints which are often neglected in the related literature. Using this logic we provide a detailed map of (un)decidability and complexity results for the satisfiability and containment decision problems for different SHACL fragments. Notably, we prove that both problems are undecidable for the full language, but we present decidable combinations of interesting features, even in the face of recursion.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826822000130/pdfft?md5=3a752036fb876665b02db4a6a6bd0704&pid=1-s2.0-S1570826822000130-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90552965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}