Pub Date : 2023-04-01DOI: 10.1016/j.websem.2023.100775
Pierre Maillot, Olivier Corby, Catherine Faron, Fabien Gandon, Franck Michel
In recent years, a large number of RDF datasets have been built and published on the Web in fields as diverse as linguistics or life sciences, as well as general datasets such as DBpedia or Wikidata. The joint exploitation of these datasets requires specific knowledge about their content, access points, and commonalities. However, not all datasets contain a self-description, and not all access points can handle the complex queries used to generate such a description.
In this article, we provide a standard-based approach to generate the description of a dataset. The generated descriptions as well as the process of their computation are expressed using standard vocabularies and languages. We implemented our approach into a framework, called IndeGx, where each indexing feature and its computation is collaboratively and declaratively defined in a GitHub repository. We have experimented IndeGx on a set of 339 RDF datasets with endpoints listed in public catalogs, over 8 months. The results show that we can collect, as much as possible, important characteristics of the datasets depending on their availability and capacities. The resulting index captures the commonalities, variety and disparity in the offered content and services and it provides an important support to any application designed to query RDF datasets.
{"title":"IndeGx: A model and a framework for indexing RDF knowledge graphs with SPARQL-based test suits","authors":"Pierre Maillot, Olivier Corby, Catherine Faron, Fabien Gandon, Franck Michel","doi":"10.1016/j.websem.2023.100775","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100775","url":null,"abstract":"<div><p>In recent years, a large number of RDF datasets have been built and published on the Web in fields as diverse as linguistics or life sciences, as well as general datasets such as DBpedia or Wikidata. The joint exploitation of these datasets requires specific knowledge about their content, access points, and commonalities. However, not all datasets contain a self-description, and not all access points can handle the complex queries used to generate such a description.</p><p>In this article, we provide a standard-based approach to generate the description of a dataset. The generated descriptions as well as the process of their computation are expressed using standard vocabularies and languages. We implemented our approach into a framework, called IndeGx, where each indexing feature and its computation is collaboratively and declaratively defined in a GitHub repository. We have experimented IndeGx on a set of 339 RDF datasets with endpoints listed in public catalogs, over 8 months. The results show that we can collect, as much as possible, important characteristics of the datasets depending on their availability and capacities. The resulting index captures the commonalities, variety and disparity in the offered content and services and it provides an important support to any application designed to query RDF datasets.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49903582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-01DOI: 10.1016/j.websem.2023.100776
Przemysław A. Wałęga, Mark Kaminski, Dingmin Wang, Bernardo Cuenca Grau
We study stream reasoning in —an extension of Datalog with metric temporal operators. We propose a sound and complete stream reasoning algorithm that is applicable to forward-propagating programs, in which propagation of derived information towards past time points is precluded. Memory consumption in our generic algorithm depends both on the properties of the rule set and the input data stream; in particular, it depends on the distances between timestamps occurring in data. This may be undesirable in certain practical scenarios since these distances can be very small, in which case the algorithm may require large amounts of memory. To address this issue, we propose a second algorithm, where the size of the required memory becomes independent on the timestamps in the data at the expense of disallowing punctual intervals in the rule set. We have implemented our approach as an extension of the reasoner MeTeoR and tested it experimentally. The obtained results support the feasibility of our approach in practice.
{"title":"Stream reasoning with DatalogMTL","authors":"Przemysław A. Wałęga, Mark Kaminski, Dingmin Wang, Bernardo Cuenca Grau","doi":"10.1016/j.websem.2023.100776","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100776","url":null,"abstract":"<div><p>We study stream reasoning in <span><math><mtext>DatalogMTL</mtext></math></span>—an extension of Datalog with metric temporal operators. We propose a sound and complete stream reasoning algorithm that is applicable to forward-propagating <span><math><mtext>DatalogMTL</mtext></math></span> programs, in which propagation of derived information towards past time points is precluded. Memory consumption in our generic algorithm depends both on the properties of the rule set and the input data stream; in particular, it depends on the distances between timestamps occurring in data. This may be undesirable in certain practical scenarios since these distances can be very small, in which case the algorithm may require large amounts of memory. To address this issue, we propose a second algorithm, where the size of the required memory becomes independent on the timestamps in the data at the expense of disallowing punctual intervals in the rule set. We have implemented our approach as an extension of the <span><math><mtext>DatalogMTL</mtext></math></span> reasoner MeTeoR and tested it experimentally. The obtained results support the feasibility of our approach in practice.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49876692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-01DOI: 10.1016/j.websem.2023.100774
Jean-Paul Calbimonte , Orfeas Aidonopoulos , Fabien Dubosson , Benjamin Pocklington , Ilia Kebets , Pierre-Mikael Legris , Michael Schumacher
Personalized healthcare is nowadays driven by the increasing volumes of patient data, observed and produced continuously thanks to medical devices, mobile sensors, patient-reported outcomes, among other data sources. This data is made available as streams, due to their dynamic nature, which represents an important challenge for processing, querying and interpreting the incoming information. In addition, the sensitive nature of healthcare data poses significant restrictions regarding privacy, which has led to the emergence of decentralized personal data management systems. Data semantics play a key role in order to enable both decentralization and integration of personal health data, as they introduce the capability to represent knowledge and information using ontologies and semantic vocabularies. In this paper we describe the SemPryv system, which provides the means to manage personal health data streams enriched with semantic information. SemPryv is designed as a decentralized system, so that users have the possibility of hosting their personal data at different sites, while keeping control of access rights. The semantization of data in SemPryv is implemented through different strategies, ranging from rule-based annotation to machine learning-based suggestions, fed from third-party specialized healthcare metadata providers. The system has been made available as Open Source, and is integrated as part of the Pryv.io platform used and commercialized in the healthcare and personal data management industry.
{"title":"Decentralized semantic provision of personal health streams","authors":"Jean-Paul Calbimonte , Orfeas Aidonopoulos , Fabien Dubosson , Benjamin Pocklington , Ilia Kebets , Pierre-Mikael Legris , Michael Schumacher","doi":"10.1016/j.websem.2023.100774","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100774","url":null,"abstract":"<div><p>Personalized healthcare is nowadays driven by the increasing volumes of patient data, observed and produced continuously thanks to medical devices, mobile sensors, patient-reported outcomes, among other data sources. This data is made available as streams, due to their dynamic nature, which represents an important challenge for processing, querying and interpreting the incoming information. In addition, the sensitive nature of healthcare data poses significant restrictions regarding privacy, which has led to the emergence of decentralized personal data management systems. Data semantics play a key role in order to enable both decentralization and integration of personal health data, as they introduce the capability to represent knowledge and information using ontologies and semantic vocabularies. In this paper we describe the SemPryv system, which provides the means to manage personal health data streams enriched with semantic information. SemPryv is designed as a decentralized system, so that users have the possibility of hosting their personal data at different sites, while keeping control of access rights. The semantization of data in SemPryv is implemented through different strategies, ranging from rule-based annotation to machine learning-based suggestions, fed from third-party specialized healthcare metadata providers. The system has been made available as Open Source, and is integrated as part of the Pryv.io platform used and commercialized in the healthcare and personal data management industry.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49903584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-01DOI: 10.1016/j.websem.2023.100773
Antonio De Nicola , Anna Formica , Michele Missikoff , Elaheh Pourabbas , Francesco Taglino
We present the parametric method SemSimp aimed at measuring semantic similarity of digital resources. SemSimp is based on the notion of information content, and it leverages a reference ontology and taxonomic reasoning, encompassing different approaches for weighting the concepts of the ontology. In particular, weights can be computed by considering either the available digital resources or the structure of the reference ontology of a given domain. SemSimp is assessed against six representative semantic similarity methods for comparing sets of concepts proposed in the literature, by carrying out an experimentation that includes both a statistical analysis and an expert judgment evaluation. To the purpose of achieving a reliable assessment, we used a real-world large dataset based on the Digital Library of the Association for Computing Machinery (ACM), and a reference ontology derived from the ACM Computing Classification System (ACM-CCS). For each method, we considered two indicators. The first concerns the degree of confidence to identify the similarity among the papers belonging to some special issues selected from the ACM Transactions on Information Systems journal, the second the Pearson correlation with human judgment. The results reveal that one of the configurations of SemSimp outperforms the other assessed methods. An additional experiment performed in the domain of physics shows that, in general, SemSimp provides better results than the other similarity methods.
{"title":"A parametric similarity method: Comparative experiments based on semantically annotated large datasets","authors":"Antonio De Nicola , Anna Formica , Michele Missikoff , Elaheh Pourabbas , Francesco Taglino","doi":"10.1016/j.websem.2023.100773","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100773","url":null,"abstract":"<div><p>We present the parametric method <em>SemSim<sup>p</sup></em><span> aimed at measuring semantic similarity of digital resources. </span><em>SemSim<sup>p</sup></em> is based on the notion of <em>information content</em>, and it leverages a reference ontology and taxonomic reasoning, encompassing different approaches for weighting the concepts of the ontology. In particular, weights can be computed by considering either the available digital resources or the structure of the reference ontology of a given domain. <em>SemSim<sup>p</sup></em><span> is assessed against six representative semantic similarity methods for comparing sets of concepts proposed in the literature, by carrying out an experimentation that includes both a statistical analysis and an expert judgment evaluation. To the purpose of achieving a reliable assessment, we used a real-world large dataset based on the Digital Library of the Association for Computing Machinery<span> (ACM), and a reference ontology derived from the ACM Computing Classification System (ACM-CCS). For each method, we considered two indicators. The first concerns the degree of confidence to identify the similarity among the papers belonging to some special issues selected from the ACM Transactions on Information Systems journal, the second the Pearson correlation with human judgment. The results reveal that one of the configurations of </span></span><em>SemSim<sup>p</sup></em> outperforms the other assessed methods. An additional experiment performed in the domain of physics shows that, in general, <em>SemSim<sup>p</sup></em> provides better results than the other similarity methods.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49903583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-01DOI: 10.1016/j.websem.2022.100769
Shrestha Ghosh , Simon Razniewski , Gerhard Weikum
In this work we address the challenging case of answering count queries in web search, such as number of songs by John Lennon. Prior methods merely answer these with a single, and sometimes puzzling number or return a ranked list of text snippets with different numbers. This paper proposes a methodology for answering count queries with inference, contextualization and explanatory evidence. Unlike previous systems, our method infers final answers from multiple observations, supports semantic qualifiers for the counts, and provides evidence by enumerating representative instances. Experiments with a wide variety of queries, including existing benchmark show the benefits of our method, and the influence of specific parameter settings. Our code, data and an interactive system demonstration are publicly available at https://github.com/ghoshs/CoQEx and https://nlcounqer.mpi-inf.mpg.de/.
{"title":"Answering Count Questions with Structured Answers from Text","authors":"Shrestha Ghosh , Simon Razniewski , Gerhard Weikum","doi":"10.1016/j.websem.2022.100769","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100769","url":null,"abstract":"<div><p><span>In this work we address the challenging case of answering count queries in web search, such as </span><em>number of songs by John Lennon</em><span>. Prior methods merely answer these with a single, and sometimes puzzling number or return a ranked list of text snippets with different numbers. This paper proposes a methodology for answering count queries with inference, contextualization and explanatory evidence. Unlike previous systems, our method infers final answers from multiple observations, supports semantic qualifiers for the counts, and provides evidence by enumerating representative instances. Experiments with a wide variety of queries, including existing benchmark show the benefits of our method, and the influence of specific parameter settings. Our code, data and an interactive system demonstration are publicly available at </span><span>https://github.com/ghoshs/CoQEx</span><svg><path></path></svg> and <span>https://nlcounqer.mpi-inf.mpg.de/</span><svg><path></path></svg>.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49903587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-01DOI: 10.1016/j.websem.2022.100771
Bernhard Krabina
While research on semantic wikis is declining, Semantic MediaWiki (SMW) can still play an important role in the emerging field of knowledge graph curation.
The Vienna History Wiki, a large knowledge base curated by the city government in collaboration with other institutions and the general public, provides an ideal use case for demonstrating strengths and weaknesses of SMW as well as discussing the challenges of co-curation in a cultural heritage setting. This paper describes processes like collaborative editing, interlinking unique identifiers on the web, sharing data with Wikidata, making use of Schema.org, and other ontologies. It presents insights from a user survey, access statistics, and a knowledge graph analysis.
This work contributes to the scarce research in wiki usage outside of the Wikipedia ecosystem as well as to the field of community-based knowledge graph curation. The availability of a now significantly improved RDF representation indicates future directions for research and practice.
{"title":"Building a Knowledge Graph for the History of Vienna with Semantic MediaWiki","authors":"Bernhard Krabina","doi":"10.1016/j.websem.2022.100771","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100771","url":null,"abstract":"<div><p>While research on semantic wikis is declining, Semantic MediaWiki (SMW) can still play an important role in the emerging field of knowledge graph curation.</p><p>The Vienna History Wiki, a large knowledge base curated by the city government in collaboration with other institutions and the general public, provides an ideal use case for demonstrating strengths and weaknesses of SMW as well as discussing the challenges of co-curation in a cultural heritage setting. This paper describes processes like collaborative editing, interlinking unique identifiers on the web, sharing data with Wikidata, making use of Schema.org, and other ontologies. It presents insights from a user survey, access statistics, and a knowledge graph analysis.</p><p>This work contributes to the scarce research in wiki usage outside of the Wikipedia ecosystem as well as to the field of community-based knowledge graph curation. The availability of a now significantly improved RDF representation indicates future directions for research and practice.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49903585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-01DOI: 10.1016/j.websem.2022.100761
Jixiong Liu , Yoan Chabot , Raphaël Troncy , Viet-Phi Huynh , Thomas Labbé , Pierre Monnin
Tabular data often refers to data that is organized in a table with rows and columns. We observe that this data format is widely used on the Web and within enterprise data repositories. Tables potentially contain rich semantic information that still needs to be interpreted. The process of extracting meaningful information out of tabular data with respect to a semantic artefact, such as an ontology or a knowledge graph, is often referred to as Semantic Table Interpretation (STI) or Semantic Table Annotation. In this survey paper, we aim to provide a comprehensive and up-to-date state-of-the-art review of the different tasks and methods that have been proposed so far to perform STI. First, we propose a new categorization that reflects the heterogeneity of table types that one can encounter, revealing different challenges that need to be addressed. Next, we define five major sub-tasks that STI deals with even if the literature has mostly focused on three sub-tasks so far. We review and group the many approaches that have been proposed into three macro families and we discuss their performance and limitations with respect to the various datasets and benchmarks proposed by the community. Finally, we detail what are the remaining scientific barriers to be able to truly automatically interpret any type of tables that can be found in the wild Web.
{"title":"From tabular data to knowledge graphs: A survey of semantic table interpretation tasks and methods","authors":"Jixiong Liu , Yoan Chabot , Raphaël Troncy , Viet-Phi Huynh , Thomas Labbé , Pierre Monnin","doi":"10.1016/j.websem.2022.100761","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100761","url":null,"abstract":"<div><p>Tabular data often refers to data that is organized in a table with rows and columns. We observe that this data format<span> is widely used on the Web and within enterprise data repositories. Tables potentially contain rich semantic information that still needs to be interpreted. The process of extracting meaningful information out of tabular data with respect to a semantic artefact, such as an ontology or a knowledge graph, is often referred to as Semantic Table Interpretation (STI) or Semantic Table Annotation. In this survey paper, we aim to provide a comprehensive and up-to-date state-of-the-art review of the different tasks and methods that have been proposed so far to perform STI. First, we propose a new categorization that reflects the heterogeneity of table types that one can encounter, revealing different challenges that need to be addressed. Next, we define five major sub-tasks that STI deals with even if the literature has mostly focused on three sub-tasks so far. We review and group the many approaches that have been proposed into three macro families and we discuss their performance and limitations with respect to the various datasets and benchmarks proposed by the community. Finally, we detail what are the remaining scientific barriers to be able to truly automatically interpret any type of tables that can be found in the wild Web.</span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49903590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-01DOI: 10.1016/j.websem.2023.100772
Nikolaos I. Spanoudakis , Georgios Gligoris , Adamos Koumi , Antonis C. Kakas
Gorgias Cloud offers an integrated application development environment that facilitates the development of argumentation-based systems over the internet. Argumentation is offered as a service in a way that this allows application systems to remotely access the argumentation service and utilize the results of the argumentative computation. Moreover, the service results include the explanation of the decision in both human and machine-readable formats. The first is useful for allowing the application validation to be done by experts, while the second is useful for development. It appears that this is the first case where argumentation is offered to developers in such an open and distributed way.
{"title":"Explainable argumentation as a service","authors":"Nikolaos I. Spanoudakis , Georgios Gligoris , Adamos Koumi , Antonis C. Kakas","doi":"10.1016/j.websem.2023.100772","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100772","url":null,"abstract":"<div><p><span>Gorgias Cloud</span><span> offers an integrated application development environment that facilitates the development of argumentation-based systems over the internet. Argumentation is offered as a service in a way that this allows application systems to remotely access the argumentation service and utilize the results of the argumentative computation. Moreover, the service results include the explanation of the decision in both human and machine-readable formats. The first is useful for allowing the application validation to be done by experts, while the second is useful for development. It appears that this is the first case where argumentation is offered to developers in such an open and distributed way.</span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49876693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-01DOI: 10.1016/j.websem.2022.100758
Luca Turchet , Francesco Antoniazzi
The Internet of Musical Things (IoMusT) refers to the extension of the Internet of Things paradigm to the musical domain. Interoperability represents a central issue within this domain, where heterogeneous Musical Things serving radically different purposes are envisioned to communicate between each other. Automatic discovery of resources is also a desirable feature in IoMusT ecosystems. However, the existing musical protocols are not adequate to support discoverability and interoperability across the wide heterogeneity of Musical Things, as they are typically not flexible, lack high resolution, are not equipped with inference mechanisms that could exploit on board the information on the whole application environment. Besides, they hardly ever support easy integration with the Web. In addition, IoMusT applications are often characterized by strict requirements in terms of latency of the exchanged messages. Semantic Web of Things technologies have the potential to overcome the limitations of existing musical protocols by enabling discoverability and interoperability across heterogeneous Musical Things. In this paper we propose the Musical Semantic Event Processing Architecture (MUSEPA), a semantically-based architecture designed to meet the IoMusT requirements of low-latency communication, discoverability, interoperability, and automatic inference. The architecture is based on the CoAP protocol, a semantic publish/subscribe broker, and the adoption of shared ontologies for describing Musical Things and their interactions. The code implementing MUSEPA can be accessed at: https://github.com/CIMIL/MUSEPA/.
{"title":"Semantic Web of Musical Things: Achieving interoperability in the Internet of Musical Things","authors":"Luca Turchet , Francesco Antoniazzi","doi":"10.1016/j.websem.2022.100758","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100758","url":null,"abstract":"<div><p><span>The Internet of Musical Things (IoMusT) refers to the extension of the Internet of Things<span><span> paradigm to the musical domain. Interoperability represents a central issue within this domain, where heterogeneous Musical Things serving radically different purposes are envisioned to communicate between each other. Automatic discovery of resources is also a desirable feature in IoMusT ecosystems. However, the existing musical protocols are not adequate to support discoverability and interoperability across the wide heterogeneity of Musical Things, as they are typically not flexible, lack high resolution, are not equipped with inference mechanisms that could exploit on board the information on the whole application environment. Besides, they hardly ever support easy integration with the Web. In addition, IoMusT applications are often characterized by strict requirements in terms of latency of the exchanged messages. Semantic Web of Things technologies have the potential to overcome the limitations of existing musical protocols by enabling discoverability and interoperability across heterogeneous Musical Things. In this paper we propose the Musical Semantic Event Processing Architecture (MUSEPA), a semantically-based architecture designed to meet the IoMusT requirements of low-latency communication, discoverability, interoperability, and automatic inference. The architecture is based on the </span>CoAP protocol, a semantic publish/subscribe broker, and the adoption of shared ontologies for describing Musical Things and their interactions. The code implementing MUSEPA can be accessed at: </span></span><span>https://github.com/CIMIL/MUSEPA/</span><svg><path></path></svg>.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50201116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The Semantic Web is distributed yet interoperable: Distributed since resources are created and published by a variety of producers, tailored to their specific needs and knowledge; Interoperable as entities are linked across resources, allowing to use resources from different providers in concord. Complementary to the explicit usage of Semantic Web resources, embedding methods made them applicable to machine learning tasks. Subsequently, embedding models for numerous tasks and structures have been developed, and embedding spaces for various resources have been published. The ecosystem of embedding spaces is distributed but not interoperable: Entity embeddings are not readily comparable across different spaces. To parallel the Web of Data with a Web of Embeddings, we must thus integrate available embedding spaces into a uniform space.
Current integration approaches are limited to two spaces and presume that both of them were embedded with the same method — both assumptions are unlikely to hold in the context of a Web of Embeddings. In this paper, we present FedCoder— an approach that integrates multiple embedding spaces via a latent space. We assert that linked entities have a similar representation in the latent space so that entities become comparable across embedding spaces. FedCoder employs an autoencoder to learn this latent space from linked as well as non-linked entities.
Our experiments show that FedCoder substantially outperforms state-of-the-art approaches when faced with different embedding models, that it scales better than previous methods in the number of embedding spaces, and that it improves with more graphs being integrated whilst performing comparably with current approaches that assumed joint learning of the embeddings and were, usually, limited to two sources. Our results demonstrate that FedCoder is well adapted to integrate the distributed, diverse, and large ecosystem of embeddings spaces into an interoperable Web of Embeddings.
{"title":"Towards the Web of Embeddings: Integrating multiple knowledge graph embedding spaces with FedCoder","authors":"Matthias Baumgartner , Daniele Dell’Aglio , Heiko Paulheim , Abraham Bernstein","doi":"10.1016/j.websem.2022.100741","DOIUrl":"https://doi.org/10.1016/j.websem.2022.100741","url":null,"abstract":"<div><p>The Semantic Web is distributed yet interoperable: Distributed since resources are created and published by a variety of producers, tailored to their specific needs and knowledge; Interoperable as entities are linked across resources, allowing to use resources from different providers in concord. Complementary to the explicit usage of Semantic Web resources, embedding methods made them applicable to machine learning tasks. Subsequently, embedding models for numerous tasks and structures have been developed, and embedding spaces for various resources have been published. The ecosystem of embedding spaces is distributed but not interoperable: Entity embeddings are not readily comparable across different spaces. To parallel the Web of Data with a Web of Embeddings, we must thus integrate available embedding spaces into a uniform space.</p><p>Current integration approaches are limited to two spaces and presume that both of them were embedded with the same method — both assumptions are unlikely to hold in the context of a Web of Embeddings. In this paper, we present FedCoder— an approach that integrates multiple embedding spaces via a latent space. We assert that linked entities have a similar representation in the latent space so that entities become comparable across embedding spaces. FedCoder employs an autoencoder to learn this latent space from linked as well as non-linked entities.</p><p>Our experiments show that FedCoder substantially outperforms state-of-the-art approaches when faced with different embedding models, that it scales better than previous methods in the number of embedding spaces, and that it improves with more graphs being integrated whilst performing comparably with current approaches that assumed joint learning of the embeddings and were, usually, limited to two sources. Our results demonstrate that FedCoder is well adapted to integrate the distributed, diverse, and large ecosystem of embeddings spaces into an interoperable Web of Embeddings.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50201121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}