首页 > 最新文献

Journal of Web Semantics最新文献

英文 中文
Three-dimensional Geospatial Interlinking with JedAI-spatial 利用 JedAI-spatial 实现三维地理空间互联
IF 2.5 3区 计算机科学 Q1 Computer Science Pub Date : 2024-03-24 DOI: 10.1016/j.websem.2024.100817
Marios Papamichalopoulos , George Papadakis , George Mandilaras , Maria Siampou , Nikos Mamoulis , Manolis Koubarakis

Geospatial data constitutes a considerable part of Semantic Web data, but so far, its sources are inadequately interlinked in the Linked Open Data cloud. Geospatial Interlinking aims to cover this gap by associating geometries with topological relations like those of the Dimensionally Extended 9-Intersection Model. Due to its quadratic time complexity, various algorithms aim to carry out Geospatial Interlinking efficiently. We present JedAI-spatial, a novel, open-source system that organizes these algorithms according to three dimensions: (i) Space Tiling, which determines the approach that reduces the search space, (ii) Budget-awareness, which distinguishes interlinking algorithms into batch and progressive ones, and (iii) Execution mode, which discerns between serial algorithms, running on a single CPU-core, and parallel ones, running on top of Apache Spark. We analytically describe JedAI-spatial’s architecture and capabilities and perform thorough experiments to provide interesting insights about the relative performance of its algorithms.

地理空间数据在语义网数据中占有相当大的比重,但迄今为止,其来源在关联开放数据云中的互联程度还不够。地理空间互联旨在通过将几何图形与拓扑关系(如维度扩展九交模型的拓扑关系)相关联来弥补这一不足。由于其二次方时间复杂性,各种算法都旨在高效地进行地理空间互联。我们介绍了 JedAI-spatial,这是一个新颖的开源系统,它根据三个维度对这些算法进行了组织:(i) 空间平铺(Space Tiling),它决定了缩小搜索空间的方法;(ii) 预算感知(Budget-awareness),它将相互链接算法区分为批处理算法和渐进算法;(iii) 执行模式(Execution mode),它区分了在单 CPU 内核上运行的串行算法和在 Apache Spark 上运行的并行算法。我们对 JedAI-spatial 的架构和功能进行了分析描述,并进行了全面的实验,以提供有关其算法相对性能的有趣见解。
{"title":"Three-dimensional Geospatial Interlinking with JedAI-spatial","authors":"Marios Papamichalopoulos ,&nbsp;George Papadakis ,&nbsp;George Mandilaras ,&nbsp;Maria Siampou ,&nbsp;Nikos Mamoulis ,&nbsp;Manolis Koubarakis","doi":"10.1016/j.websem.2024.100817","DOIUrl":"https://doi.org/10.1016/j.websem.2024.100817","url":null,"abstract":"<div><p>Geospatial data constitutes a considerable part of Semantic Web data, but so far, its sources are inadequately interlinked in the Linked Open Data cloud. Geospatial Interlinking aims to cover this gap by associating geometries with topological relations like those of the Dimensionally Extended 9-Intersection Model. Due to its quadratic time complexity, various algorithms aim to carry out Geospatial Interlinking efficiently. We present <em>JedAI-spatial</em>, a novel, open-source system that organizes these algorithms according to three dimensions: (i) <em>Space Tiling</em>, which determines the approach that reduces the search space, (ii) <em>Budget-awareness</em>, which distinguishes interlinking algorithms into batch and progressive ones, and (iii) <em>Execution mode</em>, which discerns between serial algorithms, running on a single CPU-core, and parallel ones, running on top of Apache Spark. We analytically describe JedAI-spatial’s architecture and capabilities and perform thorough experiments to provide interesting insights about the relative performance of its algorithms.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000039/pdfft?md5=59ac5500aad18c0d78d47b866d6b2073&pid=1-s2.0-S1570826824000039-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140549558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extraction of object-action and object-state associations from Knowledge Graphs 从知识图谱中提取对象-动作和对象-状态关联
IF 2.5 3区 计算机科学 Q1 Computer Science Pub Date : 2024-03-19 DOI: 10.1016/j.websem.2024.100816
Alexandros Vassiliades , Theodore Patkos , Vasilis Efthymiou , Antonis Bikakis , Nick Bassiliades , Dimitris Plexousakis

Infusing autonomous artificial systems with knowledge about the physical world they inhabit is a critical and long-held aim for the Artificial Intelligence community. Training systems with relevant data is a typical approach; however, finding the data required is not always possible, especially when much of this knowledge is commonsense. In this paper, we present a comparison of topology-based and semantics-based methods for extracting information about object-action and object-state association relations from knowledge graphs, such as ConceptNet, WordNet, ATOMIC, YAGO, WebChild and DBpedia. Moreover, we propose a novel method for extracting information about object-action and object-state associations from knowledge graphs. Our method is composed of a set of techniques for locating, enriching, evaluating, cleaning and exposing knowledge from such resources, relying on semantic similarity methods. Some important aspects of our method are the flexibility in deciding how to deal with the noise that exists in the data, and the capability to determine the importance of a path through training, rather than through manual annotation.

为自主人工系统注入有关其所处物理世界的知识,是人工智能界长期以来的一个重要目标。利用相关数据训练系统是一种典型的方法;然而,找到所需的数据并不总是可能的,尤其是当这些知识大多是常识时。在本文中,我们比较了基于拓扑学和基于语义学的方法,以便从概念网、词网、ATOMIC、YAGO、WebChild 和 DBpedia 等知识图谱中提取对象-动作和对象-状态关联关系的信息。此外,我们还提出了一种从知识图谱中提取对象-动作和对象-状态关联信息的新方法。我们的方法由一系列技术组成,这些技术依赖于语义相似性方法,用于定位、丰富、评估、清理和公开此类资源中的知识。我们的方法的一些重要方面包括:灵活决定如何处理数据中存在的噪音,以及通过训练而不是人工标注来确定路径重要性的能力。
{"title":"Extraction of object-action and object-state associations from Knowledge Graphs","authors":"Alexandros Vassiliades ,&nbsp;Theodore Patkos ,&nbsp;Vasilis Efthymiou ,&nbsp;Antonis Bikakis ,&nbsp;Nick Bassiliades ,&nbsp;Dimitris Plexousakis","doi":"10.1016/j.websem.2024.100816","DOIUrl":"10.1016/j.websem.2024.100816","url":null,"abstract":"<div><p>Infusing autonomous artificial systems with knowledge about the physical world they inhabit is a critical and long-held aim for the Artificial Intelligence community. Training systems with relevant data is a typical approach; however, finding the data required is not always possible, especially when much of this knowledge is commonsense. In this paper, we present a comparison of topology-based and semantics-based methods for extracting information about object-action and object-state association relations from knowledge graphs, such as ConceptNet, WordNet, ATOMIC, YAGO, WebChild and DBpedia. Moreover, we propose a novel method for extracting information about object-action and object-state associations from knowledge graphs. Our method is composed of a set of techniques for locating, enriching, evaluating, cleaning and exposing knowledge from such resources, relying on semantic similarity methods. Some important aspects of our method are the flexibility in deciding how to deal with the noise that exists in the data, and the capability to determine the importance of a path through training, rather than through manual annotation.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000027/pdfft?md5=ffd3cef20c3db3c0e3c77665c129fe41&pid=1-s2.0-S1570826824000027-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140182110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A simple and efficient approach to unsupervised instance matching and its application to linked data of power plants 无监督实例匹配的简单高效方法及其在发电厂关联数据中的应用
IF 2.5 3区 计算机科学 Q1 Computer Science Pub Date : 2024-02-20 DOI: 10.1016/j.websem.2024.100815
Andreas Eibeck , Shaocong Zhang , Mei Qi Lim , Markus Kraft

Knowledge graphs store and link semantically annotated data about real-world entities from a variety of domains and on a large scale. The World Avatar is based on a dynamic decentralised knowledge graph and on semantic technologies to realise complex cross-domain scenarios. Accurate computational results for such scenarios require the availability of complete, high-quality data. This work focuses on instance matching — one of the subtasks of automatically populating the knowledge graph with data from a wide spectrum of external sources. Instance matching compares two data sets and seeks to identify instances (data, records) referring to the same real-world entity. We introduce AutoCal, a new instance matcher which does not require labelled data and runs out of the box for a wide range of domains without tuning method-specific parameters. AutoCal achieves results competitive to recently proposed unsupervised matchers from the field of Machine Learning. We also select an unsupervised state-of-the-art matcher from the field of Deep Learning for a thorough comparison. Our results show that neither AutoCal nor the state-of-the-art matcher is superior regarding matching quality while AutoCal has only moderate hardware requirements and runs 2.7 to 60 times faster. In summary, AutoCal is specifically well-suited to be used in an automated environment. We present its prototypical integration into the World Avatar and apply AutoCal to the domain of power plants which is relevant for practical environmental scenarios of the World Avatar.

知识图谱可以存储和链接来自不同领域、大规模的真实世界实体的语义注释数据。世界阿凡达 "基于动态分散的知识图谱和语义技术来实现复杂的跨领域场景。要为这些场景提供准确的计算结果,就必须提供完整、高质量的数据。这项工作的重点是实例匹配,这是用各种外部来源的数据自动填充知识图谱的子任务之一。实例匹配是对两个数据集进行比较,并设法识别指向同一现实世界实体的实例(数据、记录)。我们介绍的 AutoCal 是一种新的实例匹配器,它不需要标注数据,无需调整特定方法参数即可在各种领域运行。与机器学习领域最近提出的无监督匹配器相比,AutoCal 的结果具有竞争力。我们还选择了深度学习领域最先进的无监督匹配器进行全面比较。我们的结果表明,AutoCal 和最先进的匹配器在匹配质量方面都不占优势,而 AutoCal 对硬件的要求不高,运行速度却快 2.7 到 60 倍。总之,AutoCal 特别适合在自动化环境中使用。我们介绍了将其集成到世界阿凡达中的原型,并将 AutoCal 应用于发电厂领域,这与世界阿凡达的实际环境场景息息相关。
{"title":"A simple and efficient approach to unsupervised instance matching and its application to linked data of power plants","authors":"Andreas Eibeck ,&nbsp;Shaocong Zhang ,&nbsp;Mei Qi Lim ,&nbsp;Markus Kraft","doi":"10.1016/j.websem.2024.100815","DOIUrl":"10.1016/j.websem.2024.100815","url":null,"abstract":"<div><p>Knowledge graphs store and link semantically annotated data about real-world entities from a variety of domains and on a large scale. The World Avatar is based on a dynamic decentralised knowledge graph and on semantic technologies to realise complex cross-domain scenarios. Accurate computational results for such scenarios require the availability of complete, high-quality data. This work focuses on instance matching — one of the subtasks of automatically populating the knowledge graph with data from a wide spectrum of external sources. Instance matching compares two data sets and seeks to identify instances (data, records) referring to the same real-world entity. We introduce AutoCal, a new instance matcher which does not require labelled data and runs out of the box for a wide range of domains without tuning method-specific parameters. AutoCal achieves results competitive to recently proposed unsupervised matchers from the field of Machine Learning. We also select an unsupervised state-of-the-art matcher from the field of Deep Learning for a thorough comparison. Our results show that neither AutoCal nor the state-of-the-art matcher is superior regarding matching quality while AutoCal has only moderate hardware requirements and runs 2.7 to 60 times faster. In summary, AutoCal is specifically well-suited to be used in an automated environment. We present its prototypical integration into the World Avatar and apply AutoCal to the domain of power plants which is relevant for practical environmental scenarios of the World Avatar.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000015/pdfft?md5=3ea0d1c12ee82e1292dd9975673bdbcc&pid=1-s2.0-S1570826824000015-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139918083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FIDES: An ontology-based approach for making machine learning systems accountable FIDES:一种基于本体的方法,用于使机器学习系统负责
IF 2.5 3区 计算机科学 Q1 Computer Science Pub Date : 2023-11-04 DOI: 10.1016/j.websem.2023.100808
Izaskun Fernandez , Cristina Aceta , Eduardo Gilabert , Iker Esnaola-Gonzalez

Although the maturity of technologies based on Artificial Intelligence (AI) is rather advanced nowadays, their adoption, deployment and application are not as wide as it could be expected. This could be attributed to many barriers, among which the lack of trust of users stands out. Accountability is a relevant factor to progress in this trustworthiness aspect, as it allows to determine the causes that derived a given decision or suggestion made by an AI system. This article focuses on the accountability of a specific branch of AI, statistical machine learning (ML), based on a semantic approach. FIDES, an ontology-based approach towards achieving the accountability of ML systems is presented, where all the relevant information related to a ML-based model is semantically annotated, from the dataset and model parametrisation to deployment aspects, to be exploited later to answer issues related to reproducibility, replicability, definitely, accountability. The feasibility of the proposed approach has been demonstrated in two scenarios, real-world energy efficiency and manufacturing, and it is expected to pave the way towards raising awareness about the potential of Semantic Technologies in different factors that may be key in the trustworthiness of AI-based systems.

尽管基于人工智能(AI)的技术目前已经相当成熟,但它们的采用、部署和应用并不像预期的那样广泛。这可以归因于许多障碍,其中最突出的是缺乏用户的信任。问责制是在可信度方面取得进展的一个相关因素,因为它允许确定人工智能系统做出给定决定或建议的原因。本文主要关注基于语义方法的人工智能的一个特定分支,统计机器学习(ML)的问责制。FIDES是一种实现机器学习系统问责制的基于本体论的方法,其中与基于机器学习的模型相关的所有相关信息都进行了语义注释,从数据集和模型参数化到部署方面,以后可以利用这些信息来回答与可再现性、可复制性、当然还有问责制相关的问题。所提出的方法的可行性已经在两种情况下得到了证明,即现实世界的能源效率和制造业,预计它将为提高人们对语义技术在不同因素中的潜力的认识铺平道路,这些因素可能是基于人工智能的系统可靠性的关键。
{"title":"FIDES: An ontology-based approach for making machine learning systems accountable","authors":"Izaskun Fernandez ,&nbsp;Cristina Aceta ,&nbsp;Eduardo Gilabert ,&nbsp;Iker Esnaola-Gonzalez","doi":"10.1016/j.websem.2023.100808","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100808","url":null,"abstract":"<div><p>Although the maturity of technologies based on Artificial Intelligence (AI) is rather advanced nowadays, their adoption, deployment and application are not as wide as it could be expected. This could be attributed to many barriers, among which the lack of trust of users stands out. Accountability is a relevant factor to progress in this trustworthiness aspect, as it allows to determine the causes that derived a given decision or suggestion made by an AI system. This article focuses on the accountability of a specific branch of AI, statistical machine learning (ML), based on a semantic approach. FIDES, an ontology-based approach towards achieving the accountability of ML systems is presented, where all the relevant information related to a ML-based model is semantically annotated, from the dataset and model parametrisation to deployment aspects, to be exploited later to answer issues related to reproducibility, replicability, definitely, accountability. The feasibility of the proposed approach has been demonstrated in two scenarios, real-world energy efficiency and manufacturing, and it is expected to pave the way towards raising awareness about the potential of Semantic Technologies in different factors that may be key in the trustworthiness of AI-based systems.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138087525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Semantic Web and blockchain technologies: Convergence, challenges and research trends 语义网和区块链技术:融合、挑战和研究趋势
IF 2.5 3区 计算机科学 Q1 Computer Science Pub Date : 2023-11-03 DOI: 10.1016/j.websem.2023.100809
Klevis Shkembi , Petar Kochovski , Thanasis G. Papaioannou , Caroline Barelle , Vlado Stankovski

In recent years, on the one hand, we have witnessed the rise of blockchain technology, which has led to better transparency, traceability, and therefore, trustworthy exchange of digital assets among different actors. On the other hand, achieving trustworthy content exchange has been one of the primary objectives of the Semantic Web, part of the World Wide Web Consortium. Semantic Web and blockchain technologies are the fundamental building blocks of Web3 (the third version of the Internet), which aims to link data through a decentralized approach. Blockchain provides a decentralized and secure framework for users to safeguard their data and take control over their data and Web3 experiences. However, developing trustworthy decentralized applications (Dapps) is a challenge because many blockchain-based functionalities must be developed from scratch, and combined with data semantics to open new innovative opportunities. In this survey paper, we explore the cross-cutting domain of the Semantic Web and blockchain and identify the critical building blocks required to achieve trust in the Next-Generation Internet. The application domains that could benefit from these technologies are also investigated. We developed a deep analysis of the published literature between 2015 and 2023. We performed our analysis in different digital libraries (e.g., Elsevier, IEEE, ACM), and as a result of our research, we retrieved 137 papers, of which 97 were retrieved as relevant to include in the paper. Furthermore, we studied several aspects (e.g., network type, transactions per second) of existing blockchain platforms. Semantic Web and blockchain technologies can be used to realize a verification and certification process for data quality. Examples of mechanisms to achieve this are the Decentralized Identities of the Semantic Web or the various blockchain consensus protocols that help achieve decentralization and realize democratic principles. Therefore, Semantic Web and blockchain technologies should be combined to achieve trust in the highly decentralized, semantically complex, and dynamic environments needed to build smart applications of the future.

近年来,一方面,我们见证了区块链技术的兴起,它带来了更好的透明度、可追溯性,因此,不同参与者之间的数字资产交换是值得信赖的。另一方面,实现可信的内容交换一直是语义网(万维网联盟的一部分)的主要目标之一。语义网和区块链技术是Web3 (Internet的第三版)的基本构建模块,其目的是通过分散的方法连接数据。区块链为用户提供了一个分散和安全的框架来保护他们的数据,并控制他们的数据和Web3体验。然而,开发值得信赖的去中心化应用程序(Dapps)是一个挑战,因为许多基于区块链的功能必须从头开始开发,并与数据语义相结合,以开辟新的创新机会。在这篇调查报告中,我们探讨了语义网和区块链的交叉领域,并确定了在下一代互联网中实现信任所需的关键构建块。还研究了可以从这些技术中受益的应用程序领域。我们对2015年至2023年间发表的文献进行了深入分析。我们在不同的数字图书馆(如爱思唯尔,IEEE, ACM)中进行了分析,作为我们研究的结果,我们检索了137篇论文,其中97篇被检索为与论文相关。此外,我们还研究了现有区块链平台的几个方面(例如,网络类型、每秒事务数)。语义网和区块链技术可用于实现数据质量的验证和认证过程。实现这一目标的机制示例是语义网的去中心化身份或各种区块链共识协议,这些协议有助于实现去中心化和实现民主原则。因此,语义Web和区块链技术应该结合起来,以在构建未来智能应用程序所需的高度分散、语义复杂和动态环境中实现信任。
{"title":"Semantic Web and blockchain technologies: Convergence, challenges and research trends","authors":"Klevis Shkembi ,&nbsp;Petar Kochovski ,&nbsp;Thanasis G. Papaioannou ,&nbsp;Caroline Barelle ,&nbsp;Vlado Stankovski","doi":"10.1016/j.websem.2023.100809","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100809","url":null,"abstract":"<div><p>In recent years, on the one hand, we have witnessed the rise of blockchain technology, which has led to better transparency, traceability, and therefore, trustworthy exchange of digital assets among different actors. On the other hand, achieving trustworthy content exchange has been one of the primary objectives of the Semantic Web, part of the World Wide Web Consortium. Semantic Web and blockchain technologies are the fundamental building blocks of Web3 (the third version of the Internet), which aims to link data through a decentralized approach. Blockchain provides a decentralized and secure framework for users to safeguard their data and take control over their data and Web3 experiences. However, developing trustworthy decentralized applications (Dapps) is a challenge because many blockchain-based functionalities must be developed from scratch, and combined with data semantics to open new innovative opportunities. In this survey paper, we explore the cross-cutting domain of the Semantic Web and blockchain and identify the critical building blocks required to achieve trust in the Next-Generation Internet. The application domains that could benefit from these technologies are also investigated. We developed a deep analysis of the published literature between 2015 and 2023. We performed our analysis in different digital libraries (e.g., Elsevier, IEEE, ACM), and as a result of our research, we retrieved 137 papers, of which 97 were retrieved as relevant to include in the paper. Furthermore, we studied several aspects (e.g., network type, transactions per second) of existing blockchain platforms. Semantic Web and blockchain technologies can be used to realize a verification and certification process for data quality. Examples of mechanisms to achieve this are the Decentralized Identities of the Semantic Web or the various blockchain consensus protocols that help achieve decentralization and realize democratic principles. Therefore, Semantic Web and blockchain technologies should be combined to achieve trust in the highly decentralized, semantically complex, and dynamic environments needed to build smart applications of the future.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138087526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ontology alignment with semantic and structural embeddings 本体对齐与语义和结构嵌入
IF 2.5 3区 计算机科学 Q1 Computer Science Pub Date : 2023-10-01 DOI: 10.1016/j.websem.2023.100798
Zhigang Hao , Wolfgang Mayer , Jingbo Xia , Guoliang Li , Li Qin , Zaiwen Feng

Ontology alignment is essential for data integration and interoperability across multiple applications across diverse disciplines. In recent decades, significant advancements have been made in the development of advanced methods and systems for ontology alignment. Empirical results have suggested that ontological semantics can be effectively employed to enhance the alignment process. Besides, structural information is crucial for ontology alignment as it reflects the relations among adjacent concepts in the ontology. Previous works are mainly based on external lexicon and predefined rules based on ontological structure. Recently, deep learning has imposed positive impacts on ontology alignment and obtained substantial improvement.

This paper proposes a new method based on ontology embedding incorporating the semantic and structural features. It utilizes the distance between the embedding of two ontological concepts to be aligned as the criterion for alignment. The proposed method is used to align two widely used food ontologies and three Chinese food classification ontologies. The experimental results show that our method enhances the performance compared to several state-of-the-art alignment systems, demonstrating the importance of learning semantic representation and structural representation. Furthermore, the proposed method is evaluated on several different tracks of the Ontology Alignment Evaluation Initiative (OAEI), and experimental results show that our method outperforms other baselines in effectiveness. The data and code can be obtained from: https://github.com/haozhigang1111/Ontology-Alignment.git.

本体对齐对于跨不同学科的多个应用程序的数据集成和互操作性至关重要。近几十年来,在开发用于本体对齐的先进方法和系统方面取得了重大进展。实证结果表明,本体语义可以有效地增强对齐过程。此外,结构信息对本体对齐至关重要,因为它反映了本体中相邻概念之间的关系。以往的工作主要是基于外部词典和基于本体结构的预定义规则。近年来,深度学习对本体对齐产生了积极的影响,并取得了实质性的进步。本文提出了一种结合语义特征和结构特征的本体嵌入方法。它利用待对齐的两个本体概念嵌入之间的距离作为对齐标准。采用该方法对两种广泛使用的食品本体和三种中国食品分类本体进行了对齐。实验结果表明,与几种最先进的对齐系统相比,我们的方法提高了性能,证明了学习语义表示和结构表示的重要性。在本体对齐评估计划(OAEI)的多个不同轨道上对该方法进行了评估,实验结果表明,该方法的有效性优于其他基线。数据和代码可从https://github.com/haozhigang1111/Ontology-Alignment.git获取。
{"title":"Ontology alignment with semantic and structural embeddings","authors":"Zhigang Hao ,&nbsp;Wolfgang Mayer ,&nbsp;Jingbo Xia ,&nbsp;Guoliang Li ,&nbsp;Li Qin ,&nbsp;Zaiwen Feng","doi":"10.1016/j.websem.2023.100798","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100798","url":null,"abstract":"<div><p><span><span>Ontology alignment is essential for data integration and interoperability across multiple applications across diverse disciplines. In recent decades, significant advancements have been made in the development of advanced methods and systems for ontology alignment. Empirical results have suggested that </span>ontological semantics can be effectively employed to enhance the alignment process. Besides, structural information is crucial for ontology alignment as it reflects the relations among adjacent concepts in the ontology. Previous works are mainly based on external lexicon and </span>predefined rules<span> based on ontological structure<span>. Recently, deep learning has imposed positive impacts on ontology alignment and obtained substantial improvement.</span></span></p><p><span>This paper proposes a new method based on ontology embedding incorporating the semantic and structural features. It utilizes the distance between the embedding of two ontological concepts to be aligned as the criterion for alignment. The proposed method is used to align two widely used food ontologies and three Chinese food classification ontologies. The experimental results show that our method enhances the performance compared to several state-of-the-art alignment systems, demonstrating the importance of learning semantic representation and structural representation. Furthermore, the proposed method is evaluated on several different tracks of the Ontology Alignment Evaluation Initiative (OAEI), and experimental results show that our method outperforms other baselines in effectiveness. The data and code can be obtained from: </span><span>https://github.com/haozhigang1111/Ontology-Alignment.git</span><svg><path></path></svg>.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49881572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An analysis of discussions in collaborative knowledge engineering through the lens of Wikidata 从维基数据的角度分析协同知识工程中的讨论
IF 2.5 3区 计算机科学 Q1 Computer Science Pub Date : 2023-10-01 DOI: 10.1016/j.websem.2023.100799
Elisavet Koutsiana, Gabriel Maia Rocha Amaral, Neal Reeves, Albert Meroño-Peñuela, Elena Simperl

We study discussions in Wikidata, the world’s largest open-source collaborative knowledge graph (KG). This is important because it helps KG community managers understand how discussions are used and inform the design of collaborative practices and support tools. We follow a mixed-methods approach with descriptive statistics, thematic analysis, and statistical tests to investigate how much discussions in Wikidata are used, what they are used for, and how they support knowledge engineering (KE) activities. The study covers three core sources of discussion, the talk pages that accompany Wikidata items and properties, and a general-purpose communication page. Our findings show low use of discussion capabilities and a power-law distribution similar to other KE projects such as Schema.org. When discussions are used, they are mostly about KE activities, including activities that span across the entire KE lifecycle from conceptualisation and implementation to maintenance and taxonomy building. We hope that the findings will help Wikidata devise improved practices and capabilities to encourage the use of discussions as a tool to collaborate, improve editor engagement, and engineer better KGs.

我们在世界上最大的开源协作知识图谱(KG)Wikidata中研究讨论。这一点很重要,因为它有助于KG社区管理人员了解如何使用讨论,并为协作实践和支持工具的设计提供信息。我们采用描述性统计、主题分析和统计测试的混合方法来调查维基数据中的讨论被使用了多少,它们被用于什么,以及它们如何支持知识工程(KE)活动。这项研究涵盖了三个核心讨论来源,即维基数据项目和属性附带的谈话页面,以及一个通用通信页面。我们的研究结果表明,与Schema.org等其他KE项目类似,讨论能力的使用率很低,幂律分布也很低。当使用讨论时,它们主要是关于KE活动的,包括从概念化和实现到维护和分类构建的整个KE生命周期的活动。我们希望这些发现将有助于维基数据设计改进的实践和能力,鼓励将讨论作为合作的工具,提高编辑参与度,并设计更好的KG。
{"title":"An analysis of discussions in collaborative knowledge engineering through the lens of Wikidata","authors":"Elisavet Koutsiana,&nbsp;Gabriel Maia Rocha Amaral,&nbsp;Neal Reeves,&nbsp;Albert Meroño-Peñuela,&nbsp;Elena Simperl","doi":"10.1016/j.websem.2023.100799","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100799","url":null,"abstract":"<div><p>We study <em>discussions</em><span> in Wikidata, the world’s largest open-source collaborative knowledge graph (KG). This is important because it helps KG community managers understand how discussions are used and inform the design of collaborative practices and support tools. We follow a mixed-methods approach with descriptive statistics, thematic analysis, and statistical tests to investigate how much discussions in Wikidata are used, what they are used for, and how they support knowledge engineering (KE) activities. The study covers three core sources of discussion, the talk pages that accompany Wikidata items and properties, and a general-purpose communication page. Our findings show low use of discussion capabilities and a power-law distribution similar to other KE projects such as Schema.org. When discussions are used, they are mostly about KE activities, including activities that span across the entire KE lifecycle from conceptualisation and implementation to maintenance and taxonomy building. We hope that the findings will help Wikidata devise improved practices and capabilities to encourage the use of discussions as a tool to collaborate, improve editor engagement, and engineer better KGs.</span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49882103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online maintenance of evolving knowledge graphs with RDFS-based saturation and why-provenance support 基于rdfs的饱和和原因来源支持的不断发展的知识图的在线维护
IF 2.5 3区 计算机科学 Q1 Computer Science Pub Date : 2023-10-01 DOI: 10.1016/j.websem.2023.100796
Khalid Belhajjame, Mohamed-Yassine Mejri

Enterprise RDF knowledge graphs are often built using extraction data pipelines that are fed by several heterogeneous sources (relational databases, CSV files or even unstructured textual data). As a direct consequence, the construction of these KGs undergoes a number of changes in the early stages of their life cycle, which are initiated by a human developer and therefore need to be done interactively and efficiently. Driven by such needs, in this paper, we present a solution for the incremental maintenance of KGs given user-prescribed changes. A key feature of the proposed solution is the support of provenance collection that can be used to assist the developer in the analysis and debugging of the KG. Specifically, we strive to compute and maintain the provenance of asserted and inferred facts in the knowledge graph incrementally (and thus efficiently). The evaluation exercises we have conducted show the effectiveness of our solution and highlight the parameters that impact performance.

企业RDF知识图通常使用由多个异构源(关系数据库、CSV文件甚至非结构化文本数据)提供的抽取数据管道构建。直接的结果是,这些kg的构建在其生命周期的早期阶段经历了许多变化,这些变化是由人类开发人员发起的,因此需要交互式和高效地完成。在这种需求的驱动下,在本文中,我们提出了一种解决方案,用于给定用户规定的更改的kg的增量维护。所建议的解决方案的一个关键特性是支持可用于帮助开发人员分析和调试KG的来源收集。具体地说,我们努力计算和维护知识图中断言和推断的事实的来源(从而提高效率)。我们进行的评估练习显示了我们的解决方案的有效性,并突出了影响性能的参数。
{"title":"Online maintenance of evolving knowledge graphs with RDFS-based saturation and why-provenance support","authors":"Khalid Belhajjame,&nbsp;Mohamed-Yassine Mejri","doi":"10.1016/j.websem.2023.100796","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100796","url":null,"abstract":"<div><p>Enterprise RDF knowledge graphs are often built using extraction data pipelines that are fed by several heterogeneous sources (relational databases, CSV files or even unstructured textual data). As a direct consequence, the construction of these KGs undergoes a number of changes in the early stages of their life cycle, which are initiated by a human developer and therefore need to be done interactively and efficiently. Driven by such needs, in this paper, we present a solution for the incremental maintenance of KGs given user-prescribed changes. A key feature of the proposed solution is the support of provenance collection that can be used to assist the developer in the analysis and debugging of the KG. Specifically, we strive to compute and maintain the provenance of asserted and inferred facts in the knowledge graph incrementally (and thus efficiently). The evaluation exercises we have conducted show the effectiveness of our solution and highlight the parameters that impact performance.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49881574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
SemanticHadith: An ontology-driven knowledge graph for the hadith corpus semantic chadith:一个本体驱动的圣训语料库知识图
IF 2.5 3区 计算机科学 Q1 Computer Science Pub Date : 2023-10-01 DOI: 10.1016/j.websem.2023.100797
Amna Binte Kamran , Bushra Abro, Amna Basharat

Hadith is an essential and much-celebrated resource for the Islamic domain. It is one of the two primary sources of Islamic legislation. The hadith corpus is quite large, consisting of the collection of sayings, actions and silent approval of the Prophet Muhammad. Minimal efforts have been made to date, towards unified semantic modelling, and knowledge representation of the hadith structure for enhanced interlinking and knowledge discovery. This paper presents the design, development and publishing of the hadith corpus as a knowledge graph. First, we design the SemanticHadith ontology to describe and relate core structural concepts from the hadith. We then publish the six prominent hadith collections as an RDF-Based hadith knowledge graph, which is an effort towards making the available hadith both human and machine-readable. This is the first step in the annotation and linking process of the hadith corpus aimed at enabling semantic search capabilities to support scholars, students, and researchers in the creation, evolution, and consultation of a digital representation of Islamic knowledge. The SemanticHadith knowledge graph is freely accessible at http://www.semantichadith.com.

圣训是伊斯兰教领域的重要和著名的资源。它是伊斯兰立法的两个主要来源之一。圣训语料库相当大,由先知穆罕默德的言论、行为和沉默的认可组成。迄今为止,在统一语义建模和圣训结构的知识表示方面所做的努力很少,以增强互连和知识发现。本文介绍了作为知识图谱的圣训语料库的设计、开发和发布。首先,我们设计了语义本体来描述和关联圣训中的核心结构概念。然后,我们将六个突出的圣训集合作为基于rdf的圣训知识图发布,这是一种使人类和机器都可读的圣训可用的努力。这是圣训语料库注释和链接过程的第一步,旨在实现语义搜索功能,以支持学者、学生和研究人员创建、演变和咨询伊斯兰知识的数字表示。SemanticHadith知识图谱可在http://www.semantichadith.com免费访问。
{"title":"SemanticHadith: An ontology-driven knowledge graph for the hadith corpus","authors":"Amna Binte Kamran ,&nbsp;Bushra Abro,&nbsp;Amna Basharat","doi":"10.1016/j.websem.2023.100797","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100797","url":null,"abstract":"<div><p><span>Hadith is an essential and much-celebrated resource for the Islamic domain. It is one of the two primary sources of Islamic legislation. The hadith corpus is quite large, consisting of the collection of sayings, actions and silent approval of the Prophet Muhammad. Minimal efforts have been made to date, towards unified semantic modelling, and knowledge representation of the hadith structure for enhanced interlinking and knowledge discovery. This paper presents the design, development and publishing of the hadith corpus as a knowledge graph. First, we design the </span><em>SemanticHadith</em><span> ontology to describe and relate core structural concepts from the hadith. We then publish the six prominent hadith collections as an RDF-Based hadith knowledge graph, which is an effort towards making the available hadith both human and machine-readable. This is the first step in the annotation and linking process of the hadith corpus aimed at enabling semantic search capabilities to support scholars, students, and researchers in the creation, evolution, and consultation of a digital representation of Islamic knowledge. The </span><em>SemanticHadith</em> knowledge graph is freely accessible at <span>http://www.semantichadith.com</span><svg><path></path></svg>.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49881573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards human-compatible XAI: Explaining data differentials with concept induction over background knowledge 面向人类兼容的XAI:用背景知识的概念归纳法解释数据差异
IF 2.5 3区 计算机科学 Q1 Computer Science Pub Date : 2023-09-26 DOI: 10.1016/j.websem.2023.100807
Cara Leigh Widmer , Md Kamruzzaman Sarker , Srikanth Nadella , Joshua Fiechter , Ion Juvina , Brandon Minnery , Pascal Hitzler , Joshua Schwartz , Michael Raymer

Concept induction, which is based on formal logical reasoning over description logics, has been used in ontology engineering in order to create ontology (TBox) axioms from the base data (ABox) graph. In this paper, we show that it can also be used to explain data differentials, for example in the context of Explainable AI (XAI), and we show that it can in fact be done in a way that is meaningful to a human observer. Our approach utilizes a large class hierarchy, curated from the Wikipedia category hierarchy, as background knowledge. To make the explanations easily understandable for non-specialists, the complex description logic explanations generated by our concept induction system (ECII) were presented as a word list consisting of the concept names occurring in the highest rated system responses.

概念归纳法是一种基于形式逻辑推理而不是描述逻辑的方法,它被用于本体工程中,以便从基础数据图(ABox)中创建本体公理。在本文中,我们展示了它也可以用来解释数据差异,例如在可解释人工智能(XAI)的背景下,我们展示了它实际上可以以一种对人类观察者有意义的方式完成。我们的方法利用了一个大的类层次结构,从维基百科分类层次结构中提取,作为背景知识。为了使非专业人士更容易理解这些解释,我们的概念归纳系统(ECII)生成的复杂描述逻辑解释以单词列表的形式呈现,其中包含出现在评分最高的系统反应中的概念名称。
{"title":"Towards human-compatible XAI: Explaining data differentials with concept induction over background knowledge","authors":"Cara Leigh Widmer ,&nbsp;Md Kamruzzaman Sarker ,&nbsp;Srikanth Nadella ,&nbsp;Joshua Fiechter ,&nbsp;Ion Juvina ,&nbsp;Brandon Minnery ,&nbsp;Pascal Hitzler ,&nbsp;Joshua Schwartz ,&nbsp;Michael Raymer","doi":"10.1016/j.websem.2023.100807","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100807","url":null,"abstract":"<div><p>Concept induction, which is based on formal logical reasoning over description logics<span>, has been used in ontology engineering<span><span> in order to create ontology (TBox) axioms from the base data (ABox) graph. In this paper, we show that it can also be used to explain data differentials, for example in the context of Explainable AI (XAI), and we show that it can in fact be done in a way that is meaningful to a human observer. Our approach utilizes a large class hierarchy, curated from the Wikipedia category hierarchy, as background knowledge. To make the explanations easily understandable for non-specialists, the complex description logic explanations generated by our concept </span>induction system (ECII) were presented as a word list consisting of the concept names occurring in the highest rated system responses.</span></span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2023-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49906068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Journal of Web Semantics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1