Semantic Web最新文献_第7页

MADLINK: Attentive multihop and entity descriptions for link prediction in knowledge graphs MADLINK:知识图中链接预测的细心多跳和实体描述

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2022-10-28 DOI: 10.3233/sw-222960

Russa Biswas, Harald Sack, Mehwish Alam

Knowledge Graphs (KGs) comprise of interlinked information in the form of entities and relations between them in a particular domain and provide the backbone for many applications. However, the KGs are often incomplete as the links between the entities are missing. Link Prediction is the task of predicting these missing links in a KG based on the existing links. Recent years have witnessed many studies on link prediction using KG embeddings which is one of the mainstream tasks in KG completion. To do so, most of the existing methods learn the latent representation of the entities and relations whereas only a few of them consider contextual information as well as the textual descriptions of the entities. This paper introduces an attentive encoder-decoder based link prediction approach considering both structural information of the KG and the textual entity descriptions. Random walk based path selection method is used to encapsulate the contextual information of an entity in a KG. The model explores a bidirectional Gated Recurrent Unit (GRU) based encoder-decoder to learn the representation of the paths whereas SBERT is used to generate the representation of the entity descriptions. The proposed approach outperforms most of the state-of-the-art models and achieves comparable results with the rest when evaluated with FB15K, FB15K-237, WN18, WN18RR, and YAGO3-10 datasets.

知识图谱(Knowledge Graphs, KGs)是由特定领域中实体和实体之间关系形式的相互关联的信息组成的，为许多应用提供了主干。然而，由于实体之间的联系缺失，kg通常是不完整的。链路预测是基于现有链路预测KG中缺失链路的任务。近年来，利用KG嵌入进行链接预测的研究较多，这是KG补全的主流任务之一。为了做到这一点，大多数现有的方法学习实体和关系的潜在表示，而只有少数方法考虑上下文信息以及实体的文本描述。本文介绍了一种基于细心编码器-解码器的链接预测方法，该方法同时考虑了KG的结构信息和文本实体描述。基于随机行走的路径选择方法用于将实体的上下文信息封装到KG中。该模型探索了一个基于双向门控循环单元(GRU)的编码器-解码器来学习路径的表示，而SBERT用于生成实体描述的表示。当使用FB15K、FB15K-237、WN18、WN18RR和YAGO3-10数据集进行评估时，所提出的方法优于大多数最先进的模型，并获得与其他模型相当的结果。

{"title":"MADLINK: Attentive multihop and entity descriptions for link prediction in knowledge graphs","authors":"Russa Biswas, Harald Sack, Mehwish Alam","doi":"10.3233/sw-222960","DOIUrl":"https://doi.org/10.3233/sw-222960","url":null,"abstract":"Knowledge Graphs (KGs) comprise of interlinked information in the form of entities and relations between them in a particular domain and provide the backbone for many applications. However, the KGs are often incomplete as the links between the entities are missing. Link Prediction is the task of predicting these missing links in a KG based on the existing links. Recent years have witnessed many studies on link prediction using KG embeddings which is one of the mainstream tasks in KG completion. To do so, most of the existing methods learn the latent representation of the entities and relations whereas only a few of them consider contextual information as well as the textual descriptions of the entities. This paper introduces an attentive encoder-decoder based link prediction approach considering both structural information of the KG and the textual entity descriptions. Random walk based path selection method is used to encapsulate the contextual information of an entity in a KG. The model explores a bidirectional Gated Recurrent Unit (GRU) based encoder-decoder to learn the representation of the paths whereas SBERT is used to generate the representation of the entity descriptions. The proposed approach outperforms most of the state-of-the-art models and achieves comparable results with the rest when evaluated with FB15K, FB15K-237, WN18, WN18RR, and YAGO3-10 datasets.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"117 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75416724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Editorial of transport data on the web 在网上编辑运输数据

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2022-10-12 DOI: 10.3233/sw-223278

David Chaves-Fraga, Pieter Colpaert, Mersedeh Sadeghi, M. Comerio

Whether you are planning your next trip abroad or want a package delivered to your doorstep, chances are high that you will need a chain of services provided by multiple companies. Transport is inherently a geographically and administratively decentralized domain composed of a diverse set of actors, – from public transport authorities to vehicle sharing companies, infrastructure managers in different sectors (road, rail, etc.), transport operators, retailers, and distributors. As a result, it suffers vast data heterogeneity, which, in turn, brings severe challenges to data interoperability. However, such challenges have also been posed in other domains such as the Internet of Things [18], agriculture [11], building data management [17], biology [7] or open data [2], which have found their solutions using semantic web technologies. However, despite several research contributions [6,14,19,23,25], public-funded projects1,2 or academic-industry events,3,4 we have not yet seen a wide adoption of semantic technologies in the transport domain. We may only guess the inhibitors for adopting Linked Data in this domain: i) the SPARQL query language is not built for optimal path planning, and ii) RDF is perceived as highly conceptual by industry experts. We argue that SPARQL does not fit well with the concerns that typically matter to route planners (e.g., calculating the optimal Pareto path [4]). While calculating a path with SPARQL is feasible through property paths, controlling the path planning algorithm, which can hardly be done in SPARQL, is the core concern of route planners. On the other hand, the transport domain is dominated by different standards (e.g., NeTEx,5 or DATEX II6) and vocabularies, which are based on legacy data exchange technologies (e.g., XML or RDB). However, to construct a distributed and scalable architecture that addresses the current needs of this domain, the Web and its associated technologies (i.e., the Semantic Web) are the key resource.

无论你是计划下一次出国旅行，还是想把包裹送到家门口，你都很有可能需要多家公司提供的一系列服务。交通运输本质上是一个地理上和行政上分散的领域，由不同的参与者组成，从公共交通当局到车辆共享公司，从不同部门的基础设施管理人员(公路、铁路等)，到运输运营商、零售商和分销商。这就造成了数据的巨大异构性，给数据互操作性带来了严峻的挑战。然而，这些挑战也出现在其他领域，如物联网[18]、农业[11]、建筑数据管理[17]、生物学[7]或开放数据[2]，这些领域已经找到了使用语义web技术的解决方案。然而，尽管有一些研究成果[6,14,19,23,25]、公共资助项目1,2或学术行业活动3,4，我们还没有看到语义技术在运输领域的广泛采用。我们只能猜测在这个领域采用关联数据的阻碍因素:1)SPARQL查询语言不是为最优路径规划而构建的，2)RDF被行业专家认为是高度概念化的。我们认为SPARQL不能很好地满足路线规划者通常关心的问题(例如，计算最优帕累托路径[4])。虽然SPARQL通过属性路径计算路径是可行的，但控制路径规划算法是路由规划者关注的核心问题，这在SPARQL中很难做到。另一方面，传输领域由不同的标准(例如，NeTEx、5或DATEX II6)和基于遗留数据交换技术(例如，XML或RDB)的词汇表主导。然而，要构建一个分布式的、可伸缩的体系结构来满足这个领域的当前需求，Web及其相关技术(即语义Web)是关键资源。

{"title":"Editorial of transport data on the web","authors":"David Chaves-Fraga, Pieter Colpaert, Mersedeh Sadeghi, M. Comerio","doi":"10.3233/sw-223278","DOIUrl":"https://doi.org/10.3233/sw-223278","url":null,"abstract":"Whether you are planning your next trip abroad or want a package delivered to your doorstep, chances are high that you will need a chain of services provided by multiple companies. Transport is inherently a geographically and administratively decentralized domain composed of a diverse set of actors, – from public transport authorities to vehicle sharing companies, infrastructure managers in different sectors (road, rail, etc.), transport operators, retailers, and distributors. As a result, it suffers vast data heterogeneity, which, in turn, brings severe challenges to data interoperability. However, such challenges have also been posed in other domains such as the Internet of Things [18], agriculture [11], building data management [17], biology [7] or open data [2], which have found their solutions using semantic web technologies. However, despite several research contributions [6,14,19,23,25], public-funded projects1,2 or academic-industry events,3,4 we have not yet seen a wide adoption of semantic technologies in the transport domain. We may only guess the inhibitors for adopting Linked Data in this domain: i) the SPARQL query language is not built for optimal path planning, and ii) RDF is perceived as highly conceptual by industry experts. We argue that SPARQL does not fit well with the concerns that typically matter to route planners (e.g., calculating the optimal Pareto path [4]). While calculating a path with SPARQL is feasible through property paths, controlling the path planning algorithm, which can hardly be done in SPARQL, is the core concern of route planners. On the other hand, the transport domain is dominated by different standards (e.g., NeTEx,5 or DATEX II6) and vocabularies, which are based on legacy data exchange technologies (e.g., XML or RDB). However, to construct a distributed and scalable architecture that addresses the current needs of this domain, the Web and its associated technologies (i.e., the Semantic Web) are the key resource.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"7 1","pages":"613-616"},"PeriodicalIF":3.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84559368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Linking women editors of periodicals to the Wikidata knowledge graph 将期刊的女性编辑与维基数据知识图谱连接起来

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2022-09-29 DOI: 10.3233/sw-222845

Katherine Thornton, Kenneth Seals-Nutt, Marianne Van Remoortel, Julie M. Birkholz, P. D. Potter

Stories are important tools for recounting and sharing the past. To tell a story one has to put together diverse information about people, places, time periods, and things. We detail here how a machine, through the power of Semantic Web, can compile scattered and diverse materials and information to construct stories. Through the example of the WeChangEd research project on women editors of periodicals in Europe from 1710–1920 we detail how to move from archive, to a structured data model and relational database, to Wikidata, to the use of the Stories Services API to generate multimedia stories related to people, organizations and periodicals. As more humanists, social scientists and other researchers choose to contribute their data to Wikidata we will all benefit. As researchers add data, the breadth and complexity of the questions we can ask about the data we have contributed will increase. Building applications that syndicate data from Wikidata allows us to leverage a general purpose knowledge graph with a growing number of references back to scholarly literature. Using frameworks developed by the Wikidata community allows us to rapidly provision interactive sites that will help us engage new audiences. This process that we detail here may be of interest to other researchers and cultural heritage institutions seeking web-based presentation options for telling stories from their data.

故事是叙述和分享过去的重要工具。要讲一个故事，必须把关于人物、地点、时期和事物的各种信息放在一起。我们在这里详细介绍一台机器如何通过语义网的力量，将分散的、不同的材料和信息汇编成故事。通过WeChangEd关于1710-1920年欧洲期刊女性编辑的研究项目，我们详细介绍了如何从档案到结构化数据模型和关系数据库，再到维基数据，再到使用故事服务API来生成与人物、组织和期刊相关的多媒体故事。随着越来越多的人文主义者、社会科学家和其他研究人员选择将他们的数据贡献给维基数据，我们都将受益。随着研究人员增加数据，我们可以提出的关于我们所提供的数据的问题的广度和复杂性将增加。构建应用程序，将来自维基数据的数据联合起来，使我们能够利用一个通用的知识图谱，其中包含越来越多的学术文献参考。使用维基数据社区开发的框架，我们可以快速提供交互式站点，帮助我们吸引新的受众。我们在这里详细介绍的这一过程可能会引起其他研究人员和文化遗产机构的兴趣，这些研究人员和文化遗产机构正在寻求基于网络的展示方式，以从他们的数据中讲述故事。

{"title":"Linking women editors of periodicals to the Wikidata knowledge graph","authors":"Katherine Thornton, Kenneth Seals-Nutt, Marianne Van Remoortel, Julie M. Birkholz, P. D. Potter","doi":"10.3233/sw-222845","DOIUrl":"https://doi.org/10.3233/sw-222845","url":null,"abstract":"Stories are important tools for recounting and sharing the past. To tell a story one has to put together diverse information about people, places, time periods, and things. We detail here how a machine, through the power of Semantic Web, can compile scattered and diverse materials and information to construct stories. Through the example of the WeChangEd research project on women editors of periodicals in Europe from 1710–1920 we detail how to move from archive, to a structured data model and relational database, to Wikidata, to the use of the Stories Services API to generate multimedia stories related to people, organizations and periodicals. As more humanists, social scientists and other researchers choose to contribute their data to Wikidata we will all benefit. As researchers add data, the breadth and complexity of the questions we can ask about the data we have contributed will increase. Building applications that syndicate data from Wikidata allows us to leverage a general purpose knowledge graph with a growing number of references back to scholarly literature. Using frameworks developed by the Wikidata community allows us to rapidly provision interactive sites that will help us engage new audiences. This process that we detail here may be of interest to other researchers and cultural heritage institutions seeking web-based presentation options for telling stories from their data.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"20 1","pages":"443-455"},"PeriodicalIF":3.0,"publicationDate":"2022-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90501657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning SHACL shapes from knowledge graphs 从知识图中学习SHACL形状

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2022-09-26 DOI: 10.3233/sw-223063

Pouya Ghiasnezhad Omran, K. Taylor, Sergio J. Rodríguez Méndez, A. Haller

Knowledge Graphs (KGs) have proliferated on the Web since the introduction of knowledge panels to Google search in 2012. KGs are large data-first graph databases with weak inference rules and weakly-constraining data schemes. SHACL, the Shapes Constraint Language, is a W3C recommendation for expressing constraints on graph data as shapes. SHACL shapes serve to validate a KG, to underpin manual KG editing tasks, and to offer insight into KG structure. Often in practice, large KGs have no available shape constraints and so cannot obtain these benefits for ongoing maintenance and extension. We introduce Inverse Open Path (IOP) rules, a predicate logic formalism which presents specific shapes in the form of paths over connected entities that are present in a KG. IOP rules express simple shape patterns that can be augmented with minimum cardinality constraints and also used as a building block for more complex shapes, such as trees and other rule patterns. We define formal quality measures for IOP rules and propose a novel method to learn high-quality rules from KGs. We show how to build high-quality tree shapes from the IOP rules. Our learning method, SHACLearner, is adapted from a state-of-the-art embedding-based open path rule learner (Oprl). We evaluate SHACLearner on some real-world massive KGs, including YAGO2s (4M facts), DBpedia 3.8 (11M facts), and Wikidata (8M facts). The experiments show that our SHACLearner can effectively learn informative and intuitive shapes from massive KGs. The shapes are diverse in structural features such as depth and width, and also in quality measures that indicate confidence and generality.

自2012年谷歌搜索引入知识面板以来，知识图谱(Knowledge Graphs, KGs)在网络上激增。KGs是具有弱推理规则和弱约束数据方案的大型数据优先图数据库。形状约束语言(SHACL)是W3C推荐的将图形数据上的约束表示为形状的语言。acl形状用于验证KG，支持手动KG编辑任务，并提供对KG结构的洞察。通常，大型kg没有可用的形状限制，因此无法在持续维护和扩展中获得这些优势。我们引入了逆开放路径(IOP)规则，这是一种谓词逻辑形式，它以KG中存在的连接实体上的路径形式表示特定形状。IOP规则表示简单的形状模式，可以使用最小基数约束进行扩展，也可以用作更复杂形状(如树和其他规则模式)的构建块。我们定义了IOP规则的形式化质量度量，提出了一种从KGs中学习高质量规则的新方法，并展示了如何从IOP规则中构建高质量的树形。我们的学习方法，shac学习者，改编自最先进的基于嵌入的开放路径规则学习者(Oprl)。我们在一些真实世界的大型知识库上对shaklearner进行了评估，包括YAGO2s (4M事实)、DBpedia 3.8(1100万事实)和Wikidata(800万事实)。实验表明，我们的shac学习者可以有效地从大量的kg中学习信息丰富且直观的形状，这些形状在深度和宽度等结构特征上是多样的，并且在表明置信度和普遍性的质量度量上也是多样的。

{"title":"Learning SHACL shapes from knowledge graphs","authors":"Pouya Ghiasnezhad Omran, K. Taylor, Sergio J. Rodríguez Méndez, A. Haller","doi":"10.3233/sw-223063","DOIUrl":"https://doi.org/10.3233/sw-223063","url":null,"abstract":"Knowledge Graphs (KGs) have proliferated on the Web since the introduction of knowledge panels to Google search in 2012. KGs are large data-first graph databases with weak inference rules and weakly-constraining data schemes. SHACL, the Shapes Constraint Language, is a W3C recommendation for expressing constraints on graph data as shapes. SHACL shapes serve to validate a KG, to underpin manual KG editing tasks, and to offer insight into KG structure. Often in practice, large KGs have no available shape constraints and so cannot obtain these benefits for ongoing maintenance and extension. We introduce Inverse Open Path (IOP) rules, a predicate logic formalism which presents specific shapes in the form of paths over connected entities that are present in a KG. IOP rules express simple shape patterns that can be augmented with minimum cardinality constraints and also used as a building block for more complex shapes, such as trees and other rule patterns. We define formal quality measures for IOP rules and propose a novel method to learn high-quality rules from KGs. We show how to build high-quality tree shapes from the IOP rules. Our learning method, SHACLearner, is adapted from a state-of-the-art embedding-based open path rule learner (Oprl). We evaluate SHACLearner on some real-world massive KGs, including YAGO2s (4M facts), DBpedia 3.8 (11M facts), and Wikidata (8M facts). The experiments show that our SHACLearner can effectively learn informative and intuitive shapes from massive KGs. The shapes are diverse in structural features such as depth and width, and also in quality measures that indicate confidence and generality.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"5 1","pages":"101-121"},"PeriodicalIF":3.0,"publicationDate":"2022-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83798341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Editorial of knowledge graphs validation and quality 编辑知识图谱的验证和质量

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2022-09-26 DOI: 10.3233/sw-223261

J. E. Labra Gayo, Anastasia Dimou, Katherine Thornton, A. Rula

引用次数: 0

Editorial of the Special Issue on Latest Advancements in Linguistic Linked Data 语言关联数据最新进展特刊社论

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2022-09-20 DOI: 10.3233/sw-223251

Julia Bosque-Gil, P. Cimiano, Milan Dojchinovski

Since the inception of the Open Linguistics Working Group in 2010, there have been numerous efforts in transforming language resources into Linked Data. The research field of Linguistic Linked Data (LLD) has gained in importance, visibility and impact, with the Linguistic Linked Open Data (LLOD) cloud gathering nowadays over 200 resources. With this increasing growth, new challenges have emerged concerning particular domain and task applications, quality dimensions, and linguistic features to take into account. This special issue aims to review and summarize the progress and status of LLD research in recent years, as well as to offer an understanding of the challenges ahead of the field for the years to come. The papers in this issue indicate that there are still aspects to address for a wider community adoption of LLD, as well as a lack of resources for specific tasks and (interdisciplinary) domains. Likewise, the integration of LLD resources into Natural Language Processing (NLP) architectures and the search for long-term infrastructure solutions to host LLD resources continue to be essential points to which to attend in the foreseeable future of the research line.

自2010年开放语言学工作组成立以来，在将语言资源转化为关联数据方面进行了大量努力。语言关联数据(LLD)的研究领域在重要性、可见度和影响力方面都有所提高，语言关联开放数据(LLOD)云目前聚集了200多个资源。随着这种增长，新的挑战出现了，涉及到特定的领域和任务应用、质量维度和需要考虑的语言特征。本期特刊旨在回顾和总结近年来LLD研究的进展和现状，并对未来几年该领域面临的挑战提供理解。本期的论文表明，在更广泛的社区采用LLD方面仍有一些问题需要解决，以及缺乏特定任务和(跨学科)领域的资源。同样，将LLD资源整合到自然语言处理(NLP)体系结构中，并寻找长期的基础设施解决方案来托管LLD资源，仍然是研究领域可预见的未来需要关注的要点。

{"title":"Editorial of the Special Issue on Latest Advancements in Linguistic Linked Data","authors":"Julia Bosque-Gil, P. Cimiano, Milan Dojchinovski","doi":"10.3233/sw-223251","DOIUrl":"https://doi.org/10.3233/sw-223251","url":null,"abstract":"Since the inception of the Open Linguistics Working Group in 2010, there have been numerous efforts in transforming language resources into Linked Data. The research field of Linguistic Linked Data (LLD) has gained in importance, visibility and impact, with the Linguistic Linked Open Data (LLOD) cloud gathering nowadays over 200 resources. With this increasing growth, new challenges have emerged concerning particular domain and task applications, quality dimensions, and linguistic features to take into account. This special issue aims to review and summarize the progress and status of LLD research in recent years, as well as to offer an understanding of the challenges ahead of the field for the years to come. The papers in this issue indicate that there are still aspects to address for a wider community adoption of LLD, as well as a lack of resources for specific tasks and (interdisciplinary) domains. Likewise, the integration of LLD resources into Natural Language Processing (NLP) architectures and the search for long-term infrastructure solutions to host LLD resources continue to be essential points to which to attend in the foreseeable future of the research line.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"96 1","pages":"911-916"},"PeriodicalIF":3.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80912659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Building spatio-temporal knowledge graphs from vectorized topographic historical maps 基于矢量化地形图历史图谱的时空知识图谱构建

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2022-09-14 DOI: 10.3233/sw-222918

Basel Shbita, Craig A. Knoblock, Weiwei Duan, Yao-Yi Chiang, Johannes H. Uhl, S. Leyk

Historical maps provide rich information for researchers in many areas, including the social and natural sciences. These maps contain detailed documentation of a wide variety of natural and human-made features and their changes over time, such as changes in transportation networks or the decline of wetlands or forest areas. Analyzing changes over time in such maps can be labor-intensive for a scientist, even after the geographic features have been digitized and converted to a vector format. Knowledge Graphs (KGs) are the appropriate representations to store and link such data and support semantic and temporal querying to facilitate change analysis. KGs combine expressivity, interoperability, and standardization in the Semantic Web stack, thus providing a strong foundation for querying and analysis. In this paper, we present an automatic approach to convert vector geographic features extracted from multiple historical maps into contextualized spatio-temporal KGs. The resulting graphs can be easily queried and visualized to understand the changes in different regions over time. We evaluate our technique on railroad networks and wetland areas extracted from the United States Geological Survey (USGS) historical topographic maps for several regions over multiple map sheets and editions. We also demonstrate how the automatically constructed linked data (i.e., KGs) enable effective querying and visualization of changes over different points in time.

历史地图为包括社会科学和自然科学在内的许多领域的研究人员提供了丰富的信息。这些地图包含了各种各样的自然和人为特征及其随时间变化的详细记录，例如交通网络的变化或湿地或森林面积的减少。对科学家来说，即使地理特征已经被数字化并转换成矢量格式，分析这种地图随时间的变化也是一件费力的事情。知识图(Knowledge Graphs, KGs)是存储和链接这些数据并支持语义和时态查询以促进变化分析的适当表示。KGs在语义Web堆栈中结合了表达性、互操作性和标准化，从而为查询和分析提供了坚实的基础。在本文中，我们提出了一种将从多个历史地图中提取的矢量地理特征自动转换为上下文化的时空地理特征图的方法，生成的图形可以很容易地查询和可视化，以了解不同地区随时间的变化。我们对从美国地质调查局(USGS)多个地区的历史地形图中提取的铁路网和湿地区域进行了技术评估。我们还演示了自动构建的链接数据(即KGs)如何能够对不同时间点的变化进行有效的查询和可视化。

{"title":"Building spatio-temporal knowledge graphs from vectorized topographic historical maps","authors":"Basel Shbita, Craig A. Knoblock, Weiwei Duan, Yao-Yi Chiang, Johannes H. Uhl, S. Leyk","doi":"10.3233/sw-222918","DOIUrl":"https://doi.org/10.3233/sw-222918","url":null,"abstract":"Historical maps provide rich information for researchers in many areas, including the social and natural sciences. These maps contain detailed documentation of a wide variety of natural and human-made features and their changes over time, such as changes in transportation networks or the decline of wetlands or forest areas. Analyzing changes over time in such maps can be labor-intensive for a scientist, even after the geographic features have been digitized and converted to a vector format. Knowledge Graphs (KGs) are the appropriate representations to store and link such data and support semantic and temporal querying to facilitate change analysis. KGs combine expressivity, interoperability, and standardization in the Semantic Web stack, thus providing a strong foundation for querying and analysis. In this paper, we present an automatic approach to convert vector geographic features extracted from multiple historical maps into contextualized spatio-temporal KGs. The resulting graphs can be easily queried and visualized to understand the changes in different regions over time. We evaluate our technique on railroad networks and wetland areas extracted from the United States Geological Survey (USGS) historical topographic maps for several regions over multiple map sheets and editions. We also demonstrate how the automatically constructed linked data (i.e., KGs) enable effective querying and visualization of changes over different points in time.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"3 1","pages":"527-549"},"PeriodicalIF":3.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90985201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Background knowledge in ontology matching: A survey 本体匹配的背景知识综述

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2022-09-08 DOI: 10.3233/sw-223085

Jan Portisch, M. Hladik, Heiko Paulheim

Ontology matching is an integral part for establishing semantic interoperability. One of the main challenges within the ontology matching operation is semantic heterogeneity, i.e. modeling differences between the two ontologies that are to be integrated. The semantics within most ontologies or schemas are, however, typically incomplete because they are designed within a certain context which is not explicitly modeled. Therefore, external background knowledge plays a major role in the task of (semi-) automated ontology and schema matching. In this survey, we introduce the reader to the general ontology matching problem. We review the background knowledge sources as well as the approaches applied to make use of external knowledge. Our survey covers all ontology matching systems that have been presented within the years 2004–2021 at a well-known ontology matching competition together with systematically selected publications in the research field. We present a classification system for external background knowledge, concept linking strategies, as well as for background knowledge exploitation approaches. We provide extensive examples and classify all ontology matching systems under review in a resource/strategy matrix obtained by coalescing the two classification systems. Lastly, we outline interesting and yet underexplored research directions of applying external knowledge within the ontology matching process.

本体匹配是建立语义互操作性的重要组成部分。本体匹配操作中的主要挑战之一是语义异构性，即要集成的两个本体之间的建模差异。然而，大多数本体或模式中的语义通常是不完整的，因为它们是在没有显式建模的特定上下文中设计的。因此，外部背景知识在(半)自动化本体与模式匹配任务中起着重要作用。在这篇综述中，我们向读者介绍了一般的本体匹配问题。我们回顾了背景知识来源以及利用外部知识的方法。我们的调查涵盖了2004-2021年间在一个著名的本体匹配竞赛中提出的所有本体匹配系统，以及系统地选择了研究领域的出版物。提出了外部背景知识的分类体系、概念链接策略和背景知识开发方法。我们提供了大量的例子，并在通过合并两个分类系统获得的资源/策略矩阵中对所有正在审查的本体匹配系统进行分类。最后，我们概述了在本体匹配过程中应用外部知识的有趣但尚未开发的研究方向。

{"title":"Background knowledge in ontology matching: A survey","authors":"Jan Portisch, M. Hladik, Heiko Paulheim","doi":"10.3233/sw-223085","DOIUrl":"https://doi.org/10.3233/sw-223085","url":null,"abstract":"Ontology matching is an integral part for establishing semantic interoperability. One of the main challenges within the ontology matching operation is semantic heterogeneity, i.e. modeling differences between the two ontologies that are to be integrated. The semantics within most ontologies or schemas are, however, typically incomplete because they are designed within a certain context which is not explicitly modeled. Therefore, external background knowledge plays a major role in the task of (semi-) automated ontology and schema matching. In this survey, we introduce the reader to the general ontology matching problem. We review the background knowledge sources as well as the approaches applied to make use of external knowledge. Our survey covers all ontology matching systems that have been presented within the years 2004–2021 at a well-known ontology matching competition together with systematically selected publications in the research field. We present a classification system for external background knowledge, concept linking strategies, as well as for background knowledge exploitation approaches. We provide extensive examples and classify all ontology matching systems under review in a resource/strategy matrix obtained by coalescing the two classification systems. Lastly, we outline interesting and yet underexplored research directions of applying external knowledge within the ontology matching process.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"37 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2022-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84360736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Bilingual dictionary generation and enrichment via graph exploration 基于图探索的双语词典生成与丰富

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2022-09-07 DOI: 10.3233/sw-222899

Shashwat Goel, Jorge Gracia, M. Forcada

In recent years, we have witnessed a steady growth of linguistic information represented and exposed as linked data on the Web. Such linguistic linked data have stimulated the development and use of openly available linguistic knowledge graphs, as is the case with the Apertium RDF, a collection of interconnected bilingual dictionaries represented and accessible through Semantic Web standards. In this work, we explore techniques that exploit the graph nature of bilingual dictionaries to automatically infer new links (translations). We build upon a cycle density based method: partitioning the graph into biconnected components for a speed-up, and simplifying the pipeline through a careful structural analysis that reduces hyperparameter tuning requirements. We also analyse the shortcomings of traditional evaluation metrics used for translation inference and propose to complement them with new ones, both-word precision (BWP) and both-word recall (BWR), aimed at being more informative of algorithmic improvements. Over twenty-seven language pairs, our algorithm produces dictionaries about 70% the size of existing Apertium RDF dictionaries at a high BWP of 85% from scratch within a minute. Human evaluation shows that 78% of the additional translations generated for dictionary enrichment are correct as well. We further describe an interesting use-case: inferring synonyms within a single language, on which our initial human-based evaluation shows an average accuracy of 84%. We release our tool as free/open-source software which can not only be applied to RDF data and Apertium dictionaries, but is also easily usable for other formats and communities.

近年来，我们目睹了语言信息在网络上以关联数据的形式呈现和公开的稳步增长。这种语言关联数据刺激了公开可用的语言知识图的开发和使用，就像Apertium RDF的情况一样，它是一组相互关联的双语词典，通过语义Web标准表示和访问。在这项工作中，我们探索了利用双语词典的图形特性来自动推断新链接(翻译)的技术。我们建立在基于循环密度的方法上:将图划分为双连接组件以加速，并通过仔细的结构分析简化管道，从而减少超参数调优需求。我们还分析了用于翻译推理的传统评估指标的缺点，并提出用新的评估指标——双词精度(BWP)和双词召回率(BWR)来补充它们，旨在提供更多关于算法改进的信息。在27个语言对中，我们的算法在一分钟内以85%的高BWP从零开始生成的字典大约是现有Apertium RDF字典大小的70%。人工评估表明，为丰富词典而生成的额外翻译中有78%是正确的。我们进一步描述了一个有趣的用例:在一种语言中推断同义词，我们最初基于人类的评估显示平均准确率为84%。我们将这个工具作为免费/开源软件发布，它不仅可以应用于RDF数据和Apertium字典，还可以很容易地用于其他格式和社区。

{"title":"Bilingual dictionary generation and enrichment via graph exploration","authors":"Shashwat Goel, Jorge Gracia, M. Forcada","doi":"10.3233/sw-222899","DOIUrl":"https://doi.org/10.3233/sw-222899","url":null,"abstract":"In recent years, we have witnessed a steady growth of linguistic information represented and exposed as linked data on the Web. Such linguistic linked data have stimulated the development and use of openly available linguistic knowledge graphs, as is the case with the Apertium RDF, a collection of interconnected bilingual dictionaries represented and accessible through Semantic Web standards. In this work, we explore techniques that exploit the graph nature of bilingual dictionaries to automatically infer new links (translations). We build upon a cycle density based method: partitioning the graph into biconnected components for a speed-up, and simplifying the pipeline through a careful structural analysis that reduces hyperparameter tuning requirements. We also analyse the shortcomings of traditional evaluation metrics used for translation inference and propose to complement them with new ones, both-word precision (BWP) and both-word recall (BWR), aimed at being more informative of algorithmic improvements. Over twenty-seven language pairs, our algorithm produces dictionaries about 70% the size of existing Apertium RDF dictionaries at a high BWP of 85% from scratch within a minute. Human evaluation shows that 78% of the additional translations generated for dictionary enrichment are correct as well. We further describe an interesting use-case: inferring synonyms within a single language, on which our initial human-based evaluation shows an average accuracy of 84%. We release our tool as free/open-source software which can not only be applied to RDF data and Apertium dictionaries, but is also easily usable for other formats and communities.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"14 1","pages":"1103-1132"},"PeriodicalIF":3.0,"publicationDate":"2022-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76042247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

A survey on knowledge-aware news recommender systems 基于知识感知的新闻推荐系统研究

IF 3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Semantic Web

Pub Date : 2022-09-06 DOI: 10.3233/sw-222991

Andreea Iana, Mehwish Alam, Heiko Paulheim

News consumption has shifted over time from traditional media to online platforms, which use recommendation algorithms to help users navigate through the large incoming streams of daily news by suggesting relevant articles based on their preferences and reading behavior. In comparison to domains such as movies or e-commerce, where recommender systems have proved highly successful, the characteristics of the news domain (e.g., high frequency of articles appearing and becoming outdated, greater dynamics of user interest, less explicit relations between articles, and lack of explicit user feedback) pose additional challenges for the recommendation models. While some of these can be overcome by conventional recommendation techniques, injecting external knowledge into news recommender systems has been proposed in order to enhance recommendations by capturing information and patterns not contained in the text and metadata of articles, and hence, tackle shortcomings of traditional models. This survey provides a comprehensive review of knowledge-aware news recommender systems. We propose a taxonomy that divides the models into three categories: neural methods, non-neural entity-centric methods, and non-neural path-based methods. Moreover, the underlying recommendation algorithms, as well as their evaluations are analyzed. Lastly, open issues in the domain of knowledge-aware news recommendations are identified and potential research directions are proposed.

随着时间的推移，新闻消费已经从传统媒体转向在线平台，后者利用推荐算法，根据用户的偏好和阅读行为，推荐相关文章，帮助用户在大量的每日新闻流中导航。与推荐系统已经证明非常成功的电影或电子商务等领域相比，新闻领域的特征(例如，文章出现和过时的频率高，用户兴趣的更大动态，文章之间不太明确的关系以及缺乏明确的用户反馈)对推荐模型提出了额外的挑战。虽然传统的推荐技术可以克服其中的一些问题，但已经提出将外部知识注入新闻推荐系统，以便通过捕获文章文本和元数据中不包含的信息和模式来增强推荐，从而解决传统模型的缺点。本调查提供了知识感知新闻推荐系统的全面回顾。我们提出了一种分类法，将模型分为三类:神经方法、非神经实体中心方法和非神经路径方法。此外，还分析了潜在的推荐算法及其评价。最后，指出了知识感知新闻推荐领域存在的问题，并提出了潜在的研究方向。

{"title":"A survey on knowledge-aware news recommender systems","authors":"Andreea Iana, Mehwish Alam, Heiko Paulheim","doi":"10.3233/sw-222991","DOIUrl":"https://doi.org/10.3233/sw-222991","url":null,"abstract":"News consumption has shifted over time from traditional media to online platforms, which use recommendation algorithms to help users navigate through the large incoming streams of daily news by suggesting relevant articles based on their preferences and reading behavior. In comparison to domains such as movies or e-commerce, where recommender systems have proved highly successful, the characteristics of the news domain (e.g., high frequency of articles appearing and becoming outdated, greater dynamics of user interest, less explicit relations between articles, and lack of explicit user feedback) pose additional challenges for the recommendation models. While some of these can be overcome by conventional recommendation techniques, injecting external knowledge into news recommender systems has been proposed in order to enhance recommendations by capturing information and patterns not contained in the text and metadata of articles, and hence, tackle shortcomings of traditional models. This survey provides a comprehensive review of knowledge-aware news recommender systems. We propose a taxonomy that divides the models into three categories: neural methods, non-neural entity-centric methods, and non-neural path-based methods. Moreover, the underlying recommendation algorithms, as well as their evaluations are analyzed. Lastly, open issues in the domain of knowledge-aware news recommendations are identified and potential research directions are proposed.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":"24 1","pages":""},"PeriodicalIF":3.0,"publicationDate":"2022-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77169110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11