A semantic similarity measure for linked data: An information content-based approach

IF 7.6 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Knowledge-Based Systems Pub Date : 2016-10-01 DOI:10.1016/j.knosys.2016.07.012

Rouzbeh Meymandpour, Joseph G. Davis

{"title":"A semantic similarity measure for linked data: An information content-based approach","authors":"Rouzbeh Meymandpour, Joseph G. Davis","doi":"10.1016/j.knosys.2016.07.012","DOIUrl":null,"url":null,"abstract":"<div>Linked Data allows structured data to be published in a standard manner so that datasets from diverse domains can be interlinked. By leveraging Semantic Web standards and technologies, a growing amount of semantic content has been published on the Web as Linked Open Data (LOD). The LOD cloud has made available a large volume of structured data in a range of domains via liberal licenses. The semantic content of LOD in conjunction with the advanced searching and querying mechanisms provided by SPARQL has opened up unprecedented opportunities not only for enhancing existing applications, but also for developing new and innovative semantic applications. However, SPARQL is inadequate to deal with functionalities such as comparing, prioritizing, and ranking search results which are fundamental to applications such as recommendation provision, matchmaking, social network analysis, visualization, and data clustering. This paper addresses this problem by developing a systematic measurement model of semantic similarity between resources in Linked Data. By drawing extensively on a feature-based definition of Linked Data, it proposes a generalized information content-based approach that improves on previous methods which are typically restricted to specific knowledge representation models and less relevant in the context of Linked Data. It is validated and evaluated for measuring item similarity in recommender systems. The experimental evaluation of the proposed measure shows that our approach can outperform comparable recommender systems that use conventional similarity measures.</div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"109 ","pages":"Pages 276-293"},"PeriodicalIF":7.6000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.knosys.2016.07.012","citationCount":"74","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095070511630226X","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 74

Abstract

Linked Data allows structured data to be published in a standard manner so that datasets from diverse domains can be interlinked. By leveraging Semantic Web standards and technologies, a growing amount of semantic content has been published on the Web as Linked Open Data (LOD). The LOD cloud has made available a large volume of structured data in a range of domains via liberal licenses. The semantic content of LOD in conjunction with the advanced searching and querying mechanisms provided by SPARQL has opened up unprecedented opportunities not only for enhancing existing applications, but also for developing new and innovative semantic applications. However, SPARQL is inadequate to deal with functionalities such as comparing, prioritizing, and ranking search results which are fundamental to applications such as recommendation provision, matchmaking, social network analysis, visualization, and data clustering. This paper addresses this problem by developing a systematic measurement model of semantic similarity between resources in Linked Data. By drawing extensively on a feature-based definition of Linked Data, it proposes a generalized information content-based approach that improves on previous methods which are typically restricted to specific knowledge representation models and less relevant in the context of Linked Data. It is validated and evaluated for measuring item similarity in recommender systems. The experimental evaluation of the proposed measure shows that our approach can outperform comparable recommender systems that use conventional similarity measures.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

链接数据的语义相似性度量：一种基于信息的方法

链接数据允许以标准方式发布结构化数据，以便来自不同领域的数据集可以相互关联。通过利用语义Web标准和技术，越来越多的语义内容作为链接开放数据（LOD）发布在Web上。LOD云通过自由许可证在一系列领域提供了大量结构化数据。LOD的语义内容与SPARQL提供的高级搜索和查询机制相结合，不仅为增强现有应用程序，而且为开发新的创新语义应用程序开辟了前所未有的机会。然而，SPARQL不足以处理搜索结果的比较、排序和排序等功能，这些功能对于推荐提供、匹配、社交网络分析、可视化和数据聚类等应用程序来说是至关重要的。本文通过开发链接数据中资源之间语义相似性的系统测量模型来解决这个问题。通过广泛借鉴基于特征的链接数据定义，它提出了一种基于信息的通用方法，该方法改进了以前的方法，这些方法通常局限于特定的知识表示模型，在链接数据的上下文中不太相关。在推荐系统中对其进行了验证和评估，用于测量项目相似性。对所提出的度量的实验评估表明，我们的方法可以优于使用传统相似性度量的可比推荐系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.