Computing User Similarity by Combining SimRank++ and Cosine Similarities to Improve Collaborative Filtering

2017 14th Web Information Systems and Applications Conference (WISA) Pub Date : 2017-11-01 DOI:10.1109/WISA.2017.22

Xiuli Wang, Zhuoming Xu, Xiutao Xia, Chengwang Mao

{"title":"Computing User Similarity by Combining SimRank++ and Cosine Similarities to Improve Collaborative Filtering","authors":"Xiuli Wang, Zhuoming Xu, Xiutao Xia, Chengwang Mao","doi":"10.1109/WISA.2017.22","DOIUrl":null,"url":null,"abstract":"This paper addresses the sparsity problem in collaborative filtering (CF) by developing an aggregated useruser similarity measure suitable for the user-based CF model. The aggregated similarity measure is a weighted aggregation of the SimRank++ similarity on the user-item bipartite graph and the cosine similarity of the Linked Open Data (LOD)-based user profiles derived from both the rating data and the items' descriptive attributes found from LOD resources. To validate the effectiveness of the aggregated similarity and evaluate the accuracy of rating predictions with the user-based CF method, comparative experiments between four similarity measures, the Pearson correlation coefficient, the SimRank++ similarity, the cosine similarity and the aggregated similarity, were conducted on the MovieLens 100k dataset and DBpedia. The experimental results indicate that the proposed aggregated similarity measure overall outperforms the other three similarity measures in terms of both Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), especially in the cases of 30-100 nearest neighbors.","PeriodicalId":204706,"journal":{"name":"2017 14th Web Information Systems and Applications Conference (WISA)","volume":"426 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 14th Web Information Systems and Applications Conference (WISA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WISA.2017.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

This paper addresses the sparsity problem in collaborative filtering (CF) by developing an aggregated useruser similarity measure suitable for the user-based CF model. The aggregated similarity measure is a weighted aggregation of the SimRank++ similarity on the user-item bipartite graph and the cosine similarity of the Linked Open Data (LOD)-based user profiles derived from both the rating data and the items' descriptive attributes found from LOD resources. To validate the effectiveness of the aggregated similarity and evaluate the accuracy of rating predictions with the user-based CF method, comparative experiments between four similarity measures, the Pearson correlation coefficient, the SimRank++ similarity, the cosine similarity and the aggregated similarity, were conducted on the MovieLens 100k dataset and DBpedia. The experimental results indicate that the proposed aggregated similarity measure overall outperforms the other three similarity measures in terms of both Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE), especially in the cases of 30-100 nearest neighbors.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

结合simmrank ++和余弦相似度计算用户相似度改进协同过滤

本文通过开发一种适合于基于用户的协同过滤模型的聚合用户相似度度量来解决协同过滤中的稀疏性问题。聚合相似度度量是用户-项目二部图上的simmrank ++相似度和基于链接开放数据(LOD)的用户配置文件的余弦相似度的加权聚合，这些用户配置文件来自评级数据和从LOD资源中发现的项目描述性属性。为了验证聚合相似度的有效性并评估基于用户的CF方法评级预测的准确性，在MovieLens 100k数据集和DBpedia上进行了Pearson相关系数、simmrank ++相似度、余弦相似度和聚合相似度四种相似度度量的比较实验。实验结果表明，该方法在均方根误差(RMSE)和平均绝对误差(MAE)方面均优于其他三种相似性度量方法，特别是在30-100个近邻的情况下。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 14th Web Information Systems and Applications Conference (WISA)

自引率

0.00%

发文量

期刊最新文献

Efficient Time Series Classification via Sparse Linear Combination Checking the Statutes in Chinese Judgment Document Based on Editing Distance Algorithm Information Extraction from Chinese Judgment Documents Topic Classification Based on Improved Word Embedding Keyword Extraction for Social Media Short Text