High-dimensional Sparse Embeddings for Collaborative Filtering

Proceedings of the Web Conference 2021 Pub Date : 2021-04-19 DOI:10.1145/3442381.3450054

J. V. Balen, Bart Goethals

引用次数: 4

Abstract

A widely adopted paradigm in the design of recommender systems is to represent users and items as vectors, often referred to as latent factors or embeddings. Embeddings can be obtained using a variety of recommendation models and served in production using a variety of data engineering solutions. Embeddings also facilitate transfer learning, where trained embeddings from one model are reused in another. In contrast, some of the best-performing collaborative filtering models today are high-dimensional linear models that do not rely on factorization, and so they do not produce embeddings [27, 28]. They also require pruning, amounting to a trade-off between the model size and the density of the predicted affinities. This paper argues for the use of high-dimensional, sparse latent factor models, instead. We propose a new recommendation model based on a full-rank factorization of the inverse Gram matrix. The resulting high-dimensional embeddings can be made sparse while still factorizing a dense affinity matrix. We show how the embeddings combine the advantages of latent representations with the performance of high-dimensional linear models.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

协同过滤的高维稀疏嵌入

在推荐系统的设计中，一个被广泛采用的范例是将用户和项目表示为向量，通常被称为潜在因素或嵌入。可以使用各种推荐模型获得嵌入，并使用各种数据工程解决方案在生产中提供嵌入。嵌入还可以促进迁移学习，从一个模型中训练好的嵌入可以在另一个模型中重用。相比之下，目前一些性能最好的协同过滤模型是不依赖于因子分解的高维线性模型，因此它们不会产生嵌入[27,28]。它们也需要修剪，相当于在模型大小和预测亲和的密度之间进行权衡。本文主张使用高维、稀疏的潜在因素模型来代替。提出了一种基于逆格拉姆矩阵全秩分解的推荐模型。所得的高维嵌入可以在分解密集亲和矩阵的同时变得稀疏。我们展示了嵌入如何将潜在表示的优势与高维线性模型的性能相结合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the Web Conference 2021

自引率

0.00%

发文量

期刊最新文献

WiseTrans: Adaptive Transport Protocol Selection for Mobile Web Service Outlier-Resilient Web Service QoS Prediction Not All Features Are Equal: Discovering Essential Features for Preserving Prediction Privacy Unsupervised Lifelong Learning with Curricula The Structure of Toxic Conversations on Twitter