{"title":"High-dimensional Sparse Embeddings for Collaborative Filtering","authors":"J. V. Balen, Bart Goethals","doi":"10.1145/3442381.3450054","DOIUrl":null,"url":null,"abstract":"A widely adopted paradigm in the design of recommender systems is to represent users and items as vectors, often referred to as latent factors or embeddings. Embeddings can be obtained using a variety of recommendation models and served in production using a variety of data engineering solutions. Embeddings also facilitate transfer learning, where trained embeddings from one model are reused in another. In contrast, some of the best-performing collaborative filtering models today are high-dimensional linear models that do not rely on factorization, and so they do not produce embeddings [27, 28]. They also require pruning, amounting to a trade-off between the model size and the density of the predicted affinities. This paper argues for the use of high-dimensional, sparse latent factor models, instead. We propose a new recommendation model based on a full-rank factorization of the inverse Gram matrix. The resulting high-dimensional embeddings can be made sparse while still factorizing a dense affinity matrix. We show how the embeddings combine the advantages of latent representations with the performance of high-dimensional linear models.","PeriodicalId":106672,"journal":{"name":"Proceedings of the Web Conference 2021","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Web Conference 2021","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3442381.3450054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
A widely adopted paradigm in the design of recommender systems is to represent users and items as vectors, often referred to as latent factors or embeddings. Embeddings can be obtained using a variety of recommendation models and served in production using a variety of data engineering solutions. Embeddings also facilitate transfer learning, where trained embeddings from one model are reused in another. In contrast, some of the best-performing collaborative filtering models today are high-dimensional linear models that do not rely on factorization, and so they do not produce embeddings [27, 28]. They also require pruning, amounting to a trade-off between the model size and the density of the predicted affinities. This paper argues for the use of high-dimensional, sparse latent factor models, instead. We propose a new recommendation model based on a full-rank factorization of the inverse Gram matrix. The resulting high-dimensional embeddings can be made sparse while still factorizing a dense affinity matrix. We show how the embeddings combine the advantages of latent representations with the performance of high-dimensional linear models.