在隐式反馈数据集上进行排名预测的推荐系统算法选择

arXiv - CS - Information Retrieval Pub Date : 2024-09-09 DOI:arxiv-2409.05461

Lukas Wegmeth, Tobias Vente, Joeran Beel

{"title":"在隐式反馈数据集上进行排名预测的推荐系统算法选择","authors":"Lukas Wegmeth, Tobias Vente, Joeran Beel","doi":"arxiv-2409.05461","DOIUrl":null,"url":null,"abstract":"The recommender systems algorithm selection problem for ranking prediction on\nimplicit feedback datasets is under-explored. Traditional approaches in\nrecommender systems algorithm selection focus predominantly on rating\nprediction on explicit feedback datasets, leaving a research gap for ranking\nprediction on implicit feedback datasets. Algorithm selection is a critical\nchallenge for nearly every practitioner in recommender systems. In this work,\nwe take the first steps toward addressing this research gap. We evaluate the\nNDCG@10 of 24 recommender systems algorithms, each with two hyperparameter\nconfigurations, on 72 recommender systems datasets. We train four optimized\nmachine-learning meta-models and one automated machine-learning meta-model with\nthree different settings on the resulting meta-dataset. Our results show that\nthe predictions of all tested meta-models exhibit a median Spearman correlation\nranging from 0.857 to 0.918 with the ground truth. We show that the median\nSpearman correlation between meta-model predictions and the ground truth\nincreases by an average of 0.124 when the meta-model is optimized to predict\nthe ranking of algorithms instead of their performance. Furthermore, in terms\nof predicting the best algorithm for an unknown dataset, we demonstrate that\nthe best optimized traditional meta-model, e.g., XGBoost, achieves a recall of\n48.6%, outperforming the best tested automated machine learning meta-model,\ne.g., AutoGluon, which achieves a recall of 47.2%.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Recommender Systems Algorithm Selection for Ranking Prediction on Implicit Feedback Datasets\",\"authors\":\"Lukas Wegmeth, Tobias Vente, Joeran Beel\",\"doi\":\"arxiv-2409.05461\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The recommender systems algorithm selection problem for ranking prediction on\\nimplicit feedback datasets is under-explored. Traditional approaches in\\nrecommender systems algorithm selection focus predominantly on rating\\nprediction on explicit feedback datasets, leaving a research gap for ranking\\nprediction on implicit feedback datasets. Algorithm selection is a critical\\nchallenge for nearly every practitioner in recommender systems. In this work,\\nwe take the first steps toward addressing this research gap. We evaluate the\\nNDCG@10 of 24 recommender systems algorithms, each with two hyperparameter\\nconfigurations, on 72 recommender systems datasets. We train four optimized\\nmachine-learning meta-models and one automated machine-learning meta-model with\\nthree different settings on the resulting meta-dataset. Our results show that\\nthe predictions of all tested meta-models exhibit a median Spearman correlation\\nranging from 0.857 to 0.918 with the ground truth. We show that the median\\nSpearman correlation between meta-model predictions and the ground truth\\nincreases by an average of 0.124 when the meta-model is optimized to predict\\nthe ranking of algorithms instead of their performance. Furthermore, in terms\\nof predicting the best algorithm for an unknown dataset, we demonstrate that\\nthe best optimized traditional meta-model, e.g., XGBoost, achieves a recall of\\n48.6%, outperforming the best tested automated machine learning meta-model,\\ne.g., AutoGluon, which achieves a recall of 47.2%.\",\"PeriodicalId\":501281,\"journal\":{\"name\":\"arXiv - CS - Information Retrieval\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.05461\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.05461","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

针对隐式反馈数据集排名预测的推荐系统算法选择问题还未得到充分探讨。推荐系统算法选择的传统方法主要侧重于显式反馈数据集上的评级预测，而对隐式反馈数据集上的排名预测则存在研究空白。对于几乎所有推荐系统从业者来说，算法选择都是一个严峻的挑战。在这项工作中，我们迈出了解决这一研究空白的第一步。我们在 72 个推荐系统数据集上评估了 24 种推荐系统算法的NDCG@10，每种算法有两种超参数配置。我们在生成的元数据集上使用三种不同设置训练了四个优化机器学习元模型和一个自动机器学习元模型。我们的结果表明，所有测试过的元模型的预测结果与地面实况的中位数斯皮尔曼相关性在 0.857 到 0.918 之间。我们发现，当元模型优化为预测算法排名而非性能时，元模型预测与地面实况之间的中位斯皮尔曼相关性平均增加了 0.124。此外，在预测未知数据集的最佳算法方面，我们证明了经过优化的最佳传统元模型（如 XGBoost）的召回率为 48.6%，超过了经过测试的最佳自动机器学习元模型（如 AutoGluon），后者的召回率为 47.2%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Recommender Systems Algorithm Selection for Ranking Prediction on Implicit Feedback Datasets

The recommender systems algorithm selection problem for ranking prediction on implicit feedback datasets is under-explored. Traditional approaches in recommender systems algorithm selection focus predominantly on rating prediction on explicit feedback datasets, leaving a research gap for ranking prediction on implicit feedback datasets. Algorithm selection is a critical challenge for nearly every practitioner in recommender systems. In this work, we take the first steps toward addressing this research gap. We evaluate the NDCG@10 of 24 recommender systems algorithms, each with two hyperparameter configurations, on 72 recommender systems datasets. We train four optimized machine-learning meta-models and one automated machine-learning meta-model with three different settings on the resulting meta-dataset. Our results show that the predictions of all tested meta-models exhibit a median Spearman correlation ranging from 0.857 to 0.918 with the ground truth. We show that the median Spearman correlation between meta-model predictions and the ground truth increases by an average of 0.124 when the meta-model is optimized to predict the ranking of algorithms instead of their performance. Furthermore, in terms of predicting the best algorithm for an unknown dataset, we demonstrate that the best optimized traditional meta-model, e.g., XGBoost, achieves a recall of 48.6%, outperforming the best tested automated machine learning meta-model, e.g., AutoGluon, which achieves a recall of 47.2%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Information Retrieval

自引率

0.00%

发文量