On the Opportunities and Challenges of Offline Reinforcement Learning for Recommender Systems

IF 5.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS ACM Transactions on Information Systems Pub Date : 2024-04-29 DOI:10.1145/3661996

Xiaocong Chen, Siyu Wang, Julian McAuley, Dietmar Jannach, Lina Yao

{"title":"On the Opportunities and Challenges of Offline Reinforcement Learning for Recommender Systems","authors":"Xiaocong Chen, Siyu Wang, Julian McAuley, Dietmar Jannach, Lina Yao","doi":"10.1145/3661996","DOIUrl":null,"url":null,"abstract":"<p>Reinforcement learning serves as a potent tool for modeling dynamic user interests within recommender systems, garnering increasing research attention of late. However, a significant drawback persists: its poor data efficiency, stemming from its interactive nature. The training of reinforcement learning-based recommender systems demands expensive online interactions to amass adequate trajectories, essential for agents to learn user preferences. This inefficiency renders reinforcement learning-based recommender systems a formidable undertaking, necessitating the exploration of potential solutions. Recent strides in offline reinforcement learning present a new perspective. Offline reinforcement learning empowers agents to glean insights from offline datasets and deploy learned policies in online settings. Given that recommender systems possess extensive offline datasets, the framework of offline reinforcement learning aligns seamlessly. Despite being a burgeoning field, works centered on recommender systems utilizing offline reinforcement learning remain limited. This survey aims to introduce and delve into offline reinforcement learning within recommender systems, offering an inclusive review of existing literature in this domain. Furthermore, we strive to underscore prevalent challenges, opportunities, and future pathways, poised to propel research in this evolving field.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"9 1","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3661996","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Reinforcement learning serves as a potent tool for modeling dynamic user interests within recommender systems, garnering increasing research attention of late. However, a significant drawback persists: its poor data efficiency, stemming from its interactive nature. The training of reinforcement learning-based recommender systems demands expensive online interactions to amass adequate trajectories, essential for agents to learn user preferences. This inefficiency renders reinforcement learning-based recommender systems a formidable undertaking, necessitating the exploration of potential solutions. Recent strides in offline reinforcement learning present a new perspective. Offline reinforcement learning empowers agents to glean insights from offline datasets and deploy learned policies in online settings. Given that recommender systems possess extensive offline datasets, the framework of offline reinforcement learning aligns seamlessly. Despite being a burgeoning field, works centered on recommender systems utilizing offline reinforcement learning remain limited. This survey aims to introduce and delve into offline reinforcement learning within recommender systems, offering an inclusive review of existing literature in this domain. Furthermore, we strive to underscore prevalent challenges, opportunities, and future pathways, poised to propel research in this evolving field.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

离线强化学习在推荐系统中的机遇与挑战

强化学习是在推荐系统中建立动态用户兴趣模型的有效工具，近来受到越来越多的研究关注。然而，强化学习仍然存在一个显著的缺点：由于其交互性，数据效率较低。基于强化学习的推荐系统的训练需要昂贵的在线交互来积累足够的轨迹，这对代理学习用户偏好至关重要。这种低效率使基于强化学习的推荐系统成为一项艰巨的任务，因此有必要探索潜在的解决方案。离线强化学习的最新进展提供了一个新的视角。离线强化学习使代理能够从离线数据集中获得洞察力，并在在线环境中部署学习到的策略。鉴于推荐系统拥有广泛的离线数据集，离线强化学习的框架可谓天衣无缝。尽管离线强化学习是一个新兴领域，但以利用离线强化学习的推荐系统为中心的研究成果仍然有限。本调查旨在介绍和深入研究推荐系统中的离线强化学习，对该领域的现有文献进行全面回顾。此外，我们还将努力强调当前的挑战、机遇和未来的发展方向，以推动这一不断发展的领域的研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Information Systems 工程技术-计算机：信息系统

CiteScore

9.40

自引率

14.30%

发文量

165

审稿时长

>12 weeks

期刊介绍： The ACM Transactions on Information Systems (TOIS) publishes papers on information retrieval (such as search engines, recommender systems) that contain: new principled information retrieval models or algorithms with sound empirical validation; observational, experimental and/or theoretical studies yielding new insights into information retrieval or information seeking; accounts of applications of existing information retrieval techniques that shed light on the strengths and weaknesses of the techniques; formalization of new information retrieval or information seeking tasks and of methods for evaluating the performance on those tasks; development of content (text, image, speech, video, etc) analysis methods to support information retrieval and information seeking; development of computational models of user information preferences and interaction behaviors; creation and analysis of evaluation methodologies for information retrieval and information seeking; or surveys of existing work that propose a significant synthesis. The information retrieval scope of ACM Transactions on Information Systems (TOIS) appeals to industry practitioners for its wealth of creative ideas, and to academic researchers for its descriptions of their colleagues'' work.

期刊最新文献

ROGER: Ranking-oriented Generative Retrieval Adversarial Item Promotion on Visually-Aware Recommender Systems by Guided Diffusion Bridging Dense and Sparse Maximum Inner Product Search MvStHgL: Multi-view Hypergraph Learning with Spatial-temporal Periodic Interests for Next POI Recommendation City Matters! A Dual-Target Cross-City Sequential POI Recommendation Model