G. Adomavicius, Konstantin Bauman, B. Mobasher, Francesco Ricci, Alexander Tuzhilin, Moshe Unger
Contextual information has been widely recognized as an important modeling dimension in social sciences and in computing. In particular, the role of context has been recognized in enhancing recommendation results and retrieval performance. While a substantial amount of existing research has focused on context-aware recommender systems (CARS), many interesting problems remain under-explored. The CARS 2022 workshop provides a venue for presenting and discussing: the important features of the next generation of CARS; and application domains that may require the use of novel types of contextual information and cope with their dynamic properties in group recommendations and in online environments.
{"title":"CARS: Workshop on Context-Aware Recommender Systems 2022","authors":"G. Adomavicius, Konstantin Bauman, B. Mobasher, Francesco Ricci, Alexander Tuzhilin, Moshe Unger","doi":"10.1145/3523227.3547421","DOIUrl":"https://doi.org/10.1145/3523227.3547421","url":null,"abstract":"Contextual information has been widely recognized as an important modeling dimension in social sciences and in computing. In particular, the role of context has been recognized in enhancing recommendation results and retrieval performance. While a substantial amount of existing research has focused on context-aware recommender systems (CARS), many interesting problems remain under-explored. The CARS 2022 workshop provides a venue for presenting and discussing: the important features of the next generation of CARS; and application domains that may require the use of novel types of contextual information and cope with their dynamic properties in group recommendations and in online environments.","PeriodicalId":443279,"journal":{"name":"Proceedings of the 16th ACM Conference on Recommender Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127007846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The main source of knowledge utilized in recommender systems (RS) is users’ feedback. While the usage of implicit feedback (i.e. user’s behavior statistics) is gaining in prominence, the explicit feedback (i.e. user’s ratings) remain an important data source. This is true especially for domains, where evaluation of an object does not require an extensive usage and users are well motivated to do so (e.g., video-on-demand services or library archives). So far, numerous rating schemes for explicit feedback have been proposed, ranging both in granularity and presentation style. There are several works studying the effect of rating’s scale and presentation on user’s rating behavior, e.g. willingness to provide feedback or various biases in rating behavior. Nonetheless, the effect of ratings granularity on RS performance remain largely under-researched. In this paper, we studied the combined effect of ratings granularity and supposed probability of feedback existence on various performance statistics of recommender systems. Results indicate that decreasing feedback granularity may lead to changes in RS’s performance w.r.t. nDCG for some recommending algorithms. Nonetheless, in most cases the effect of feedback granularity is surpassed by even a small decrease in feedback’s quantity. Therefore, our results corroborate the policy of many major real-world applications, i.e. preference of simpler rating schemes with the higher chance of feedback reception instead of finer-grained rating scenarios.
{"title":"The Effect of Feedback Granularity on Recommender Systems Performance","authors":"Ladislav Peška, Stepán Balcar","doi":"10.1145/3523227.3551479","DOIUrl":"https://doi.org/10.1145/3523227.3551479","url":null,"abstract":"The main source of knowledge utilized in recommender systems (RS) is users’ feedback. While the usage of implicit feedback (i.e. user’s behavior statistics) is gaining in prominence, the explicit feedback (i.e. user’s ratings) remain an important data source. This is true especially for domains, where evaluation of an object does not require an extensive usage and users are well motivated to do so (e.g., video-on-demand services or library archives). So far, numerous rating schemes for explicit feedback have been proposed, ranging both in granularity and presentation style. There are several works studying the effect of rating’s scale and presentation on user’s rating behavior, e.g. willingness to provide feedback or various biases in rating behavior. Nonetheless, the effect of ratings granularity on RS performance remain largely under-researched. In this paper, we studied the combined effect of ratings granularity and supposed probability of feedback existence on various performance statistics of recommender systems. Results indicate that decreasing feedback granularity may lead to changes in RS’s performance w.r.t. nDCG for some recommending algorithms. Nonetheless, in most cases the effect of feedback granularity is surpassed by even a small decrease in feedback’s quantity. Therefore, our results corroborate the policy of many major real-world applications, i.e. preference of simpler rating schemes with the higher chance of feedback reception instead of finer-grained rating scenarios.","PeriodicalId":443279,"journal":{"name":"Proceedings of the 16th ACM Conference on Recommender Systems","volume":"139 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126898195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Minmin Chen, Can Xu, Vince Gatto, Devanshu Jain, Aviral Kumar, Ed H. Chi
Industrial recommendation platforms are increasingly concerned with how to make recommendations that cause users to enjoy their long term experience on the platform. Reinforcement learning emerged naturally as an appealing approach for its promise in 1) combating feedback loop effect resulted from myopic system behaviors; and 2) sequential planning to optimize long term outcome. Scaling RL algorithms to production recommender systems serving billions of users and contents, however remain challenging. Sample inefficiency and instability of online RL hinder its widespread adoption in production. Offline RL enables usage of off-policy data and batch learning. It on the other hand faces significant challenges in learning due to the distribution shift. A REINFORCE agent [3] was successfully tested for YouTube recommendation, significantly outperforming a sophisticated supervised learning production system. Off-policy correction was employed to learn from logged data. The algorithm partially mitigates the distribution shift by employing a one-step importance weighting. We resort to the off-policy actor critic algorithms to addresses the distribution shift to a better extent. Here we share the key designs in setting up an off-policy actor-critic agent for production recommender systems. It extends [3] with a critic network that estimates the value of any state-action pairs under the target learned policy through temporal difference learning. We demonstrate in offline and live experiments that the new framework out-performs baseline and improves long term user experience. An interesting discovery along our investigation is that recommendation agents that employ a softmax policy parameterization, can end up being too pessimistic about out-of-distribution (OOD) actions. Finding the right balance between pessimism and optimism on OOD actions is critical to the success of offline RL for recommender systems.
{"title":"Off-Policy Actor-critic for Recommender Systems","authors":"Minmin Chen, Can Xu, Vince Gatto, Devanshu Jain, Aviral Kumar, Ed H. Chi","doi":"10.1145/3523227.3546758","DOIUrl":"https://doi.org/10.1145/3523227.3546758","url":null,"abstract":"Industrial recommendation platforms are increasingly concerned with how to make recommendations that cause users to enjoy their long term experience on the platform. Reinforcement learning emerged naturally as an appealing approach for its promise in 1) combating feedback loop effect resulted from myopic system behaviors; and 2) sequential planning to optimize long term outcome. Scaling RL algorithms to production recommender systems serving billions of users and contents, however remain challenging. Sample inefficiency and instability of online RL hinder its widespread adoption in production. Offline RL enables usage of off-policy data and batch learning. It on the other hand faces significant challenges in learning due to the distribution shift. A REINFORCE agent [3] was successfully tested for YouTube recommendation, significantly outperforming a sophisticated supervised learning production system. Off-policy correction was employed to learn from logged data. The algorithm partially mitigates the distribution shift by employing a one-step importance weighting. We resort to the off-policy actor critic algorithms to addresses the distribution shift to a better extent. Here we share the key designs in setting up an off-policy actor-critic agent for production recommender systems. It extends [3] with a critic network that estimates the value of any state-action pairs under the target learned policy through temporal difference learning. We demonstrate in offline and live experiments that the new framework out-performs baseline and improves long term user experience. An interesting discovery along our investigation is that recommendation agents that employ a softmax policy parameterization, can end up being too pessimistic about out-of-distribution (OOD) actions. Finding the right balance between pessimism and optimism on OOD actions is critical to the success of offline RL for recommender systems.","PeriodicalId":443279,"journal":{"name":"Proceedings of the 16th ACM Conference on Recommender Systems","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125379971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Konstan, A. Muralidharan, Ankan Saha, Shilad Sen, Mengting Wan, Longqi Yang
As organizations increasingly digitize their business processes, the role of recommender systems in work environments is expanding. The goal of the RecWork workshop is closing the gap in recommender systems research for work environments in areas such as calendaring, productivity, community building, space planning, workforce development, and information routing. RecWork will bring together experts who will collaboratively synthesize a forward-looking research agenda for recommender systems in the workplace. The outcome will be captured through a white paper that will serve as the foundation for future RecWork workshops. These steps will help advance research in workplace recommenders and broaden the reach of the RecSys conference.
{"title":"RecWork: Workshop on Recommender Systems for the Future of Work","authors":"J. Konstan, A. Muralidharan, Ankan Saha, Shilad Sen, Mengting Wan, Longqi Yang","doi":"10.1145/3523227.3547415","DOIUrl":"https://doi.org/10.1145/3523227.3547415","url":null,"abstract":"As organizations increasingly digitize their business processes, the role of recommender systems in work environments is expanding. The goal of the RecWork workshop is closing the gap in recommender systems research for work environments in areas such as calendaring, productivity, community building, space planning, workforce development, and information routing. RecWork will bring together experts who will collaboratively synthesize a forward-looking research agenda for recommender systems in the workplace. The outcome will be captured through a white paper that will serve as the foundation for future RecWork workshops. These steps will help advance research in workplace recommenders and broaden the reach of the RecSys conference.","PeriodicalId":443279,"journal":{"name":"Proceedings of the 16th ACM Conference on Recommender Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121495837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Giacomo Balloccu, Ludovico Boratto, G. Fenu, M. Marras
The goal of this tutorial is to present the RecSys community with recent advances on explainable recommender systems with knowledge graphs. We will first introduce conceptual foundations, by surveying the state of the art and describing real-world examples of how knowledge graphs are being integrated into the recommendation pipeline, also for the purpose of providing explanations. This tutorial will continue with a systematic presentation of algorithmic solutions to model, integrate, train, and assess a recommender system with knowledge graphs, with particular attention to the explainability perspective. A practical part will then provide attendees with concrete implementations of recommender systems with knowledge graphs, leveraging open-source tools and public datasets; in this part, tutorial participants will be engaged in the design of explanations accompanying the recommendations and in articulating their impact. We conclude the tutorial by analyzing emerging open issues and future directions. Website: https://explainablerecsys.github.io/recsys2022/.
{"title":"Hands on Explainable Recommender Systems with Knowledge Graphs","authors":"Giacomo Balloccu, Ludovico Boratto, G. Fenu, M. Marras","doi":"10.1145/3523227.3547374","DOIUrl":"https://doi.org/10.1145/3523227.3547374","url":null,"abstract":"The goal of this tutorial is to present the RecSys community with recent advances on explainable recommender systems with knowledge graphs. We will first introduce conceptual foundations, by surveying the state of the art and describing real-world examples of how knowledge graphs are being integrated into the recommendation pipeline, also for the purpose of providing explanations. This tutorial will continue with a systematic presentation of algorithmic solutions to model, integrate, train, and assess a recommender system with knowledge graphs, with particular attention to the explainability perspective. A practical part will then provide attendees with concrete implementations of recommender systems with knowledge graphs, leveraging open-source tools and public datasets; in this part, tutorial participants will be engaged in the design of explanations accompanying the recommendations and in articulating their impact. We conclude the tutorial by analyzing emerging open issues and future directions. Website: https://explainablerecsys.github.io/recsys2022/.","PeriodicalId":443279,"journal":{"name":"Proceedings of the 16th ACM Conference on Recommender Systems","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114684640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, many methods for machine learning on tabular data were introduced that use either factorization machines, neural networks or both. This created a great variety of methods making it non-obvious which method should be used in practice. We begin by extending the previously established theoretical connection between polynomial neural networks and factorization machines (FM) to recently introduced FM techniques. This allows us to propose a single neural-network-based framework that can switch between the deep learning and FM paradigms by a simple change of an activation function. We further show that an activation function exists which can adaptively learn to select the optimal paradigm. Another key element in our framework is its ability to learn high-dimensional embeddings by low-rank factorization. Our framework can handle numeric and categorical data as well as multiclass outputs. Extensive empirical experiments verify our analytical claims. Source code is available at https://github.com/ChenAlmagor/FiFa
{"title":"You Say Factorization Machine, I Say Neural Network - It’s All in the Activation","authors":"Chen Almagor, Yedid Hoshen","doi":"10.1145/3523227.3551499","DOIUrl":"https://doi.org/10.1145/3523227.3551499","url":null,"abstract":"In recent years, many methods for machine learning on tabular data were introduced that use either factorization machines, neural networks or both. This created a great variety of methods making it non-obvious which method should be used in practice. We begin by extending the previously established theoretical connection between polynomial neural networks and factorization machines (FM) to recently introduced FM techniques. This allows us to propose a single neural-network-based framework that can switch between the deep learning and FM paradigms by a simple change of an activation function. We further show that an activation function exists which can adaptively learn to select the optimal paradigm. Another key element in our framework is its ability to learn high-dimensional embeddings by low-rank factorization. Our framework can handle numeric and categorical data as well as multiclass outputs. Extensive empirical experiments verify our analytical claims. Source code is available at https://github.com/ChenAlmagor/FiFa","PeriodicalId":443279,"journal":{"name":"Proceedings of the 16th ACM Conference on Recommender Systems","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115387742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Karl Higley, Even Oldridge, Ronay Ak, Sara Rabhi, G. Moreira
Newcomers to recommender systems often face challenges related to their lack of understanding of how these systems operate in real life. In most online content related to this topic, the focus is on models and algorithms that score items based on the user’s preferences. However, the recommender model alone does not comprise everything needed for serving optimized recommender systems that meet the company’s business objectives. An industry-standard recommender system involves a number of steps, including data preprocessing, defining and training recommender models, as well as filtering and business logic for serving. In this work, we propose the four-stage recommender system, an industry-wide design pattern we have identified for production recommender systems. The four-stage pipeline includes an item retrieval step that prepares a small subset of relevant items for scoring. The filtering stage then cleans up the subset of items based on business logic such as removing out-of-stock or previously seen items. As for the ranking component, it uses a recommender model to score each item in the presented list based on the preferences of the user. In the final step, the scores are re-ordered to provide a final recommendation list aligned with other business needs or constraints such as diversity. In particular, the presented demo demonstrates how easy it is to build and deploy a four-stage recommender system pipeline using the NVIDIA Merlin open-source framework.
{"title":"Building and Deploying a Multi-Stage Recommender System with Merlin","authors":"Karl Higley, Even Oldridge, Ronay Ak, Sara Rabhi, G. Moreira","doi":"10.1145/3523227.3551468","DOIUrl":"https://doi.org/10.1145/3523227.3551468","url":null,"abstract":"Newcomers to recommender systems often face challenges related to their lack of understanding of how these systems operate in real life. In most online content related to this topic, the focus is on models and algorithms that score items based on the user’s preferences. However, the recommender model alone does not comprise everything needed for serving optimized recommender systems that meet the company’s business objectives. An industry-standard recommender system involves a number of steps, including data preprocessing, defining and training recommender models, as well as filtering and business logic for serving. In this work, we propose the four-stage recommender system, an industry-wide design pattern we have identified for production recommender systems. The four-stage pipeline includes an item retrieval step that prepares a small subset of relevant items for scoring. The filtering stage then cleans up the subset of items based on business logic such as removing out-of-stock or previously seen items. As for the ranking component, it uses a recommender model to score each item in the presented list based on the preferences of the user. In the final step, the scores are re-ordered to provide a final recommendation list aligned with other business needs or constraints such as diversity. In particular, the presented demo demonstrates how easy it is to build and deploy a four-stage recommender system pipeline using the NVIDIA Merlin open-source framework.","PeriodicalId":443279,"journal":{"name":"Proceedings of the 16th ACM Conference on Recommender Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115279616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziyang Tang, Yiheng Duan, Steven H. Zhu, Stephanie S. Zhang, Lihong Li
A/B testing is a powerful tool for a company to make informed decisions about their services and products. A limitation of A/B tests is that they do not easily extend to measure post-experiment (long-term) differences. In this talk, we study a different approach inspired by recent advances in off-policy evaluation in reinforcement learning (RL). The basic RL approach assumes customer behavior follows a stationary Markovian process, and estimates the average engagement metric when the process reaches the steady state. However, in realistic scenarios, the stationary assumption is often violated due to weekly variations and seasonality effects. To tackle this challenge, we propose a variation by relaxing the stationary assumption. We empirically tested both stationary and nonstationary approaches in a synthetic dataset and an online store dataset.
{"title":"Estimating Long-term Effects from Experimental Data","authors":"Ziyang Tang, Yiheng Duan, Steven H. Zhu, Stephanie S. Zhang, Lihong Li","doi":"10.1145/3523227.3547398","DOIUrl":"https://doi.org/10.1145/3523227.3547398","url":null,"abstract":"A/B testing is a powerful tool for a company to make informed decisions about their services and products. A limitation of A/B tests is that they do not easily extend to measure post-experiment (long-term) differences. In this talk, we study a different approach inspired by recent advances in off-policy evaluation in reinforcement learning (RL). The basic RL approach assumes customer behavior follows a stationary Markovian process, and estimates the average engagement metric when the process reaches the steady state. However, in realistic scenarios, the stationary assumption is often violated due to weekly variations and seasonality effects. To tackle this challenge, we propose a variation by relaxing the stationary assumption. We empirically tested both stationary and nonstationary approaches in a synthetic dataset and an online store dataset.","PeriodicalId":443279,"journal":{"name":"Proceedings of the 16th ACM Conference on Recommender Systems","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122655347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weixin Chen, Mingkai He, Yongxin Ni, Weike Pan, L. Chen, Zhong Ming
Heterogeneous sequential recommendation (HSR) is a very important recommendation problem, which aims to predict a user’s next interacted item under a target behavior type (e.g., purchase in e-commerce sites) based on his/her historical interactions with different behaviors. Though existing sequential methods have achieved advanced performance by considering the varied impacts of interactions with sequential information, a large body of them still have two major shortcomings. Firstly, they usually model different behaviors separately without considering the correlations between them. The transitions from item to item under diverse behaviors indicate some users’ potential behavior manner. Secondly, though the behavior information contains a user’s fine-grained interests, the insufficient consideration of the local context information limits them from well understanding user intentions. Utilizing the adjacent interactions to better understand a user’s behavior could improve the certainty of prediction. To address these two issues, we propose a novel solution utilizing global and personalized graphs for HSR (GPG4HSR) to learn behavior transitions and user intentions. Specifically, our GPG4HSR consists of two graphs, i.e., a global graph to capture the transitions between different behaviors, and a personalized graph to model items with behaviors by further considering the distinct user intentions of the adjacent contextually relevant nodes. Extensive experiments on four public datasets with the state-of-the-art baselines demonstrate the effectiveness and general applicability of our method GPG4HSR.
{"title":"Global and Personalized Graphs for Heterogeneous Sequential Recommendation by Learning Behavior Transitions and User Intentions","authors":"Weixin Chen, Mingkai He, Yongxin Ni, Weike Pan, L. Chen, Zhong Ming","doi":"10.1145/3523227.3546761","DOIUrl":"https://doi.org/10.1145/3523227.3546761","url":null,"abstract":"Heterogeneous sequential recommendation (HSR) is a very important recommendation problem, which aims to predict a user’s next interacted item under a target behavior type (e.g., purchase in e-commerce sites) based on his/her historical interactions with different behaviors. Though existing sequential methods have achieved advanced performance by considering the varied impacts of interactions with sequential information, a large body of them still have two major shortcomings. Firstly, they usually model different behaviors separately without considering the correlations between them. The transitions from item to item under diverse behaviors indicate some users’ potential behavior manner. Secondly, though the behavior information contains a user’s fine-grained interests, the insufficient consideration of the local context information limits them from well understanding user intentions. Utilizing the adjacent interactions to better understand a user’s behavior could improve the certainty of prediction. To address these two issues, we propose a novel solution utilizing global and personalized graphs for HSR (GPG4HSR) to learn behavior transitions and user intentions. Specifically, our GPG4HSR consists of two graphs, i.e., a global graph to capture the transitions between different behaviors, and a personalized graph to model items with behaviors by further considering the distinct user intentions of the adjacent contextually relevant nodes. Extensive experiments on four public datasets with the state-of-the-art baselines demonstrate the effectiveness and general applicability of our method GPG4HSR.","PeriodicalId":443279,"journal":{"name":"Proceedings of the 16th ACM Conference on Recommender Systems","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121157522","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Reinforcement learning (RL) is gaining traction as a complementary approach to supervised learning for RecSys due to its ability to solve sequential decision-making processes for delayed rewards. Recent advances in offline reinforcement learning, off-policy evaluation, and more scalable, performant system design with the ability to run code in parallel, have made RL more tractable for the RecSys real time use cases. This tutorial introduces RLlib [9], a comprehensive open-source Python RL framework built for production workloads. RLlib is built on top of open-source Ray [8], an easy-to-use, distributed computing framework for Python that can handle complex, heterogeneous applications. Ray and RLlib run on compute clusters on any cloud without vendor lock. Using Colab notebooks, you will leave this tutorial with a complete, working example of parallelized Python RL code using RLlib for RecSys on a github repo.
由于强化学习(RL)能够解决延迟奖励的顺序决策过程,因此它作为监督学习的补充方法在RecSys中获得了越来越多的关注。最近在离线强化学习、离线策略评估、更可扩展、性能更好的系统设计以及并行运行代码的能力方面取得的进展,使得强化学习在RecSys实时用例中更容易处理。本教程介绍了RLlib[9],一个为生产工作负载构建的全面的开源Python RL框架。RLlib建立在开源的Ray[8]之上,Ray是一个易于使用的Python分布式计算框架,可以处理复杂的异构应用程序。Ray和RLlib可以在没有供应商锁定的任何云上的计算集群上运行。使用Colab笔记本,您将在本教程中留下一个完整的,使用RLlib for RecSys在github repo上并行化Python RL代码的工作示例。
{"title":"Hands-on Reinforcement Learning for Recommender Systems - From Bandits to SlateQ to Offline RL with Ray RLlib","authors":"Christy D. Bergman, Kourosh Hakhamaneshi","doi":"10.1145/3523227.3547370","DOIUrl":"https://doi.org/10.1145/3523227.3547370","url":null,"abstract":"Reinforcement learning (RL) is gaining traction as a complementary approach to supervised learning for RecSys due to its ability to solve sequential decision-making processes for delayed rewards. Recent advances in offline reinforcement learning, off-policy evaluation, and more scalable, performant system design with the ability to run code in parallel, have made RL more tractable for the RecSys real time use cases. This tutorial introduces RLlib [9], a comprehensive open-source Python RL framework built for production workloads. RLlib is built on top of open-source Ray [8], an easy-to-use, distributed computing framework for Python that can handle complex, heterogeneous applications. Ray and RLlib run on compute clusters on any cloud without vendor lock. Using Colab notebooks, you will leave this tutorial with a complete, working example of parallelized Python RL code using RLlib for RecSys on a github repo.","PeriodicalId":443279,"journal":{"name":"Proceedings of the 16th ACM Conference on Recommender Systems","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125151247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}