{"title":"Hands-on Reinforcement Learning for Recommender Systems - From Bandits to SlateQ to Offline RL with Ray RLlib","authors":"Christy D. Bergman, Kourosh Hakhamaneshi","doi":"10.1145/3523227.3547370","DOIUrl":null,"url":null,"abstract":"Reinforcement learning (RL) is gaining traction as a complementary approach to supervised learning for RecSys due to its ability to solve sequential decision-making processes for delayed rewards. Recent advances in offline reinforcement learning, off-policy evaluation, and more scalable, performant system design with the ability to run code in parallel, have made RL more tractable for the RecSys real time use cases. This tutorial introduces RLlib [9], a comprehensive open-source Python RL framework built for production workloads. RLlib is built on top of open-source Ray [8], an easy-to-use, distributed computing framework for Python that can handle complex, heterogeneous applications. Ray and RLlib run on compute clusters on any cloud without vendor lock. Using Colab notebooks, you will leave this tutorial with a complete, working example of parallelized Python RL code using RLlib for RecSys on a github repo.","PeriodicalId":443279,"journal":{"name":"Proceedings of the 16th ACM Conference on Recommender Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM Conference on Recommender Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3523227.3547370","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Reinforcement learning (RL) is gaining traction as a complementary approach to supervised learning for RecSys due to its ability to solve sequential decision-making processes for delayed rewards. Recent advances in offline reinforcement learning, off-policy evaluation, and more scalable, performant system design with the ability to run code in parallel, have made RL more tractable for the RecSys real time use cases. This tutorial introduces RLlib [9], a comprehensive open-source Python RL framework built for production workloads. RLlib is built on top of open-source Ray [8], an easy-to-use, distributed computing framework for Python that can handle complex, heterogeneous applications. Ray and RLlib run on compute clusters on any cloud without vendor lock. Using Colab notebooks, you will leave this tutorial with a complete, working example of parallelized Python RL code using RLlib for RecSys on a github repo.