RLPS: A Reinforcement Learning–Based Framework for Personalized Search

Jing Yao, Zhicheng Dou, Jun Xu, Jirong Wen
{"title":"RLPS: A Reinforcement Learning–Based Framework for Personalized Search","authors":"Jing Yao, Zhicheng Dou, Jun Xu, Jirong Wen","doi":"10.1145/3446617","DOIUrl":null,"url":null,"abstract":"Personalized search is a promising way to improve search qualities by taking user interests into consideration. Recently, machine learning and deep learning techniques have been successfully applied to search result personalization. Most existing models simply regard the personal search history as a static set of user behaviors and learn fixed ranking strategies based on all the recorded data. Though improvements have been achieved, the essence that the search process is a sequence of interactions between the search engine and user is ignored. The user’s interests may dynamically change during the search process, therefore, it would be more helpful if a personalized search model could track the whole interaction process and adjust its ranking strategy continuously. In this article, we adapt reinforcement learning to personalized search and propose a framework, referred to as RLPS. It utilizes a Markov Decision Process (MDP) to track sequential interactions between the user and search engine, and continuously update the underlying personalized ranking model with the user’s real-time feedback to learn the user’s dynamic interests. Within this framework, we implement two models: the listwise RLPS-L and the hierarchical RLPS-H. RLPS-L interacts with users and trains the ranking model with document lists, while RLPS-H improves model training by designing a layered structure and introducing document pairs. In addition, we also design a feedback-aware personalized ranking component to capture the user’s feedback, which impacts the user interest profile for the next query. Significant improvements over existing personalized search models are observed in the experiments on the public AOL search log and a commercial log.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"24 1","pages":"1 - 29"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Information Systems (TOIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3446617","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Personalized search is a promising way to improve search qualities by taking user interests into consideration. Recently, machine learning and deep learning techniques have been successfully applied to search result personalization. Most existing models simply regard the personal search history as a static set of user behaviors and learn fixed ranking strategies based on all the recorded data. Though improvements have been achieved, the essence that the search process is a sequence of interactions between the search engine and user is ignored. The user’s interests may dynamically change during the search process, therefore, it would be more helpful if a personalized search model could track the whole interaction process and adjust its ranking strategy continuously. In this article, we adapt reinforcement learning to personalized search and propose a framework, referred to as RLPS. It utilizes a Markov Decision Process (MDP) to track sequential interactions between the user and search engine, and continuously update the underlying personalized ranking model with the user’s real-time feedback to learn the user’s dynamic interests. Within this framework, we implement two models: the listwise RLPS-L and the hierarchical RLPS-H. RLPS-L interacts with users and trains the ranking model with document lists, while RLPS-H improves model training by designing a layered structure and introducing document pairs. In addition, we also design a feedback-aware personalized ranking component to capture the user’s feedback, which impacts the user interest profile for the next query. Significant improvements over existing personalized search models are observed in the experiments on the public AOL search log and a commercial log.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
RLPS:基于强化学习的个性化搜索框架
个性化搜索是一种很有前途的方法,可以通过考虑用户的兴趣来提高搜索质量。近年来,机器学习和深度学习技术已成功应用于搜索结果个性化。现有的大多数模型只是简单地将个人搜索历史视为一组静态的用户行为,并根据所有记录的数据学习固定的排名策略。虽然已经取得了改进,但搜索过程的本质是搜索引擎和用户之间的一系列交互,这一点被忽视了。在搜索过程中,用户的兴趣可能会发生动态变化,因此,个性化搜索模型如果能够跟踪整个交互过程并不断调整其排名策略,将会更有帮助。在本文中,我们将强化学习应用于个性化搜索,并提出了一个框架,称为RLPS。它利用马尔可夫决策过程(Markov Decision Process, MDP)跟踪用户与搜索引擎之间的顺序交互,并利用用户的实时反馈不断更新底层个性化排名模型,以了解用户的动态兴趣。在这个框架内,我们实现了两个模型:列表式RLPS-L和分层式RLPS-H。RLPS-L与用户交互,使用文档列表训练排名模型,而RLPS-H通过设计分层结构和引入文档对来改进模型训练。此外,我们还设计了一个反馈感知的个性化排名组件来捕获用户的反馈,这些反馈会影响下一个查询的用户兴趣配置文件。在公共AOL搜索日志和商业日志上的实验中,可以观察到对现有个性化搜索模型的显著改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Collaborative Graph Learning for Session-based Recommendation GraphHINGE: Learning Interaction Models of Structured Neighborhood on Heterogeneous Information Network Scalable Representation Learning for Dynamic Heterogeneous Information Networks via Metagraphs Complex-valued Neural Network-based Quantum Language Models eFraudCom: An E-commerce Fraud Detection System via Competitive Graph Neural Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1