Intrinsically motivated reinforcement learning based recommendation with counterfactual data augmentation

Xiaocong Chen, Siyu Wang, Lianyong Qi, Yong Li, Lina Yao
{"title":"Intrinsically motivated reinforcement learning based recommendation with counterfactual data augmentation","authors":"Xiaocong Chen, Siyu Wang, Lianyong Qi, Yong Li, Lina Yao","doi":"10.1007/s11280-023-01187-7","DOIUrl":null,"url":null,"abstract":"<p>Deep reinforcement learning (DRL) has shown promising results in modeling dynamic user preferences in RS in recent literature. However, training a DRL agent in the sparse RS environment poses a significant challenge. This is because the agent must balance between exploring informative user-item interaction trajectories and using existing trajectories for policy learning, a known exploration and exploitation trade-off. This trade-off greatly affects the recommendation performance when the environment is sparse. In DRL-based RS, balancing exploration and exploitation is even more challenging as the agent needs to deeply explore informative trajectories and efficiently exploit them in the context of RS. To address this issue, we propose a novel intrinsically motivated reinforcement learning (IMRL) method that enhances the agent’s capability to explore informative interaction trajectories in the sparse environment. We further enrich these trajectories via an adaptive counterfactual augmentation strategy with a customised threshold to improve their efficiency in exploitation. Our approach is evaluated on six offline datasets and three online simulation platforms, demonstrating its superiority over existing state-of-the-art methods. The extensive experiments show that our IMRL method outperforms other methods in terms of recommendation performance in the sparse RS environment.</p>","PeriodicalId":501180,"journal":{"name":"World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"World Wide Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11280-023-01187-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Deep reinforcement learning (DRL) has shown promising results in modeling dynamic user preferences in RS in recent literature. However, training a DRL agent in the sparse RS environment poses a significant challenge. This is because the agent must balance between exploring informative user-item interaction trajectories and using existing trajectories for policy learning, a known exploration and exploitation trade-off. This trade-off greatly affects the recommendation performance when the environment is sparse. In DRL-based RS, balancing exploration and exploitation is even more challenging as the agent needs to deeply explore informative trajectories and efficiently exploit them in the context of RS. To address this issue, we propose a novel intrinsically motivated reinforcement learning (IMRL) method that enhances the agent’s capability to explore informative interaction trajectories in the sparse environment. We further enrich these trajectories via an adaptive counterfactual augmentation strategy with a customised threshold to improve their efficiency in exploitation. Our approach is evaluated on six offline datasets and three online simulation platforms, demonstrating its superiority over existing state-of-the-art methods. The extensive experiments show that our IMRL method outperforms other methods in terms of recommendation performance in the sparse RS environment.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于反事实数据增强的内在动机强化学习推荐
在最近的文献中,深度强化学习(DRL)在RS中动态用户偏好建模方面显示出有希望的结果。然而,在稀疏的RS环境中训练DRL代理是一个重大的挑战。这是因为智能体必须在探索信息丰富的用户-物品交互轨迹和使用现有轨迹进行策略学习之间取得平衡,这是一种已知的探索和利用权衡。当环境稀疏时,这种权衡极大地影响了推荐性能。在基于drl的RS中,由于智能体需要深入探索信息轨迹并在RS环境中有效地利用它们,因此平衡探索和利用更加具有挑战性。为了解决这个问题,我们提出了一种新的内在动机强化学习(IMRL)方法,该方法增强了智能体在稀疏环境中探索信息交互轨迹的能力。我们通过定制阈值的自适应反事实增强策略进一步丰富这些轨迹,以提高其开发效率。我们的方法在六个离线数据集和三个在线仿真平台上进行了评估,证明了其优于现有最先进的方法。大量的实验表明,我们的IMRL方法在稀疏RS环境下的推荐性能优于其他方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
HetFS: a method for fast similarity search with ad-hoc meta-paths on heterogeneous information networks A SHAP-based controversy analysis through communities on Twitter pFind: Privacy-preserving lost object finding in vehicular crowdsensing Use of prompt-based learning for code-mixed and code-switched text classification Drug traceability system based on semantic blockchain and on a reputation method
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1