Wenchao Zhao, Xiaoyi Liu, Ruilin Xu, Lingxi Xiao, Muqing Li
In e-commerce websites, web mining web page recommendation technology has been widely used. However, recommendation solutions often cannot meet the actual application needs of online shopping users. To address this problem, this paper proposes an e-commerce web page recommendation solution that combines semantic web mining and BP neural networks. First, the web logs of user searches are processed, and 5 features are extracted: content priority, time consumption priority, online shopping users' explicit/implicit feedback on the website, recommendation semantics and input deviation amount. Then, these features are used as input features of the BP neural network to classify and identify the priority of the final output web page. Finally, the web pages are sorted according to priority and recommended to users. This project uses book sales webpages as samples for experiments. The results show that this solution can quickly and accurately identify the webpages required by users.
在电子商务网站中,网络挖掘网页推荐技术得到了广泛应用。然而,推荐解决方案往往不能满足在线购物用户的实际应用需求。针对这一问题,本文提出了一种结合语义网络挖掘和 BP 神经网络的电子商务网页推荐解决方案。首先,处理用户搜索的网络日志,提取 5 个特征:内容优先级、时间消费优先级、网购用户对网站的显性/隐性反馈、推荐语义和输入偏差量。然后,将这些特征作为 BP 神经网络的输入特征,对最终输出的网页进行分类和优先级识别。最后,根据优先级对网页进行排序并推荐给用户。本项目使用图书销售网页作为实验样本。结果表明,该解决方案能够快速、准确地识别用户所需的网页。
{"title":"E-commerce Webpage Recommendation Scheme Base on Semantic Mining and Neural Networks","authors":"Wenchao Zhao, Xiaoyi Liu, Ruilin Xu, Lingxi Xiao, Muqing Li","doi":"arxiv-2409.07033","DOIUrl":"https://doi.org/arxiv-2409.07033","url":null,"abstract":"In e-commerce websites, web mining web page recommendation technology has\u0000been widely used. However, recommendation solutions often cannot meet the\u0000actual application needs of online shopping users. To address this problem,\u0000this paper proposes an e-commerce web page recommendation solution that\u0000combines semantic web mining and BP neural networks. First, the web logs of\u0000user searches are processed, and 5 features are extracted: content priority,\u0000time consumption priority, online shopping users' explicit/implicit feedback on\u0000the website, recommendation semantics and input deviation amount. Then, these\u0000features are used as input features of the BP neural network to classify and\u0000identify the priority of the final output web page. Finally, the web pages are\u0000sorted according to priority and recommended to users. This project uses book\u0000sales webpages as samples for experiments. The results show that this solution\u0000can quickly and accurately identify the webpages required by users.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"55 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E-commerce platforms have a vast catalog of items to cater to their customers' shopping interests. Most of these platforms assist their customers in the shopping process by offering optimized recommendation carousels, designed to help customers quickly locate their desired items. Many models have been proposed in academic literature to generate and enhance the ranking and recall set of items in these carousels. Conventionally, the accompanying carousel title text (header) of these carousels remains static. In most instances, a generic text such as "Items similar to your current viewing" is utilized. Fixed variations such as the inclusion of specific attributes "Other items from a similar seller" or "Items from a similar brand" in addition to "frequently bought together" or "considered together" are observed as well. This work proposes a novel approach to customize the header generation process of these carousels. Our work leverages user-generated reviews that lay focus on specific attributes (aspects) of an item that were favorably perceived by users during their interaction with the given item. We extract these aspects from reviews and train a graph neural network-based model under the framework of a conditional ranking task. We refer to our innovative methodology as Dynamic Text Snippets (DTS) which generates multiple header texts for an anchor item and its recall set. Our approach demonstrates the potential of utilizing user-generated reviews and presents a unique paradigm for exploring increasingly context-aware recommendation systems.
{"title":"Leveraging User-Generated Reviews for Recommender Systems with Dynamic Headers","authors":"Shanu Vashishtha, Abhay Kumar, Lalitesh Morishetti, Kaushiki Nag, Kannan Achan","doi":"arxiv-2409.07627","DOIUrl":"https://doi.org/arxiv-2409.07627","url":null,"abstract":"E-commerce platforms have a vast catalog of items to cater to their\u0000customers' shopping interests. Most of these platforms assist their customers\u0000in the shopping process by offering optimized recommendation carousels,\u0000designed to help customers quickly locate their desired items. Many models have\u0000been proposed in academic literature to generate and enhance the ranking and\u0000recall set of items in these carousels. Conventionally, the accompanying\u0000carousel title text (header) of these carousels remains static. In most\u0000instances, a generic text such as \"Items similar to your current viewing\" is\u0000utilized. Fixed variations such as the inclusion of specific attributes \"Other\u0000items from a similar seller\" or \"Items from a similar brand\" in addition to\u0000\"frequently bought together\" or \"considered together\" are observed as well.\u0000This work proposes a novel approach to customize the header generation process\u0000of these carousels. Our work leverages user-generated reviews that lay focus on\u0000specific attributes (aspects) of an item that were favorably perceived by users\u0000during their interaction with the given item. We extract these aspects from\u0000reviews and train a graph neural network-based model under the framework of a\u0000conditional ranking task. We refer to our innovative methodology as Dynamic\u0000Text Snippets (DTS) which generates multiple header texts for an anchor item\u0000and its recall set. Our approach demonstrates the potential of utilizing\u0000user-generated reviews and presents a unique paradigm for exploring\u0000increasingly context-aware recommendation systems.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"15 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniele Malitesta, Alberto Carlo Maria Mancino, Pasquale Minervini, Tommaso Di Noia
Item recommendation (the task of predicting if a user may interact with new items from the catalogue in a recommendation system) and link prediction (the task of identifying missing links in a knowledge graph) have long been regarded as distinct problems. In this work, we show that the item recommendation problem can be seen as an instance of the link prediction problem, where entities in the graph represent users and items, and the task consists of predicting missing instances of the relation type <>. In a preliminary attempt to demonstrate the assumption, we decide to test three popular factorisation-based link prediction models on the item recommendation task, showing that their predictive accuracy is competitive with ten state-of-the-art recommendation models. The purpose is to show how the former may be seamlessly and effectively applied to the recommendation task without any specific modification to their architectures. Finally, while beginning to unveil the key reasons behind the recommendation performance of the selected link prediction models, we explore different settings for their hyper-parameter values, paving the way for future directions.
{"title":"Dot Product is All You Need: Bridging the Gap Between Item Recommendation and Link Prediction","authors":"Daniele Malitesta, Alberto Carlo Maria Mancino, Pasquale Minervini, Tommaso Di Noia","doi":"arxiv-2409.07433","DOIUrl":"https://doi.org/arxiv-2409.07433","url":null,"abstract":"Item recommendation (the task of predicting if a user may interact with new\u0000items from the catalogue in a recommendation system) and link prediction (the\u0000task of identifying missing links in a knowledge graph) have long been regarded\u0000as distinct problems. In this work, we show that the item recommendation\u0000problem can be seen as an instance of the link prediction problem, where\u0000entities in the graph represent users and items, and the task consists of\u0000predicting missing instances of the relation type <<interactsWith>>. In a\u0000preliminary attempt to demonstrate the assumption, we decide to test three\u0000popular factorisation-based link prediction models on the item recommendation\u0000task, showing that their predictive accuracy is competitive with ten\u0000state-of-the-art recommendation models. The purpose is to show how the former\u0000may be seamlessly and effectively applied to the recommendation task without\u0000any specific modification to their architectures. Finally, while beginning to\u0000unveil the key reasons behind the recommendation performance of the selected\u0000link prediction models, we explore different settings for their hyper-parameter\u0000values, paving the way for future directions.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Large language models (LLMs) are increasingly used in natural language processing tasks. Recommender systems traditionally use methods such as collaborative filtering and matrix factorization, as well as advanced techniques like deep learning and reinforcement learning. Although language models have been applied in recommendation, the recent trend have focused on leveraging the generative capabilities of LLMs for more personalized suggestions. While current research focuses on English due to its resource richness, this work explores the impact of non-English prompts on recommendation performance. Using OpenP5, a platform for developing and evaluating LLM-based recommendations, we expanded its English prompt templates to include Spanish and Turkish. Evaluation on three real-world datasets, namely ML1M, LastFM, and Amazon-Beauty, showed that usage of non-English prompts generally reduce performance, especially in less-resourced languages like Turkish. We also retrained an LLM-based recommender model with multilingual prompts to analyze performance variations. Retraining with multilingual prompts resulted in more balanced performance across languages, but slightly reduced English performance. This work highlights the need for diverse language support in LLM-based recommenders and suggests future research on creating evaluation datasets, using newer models and additional languages.
{"title":"Multilingual Prompts in LLM-Based Recommenders: Performance Across Languages","authors":"Makbule Gulcin Ozsoy","doi":"arxiv-2409.07604","DOIUrl":"https://doi.org/arxiv-2409.07604","url":null,"abstract":"Large language models (LLMs) are increasingly used in natural language\u0000processing tasks. Recommender systems traditionally use methods such as\u0000collaborative filtering and matrix factorization, as well as advanced\u0000techniques like deep learning and reinforcement learning. Although language\u0000models have been applied in recommendation, the recent trend have focused on\u0000leveraging the generative capabilities of LLMs for more personalized\u0000suggestions. While current research focuses on English due to its resource\u0000richness, this work explores the impact of non-English prompts on\u0000recommendation performance. Using OpenP5, a platform for developing and\u0000evaluating LLM-based recommendations, we expanded its English prompt templates\u0000to include Spanish and Turkish. Evaluation on three real-world datasets, namely\u0000ML1M, LastFM, and Amazon-Beauty, showed that usage of non-English prompts\u0000generally reduce performance, especially in less-resourced languages like\u0000Turkish. We also retrained an LLM-based recommender model with multilingual\u0000prompts to analyze performance variations. Retraining with multilingual prompts\u0000resulted in more balanced performance across languages, but slightly reduced\u0000English performance. This work highlights the need for diverse language support\u0000in LLM-based recommenders and suggests future research on creating evaluation\u0000datasets, using newer models and additional languages.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Qijiong Liu, Jieming Zhu, Lu Fan, Zhou Zhao, Xiao-Ming Wu
Traditional recommendation models often rely on unique item identifiers (IDs) to distinguish between items, which can hinder their ability to effectively leverage item content information and generalize to long-tail or cold-start items. Recently, semantic tokenization has been proposed as a promising solution that aims to tokenize each item's semantic representation into a sequence of discrete tokens. In this way, it preserves the item's semantics within these tokens and ensures that semantically similar items are represented by similar tokens. These semantic tokens have become fundamental in training generative recommendation models. However, existing generative recommendation methods typically involve multiple sub-models for embedding, quantization, and recommendation, leading to an overly complex system. In this paper, we propose to streamline the semantic tokenization and generative recommendation process with a unified framework, dubbed STORE, which leverages a single large language model (LLM) for both tasks. Specifically, we formulate semantic tokenization as a text-to-token task and generative recommendation as a token-to-token task, supplemented by a token-to-text reconstruction task and a text-to-token auxiliary task. All these tasks are framed in a generative manner and trained using a single LLM backbone. Extensive experiments have been conducted to validate the effectiveness of our STORE framework across various recommendation tasks and datasets. We will release the source code and configurations for reproducible research.
传统的推荐模型通常依赖于唯一的项目标识符(ID)来区分项目,这可能会妨碍它们有效利用项目内容信息和概括长尾项目或冷启动项目的能力。最近,有人提出了语义标记化这一有前途的解决方案,其目的是将每个条目的语义表示标记化为一系列离散的标记。这样,它就能在这些标记中保留项目的语义,并确保语义相似的项目由相似的标记来表示。这些语义标记已成为训练生成式推荐模型的基础。然而,现有的生成式推荐方法通常涉及嵌入、量化和推荐等多个子模型,导致系统过于复杂。在本文中,我们建议使用一个统一的框架来简化语义标记化和生成式推荐过程,该框架被称为 STORE,它利用单一的大型语言模型(LLM)来完成这两项任务。具体来说,我们将语义标记化视为文本到标记的任务,将生成式推荐视为标记到标记的任务,并辅以标记到文本的重构任务和文本到标记的辅助任务。所有这些任务都是以生成方式构建的,并使用单个 LLM 骨干进行训练。我们进行了广泛的实验,以验证 STORE 框架在各种推荐任务和数据集上的有效性。我们将发布源代码和配置,以便进行可重复的研究。
{"title":"STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM","authors":"Qijiong Liu, Jieming Zhu, Lu Fan, Zhou Zhao, Xiao-Ming Wu","doi":"arxiv-2409.07276","DOIUrl":"https://doi.org/arxiv-2409.07276","url":null,"abstract":"Traditional recommendation models often rely on unique item identifiers (IDs)\u0000to distinguish between items, which can hinder their ability to effectively\u0000leverage item content information and generalize to long-tail or cold-start\u0000items. Recently, semantic tokenization has been proposed as a promising\u0000solution that aims to tokenize each item's semantic representation into a\u0000sequence of discrete tokens. In this way, it preserves the item's semantics\u0000within these tokens and ensures that semantically similar items are represented\u0000by similar tokens. These semantic tokens have become fundamental in training\u0000generative recommendation models. However, existing generative recommendation\u0000methods typically involve multiple sub-models for embedding, quantization, and\u0000recommendation, leading to an overly complex system. In this paper, we propose\u0000to streamline the semantic tokenization and generative recommendation process\u0000with a unified framework, dubbed STORE, which leverages a single large language\u0000model (LLM) for both tasks. Specifically, we formulate semantic tokenization as\u0000a text-to-token task and generative recommendation as a token-to-token task,\u0000supplemented by a token-to-text reconstruction task and a text-to-token\u0000auxiliary task. All these tasks are framed in a generative manner and trained\u0000using a single LLM backbone. Extensive experiments have been conducted to\u0000validate the effectiveness of our STORE framework across various recommendation\u0000tasks and datasets. We will release the source code and configurations for\u0000reproducible research.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yingling Lu, Yijun Yang, Zhaohu Xing, Qiong Wang, Lei Zhu
Diffusion Probabilistic Models have recently attracted significant attention in the community of computer vision due to their outstanding performance. However, while a substantial amount of diffusion-based research has focused on generative tasks, no work introduces diffusion models to advance the results of polyp segmentation in videos, which is frequently challenged by polyps' high camouflage and redundant temporal cues.In this paper, we present a novel diffusion-based network for video polyp segmentation task, dubbed as Diff-VPS. We incorporate multi-task supervision into diffusion models to promote the discrimination of diffusion models on pixel-by-pixel segmentation. This integrates the contextual high-level information achieved by the joint classification and detection tasks. To explore the temporal dependency, Temporal Reasoning Module (TRM) is devised via reasoning and reconstructing the target frame from the previous frames. We further equip TRM with a generative adversarial self-supervised strategy to produce more realistic frames and thus capture better dynamic cues. Extensive experiments are conducted on SUN-SEG, and the results indicate that our proposed Diff-VPS significantly achieves state-of-the-art performance. Code is available at https://github.com/lydia-yllu/Diff-VPS.
{"title":"Diff-VPS: Video Polyp Segmentation via a Multi-task Diffusion Network with Adversarial Temporal Reasoning","authors":"Yingling Lu, Yijun Yang, Zhaohu Xing, Qiong Wang, Lei Zhu","doi":"arxiv-2409.07238","DOIUrl":"https://doi.org/arxiv-2409.07238","url":null,"abstract":"Diffusion Probabilistic Models have recently attracted significant attention\u0000in the community of computer vision due to their outstanding performance.\u0000However, while a substantial amount of diffusion-based research has focused on\u0000generative tasks, no work introduces diffusion models to advance the results of\u0000polyp segmentation in videos, which is frequently challenged by polyps' high\u0000camouflage and redundant temporal cues.In this paper, we present a novel\u0000diffusion-based network for video polyp segmentation task, dubbed as Diff-VPS.\u0000We incorporate multi-task supervision into diffusion models to promote the\u0000discrimination of diffusion models on pixel-by-pixel segmentation. This\u0000integrates the contextual high-level information achieved by the joint\u0000classification and detection tasks. To explore the temporal dependency,\u0000Temporal Reasoning Module (TRM) is devised via reasoning and reconstructing the\u0000target frame from the previous frames. We further equip TRM with a generative\u0000adversarial self-supervised strategy to produce more realistic frames and thus\u0000capture better dynamic cues. Extensive experiments are conducted on SUN-SEG,\u0000and the results indicate that our proposed Diff-VPS significantly achieves\u0000state-of-the-art performance. Code is available at\u0000https://github.com/lydia-yllu/Diff-VPS.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"67 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haokai Ma, Ruobing Xie, Lei Meng, Fuli Feng, Xiaoyu Du, Xingwu Sun, Zhanhui Kang, Xiangxu Meng
Recommender systems aim to capture users' personalized preferences from the cast amount of user behaviors, making them pivotal in the era of information explosion. However, the presence of the dynamic preference, the "information cocoons", and the inherent feedback loops in recommendation make users interact with a limited number of items. Conventional recommendation algorithms typically focus on the positive historical behaviors, while neglecting the essential role of negative feedback in user interest understanding. As a promising but easy-to-ignored area, negative sampling is proficients in revealing the genuine negative aspect inherent in user behaviors, emerging as an inescapable procedure in recommendation. In this survey, we first discuss the role of negative sampling in recommendation and thoroughly analyze challenges that consistently impede its progress. Then, we conduct an extensive literature review on the existing negative sampling strategies in recommendation and classify them into five categories with their discrepant techniques. Finally, we detail the insights of the tailored negative sampling strategies in diverse recommendation scenarios and outline an overview of the prospective research directions toward which the community may engage and benefit.
{"title":"Negative Sampling in Recommendation: A Survey and Future Directions","authors":"Haokai Ma, Ruobing Xie, Lei Meng, Fuli Feng, Xiaoyu Du, Xingwu Sun, Zhanhui Kang, Xiangxu Meng","doi":"arxiv-2409.07237","DOIUrl":"https://doi.org/arxiv-2409.07237","url":null,"abstract":"Recommender systems aim to capture users' personalized preferences from the\u0000cast amount of user behaviors, making them pivotal in the era of information\u0000explosion. However, the presence of the dynamic preference, the \"information\u0000cocoons\", and the inherent feedback loops in recommendation make users interact\u0000with a limited number of items. Conventional recommendation algorithms\u0000typically focus on the positive historical behaviors, while neglecting the\u0000essential role of negative feedback in user interest understanding. As a\u0000promising but easy-to-ignored area, negative sampling is proficients in\u0000revealing the genuine negative aspect inherent in user behaviors, emerging as\u0000an inescapable procedure in recommendation. In this survey, we first discuss\u0000the role of negative sampling in recommendation and thoroughly analyze\u0000challenges that consistently impede its progress. Then, we conduct an extensive\u0000literature review on the existing negative sampling strategies in\u0000recommendation and classify them into five categories with their discrepant\u0000techniques. Finally, we detail the insights of the tailored negative sampling\u0000strategies in diverse recommendation scenarios and outline an overview of the\u0000prospective research directions toward which the community may engage and\u0000benefit.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luo Ji, Gao Liu, Mingyang Yin, Hongxia Yang, Jingren Zhou
Modern listwise recommendation systems need to consider both long-term user perceptions and short-term interest shifts. Reinforcement learning can be applied on recommendation to study such a problem but is also subject to large search space, sparse user feedback and long interactive latency. Motivated by recent progress in hierarchical reinforcement learning, we propose a novel framework called mccHRL to provide different levels of temporal abstraction on listwise recommendation. Within the hierarchical framework, the high-level agent studies the evolution of user perception, while the low-level agent produces the item selection policy by modeling the process as a sequential decision-making problem. We argue that such framework has a well-defined decomposition of the outra-session context and the intra-session context, which are encoded by the high-level and low-level agents, respectively. To verify this argument, we implement both a simulator-based environment and an industrial dataset-based experiment. Results observe significant performance improvement by our method, compared with several well-known baselines. Data and codes have been made public.
{"title":"Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation","authors":"Luo Ji, Gao Liu, Mingyang Yin, Hongxia Yang, Jingren Zhou","doi":"arxiv-2409.07416","DOIUrl":"https://doi.org/arxiv-2409.07416","url":null,"abstract":"Modern listwise recommendation systems need to consider both long-term user\u0000perceptions and short-term interest shifts. Reinforcement learning can be\u0000applied on recommendation to study such a problem but is also subject to large\u0000search space, sparse user feedback and long interactive latency. Motivated by\u0000recent progress in hierarchical reinforcement learning, we propose a novel\u0000framework called mccHRL to provide different levels of temporal abstraction on\u0000listwise recommendation. Within the hierarchical framework, the high-level\u0000agent studies the evolution of user perception, while the low-level agent\u0000produces the item selection policy by modeling the process as a sequential\u0000decision-making problem. We argue that such framework has a well-defined\u0000decomposition of the outra-session context and the intra-session context, which\u0000are encoded by the high-level and low-level agents, respectively. To verify\u0000this argument, we implement both a simulator-based environment and an\u0000industrial dataset-based experiment. Results observe significant performance\u0000improvement by our method, compared with several well-known baselines. Data and\u0000codes have been made public.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Modern music streaming services are heavily based on recommendation engines to serve content to users. Sequential recommendation -- continuously providing new items within a single session in a contextually coherent manner -- has been an emerging topic in current literature. User feedback -- a positive or negative response to the item presented -- is used to drive content recommendations by learning user preferences. We extend this idea to session-based recommendation to provide context-coherent music recommendations by modelling negative user feedback, i.e., skips, in the loss function. We propose a sequence-aware contrastive sub-task to structure item embeddings in session-based music recommendation, such that true next-positive items (ignoring skipped items) are structured closer in the session embedding space, while skipped tracks are structured farther away from all items in the session. This directly affects item rankings using a K-nearest-neighbors search for next-item recommendations, while also promoting the rank of the true next item. Experiments incorporating this task into SoTA methods for sequential item recommendation show consistent performance gains in terms of next-item hit rate, item ranking, and skip down-ranking on three music recommendation datasets, strongly benefiting from the increasing presence of user feedback.
{"title":"Enhancing Sequential Music Recommendation with Negative Feedback-informed Contrastive Learning","authors":"Pavan Seshadri, Shahrzad Shashaani, Peter Knees","doi":"arxiv-2409.07367","DOIUrl":"https://doi.org/arxiv-2409.07367","url":null,"abstract":"Modern music streaming services are heavily based on recommendation engines\u0000to serve content to users. Sequential recommendation -- continuously providing\u0000new items within a single session in a contextually coherent manner -- has been\u0000an emerging topic in current literature. User feedback -- a positive or\u0000negative response to the item presented -- is used to drive content\u0000recommendations by learning user preferences. We extend this idea to\u0000session-based recommendation to provide context-coherent music recommendations\u0000by modelling negative user feedback, i.e., skips, in the loss function. We\u0000propose a sequence-aware contrastive sub-task to structure item embeddings in\u0000session-based music recommendation, such that true next-positive items\u0000(ignoring skipped items) are structured closer in the session embedding space,\u0000while skipped tracks are structured farther away from all items in the session.\u0000This directly affects item rankings using a K-nearest-neighbors search for\u0000next-item recommendations, while also promoting the rank of the true next item.\u0000Experiments incorporating this task into SoTA methods for sequential item\u0000recommendation show consistent performance gains in terms of next-item hit\u0000rate, item ranking, and skip down-ranking on three music recommendation\u0000datasets, strongly benefiting from the increasing presence of user feedback.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multi-modal models have gained significant attention due to their powerful capabilities. These models effectively align embeddings across diverse data modalities, showcasing superior performance in downstream tasks compared to their unimodal counterparts. Recent study showed that the attacker can manipulate an image or audio file by altering it in such a way that its embedding matches that of an attacker-chosen targeted input, thereby deceiving downstream models. However, this method often underperforms due to inherent disparities in data from different modalities. In this paper, we introduce CrossFire, an innovative approach to attack multi-modal models. CrossFire begins by transforming the targeted input chosen by the attacker into a format that matches the modality of the original image or audio file. We then formulate our attack as an optimization problem, aiming to minimize the angular deviation between the embeddings of the transformed input and the modified image or audio file. Solving this problem determines the perturbations to be added to the original media. Our extensive experiments on six real-world benchmark datasets reveal that CrossFire can significantly manipulate downstream tasks, surpassing existing attacks. Additionally, we evaluate six defensive strategies against CrossFire, finding that current defenses are insufficient to counteract our CrossFire.
{"title":"Adversarial Attacks to Multi-Modal Models","authors":"Zhihao Dou, Xin Hu, Haibo Yang, Zhuqing Liu, Minghong Fang","doi":"arxiv-2409.06793","DOIUrl":"https://doi.org/arxiv-2409.06793","url":null,"abstract":"Multi-modal models have gained significant attention due to their powerful\u0000capabilities. These models effectively align embeddings across diverse data\u0000modalities, showcasing superior performance in downstream tasks compared to\u0000their unimodal counterparts. Recent study showed that the attacker can\u0000manipulate an image or audio file by altering it in such a way that its\u0000embedding matches that of an attacker-chosen targeted input, thereby deceiving\u0000downstream models. However, this method often underperforms due to inherent\u0000disparities in data from different modalities. In this paper, we introduce\u0000CrossFire, an innovative approach to attack multi-modal models. CrossFire\u0000begins by transforming the targeted input chosen by the attacker into a format\u0000that matches the modality of the original image or audio file. We then\u0000formulate our attack as an optimization problem, aiming to minimize the angular\u0000deviation between the embeddings of the transformed input and the modified\u0000image or audio file. Solving this problem determines the perturbations to be\u0000added to the original media. Our extensive experiments on six real-world\u0000benchmark datasets reveal that CrossFire can significantly manipulate\u0000downstream tasks, surpassing existing attacks. Additionally, we evaluate six\u0000defensive strategies against CrossFire, finding that current defenses are\u0000insufficient to counteract our CrossFire.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"44 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142205323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}