STORE:利用单一 LLM 精简语义标记化和生成式推荐

Qijiong Liu, Jieming Zhu, Lu Fan, Zhou Zhao, Xiao-Ming Wu
{"title":"STORE:利用单一 LLM 精简语义标记化和生成式推荐","authors":"Qijiong Liu, Jieming Zhu, Lu Fan, Zhou Zhao, Xiao-Ming Wu","doi":"arxiv-2409.07276","DOIUrl":null,"url":null,"abstract":"Traditional recommendation models often rely on unique item identifiers (IDs)\nto distinguish between items, which can hinder their ability to effectively\nleverage item content information and generalize to long-tail or cold-start\nitems. Recently, semantic tokenization has been proposed as a promising\nsolution that aims to tokenize each item's semantic representation into a\nsequence of discrete tokens. In this way, it preserves the item's semantics\nwithin these tokens and ensures that semantically similar items are represented\nby similar tokens. These semantic tokens have become fundamental in training\ngenerative recommendation models. However, existing generative recommendation\nmethods typically involve multiple sub-models for embedding, quantization, and\nrecommendation, leading to an overly complex system. In this paper, we propose\nto streamline the semantic tokenization and generative recommendation process\nwith a unified framework, dubbed STORE, which leverages a single large language\nmodel (LLM) for both tasks. Specifically, we formulate semantic tokenization as\na text-to-token task and generative recommendation as a token-to-token task,\nsupplemented by a token-to-text reconstruction task and a text-to-token\nauxiliary task. All these tasks are framed in a generative manner and trained\nusing a single LLM backbone. Extensive experiments have been conducted to\nvalidate the effectiveness of our STORE framework across various recommendation\ntasks and datasets. We will release the source code and configurations for\nreproducible research.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM\",\"authors\":\"Qijiong Liu, Jieming Zhu, Lu Fan, Zhou Zhao, Xiao-Ming Wu\",\"doi\":\"arxiv-2409.07276\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Traditional recommendation models often rely on unique item identifiers (IDs)\\nto distinguish between items, which can hinder their ability to effectively\\nleverage item content information and generalize to long-tail or cold-start\\nitems. Recently, semantic tokenization has been proposed as a promising\\nsolution that aims to tokenize each item's semantic representation into a\\nsequence of discrete tokens. In this way, it preserves the item's semantics\\nwithin these tokens and ensures that semantically similar items are represented\\nby similar tokens. These semantic tokens have become fundamental in training\\ngenerative recommendation models. However, existing generative recommendation\\nmethods typically involve multiple sub-models for embedding, quantization, and\\nrecommendation, leading to an overly complex system. In this paper, we propose\\nto streamline the semantic tokenization and generative recommendation process\\nwith a unified framework, dubbed STORE, which leverages a single large language\\nmodel (LLM) for both tasks. Specifically, we formulate semantic tokenization as\\na text-to-token task and generative recommendation as a token-to-token task,\\nsupplemented by a token-to-text reconstruction task and a text-to-token\\nauxiliary task. All these tasks are framed in a generative manner and trained\\nusing a single LLM backbone. Extensive experiments have been conducted to\\nvalidate the effectiveness of our STORE framework across various recommendation\\ntasks and datasets. We will release the source code and configurations for\\nreproducible research.\",\"PeriodicalId\":501281,\"journal\":{\"name\":\"arXiv - CS - Information Retrieval\",\"volume\":\"6 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07276\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07276","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

传统的推荐模型通常依赖于唯一的项目标识符(ID)来区分项目,这可能会妨碍它们有效利用项目内容信息和概括长尾项目或冷启动项目的能力。最近,有人提出了语义标记化这一有前途的解决方案,其目的是将每个条目的语义表示标记化为一系列离散的标记。这样,它就能在这些标记中保留项目的语义,并确保语义相似的项目由相似的标记来表示。这些语义标记已成为训练生成式推荐模型的基础。然而,现有的生成式推荐方法通常涉及嵌入、量化和推荐等多个子模型,导致系统过于复杂。在本文中,我们建议使用一个统一的框架来简化语义标记化和生成式推荐过程,该框架被称为 STORE,它利用单一的大型语言模型(LLM)来完成这两项任务。具体来说,我们将语义标记化视为文本到标记的任务,将生成式推荐视为标记到标记的任务,并辅以标记到文本的重构任务和文本到标记的辅助任务。所有这些任务都是以生成方式构建的,并使用单个 LLM 骨干进行训练。我们进行了广泛的实验,以验证 STORE 框架在各种推荐任务和数据集上的有效性。我们将发布源代码和配置,以便进行可重复的研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM
Traditional recommendation models often rely on unique item identifiers (IDs) to distinguish between items, which can hinder their ability to effectively leverage item content information and generalize to long-tail or cold-start items. Recently, semantic tokenization has been proposed as a promising solution that aims to tokenize each item's semantic representation into a sequence of discrete tokens. In this way, it preserves the item's semantics within these tokens and ensures that semantically similar items are represented by similar tokens. These semantic tokens have become fundamental in training generative recommendation models. However, existing generative recommendation methods typically involve multiple sub-models for embedding, quantization, and recommendation, leading to an overly complex system. In this paper, we propose to streamline the semantic tokenization and generative recommendation process with a unified framework, dubbed STORE, which leverages a single large language model (LLM) for both tasks. Specifically, we formulate semantic tokenization as a text-to-token task and generative recommendation as a token-to-token task, supplemented by a token-to-text reconstruction task and a text-to-token auxiliary task. All these tasks are framed in a generative manner and trained using a single LLM backbone. Extensive experiments have been conducted to validate the effectiveness of our STORE framework across various recommendation tasks and datasets. We will release the source code and configurations for reproducible research.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference Retrieve, Annotate, Evaluate, Repeat: Leveraging Multimodal LLMs for Large-Scale Product Retrieval Evaluation Active Reconfigurable Intelligent Surface Empowered Synthetic Aperture Radar Imaging FLARE: Fusing Language Models and Collaborative Architectures for Recommender Enhancement Basket-Enhanced Heterogenous Hypergraph for Price-Sensitive Next Basket Recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1