首页 > 最新文献

arXiv - CS - Information Retrieval最新文献

英文 中文
Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference 解码风格:高效微调 LLM,实现图像引导下的服装偏好推荐
Pub Date : 2024-09-18 DOI: arxiv-2409.12150
Najmeh Forouzandehmehr, Nima Farrokhsiar, Ramin Giahi, Evren Korpeoglu, Kannan Achan
Personalized outfit recommendation remains a complex challenge, demandingboth fashion compatibility understanding and trend awareness. This paperpresents a novel framework that harnesses the expressive power of largelanguage models (LLMs) for this task, mitigating their "black box" and staticnature through fine-tuning and direct feedback integration. We bridge the itemvisual-textual gap in items descriptions by employing image captioning with aMultimodal Large Language Model (MLLM). This enables the LLM to extract styleand color characteristics from human-curated fashion images, forming the basisfor personalized recommendations. The LLM is efficiently fine-tuned on theopen-source Polyvore dataset of curated fashion images, optimizing its abilityto recommend stylish outfits. A direct preference mechanism using negativeexamples is employed to enhance the LLM's decision-making process. This createsa self-enhancing AI feedback loop that continuously refines recommendations inline with seasonal fashion trends. Our framework is evaluated on the Polyvoredataset, demonstrating its effectiveness in two key tasks: fill-in-the-blank,and complementary item retrieval. These evaluations underline the framework'sability to generate stylish, trend-aligned outfit suggestions, continuouslyimproving through direct feedback. The evaluation results demonstrated that ourproposed framework significantly outperforms the base LLM, creating morecohesive outfits. The improved performance in these tasks underscores theproposed framework's potential to enhance the shopping experience with accuratesuggestions, proving its effectiveness over the vanilla LLM based outfitgeneration.
个性化服装推荐仍然是一项复杂的挑战,既需要对时尚兼容性的理解,又需要对流行趋势的认识。本文提出了一个新颖的框架,利用大型语言模型(LLM)的表现力来完成这项任务,通过微调和直接反馈整合来减轻其 "黑箱 "和静态特性。我们通过多模态大语言模型(MLLM)使用图像标题来弥合项目描述中的项目视觉与文本之间的差距。这使得多模态大语言模型能够从人类编辑的时尚图片中提取风格和色彩特征,为个性化推荐奠定基础。LLM 在开源的 Polyvore 时尚图片数据集上进行了有效的微调,优化了其推荐时尚服装的能力。使用负面示例的直接偏好机制可增强 LLM 的决策过程。这就形成了一个自我增强的人工智能反馈回路,可根据季节性时尚趋势不断改进推荐。我们的框架在 Polyvoredataset 上进行了评估,证明了它在两个关键任务中的有效性:填空和补充项目检索。这些评估强调了该框架生成时尚、符合潮流的服装建议的能力,并通过直接反馈不断改进。评估结果表明,我们提出的框架明显优于基本的 LLM,能生成更具凝聚力的服装。在这些任务中性能的提高凸显了拟议框架通过准确的建议提升购物体验的潜力,证明了它比基于 LLM 的服装生成更有效。
{"title":"Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference","authors":"Najmeh Forouzandehmehr, Nima Farrokhsiar, Ramin Giahi, Evren Korpeoglu, Kannan Achan","doi":"arxiv-2409.12150","DOIUrl":"https://doi.org/arxiv-2409.12150","url":null,"abstract":"Personalized outfit recommendation remains a complex challenge, demanding\u0000both fashion compatibility understanding and trend awareness. This paper\u0000presents a novel framework that harnesses the expressive power of large\u0000language models (LLMs) for this task, mitigating their \"black box\" and static\u0000nature through fine-tuning and direct feedback integration. We bridge the item\u0000visual-textual gap in items descriptions by employing image captioning with a\u0000Multimodal Large Language Model (MLLM). This enables the LLM to extract style\u0000and color characteristics from human-curated fashion images, forming the basis\u0000for personalized recommendations. The LLM is efficiently fine-tuned on the\u0000open-source Polyvore dataset of curated fashion images, optimizing its ability\u0000to recommend stylish outfits. A direct preference mechanism using negative\u0000examples is employed to enhance the LLM's decision-making process. This creates\u0000a self-enhancing AI feedback loop that continuously refines recommendations in\u0000line with seasonal fashion trends. Our framework is evaluated on the Polyvore\u0000dataset, demonstrating its effectiveness in two key tasks: fill-in-the-blank,\u0000and complementary item retrieval. These evaluations underline the framework's\u0000ability to generate stylish, trend-aligned outfit suggestions, continuously\u0000improving through direct feedback. The evaluation results demonstrated that our\u0000proposed framework significantly outperforms the base LLM, creating more\u0000cohesive outfits. The improved performance in these tasks underscores the\u0000proposed framework's potential to enhance the shopping experience with accurate\u0000suggestions, proving its effectiveness over the vanilla LLM based outfit\u0000generation.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Factuality of Large Language Models in the Legal Domain 法律领域大型语言模型的事实性
Pub Date : 2024-09-18 DOI: arxiv-2409.11798
Rajaa El Hamdani, Thomas Bonald, Fragkiskos Malliaros, Nils Holzenberger, Fabian Suchanek
This paper investigates the factuality of large language models (LLMs) asknowledge bases in the legal domain, in a realistic usage scenario: we allowfor acceptable variations in the answer, and let the model abstain fromanswering when uncertain. First, we design a dataset of diverse factualquestions about case law and legislation. We then use the dataset to evaluateseveral LLMs under different evaluation methods, including exact, alias, andfuzzy matching. Our results show that the performance improves significantlyunder the alias and fuzzy matching methods. Further, we explore the impact ofabstaining and in-context examples, finding that both strategies enhanceprecision. Finally, we demonstrate that additional pre-training on legaldocuments, as seen with SaulLM, further improves factual precision from 63% to81%.
本文在现实使用场景中研究了法律领域大型语言模型(LLMs)知识库的事实性问题:我们允许答案有可接受的变化,并让模型在不确定时放弃回答。首先,我们设计了一个关于判例法和立法的各种事实问题的数据集。然后,我们使用该数据集评估了不同评估方法下的几种 LLM,包括精确匹配、别名匹配和模糊匹配。结果表明,在别名匹配和模糊匹配方法下,LLM 的性能有了显著提高。此外,我们还探讨了保留示例和上下文示例的影响,发现这两种策略都能提高精确度。最后,我们证明了在 SaulLM 的基础上对法律文件进行额外的预训练可以进一步提高事实精确度,从 63% 提高到 81%。
{"title":"The Factuality of Large Language Models in the Legal Domain","authors":"Rajaa El Hamdani, Thomas Bonald, Fragkiskos Malliaros, Nils Holzenberger, Fabian Suchanek","doi":"arxiv-2409.11798","DOIUrl":"https://doi.org/arxiv-2409.11798","url":null,"abstract":"This paper investigates the factuality of large language models (LLMs) as\u0000knowledge bases in the legal domain, in a realistic usage scenario: we allow\u0000for acceptable variations in the answer, and let the model abstain from\u0000answering when uncertain. First, we design a dataset of diverse factual\u0000questions about case law and legislation. We then use the dataset to evaluate\u0000several LLMs under different evaluation methods, including exact, alias, and\u0000fuzzy matching. Our results show that the performance improves significantly\u0000under the alias and fuzzy matching methods. Further, we explore the impact of\u0000abstaining and in-context examples, finding that both strategies enhance\u0000precision. Finally, we demonstrate that additional pre-training on legal\u0000documents, as seen with SaulLM, further improves factual precision from 63% to\u000081%.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Active Reconfigurable Intelligent Surface Empowered Synthetic Aperture Radar Imaging 主动可重构智能表面增强型合成孔径雷达成像技术
Pub Date : 2024-09-18 DOI: arxiv-2409.11728
Yifan Sun, Rang Liu, Zhiping Lu, Honghao Luo, Ming Li, Qian Liu
Synthetic Aperture Radar (SAR) utilizes the movement of the radar antennaover a specific area of interest to achieve higher spatial resolution imaging.In this paper, we aim to investigate the realization of SAR imaging for astationary radar system with the assistance of active reconfigurableintelligent surface (ARIS) mounted on an unmanned aerial vehicle (UAV). As theUAV moves along the stationary trajectory, the ARIS can not only build ahigh-quality virtual line-of-sight (LoS) propagation path, but its mobility canalso effectively create a much larger virtual aperture, which can be utilizedto realize a SAR system. In this paper, we first present a range-Doppler (RD)imaging algorithm to obtain imaging results for the proposed ARIS-empowered SARsystem. Then, to further improve the SAR imaging performance, we attempt tooptimize the reflection coefficients of ARIS to maximize the signal-to-noiseratio (SNR) at the stationary radar receiver under the constraints of ARISmaximum power and amplification factor. An effective algorithm based onfractional programming (FP) and majorization minimization (MM) methods isdeveloped to solve the resulting non-convex problem. Simulation resultsvalidate the effectiveness of ARIS-assisted SAR imaging and our proposed RDimaging and ARIS optimization algorithms.
合成孔径雷达(SAR)利用雷达天线在特定感兴趣区域的移动来实现更高的空间分辨率成像。在本文中,我们旨在研究在无人飞行器(UAV)上安装的主动可重构智能表面(ARIS)的辅助下实现静态雷达系统的 SAR 成像。当无人飞行器沿静止轨迹移动时,ARIS 不仅能建立高质量的虚拟视距(LoS)传播路径,而且其移动性还能有效地创建更大的虚拟孔径,从而利用该孔径实现合成孔径雷达系统。在本文中,我们首先介绍了一种测距-多普勒(RD)成像算法,以获得所提出的由 ARIS 驱动的合成孔径雷达系统的成像结果。然后,为了进一步提高合成孔径雷达成像性能,我们尝试优化 ARIS 的反射系数,以便在 ARIS 最大功率和放大系数的约束下最大化静止雷达接收器的信噪比(SNR)。基于分数编程(FP)和大化最小化(MM)方法开发了一种有效算法来解决由此产生的非凸问题。仿真结果验证了 ARIS 辅助合成孔径雷达成像以及我们提出的 RDimaging 和 ARIS 优化算法的有效性。
{"title":"Active Reconfigurable Intelligent Surface Empowered Synthetic Aperture Radar Imaging","authors":"Yifan Sun, Rang Liu, Zhiping Lu, Honghao Luo, Ming Li, Qian Liu","doi":"arxiv-2409.11728","DOIUrl":"https://doi.org/arxiv-2409.11728","url":null,"abstract":"Synthetic Aperture Radar (SAR) utilizes the movement of the radar antenna\u0000over a specific area of interest to achieve higher spatial resolution imaging.\u0000In this paper, we aim to investigate the realization of SAR imaging for a\u0000stationary radar system with the assistance of active reconfigurable\u0000intelligent surface (ARIS) mounted on an unmanned aerial vehicle (UAV). As the\u0000UAV moves along the stationary trajectory, the ARIS can not only build a\u0000high-quality virtual line-of-sight (LoS) propagation path, but its mobility can\u0000also effectively create a much larger virtual aperture, which can be utilized\u0000to realize a SAR system. In this paper, we first present a range-Doppler (RD)\u0000imaging algorithm to obtain imaging results for the proposed ARIS-empowered SAR\u0000system. Then, to further improve the SAR imaging performance, we attempt to\u0000optimize the reflection coefficients of ARIS to maximize the signal-to-noise\u0000ratio (SNR) at the stationary radar receiver under the constraints of ARIS\u0000maximum power and amplification factor. An effective algorithm based on\u0000fractional programming (FP) and majorization minimization (MM) methods is\u0000developed to solve the resulting non-convex problem. Simulation results\u0000validate the effectiveness of ARIS-assisted SAR imaging and our proposed RD\u0000imaging and ARIS optimization algorithms.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Basket-Enhanced Heterogenous Hypergraph for Price-Sensitive Next Basket Recommendation 用于价格敏感型下一篮子推荐的篮子增强型异质超图
Pub Date : 2024-09-18 DOI: arxiv-2409.11695
Yuening Zhou, Yulin Wang, Qian Cui, Xinyu Guan, Francisco Cisternas
Next Basket Recommendation (NBR) is a new type of recommender system thatpredicts combinations of items users are likely to purchase together. ExistingNBR models often overlook a crucial factor, which is price, and do not fullycapture item-basket-user interactions. To address these limitations, we proposea novel method called Basket-augmented Dynamic Heterogeneous Hypergraph (BDHH).BDHH utilizes a heterogeneous multi-relational graph to capture the intricaterelationships among item features, with price as a critical factor. Moreover,our approach includes a basket-guided dynamic augmentation network that coulddynamically enhances item-basket-user interactions. Experiments on real-worlddatasets demonstrate that BDHH significantly improves recommendation accuracy,providing a more comprehensive understanding of user behavior.
下一篮子推荐(NBR)是一种新型推荐系统,它能预测用户可能一起购买的商品组合。现有的 NBR 模型往往忽略了一个关键因素,那就是价格,而且不能完全捕捉商品-篮子-用户之间的互动。为了解决这些局限性,我们提出了一种名为 "篮子增强动态异构超图(BDHH)"的新方法。BDHH 利用异构多关系图来捕捉商品特征之间错综复杂的关系,其中价格是一个关键因素。此外,我们的方法还包括一个篮子引导的动态增强网络,可以动态增强商品-篮子-用户之间的互动。在真实世界数据集上的实验证明,BDHH 显著提高了推荐的准确性,提供了对用户行为更全面的理解。
{"title":"Basket-Enhanced Heterogenous Hypergraph for Price-Sensitive Next Basket Recommendation","authors":"Yuening Zhou, Yulin Wang, Qian Cui, Xinyu Guan, Francisco Cisternas","doi":"arxiv-2409.11695","DOIUrl":"https://doi.org/arxiv-2409.11695","url":null,"abstract":"Next Basket Recommendation (NBR) is a new type of recommender system that\u0000predicts combinations of items users are likely to purchase together. Existing\u0000NBR models often overlook a crucial factor, which is price, and do not fully\u0000capture item-basket-user interactions. To address these limitations, we propose\u0000a novel method called Basket-augmented Dynamic Heterogeneous Hypergraph (BDHH).\u0000BDHH utilizes a heterogeneous multi-relational graph to capture the intricate\u0000relationships among item features, with price as a critical factor. Moreover,\u0000our approach includes a basket-guided dynamic augmentation network that could\u0000dynamically enhances item-basket-user interactions. Experiments on real-world\u0000datasets demonstrate that BDHH significantly improves recommendation accuracy,\u0000providing a more comprehensive understanding of user behavior.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Retrieve, Annotate, Evaluate, Repeat: Leveraging Multimodal LLMs for Large-Scale Product Retrieval Evaluation 检索、注释、评估、重复:利用多模态 LLM 进行大规模产品检索评估
Pub Date : 2024-09-18 DOI: arxiv-2409.11860
Kasra Hosseini, Thomas Kober, Josip Krapac, Roland Vollgraf, Weiwei Cheng, Ana Peleteiro Ramallo
Evaluating production-level retrieval systems at scale is a crucial yetchallenging task due to the limited availability of a large pool ofwell-trained human annotators. Large Language Models (LLMs) have the potentialto address this scaling issue and offer a viable alternative to humans for thebulk of annotation tasks. In this paper, we propose a framework for assessingthe product search engines in a large-scale e-commerce setting, leveragingMultimodal LLMs for (i) generating tailored annotation guidelines forindividual queries, and (ii) conducting the subsequent annotation task. Ourmethod, validated through deployment on a large e-commerce platform,demonstrates comparable quality to human annotations, significantly reducestime and cost, facilitates rapid problem discovery, and provides an effectivesolution for production-level quality control at scale.
由于训练有素的人类注释者数量有限,对生产级检索系统进行大规模评估是一项至关重要但又极具挑战性的任务。大型语言模型(LLM)有可能解决这一规模化问题,并为大量注释任务提供可行的人工替代方案。在本文中,我们提出了一个在大规模电子商务环境中评估产品搜索引擎的框架,利用多模态 LLM (i) 生成针对单个查询的定制注释指南,(ii) 执行后续注释任务。我们的方法通过在大型电子商务平台上的部署得到了验证,其质量可与人工标注相媲美,大大减少了时间和成本,有利于快速发现问题,并为大规模生产级质量控制提供了有效的解决方案。
{"title":"Retrieve, Annotate, Evaluate, Repeat: Leveraging Multimodal LLMs for Large-Scale Product Retrieval Evaluation","authors":"Kasra Hosseini, Thomas Kober, Josip Krapac, Roland Vollgraf, Weiwei Cheng, Ana Peleteiro Ramallo","doi":"arxiv-2409.11860","DOIUrl":"https://doi.org/arxiv-2409.11860","url":null,"abstract":"Evaluating production-level retrieval systems at scale is a crucial yet\u0000challenging task due to the limited availability of a large pool of\u0000well-trained human annotators. Large Language Models (LLMs) have the potential\u0000to address this scaling issue and offer a viable alternative to humans for the\u0000bulk of annotation tasks. In this paper, we propose a framework for assessing\u0000the product search engines in a large-scale e-commerce setting, leveraging\u0000Multimodal LLMs for (i) generating tailored annotation guidelines for\u0000individual queries, and (ii) conducting the subsequent annotation task. Our\u0000method, validated through deployment on a large e-commerce platform,\u0000demonstrates comparable quality to human annotations, significantly reduces\u0000time and cost, facilitates rapid problem discovery, and provides an effective\u0000solution for production-level quality control at scale.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FLARE: Fusing Language Models and Collaborative Architectures for Recommender Enhancement FLARE:融合语言模型和协作架构以增强推荐功能
Pub Date : 2024-09-18 DOI: arxiv-2409.11699
Liam Hebert, Marialena Kyriakidi, Hubert Pham, Krishna Sayana, James Pine, Sukhdeep Sodhi, Ambarish Jash
Hybrid recommender systems, combining item IDs and textual descriptions,offer potential for improved accuracy. However, previous work has largelyfocused on smaller datasets and model architectures. This paper introducesFlare (Fusing Language models and collaborative Architectures for RecommenderEnhancement), a novel hybrid recommender that integrates a language model (mT5)with a collaborative filtering model (Bert4Rec) using a Perceiver network. Thisarchitecture allows Flare to effectively combine collaborative and contentinformation for enhanced recommendations. We conduct a two-stage evaluation, first assessing Flare's performanceagainst established baselines on smaller datasets, where it demonstratescompetitive accuracy. Subsequently, we evaluate Flare on a larger, morerealistic dataset with a significantly larger item vocabulary, introducing newbaselines for this setting. Finally, we showcase Flare's inherent ability tosupport critiquing, enabling users to provide feedback and refinerecommendations. We further leverage critiquing as an evaluation method toassess the model's language understanding and its transferability to therecommendation task.
混合推荐系统结合了项目 ID 和文本描述,具有提高准确性的潜力。然而,以前的工作主要集中在较小的数据集和模型架构上。本文介绍了一种新型混合推荐器 Flare(融合语言模型和协作架构用于增强推荐器功能),它利用 Perceiver 网络将语言模型(mT5)与协作过滤模型(Bert4Rec)集成在一起。这种架构使 Flare 能够有效地将协作信息和内容信息结合起来,从而增强推荐效果。我们分两个阶段进行评估,首先评估 Flare 在较小数据集上与既定基线相比的性能,Flare 在这些数据集上表现出了具有竞争力的准确性。随后,我们在一个更大、更现实的数据集上对 Flare 进行了评估,该数据集的项目词汇量要大得多,我们为此引入了新的基准。最后,我们展示了 Flare 支持评论的内在能力,使用户能够提供反馈并完善建议。我们进一步利用点评作为一种评估方法,来评估模型的语言理解能力及其在推荐任务中的可移植性。
{"title":"FLARE: Fusing Language Models and Collaborative Architectures for Recommender Enhancement","authors":"Liam Hebert, Marialena Kyriakidi, Hubert Pham, Krishna Sayana, James Pine, Sukhdeep Sodhi, Ambarish Jash","doi":"arxiv-2409.11699","DOIUrl":"https://doi.org/arxiv-2409.11699","url":null,"abstract":"Hybrid recommender systems, combining item IDs and textual descriptions,\u0000offer potential for improved accuracy. However, previous work has largely\u0000focused on smaller datasets and model architectures. This paper introduces\u0000Flare (Fusing Language models and collaborative Architectures for Recommender\u0000Enhancement), a novel hybrid recommender that integrates a language model (mT5)\u0000with a collaborative filtering model (Bert4Rec) using a Perceiver network. This\u0000architecture allows Flare to effectively combine collaborative and content\u0000information for enhanced recommendations. We conduct a two-stage evaluation, first assessing Flare's performance\u0000against established baselines on smaller datasets, where it demonstrates\u0000competitive accuracy. Subsequently, we evaluate Flare on a larger, more\u0000realistic dataset with a significantly larger item vocabulary, introducing new\u0000baselines for this setting. Finally, we showcase Flare's inherent ability to\u0000support critiquing, enabling users to provide feedback and refine\u0000recommendations. We further leverage critiquing as an evaluation method to\u0000assess the model's language understanding and its transferability to the\u0000recommendation task.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255158","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding the Effects of the Baidu-ULTR Logging Policy on Two-Tower Models 了解百度-ULTR 日志政策对双塔模型的影响
Pub Date : 2024-09-18 DOI: arxiv-2409.12043
Morris de Haan, Philipp Hager
Despite the popularity of the two-tower model for unbiased learning to rank(ULTR) tasks, recent work suggests that it suffers from a major limitation thatcould lead to its collapse in industry applications: the problem of loggingpolicy confounding. Several potential solutions have even been proposed;however, the evaluation of these methods was mostly conducted usingsemi-synthetic simulation experiments. This paper bridges the gap betweentheory and practice by investigating the confounding problem on the largestreal-world dataset, Baidu-ULTR. Our main contributions are threefold: 1) weshow that the conditions for the confounding problem are given on Baidu-ULTR,2) the confounding problem bears no significant effect on the two-tower model,and 3) we point to a potential mismatch between expert annotations, the goldenstandard in ULTR, and user click behavior.
尽管双塔模型在无偏学习排名(ULTR)任务中很受欢迎,但最近的研究表明,它存在一个可能导致其在行业应用中崩溃的主要局限性:记录政策混淆问题。人们甚至提出了几种潜在的解决方案;然而,对这些方法的评估大多是通过半合成模拟实验进行的。本文通过在最大的真实世界数据集百度-ULTR 上研究混淆问题,弥补了理论与实践之间的差距。我们的主要贡献有三个方面:1)我们证明了在百度-ULTR 上混淆问题的条件;2)混淆问题对双塔模型没有显著影响;3)我们指出了专家注释(ULTR 的黄金标准)与用户点击行为之间潜在的不匹配。
{"title":"Understanding the Effects of the Baidu-ULTR Logging Policy on Two-Tower Models","authors":"Morris de Haan, Philipp Hager","doi":"arxiv-2409.12043","DOIUrl":"https://doi.org/arxiv-2409.12043","url":null,"abstract":"Despite the popularity of the two-tower model for unbiased learning to rank\u0000(ULTR) tasks, recent work suggests that it suffers from a major limitation that\u0000could lead to its collapse in industry applications: the problem of logging\u0000policy confounding. Several potential solutions have even been proposed;\u0000however, the evaluation of these methods was mostly conducted using\u0000semi-synthetic simulation experiments. This paper bridges the gap between\u0000theory and practice by investigating the confounding problem on the largest\u0000real-world dataset, Baidu-ULTR. Our main contributions are threefold: 1) we\u0000show that the conditions for the confounding problem are given on Baidu-ULTR,\u00002) the confounding problem bears no significant effect on the two-tower model,\u0000and 3) we point to a potential mismatch between expert annotations, the golden\u0000standard in ULTR, and user click behavior.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142269087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Enhanced-State Reinforcement Learning Algorithm for Multi-Task Fusion in Large-Scale Recommender Systems 用于大规模推荐系统多任务融合的增强状态强化学习算法
Pub Date : 2024-09-18 DOI: arxiv-2409.11678
Peng Liu, Jiawei Zhu, Cong Xu, Ming Zhao, Bin Wang
As the last key stage of Recommender Systems (RSs), Multi-Task Fusion (MTF)is in charge of combining multiple scores predicted by Multi-Task Learning(MTL) into a final score to maximize user satisfaction, which decides theultimate recommendation results. In recent years, to maximize long-term usersatisfaction within a recommendation session, Reinforcement Learning (RL) iswidely used for MTF in large-scale RSs. However, limited by their modelingpattern, all the current RL-MTF methods can only utilize user features as thestate to generate actions for each user, but unable to make use of itemfeatures and other valuable features, which leads to suboptimal results.Addressing this problem is a challenge that requires breaking through thecurrent modeling pattern of RL-MTF. To solve this problem, we propose a novelmethod called Enhanced-State RL for MTF in RSs. Unlike the existing methodsmentioned above, our method first defines user features, item features, andother valuable features collectively as the enhanced state; then proposes anovel actor and critic learning process to utilize the enhanced state to makemuch better action for each user-item pair. To the best of our knowledge, thisnovel modeling pattern is being proposed for the first time in the field ofRL-MTF. We conduct extensive offline and online experiments in a large-scaleRS. The results demonstrate that our model outperforms other modelssignificantly. Enhanced-State RL has been fully deployed in our RS more thanhalf a year, improving +3.84% user valid consumption and +0.58% user durationtime compared to baseline.
作为推荐系统(RS)的最后一个关键阶段,多任务融合(MTF)负责将多任务学习(Multi-Task Learning,MTL)预测的多个得分合并成一个最终得分,以最大限度地提高用户满意度,从而决定最终的推荐结果。近年来,为了最大限度地提高用户在一次推荐会话中的长期满意度,强化学习(RL)被广泛应用于大规模 RS 的 MTF。然而,受限于其建模模式,目前所有的 RL-MTF 方法都只能利用用户特征作为状态来为每个用户生成动作,而无法利用项目特征和其他有价值的特征,从而导致次优结果。为了解决这个问题,我们提出了一种新方法,称为 RS 中 MTF 的增强状态 RL。与上述现有方法不同的是,我们的方法首先将用户特征、物品特征和其他有价值的特征统称为增强状态,然后提出了一个新的行为者和批评者学习过程,以利用增强状态为每个用户-物品配对做出更好的行动。据我们所知,这种新颖的建模模式是在 RL-MTF 领域首次提出的。我们在大规模系统中进行了广泛的离线和在线实验。结果表明,我们的模型明显优于其他模型。增强状态 RL 已在我们的 RS 中全面部署了半年多,与基线相比,用户有效消耗量提高了 +3.84%,用户持续时间提高了 +0.58%。
{"title":"An Enhanced-State Reinforcement Learning Algorithm for Multi-Task Fusion in Large-Scale Recommender Systems","authors":"Peng Liu, Jiawei Zhu, Cong Xu, Ming Zhao, Bin Wang","doi":"arxiv-2409.11678","DOIUrl":"https://doi.org/arxiv-2409.11678","url":null,"abstract":"As the last key stage of Recommender Systems (RSs), Multi-Task Fusion (MTF)\u0000is in charge of combining multiple scores predicted by Multi-Task Learning\u0000(MTL) into a final score to maximize user satisfaction, which decides the\u0000ultimate recommendation results. In recent years, to maximize long-term user\u0000satisfaction within a recommendation session, Reinforcement Learning (RL) is\u0000widely used for MTF in large-scale RSs. However, limited by their modeling\u0000pattern, all the current RL-MTF methods can only utilize user features as the\u0000state to generate actions for each user, but unable to make use of item\u0000features and other valuable features, which leads to suboptimal results.\u0000Addressing this problem is a challenge that requires breaking through the\u0000current modeling pattern of RL-MTF. To solve this problem, we propose a novel\u0000method called Enhanced-State RL for MTF in RSs. Unlike the existing methods\u0000mentioned above, our method first defines user features, item features, and\u0000other valuable features collectively as the enhanced state; then proposes a\u0000novel actor and critic learning process to utilize the enhanced state to make\u0000much better action for each user-item pair. To the best of our knowledge, this\u0000novel modeling pattern is being proposed for the first time in the field of\u0000RL-MTF. We conduct extensive offline and online experiments in a large-scale\u0000RS. The results demonstrate that our model outperforms other models\u0000significantly. Enhanced-State RL has been fully deployed in our RS more than\u0000half a year, improving +3.84% user valid consumption and +0.58% user duration\u0000time compared to baseline.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LLM-Powered Text Simulation Attack Against ID-Free Recommender Systems 由 LLM 驱动的针对无 ID 推荐系统的文本模拟攻击
Pub Date : 2024-09-18 DOI: arxiv-2409.11690
Zongwei Wang, Min Gao, Junliang Yu, Xinyi Gao, Quoc Viet Hung Nguyen, Shazia Sadiq, Hongzhi Yin
The ID-free recommendation paradigm has been proposed to address thelimitation that traditional recommender systems struggle to model cold-startusers or items with new IDs. Despite its effectiveness, this study uncoversthat ID-free recommender systems are vulnerable to the proposed Text Simulationattack (TextSimu) which aims to promote specific target items. As a novel typeof text poisoning attack, TextSimu exploits large language models (LLM) toalter the textual information of target items by simulating the characteristicsof popular items. It operates effectively in both black-box and white-boxsettings, utilizing two key components: a unified popularity extraction module,which captures the essential characteristics of popular items, and an N-personaconsistency simulation strategy, which creates multiple personas tocollaboratively synthesize refined promotional textual descriptions for targetitems by simulating the popular items. To withstand TextSimu-like attacks, wefurther explore the detection approach for identifying LLM-generatedpromotional text. Extensive experiments conducted on three datasets demonstratethat TextSimu poses a more significant threat than existing poisoning attacks,while our defense method can detect malicious text of target items generated byTextSimu. By identifying the vulnerability, we aim to advance the developmentof more robust ID-free recommender systems.
无 ID 推荐范式的提出是为了解决传统推荐系统难以对冷启动用户或具有新 ID 的项目进行建模的限制。尽管无 ID 推荐系统很有效,但本研究发现它很容易受到旨在推广特定目标项目的文本模拟攻击(TextSimu)的攻击。作为一种新型文本中毒攻击,TextSimu 利用大型语言模型(LLM),通过模拟流行项目的特征来改变目标项目的文本信息。它在黑盒和白盒环境下都能有效运行,利用了两个关键组件:一个是统一的流行度提取模块,它能捕捉流行项目的基本特征;另一个是 N 人一致性模拟策略,它能创建多个角色,通过模拟流行项目来协作合成目标项目的精炼促销文本描述。为了抵御类似 TextSimu 的攻击,我们进一步探索了识别 LLM 生成的促销文本的检测方法。在三个数据集上进行的广泛实验表明,TextSimu 比现有的中毒攻击构成了更大的威胁,而我们的防御方法可以检测到由 TextSimu 生成的目标项目的恶意文本。通过识别该漏洞,我们旨在推动更强大的无 ID 推荐系统的开发。
{"title":"LLM-Powered Text Simulation Attack Against ID-Free Recommender Systems","authors":"Zongwei Wang, Min Gao, Junliang Yu, Xinyi Gao, Quoc Viet Hung Nguyen, Shazia Sadiq, Hongzhi Yin","doi":"arxiv-2409.11690","DOIUrl":"https://doi.org/arxiv-2409.11690","url":null,"abstract":"The ID-free recommendation paradigm has been proposed to address the\u0000limitation that traditional recommender systems struggle to model cold-start\u0000users or items with new IDs. Despite its effectiveness, this study uncovers\u0000that ID-free recommender systems are vulnerable to the proposed Text Simulation\u0000attack (TextSimu) which aims to promote specific target items. As a novel type\u0000of text poisoning attack, TextSimu exploits large language models (LLM) to\u0000alter the textual information of target items by simulating the characteristics\u0000of popular items. It operates effectively in both black-box and white-box\u0000settings, utilizing two key components: a unified popularity extraction module,\u0000which captures the essential characteristics of popular items, and an N-persona\u0000consistency simulation strategy, which creates multiple personas to\u0000collaboratively synthesize refined promotional textual descriptions for target\u0000items by simulating the popular items. To withstand TextSimu-like attacks, we\u0000further explore the detection approach for identifying LLM-generated\u0000promotional text. Extensive experiments conducted on three datasets demonstrate\u0000that TextSimu poses a more significant threat than existing poisoning attacks,\u0000while our defense method can detect malicious text of target items generated by\u0000TextSimu. By identifying the vulnerability, we aim to advance the development\u0000of more robust ID-free recommender systems.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalized compression and compressive search of large datasets 大型数据集的广义压缩和压缩搜索
Pub Date : 2024-09-18 DOI: arxiv-2409.12161
Morgan E. Prior, Thomas Howard III, Emily Light, Najib Ishaq, Noah M. Daniels
The Big Data explosion has necessitated the development of search algorithmsthat scale sub-linearly in time and memory. While compression algorithms and search algorithms do exist independently,few algorithms offer both, and those which do are domain-specific. We present panCAKES, a novel approach to compressive search, i.e., a way toperform $k$-NN and $rho$-NN search on compressed data while only decompressinga small, relevant, portion of the data. panCAKES assumes the manifold hypothesis and leverages the low-dimensionalstructure of the data to compress and search it efficiently. panCAKES is generic over any distance function for which the distance betweentwo points is proportional to the memory cost of storing an encoding of one interms of the other. This property holds for many widely-used distance functions, e.g. string editdistances (Levenshtein, Needleman-Wunsch, etc.) and set dissimilarity measures(Jaccard, Dice, etc.). We benchmark panCAKES on a variety of datasets, including genomic, proteomic,and set data. We compare compression ratios to gzip, and search performance between thecompressed and uncompressed versions of the same dataset. panCAKES achieves compression ratios close to those of gzip, while offeringsub-linear time performance for $k$-NN and $rho$-NN search. We conclude that panCAKES is an efficient, general-purpose algorithm forexact compressive search on large datasets that obey the manifold hypothesis. We provide an open-source implementation of panCAKES in the Rust programminglanguage.
大数据爆炸要求开发在时间和内存上呈亚线性扩展的搜索算法。虽然压缩算法和搜索算法各自独立存在,但很少有算法能同时提供压缩算法和搜索算法,而且提供压缩算法和搜索算法的算法都是针对特定领域的。PanCAKES 假设流形假设,并利用数据的低维结构来高效地压缩和搜索数据。PanCAKES 通用于任何距离函数,两点之间的距离与存储一个点与另一个点之间的编码的内存成本成正比。这一特性适用于许多广泛使用的距离函数,例如字符串编辑距离(Levenshtein、Needleman-Wunsch 等)和集合不相似度量(Jaccard、Dice 等)。我们在各种数据集(包括基因组、蛋白质组和集合数据)上对 panCAKES 进行了基准测试。我们比较了与 gzip 的压缩率,以及同一数据集压缩版本和未压缩版本的搜索性能。panCAKES 实现了接近 gzip 的压缩率,同时为 $k$-NN 和 $rho$-NN 搜索提供了近线性时间性能。我们的结论是,panCAKES 是一种高效的通用算法,可以在符合流形假设的大型数据集上进行压缩搜索。我们用 Rust 编程语言提供了 panCAKES 的开源实现。
{"title":"Generalized compression and compressive search of large datasets","authors":"Morgan E. Prior, Thomas Howard III, Emily Light, Najib Ishaq, Noah M. Daniels","doi":"arxiv-2409.12161","DOIUrl":"https://doi.org/arxiv-2409.12161","url":null,"abstract":"The Big Data explosion has necessitated the development of search algorithms\u0000that scale sub-linearly in time and memory. While compression algorithms and search algorithms do exist independently,\u0000few algorithms offer both, and those which do are domain-specific. We present panCAKES, a novel approach to compressive search, i.e., a way to\u0000perform $k$-NN and $rho$-NN search on compressed data while only decompressing\u0000a small, relevant, portion of the data. panCAKES assumes the manifold hypothesis and leverages the low-dimensional\u0000structure of the data to compress and search it efficiently. panCAKES is generic over any distance function for which the distance between\u0000two points is proportional to the memory cost of storing an encoding of one in\u0000terms of the other. This property holds for many widely-used distance functions, e.g. string edit\u0000distances (Levenshtein, Needleman-Wunsch, etc.) and set dissimilarity measures\u0000(Jaccard, Dice, etc.). We benchmark panCAKES on a variety of datasets, including genomic, proteomic,\u0000and set data. We compare compression ratios to gzip, and search performance between the\u0000compressed and uncompressed versions of the same dataset. panCAKES achieves compression ratios close to those of gzip, while offering\u0000sub-linear time performance for $k$-NN and $rho$-NN search. We conclude that panCAKES is an efficient, general-purpose algorithm for\u0000exact compressive search on large datasets that obey the manifold hypothesis. We provide an open-source implementation of panCAKES in the Rust programming\u0000language.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142255194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Information Retrieval
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1