ACM Transactions on Information Systems最新文献_第10页

H3GNN: Hybrid Hierarchical HyperGraph Neural Network for Personalized Session-based Recommendation H3GNN:用于个性化会话推荐的混合层次超图神经网络

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-10-23 DOI: 10.1145/3630002

Zhizhuo Yin, Kai Han, Pengzi Wang, Xi Zhu

Personalized Session-based recommendation (PSBR) is a general and challenging task in the real world, aiming to recommend a session’s next clicked item based on the session’s item transition information and the corresponding user’s historical sessions. A session is defined as a sequence of interacted items during a short period. The PSBR problem has a natural hierarchical architecture in which each session consists of a series of items, and each user owns a series of sessions. However, the existing PSBR methods can merely capture the pairwise relation information within items and users. To effectively capture the hierarchical information, we propose a novel hierarchical hypergraph neural network to model the hierarchical architecture. Moreover, considering that the items in sessions are sequentially ordered, while the hypergraph can only model the set relation, we propose a directed graph aggregator (DGA) to aggregate the sequential information from the directed global item graph. By attentively combining the embeddings of the above two modules, we propose a framework dubbed H3GNN (Hybrid Hierarchical HyperGraph Neural Network). Extensive experiments on three benchmark datasets demonstrate the superiority of our proposed model compared to the state-of-the-art methods, and ablation experiment results validate the effectiveness of all the proposed components.

基于会话的个性化推荐(PSBR)是现实世界中一个普遍且具有挑战性的任务，其目的是根据会话的项目转换信息和相应用户的历史会话来推荐会话的下一个点击项目。会话被定义为短时间内一系列相互作用的项目。PSBR问题具有自然的层次结构，其中每个会话由一系列项组成，每个用户拥有一系列会话。然而，现有的PSBR方法只能捕获项目和用户中的成对关系信息。为了有效地捕获层次信息，我们提出了一种新的层次超图神经网络来对层次结构进行建模。此外，考虑到会话中的项目是有序的，而超图只能对集合关系进行建模，我们提出了一种有向图聚合器(DGA)来对有向全局项目图中的顺序信息进行聚合。通过仔细结合上述两个模块的嵌入，我们提出了一个名为H3GNN(混合层次超图神经网络)的框架。在三个基准数据集上的大量实验证明了我们提出的模型与最先进的方法相比的优越性，烧蚀实验结果验证了所有提出的组件的有效性。

{"title":"H3GNN: Hybrid Hierarchical HyperGraph Neural Network for Personalized Session-based Recommendation","authors":"Zhizhuo Yin, Kai Han, Pengzi Wang, Xi Zhu","doi":"10.1145/3630002","DOIUrl":"https://doi.org/10.1145/3630002","url":null,"abstract":"Personalized Session-based recommendation (PSBR) is a general and challenging task in the real world, aiming to recommend a session’s next clicked item based on the session’s item transition information and the corresponding user’s historical sessions. A session is defined as a sequence of interacted items during a short period. The PSBR problem has a natural hierarchical architecture in which each session consists of a series of items, and each user owns a series of sessions. However, the existing PSBR methods can merely capture the pairwise relation information within items and users. To effectively capture the hierarchical information, we propose a novel hierarchical hypergraph neural network to model the hierarchical architecture. Moreover, considering that the items in sessions are sequentially ordered, while the hypergraph can only model the set relation, we propose a directed graph aggregator (DGA) to aggregate the sequential information from the directed global item graph. By attentively combining the embeddings of the above two modules, we propose a framework dubbed H3GNN (Hybrid Hierarchical HyperGraph Neural Network). Extensive experiments on three benchmark datasets demonstrate the superiority of our proposed model compared to the state-of-the-art methods, and ablation experiment results validate the effectiveness of all the proposed components.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"25 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135366499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Manipulating Visually-aware Federated Recommender Systems and Its Countermeasures 操纵视觉感知联合推荐系统及其对策

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-10-23 DOI: 10.1145/3630005

Wei Yuan, Shilong Yuan, Chaoqun Yang, Quoc Viet Hung Nguyen, Hongzhi Yin

Federated recommender systems (FedRecs) have been widely explored recently due to their capability to safeguard user data privacy. These systems enable a central server to collaboratively learn recommendation models by sharing public parameters with clients, providing privacy-preserving solutions. However, this collaborative approach also creates a vulnerability that allows adversaries to manipulate FedRecs. Existing works on FedRec security already reveal that items can easily be promoted by malicious users via model poisoning attacks, but all of them mainly focus on FedRecs with only collaborative information (i.e., user-item interactions). We contend that these attacks are effective primarily due to the data sparsity of collaborative signals. In light of this, we propose a method to address data sparsity and model poisoning threats by incorporating product visual information. Intriguingly, our empirical findings demonstrate that the inclusion of visual information renders all existing model poisoning attacks ineffective. Nevertheless, the integration of visual information also introduces a new avenue for adversaries to manipulate federated recommender systems, as this information typically originates from external sources. To assess such threats, we propose a novel form of poisoning attack tailored for visually-aware FedRecs, namely image poisoning attacks, where adversaries can gradually modify the uploaded image with human-unaware perturbations to manipulate item ranks during the FedRecs’ training process. Moreover, we provide empirical evidence showcasing a heightened threat when image poisoning attacks are combined with model poisoning attacks, resulting in easier manipulation of the federated recommendation systems. To ensure the safe utilization of visual information, we employ a diffusion model in visually-aware FedRecs to purify each uploaded image and detect the adversarial images. Extensive experiments conducted with two FedRecs on two datasets demonstrate the effectiveness and generalization of our proposed attacks and defenses.

联邦推荐系统(federc)由于其保护用户数据隐私的能力，最近得到了广泛的探索。这些系统使中央服务器能够通过与客户共享公共参数来协作学习推荐模型，从而提供保护隐私的解决方案。然而，这种协作方法也产生了一个漏洞，允许对手操纵FedRecs。现有的FedRec安全研究已经表明，物品可以很容易地被恶意用户通过模型中毒攻击来推广，但所有这些研究都主要集中在只有协作信息(即用户-物品交互)的FedRec上。我们认为这些攻击之所以有效，主要是因为协作信号的数据稀疏性。鉴于此，我们提出了一种通过结合产品视觉信息来解决数据稀疏性和模型中毒威胁的方法。有趣的是，我们的实证研究结果表明，包含视觉信息使所有现有的模型中毒攻击无效。然而，视觉信息的集成也为对手操纵联邦推荐系统引入了新的途径，因为这些信息通常来自外部来源。为了评估这些威胁，我们提出了一种为具有视觉感知的FedRecs量身定制的新型投毒攻击，即图像投毒攻击，攻击者可以在FedRecs的训练过程中逐渐修改上传的图像，并使用人类不知道的扰动来操纵项目排名。此外，我们提供的经验证据表明，当图像中毒攻击与模型中毒攻击相结合时，威胁会增加，从而更容易操纵联邦推荐系统。为了确保视觉信息的安全利用，我们在视觉感知FedRecs中使用扩散模型来净化每个上传的图像并检测对抗图像。在两个数据集上使用两个FedRecs进行的大量实验证明了我们提出的攻击和防御的有效性和泛化性。

{"title":"Manipulating Visually-aware Federated Recommender Systems and Its Countermeasures","authors":"Wei Yuan, Shilong Yuan, Chaoqun Yang, Quoc Viet Hung Nguyen, Hongzhi Yin","doi":"10.1145/3630005","DOIUrl":"https://doi.org/10.1145/3630005","url":null,"abstract":"Federated recommender systems (FedRecs) have been widely explored recently due to their capability to safeguard user data privacy. These systems enable a central server to collaboratively learn recommendation models by sharing public parameters with clients, providing privacy-preserving solutions. However, this collaborative approach also creates a vulnerability that allows adversaries to manipulate FedRecs. Existing works on FedRec security already reveal that items can easily be promoted by malicious users via model poisoning attacks, but all of them mainly focus on FedRecs with only collaborative information (i.e., user-item interactions). We contend that these attacks are effective primarily due to the data sparsity of collaborative signals. In light of this, we propose a method to address data sparsity and model poisoning threats by incorporating product visual information. Intriguingly, our empirical findings demonstrate that the inclusion of visual information renders all existing model poisoning attacks ineffective. Nevertheless, the integration of visual information also introduces a new avenue for adversaries to manipulate federated recommender systems, as this information typically originates from external sources. To assess such threats, we propose a novel form of poisoning attack tailored for visually-aware FedRecs, namely image poisoning attacks, where adversaries can gradually modify the uploaded image with human-unaware perturbations to manipulate item ranks during the FedRecs’ training process. Moreover, we provide empirical evidence showcasing a heightened threat when image poisoning attacks are combined with model poisoning attacks, resulting in easier manipulation of the federated recommendation systems. To ensure the safe utilization of visual information, we employ a diffusion model in visually-aware FedRecs to purify each uploaded image and detect the adversarial images. Extensive experiments conducted with two FedRecs on two datasets demonstrate the effectiveness and generalization of our proposed attacks and defenses.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"29 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135366638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Teach and Explore: A Multiplex Information-guided Effective and Efficient Reinforcement Learning for Sequential Recommendation 教学与探索:一种多重信息引导的序列推荐的有效强化学习

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-10-23 DOI: 10.1145/3630003

Surong Yan, Chenglong Shi, Haosen Wang, Lei Chen, Ling Jiang, Ruilin Guo, Kwei-Jay Lin

Casting sequential recommendation (SR) as a reinforcement learning (RL) problem is promising and some RL-based methods have been proposed for SR. However, these models are sub-optimal due to the following limitations: a) they fail to leverage the supervision signals in the RL training to capture users’ explicit preferences, leading to slow convergence; and b) they do not utilize auxiliary information (e.g., knowledge graph) to avoid blindness when exploring users’ potential interests. To address the above-mentioned limitations, we propose a multiplex information-guided RL model (MELOD), which employs a novel RL training framework with Teach and Explore components for SR. We adopt a Teach component to accurately capture users’ explicit preferences and speed up RL convergence. Meanwhile, we design a dynamic intent induction network (DIIN) as a policy function to generate diverse predictions. We utilize the DIIN for the Explore component to mine users’ potential interests by conducting a sequential and knowledge information joint-guided exploration. Moreover, a sequential and knowledge-aware reward function is designed to achieve stable RL training. These components significantly improve MELOD’s performance and convergence against existing RL algorithms to achieve effectiveness and efficiency. Experimental results on seven real-world datasets show that our model significantly outperforms state-of-the-art methods.

将序列推荐(SR)作为一个强化学习(RL)问题是有希望的，并且已经提出了一些基于RL的方法。然而，由于以下限制，这些模型不是最优的:a)它们不能利用RL训练中的监督信号来捕捉用户的明确偏好，导致收敛缓慢;b)没有利用辅助信息(如知识图谱)，避免在挖掘用户潜在兴趣时的盲目性。为了解决上述限制，我们提出了一种多重信息引导的强化学习模型(MELOD)，该模型采用了一种新颖的强化学习训练框架，其中包含用于强化学习的Teach和Explore组件。我们采用了Teach组件来准确捕获用户的明确偏好并加速强化学习的收敛。同时，我们设计了一个动态意图诱导网络(DIIN)作为策略函数来生成不同的预测。我们将DIIN用于Explore组件，通过进行顺序和知识信息联合引导的探索来挖掘用户的潜在兴趣。此外，为了实现稳定的强化学习训练，设计了一个顺序的、知识感知的奖励函数。这些组件显著提高了MELOD与现有RL算法的性能和收敛性，从而实现了有效性和效率。在七个真实数据集上的实验结果表明，我们的模型明显优于最先进的方法。

{"title":"Teach and Explore: A Multiplex Information-guided Effective and Efficient Reinforcement Learning for Sequential Recommendation","authors":"Surong Yan, Chenglong Shi, Haosen Wang, Lei Chen, Ling Jiang, Ruilin Guo, Kwei-Jay Lin","doi":"10.1145/3630003","DOIUrl":"https://doi.org/10.1145/3630003","url":null,"abstract":"Casting sequential recommendation (SR) as a reinforcement learning (RL) problem is promising and some RL-based methods have been proposed for SR. However, these models are sub-optimal due to the following limitations: a) they fail to leverage the supervision signals in the RL training to capture users’ explicit preferences, leading to slow convergence; and b) they do not utilize auxiliary information (e.g., knowledge graph) to avoid blindness when exploring users’ potential interests. To address the above-mentioned limitations, we propose a multiplex information-guided RL model (MELOD), which employs a novel RL training framework with Teach and Explore components for SR. We adopt a Teach component to accurately capture users’ explicit preferences and speed up RL convergence. Meanwhile, we design a dynamic intent induction network (DIIN) as a policy function to generate diverse predictions. We utilize the DIIN for the Explore component to mine users’ potential interests by conducting a sequential and knowledge information joint-guided exploration. Moreover, a sequential and knowledge-aware reward function is designed to achieve stable RL training. These components significantly improve MELOD’s performance and convergence against existing RL algorithms to achieve effectiveness and efficiency. Experimental results on seven real-world datasets show that our model significantly outperforms state-of-the-art methods.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"5 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135366775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robust Collaborative Filtering to Popularity Distribution Shift 面向人气分布转移的鲁棒协同过滤

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-10-12 DOI: 10.1145/3627159

An Zhang, Wenchang Ma, Jingnan Zheng, Xiang Wang, Tat-Seng Chua

In leading collaborative filtering (CF) models, representations of users and items are prone to learn popularity bias in the training data as shortcuts. The popularity shortcut tricks are good for in-distribution (ID) performance but poorly generalized to out-of-distribution (OOD) data, i.e., when popularity distribution of test data shifts w.r.t. the training one. To close the gap, debiasing strategies try to assess the shortcut degrees and mitigate them from the representations. However, there exist two deficiencies: (1) when measuring the shortcut degrees, most strategies only use statistical metrics on a single aspect (i.e., item frequency on item and user frequency on user aspect), failing to accommodate the compositional degree of a user-item pair; (2) when mitigating shortcuts, many strategies assume that the test distribution is known in advance. This results in low-quality debiased representations. Worse still, these strategies achieve OOD generalizability with a sacrifice on ID performance. In this work, we present a simple yet effective debiasing strategy, PopGo, which quantifies and reduces the interaction-wise popularity shortcut without any assumptions on the test data. It first learns a shortcut model, which yields a shortcut degree of a user-item pair based on their popularity representations. Then, it trains the CF model by adjusting the predictions with the interaction-wise shortcut degrees. By taking both causal- and information-theoretical looks at PopGo, we can justify why it encourages the CF model to capture the critical popularity-agnostic features while leaving the spurious popularity-relevant patterns out. We use PopGo to debias two high-performing CF models (MF, LightGCN) on four benchmark datasets. On both ID and OOD test sets, PopGo achieves significant gains over the state-of-the-art debiasing strategies (e.g., DICE, MACR).

在领先的协同过滤(CF)模型中，用户和项目的表示倾向于学习训练数据中的流行偏差作为快捷方式。流行度捷径技巧对分布内(ID)性能很好，但不适用于分布外(OOD)数据，即当测试数据的流行度分布与训练数据的流行度分布相反时。为了缩小差距，去偏策略试图评估捷径度并从表征中减轻它们。但存在两个不足:(1)大多数策略在测量快捷度时，仅使用单一方面的统计度量(即项目对项目的频率和用户对用户方面的频率)，未能适应用户-项目对的构成程度;(2)在缓解捷径时，许多策略都假设测试分布是事先已知的。这导致低质量的无偏见表示。更糟糕的是，这些策略以牺牲ID性能来实现OOD的通用性。在这项工作中，我们提出了一个简单而有效的去偏策略，PopGo，它量化和减少了交互方面的流行捷径，而不需要对测试数据进行任何假设。它首先学习一个快捷模型，该模型根据用户-物品对的流行度表示生成快捷度。然后，它通过调整预测与交互的快捷度来训练CF模型。通过对PopGo的因果和信息理论的观察，我们可以证明为什么它鼓励CF模型捕捉关键的流行不可知论特征，而将虚假的流行相关模式排除在外。我们使用PopGo对四个基准数据集上的两个高性能CF模型(MF, LightGCN)进行了比较。在ID和OOD测试集上，PopGo比最先进的去偏策略(例如DICE, MACR)都取得了显著的进步。

{"title":"Robust Collaborative Filtering to Popularity Distribution Shift","authors":"An Zhang, Wenchang Ma, Jingnan Zheng, Xiang Wang, Tat-Seng Chua","doi":"10.1145/3627159","DOIUrl":"https://doi.org/10.1145/3627159","url":null,"abstract":"In leading collaborative filtering (CF) models, representations of users and items are prone to learn popularity bias in the training data as shortcuts. The popularity shortcut tricks are good for in-distribution (ID) performance but poorly generalized to out-of-distribution (OOD) data, i.e., when popularity distribution of test data shifts w.r.t. the training one. To close the gap, debiasing strategies try to assess the shortcut degrees and mitigate them from the representations. However, there exist two deficiencies: (1) when measuring the shortcut degrees, most strategies only use statistical metrics on a single aspect (i.e., item frequency on item and user frequency on user aspect), failing to accommodate the compositional degree of a user-item pair; (2) when mitigating shortcuts, many strategies assume that the test distribution is known in advance. This results in low-quality debiased representations. Worse still, these strategies achieve OOD generalizability with a sacrifice on ID performance. In this work, we present a simple yet effective debiasing strategy, PopGo, which quantifies and reduces the interaction-wise popularity shortcut without any assumptions on the test data. It first learns a shortcut model, which yields a shortcut degree of a user-item pair based on their popularity representations. Then, it trains the CF model by adjusting the predictions with the interaction-wise shortcut degrees. By taking both causal- and information-theoretical looks at PopGo, we can justify why it encourages the CF model to capture the critical popularity-agnostic features while leaving the spurious popularity-relevant patterns out. We use PopGo to debias two high-performing CF models (MF, LightGCN) on four benchmark datasets. On both ID and OOD test sets, PopGo achieves significant gains over the state-of-the-art debiasing strategies (e.g., DICE, MACR).","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136013633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Multimodal Dialog Systems with Dual Knowledge-enhanced Generative Pretrained Language Model 具有双知识增强生成预训练语言模型的多模态对话系统

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-10-06 DOI: 10.1145/3606368

Xiaolin Chen, Xuemeng Song, Liqiang Jing, Shuo Li, Linmei Hu, Liqiang Nie

Text response generation for multimodal task-oriented dialog systems, which aims to generate the proper text response given the multimodal context, is an essential yet challenging task. Although existing efforts have achieved compelling success, they still suffer from two pivotal limitations: 1) overlook the benefit of generative pre-training , and 2) ignore the textual context related knowledge . To address these limitations, we propose a novel dual knowledge-enhanced generative pretrained language model for multimodal task-oriented dialog systems (DKMD), consisting of three key components: dual knowledge selection , dual knowledge-enhanced context learning , and knowledge-enhanced response generation . To be specific, the dual knowledge selection component aims to select the related knowledge according to both textual and visual modalities of the given context. Thereafter, the dual knowledge-enhanced context learning component targets seamlessly integrating the selected knowledge into the multimodal context learning from both global and local perspectives, where the cross-modal semantic relation is also explored. Moreover, the knowledge-enhanced response generation component comprises a revised BART decoder, where an additional dot-product knowledge-decoder attention sub-layer is introduced for explicitly utilizing the knowledge to advance the text response generation. Extensive experiments on a public dataset verify the superiority of the proposed DKMD over state-of-the-art competitors.

面向多模态任务的对话系统文本响应生成是一项重要而又具有挑战性的任务，其目的是在多模态语境下生成适当的文本响应。尽管现有的努力已经取得了令人瞩目的成功，但它们仍然存在两个关键的局限性:1)忽视了生成式预训练的好处;2)忽视了与文本上下文相关的知识。为了解决这些限制，我们提出了一种新的针对多模态任务导向对话系统(DKMD)的双知识增强生成预训练语言模型，该模型由三个关键组件组成:双知识选择、双知识增强的上下文学习和知识增强的响应生成。具体而言，双重知识选择组件旨在根据给定上下文的文本和视觉模式选择相关知识。然后，双知识增强上下文学习组件的目标是从全局和局部的角度将所选知识无缝地集成到多模态上下文学习中，并探索跨模态语义关系。此外，知识增强的响应生成组件包括一个修订的BART解码器，其中引入了一个额外的点积知识解码器注意子层，用于明确地利用知识来推进文本响应生成。在公共数据集上进行的大量实验验证了所提出的DKMD优于最先进的竞争对手。

{"title":"Multimodal Dialog Systems with Dual Knowledge-enhanced Generative Pretrained Language Model","authors":"Xiaolin Chen, Xuemeng Song, Liqiang Jing, Shuo Li, Linmei Hu, Liqiang Nie","doi":"10.1145/3606368","DOIUrl":"https://doi.org/10.1145/3606368","url":null,"abstract":"Text response generation for multimodal task-oriented dialog systems, which aims to generate the proper text response given the multimodal context, is an essential yet challenging task. Although existing efforts have achieved compelling success, they still suffer from two pivotal limitations: 1) overlook the benefit of generative pre-training , and 2) ignore the textual context related knowledge . To address these limitations, we propose a novel dual knowledge-enhanced generative pretrained language model for multimodal task-oriented dialog systems (DKMD), consisting of three key components: dual knowledge selection , dual knowledge-enhanced context learning , and knowledge-enhanced response generation . To be specific, the dual knowledge selection component aims to select the related knowledge according to both textual and visual modalities of the given context. Thereafter, the dual knowledge-enhanced context learning component targets seamlessly integrating the selected knowledge into the multimodal context learning from both global and local perspectives, where the cross-modal semantic relation is also explored. Moreover, the knowledge-enhanced response generation component comprises a revised BART decoder, where an additional dot-product knowledge-decoder attention sub-layer is introduced for explicitly utilizing the knowledge to advance the text response generation. Extensive experiments on a public dataset verify the superiority of the proposed DKMD over state-of-the-art competitors.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135347473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

An Intent Taxonomy of Legal Case Retrieval 法律案例检索的意图分类

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-09-29 DOI: 10.1145/3626093

Yunqiu Shao, Haitao Li, Yueyue Wu, Yiqun Liu, Qingyao Ai, Jiaxin Mao, Yixiao Ma, Shaoping Ma

Legal case retrieval is a special Information Retrieval (IR) task focusing on legal case documents. Depending on the downstream tasks of the retrieved case documents, users’ information needs in legal case retrieval could be significantly different from those in Web search and traditional ad-hoc retrieval tasks. While there are several studies that retrieve legal cases based on text similarity, the underlying search intents of legal retrieval users, as shown in this paper, are more complicated than that yet mostly unexplored. To this end, we present a novel hierarchical intent taxonomy of legal case retrieval. It consists of five intent types categorized by three criteria, i.e., search for Particular Case(s) , Characterization , Penalty , Procedure , and Interest . The taxonomy was constructed transparently and evaluated extensively through interviews, editorial user studies, and query log analysis. Through a laboratory user study, we reveal significant differences in user behavior and satisfaction under different search intents in legal case retrieval. Furthermore, we apply the proposed taxonomy to various downstream legal retrieval tasks, e.g., result ranking and satisfaction prediction, and demonstrate its effectiveness. Our work provides important insights into the understanding of user intents in legal case retrieval and potentially leads to better retrieval techniques in the legal domain, such as intent-aware ranking strategies and evaluation methodologies.

法律案件检索是一项针对法律案件文书的特殊信息检索任务。根据所检索的案件文档的下游任务，法律案件检索中的用户信息需求可能与Web搜索和传统的临时检索任务中的用户信息需求有很大不同。虽然有一些基于文本相似度检索法律案例的研究，但正如本文所示，法律检索用户的潜在搜索意图比这更复杂，而且大多未被探索。为此，我们提出了一种新的法律案例检索的层次意图分类法。它由五种意图类型组成，按三个标准分类，即搜索特定案例、特征描述、处罚、程序和利益。分类法是透明地构建的，并通过访谈、编辑用户研究和查询日志分析进行了广泛的评估。通过实验室用户研究，我们发现不同检索意图下的法律案例检索用户行为和满意度存在显著差异。此外，我们将所提出的分类方法应用于各种下游法律检索任务，如结果排序和满意度预测，并证明了其有效性。我们的工作为理解法律案例检索中的用户意图提供了重要的见解，并可能导致法律领域中更好的检索技术，例如意图感知排序策略和评估方法。

{"title":"An Intent Taxonomy of Legal Case Retrieval","authors":"Yunqiu Shao, Haitao Li, Yueyue Wu, Yiqun Liu, Qingyao Ai, Jiaxin Mao, Yixiao Ma, Shaoping Ma","doi":"10.1145/3626093","DOIUrl":"https://doi.org/10.1145/3626093","url":null,"abstract":"Legal case retrieval is a special Information Retrieval (IR) task focusing on legal case documents. Depending on the downstream tasks of the retrieved case documents, users’ information needs in legal case retrieval could be significantly different from those in Web search and traditional ad-hoc retrieval tasks. While there are several studies that retrieve legal cases based on text similarity, the underlying search intents of legal retrieval users, as shown in this paper, are more complicated than that yet mostly unexplored. To this end, we present a novel hierarchical intent taxonomy of legal case retrieval. It consists of five intent types categorized by three criteria, i.e., search for Particular Case(s) , Characterization , Penalty , Procedure , and Interest . The taxonomy was constructed transparently and evaluated extensively through interviews, editorial user studies, and query log analysis. Through a laboratory user study, we reveal significant differences in user behavior and satisfaction under different search intents in legal case retrieval. Furthermore, we apply the proposed taxonomy to various downstream legal retrieval tasks, e.g., result ranking and satisfaction prediction, and demonstrate its effectiveness. Our work provides important insights into the understanding of user intents in legal case retrieval and potentially leads to better retrieval techniques in the legal domain, such as intent-aware ranking strategies and evaluation methodologies.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135132323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Personalized Query Expansion with Contextual Word Embeddings 个性化查询扩展与上下文词嵌入

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-09-20 DOI: 10.1145/3624988

Elias Bassani, Nicola Tonellotto, Gabriella Pasi

Personalized Query Expansion, the task of expanding queries with additional terms extracted from the user-related vocabulary, is a well-known solution to improve the retrieval performance of a system w.r.t. short queries. Recent approaches rely on word embeddings to select expansion terms from user-related texts. Although delivering promising results with former word embedding techniques, we argue that these methods are not suited for contextual word embeddings, which produce a unique vector representation for each term occurrence. In this article, we propose a Personalized Query Expansion method designed to solve the issues arising from the use of contextual word embeddings with the current Personalized Query Expansion approaches based on word embeddings. Specifically, we employ a clustering-based procedure to identify the terms that better represent the user interests and to improve the diversity of those selected for expansion, achieving improvements up to 4% w.r.t. the best-performing baseline in terms of MAP@100. Moreover, our approach outperforms previous ones in terms of efficiency, allowing us to achieve sub-millisecond expansion times even in data-rich scenarios. Finally, we introduce a novel metric to evaluate the expansion terms diversity and empirically show the unsuitability of previous approaches based on word embeddings when employed along with contextual word embeddings, which cause the selection of semantically overlapping expansion terms.

个性化查询扩展，即使用从用户相关词汇表中提取的附加术语扩展查询的任务，是一种众所周知的提高系统w.r.t.短查询检索性能的解决方案。最近的方法依靠词嵌入从用户相关文本中选择扩展术语。虽然以前的词嵌入技术提供了有希望的结果，但我们认为这些方法不适合上下文词嵌入，上下文词嵌入为每个词的出现产生唯一的向量表示。在本文中，我们提出了一种个性化查询扩展方法，旨在解决当前基于词嵌入的个性化查询扩展方法中使用上下文词嵌入所产生的问题。具体来说，我们采用了一个基于聚类的过程来识别更好地代表用户兴趣的术语，并提高选择用于扩展的术语的多样性，实现了比MAP@100表现最好的基线高出4%的改进。此外，我们的方法在效率方面优于以前的方法，即使在数据丰富的场景中，我们也可以实现亚毫秒的扩展时间。最后，我们引入了一种新的度量来评估扩展词的多样性，并通过经验证明了先前基于词嵌入的方法在与上下文词嵌入一起使用时的不适用性，这会导致选择语义重叠的扩展词。

{"title":"Personalized Query Expansion with Contextual Word Embeddings","authors":"Elias Bassani, Nicola Tonellotto, Gabriella Pasi","doi":"10.1145/3624988","DOIUrl":"https://doi.org/10.1145/3624988","url":null,"abstract":"Personalized Query Expansion, the task of expanding queries with additional terms extracted from the user-related vocabulary, is a well-known solution to improve the retrieval performance of a system w.r.t. short queries. Recent approaches rely on word embeddings to select expansion terms from user-related texts. Although delivering promising results with former word embedding techniques, we argue that these methods are not suited for contextual word embeddings, which produce a unique vector representation for each term occurrence. In this article, we propose a Personalized Query Expansion method designed to solve the issues arising from the use of contextual word embeddings with the current Personalized Query Expansion approaches based on word embeddings. Specifically, we employ a clustering-based procedure to identify the terms that better represent the user interests and to improve the diversity of those selected for expansion, achieving improvements up to 4% w.r.t. the best-performing baseline in terms of MAP@100. Moreover, our approach outperforms previous ones in terms of efficiency, allowing us to achieve sub-millisecond expansion times even in data-rich scenarios. Finally, we introduce a novel metric to evaluate the expansion terms diversity and empirically show the unsuitability of previous approaches based on word embeddings when employed along with contextual word embeddings, which cause the selection of semantically overlapping expansion terms.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136308716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Semantic Collaborative Learning for Cross-Modal Moment Localization 跨模态时刻定位的语义协同学习

IF 5.6 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-09-07 DOI: 10.1145/3620669

Yupeng Hu, Kun Wang, Meng Liu, Haoyu Tang, Liqiang Nie

Localizing a desired moment within an untrimmed video via a given natural language query, i.e., cross-modal moment localization, has attracted widespread research attention recently. However, it is a challenging task because it requires not only accurately understanding intra-modal semantic information, but also explicitly capturing inter-modal semantic correlations (consistency and complementarity). Existing efforts mainly focus on intra-modal semantic understanding and inter-modal semantic alignment, while ignoring necessary semantic supplement. Consequently, we present a cross-modal semantic perception network for more effective intra-modal semantic understanding and inter-modal semantic collaboration. Concretely, we design a dual-path representation network for intra-modal semantic modeling. Meanwhile, we develop a semantic collaborative network to achieve multi-granularity semantic alignment and hierarchical semantic supplement. Thereby, effective moment localization can be achieved based on sufficient semantic collaborative learning. Extensive comparison experiments demonstrate the promising performance of our model compared with existing state-of-the-art competitors.

通过给定的自然语言查询在未修剪的视频中定位所需的时刻，即跨模态时刻定位，最近引起了广泛的研究关注。然而，这是一项具有挑战性的任务，因为它不仅需要准确地理解模态内的语义信息，还需要显式地捕获模态间的语义相关性(一致性和互补性)。现有的研究主要集中在模态内的语义理解和模态间的语义对齐，而忽略了必要的语义补充。因此，我们提出了一个跨模态语义感知网络，以实现更有效的模态内语义理解和模态间语义协作。具体而言，我们设计了一个双路径表示网络用于模态内语义建模。同时，我们开发了一个语义协同网络，实现了多粒度语义对齐和分层语义补充。因此，在充分的语义协同学习的基础上，可以实现有效的矩定位。大量的对比实验表明，与现有的最先进的竞争对手相比，我们的模型具有良好的性能。

{"title":"Semantic Collaborative Learning for Cross-Modal Moment Localization","authors":"Yupeng Hu, Kun Wang, Meng Liu, Haoyu Tang, Liqiang Nie","doi":"10.1145/3620669","DOIUrl":"https://doi.org/10.1145/3620669","url":null,"abstract":"Localizing a desired moment within an untrimmed video via a given natural language query, i.e., cross-modal moment localization, has attracted widespread research attention recently. However, it is a challenging task because it requires not only accurately understanding intra-modal semantic information, but also explicitly capturing inter-modal semantic correlations (consistency and complementarity). Existing efforts mainly focus on intra-modal semantic understanding and inter-modal semantic alignment, while ignoring necessary semantic supplement. Consequently, we present a cross-modal semantic perception network for more effective intra-modal semantic understanding and inter-modal semantic collaboration. Concretely, we design a dual-path representation network for intra-modal semantic modeling. Meanwhile, we develop a semantic collaborative network to achieve multi-granularity semantic alignment and hierarchical semantic supplement. Thereby, effective moment localization can be achieved based on sufficient semantic collaborative learning. Extensive comparison experiments demonstrate the promising performance of our model compared with existing state-of-the-art competitors.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43407668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-aspect Graph Contrastive Learning for Review-enhanced Recommendation 面向评论增强推荐的多向图对比学习

IF 5.6 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-09-05 DOI: 10.1145/3618106

K. Wang, Yanmin Zhu, Tianzi Zang, Chunyang Wang, Kuan Liu, Peibo Ma

Review-based recommender systems explore semantic aspects of users’ preferences by incorporating user-generated reviews into rating-based models. Recent works have demonstrated the potential of review information to improve the recommendation capacity. However, most existing studies rely on optimizing review-based representation learning part, thus failing to explicitly capture the fine-grained semantic aspects, and also ignoring the intrinsic correlation between ratings and reviews. To address these problems, we propose a multi-aspect graph contrastive learning framework, named MAGCL, with three distinctive designs: (i) a multi-aspect representation learning module, which projects semantic relations to different subspaces by decoupling review information, and then obtains high-order decoupled representations in each aspect via graph encoder. (ii) the contrastive learning module performs graph contrastive learning to capture the correlation between rating and review patterns, which utilize unlabeled data to generate self-supervised signals, in turn, relieve the data sparsity problem of supervision signals. (iii) the multi-task learning module conducts joint training to learn high-order structure-aware yet self-discriminative node representations by combining recommendation task and self-supervised task, which helps alleviate the over-smoothing problem. Extensive experiments are conducted on four real-world review datasets and the results show the superiority of the proposed framework MAGCL compared with several state-of-the-arts. We also provide further analysis on multi-aspect representations and graph contrastive learning to verify the advantage of proposed framework.

基于评论的推荐系统通过将用户生成的评论整合到基于评级的模型中来探索用户偏好的语义方面。最近的研究已经证明了评论信息在提高推荐能力方面的潜力。然而，现有的研究大多依赖于基于评论的表征学习部分的优化，未能明确地捕捉到细粒度的语义方面，也忽略了评分与评论之间的内在相关性。为了解决这些问题，我们提出了一个多向图对比学习框架，命名为MAGCL，它有三个不同的设计:(i)多向表示学习模块，通过解耦复习信息将语义关系投影到不同的子空间，然后通过图编码器获得每个方面的高阶解耦表示。(ii)对比学习模块通过图对比学习，捕捉评分模式和复习模式之间的相关性，利用未标记数据生成自监督信号，从而缓解监督信号的数据稀疏性问题。(iii)多任务学习模块结合推荐任务和自监督任务进行联合训练，学习高阶结构感知且自判别的节点表示，缓解了过度平滑问题。在四个真实世界的回顾数据集上进行了大量的实验，结果表明所提出的框架MAGCL与几种最先进的框架相比具有优势。我们还进一步分析了多向表示和图对比学习，以验证所提出框架的优势。

{"title":"Multi-aspect Graph Contrastive Learning for Review-enhanced Recommendation","authors":"K. Wang, Yanmin Zhu, Tianzi Zang, Chunyang Wang, Kuan Liu, Peibo Ma","doi":"10.1145/3618106","DOIUrl":"https://doi.org/10.1145/3618106","url":null,"abstract":"Review-based recommender systems explore semantic aspects of users’ preferences by incorporating user-generated reviews into rating-based models. Recent works have demonstrated the potential of review information to improve the recommendation capacity. However, most existing studies rely on optimizing review-based representation learning part, thus failing to explicitly capture the fine-grained semantic aspects, and also ignoring the intrinsic correlation between ratings and reviews. To address these problems, we propose a multi-aspect graph contrastive learning framework, named MAGCL, with three distinctive designs: (i) a multi-aspect representation learning module, which projects semantic relations to different subspaces by decoupling review information, and then obtains high-order decoupled representations in each aspect via graph encoder. (ii) the contrastive learning module performs graph contrastive learning to capture the correlation between rating and review patterns, which utilize unlabeled data to generate self-supervised signals, in turn, relieve the data sparsity problem of supervision signals. (iii) the multi-task learning module conducts joint training to learn high-order structure-aware yet self-discriminative node representations by combining recommendation task and self-supervised task, which helps alleviate the over-smoothing problem. Extensive experiments are conducted on four real-world review datasets and the results show the superiority of the proposed framework MAGCL compared with several state-of-the-arts. We also provide further analysis on multi-aspect representations and graph contrastive learning to verify the advantage of proposed framework.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47031563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

IF 5.6 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-09-01 DOI: 10.1145/3618107

Xiaoyu Shi, Quanliang Liu, Hong Xie, Di Wu, Boxin Peng, Mingsheng Shang, Defu Lian

While personalization increases the utility of item recommendation, it also suffers from the issue of popularity bias. However, previous methods emphasize adopting supervised learning models to relieve popularity bias in the static recommendation, ignoring the dynamic transfer of user preference and amplification effects of the feedback loop in the recommender system (RS). In this paper, we focus on studying this issue in the interactive recommendation. We argue that diversification and novelty are both equally crucial for improving user satisfaction of IRS in the aforementioned setting. To achieve this goal, we propose a Diversity-Novelty-aware Interactive Recommendation framework (DNaIR) that augments offline reinforcement learning (RL) to increase the exposure rate of long-tail items with high quality. Its main idea is first to aggregate the item similarity, popularity, and quality into the reward model to help the planning of RL policy. It then designs a diversity-aware stochastic action generator to achieve an efficient and lightweight DNaIR algorithm. Extensive experiments are conducted on the three real-world datasets and an authentic RL environment (Virtual-Taobao). The experiments show that our model can better and full use of the long-tail items to improve recommendation satisfaction, especially those low popularity items with high-quality ones, thus achieving state-of-the-art performance.

虽然个性化增加了项目推荐的效用，但它也受到流行偏差的影响。然而，以往的方法强调在静态推荐中采用监督学习模型来缓解人气偏差，忽略了推荐系统中用户偏好的动态转移和反馈回路的放大效应。本文主要研究互动推荐中的这一问题。我们认为，在上述情况下，多样化和新颖性对于提高IRS的用户满意度同样至关重要。为了实现这一目标，我们提出了一个多样性-新颖性感知交互式推荐框架(DNaIR)，该框架增强了离线强化学习(RL)，以提高高质量长尾项目的曝光率。其主要思想是首先将物品的相似度、受欢迎程度和质量汇总到奖励模型中，以帮助RL策略的规划。然后设计了一个多样性感知的随机动作生成器，实现了一种高效、轻量级的DNaIR算法。在三个真实世界的数据集和一个真实的强化学习环境(Virtual-Taobao)上进行了大量的实验。实验表明，我们的模型可以更好、更充分地利用长尾条目来提高推荐满意度，特别是那些低人气条目和高质量条目的推荐满意度，从而达到了最先进的性能。

{"title":"Relieving Popularity Bias in Interactive Recommendation: A Diversity-Novelty-Aware Reinforcement Learning Approach","authors":"Xiaoyu Shi, Quanliang Liu, Hong Xie, Di Wu, Boxin Peng, Mingsheng Shang, Defu Lian","doi":"10.1145/3618107","DOIUrl":"https://doi.org/10.1145/3618107","url":null,"abstract":"While personalization increases the utility of item recommendation, it also suffers from the issue of popularity bias. However, previous methods emphasize adopting supervised learning models to relieve popularity bias in the static recommendation, ignoring the dynamic transfer of user preference and amplification effects of the feedback loop in the recommender system (RS). In this paper, we focus on studying this issue in the interactive recommendation. We argue that diversification and novelty are both equally crucial for improving user satisfaction of IRS in the aforementioned setting. To achieve this goal, we propose a Diversity-Novelty-aware Interactive Recommendation framework (DNaIR) that augments offline reinforcement learning (RL) to increase the exposure rate of long-tail items with high quality. Its main idea is first to aggregate the item similarity, popularity, and quality into the reward model to help the planning of RL policy. It then designs a diversity-aware stochastic action generator to achieve an efficient and lightweight DNaIR algorithm. Extensive experiments are conducted on the three real-world datasets and an authentic RL environment (Virtual-Taobao). The experiments show that our model can better and full use of the long-tail items to improve recommendation satisfaction, especially those low popularity items with high-quality ones, thus achieving state-of-the-art performance.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":" ","pages":""},"PeriodicalIF":5.6,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43544461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0