ACM Transactions on Information Systems最新文献_第9页

Efficient Neural Ranking using Forward Indexes and Lightweight Encoders 使用前向索引和轻量级编码器的高效神经排序

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-11-08 DOI: 10.1145/3631939

Jurek Leonhardt, Henrik Müller, Koustav Rudra, Megha Khosla, Abhijit Anand, Avishek Anand

Dual-encoder-based dense retrieval models have become the standard in IR. They employ large Transformer-based language models, which are notoriously inefficient in terms of resources and latency. We propose Fast-Forward indexes—vector forward indexes which exploit the semantic matching capabilities of dual-encoder models for efficient and effective re-ranking. Our framework enables re-ranking at very high retrieval depths and combines the merits of both lexical and semantic matching via score interpolation. Furthermore, in order to mitigate the limitations of dual-encoders, we tackle two main challenges: Firstly, we improve computational efficiency by either pre-computing representations, avoiding unnecessary computations altogether, or reducing the complexity of encoders. This allows us to considerably improve ranking efficiency and latency. Secondly, we optimize the memory footprint and maintenance cost of indexes; we propose two complementary techniques to reduce the index size and show that, by dynamically dropping irrelevant document tokens, the index maintenance efficiency can be improved substantially. We perform evaluation to show the effectiveness and efficiency of Fast-Forward indexes—our method has low latency and achieves competitive results without the need for hardware acceleration, such as GPUs.

基于双编码器的密集检索模型已经成为红外领域的标准。它们采用大型的基于transformer的语言模型，这在资源和延迟方面是出了名的低效。我们提出了快速前向索引——矢量前向索引，利用双编码器模型的语义匹配能力进行高效的重新排序。我们的框架能够在非常高的检索深度上重新排序，并通过分数插值结合了词汇和语义匹配的优点。此外，为了减轻双编码器的局限性，我们解决了两个主要挑战:首先，我们通过预计算表示来提高计算效率，避免不必要的计算，或者降低编码器的复杂性。这使我们能够大大提高排名效率和延迟。其次，优化索引的内存占用和维护成本;我们提出了两种互补的技术来减少索引大小，并表明，通过动态删除不相关的文档令牌，可以大大提高索引维护效率。我们执行评估以显示Fast-Forward索引的有效性和效率-我们的方法具有低延迟，并且无需硬件加速(如gpu)即可获得具有竞争力的结果。

{"title":"Efficient Neural Ranking using Forward Indexes and Lightweight Encoders","authors":"Jurek Leonhardt, Henrik Müller, Koustav Rudra, Megha Khosla, Abhijit Anand, Avishek Anand","doi":"10.1145/3631939","DOIUrl":"https://doi.org/10.1145/3631939","url":null,"abstract":"Dual-encoder-based dense retrieval models have become the standard in IR. They employ large Transformer-based language models, which are notoriously inefficient in terms of resources and latency. We propose Fast-Forward indexes—vector forward indexes which exploit the semantic matching capabilities of dual-encoder models for efficient and effective re-ranking. Our framework enables re-ranking at very high retrieval depths and combines the merits of both lexical and semantic matching via score interpolation. Furthermore, in order to mitigate the limitations of dual-encoders, we tackle two main challenges: Firstly, we improve computational efficiency by either pre-computing representations, avoiding unnecessary computations altogether, or reducing the complexity of encoders. This allows us to considerably improve ranking efficiency and latency. Secondly, we optimize the memory footprint and maintenance cost of indexes; we propose two complementary techniques to reduce the index size and show that, by dynamically dropping irrelevant document tokens, the index maintenance efficiency can be improved substantially. We perform evaluation to show the effectiveness and efficiency of Fast-Forward indexes—our method has low latency and achieves competitive results without the need for hardware acceleration, such as GPUs.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"65 s297","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135342067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Contrastive Self-supervised Learning in Recommender Systems: A Survey 推荐系统中的对比自监督学习研究综述

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-11-08 DOI: 10.1145/3627158

Mengyuan Jing, Yanmin Zhu, Tianzi Zang, Ke Wang

Deep learning-based recommender systems have achieved remarkable success in recent years. However, these methods usually heavily rely on labeled data (i.e., user-item interactions), suffering from problems such as data sparsity and cold-start. Self-supervised learning, an emerging paradigm that extracts information from unlabeled data, provides insights into addressing these problems. Specifically, contrastive self-supervised learning, due to its flexibility and promising performance, has attracted considerable interest and recently become a dominant branch in self-supervised learning-based recommendation methods. In this survey, we provide an up-to-date and comprehensive review of current contrastive self-supervised learning-based recommendation methods. Firstly, we propose a unified framework for these methods. We then introduce a taxonomy based on the key components of the framework, including view generation strategy, contrastive task, and contrastive objective. For each component, we provide detailed descriptions and discussions to guide the choice of the appropriate method. Finally, we outline open issues and promising directions for future research.

近年来，基于深度学习的推荐系统取得了显著的成功。然而，这些方法通常严重依赖于标记数据(即用户-项目交互)，因此存在数据稀疏性和冷启动等问题。自监督学习是一种新兴的范例，它从未标记的数据中提取信息，为解决这些问题提供了见解。具体来说，对比自监督学习由于其灵活性和良好的性能，引起了人们的广泛关注，最近成为基于自监督学习的推荐方法的一个主导分支。在这项调查中，我们提供了一个最新的和全面的审查，目前对比自监督学习为基础的推荐方法。首先，我们为这些方法提出了一个统一的框架。然后，我们介绍了一个基于框架关键组件的分类法，包括视图生成策略、对比任务和对比目标。对于每个组件，我们提供了详细的描述和讨论，以指导选择合适的方法。最后，我们概述了未来研究的开放问题和有希望的方向。

{"title":"Contrastive Self-supervised Learning in Recommender Systems: A Survey","authors":"Mengyuan Jing, Yanmin Zhu, Tianzi Zang, Ke Wang","doi":"10.1145/3627158","DOIUrl":"https://doi.org/10.1145/3627158","url":null,"abstract":"Deep learning-based recommender systems have achieved remarkable success in recent years. However, these methods usually heavily rely on labeled data (i.e., user-item interactions), suffering from problems such as data sparsity and cold-start. Self-supervised learning, an emerging paradigm that extracts information from unlabeled data, provides insights into addressing these problems. Specifically, contrastive self-supervised learning, due to its flexibility and promising performance, has attracted considerable interest and recently become a dominant branch in self-supervised learning-based recommendation methods. In this survey, we provide an up-to-date and comprehensive review of current contrastive self-supervised learning-based recommendation methods. Firstly, we propose a unified framework for these methods. We then introduce a taxonomy based on the key components of the framework, including view generation strategy, contrastive task, and contrastive objective. For each component, we provide detailed descriptions and discussions to guide the choice of the appropriate method. Finally, we outline open issues and promising directions for future research.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":" 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135293385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Improving First-stage Retrieval of Point-of-Interest Search by Pre-training Models 利用预训练模型改进兴趣点搜索的第一阶段检索

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-11-07 DOI: 10.1145/3631937

Lang Mei, Jiaxin Mao, Juan Hu, Naiqiang Tan, Hua Chai, Ji-rong Wen

Point-of-interest (POI) search is important for location-based services, such as navigation and online ride-hailing service. The goal of POI search is to find the most relevant destinations from a large-scale POI database given a text query. To improve the effectiveness and efficiency of POI search, most existing approaches are based on a multi-stage pipeline that consists of an efficiency-oriented retrieval stage and one or more effectiveness-oriented re-rank stages. In this paper, we focus on the first efficiency-oriented retrieval stage of the POI search. We first identify the limitations of existing first-stage POI retrieval models in capturing the semantic-geography relationship and modeling the fine-grained geographical context information. Then, we propose a Geo-Enhanced Dense Retrieval framework for POI search to alleviate the above problems. Specifically, the proposed framework leverages the capacity of pre-trained language models (e.g., BERT) and designs a pre-training approach to better model the semantic match between the query prefix and POIs. With the POI collection, we first perform a token-level pre-training task based on a geographical-sensitive masked language prediction, and design two retrieval-oriented pre-training tasks that link the address of each POI to its name and geo-location. With the user behavior logs collected from an online POI search system, we design two additional pre-training tasks based on users’ query reformulation behavior and the transitions between POIs. We also utilize a late-interaction network structure to model the fine-grained interactions between the text and geographical context information within an acceptable query latency. Extensive experiments on the real-world datasets collected from the Didichuxing application demonstrate that the proposed framework can achieve superior retrieval performance over existing first-stage POI retrieval methods.

兴趣点(POI)搜索对于基于位置的服务很重要，比如导航和在线叫车服务。POI搜索的目标是从给定文本查询的大规模POI数据库中找到最相关的目的地。为了提高POI搜索的有效性和效率，大多数现有方法都基于多级管道，该管道由一个面向效率的检索阶段和一个或多个面向效率的重新排序阶段组成。在本文中，我们关注POI搜索的第一个面向效率的检索阶段。本文首先指出了现有第一阶段POI检索模型在捕获语义-地理关系和建模细粒度地理上下文信息方面的局限性。针对上述问题，提出了一种基于地理增强的密集检索框架。具体来说，所提出的框架利用了预训练语言模型(例如BERT)的能力，并设计了一种预训练方法来更好地建模查询前缀和poi之间的语义匹配。对于POI集合，我们首先基于地理敏感的掩码语言预测执行令牌级预训练任务，并设计两个面向检索的预训练任务，将每个POI的地址与其名称和地理位置联系起来。利用从在线POI搜索系统中收集的用户行为日志，我们基于用户的查询重新表述行为和POI之间的转换设计了两个额外的预训练任务。我们还利用后期交互网络结构在可接受的查询延迟内对文本和地理上下文信息之间的细粒度交互进行建模。在didihuxing应用中收集的真实数据集上进行的大量实验表明，所提出的框架比现有的第一阶段POI检索方法具有更好的检索性能。

{"title":"Improving First-stage Retrieval of Point-of-Interest Search by Pre-training Models","authors":"Lang Mei, Jiaxin Mao, Juan Hu, Naiqiang Tan, Hua Chai, Ji-rong Wen","doi":"10.1145/3631937","DOIUrl":"https://doi.org/10.1145/3631937","url":null,"abstract":"Point-of-interest (POI) search is important for location-based services, such as navigation and online ride-hailing service. The goal of POI search is to find the most relevant destinations from a large-scale POI database given a text query. To improve the effectiveness and efficiency of POI search, most existing approaches are based on a multi-stage pipeline that consists of an efficiency-oriented retrieval stage and one or more effectiveness-oriented re-rank stages. In this paper, we focus on the first efficiency-oriented retrieval stage of the POI search. We first identify the limitations of existing first-stage POI retrieval models in capturing the semantic-geography relationship and modeling the fine-grained geographical context information. Then, we propose a Geo-Enhanced Dense Retrieval framework for POI search to alleviate the above problems. Specifically, the proposed framework leverages the capacity of pre-trained language models (e.g., BERT) and designs a pre-training approach to better model the semantic match between the query prefix and POIs. With the POI collection, we first perform a token-level pre-training task based on a geographical-sensitive masked language prediction, and design two retrieval-oriented pre-training tasks that link the address of each POI to its name and geo-location. With the user behavior logs collected from an online POI search system, we design two additional pre-training tasks based on users’ query reformulation behavior and the transitions between POIs. We also utilize a late-interaction network structure to model the fine-grained interactions between the text and geographical context information within an acceptable query latency. Extensive experiments on the real-world datasets collected from the Didichuxing application demonstrate that the proposed framework can achieve superior retrieval performance over existing first-stage POI retrieval methods.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"235 1‐3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135476078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SetRank: A Setwise Bayesian Approach for Collaborative Ranking in Recommender System SetRank:推荐系统协同排名的一种Setwise贝叶斯方法

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-11-07 DOI: 10.1145/3626194

Chao Wang, Hengshu Zhu, Chen Zhu, Chuan Qin, Enhong Chen, Hui Xiong

The recent development of recommender systems has a focus on collaborative ranking, which provides users with a sorted list rather than rating prediction. The sorted item lists can more directly reflect the preferences for users and usually perform better than rating prediction in practice. While considerable efforts have been made in this direction, the well-known pairwise and listwise approaches have still been limited by various challenges. Specifically, for the pairwise approaches, the assumption of independent pairwise preference is not always held in practice. Also, the listwise approaches cannot efficiently accommodate “ties” and unobserved data due to the precondition of the entire list permutation. To this end, in this article, we propose a novel setwise Bayesian approach for collaborative ranking, namely, SetRank, to inherently accommodate the characteristics of user feedback in recommender systems. SetRank aims to maximize the posterior probability of novel setwise preference structures and three implementations for SetRank are presented. We also theoretically prove that the bound of excess risk in SetRank can be proportional to (sqrt {M/N}) , where M and N are the numbers of items and users, respectively. Finally, extensive experiments on four real-world datasets clearly validate the superiority of SetRank compared with various state-of-the-art baselines.

推荐系统的最新发展侧重于协作排名，它为用户提供排序列表而不是评级预测。排序后的物品列表可以更直接地反映用户的偏好，在实践中通常比评级预测效果更好。虽然在这方面已经作出了相当大的努力，但众所周知的成对和列表方法仍然受到各种挑战的限制。具体而言，对于两两方法，在实践中并不总是坚持独立的两两偏好假设。此外，由于整个列表排列的先决条件，列表方法不能有效地容纳“关系”和未观察到的数据。为此，在本文中，我们提出了一种新的setwise贝叶斯协同排名方法，即SetRank，以内在地适应推荐系统中用户反馈的特征。SetRank的目的是最大化新的集合偏好结构的后验概率，并提出了三种SetRank的实现方法。我们还从理论上证明了SetRank的超额风险界可以与(sqrt {M/N})成正比，其中M和N分别为商品数量和用户数量。最后，在四个真实数据集上进行的大量实验清楚地验证了SetRank与各种最先进基线相比的优越性。

{"title":"SetRank: A Setwise Bayesian Approach for Collaborative Ranking in Recommender System","authors":"Chao Wang, Hengshu Zhu, Chen Zhu, Chuan Qin, Enhong Chen, Hui Xiong","doi":"10.1145/3626194","DOIUrl":"https://doi.org/10.1145/3626194","url":null,"abstract":"The recent development of recommender systems has a focus on collaborative ranking, which provides users with a sorted list rather than rating prediction. The sorted item lists can more directly reflect the preferences for users and usually perform better than rating prediction in practice. While considerable efforts have been made in this direction, the well-known pairwise and listwise approaches have still been limited by various challenges. Specifically, for the pairwise approaches, the assumption of independent pairwise preference is not always held in practice. Also, the listwise approaches cannot efficiently accommodate “ties” and unobserved data due to the precondition of the entire list permutation. To this end, in this article, we propose a novel setwise Bayesian approach for collaborative ranking, namely, SetRank, to inherently accommodate the characteristics of user feedback in recommender systems. SetRank aims to maximize the posterior probability of novel setwise preference structures and three implementations for SetRank are presented. We also theoretically prove that the bound of excess risk in SetRank can be proportional to (sqrt {M/N}) , where M and N are the numbers of items and users, respectively. Finally, extensive experiments on four real-world datasets clearly validate the superiority of SetRank compared with various state-of-the-art baselines.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135432011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bi-Preference Learning Heterogeneous Hypergraph Networks for Session-based Recommendation 基于会话推荐的双偏好学习异构超图网络

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-11-07 DOI: 10.1145/3631940

Xiaokun Zhang, Bo Xu, Fenglong Ma, Chenliang Li, Yuan Lin, Hongfei Lin

Session-based recommendation intends to predict next purchased items based on anonymous behavior sequences. Numerous economic studies have revealed that item price is a key factor influencing user purchase decisions. Unfortunately, existing methods for session-based recommendation only aim at capturing user interest preference, while ignoring user price preference. Actually, there are primarily two challenges preventing us from accessing price preference. Firstly, the price preference is highly associated to various item features ( i.e., category and brand), which asks us to mine price preference from heterogeneous information. Secondly, price preference and interest preference are interdependent and collectively determine user choice, necessitating that we jointly consider both price and interest preference for intent modeling. To handle above challenges, we propose a novel approach Bi-Preference Learning Heterogeneous Hypergraph Networks (BiPNet) for session-based recommendation. Specifically, the customized heterogeneous hypergraph networks with a triple-level convolution are devised to capture user price and interest preference from heterogeneous features of items. Besides, we develop a Bi-Preference Learning schema to explore mutual relations between price and interest preference and collectively learn these two preferences under the multi-task learning architecture. Extensive experiments on multiple public datasets confirm the superiority of BiPNet over competitive baselines. Additional research also supports the notion that the price is crucial for the task.

基于会话的推荐旨在根据匿名行为序列预测下一个购买的物品。大量经济学研究表明，商品价格是影响用户购买决策的关键因素。遗憾的是，现有的基于会话的推荐方法只关注用户的兴趣偏好，而忽略了用户的价格偏好。实际上，阻碍我们获得价格优惠的主要有两个挑战。首先，价格偏好与各种商品特征(即品类和品牌)高度相关，这就要求我们从异构信息中挖掘价格偏好。其次，价格偏好和兴趣偏好是相互依赖的，共同决定了用户的选择，因此我们需要在意向建模中同时考虑价格偏好和兴趣偏好。为了应对上述挑战，我们提出了一种新的基于会话的推荐方法——双偏好学习异构超图网络(BiPNet)。具体而言，设计了具有三层卷积的定制异构超图网络，以从物品的异构特征中捕获用户价格和兴趣偏好。此外，我们开发了一个双偏好学习模式来探索价格偏好和兴趣偏好之间的相互关系，并在多任务学习架构下集体学习这两种偏好。在多个公共数据集上进行的大量实验证实了BiPNet优于竞争基线的优势。额外的研究也支持了价格对任务至关重要的观点。

{"title":"Bi-Preference Learning Heterogeneous Hypergraph Networks for Session-based Recommendation","authors":"Xiaokun Zhang, Bo Xu, Fenglong Ma, Chenliang Li, Yuan Lin, Hongfei Lin","doi":"10.1145/3631940","DOIUrl":"https://doi.org/10.1145/3631940","url":null,"abstract":"Session-based recommendation intends to predict next purchased items based on anonymous behavior sequences. Numerous economic studies have revealed that item price is a key factor influencing user purchase decisions. Unfortunately, existing methods for session-based recommendation only aim at capturing user interest preference, while ignoring user price preference. Actually, there are primarily two challenges preventing us from accessing price preference. Firstly, the price preference is highly associated to various item features ( i.e., category and brand), which asks us to mine price preference from heterogeneous information. Secondly, price preference and interest preference are interdependent and collectively determine user choice, necessitating that we jointly consider both price and interest preference for intent modeling. To handle above challenges, we propose a novel approach Bi-Preference Learning Heterogeneous Hypergraph Networks (BiPNet) for session-based recommendation. Specifically, the customized heterogeneous hypergraph networks with a triple-level convolution are devised to capture user price and interest preference from heterogeneous features of items. Besides, we develop a Bi-Preference Learning schema to explore mutual relations between price and interest preference and collectively learn these two preferences under the multi-task learning architecture. Extensive experiments on multiple public datasets confirm the superiority of BiPNet over competitive baselines. Additional research also supports the notion that the price is crucial for the task.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"286 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135475334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Triple Dual Learning for Opinion-Based Explainable Recommendation 基于意见的可解释推荐的三重双重学习

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-11-02 DOI: 10.1145/3631521

Yuting Zhang, Ying Sun, Fuzhen Zhuang, Yongchun Zhu, Zhulin An, Yongjun Xu

Recently, with the aim of enhancing the trustworthiness of recommender systems, explainable recommendation has attracted much attention from the research community. Intuitively, users’ opinions towards different aspects of an item determine their ratings (i.e., users’ preferences) for the item. Therefore, rating prediction from the perspective of opinions can realize personalized explanations at the level of item aspects and user preferences. However, there are several challenges in developing an opinion-based explainable recommendation: (1) The complicated relationship between users’ opinions and ratings. (2) The difficulty of predicting the potential (i.e., unseen) user-item opinions because of the sparsity of opinion information. To tackle these challenges, we propose an overall preference-aware opinion-based explainable rating prediction model by jointly modeling the multiple observations of user-item interaction (i.e., review, opinion, rating). To alleviate the sparsity problem and raise the effectiveness of opinion prediction, we further propose a triple dual learning-based framework with a novelly designed triple dual constraint . Finally, experiments on three popular datasets show the effectiveness and great explanation performance of our framework.

近年来，为了提高推荐系统的可信度，可解释推荐受到了学术界的广泛关注。直观地说，用户对一个项目不同方面的意见决定了他们对该项目的评分(即用户的偏好)。因此，从意见角度进行评分预测，可以实现项目层面和用户偏好层面的个性化解释。然而，开发基于意见的可解释推荐存在几个挑战:(1)用户意见和评分之间的复杂关系。(2)由于意见信息的稀疏性，难以预测潜在的(即看不见的)用户-物品意见。为了应对这些挑战，我们通过对用户-物品交互(即评论、意见、评分)的多个观察结果联合建模，提出了一个基于偏好感知意见的整体可解释评级预测模型。为了缓解稀疏性问题，提高意见预测的有效性，我们进一步提出了一种基于三对偶学习的框架，该框架具有新颖的三对偶约束设计。最后，在三个流行的数据集上进行了实验，证明了该框架的有效性和良好的解释性能。

{"title":"Triple Dual Learning for Opinion-Based Explainable Recommendation","authors":"Yuting Zhang, Ying Sun, Fuzhen Zhuang, Yongchun Zhu, Zhulin An, Yongjun Xu","doi":"10.1145/3631521","DOIUrl":"https://doi.org/10.1145/3631521","url":null,"abstract":"Recently, with the aim of enhancing the trustworthiness of recommender systems, explainable recommendation has attracted much attention from the research community. Intuitively, users’ opinions towards different aspects of an item determine their ratings (i.e., users’ preferences) for the item. Therefore, rating prediction from the perspective of opinions can realize personalized explanations at the level of item aspects and user preferences. However, there are several challenges in developing an opinion-based explainable recommendation: (1) The complicated relationship between users’ opinions and ratings. (2) The difficulty of predicting the potential (i.e., unseen) user-item opinions because of the sparsity of opinion information. To tackle these challenges, we propose an overall preference-aware opinion-based explainable rating prediction model by jointly modeling the multiple observations of user-item interaction (i.e., review, opinion, rating). To alleviate the sparsity problem and raise the effectiveness of opinion prediction, we further propose a triple dual learning-based framework with a novelly designed triple dual constraint . Finally, experiments on three popular datasets show the effectiveness and great explanation performance of our framework.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"14 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135875570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

rHDP: An Aspect Sharing-Enhanced Hierarchical Topic Model for Multi-Domain Corpus 面向多领域语料库的面向方面共享的层次主题模型

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-11-02 DOI: 10.1145/3631352

Yitao Zhang, Changxuan Wan, Keli Xiao, Qizhi Wan, Dexi Liu, Xiping Liu

Learning topic hierarchies from a multi-domain corpus is crucial in topic modeling as it reveals valuable structural information embedded within documents. Despite the extensive literature on hierarchical topic models, effectively discovering inter-topic correlations and differences among subtopics at the same level in the topic hierarchy, obtained from multiple domains, remains an unresolved challenge. This paper proposes an enhanced nested Chinese restaurant process (nCRP), nCRP+, by introducing an additional mechanism based on Chinese restaurant franchise (CRF) for aspect-sharing pattern extraction in the original nCRP. Subsequently, by employing the distribution extracted from nCRP+ as the prior distribution for topic hierarchy in the hierarchical Dirichlet processes (HDP), we develop a hierarchical topic model for multi-domain corpus, named rHDP. We describe the model with the analogy of Chinese restaurant franchise based on the central kitchen and propose a hierarchical Gibbs sampling scheme to infer the model. Our method effectively constructs well-established topic hierarchies, accurately reflecting diverse parent-child topic relationships, explicit topic aspect sharing correlations for inter-topics, and differences between these shared topics. To validate the efficacy of our approach, we conduct experiments using a renowned public dataset and an online collection of Chinese financial documents. The experimental results confirm the superiority of our method over the state-of-the-art techniques in identifying multi-domain topic hierarchies, according to multiple evaluation metrics.

从多领域语料库中学习主题层次结构对于主题建模至关重要，因为它揭示了嵌入在文档中的有价值的结构信息。尽管有大量关于分层主题模型的文献，但如何有效地发现从多个领域获得的主题层次中同一层次的子主题之间的相关性和差异，仍然是一个尚未解决的挑战。本文在原有的嵌套中餐馆流程(nCRP)基础上，引入了一种基于中餐馆特许经营(CRF)的方面共享模式提取机制，提出了一种增强的嵌套中餐馆流程(nCRP)——nCRP+。随后，利用nCRP+提取的分布作为层次Dirichlet过程(HDP)中主题层次的先验分布，建立了多领域语料库的层次主题模型rHDP。以中餐馆特许经营为例，以中央厨房为模型进行了类比描述，并提出了一种分层Gibbs抽样方案来推导模型。我们的方法有效地构建了完善的主题层次结构，准确地反映了不同的亲子主题关系、主题间显式的主题方面共享相关性以及这些共享主题之间的差异。为了验证我们方法的有效性，我们使用著名的公共数据集和中国金融文件的在线收集进行了实验。实验结果证实了我们的方法在根据多个评价指标识别多领域主题层次方面优于目前最先进的技术。

{"title":"rHDP: An Aspect Sharing-Enhanced Hierarchical Topic Model for Multi-Domain Corpus","authors":"Yitao Zhang, Changxuan Wan, Keli Xiao, Qizhi Wan, Dexi Liu, Xiping Liu","doi":"10.1145/3631352","DOIUrl":"https://doi.org/10.1145/3631352","url":null,"abstract":"Learning topic hierarchies from a multi-domain corpus is crucial in topic modeling as it reveals valuable structural information embedded within documents. Despite the extensive literature on hierarchical topic models, effectively discovering inter-topic correlations and differences among subtopics at the same level in the topic hierarchy, obtained from multiple domains, remains an unresolved challenge. This paper proposes an enhanced nested Chinese restaurant process (nCRP), nCRP+, by introducing an additional mechanism based on Chinese restaurant franchise (CRF) for aspect-sharing pattern extraction in the original nCRP. Subsequently, by employing the distribution extracted from nCRP+ as the prior distribution for topic hierarchy in the hierarchical Dirichlet processes (HDP), we develop a hierarchical topic model for multi-domain corpus, named rHDP. We describe the model with the analogy of Chinese restaurant franchise based on the central kitchen and propose a hierarchical Gibbs sampling scheme to infer the model. Our method effectively constructs well-established topic hierarchies, accurately reflecting diverse parent-child topic relationships, explicit topic aspect sharing correlations for inter-topics, and differences between these shared topics. To validate the efficacy of our approach, we conduct experiments using a renowned public dataset and an online collection of Chinese financial documents. The experimental results confirm the superiority of our method over the state-of-the-art techniques in identifying multi-domain topic hierarchies, according to multiple evaluation metrics.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"4 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135876641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Understanding Feeling-of-Knowing in Information Search: An EEG Study 信息搜索中的认知感受:一项脑电图研究

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-10-30 DOI: 10.1145/3611384

Dominika Michalkova, Mario Parra Rodriguez, Yashar Moshfeghi

The realisation and the variability of information needs (IN) with respect to a searcher’s gap in knowledge is driven by the perceived Anomalous State of Knowledge (ASK). The concept of Feeling-of-Knowing (FOK), as the introspective feeling of knowledge awareness, shares the characteristics of an ASK state. From an IR perspective, FOK as a premise to trigger IN is unexplored. Motivated by the neuroimaging studies in IR, we investigate the neurophysiological drivers associated with FOK, to provide evidence validating FOK as a distinctive state in IN realisation. We employ Electroencephalography to capture the brain activity of 24 healthy participants performing a textual Question Answering IR scenario. We analyse the evoked neural patterns corresponding to three states of knowledge: i.e. (1)“I know”, (2)“FOK”, (3)“I do not know”. Our findings show the distinct neurophysiological signatures (N1, P2, N400, P6) in response to information segments processed in the context of our three levels. They further reveal that the brain manifestation associated with “FOK” does not significantly differ from the ones associated with “I do not know”, indicating their association with recognition of a gap in knowledge and as such could further inform the IN formation on different levels of knowing.

与搜索者的知识差距相关的信息需求(IN)的实现和可变性是由感知到的知识异常状态(ASK)驱动的。“知感”(feeling -of- knowing, FOK)概念作为一种知识意识的内省感受，具有ASK状态的特征。从IR的角度来看，FOK作为触发IN的前提尚未被探索。在IR神经影像学研究的推动下，我们研究了与FOK相关的神经生理驱动因素，以提供证据证明FOK是in实现中的一种独特状态。我们使用脑电图来捕捉24名健康参与者执行文本问答IR场景的大脑活动。我们分析了三种知识状态对应的诱发神经模式:即(1)“我知道”，(2)“我知道”，(3)“我不知道”。我们的研究结果显示了不同的神经生理特征(N1, P2, N400, P6)在我们的三个水平的背景下处理的信息片段的响应。他们进一步揭示，与“我不知道”相关的大脑表现与“我不知道”相关的大脑表现没有显著差异，表明它们与认识知识差距有关，因此可以进一步告知不同层次的知识信息。

{"title":"Understanding Feeling-of-Knowing in Information Search: An EEG Study","authors":"Dominika Michalkova, Mario Parra Rodriguez, Yashar Moshfeghi","doi":"10.1145/3611384","DOIUrl":"https://doi.org/10.1145/3611384","url":null,"abstract":"The realisation and the variability of information needs (IN) with respect to a searcher’s gap in knowledge is driven by the perceived Anomalous State of Knowledge (ASK). The concept of Feeling-of-Knowing (FOK), as the introspective feeling of knowledge awareness, shares the characteristics of an ASK state. From an IR perspective, FOK as a premise to trigger IN is unexplored. Motivated by the neuroimaging studies in IR, we investigate the neurophysiological drivers associated with FOK, to provide evidence validating FOK as a distinctive state in IN realisation. We employ Electroencephalography to capture the brain activity of 24 healthy participants performing a textual Question Answering IR scenario. We analyse the evoked neural patterns corresponding to three states of knowledge: i.e. (1)“I know”, (2)“FOK”, (3)“I do not know”. Our findings show the distinct neurophysiological signatures (N1, P2, N400, P6) in response to information segments processed in the context of our three levels. They further reveal that the brain manifestation associated with “FOK” does not significantly differ from the ones associated with “I do not know”, indicating their association with recognition of a gap in knowledge and as such could further inform the IN formation on different levels of knowing.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"65 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136105166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Community preserving social recommendation with Cyclic Transfer Learning 基于循环迁移学习的社区保护社会推荐

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-10-28 DOI: 10.1145/3631115

Xuelian Ni, Fei Xiong, Shirui Pan, Jia Wu, Liang Wang, Hongshu Chen

Transfer learning-based recommendation mitigates the sparsity of user-item interactions by introducing auxiliary domains. Social influence extracted from direct connections between users typically serves as an auxiliary domain to improve prediction performance. However, direct social connections also face severe data sparsity problems that limit model performance. In contrast, users’ dependency on communities is another valuable social information that has not yet received sufficient attention. Although studies have incorporated community information into recommendation by aggregating users’ preferences within the same community, they seldom capture the structural discrepancies among communities and the influence of structural discrepancies on users’ preferences. To address these challenges, we propose a community-preserving recommendation framework with cyclic transfer learning, incorporating heterogeneous community influence into the rating domain. We analyze the characteristics of the community domain and its inter-influence on the rating domain, and construct link constraints and preference constraints in the community domain. The shared vectors that bridge the rating domain and the community domain are allowed to be more consistent with the characteristics of both domains. Extensive experiments are conducted on four real-world datasets. The results manifest the excellent performance of our approach in capturing real users’ preferences compared with other state-of-the-art methods.

基于迁移学习的推荐通过引入辅助域减轻了用户-项目交互的稀疏性。从用户之间的直接联系中提取的社会影响通常作为辅助域来提高预测性能。然而，直接的社会联系也面临严重的数据稀疏性问题，这限制了模型的性能。相比之下，用户对社区的依赖是另一个有价值的社会信息，但尚未得到足够的重视。虽然研究通过汇总用户在同一社区内的偏好，将社区信息纳入推荐，但它们很少捕捉到社区之间的结构差异以及结构差异对用户偏好的影响。为了解决这些挑战，我们提出了一个基于循环迁移学习的社区保护推荐框架，将异质社区影响纳入评级域。分析了社区域的特征及其对评级域的相互影响，构建了社区域的链接约束和偏好约束。连接评级域和社区域的共享向量可以更符合两个域的特征。在四个真实数据集上进行了广泛的实验。结果表明，与其他最先进的方法相比，我们的方法在捕获真实用户偏好方面表现出色。

{"title":"Community preserving social recommendation with Cyclic Transfer Learning","authors":"Xuelian Ni, Fei Xiong, Shirui Pan, Jia Wu, Liang Wang, Hongshu Chen","doi":"10.1145/3631115","DOIUrl":"https://doi.org/10.1145/3631115","url":null,"abstract":"Transfer learning-based recommendation mitigates the sparsity of user-item interactions by introducing auxiliary domains. Social influence extracted from direct connections between users typically serves as an auxiliary domain to improve prediction performance. However, direct social connections also face severe data sparsity problems that limit model performance. In contrast, users’ dependency on communities is another valuable social information that has not yet received sufficient attention. Although studies have incorporated community information into recommendation by aggregating users’ preferences within the same community, they seldom capture the structural discrepancies among communities and the influence of structural discrepancies on users’ preferences. To address these challenges, we propose a community-preserving recommendation framework with cyclic transfer learning, incorporating heterogeneous community influence into the rating domain. We analyze the characteristics of the community domain and its inter-influence on the rating domain, and construct link constraints and preference constraints in the community domain. The shared vectors that bridge the rating domain and the community domain are allowed to be more consistent with the characteristics of both domains. Extensive experiments are conducted on four real-world datasets. The results manifest the excellent performance of our approach in capturing real users’ preferences compared with other state-of-the-art methods.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"36 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136158811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Better Understanding Procedural Search Tasks: Perceptions, Behaviors, and Challenges 更好地理解程序性搜索任务:感知、行为和挑战

2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems

Pub Date : 2023-10-23 DOI: 10.1145/3630004

Bogeum Choi, Sarah Casteel, Jaime Arguello, Robert Capra

People often search for information to acquire procedural knowledge–“how to” knowledge about step-by-step procedures, methods, algorithms, techniques, heuristics, and skills. A procedural search task might involve implementing a solution to a problem, evaluating different approaches to a problem, and brainstorming on the types of problems that can be solved with a specific resource. We report on a study ( N = 36) that aimed to better understand how people search for procedural knowledge. Much research has investigated how search task characteristics impact people’s perceptions and behaviors. Along these lines, we manipulated procedural search tasks along two orthogonal dimensions: product and goal. The product dimension relates to the main outcome of the task and the goal dimension relates to task’s success criteria. We manipulated tasks across three product categories and two goal categories. The study investigated four research questions. First, we examined the effects of the product and goal on participants (RQ1) pre-task perceptions, (RQ2) post-task perceptions, and (RQ3) search behaviors. Second, regardless of the task product and goal, by analyzing participants’ think-aloud comments and screen activities we closely examined how people search for procedural knowledge. Specifically, we report on (RQ4) important relevance criteria, types of information sought, and challenges.

人们经常通过搜索信息来获取程序性知识——关于一步一步的程序、方法、算法、技术、启发式和技能的“如何”知识。程序搜索任务可能涉及实现问题的解决方案，评估解决问题的不同方法，以及对可以使用特定资源解决的问题类型进行头脑风暴。我们报告了一项研究(N = 36)，旨在更好地了解人们如何搜索程序性知识。许多研究调查了搜索任务特征是如何影响人们的感知和行为的。沿着这些思路，我们沿着两个正交的维度操作程序搜索任务:产品和目标。产品维度与任务的主要结果有关，目标维度与任务的成功标准有关。我们在三个产品类别和两个目标类别中操作任务。该研究调查了四个研究问题。首先，我们研究了产品和目标对参与者(RQ1)任务前感知、(RQ2)任务后感知和(RQ3)搜索行为的影响。其次，无论任务产品和目标如何，通过分析参与者的有声思考评论和筛选活动，我们仔细研究了人们如何搜索程序性知识。具体来说，我们报告了(RQ4)重要的相关标准、所寻求的信息类型和挑战。

{"title":"Better Understanding Procedural Search Tasks: Perceptions, Behaviors, and Challenges","authors":"Bogeum Choi, Sarah Casteel, Jaime Arguello, Robert Capra","doi":"10.1145/3630004","DOIUrl":"https://doi.org/10.1145/3630004","url":null,"abstract":"People often search for information to acquire procedural knowledge–“how to” knowledge about step-by-step procedures, methods, algorithms, techniques, heuristics, and skills. A procedural search task might involve implementing a solution to a problem, evaluating different approaches to a problem, and brainstorming on the types of problems that can be solved with a specific resource. We report on a study ( N = 36) that aimed to better understand how people search for procedural knowledge. Much research has investigated how search task characteristics impact people’s perceptions and behaviors. Along these lines, we manipulated procedural search tasks along two orthogonal dimensions: product and goal. The product dimension relates to the main outcome of the task and the goal dimension relates to task’s success criteria. We manipulated tasks across three product categories and two goal categories. The study investigated four research questions. First, we examined the effects of the product and goal on participants (RQ1) pre-task perceptions, (RQ2) post-task perceptions, and (RQ3) search behaviors. Second, regardless of the task product and goal, by analyzing participants’ think-aloud comments and screen activities we closely examined how people search for procedural knowledge. Specifically, we report on (RQ4) important relevance criteria, types of information sought, and challenges.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"53 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135366502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0