Recommender systems have become crucial in information filtering nowadays. Existing recommender systems extract user preferences based on the correlation in data, such as behavioral correlation in collaborative filtering, feature-feature, or feature-behavior correlation in click-through rate prediction. However, unfortunately, the real world is driven by causality, not just correlation, and correlation does not imply causation. For instance, recommender systems might recommend a battery charger to a user after buying a phone, where the latter can serve as the cause of the former; such a causal relation cannot be reversed. Recently, to address this, researchers in recommender systems have begun utilizing causal inference to extract causality, thereby enhancing the recommender system. In this survey, we offer a comprehensive review of the literature on causal inference-based recommendation. Initially, we introduce the fundamental concepts of both recommender system and causal inference as the foundation for subsequent content. We then highlight the typical issues faced by non-causality recommender system. Following that, we thoroughly review the existing work on causal inference-based recommender systems, based on a taxonomy of three-aspect challenges that causal inference can address. Finally, we discuss the open problems in this critical research area and suggest important potential future works.
{"title":"Causal Inference in Recommender Systems: A Survey and Future Directions","authors":"Chen Gao, Yu Zheng, Wenjie Wang, Fuli Feng, Xiangnan He, Yong Li","doi":"10.1145/3639048","DOIUrl":"https://doi.org/10.1145/3639048","url":null,"abstract":"<p>Recommender systems have become crucial in information filtering nowadays. Existing recommender systems extract user preferences based on the correlation in data, such as behavioral correlation in collaborative filtering, feature-feature, or feature-behavior correlation in click-through rate prediction. However, unfortunately, the real world is driven by <i>causality</i>, not just correlation, and correlation does not imply causation. For instance, recommender systems might recommend a battery charger to a user after buying a phone, where the latter can serve as the cause of the former; such a causal relation cannot be reversed. Recently, to address this, researchers in recommender systems have begun utilizing causal inference to extract causality, thereby enhancing the recommender system. In this survey, we offer a comprehensive review of the literature on causal inference-based recommendation. Initially, we introduce the fundamental concepts of both recommender system and causal inference as the foundation for subsequent content. We then highlight the typical issues faced by non-causality recommender system. Following that, we thoroughly review the existing work on causal inference-based recommender systems, based on a taxonomy of three-aspect challenges that causal inference can address. Finally, we discuss the open problems in this critical research area and suggest important potential future works.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"21 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139082213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mainstream solutions to sequential recommendation represent items with fixed vectors. These vectors have limited capability in capturing items’ latent aspects and users’ diverse preferences. As a new generative paradigm, diffusion models have achieved excellent performance in areas like computer vision and natural language processing. To our understanding, its unique merit in representation generation well fits the problem setting of sequential recommendation. In this article, we make the very first attempt to adapt the diffusion model to sequential recommendation and propose DiffuRec for item representation construction and uncertainty injection. Rather than modeling item representations as fixed vectors, we represent them as distributions in DiffuRec, which reflect a user’s multiple interests and an item’s various aspects adaptively. In the diffusion phase, DiffuRec corrupts the target item embedding into a Gaussian distribution via noise adding, which is further applied for sequential item distribution representation generation and uncertainty injection. Afterward, the item representation is fed into an approximator for target item representation reconstruction. In the reverse phase, based on a user’s historical interaction behaviors, we reverse a Gaussian noise into the target item representation, then apply a rounding operation for target item prediction. Experiments over four datasets show that DiffuRec outperforms strong baselines by a large margin.1
{"title":"DiffuRec: A Diffusion Model for Sequential Recommendation","authors":"Zihao Li, Aixin Sun, Chenliang Li","doi":"10.1145/3631116","DOIUrl":"https://doi.org/10.1145/3631116","url":null,"abstract":"<p>Mainstream solutions to sequential recommendation represent items with fixed vectors. These vectors have limited capability in capturing items’ latent aspects and users’ diverse preferences. As a new generative paradigm, <i>diffusion models</i> have achieved excellent performance in areas like computer vision and natural language processing. To our understanding, its unique merit in representation generation well fits the problem setting of sequential recommendation. In this article, we make the very first attempt to adapt the diffusion model to sequential recommendation and propose <span>DiffuRec</span> for item representation construction and uncertainty injection. Rather than modeling item representations as fixed vectors, we represent them as distributions in <span>DiffuRec</span>, which reflect a user’s multiple interests and an item’s various aspects adaptively. In the diffusion phase, <span>DiffuRec</span> corrupts the target item embedding into a Gaussian distribution via noise adding, which is further applied for sequential item distribution representation generation and uncertainty injection. Afterward, the item representation is fed into an approximator for target item representation reconstruction. In the reverse phase, based on a user’s historical interaction behaviors, we reverse a Gaussian noise into the target item representation, then apply a rounding operation for target item prediction. Experiments over four datasets show that <span>DiffuRec</span> outperforms strong baselines by a large margin.<sup>1</sup></p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"1 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2023-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139063803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The emergence of Graph Neural Networks (GNNs) has greatly advanced the development of recommendation systems. Recently, many researchers have leveraged GNN-based models to learn fair representations for users and items. However, current GNN-based models suffer from biased user-item interaction data, which negatively impacts recommendation fairness. Although there have been several studies employed adversarial learning to mitigate this issue in recommendation systems, they mostly focus on modifying the model training approach with fairness regularization and neglect direct intervention of biased interaction. Different from these models, this paper introduces a novel perspective by directly intervening in observed interactions to generate a counterfactual graph (called FairGap) that is not influenced by sensitive node attributes, enabling us to learn fair representations for users and items easily. We design the FairGap to answer the key counterfactual question: “ Would interactions with an item remain unchanged if user’s sensitive attributes were concealed? ”. We also provide theoretical proofs to show that our learning strategy via the counterfactual graph is unbiased in expectation. Moreover, we propose a fairness-enhancing mechanism to continuously improve user fairness in the graph-based recommendation. Extensive experimental results against state-of-the-art competitors and base models on three real-world datasets validate the effectiveness of our proposed model.
{"title":"FairGap: Fairness-aware Recommendation via Generating Counterfactual Graph","authors":"Wei Chen, Yiqing Wu, Zhao Zhang, Fuzhen Zhuang, Zhongshi He, Ruobing Xie, Feng xia","doi":"10.1145/3638352","DOIUrl":"https://doi.org/10.1145/3638352","url":null,"abstract":"The emergence of Graph Neural Networks (GNNs) has greatly advanced the development of recommendation systems. Recently, many researchers have leveraged GNN-based models to learn fair representations for users and items. However, current GNN-based models suffer from biased user-item interaction data, which negatively impacts recommendation fairness. Although there have been several studies employed adversarial learning to mitigate this issue in recommendation systems, they mostly focus on modifying the model training approach with fairness regularization and neglect direct intervention of biased interaction. Different from these models, this paper introduces a novel perspective by directly intervening in observed interactions to generate a counterfactual graph (called FairGap) that is not influenced by sensitive node attributes, enabling us to learn fair representations for users and items easily. We design the FairGap to answer the key counterfactual question: “ Would interactions with an item remain unchanged if user’s sensitive attributes were concealed? ”. We also provide theoretical proofs to show that our learning strategy via the counterfactual graph is unbiased in expectation. Moreover, we propose a fairness-enhancing mechanism to continuously improve user fairness in the graph-based recommendation. Extensive experimental results against state-of-the-art competitors and base models on three real-world datasets validate the effectiveness of our proposed model.","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"17 11","pages":""},"PeriodicalIF":5.6,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138945901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haokai Ma, Ruobing Xie, Lei Meng, Xin Chen, Xu Zhang, Leyu Lin, Jie Zhou
Cross-domain recommendation (CDR) aims to leverage the correlation of users’ behaviors in both the source and target domains to improve the user preference modeling in the target domain. Conventional CDR methods typically explore the dual-relations between the source and target domains’ behaviors. However, this may ignore the informative mixed behaviors that naturally reflect the user’s global preference. To address this issue, we present a novel framework, termed triple sequence learning for cross-domain recommendation (Tri-CDR), which jointly models the source, target, and mixed behavior sequences to highlight the global and target preference and precisely model the triple correlation in CDR. Specifically, Tri-CDR independently models the hidden representations for the triple behavior sequences and proposes a triple cross-domain attention (TCA) method to emphasize the informative knowledge related to both user’s global and target-domain preference. To comprehensively explore the cross-domain correlations, we design a triple contrastive learning (TCL) strategy that simultaneously considers the coarse-grained similarities and fine-grained distinctions among the triple sequences, ensuring the alignment while preserving information diversity in multi-domain. We conduct extensive experiments and analyses on six cross-domain settings. The significant improvements of Tri-CDR with different sequential encoders verify its effectiveness and universality. The source code is avaliable in https://github.com/hulkima/Tri-CDR.
{"title":"Triple Sequence Learning for Cross-domain Recommendation","authors":"Haokai Ma, Ruobing Xie, Lei Meng, Xin Chen, Xu Zhang, Leyu Lin, Jie Zhou","doi":"10.1145/3638351","DOIUrl":"https://doi.org/10.1145/3638351","url":null,"abstract":"<p>Cross-domain recommendation (CDR) aims to leverage the correlation of users’ behaviors in both the source and target domains to improve the user preference modeling in the target domain. Conventional CDR methods typically explore the dual-relations between the source and target domains’ behaviors. However, this may ignore the informative mixed behaviors that naturally reflect the user’s global preference. To address this issue, we present a novel framework, termed triple sequence learning for cross-domain recommendation (Tri-CDR), which jointly models the source, target, and mixed behavior sequences to highlight the global and target preference and precisely model the triple correlation in CDR. Specifically, Tri-CDR independently models the hidden representations for the triple behavior sequences and proposes a triple cross-domain attention (TCA) method to emphasize the informative knowledge related to both user’s global and target-domain preference. To comprehensively explore the cross-domain correlations, we design a triple contrastive learning (TCL) strategy that simultaneously considers the coarse-grained similarities and fine-grained distinctions among the triple sequences, ensuring the alignment while preserving information diversity in multi-domain. We conduct extensive experiments and analyses on six cross-domain settings. The significant improvements of Tri-CDR with different sequential encoders verify its effectiveness and universality. The source code is avaliable in https://github.com/hulkima/Tri-CDR.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"4 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139020342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chaoran Cui, Yumo Yao, Chunyun Zhang, Hebo Ma, Yuling Ma, Zhaochun Ren, Chen Zhang, James Ko
Knowledge tracing aims to trace students’ evolving knowledge states by predicting their future performance on concept-related exercises. Recently, some graph-based models have been developed to incorporate the relationships between exercises to improve knowledge tracing, but only a single type of relationship information is generally explored. In this paper, we present a novel Dual Graph Ensemble learning method for Knowledge Tracing (DGEKT), which establishes a dual graph structure of students’ learning interactions to capture the heterogeneous exercise-concept associations and interaction transitions by hypergraph modeling and directed graph modeling, respectively. To combine the dual graph models, we introduce the technique of online knowledge distillation. This choice arises from the observation that, while the knowledge tracing model is designed to predict students’ responses to the exercises related to different concepts, it is optimized merely with respect to the prediction accuracy on a single exercise at each step. With online knowledge distillation, the dual graph models are adaptively combined to form a stronger ensemble teacher model, which provides its predictions on all exercises as extra supervision for better modeling ability. In the experiments, we compare DGEKT against eight knowledge tracing baselines on three benchmark datasets, and the results demonstrate that DGEKT achieves state-of-the-art performance.
{"title":"DGEKT: A Dual Graph Ensemble Learning Method for Knowledge Tracing","authors":"Chaoran Cui, Yumo Yao, Chunyun Zhang, Hebo Ma, Yuling Ma, Zhaochun Ren, Chen Zhang, James Ko","doi":"10.1145/3638350","DOIUrl":"https://doi.org/10.1145/3638350","url":null,"abstract":"<p>Knowledge tracing aims to trace students’ evolving knowledge states by predicting their future performance on concept-related exercises. Recently, some graph-based models have been developed to incorporate the relationships between exercises to improve knowledge tracing, but only a single type of relationship information is generally explored. In this paper, we present a novel Dual Graph Ensemble learning method for Knowledge Tracing (DGEKT), which establishes a dual graph structure of students’ learning interactions to capture the heterogeneous exercise-concept associations and interaction transitions by hypergraph modeling and directed graph modeling, respectively. To combine the dual graph models, we introduce the technique of online knowledge distillation. This choice arises from the observation that, while the knowledge tracing model is designed to predict students’ responses to the exercises related to different concepts, it is optimized merely with respect to the prediction accuracy on a single exercise at each step. With online knowledge distillation, the dual graph models are adaptively combined to form a stronger ensemble teacher model, which provides its predictions on all exercises as extra supervision for better modeling ability. In the experiments, we compare DGEKT against eight knowledge tracing baselines on three benchmark datasets, and the results demonstrate that DGEKT achieves state-of-the-art performance.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"54 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2023-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139020371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hédi Razgallah, Michalis Vlachos, Ahmad Ajalloeian, Ninghao Liu, Johannes Schneider, Alexis Steinmann
The application of recommendation technologies has been crucial in the promotion of physical and digital content across numerous global platforms such as Amazon, Apple, and Netflix. Our study aims to investigate the advantages of employing recommendation technologies on educational platforms, with a particular focus on an educational platform for learning and practicing music.
Our research is based on data from Tomplay, a music platform that offers sheet music with professional audio recordings, enabling users to discover and practice music content at varying levels of difficulty. Through our analysis, we emphasize the distinct interaction patterns on educational platforms like Tomplay, which we compare with other commonly used recommendation datasets. We find that interactions are comparatively sparse on educational platforms, with users often focusing on specific content as they learn, rather than interacting with a broader range of material. Therefore, our primary goal is to address the issue of data sparsity. We achieve this through entity resolution principles and propose a neural network (NN) based recommendation model. Further, we improve this model by utilizing graph neural networks (GNNs), which provide superior predictive accuracy compared to NNs. Notably, our study demonstrates that GNNs are highly effective even for users with little or no historical preferences (cold-start problem).
Our cold-start experiments also provide valuable insights into an independent issue, namely the number of historical interactions needed by a recommendation model to gain a comprehensive understanding of a user. Our findings demonstrate that a platform acquires a solid knowledge of a user’s general preferences and characteristics with 50 past interactions. Overall, our study makes significant contributions to information systems research on business analytics and prescriptive analytics. Moreover, our framework and evaluation results offer implications for various stakeholders, including online educational institutions, education policymakers, and learning platform users.
{"title":"Using Neural and Graph Neural Recommender systems to Overcome Choice Overload: Evidence from a Music Education Platform","authors":"Hédi Razgallah, Michalis Vlachos, Ahmad Ajalloeian, Ninghao Liu, Johannes Schneider, Alexis Steinmann","doi":"10.1145/3637873","DOIUrl":"https://doi.org/10.1145/3637873","url":null,"abstract":"<p>The application of recommendation technologies has been crucial in the promotion of physical and digital content across numerous global platforms such as Amazon, Apple, and Netflix. Our study aims to investigate the advantages of employing recommendation technologies on educational platforms, with a particular focus on an educational platform for learning and practicing music. </p><p>Our research is based on data from Tomplay, a music platform that offers sheet music with professional audio recordings, enabling users to discover and practice music content at varying levels of difficulty. Through our analysis, we emphasize the distinct interaction patterns on educational platforms like Tomplay, which we compare with other commonly used recommendation datasets. We find that interactions are comparatively sparse on educational platforms, with users often focusing on specific content as they learn, rather than interacting with a broader range of material. Therefore, our primary goal is to address the issue of data sparsity. We achieve this through entity resolution principles and propose a neural network (NN) based recommendation model. Further, we improve this model by utilizing graph neural networks (GNNs), which provide superior predictive accuracy compared to NNs. Notably, our study demonstrates that GNNs are highly effective even for users with little or no historical preferences (cold-start problem). </p><p>Our cold-start experiments also provide valuable insights into an independent issue, namely the number of historical interactions needed by a recommendation model to gain a comprehensive understanding of a user. Our findings demonstrate that a platform acquires a solid knowledge of a user’s general preferences and characteristics with 50 past interactions. Overall, our study makes significant contributions to information systems research on business analytics and prescriptive analytics. Moreover, our framework and evaluation results offer implications for various stakeholders, including online educational institutions, education policymakers, and learning platform users.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"73 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2023-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138817699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiechen Xu, Lei Han, Shazia Sadiq, Gianluca Demartini
Misinformation has been rapidly spreading online. The common approach to deal with it is deploying expert fact-checkers that follow forensic processes to identify the veracity of statements. Unfortunately, such an approach does not scale well. To deal with this, crowdsourcing has been looked at as an opportunity to complement the work done by trained journalists. In this paper, we look at the effect of presenting the crowd with evidence from others while judging the veracity of statements. We implement various variants of the judgment task design to understand if and how the presented evidence may or may not affect the way crowd workers judge truthfulness and their performance. Our results show that, in certain cases, the presented evidence and the way in which it is presented may mislead crowd workers who would otherwise be more accurate if judging independently from others. Those who make appropriate use of the provided evidence, however, can benefit from it and generate better judgments.
{"title":"On the Impact of Showing Evidence from Peers in Crowdsourced Truthfulness Assessments","authors":"Jiechen Xu, Lei Han, Shazia Sadiq, Gianluca Demartini","doi":"10.1145/3637872","DOIUrl":"https://doi.org/10.1145/3637872","url":null,"abstract":"<p>Misinformation has been rapidly spreading online. The common approach to deal with it is deploying expert fact-checkers that follow forensic processes to identify the veracity of statements. Unfortunately, such an approach does not scale well. To deal with this, crowdsourcing has been looked at as an opportunity to complement the work done by trained journalists. In this paper, we look at the effect of presenting the crowd with evidence from others while judging the veracity of statements. We implement various variants of the judgment task design to understand if and how the presented evidence may or may not affect the way crowd workers judge truthfulness and their performance. Our results show that, in certain cases, the presented evidence and the way in which it is presented may mislead crowd workers who would otherwise be more accurate if judging independently from others. Those who make appropriate use of the provided evidence, however, can benefit from it and generate better judgments.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"244 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2023-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138743466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Self-attention models have achieved the state-of-the-art performance in sequential recommender systems by capturing the sequential dependencies among user-item interactions. However, they rely on adding positional embeddings to the item sequence to retain the sequential information, which may break the semantics of item embeddings due to the heterogeneity between these two types of embeddings. In addition, most existing works assume that such dependencies exist solely in the item embeddings, but neglect their existence among the item features. In our previous study, we proposed a novel sequential recommendation model, i.e., MLP4Rec, based on the recent advances of MLP-Mixer architectures, which is naturally sensitive to the order of items in a sequence because matrix elements related to different positions of a sequence will be given different weights in training. We developed a tri-directional fusion scheme to coherently capture sequential, cross-channel, and cross-feature correlations with linear computational complexity as well as much fewer model parameters than existing self-attention methods. However, the cascading mixer structure, the large number of normalization layers between different mixer layers, and the noise generated by these operations limit the efficiency of information extraction and the effectiveness of MLP4Rec. In this extended version, we propose a novel framework – SMLP4Rec for sequential recommendation to address the aforementioned issues. The new framework changes the flawed cascading structure to a parallel mode, and integrates normalization layers to minimize their impact on the model’s efficiency while maximizing their effectiveness. As a result, the training speed and prediction accuracy of SMLP4Rec are vastly improved in comparison to MLP4Rec. Extensive experimental results demonstrate that the proposed method is significantly superior to the state-of-the-art approaches. The implementation code is available online to ease reproducibility.
{"title":"SMLP4Rec: An Efficient all-MLP Architecture for Sequential Recommendations","authors":"Jingtong Gao, Xiangyu Zhao, Muyang Li, Minghao Zhao, Runze Wu, Ruocheng Guo, Yiding Liu, Dawei Yin","doi":"10.1145/3637871","DOIUrl":"https://doi.org/10.1145/3637871","url":null,"abstract":"<p>Self-attention models have achieved the state-of-the-art performance in sequential recommender systems by capturing the sequential dependencies among user-item interactions. However, they rely on adding positional embeddings to the item sequence to retain the sequential information, which may break the semantics of item embeddings due to the heterogeneity between these two types of embeddings. In addition, most existing works assume that such dependencies exist solely in the item embeddings, but neglect their existence among the item features. In our previous study, we proposed a novel sequential recommendation model, i.e., MLP4Rec, based on the recent advances of MLP-Mixer architectures, which is naturally sensitive to the order of items in a sequence because matrix elements related to different positions of a sequence will be given different weights in training. We developed a tri-directional fusion scheme to coherently capture sequential, cross-channel, and cross-feature correlations with linear computational complexity as well as much fewer model parameters than existing self-attention methods. However, the cascading mixer structure, the large number of normalization layers between different mixer layers, and the noise generated by these operations limit the efficiency of information extraction and the effectiveness of MLP4Rec. In this extended version, we propose a novel framework – SMLP4Rec for sequential recommendation to address the aforementioned issues. The new framework changes the flawed cascading structure to a parallel mode, and integrates normalization layers to minimize their impact on the model’s efficiency while maximizing their effectiveness. As a result, the training speed and prediction accuracy of SMLP4Rec are vastly improved in comparison to MLP4Rec. Extensive experimental results demonstrate that the proposed method is significantly superior to the state-of-the-art approaches. The implementation code is available online to ease reproducibility.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"16 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138717249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wayne Xin Zhao, Jing Liu, Ruiyang Ren, Ji-Rong Wen
Text retrieval is a long-standing research topic on information seeking, where a system is required to return relevant information resources to user’s queries in natural language. From heuristic-based retrieval methods to learning-based ranking functions, the underlying retrieval models have been continually evolved with the ever-lasting technical innovation. To design effective retrieval models, a key point lies in how to learn text representations and model the relevance matching. The recent success of pretrained language models (PLM) sheds light on developing more capable text retrieval approaches by leveraging the excellent modeling capacity of PLMs. With powerful PLMs, we can effectively learn the semantic representations of queries and texts in the latent representation space, and further construct the semantic matching function between the dense vectors for relevance modeling. Such a retrieval approach is called dense retrieval, since it employs dense vectors to represent the texts. Considering the rapid progress on dense retrieval, this survey systematically reviews the recent progress on PLM-based dense retrieval. Different from previous surveys on dense retrieval, we take a new perspective to organize the related studies by four major aspects, including architecture, training, indexing and integration, and thoroughly summarize the mainstream techniques for each aspect. We extensively collect the recent advances on this topic, and include 300+ reference papers. To support our survey, we create a website for providing useful resources, and release a code repository for dense retrieval. This survey aims to provide a comprehensive, practical reference focused on the major progress for dense text retrieval.
{"title":"Dense Text Retrieval based on Pretrained Language Models: A Survey","authors":"Wayne Xin Zhao, Jing Liu, Ruiyang Ren, Ji-Rong Wen","doi":"10.1145/3637870","DOIUrl":"https://doi.org/10.1145/3637870","url":null,"abstract":"<p>Text retrieval is a long-standing research topic on information seeking, where a system is required to return relevant information resources to user’s queries in natural language. From heuristic-based retrieval methods to learning-based ranking functions, the underlying retrieval models have been continually evolved with the ever-lasting technical innovation. To design effective retrieval models, a key point lies in how to learn text representations and model the relevance matching. The recent success of pretrained language models (PLM) sheds light on developing more capable text retrieval approaches by leveraging the excellent modeling capacity of PLMs. With powerful PLMs, we can effectively learn the semantic representations of queries and texts in the latent representation space, and further construct the semantic matching function between the dense vectors for relevance modeling. Such a retrieval approach is called <i>dense retrieval</i>, since it employs dense vectors to represent the texts. Considering the rapid progress on dense retrieval, this survey systematically reviews the recent progress on PLM-based dense retrieval. Different from previous surveys on dense retrieval, we take a new perspective to organize the related studies by four major aspects, including architecture, training, indexing and integration, and thoroughly summarize the mainstream techniques for each aspect. We extensively collect the recent advances on this topic, and include 300+ reference papers. To support our survey, we create a website for providing useful resources, and release a code repository for dense retrieval. This survey aims to provide a comprehensive, practical reference focused on the major progress for dense text retrieval.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"70 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138716968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziyi Ye, Xiaohui Xie, Qingyao Ai, Yiqun Liu, Zhihong Wang, Weihang Su, Min Zhang
The Relevance Feedback (RF) process relies on accurate and real-time relevance estimation of feedback documents to improve retrieval performance. Since collecting explicit relevance annotations imposes an extra burden on the user, extensive studies have explored using pseudo-relevance signals and implicit feedback signals as substitutes. However, such signals are indirect indicators of relevance and suffer from complex search scenarios where user interactions are absent or biased.
Recently, the advances in portable and high-precision brain-computer interface (BCI) devices have shown the possibility to monitor user’s brain activities during search process. Brain signals can directly reflect user’s psychological responses to search results and thus it can act as additional and unbiased RF signals. To explore the effectiveness of brain signals in the context of RF, we propose a novel RF framework that combines BCI-based relevance feedback with pseudo-relevance signals and implicit signals to improve the performance of document re-ranking. The experimental results on the user study dataset show that incorporating brain signals leads to significant performance improvement in our RF framework. Besides, we observe that brain signals perform particularly well in several hard search scenarios, especially when implicit signals as feedback are missing or noisy. This reveals when and how to exploit brain signals in the context of RF.
{"title":"Relevance Feedback with Brain Signals","authors":"Ziyi Ye, Xiaohui Xie, Qingyao Ai, Yiqun Liu, Zhihong Wang, Weihang Su, Min Zhang","doi":"10.1145/3637874","DOIUrl":"https://doi.org/10.1145/3637874","url":null,"abstract":"<p>The Relevance Feedback (RF) process relies on accurate and real-time relevance estimation of feedback documents to improve retrieval performance. Since collecting explicit relevance annotations imposes an extra burden on the user, extensive studies have explored using pseudo-relevance signals and implicit feedback signals as substitutes. However, such signals are indirect indicators of relevance and suffer from complex search scenarios where user interactions are absent or biased. </p><p>Recently, the advances in portable and high-precision brain-computer interface (BCI) devices have shown the possibility to monitor user’s brain activities during search process. Brain signals can directly reflect user’s psychological responses to search results and thus it can act as additional and unbiased RF signals. To explore the effectiveness of brain signals in the context of RF, we propose a novel RF framework that combines BCI-based relevance feedback with pseudo-relevance signals and implicit signals to improve the performance of document re-ranking. The experimental results on the user study dataset show that incorporating brain signals leads to significant performance improvement in our RF framework. Besides, we observe that brain signals perform particularly well in several hard search scenarios, especially when implicit signals as feedback are missing or noisy. This reveals when and how to exploit brain signals in the context of RF.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"9 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2023-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138717032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}