Jiaul H. Paik, Yash Agrawal, Sahil Rishi, Vaishal Shah
Existing probabilistic retrieval models do not restrict the domain of the random variables that they deal with. In this article, we show that the upper bound of the normalized term frequency (tf) from the relevant documents is much smaller than the upper bound of the normalized tf from the whole collection. As a result, the existing models suffer from two major problems: (i) the domain mismatch causes data modeling error, (ii) since the outliers have very large magnitude and the retrieval models follow tf hypothesis, the combination of these two factors tends to overestimate the relevance score. In an attempt to address these problems, we propose novel weighted probabilistic models based on truncated distributions. We evaluate our models on a set of large document collections. Significant performance improvement over six existing probabilistic models is demonstrated.
{"title":"Truncated Models for Probabilistic Weighted Retrieval","authors":"Jiaul H. Paik, Yash Agrawal, Sahil Rishi, Vaishal Shah","doi":"10.1145/3476837","DOIUrl":"https://doi.org/10.1145/3476837","url":null,"abstract":"Existing probabilistic retrieval models do not restrict the domain of the random variables that they deal with. In this article, we show that the upper bound of the normalized term frequency (tf) from the relevant documents is much smaller than the upper bound of the normalized tf from the whole collection. As a result, the existing models suffer from two major problems: (i) the domain mismatch causes data modeling error, (ii) since the outliers have very large magnitude and the retrieval models follow tf hypothesis, the combination of these two factors tends to overestimate the relevance score. In an attempt to address these problems, we propose novel weighted probabilistic models based on truncated distributions. We evaluate our models on a set of large document collections. Significant performance improvement over six existing probabilistic models is demonstrated.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"21 1","pages":"1 - 24"},"PeriodicalIF":0.0,"publicationDate":"2021-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89004164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As a major component of strategic talent management, learning and development (L&D) aims at improving the individual and organization performances through planning tailored training for employees to increase and improve their skills and knowledge. While many companies have developed the learning management systems (LMSs) for facilitating the online training of employees, a long-standing important issue is how to achieve personalized training recommendations with the consideration of their needs for future career development. To this end, in this article, we present a focused study on the explainable personalized online course recommender system for enhancing employee training and development. Specifically, we first propose a novel end-to-end hierarchical framework, namely Demand-aware Collaborative Bayesian Variational Network (DCBVN), to jointly model both the employees’ current competencies and their career development preferences in an explainable way. In DCBVN, we first extract the latent interpretable representations of the employees’ competencies from their skill profiles with autoencoding variational inference based topic modeling. Then, we develop an effective demand recognition mechanism for learning the personal demands of career development for employees. In particular, all the above processes are integrated into a unified Bayesian inference view for obtaining both accurate and explainable recommendations. Furthermore, for handling the employees with sparse or missing skill profiles, we develop an improved version of DCBVN, called the Demand-aware Collaborative Competency Attentive Network (DCCAN) framework, by considering the connectivity among employees. In DCCAN, we first build two employee competency graphs from learning and working aspects. Then, we design a graph-attentive network and a multi-head integration mechanism to infer one’s competency information from her neighborhood employees. Finally, we can generate explainable recommendation results based on the competency representations. Extensive experimental results on real-world data clearly demonstrate the effectiveness and the interpretability of both of our frameworks, as well as their robustness on sparse and cold-start scenarios.
{"title":"Personalized and Explainable Employee Training Course Recommendations: A Bayesian Variational Approach","authors":"Chao Wang, Hengshu Zhu, Peng Wang, Chen Zhu, Xi Zhang, Enhong Chen, Hui Xiong","doi":"10.1145/3490476","DOIUrl":"https://doi.org/10.1145/3490476","url":null,"abstract":"As a major component of strategic talent management, learning and development (L&D) aims at improving the individual and organization performances through planning tailored training for employees to increase and improve their skills and knowledge. While many companies have developed the learning management systems (LMSs) for facilitating the online training of employees, a long-standing important issue is how to achieve personalized training recommendations with the consideration of their needs for future career development. To this end, in this article, we present a focused study on the explainable personalized online course recommender system for enhancing employee training and development. Specifically, we first propose a novel end-to-end hierarchical framework, namely Demand-aware Collaborative Bayesian Variational Network (DCBVN), to jointly model both the employees’ current competencies and their career development preferences in an explainable way. In DCBVN, we first extract the latent interpretable representations of the employees’ competencies from their skill profiles with autoencoding variational inference based topic modeling. Then, we develop an effective demand recognition mechanism for learning the personal demands of career development for employees. In particular, all the above processes are integrated into a unified Bayesian inference view for obtaining both accurate and explainable recommendations. Furthermore, for handling the employees with sparse or missing skill profiles, we develop an improved version of DCBVN, called the Demand-aware Collaborative Competency Attentive Network (DCCAN) framework, by considering the connectivity among employees. In DCCAN, we first build two employee competency graphs from learning and working aspects. Then, we design a graph-attentive network and a multi-head integration mechanism to infer one’s competency information from her neighborhood employees. Finally, we can generate explainable recommendation results based on the competency representations. Extensive experimental results on real-world data clearly demonstrate the effectiveness and the interpretability of both of our frameworks, as well as their robustness on sparse and cold-start scenarios.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"42 1","pages":"1 - 32"},"PeriodicalIF":0.0,"publicationDate":"2021-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81014618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ji Fang, Shangsong Liang, Zaiqiao Meng, M. de Rijke
Network-based information has been widely explored and exploited in the information retrieval literature. Attributed networks, consisting of nodes, edges as well as attributes describing properties of nodes, are a basic type of network-based data, and are especially useful for many applications. Examples include user profiling in social networks and item recommendation in user-item purchase networks. Learning useful and expressive representations of entities in attributed networks can provide more effective building blocks to down-stream network-based tasks such as link prediction and attribute inference. Practically, input features of attributed networks are normalized as unit directional vectors. However, most network embedding techniques ignore the spherical nature of inputs and focus on learning representations in a Gaussian or Euclidean space, which, we hypothesize, might lead to less effective representations. To obtain more effective representations of attributed networks, we investigate the problem of mapping an attributed network with unit normalized directional features into a non-Gaussian and non-Euclidean space. Specifically, we propose a hyperspherical variational co-embedding for attributed networks (HCAN), which is based on generalized variational auto-encoders for heterogeneous data with multiple types of entities. HCAN jointly learns latent embeddings for both nodes and attributes in a unified hyperspherical space such that the affinities between nodes and attributes can be captured effectively. We argue that this is a crucial feature in many real-world applications of attributed networks. Previous Gaussian network embedding algorithms break the assumption of uninformative prior, which leads to unstable results and poor performance. In contrast, HCAN embeds nodes and attributes as von Mises-Fisher distributions, and allows one to capture the uncertainty of the inferred representations. Experimental results on eight datasets show that HCAN yields better performance in a number of applications compared with nine state-of-the-art baselines.
{"title":"Hyperspherical Variational Co-embedding for Attributed Networks","authors":"Ji Fang, Shangsong Liang, Zaiqiao Meng, M. de Rijke","doi":"10.1145/3478284","DOIUrl":"https://doi.org/10.1145/3478284","url":null,"abstract":"Network-based information has been widely explored and exploited in the information retrieval literature. Attributed networks, consisting of nodes, edges as well as attributes describing properties of nodes, are a basic type of network-based data, and are especially useful for many applications. Examples include user profiling in social networks and item recommendation in user-item purchase networks. Learning useful and expressive representations of entities in attributed networks can provide more effective building blocks to down-stream network-based tasks such as link prediction and attribute inference. Practically, input features of attributed networks are normalized as unit directional vectors. However, most network embedding techniques ignore the spherical nature of inputs and focus on learning representations in a Gaussian or Euclidean space, which, we hypothesize, might lead to less effective representations. To obtain more effective representations of attributed networks, we investigate the problem of mapping an attributed network with unit normalized directional features into a non-Gaussian and non-Euclidean space. Specifically, we propose a hyperspherical variational co-embedding for attributed networks (HCAN), which is based on generalized variational auto-encoders for heterogeneous data with multiple types of entities. HCAN jointly learns latent embeddings for both nodes and attributes in a unified hyperspherical space such that the affinities between nodes and attributes can be captured effectively. We argue that this is a crucial feature in many real-world applications of attributed networks. Previous Gaussian network embedding algorithms break the assumption of uninformative prior, which leads to unstable results and poor performance. In contrast, HCAN embeds nodes and attributes as von Mises-Fisher distributions, and allows one to capture the uncertainty of the inferred representations. Experimental results on eight datasets show that HCAN yields better performance in a number of applications compared with nine state-of-the-art baselines.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"101 1","pages":"1 - 36"},"PeriodicalIF":0.0,"publicationDate":"2021-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88584297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In Information Retrieval, numerous retrieval models or document ranking functions have been developed in the quest for better retrieval effectiveness. Apart from some formal retrieval models formulated on a theoretical basis, various recent works have applied heuristic constraints to guide the derivation of document ranking functions. While many recent methods are shown to improve over established and successful models, comparison among these new methods under a common environment is often missing. To address this issue, we perform an extensive and up-to-date comparison of leading term-independence retrieval models implemented in our own retrieval system. Our study focuses on the following questions: (RQ1) Is there a retrieval model that consistently outperforms all other models across multiple collections; (RQ2) What are the important features of an effective document ranking function? Our retrieval experiments performed on several TREC test collections of a wide range of sizes (up to the terabyte-sized Clueweb09 Category B) enable us to answer these research questions. This work also serves as a reproducibility study for leading retrieval models. While our experiments show that no single retrieval model outperforms all others across all tested collections, some recent retrieval models, such as MATF and MVD, consistently perform better than the common baselines.
{"title":"A Comparison between Term-Independence Retrieval Models for Ad Hoc Retrieval","authors":"E. K. F. Dang, R. Luk, James Allan","doi":"10.1145/3483612","DOIUrl":"https://doi.org/10.1145/3483612","url":null,"abstract":"In Information Retrieval, numerous retrieval models or document ranking functions have been developed in the quest for better retrieval effectiveness. Apart from some formal retrieval models formulated on a theoretical basis, various recent works have applied heuristic constraints to guide the derivation of document ranking functions. While many recent methods are shown to improve over established and successful models, comparison among these new methods under a common environment is often missing. To address this issue, we perform an extensive and up-to-date comparison of leading term-independence retrieval models implemented in our own retrieval system. Our study focuses on the following questions: (RQ1) Is there a retrieval model that consistently outperforms all other models across multiple collections; (RQ2) What are the important features of an effective document ranking function? Our retrieval experiments performed on several TREC test collections of a wide range of sizes (up to the terabyte-sized Clueweb09 Category B) enable us to answer these research questions. This work also serves as a reproducibility study for leading retrieval models. While our experiments show that no single retrieval model outperforms all others across all tested collections, some recent retrieval models, such as MATF and MVD, consistently perform better than the common baselines.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"77 1","pages":"1 - 37"},"PeriodicalIF":0.0,"publicationDate":"2021-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80788567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chuxu Zhang, Julia Kiseleva, S. Jauhar, Ryen W. White
People rely on task management applications and digital assistants to capture and track their tasks, and help with executing them. The burden of organizing and scheduling time for tasks continues to reside with users of these systems, despite the high cognitive load associated with these activities. Users stand to benefit greatly from a task management system capable of prioritizing their pending tasks, thus saving them time and effort. In this article, we make three main contributions. First, we propose the problem of task prioritization, formulating it as a ranking over a user’s pending tasks given a history of previous interactions with a task management system. Second, we perform an extensive analysis on the large-scale anonymized, de-identified logs of a popular task management application, deriving a dataset of grounded, real-world tasks from which to learn and evaluate our proposed system. We also identify patterns in how people record tasks as complete, which vary consistently with the nature of the task. Third, we propose a novel contextual deep learning solution capable of performing personalized task prioritization. In a battery of tests, we show that this approach outperforms several operational baselines and other sequential ranking models from previous work. Our findings have implications for understanding the ways people prioritize and manage tasks with digital tools, and in the design of support for users of task management applications.
{"title":"Grounded Task Prioritization with Context-Aware Sequential Ranking","authors":"Chuxu Zhang, Julia Kiseleva, S. Jauhar, Ryen W. White","doi":"10.1145/3486861","DOIUrl":"https://doi.org/10.1145/3486861","url":null,"abstract":"People rely on task management applications and digital assistants to capture and track their tasks, and help with executing them. The burden of organizing and scheduling time for tasks continues to reside with users of these systems, despite the high cognitive load associated with these activities. Users stand to benefit greatly from a task management system capable of prioritizing their pending tasks, thus saving them time and effort. In this article, we make three main contributions. First, we propose the problem of task prioritization, formulating it as a ranking over a user’s pending tasks given a history of previous interactions with a task management system. Second, we perform an extensive analysis on the large-scale anonymized, de-identified logs of a popular task management application, deriving a dataset of grounded, real-world tasks from which to learn and evaluate our proposed system. We also identify patterns in how people record tasks as complete, which vary consistently with the nature of the task. Third, we propose a novel contextual deep learning solution capable of performing personalized task prioritization. In a battery of tests, we show that this approach outperforms several operational baselines and other sequential ranking models from previous work. Our findings have implications for understanding the ways people prioritize and manage tasks with digital tools, and in the design of support for users of task management applications.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"75 1","pages":"1 - 28"},"PeriodicalIF":0.0,"publicationDate":"2021-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86428384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Session-based recommendation aims to generate recommendations merely based on the ongoing session, which is a challenging task. Previous methods mainly focus on modeling the sequential signals or the transition relations between items in the current session using RNNs or GNNs to identify user’s intent for recommendation. Such models generally ignore the dynamic connections between the local and global item transition patterns, although the global information is taken into consideration by exploiting the global-level pair-wise item transitions. Moreover, existing methods that mainly adopt the cross-entropy loss with softmax generally face a serious over-fitting problem, harming the recommendation accuracy. Thus, in this article, we propose a Graph Co-Attentive Recommendation Machine (GCARM) for session-based recommendation. In detail, we first design a Graph Co-Attention Network (GCAT) to consider the dynamic correlations between the local and global neighbors of each node during the information propagation. Then, the item-level dynamic connections between the output of the local and global graphs are modeled to generate the final item representations. After that, we produce the prediction scores and design a Max Cross-Entropy (MCE) loss to prevent over-fitting. Extensive experiments are conducted on three benchmark datasets, i.e., Diginetica, Gowalla, and Yoochoose. The experimental results show that GCARM can achieve the state-of-the-art performance in terms of Recall and MRR, especially on boosting the ranking of the target item.
{"title":"Graph Co-Attentive Session-based Recommendation","authors":"Zhiqiang Pan, Fei Cai, Wanyu Chen, Honghui Chen","doi":"10.1145/3486711","DOIUrl":"https://doi.org/10.1145/3486711","url":null,"abstract":"Session-based recommendation aims to generate recommendations merely based on the ongoing session, which is a challenging task. Previous methods mainly focus on modeling the sequential signals or the transition relations between items in the current session using RNNs or GNNs to identify user’s intent for recommendation. Such models generally ignore the dynamic connections between the local and global item transition patterns, although the global information is taken into consideration by exploiting the global-level pair-wise item transitions. Moreover, existing methods that mainly adopt the cross-entropy loss with softmax generally face a serious over-fitting problem, harming the recommendation accuracy. Thus, in this article, we propose a Graph Co-Attentive Recommendation Machine (GCARM) for session-based recommendation. In detail, we first design a Graph Co-Attention Network (GCAT) to consider the dynamic correlations between the local and global neighbors of each node during the information propagation. Then, the item-level dynamic connections between the output of the local and global graphs are modeled to generate the final item representations. After that, we produce the prediction scores and design a Max Cross-Entropy (MCE) loss to prevent over-fitting. Extensive experiments are conducted on three benchmark datasets, i.e., Diginetica, Gowalla, and Yoochoose. The experimental results show that GCARM can achieve the state-of-the-art performance in terms of Recall and MRR, especially on boosting the ranking of the target item.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"12 1","pages":"1 - 31"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89639125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Factorization models have been successfully applied to the recommendation problems and have significant impact to both academia and industries in the field of Collaborative Filtering (CF). However, the intermediate data generated in factorization models’ decision making process (or training process, footprint) have been overlooked even though they may provide rich information to further improve recommendations. In this article, we introduce the concept of Convergence Pattern, which records how ratings are learned step-by-step in factorization models in the field of CF. We show that the concept of Convergence Patternexists in both the model perspective (e.g., classical Matrix Factorization (MF) and deep-learning factorization) and the training (learning) perspective (e.g., stochastic gradient descent (SGD), alternating least squares (ALS), and Markov Chain Monte Carlo (MCMC)). By utilizing the Convergence Pattern, we propose a prediction model to estimate the prediction reliability of missing ratings and then improve the quality of recommendations. Two applications have been investigated: (1) how to evaluate the reliability of predicted missing ratings and thus recommend those ratings with high reliability. (2) How to explore the estimated reliability to adjust the predicted ratings to further improve the predication accuracy. Extensive experiments have been conducted on several benchmark datasets on three recommendation tasks: decision-aware recommendation, rating predicted, and Top-N recommendation. The experiment results have verified the effectiveness of the proposed methods in various aspects.
{"title":"The Footprint of Factorization Models and Their Applications in Collaborative Filtering","authors":"Jinze Wang, Yongli Ren, Jie Li, Ke Deng","doi":"10.1145/3490475","DOIUrl":"https://doi.org/10.1145/3490475","url":null,"abstract":"Factorization models have been successfully applied to the recommendation problems and have significant impact to both academia and industries in the field of Collaborative Filtering (CF). However, the intermediate data generated in factorization models’ decision making process (or training process, footprint) have been overlooked even though they may provide rich information to further improve recommendations. In this article, we introduce the concept of Convergence Pattern, which records how ratings are learned step-by-step in factorization models in the field of CF. We show that the concept of Convergence Patternexists in both the model perspective (e.g., classical Matrix Factorization (MF) and deep-learning factorization) and the training (learning) perspective (e.g., stochastic gradient descent (SGD), alternating least squares (ALS), and Markov Chain Monte Carlo (MCMC)). By utilizing the Convergence Pattern, we propose a prediction model to estimate the prediction reliability of missing ratings and then improve the quality of recommendations. Two applications have been investigated: (1) how to evaluate the reliability of predicted missing ratings and thus recommend those ratings with high reliability. (2) How to explore the estimated reliability to adjust the predicted ratings to further improve the predication accuracy. Extensive experiments have been conducted on several benchmark datasets on three recommendation tasks: decision-aware recommendation, rating predicted, and Top-N recommendation. The experiment results have verified the effectiveness of the proposed methods in various aspects.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"33 1","pages":"1 - 32"},"PeriodicalIF":0.0,"publicationDate":"2021-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82436598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peijie Sun, Le Wu, Kun Zhang, Yuxuan Su, Meng Wang
Review based recommendation utilizes both users’ rating records and the associated reviews for recommendation. Recently, with the rapid demand for explanations of recommendation results, reviews are used to train the encoder–decoder models for explanation text generation. As most of the reviews are general text without detailed evaluation, some researchers leveraged auxiliary information of users or items to enrich the generated explanation text. Nevertheless, the auxiliary data is not available in most scenarios and may suffer from data privacy problems. In this article, we argue that the reviews contain abundant semantic information to express the users’ feelings for various aspects of items, while these information are not fully explored in current explanation text generation task. To this end, we study how to generate more fine-grained explanation text in review based recommendation without any auxiliary data. Though the idea is simple, it is non-trivial since the aspect is hidden and unlabeled. Besides, it is also very challenging to inject aspect information for generating explanation text with noisy review input. To solve these challenges, we first leverage an advanced unsupervised neural aspect extraction model to learn the aspect-aware representation of each review sentence. Thus, users and items can be represented in the aspect space based on their historical associated reviews. After that, we detail how to better predict ratings and generate explanation text with the user and item representations in the aspect space. We further dynamically assign review sentences which contain larger proportion of aspect words with larger weights to control the text generation process, and jointly optimize rating prediction accuracy and explanation text generation quality with a multi-task learning framework. Finally, extensive experimental results on three real-world datasets demonstrate the superiority of our proposed model for both recommendation accuracy and explainability.
{"title":"An Unsupervised Aspect-Aware Recommendation Model with Explanation Text Generation","authors":"Peijie Sun, Le Wu, Kun Zhang, Yuxuan Su, Meng Wang","doi":"10.1145/3483611","DOIUrl":"https://doi.org/10.1145/3483611","url":null,"abstract":"Review based recommendation utilizes both users’ rating records and the associated reviews for recommendation. Recently, with the rapid demand for explanations of recommendation results, reviews are used to train the encoder–decoder models for explanation text generation. As most of the reviews are general text without detailed evaluation, some researchers leveraged auxiliary information of users or items to enrich the generated explanation text. Nevertheless, the auxiliary data is not available in most scenarios and may suffer from data privacy problems. In this article, we argue that the reviews contain abundant semantic information to express the users’ feelings for various aspects of items, while these information are not fully explored in current explanation text generation task. To this end, we study how to generate more fine-grained explanation text in review based recommendation without any auxiliary data. Though the idea is simple, it is non-trivial since the aspect is hidden and unlabeled. Besides, it is also very challenging to inject aspect information for generating explanation text with noisy review input. To solve these challenges, we first leverage an advanced unsupervised neural aspect extraction model to learn the aspect-aware representation of each review sentence. Thus, users and items can be represented in the aspect space based on their historical associated reviews. After that, we detail how to better predict ratings and generate explanation text with the user and item representations in the aspect space. We further dynamically assign review sentences which contain larger proportion of aspect words with larger weights to control the text generation process, and jointly optimize rating prediction accuracy and explanation text generation quality with a multi-task learning framework. Finally, extensive experimental results on three real-world datasets demonstrate the superiority of our proposed model for both recommendation accuracy and explainability.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"218 1","pages":"1 - 29"},"PeriodicalIF":0.0,"publicationDate":"2021-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85540746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Label Propagation Algorithm (LPA) and Graph Convolutional Neural Networks (GCN) are both message passing algorithms on graphs. Both solve the task of node classification, but LPA propagates node label information across the edges of the graph, while GCN propagates and transforms node feature information. However, while conceptually similar, theoretical relationship between LPA and GCN has not yet been systematically investigated. Moreover, it is unclear how LPA and GCN can be combined under a unified framework to improve the performance. Here we study the relationship between LPA and GCN in terms of feature/label influence, in which we characterize how much the initial feature/label of one node influences the final feature/label of another node in GCN/LPA. Based on our theoretical analysis, we propose an end-to-end model that combines GCN and LPA. In our unified model, edge weights are learnable, and the LPA serves as regularization to assist the GCN in learning proper edge weights that lead to improved performance. Our model can also be seen as learning the weights of edges based on node labels, which is more direct and efficient than existing feature-based attention models or topology-based diffusion models. In a number of experiments for semi-supervised node classification and knowledge-graph-aware recommendation, our model shows superiority over state-of-the-art baselines.
{"title":"Combining Graph Convolutional Neural Networks and Label Propagation","authors":"Hongwei Wang, J. Leskovec","doi":"10.1145/3490478","DOIUrl":"https://doi.org/10.1145/3490478","url":null,"abstract":"Label Propagation Algorithm (LPA) and Graph Convolutional Neural Networks (GCN) are both message passing algorithms on graphs. Both solve the task of node classification, but LPA propagates node label information across the edges of the graph, while GCN propagates and transforms node feature information. However, while conceptually similar, theoretical relationship between LPA and GCN has not yet been systematically investigated. Moreover, it is unclear how LPA and GCN can be combined under a unified framework to improve the performance. Here we study the relationship between LPA and GCN in terms of feature/label influence, in which we characterize how much the initial feature/label of one node influences the final feature/label of another node in GCN/LPA. Based on our theoretical analysis, we propose an end-to-end model that combines GCN and LPA. In our unified model, edge weights are learnable, and the LPA serves as regularization to assist the GCN in learning proper edge weights that lead to improved performance. Our model can also be seen as learning the weights of edges based on node labels, which is more direct and efficient than existing feature-based attention models or topology-based diffusion models. In a number of experiments for semi-supervised node classification and knowledge-graph-aware recommendation, our model shows superiority over state-of-the-art baselines.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"13 1","pages":"1 - 27"},"PeriodicalIF":0.0,"publicationDate":"2021-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75803302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
POI recommendation has become an essential means to help people discover attractive places. Intuitively, activities have an important impact on users’ decision-making, because users select POIs to attend corresponding activities. However, many existing studies ignore the social motivation of user behaviors and regard all check-ins as influenced only by individual user interests. As a result, they cannot model user preferences accurately, which degrades recommendation effectiveness. In this article, from the perspective of activities, this study proposes a probabilistic generative model called STARec. Specifically, based on the social effect of activities, STARec defines users’ social preferences as distinct from their individual interests and combines these with individual user activity interests to effectively depict user preferences. Moreover, the inconsistency between users’ social preferences and their decisions is modeled. An activity frequency feature is introduced to acquire accurate user social preferences because of close correlation between these and the key impact factor of corresponding check-ins. An alias sampling-based training method was used to accelerate training. Extensive experiments were conducted on two real-world datasets. Experimental results demonstrated that the proposed STARec model achieves superior performance in terms of high recommendation accuracy, robustness to data sparsity, effectiveness in handling cold-start problems, efficiency, and interpretability.
{"title":"STARec: Adaptive Learning with Spatiotemporal and Activity Influence for POI Recommendation","authors":"Weiyun Ji, Xiang-wu Meng, Yujie Zhang","doi":"10.1145/3485631","DOIUrl":"https://doi.org/10.1145/3485631","url":null,"abstract":"POI recommendation has become an essential means to help people discover attractive places. Intuitively, activities have an important impact on users’ decision-making, because users select POIs to attend corresponding activities. However, many existing studies ignore the social motivation of user behaviors and regard all check-ins as influenced only by individual user interests. As a result, they cannot model user preferences accurately, which degrades recommendation effectiveness. In this article, from the perspective of activities, this study proposes a probabilistic generative model called STARec. Specifically, based on the social effect of activities, STARec defines users’ social preferences as distinct from their individual interests and combines these with individual user activity interests to effectively depict user preferences. Moreover, the inconsistency between users’ social preferences and their decisions is modeled. An activity frequency feature is introduced to acquire accurate user social preferences because of close correlation between these and the key impact factor of corresponding check-ins. An alias sampling-based training method was used to accelerate training. Extensive experiments were conducted on two real-world datasets. Experimental results demonstrated that the proposed STARec model achieves superior performance in terms of high recommendation accuracy, robustness to data sparsity, effectiveness in handling cold-start problems, efficiency, and interpretability.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"66 1","pages":"1 - 40"},"PeriodicalIF":0.0,"publicationDate":"2021-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89324699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}