Zhongwei Xie, Ling Liu, Yanzhao Wu, Luo Zhong, Lin Li
This article introduces a two-phase deep feature engineering framework for efficient learning of semantics enhanced joint embedding, which clearly separates the deep feature engineering in data preprocessing from training the text-image joint embedding model. We use the Recipe1M dataset for the technical description and empirical validation. In preprocessing, we perform deep feature engineering by combining deep feature engineering with semantic context features derived from raw text-image input data. We leverage LSTM to identify key terms, deep NLP models from the BERT family, TextRank, or TF-IDF to produce ranking scores for key terms before generating the vector representation for each key term by using Word2vec. We leverage Wide ResNet50 and Word2vec to extract and encode the image category semantics of food images to help semantic alignment of the learned recipe and image embeddings in the joint latent space. In joint embedding learning, we perform deep feature engineering by optimizing the batch-hard triplet loss function with soft-margin and double negative sampling, taking into account also the category-based alignment loss and discriminator-based alignment loss. Extensive experiments demonstrate that our SEJE approach with deep feature engineering significantly outperforms the state-of-the-art approaches.
{"title":"Learning Text-image Joint Embedding for Efficient Cross-modal Retrieval with Deep Feature Engineering","authors":"Zhongwei Xie, Ling Liu, Yanzhao Wu, Luo Zhong, Lin Li","doi":"10.1145/3490519","DOIUrl":"https://doi.org/10.1145/3490519","url":null,"abstract":"This article introduces a two-phase deep feature engineering framework for efficient learning of semantics enhanced joint embedding, which clearly separates the deep feature engineering in data preprocessing from training the text-image joint embedding model. We use the Recipe1M dataset for the technical description and empirical validation. In preprocessing, we perform deep feature engineering by combining deep feature engineering with semantic context features derived from raw text-image input data. We leverage LSTM to identify key terms, deep NLP models from the BERT family, TextRank, or TF-IDF to produce ranking scores for key terms before generating the vector representation for each key term by using Word2vec. We leverage Wide ResNet50 and Word2vec to extract and encode the image category semantics of food images to help semantic alignment of the learned recipe and image embeddings in the joint latent space. In joint embedding learning, we perform deep feature engineering by optimizing the batch-hard triplet loss function with soft-margin and double negative sampling, taking into account also the category-based alignment loss and discriminator-based alignment loss. Extensive experiments demonstrate that our SEJE approach with deep feature engineering significantly outperforms the state-of-the-art approaches.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"25 1","pages":"1 - 27"},"PeriodicalIF":0.0,"publicationDate":"2021-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87256841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The cold-start recommendation is an urgent problem in contemporary online applications. It aims to provide users whose behaviors are literally sparse with as accurate recommendations as possible. Many data-driven algorithms, such as the widely used matrix factorization, underperform because of data sparseness. This work adopts the idea of meta-learning to solve the user’s cold-start recommendation problem. We propose a meta-learning-based cold-start sequential recommendation framework called metaCSR, including three main components: Diffusion Representer for learning better user/item embedding through information diffusion on the interaction graph; Sequential Recommender for capturing temporal dependencies of behavior sequences; and Meta Learner for extracting and propagating transferable knowledge of prior users and learning a good initialization for new users. metaCSR holds the ability to learn the common patterns from regular users’ behaviors and optimize the initialization so that the model can quickly adapt to new users after one or a few gradient updates to achieve optimal performance. The extensive quantitative experiments on three widely used datasets show the remarkable performance of metaCSR in dealing with the user cold-start problem. Meanwhile, a series of qualitative analysis demonstrates that the proposed metaCSR has good generalization.
{"title":"Learning to Learn a Cold-start Sequential Recommender","authors":"Xiaowen Huang, J. Sang, Jian Yu, Changsheng Xu","doi":"10.1145/3466753","DOIUrl":"https://doi.org/10.1145/3466753","url":null,"abstract":"The cold-start recommendation is an urgent problem in contemporary online applications. It aims to provide users whose behaviors are literally sparse with as accurate recommendations as possible. Many data-driven algorithms, such as the widely used matrix factorization, underperform because of data sparseness. This work adopts the idea of meta-learning to solve the user’s cold-start recommendation problem. We propose a meta-learning-based cold-start sequential recommendation framework called metaCSR, including three main components: Diffusion Representer for learning better user/item embedding through information diffusion on the interaction graph; Sequential Recommender for capturing temporal dependencies of behavior sequences; and Meta Learner for extracting and propagating transferable knowledge of prior users and learning a good initialization for new users. metaCSR holds the ability to learn the common patterns from regular users’ behaviors and optimize the initialization so that the model can quickly adapt to new users after one or a few gradient updates to achieve optimal performance. The extensive quantitative experiments on three widely used datasets show the remarkable performance of metaCSR in dealing with the user cold-start problem. Meanwhile, a series of qualitative analysis demonstrates that the proposed metaCSR has good generalization.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"102 1","pages":"1 - 25"},"PeriodicalIF":0.0,"publicationDate":"2021-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78165036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
When conducting a search task, users may find it difficult to articulate their need, even more so when the task is complex. To help them complete their search, search engine usually provide query suggestions. A good query suggestion system requires to model user behavior during the search session. In this article, we study multiple Transformer architectures applied to the query suggestion task and compare them with recurrent neural network (RNN)-based models. We experiment Transformer models with different tokenizers, with different Encoders (large pretrained models or fully trained ones), and with two kinds of architectures (flat or hierarchic). We study the performance and the behaviors of these various models, and observe that Transformer-based models outperform RNN-based ones. We show that while the hierarchical architectures exhibit very good performances for query suggestion, the flat models are more suitable for complex and long search tasks. Finally, we investigate the flat models behavior and demonstrate that they indeed learn to recover the hierarchy of a search session.
{"title":"On the Study of Transformers for Query Suggestion","authors":"Agnès Mustar, S. Lamprier, Benjamin Piwowarski","doi":"10.1145/3470562","DOIUrl":"https://doi.org/10.1145/3470562","url":null,"abstract":"When conducting a search task, users may find it difficult to articulate their need, even more so when the task is complex. To help them complete their search, search engine usually provide query suggestions. A good query suggestion system requires to model user behavior during the search session. In this article, we study multiple Transformer architectures applied to the query suggestion task and compare them with recurrent neural network (RNN)-based models. We experiment Transformer models with different tokenizers, with different Encoders (large pretrained models or fully trained ones), and with two kinds of architectures (flat or hierarchic). We study the performance and the behaviors of these various models, and observe that Transformer-based models outperform RNN-based ones. We show that while the hierarchical architectures exhibit very good performances for query suggestion, the flat models are more suitable for complex and long search tasks. Finally, we investigate the flat models behavior and demonstrate that they indeed learn to recover the hierarchy of a search session.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"10 1","pages":"1 - 27"},"PeriodicalIF":0.0,"publicationDate":"2021-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78954365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Social recommendation has achieved great success in many domains including e-commerce and location-based social networks. Existing methods usually explore the user-item interactions or user-user connections to predict users’ preference behaviors. However, they usually learn both user and item representations in Euclidean space, which has large limitations for exploring the latent hierarchical property in the data. In this article, we study a novel problem of hyperbolic social recommendation, where we aim to learn the compact but strong representations for both users and items. Meanwhile, this work also addresses two critical domain-issues, which are under-explored. First, users often make trade-offs with multiple underlying aspect factors to make decisions during their interactions with items. Second, users generally build connections with others in terms of different aspects, which produces different influences with aspects in social network. To this end, we propose a novel graph neural network (GNN) framework with multiple aspect learning, namely, HyperSoRec. Specifically, we first embed all users, items, and aspects into hyperbolic space with superior representations to ensure their hierarchical properties. Then, we adapt a GNN with novel multi-aspect message-passing-receiving mechanism to capture different influences among users. Next, to characterize the multi-aspect interactions of users on items, we propose an adaptive hyperbolic metric learning method by introducing learnable interactive relations among different aspects. Finally, we utilize the hyperbolic translational distance to measure the plausibility in each user-item pair for recommendation. Experimental results on two public datasets clearly demonstrate that our HyperSoRec not only achieves significant improvement for recommendation performance but also shows better representation ability in hyperbolic space with strong robustness and reliability.
{"title":"HyperSoRec: Exploiting Hyperbolic User and Item Representations with Multiple Aspects for Social-aware Recommendation","authors":"Hao Wang, Defu Lian, Hanghang Tong, Qi Liu, Zhenya Huang, Enhong Chen","doi":"10.1145/3463913","DOIUrl":"https://doi.org/10.1145/3463913","url":null,"abstract":"Social recommendation has achieved great success in many domains including e-commerce and location-based social networks. Existing methods usually explore the user-item interactions or user-user connections to predict users’ preference behaviors. However, they usually learn both user and item representations in Euclidean space, which has large limitations for exploring the latent hierarchical property in the data. In this article, we study a novel problem of hyperbolic social recommendation, where we aim to learn the compact but strong representations for both users and items. Meanwhile, this work also addresses two critical domain-issues, which are under-explored. First, users often make trade-offs with multiple underlying aspect factors to make decisions during their interactions with items. Second, users generally build connections with others in terms of different aspects, which produces different influences with aspects in social network. To this end, we propose a novel graph neural network (GNN) framework with multiple aspect learning, namely, HyperSoRec. Specifically, we first embed all users, items, and aspects into hyperbolic space with superior representations to ensure their hierarchical properties. Then, we adapt a GNN with novel multi-aspect message-passing-receiving mechanism to capture different influences among users. Next, to characterize the multi-aspect interactions of users on items, we propose an adaptive hyperbolic metric learning method by introducing learnable interactive relations among different aspects. Finally, we utilize the hyperbolic translational distance to measure the plausibility in each user-item pair for recommendation. Experimental results on two public datasets clearly demonstrate that our HyperSoRec not only achieves significant improvement for recommendation performance but also shows better representation ability in hyperbolic space with strong robustness and reliability.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"466 1","pages":"1 - 28"},"PeriodicalIF":0.0,"publicationDate":"2021-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86715419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conversational systems now attract great attention due to their promising potential and commercial values. To build a conversational system with moderate intelligence is challenging and requires big (conversational) data, as well as interdisciplinary techniques. Thanks to the prosperity of the Web, the massive data available greatly facilitate data-driven methods such as deep learning for human-computer conversational systems. In general, retrieval-based conversational systems apply various matching schema between query utterances and responses, but the classic retrieval paradigm suffers from prominent weakness for conversations: the system finds similar responses given a particular query. For real human-to-human conversations, on the contrary, responses can be greatly different yet all are possibly appropriate. The observation reveals the diversity phenomenon in conversations. In this article, we ascribe the lack of conversational diversity to the reason that the query utterances are statically modeled regardless of candidate responses through traditional methods. To this end, we propose a dynamic representation learning strategy that models the query utterances and different response candidates in an interactive way. To be more specific, we propose a Respond-with-Diversity model augmented by the memory module interacting with both the query utterances and multiple candidate responses. Hence, we obtain dynamic representations for the input queries conditioned on different response candidates. We frame the model as an end-to-end learnable neural network. In the experiments, we demonstrate the effectiveness of the proposed model by achieving a good appropriateness score and much better diversity in retrieval-based conversations between humans and computers.
{"title":"Multi-Response Awareness for Retrieval-Based Conversations: Respond with Diversity via Dynamic Representation Learning","authors":"Rui Yan, Weiheng Liao, Dongyan Zhao, Ji-rong Wen","doi":"10.1145/3470450","DOIUrl":"https://doi.org/10.1145/3470450","url":null,"abstract":"Conversational systems now attract great attention due to their promising potential and commercial values. To build a conversational system with moderate intelligence is challenging and requires big (conversational) data, as well as interdisciplinary techniques. Thanks to the prosperity of the Web, the massive data available greatly facilitate data-driven methods such as deep learning for human-computer conversational systems. In general, retrieval-based conversational systems apply various matching schema between query utterances and responses, but the classic retrieval paradigm suffers from prominent weakness for conversations: the system finds similar responses given a particular query. For real human-to-human conversations, on the contrary, responses can be greatly different yet all are possibly appropriate. The observation reveals the diversity phenomenon in conversations. In this article, we ascribe the lack of conversational diversity to the reason that the query utterances are statically modeled regardless of candidate responses through traditional methods. To this end, we propose a dynamic representation learning strategy that models the query utterances and different response candidates in an interactive way. To be more specific, we propose a Respond-with-Diversity model augmented by the memory module interacting with both the query utterances and multiple candidate responses. Hence, we obtain dynamic representations for the input queries conditioned on different response candidates. We frame the model as an end-to-end learnable neural network. In the experiments, we demonstrate the effectiveness of the proposed model by achieving a good appropriateness score and much better diversity in retrieval-based conversations between humans and computers.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"26 1","pages":"1 - 29"},"PeriodicalIF":0.0,"publicationDate":"2021-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73941464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Several recent studies have explored the interaction effects between topics, systems, corpora, and components when measuring retrieval effectiveness. However, all of these previous studies assume that a topic or information need is represented by a single query. In reality, users routinely reformulate queries to satisfy an information need. In recent years, there has been renewed interest in the notion of “query variations” which are essentially multiple user formulations for an information need. Like many retrieval models, some queries are highly effective while others are not. This is often an artifact of the collection being searched which might be more or less sensitive to word choice. Users rarely have perfect knowledge about the underlying collection, and so finding queries that work is often a trial-and-error process. In this work, we explore the fundamental problem of system interaction effects between collections, ranking models, and queries. To answer this important question, we formalize the analysis using ANalysis Of VAriance (ANOVA) models to measure multiple components effects across collections and topics by nesting multiple query variations within each topic. Our findings show that query formulations have a comparable effect size of the topic factor itself, which is known to be the factor with the greatest effect size in prior ANOVA studies. Both topic and formulation have a substantially larger effect size than any other factor, including the ranking algorithms and, surprisingly, even query expansion. This finding reinforces the importance of further research in understanding the role of query rewriting in IR related tasks.
{"title":"Topic Difficulty: Collection and Query Formulation Effects","authors":"J. Culpepper, G. Faggioli, N. Ferro, Oren Kurland","doi":"10.1145/3470563","DOIUrl":"https://doi.org/10.1145/3470563","url":null,"abstract":"Several recent studies have explored the interaction effects between topics, systems, corpora, and components when measuring retrieval effectiveness. However, all of these previous studies assume that a topic or information need is represented by a single query. In reality, users routinely reformulate queries to satisfy an information need. In recent years, there has been renewed interest in the notion of “query variations” which are essentially multiple user formulations for an information need. Like many retrieval models, some queries are highly effective while others are not. This is often an artifact of the collection being searched which might be more or less sensitive to word choice. Users rarely have perfect knowledge about the underlying collection, and so finding queries that work is often a trial-and-error process. In this work, we explore the fundamental problem of system interaction effects between collections, ranking models, and queries. To answer this important question, we formalize the analysis using ANalysis Of VAriance (ANOVA) models to measure multiple components effects across collections and topics by nesting multiple query variations within each topic. Our findings show that query formulations have a comparable effect size of the topic factor itself, which is known to be the factor with the greatest effect size in prior ANOVA studies. Both topic and formulation have a substantially larger effect size than any other factor, including the ranking algorithms and, surprisingly, even query expansion. This finding reinforces the importance of further research in understanding the role of query rewriting in IR related tasks.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"434 1","pages":"1 - 36"},"PeriodicalIF":0.0,"publicationDate":"2021-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76655865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The job interview is considered as one of the most essential tasks in talent recruitment, which forms a bridge between candidates and employers in fitting the right person for the right job. While substantial efforts have been made on improving the job interview process, it is inevitable to have biased or inconsistent interview assessment due to the subjective nature of the traditional interview process. To this end, in this article, we propose three novel approaches to intelligent job interview by learning the large-scale real-world interview data. Specifically, we first develop a preliminary model, named Joint Learning Model on Interview Assessment (JLMIA), to mine the relationship among job description, candidate resume, and interview assessment. Then, we further design an enhanced model, named Neural-JLMIA, to improve the representative capability by applying neural variance inference. Last, we propose to refine JLMIA with Refined-JLMIA (R-JLMIA) by modeling individual characteristics for each collection, i.e., disentangling the core competences from resume and capturing the evolution of the semantic topics over different interview rounds. As a result, our approaches can effectively learn the representative perspectives of different job interview processes from the successful job interview records in history. In addition, we exploit our approaches for two real-world applications, i.e., person-job fit and skill recommendation for interview assessment. Extensive experiments conducted on real-world data clearly validate the effectiveness of our models, which can lead to substantially less bias in job interviews and provide an interpretable understanding of job interview assessment.
{"title":"Joint Representation Learning with Relation-Enhanced Topic Models for Intelligent Job Interview Assessment","authors":"Dazhong Shen, Chuan Qin, Hengshu Zhu, Tong Xu, Enhong Chen, Hui Xiong","doi":"10.1145/3469654","DOIUrl":"https://doi.org/10.1145/3469654","url":null,"abstract":"The job interview is considered as one of the most essential tasks in talent recruitment, which forms a bridge between candidates and employers in fitting the right person for the right job. While substantial efforts have been made on improving the job interview process, it is inevitable to have biased or inconsistent interview assessment due to the subjective nature of the traditional interview process. To this end, in this article, we propose three novel approaches to intelligent job interview by learning the large-scale real-world interview data. Specifically, we first develop a preliminary model, named Joint Learning Model on Interview Assessment (JLMIA), to mine the relationship among job description, candidate resume, and interview assessment. Then, we further design an enhanced model, named Neural-JLMIA, to improve the representative capability by applying neural variance inference. Last, we propose to refine JLMIA with Refined-JLMIA (R-JLMIA) by modeling individual characteristics for each collection, i.e., disentangling the core competences from resume and capturing the evolution of the semantic topics over different interview rounds. As a result, our approaches can effectively learn the representative perspectives of different job interview processes from the successful job interview records in history. In addition, we exploit our approaches for two real-world applications, i.e., person-job fit and skill recommendation for interview assessment. Extensive experiments conducted on real-world data clearly validate the effectiveness of our models, which can lead to substantially less bias in job interviews and provide an interpretable understanding of job interview assessment.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"96 1","pages":"1 - 36"},"PeriodicalIF":0.0,"publicationDate":"2021-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80116646","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Traditionally, capsule wardrobes are manually designed by expert fashionistas through their creativity and technical prowess. The goal is to curate minimal fashion items that can be assembled into several compatible and versatile outfits. It is usually a cost and time intensive process, and hence lacks scalability. Although there are a few approaches that attempt to automate the process, they tend to ignore the price of items or shopping budget. In this article, we formulate this task as a multi-objective budget constrained capsule wardrobe recommendation (MOBCCWR) problem. It is modeled as a bipartite graph having two disjoint vertex sets corresponding to top-wear and bottom-wear items, respectively. An edge represents compatibility between the corresponding item pairs. The objective is to find a 1-neighbor subset of fashion items as a capsule wardrobe that jointly maximize compatibility and versatility scores by considering corresponding user-specified preference weight coefficients and an overall shopping budget as a means of achieving personalization. We study the complexity class of MOBCCWR, show that it is NP-Complete, and propose a greedy algorithm for finding a near-optimal solution in real time. We also analyze the time complexity and approximation bound for our algorithm. Experimental results show the effectiveness of the proposed approach on both real and synthetic datasets.
{"title":"A Graph Theoretic Approach for Multi-Objective Budget Constrained Capsule Wardrobe Recommendation","authors":"Shubham Patil, Debopriyo Banerjee, S. Sural","doi":"10.1145/3457182","DOIUrl":"https://doi.org/10.1145/3457182","url":null,"abstract":"Traditionally, capsule wardrobes are manually designed by expert fashionistas through their creativity and technical prowess. The goal is to curate minimal fashion items that can be assembled into several compatible and versatile outfits. It is usually a cost and time intensive process, and hence lacks scalability. Although there are a few approaches that attempt to automate the process, they tend to ignore the price of items or shopping budget. In this article, we formulate this task as a multi-objective budget constrained capsule wardrobe recommendation (MOBCCWR) problem. It is modeled as a bipartite graph having two disjoint vertex sets corresponding to top-wear and bottom-wear items, respectively. An edge represents compatibility between the corresponding item pairs. The objective is to find a 1-neighbor subset of fashion items as a capsule wardrobe that jointly maximize compatibility and versatility scores by considering corresponding user-specified preference weight coefficients and an overall shopping budget as a means of achieving personalization. We study the complexity class of MOBCCWR, show that it is NP-Complete, and propose a greedy algorithm for finding a near-optimal solution in real time. We also analyze the time complexity and approximation bound for our algorithm. Experimental results show the effectiveness of the proposed approach on both real and synthetic datasets.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"1 1","pages":"1 - 33"},"PeriodicalIF":0.0,"publicationDate":"2021-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89693934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As the deep learning techniques have expanded to real-world recommendation tasks, many deep neural network based Collaborative Filtering (CF) models have been developed to project user-item interactions into latent feature space, based on various neural architectures, such as multi-layer perceptron, autoencoder, and graph neural networks. However, the majority of existing collaborative filtering systems are not well designed to handle missing data. Particularly, in order to inject the negative signals in the training phase, these solutions largely rely on negative sampling from unobserved user-item interactions and simply treating them as negative instances, which brings the recommendation performance degradation. To address the issues, we develop a Collaborative Reflection-Augmented Autoencoder Network (CRANet), that is capable of exploring transferable knowledge from observed and unobserved user-item interactions. The network architecture of CRANet is formed of an integrative structure with a reflective receptor network and an information fusion autoencoder module, which endows our recommendation framework with the ability of encoding implicit user’s pairwise preference on both interacted and non-interacted items. Additionally, a parametric regularization-based tied-weight scheme is designed to perform robust joint training of the two-stage CRANetmodel. We finally experimentally validate CRANeton four diverse benchmark datasets corresponding to two recommendation tasks, to show that debiasing the negative signals of user-item interactions improves the performance as compared to various state-of-the-art recommendation techniques. Our source code is available at https://github.com/akaxlh/CRANet.
{"title":"Collaborative Reflection-Augmented Autoencoder Network for Recommender Systems","authors":"Lianghao Xia, Chao Huang, Yong Xu, Huance Xu, Xiang Li, Weiguo Zhang","doi":"10.1145/3467023","DOIUrl":"https://doi.org/10.1145/3467023","url":null,"abstract":"As the deep learning techniques have expanded to real-world recommendation tasks, many deep neural network based Collaborative Filtering (CF) models have been developed to project user-item interactions into latent feature space, based on various neural architectures, such as multi-layer perceptron, autoencoder, and graph neural networks. However, the majority of existing collaborative filtering systems are not well designed to handle missing data. Particularly, in order to inject the negative signals in the training phase, these solutions largely rely on negative sampling from unobserved user-item interactions and simply treating them as negative instances, which brings the recommendation performance degradation. To address the issues, we develop a Collaborative Reflection-Augmented Autoencoder Network (CRANet), that is capable of exploring transferable knowledge from observed and unobserved user-item interactions. The network architecture of CRANet is formed of an integrative structure with a reflective receptor network and an information fusion autoencoder module, which endows our recommendation framework with the ability of encoding implicit user’s pairwise preference on both interacted and non-interacted items. Additionally, a parametric regularization-based tied-weight scheme is designed to perform robust joint training of the two-stage CRANetmodel. We finally experimentally validate CRANeton four diverse benchmark datasets corresponding to two recommendation tasks, to show that debiasing the negative signals of user-item interactions improves the performance as compared to various state-of-the-art recommendation techniques. Our source code is available at https://github.com/akaxlh/CRANet.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"71 1","pages":"1 - 22"},"PeriodicalIF":0.0,"publicationDate":"2021-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83987575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiao Zhang, Meng Liu, Jianhua Yin, Z. Ren, Liqiang Nie
With the increasing prevalence of portable devices and the popularity of community Question Answering (cQA) sites, users can seamlessly post and answer many questions. To effectively organize the information for precise recommendation and easy searching, these platforms require users to select topics for their raised questions. However, due to the limited experience, certain users fail to select appropriate topics for their questions. Thereby, automatic question tagging becomes an urgent and vital problem for the cQA sites, yet it is non-trivial due to the following challenges. On the one hand, vast and meaningful topics are available yet not utilized in the cQA sites; how to model and tag them to relevant questions is a highly challenging problem. On the other hand, related topics in the cQA sites may be organized into a directed acyclic graph. In light of this, how to exploit relations among topics to enhance their representations is critical. To settle these challenges, we devise a graph-guided topic ranking model to tag questions in the cQA sites appropriately. In particular, we first design a topic information fusion module to learn the topic representation by jointly considering the name and description of the topic. Afterwards, regarding the special structure of topics, we propose an information propagation module to enhance the topic representation. As the comprehension of questions plays a vital role in question tagging, we design a multi-level context-modeling-based question encoder to obtain the enhanced question representation. Moreover, we introduce an interaction module to extract topic-aware question information and capture the interactive information between questions and topics. Finally, we utilize the interactive information to estimate the ranking scores for topics. Extensive experiments on three Chinese cQA datasets have demonstrated that our proposed model outperforms several state-of-the-art competitors.
{"title":"Question Tagging via Graph-guided Ranking","authors":"Xiao Zhang, Meng Liu, Jianhua Yin, Z. Ren, Liqiang Nie","doi":"10.1145/3468270","DOIUrl":"https://doi.org/10.1145/3468270","url":null,"abstract":"With the increasing prevalence of portable devices and the popularity of community Question Answering (cQA) sites, users can seamlessly post and answer many questions. To effectively organize the information for precise recommendation and easy searching, these platforms require users to select topics for their raised questions. However, due to the limited experience, certain users fail to select appropriate topics for their questions. Thereby, automatic question tagging becomes an urgent and vital problem for the cQA sites, yet it is non-trivial due to the following challenges. On the one hand, vast and meaningful topics are available yet not utilized in the cQA sites; how to model and tag them to relevant questions is a highly challenging problem. On the other hand, related topics in the cQA sites may be organized into a directed acyclic graph. In light of this, how to exploit relations among topics to enhance their representations is critical. To settle these challenges, we devise a graph-guided topic ranking model to tag questions in the cQA sites appropriately. In particular, we first design a topic information fusion module to learn the topic representation by jointly considering the name and description of the topic. Afterwards, regarding the special structure of topics, we propose an information propagation module to enhance the topic representation. As the comprehension of questions plays a vital role in question tagging, we design a multi-level context-modeling-based question encoder to obtain the enhanced question representation. Moreover, we introduce an interaction module to extract topic-aware question information and capture the interactive information between questions and topics. Finally, we utilize the interactive information to estimate the ranking scores for topics. Extensive experiments on three Chinese cQA datasets have demonstrated that our proposed model outperforms several state-of-the-art competitors.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"22 1","pages":"1 - 23"},"PeriodicalIF":0.0,"publicationDate":"2021-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80097198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}