T. Vuong, S. Andolina, Giulio Jacucci, Tuukka Ruotsalo
Web searches often originate from conversations in which people engage before they perform a search. Therefore, conversations can be a valuable source of context with which to support the search process. We investigate whether spoken input from conversations can be used as a context to improve query auto-completion. We model the temporal dynamics of the spoken conversational context preceding queries and use these models to re-rank the query auto-completion suggestions. Data were collected from a controlled experiment and comprised conversations among 12 participant pairs conversing about movies or traveling. Search query logs during the conversations were recorded and temporally associated with the conversations. We compared the effects of spoken conversational input in four conditions: a control condition without contextualization; an experimental condition with the model using search query logs; an experimental condition with the model using spoken conversational input; and an experimental condition with the model using both search query logs and spoken conversational input. We show the advantage of combining the spoken conversational context with the Web-search context for improved retrieval performance. Our results suggest that spoken conversations provide a rich context for supporting information searches beyond current user-modeling approaches.
{"title":"Spoken Conversational Context Improves Query Auto-completion in Web Search","authors":"T. Vuong, S. Andolina, Giulio Jacucci, Tuukka Ruotsalo","doi":"10.1145/3447875","DOIUrl":"https://doi.org/10.1145/3447875","url":null,"abstract":"Web searches often originate from conversations in which people engage before they perform a search. Therefore, conversations can be a valuable source of context with which to support the search process. We investigate whether spoken input from conversations can be used as a context to improve query auto-completion. We model the temporal dynamics of the spoken conversational context preceding queries and use these models to re-rank the query auto-completion suggestions. Data were collected from a controlled experiment and comprised conversations among 12 participant pairs conversing about movies or traveling. Search query logs during the conversations were recorded and temporally associated with the conversations. We compared the effects of spoken conversational input in four conditions: a control condition without contextualization; an experimental condition with the model using search query logs; an experimental condition with the model using spoken conversational input; and an experimental condition with the model using both search query logs and spoken conversational input. We show the advantage of combining the spoken conversational context with the Web-search context for improved retrieval performance. Our results suggest that spoken conversations provide a rich context for supporting information searches beyond current user-modeling approaches.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"28 1","pages":"1 - 32"},"PeriodicalIF":0.0,"publicationDate":"2021-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83299643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conversational search systems, such as Google assistant and Microsoft Cortana, enable users to interact with search systems in multiple rounds through natural language dialogues. Evaluating such systems is very challenging, given that any natural language responses could be generated, and users commonly interact for multiple semantically coherent rounds to accomplish a search task. Although prior studies proposed many evaluation metrics, the extent of how those measures effectively capture user preference remain to be investigated. In this article, we systematically meta-evaluate a variety of conversational search metrics. We specifically study three perspectives on those metrics: (1) reliability: the ability to detect “actual” performance differences as opposed to those observed by chance; (2) fidelity: the ability to agree with ultimate user preference; and (3) intuitiveness: the ability to capture any property deemed important: adequacy, informativeness, and fluency in the context of conversational search. By conducting experiments on two test collections, we find that the performance of different metrics vary significantly across different scenarios, whereas consistent with prior studies, existing metrics only achieve weak correlation with ultimate user preference and satisfaction. METEOR is, comparatively speaking, the best existing single-turn metric considering all three perspectives. We also demonstrate that adapted session-based evaluation metrics can be used to measure multi-turn conversational search, achieving moderate concordance with user satisfaction. To our knowledge, our work establishes the most comprehensive meta-evaluation for conversational search to date.
{"title":"Meta-evaluation of Conversational Search Evaluation Metrics","authors":"Zeyang Liu, K. Zhou, Max L. Wilson","doi":"10.1145/3445029","DOIUrl":"https://doi.org/10.1145/3445029","url":null,"abstract":"Conversational search systems, such as Google assistant and Microsoft Cortana, enable users to interact with search systems in multiple rounds through natural language dialogues. Evaluating such systems is very challenging, given that any natural language responses could be generated, and users commonly interact for multiple semantically coherent rounds to accomplish a search task. Although prior studies proposed many evaluation metrics, the extent of how those measures effectively capture user preference remain to be investigated. In this article, we systematically meta-evaluate a variety of conversational search metrics. We specifically study three perspectives on those metrics: (1) reliability: the ability to detect “actual” performance differences as opposed to those observed by chance; (2) fidelity: the ability to agree with ultimate user preference; and (3) intuitiveness: the ability to capture any property deemed important: adequacy, informativeness, and fluency in the context of conversational search. By conducting experiments on two test collections, we find that the performance of different metrics vary significantly across different scenarios, whereas consistent with prior studies, existing metrics only achieve weak correlation with ultimate user preference and satisfaction. METEOR is, comparatively speaking, the best existing single-turn metric considering all three perspectives. We also demonstrate that adapted session-based evaluation metrics can be used to measure multi-turn conversational search, achieving moderate concordance with user satisfaction. To our knowledge, our work establishes the most comprehensive meta-evaluation for conversational search to date.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"1 1","pages":"1 - 42"},"PeriodicalIF":0.0,"publicationDate":"2021-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88116500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Inverted indexes continue to be a mainstay of text search engines, allowing efficient querying of large document collections. While there are a number of possible organizations, document-ordered indexes are the most common, since they are amenable to various query types, support index updates, and allow for efficient dynamic pruning operations. One disadvantage with document-ordered indexes is that high-scoring documents can be distributed across the document identifier space, meaning that index traversal algorithms that terminate early might put search effectiveness at risk. The alternative is impact-ordered indexes, which primarily support top- disjunctions but also allow for anytime query processing, where the search can be terminated at any time, with search quality improving as processing latency increases. Anytime query processing can be used to effectively reduce high-percentile tail latency that is essential for operational scenarios in which a service level agreement (SLA) imposes response time requirements. In this work, we show how document-ordered indexes can be organized such that they can be queried in an anytime fashion, enabling strict latency control with effective early termination. Our experiments show that processing document-ordered topical segments selected by a simple score estimator outperforms existing anytime algorithms, and allows query runtimes to be accurately limited to comply with SLA requirements.
{"title":"Anytime Ranking on Document-Ordered Indexes","authors":"J. Mackenzie, M. Petri, Alistair Moffat","doi":"10.1145/3467890","DOIUrl":"https://doi.org/10.1145/3467890","url":null,"abstract":"Inverted indexes continue to be a mainstay of text search engines, allowing efficient querying of large document collections. While there are a number of possible organizations, document-ordered indexes are the most common, since they are amenable to various query types, support index updates, and allow for efficient dynamic pruning operations. One disadvantage with document-ordered indexes is that high-scoring documents can be distributed across the document identifier space, meaning that index traversal algorithms that terminate early might put search effectiveness at risk. The alternative is impact-ordered indexes, which primarily support top- disjunctions but also allow for anytime query processing, where the search can be terminated at any time, with search quality improving as processing latency increases. Anytime query processing can be used to effectively reduce high-percentile tail latency that is essential for operational scenarios in which a service level agreement (SLA) imposes response time requirements. In this work, we show how document-ordered indexes can be organized such that they can be queried in an anytime fashion, enabling strict latency control with effective early termination. Our experiments show that processing document-ordered topical segments selected by a simple score estimator outperforms existing anytime algorithms, and allows query runtimes to be accurately limited to comply with SLA requirements.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"80 1","pages":"1 - 32"},"PeriodicalIF":0.0,"publicationDate":"2021-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83790065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recommendation systems are often evaluated based on user’s interactions that were collected from an existing, already deployed recommendation system. In this situation, users only provide feedback on the exposed items and they may not leave feedback on other items since they have not been exposed to them by the deployed system. As a result, the collected feedback dataset that is used to evaluate a new model is influenced by the deployed system, as a form of closed loop feedback. In this article, we show that the typical offline evaluation of recommender systems suffers from the so-called Simpson’s paradox. Simpson’s paradox is the name given to a phenomenon observed when a significant trend appears in several different sub-populations of observational data but disappears or is even reversed when these sub-populations are combined together. Our in-depth experiments based on stratified sampling reveal that a very small minority of items that are frequently exposed by the deployed system plays a confounding factor in the offline evaluation of recommendation systems. In addition, we propose a novel evaluation methodology that takes into account the confounder, i.e., the deployed system’s characteristics. Using the relative comparison of many recommendation models as in the typical offline evaluation of recommender systems, and based on the Kendall rank correlation coefficient, we show that our proposed evaluation methodology exhibits statistically significant improvements of 14% and 40% on the examined open loop datasets (Yahoo! and Coat), respectively, in reflecting the true ranking of systems with an open loop (randomised) evaluation in comparison to the standard evaluation.
{"title":"The Simpson’s Paradox in the Offline Evaluation of Recommendation Systems","authors":"A. H. Jadidinejad, C. Macdonald, I. Ounis","doi":"10.1145/3458509","DOIUrl":"https://doi.org/10.1145/3458509","url":null,"abstract":"Recommendation systems are often evaluated based on user’s interactions that were collected from an existing, already deployed recommendation system. In this situation, users only provide feedback on the exposed items and they may not leave feedback on other items since they have not been exposed to them by the deployed system. As a result, the collected feedback dataset that is used to evaluate a new model is influenced by the deployed system, as a form of closed loop feedback. In this article, we show that the typical offline evaluation of recommender systems suffers from the so-called Simpson’s paradox. Simpson’s paradox is the name given to a phenomenon observed when a significant trend appears in several different sub-populations of observational data but disappears or is even reversed when these sub-populations are combined together. Our in-depth experiments based on stratified sampling reveal that a very small minority of items that are frequently exposed by the deployed system plays a confounding factor in the offline evaluation of recommendation systems. In addition, we propose a novel evaluation methodology that takes into account the confounder, i.e., the deployed system’s characteristics. Using the relative comparison of many recommendation models as in the typical offline evaluation of recommender systems, and based on the Kendall rank correlation coefficient, we show that our proposed evaluation methodology exhibits statistically significant improvements of 14% and 40% on the examined open loop datasets (Yahoo! and Coat), respectively, in reflecting the true ranking of systems with an open loop (randomised) evaluation in comparison to the standard evaluation.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"110 1","pages":"1 - 22"},"PeriodicalIF":0.0,"publicationDate":"2021-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87677459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hao Peng, Ruitong Zhang, Yingtong Dou, Renyu Yang, Jingyi Zhang, Philip S. Yu
Graph Neural Networks (GNNs) have been widely used for the representation learning of various structured graph data, typically through message passing among nodes by aggregating their neighborhood information via different operations. While promising, most existing GNNs oversimplify the complexity and diversity of the edges in the graph and thus are inefficient to cope with ubiquitous heterogeneous graphs, which are typically in the form of multi-relational graph representations. In this article, we propose RioGNN, a novel Reinforced, recursive, and flexible neighborhood selection guided multi-relational Graph Neural Network architecture, to navigate complexity of neural network structures whilst maintaining relation-dependent representations. We first construct a multi-relational graph, according to the practical task, to reflect the heterogeneity of nodes, edges, attributes, and labels. To avoid the embedding over-assimilation among different types of nodes, we employ a label-aware neural similarity measure to ascertain the most similar neighbors based on node attributes. A reinforced relation-aware neighbor selection mechanism is developed to choose the most similar neighbors of a targeting node within a relation before aggregating all neighborhood information from different relations to obtain the eventual node embedding. Particularly, to improve the efficiency of neighbor selecting, we propose a new recursive and scalable reinforcement learning framework with estimable depth and width for different scales of multi-relational graphs. RioGNN can learn more discriminative node embedding with enhanced explainability due to the recognition of individual importance of each relation via the filtering threshold mechanism. Comprehensive experiments on real-world graph data and practical tasks demonstrate the advancements of effectiveness, efficiency, and the model explainability, as opposed to other comparative GNN models.
{"title":"Reinforced Neighborhood Selection Guided Multi-Relational Graph Neural Networks","authors":"Hao Peng, Ruitong Zhang, Yingtong Dou, Renyu Yang, Jingyi Zhang, Philip S. Yu","doi":"10.1145/3490181","DOIUrl":"https://doi.org/10.1145/3490181","url":null,"abstract":"Graph Neural Networks (GNNs) have been widely used for the representation learning of various structured graph data, typically through message passing among nodes by aggregating their neighborhood information via different operations. While promising, most existing GNNs oversimplify the complexity and diversity of the edges in the graph and thus are inefficient to cope with ubiquitous heterogeneous graphs, which are typically in the form of multi-relational graph representations. In this article, we propose RioGNN, a novel Reinforced, recursive, and flexible neighborhood selection guided multi-relational Graph Neural Network architecture, to navigate complexity of neural network structures whilst maintaining relation-dependent representations. We first construct a multi-relational graph, according to the practical task, to reflect the heterogeneity of nodes, edges, attributes, and labels. To avoid the embedding over-assimilation among different types of nodes, we employ a label-aware neural similarity measure to ascertain the most similar neighbors based on node attributes. A reinforced relation-aware neighbor selection mechanism is developed to choose the most similar neighbors of a targeting node within a relation before aggregating all neighborhood information from different relations to obtain the eventual node embedding. Particularly, to improve the efficiency of neighbor selecting, we propose a new recursive and scalable reinforcement learning framework with estimable depth and width for different scales of multi-relational graphs. RioGNN can learn more discriminative node embedding with enhanced explainability due to the recognition of individual importance of each relation via the filtering threshold mechanism. Comprehensive experiments on real-world graph data and practical tasks demonstrate the advancements of effectiveness, efficiency, and the model explainability, as opposed to other comparative GNN models.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"47 1","pages":"1 - 46"},"PeriodicalIF":0.0,"publicationDate":"2021-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85268155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conversational search is a relatively young area of research that aims at automating an information-seeking dialogue. In this article, we help to position it with respect to other research areas within conversational artificial intelligence (AI) by analysing the structural properties of an information-seeking dialogue. To this end, we perform a large-scale dialogue analysis of more than 150K transcripts from 16 publicly available dialogue datasets. These datasets were collected to inform different dialogue-based tasks including conversational search. We extract different patterns of mixed initiative from these dialogue transcripts and use them to compare dialogues of different types. Moreover, we contrast the patterns found in information-seeking dialogues that are being used for research purposes with the patterns found in virtual reference interviews that were conducted by professional librarians. The insights we provide (1) establish close relations between conversational search and other conversational AI tasks and (2) uncover limitations of existing conversational datasets to inform future data collection tasks.
{"title":"A Large-scale Analysis of Mixed Initiative in Information-Seeking Dialogues for Conversational Search","authors":"Svitlana Vakulenko, E. Kanoulas, M. de Rijke","doi":"10.1145/3466796","DOIUrl":"https://doi.org/10.1145/3466796","url":null,"abstract":"Conversational search is a relatively young area of research that aims at automating an information-seeking dialogue. In this article, we help to position it with respect to other research areas within conversational artificial intelligence (AI) by analysing the structural properties of an information-seeking dialogue. To this end, we perform a large-scale dialogue analysis of more than 150K transcripts from 16 publicly available dialogue datasets. These datasets were collected to inform different dialogue-based tasks including conversational search. We extract different patterns of mixed initiative from these dialogue transcripts and use them to compare dialogues of different types. Moreover, we contrast the patterns found in information-seeking dialogues that are being used for research purposes with the patterns found in virtual reference interviews that were conducted by professional librarians. The insights we provide (1) establish close relations between conversational search and other conversational AI tasks and (2) uncover limitations of existing conversational datasets to inform future data collection tasks.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"26 1","pages":"1 - 32"},"PeriodicalIF":0.0,"publicationDate":"2021-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83263120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Deng, Yuexiang Xie, Yaliang Li, Min Yang, W. Lam, Ying Shen
Answer selection, which is involved in many natural language processing applications, such as dialog systems and question answering (QA), is an important yet challenging task in practice, since conventional methods typically suffer from the issues of ignoring diverse real-world background knowledge. In this article, we extensively investigate approaches to enhancing the answer selection model with external knowledge from knowledge graph (KG). First, we present a context-knowledge interaction learning framework, Knowledge-aware Neural Network, which learns the QA sentence representations by considering a tight interaction with the external knowledge from KG and the textual information. Then, we develop two kinds of knowledge-aware attention mechanism to summarize both the context-based and knowledge-based interactions between questions and answers. To handle the diversity and complexity of KG information, we further propose a Contextualized Knowledge-aware Attentive Neural Network, which improves the knowledge representation learning with structure information via a customized Graph Convolutional Network and comprehensively learns context-based and knowledge-based sentence representation via the multi-view knowledge-aware attention mechanism. We evaluate our method on four widely used benchmark QA datasets, including WikiQA, TREC QA, InsuranceQA, and Yahoo QA. Results verify the benefits of incorporating external knowledge from KG and show the robust superiority and extensive applicability of our method.
{"title":"Contextualized Knowledge-aware Attentive Neural Network: Enhancing Answer Selection with Knowledge","authors":"Yang Deng, Yuexiang Xie, Yaliang Li, Min Yang, W. Lam, Ying Shen","doi":"10.1145/3457533","DOIUrl":"https://doi.org/10.1145/3457533","url":null,"abstract":"Answer selection, which is involved in many natural language processing applications, such as dialog systems and question answering (QA), is an important yet challenging task in practice, since conventional methods typically suffer from the issues of ignoring diverse real-world background knowledge. In this article, we extensively investigate approaches to enhancing the answer selection model with external knowledge from knowledge graph (KG). First, we present a context-knowledge interaction learning framework, Knowledge-aware Neural Network, which learns the QA sentence representations by considering a tight interaction with the external knowledge from KG and the textual information. Then, we develop two kinds of knowledge-aware attention mechanism to summarize both the context-based and knowledge-based interactions between questions and answers. To handle the diversity and complexity of KG information, we further propose a Contextualized Knowledge-aware Attentive Neural Network, which improves the knowledge representation learning with structure information via a customized Graph Convolutional Network and comprehensively learns context-based and knowledge-based sentence representation via the multi-view knowledge-aware attention mechanism. We evaluate our method on four widely used benchmark QA datasets, including WikiQA, TREC QA, InsuranceQA, and Yahoo QA. Results verify the benefits of incorporating external knowledge from KG and show the robust superiority and extensive applicability of our method.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"69 1","pages":"1 - 33"},"PeriodicalIF":0.0,"publicationDate":"2021-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85765862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lei Guo, Hongzhi Yin, Tong Chen, Xiangliang Zhang, Kai Zheng
Group recommendation aims to recommend items to a group of users. In this work, we study group recommendation in a particular scenario, namely occasional group recommendation, where groups are formed ad hoc and users may just constitute a group for the first time—that is, the historical group-item interaction records are highly limited. Most state-of-the-art works have addressed the challenge by aggregating group members’ personal preferences to learn the group representation. However, the representation learning for a group is most complex beyond the aggregation or fusion of group member representation, as the personal preferences and group preferences may be in different spaces and even orthogonal. In addition, the learned user representation is not accurate due to the sparsity of users’ interaction data. Moreover, the group similarity in terms of common group members has been overlooked, which, however, has the great potential to improve the group representation learning. In this work, we focus on addressing the aforementioned challenges in the group representation learning task, and devise a hierarchical hyperedge embedding-based group recommender, namely HyperGroup. Specifically, we propose to leverage the user-user interactions to alleviate the sparsity issue of user-item interactions, and design a graph neural network-based representation learning network to enhance the learning of individuals’ preferences from their friends’ preferences, which provides a solid foundation for learning groups’ preferences. To exploit the group similarity (i.e., overlapping relationships among groups) to learn a more accurate group representation from highly limited group-item interactions, we connect all groups as a network of overlapping sets (a.k.a. hypergraph), and treat the task of group preference learning as embedding hyperedges (i.e., user sets/groups) in a hypergraph, where an inductive hyperedge embedding method is proposed. To further enhance the group-level preference modeling, we develop a joint training strategy to learn both user-item and group-item interactions in the same process. We conduct extensive experiments on two real-world datasets, and the experimental results demonstrate the superiority of our proposed HyperGroup in comparison to the state-of-the-art baselines.
{"title":"Hierarchical Hyperedge Embedding-Based Representation Learning for Group Recommendation","authors":"Lei Guo, Hongzhi Yin, Tong Chen, Xiangliang Zhang, Kai Zheng","doi":"10.1145/3457949","DOIUrl":"https://doi.org/10.1145/3457949","url":null,"abstract":"Group recommendation aims to recommend items to a group of users. In this work, we study group recommendation in a particular scenario, namely occasional group recommendation, where groups are formed ad hoc and users may just constitute a group for the first time—that is, the historical group-item interaction records are highly limited. Most state-of-the-art works have addressed the challenge by aggregating group members’ personal preferences to learn the group representation. However, the representation learning for a group is most complex beyond the aggregation or fusion of group member representation, as the personal preferences and group preferences may be in different spaces and even orthogonal. In addition, the learned user representation is not accurate due to the sparsity of users’ interaction data. Moreover, the group similarity in terms of common group members has been overlooked, which, however, has the great potential to improve the group representation learning. In this work, we focus on addressing the aforementioned challenges in the group representation learning task, and devise a hierarchical hyperedge embedding-based group recommender, namely HyperGroup. Specifically, we propose to leverage the user-user interactions to alleviate the sparsity issue of user-item interactions, and design a graph neural network-based representation learning network to enhance the learning of individuals’ preferences from their friends’ preferences, which provides a solid foundation for learning groups’ preferences. To exploit the group similarity (i.e., overlapping relationships among groups) to learn a more accurate group representation from highly limited group-item interactions, we connect all groups as a network of overlapping sets (a.k.a. hypergraph), and treat the task of group preference learning as embedding hyperedges (i.e., user sets/groups) in a hypergraph, where an inductive hyperedge embedding method is proposed. To further enhance the group-level preference modeling, we develop a joint training strategy to learn both user-item and group-item interactions in the same process. We conduct extensive experiments on two real-world datasets, and the experimental results demonstrate the superiority of our proposed HyperGroup in comparison to the state-of-the-art baselines.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"17 1","pages":"1 - 27"},"PeriodicalIF":0.0,"publicationDate":"2021-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82777335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Juntao Li, Chang Liu, Chongyang Tao, Zhangming Chan, Dongyan Zhao, Min Zhang, Rui Yan
Existing multi-turn context-response matching methods mainly concentrate on obtaining multi-level and multi-dimension representations and better interactions between context utterances and response. However, in real-place conversation scenarios, whether a response candidate is suitable not only counts on the given dialogue context but also other backgrounds, e.g., wording habits, user-specific dialogue history content. To fill the gap between these up-to-date methods and the real-world applications, we incorporate user-specific dialogue history into the response selection and propose a personalized hybrid matching network (PHMN). Our contributions are two-fold: (1) our model extracts personalized wording behaviors from user-specific dialogue history as extra matching information; (2) we perform hybrid representation learning on context-response utterances and explicitly incorporate a customized attention mechanism to extract vital information from context-response interactions so as to improve the accuracy of matching. We evaluate our model on two large datasets with user identification, i.e., personalized Ubuntu dialogue Corpus (P-Ubuntu) and personalized Weibo dataset (P-Weibo). Experimental results confirm that our method significantly outperforms several strong models by combining personalized attention, wording behaviors, and hybrid representation learning.
{"title":"Dialogue History Matters! Personalized Response Selection in Multi-Turn Retrieval-Based Chatbots","authors":"Juntao Li, Chang Liu, Chongyang Tao, Zhangming Chan, Dongyan Zhao, Min Zhang, Rui Yan","doi":"10.1145/3453183","DOIUrl":"https://doi.org/10.1145/3453183","url":null,"abstract":"Existing multi-turn context-response matching methods mainly concentrate on obtaining multi-level and multi-dimension representations and better interactions between context utterances and response. However, in real-place conversation scenarios, whether a response candidate is suitable not only counts on the given dialogue context but also other backgrounds, e.g., wording habits, user-specific dialogue history content. To fill the gap between these up-to-date methods and the real-world applications, we incorporate user-specific dialogue history into the response selection and propose a personalized hybrid matching network (PHMN). Our contributions are two-fold: (1) our model extracts personalized wording behaviors from user-specific dialogue history as extra matching information; (2) we perform hybrid representation learning on context-response utterances and explicitly incorporate a customized attention mechanism to extract vital information from context-response interactions so as to improve the accuracy of matching. We evaluate our model on two large datasets with user identification, i.e., personalized Ubuntu dialogue Corpus (P-Ubuntu) and personalized Weibo dataset (P-Weibo). Experimental results confirm that our method significantly outperforms several strong models by combining personalized attention, wording behaviors, and hybrid representation learning.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"6 1","pages":"1 - 25"},"PeriodicalIF":0.0,"publicationDate":"2021-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80269305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multi-stage ranking pipelines have been a practical solution in modern search systems, where the first-stage retrieval is to return a subset of candidate documents and latter stages attempt to re-rank those candidates. Unlike re-ranking stages going through quick technique shifts over the past decades, the first-stage retrieval has long been dominated by classical term-based models. Unfortunately, these models suffer from the vocabulary mismatch problem, which may block re-ranking stages from relevant documents at the very beginning. Therefore, it has been a long-term desire to build semantic models for the first-stage retrieval that can achieve high recall efficiently. Recently, we have witnessed an explosive growth of research interests on the first-stage semantic retrieval models. We believe it is the right time to survey current status, learn from existing methods, and gain some insights for future development. In this article, we describe the current landscape of the first-stage retrieval models under a unified framework to clarify the connection between classical term-based retrieval methods, early semantic retrieval methods, and neural semantic retrieval methods. Moreover, we identify some open challenges and envision some future directions, with the hope of inspiring more research on these important yet less investigated topics.
{"title":"Semantic Models for the First-Stage Retrieval: A Comprehensive Review","authors":"Yinqiong Cai, Yixing Fan, Jiafeng Guo, Fei Sun, Ruqing Zhang, Xueqi Cheng","doi":"10.1145/3486250","DOIUrl":"https://doi.org/10.1145/3486250","url":null,"abstract":"Multi-stage ranking pipelines have been a practical solution in modern search systems, where the first-stage retrieval is to return a subset of candidate documents and latter stages attempt to re-rank those candidates. Unlike re-ranking stages going through quick technique shifts over the past decades, the first-stage retrieval has long been dominated by classical term-based models. Unfortunately, these models suffer from the vocabulary mismatch problem, which may block re-ranking stages from relevant documents at the very beginning. Therefore, it has been a long-term desire to build semantic models for the first-stage retrieval that can achieve high recall efficiently. Recently, we have witnessed an explosive growth of research interests on the first-stage semantic retrieval models. We believe it is the right time to survey current status, learn from existing methods, and gain some insights for future development. In this article, we describe the current landscape of the first-stage retrieval models under a unified framework to clarify the connection between classical term-based retrieval methods, early semantic retrieval methods, and neural semantic retrieval methods. Moreover, we identify some open challenges and envision some future directions, with the hope of inspiring more research on these important yet less investigated topics.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"183 1","pages":"1 - 42"},"PeriodicalIF":0.0,"publicationDate":"2021-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77597594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}