The current research will be devoted to the challenging and under-investigated task of multi-source answer generation for complex non-factoid questions. We will start with experimenting with generative models on one particular type of non-factoid questions - instrumental/procedural questions which often start with "how-to". For this, a new dataset, comprised of more than 100,000 QA-pairs which were crawled from a dedicated web-resource where each answer has a set of references to the articles it was written upon, will be used. We will also compare different ways of model evaluation to choose a metric which better correlates with human assessment. To be able to do this, the way people evaluate answers to non-factoid questions and set some formal criteria of what makes a good quality answer is needed to be understood. Eye-tracking and crowdsourcing methods will be employed to study how users interact with answers and evaluate them, and how the answer features correlate with task complexity. We hope that our research will help to redefine the way users interact and work with search engines so as to transform IR finally into the answer retrieval systems that users have always desired.
{"title":"Multi-Document Answer Generation for Non-Factoid Questions","authors":"Valeriia Bolotova-Baranova","doi":"10.1145/3397271.3401449","DOIUrl":"https://doi.org/10.1145/3397271.3401449","url":null,"abstract":"The current research will be devoted to the challenging and under-investigated task of multi-source answer generation for complex non-factoid questions. We will start with experimenting with generative models on one particular type of non-factoid questions - instrumental/procedural questions which often start with \"how-to\". For this, a new dataset, comprised of more than 100,000 QA-pairs which were crawled from a dedicated web-resource where each answer has a set of references to the articles it was written upon, will be used. We will also compare different ways of model evaluation to choose a metric which better correlates with human assessment. To be able to do this, the way people evaluate answers to non-factoid questions and set some formal criteria of what makes a good quality answer is needed to be understood. Eye-tracking and crowdsourcing methods will be employed to study how users interact with answers and evaluate them, and how the answer features correlate with task complexity. We hope that our research will help to redefine the way users interact and work with search engines so as to transform IR finally into the answer retrieval systems that users have always desired.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131735555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent progress in deep learning has brought tremendous improvements in conversational AI, leading to a plethora of commercial conversational services that allow naturally spoken interactions, increasing the need for more human-centric interactions in IR. As a result, we have witnessed a resurgent interest in developing modern CIR systems in research communities and industry. This tutorial presents recent advances in CIR, focusing mainly on neural approaches and new applications developed in the past five years. Our goal is to provide a thorough and in-depth overview of the general definition of CIR, the components of CIR systems, new applications raised for its conversational aspects, and the (neural) techniques recently developed for it.
{"title":"Recent Advances in Conversational Information Retrieval","authors":"Jianfeng Gao, Chenyan Xiong, Paul N. Bennett","doi":"10.1145/3397271.3401418","DOIUrl":"https://doi.org/10.1145/3397271.3401418","url":null,"abstract":"Recent progress in deep learning has brought tremendous improvements in conversational AI, leading to a plethora of commercial conversational services that allow naturally spoken interactions, increasing the need for more human-centric interactions in IR. As a result, we have witnessed a resurgent interest in developing modern CIR systems in research communities and industry. This tutorial presents recent advances in CIR, focusing mainly on neural approaches and new applications developed in the past five years. Our goal is to provide a thorough and in-depth overview of the general definition of CIR, the components of CIR systems, new applications raised for its conversational aspects, and the (neural) techniques recently developed for it.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124181538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wen Wang, Wei Zhang, Jun Rao, Zhijie Qiu, Bo Zhang, Leyu Lin, H. Zha
Sequential recommendation and group recommendation are two important branches in the field of recommender system. While considerable efforts have been devoted to these two branches in an independent way, we combine them by proposing the novel sequential group recommendation problem which enables modeling group dynamic representations and is crucial for achieving better group recommendation performance. The major challenge of the problem is how to effectively learn dynamic group representations based on the sequential user-item interactions of group members in the past time frames. To address this, we devise a Group-aware Long- and Short-term Graph Representation Learning approach, namely GLS-GRL, for sequential group recommendation. Specifically, for a target group, we construct a group-aware long-term graph to capture user-item interactions and item-item co-occurrence in the whole history, and a group-aware short-term graph to contain the same information regarding only the current time frame. Based on the graphs, GLS-GRL performs graph representation learning to obtain long-term and short-term user representations, and further adaptively fuse them to gain integrated user representations. Finally, group representations are obtained by a constrained user-interacted attention mechanism which encodes the correlations between group members. Comprehensive experiments demonstrate that GLS-GRL achieves better performance than several strong alternatives coming from sequential recommendation and group recommendation methods, validating the effectiveness of the core components in GLS-GRL.
{"title":"Group-Aware Long- and Short-Term Graph Representation Learning for Sequential Group Recommendation","authors":"Wen Wang, Wei Zhang, Jun Rao, Zhijie Qiu, Bo Zhang, Leyu Lin, H. Zha","doi":"10.1145/3397271.3401136","DOIUrl":"https://doi.org/10.1145/3397271.3401136","url":null,"abstract":"Sequential recommendation and group recommendation are two important branches in the field of recommender system. While considerable efforts have been devoted to these two branches in an independent way, we combine them by proposing the novel sequential group recommendation problem which enables modeling group dynamic representations and is crucial for achieving better group recommendation performance. The major challenge of the problem is how to effectively learn dynamic group representations based on the sequential user-item interactions of group members in the past time frames. To address this, we devise a Group-aware Long- and Short-term Graph Representation Learning approach, namely GLS-GRL, for sequential group recommendation. Specifically, for a target group, we construct a group-aware long-term graph to capture user-item interactions and item-item co-occurrence in the whole history, and a group-aware short-term graph to contain the same information regarding only the current time frame. Based on the graphs, GLS-GRL performs graph representation learning to obtain long-term and short-term user representations, and further adaptively fuse them to gain integrated user representations. Finally, group representations are obtained by a constrained user-interacted attention mechanism which encodes the correlations between group members. Comprehensive experiments demonstrate that GLS-GRL achieves better performance than several strong alternatives coming from sequential recommendation and group recommendation methods, validating the effectiveness of the core components in GLS-GRL.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121181709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kunal Goyal, Utkarsh Gupta, A. De, Soumen Chakrabarti
Graph retrieval from a large corpus of graphs has a wide variety of applications, e.g., sentence retrieval using words and dependency parse trees for question answering, image retrieval using scene graphs, and molecule discovery from a set of existing molecular graphs. In such graph search applications, nodes, edges and associated features bear distinctive physical significance. Therefore, a unified, trainable search model that efficiently returns corpus graphs that are highly relevant to a query graph has immense potential impact. In this paper, we present an effective, feature and structure-aware, end-to-end trainable neural match scoring system for graphs. We achieve this by constructing the product graph between the query and a candidate graph in the corpus, and then conduct a family of random walks on the product graph, which are then aggregated into the match score, using a network whose parameters can be trained. Experiments show the efficacy of our method, compared to competitive baseline approaches.
{"title":"Deep Neural Matching Models for Graph Retrieval","authors":"Kunal Goyal, Utkarsh Gupta, A. De, Soumen Chakrabarti","doi":"10.1145/3397271.3401216","DOIUrl":"https://doi.org/10.1145/3397271.3401216","url":null,"abstract":"Graph retrieval from a large corpus of graphs has a wide variety of applications, e.g., sentence retrieval using words and dependency parse trees for question answering, image retrieval using scene graphs, and molecule discovery from a set of existing molecular graphs. In such graph search applications, nodes, edges and associated features bear distinctive physical significance. Therefore, a unified, trainable search model that efficiently returns corpus graphs that are highly relevant to a query graph has immense potential impact. In this paper, we present an effective, feature and structure-aware, end-to-end trainable neural match scoring system for graphs. We achieve this by constructing the product graph between the query and a candidate graph in the corpus, and then conduct a family of random walks on the product graph, which are then aggregated into the match score, using a network whose parameters can be trained. Experiments show the efficacy of our method, compared to competitive baseline approaches.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121247541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In professional search tasks such as precision medicine literature search, queries often involve multiple aspects. To assess the relevance of a document, a searcher often painstakingly validates each aspect in the query and follows a task-specific logic to make a relevance decision. In such scenarios, we say the searcher makes a structured relevance judgment, as opposed to the traditional univariate (binary or graded) relevance judgment. Ideally, a search engine can support searcher's workflow and follow the same steps to predict document relevance. This approach may not only yield highly effective retrieval models, but also open up opportunities for the model to explain its decision in the same "lingo" as the searcher. Using structured relevance judgment data from the TREC Precision Medicine track, we propose novel retrieval models that emulate how medical experts make structured relevance judgments. Our experiments demonstrate that these simple, explainable models can outperform complex, black-box learning-to-rank models.
{"title":"Towards Explainable Retrieval Models for Precision Medicine Literature Search","authors":"Jiaming Qu, Jaime Arguello, Yue Wang","doi":"10.1145/3397271.3401277","DOIUrl":"https://doi.org/10.1145/3397271.3401277","url":null,"abstract":"In professional search tasks such as precision medicine literature search, queries often involve multiple aspects. To assess the relevance of a document, a searcher often painstakingly validates each aspect in the query and follows a task-specific logic to make a relevance decision. In such scenarios, we say the searcher makes a structured relevance judgment, as opposed to the traditional univariate (binary or graded) relevance judgment. Ideally, a search engine can support searcher's workflow and follow the same steps to predict document relevance. This approach may not only yield highly effective retrieval models, but also open up opportunities for the model to explain its decision in the same \"lingo\" as the searcher. Using structured relevance judgment data from the TREC Precision Medicine track, we propose novel retrieval models that emulate how medical experts make structured relevance judgments. Our experiments demonstrate that these simple, explainable models can outperform complex, black-box learning-to-rank models.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117254178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ao Liu, Shuai Yuan, Chenbin Zhang, Congjian Luo, Yaqing Liao, Kun Bai, Zenglin Xu
Multimodal Machine Comprehension ($rm M^3C$) has been a challenging task that requires understanding both language and vision, as well as their integration and interaction. For example, the RecipeQA challenge, which provides several $rm M^3C$ tasks, requires deep neural models to understand textual instructions, images of different steps, as well as the logic orders of food cooking. To address this challenge, we propose a Multi-Level Multi-Modal Transformer (MLMM-Trans) framework to integrate and understand multiple textual instructions and multiple images. Our model can conduct intensive attention mechanism at multiple levels of objects (e.g., step level and passage-image level) for sequences of different modalities. Experiments have shown that our model can achieve the state-of-the-art results on the three multimodal tasks of RecipeQA.
{"title":"Multi-Level Multimodal Transformer Network for Multimodal Recipe Comprehension","authors":"Ao Liu, Shuai Yuan, Chenbin Zhang, Congjian Luo, Yaqing Liao, Kun Bai, Zenglin Xu","doi":"10.1145/3397271.3401247","DOIUrl":"https://doi.org/10.1145/3397271.3401247","url":null,"abstract":"Multimodal Machine Comprehension ($rm M^3C$) has been a challenging task that requires understanding both language and vision, as well as their integration and interaction. For example, the RecipeQA challenge, which provides several $rm M^3C$ tasks, requires deep neural models to understand textual instructions, images of different steps, as well as the logic orders of food cooking. To address this challenge, we propose a Multi-Level Multi-Modal Transformer (MLMM-Trans) framework to integrate and understand multiple textual instructions and multiple images. Our model can conduct intensive attention mechanism at multiple levels of objects (e.g., step level and passage-image level) for sequences of different modalities. Experiments have shown that our model can achieve the state-of-the-art results on the three multimodal tasks of RecipeQA.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117220299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dong-Kyu Chae, Jihoo Kim, Duen Horng Chau, Sang-Wook Kim
Cold-start problems are arguably the biggest challenges faced by collaborative filtering (CF) used in recommender systems. When few ratings are available, CF models typically fail to provide satisfactory recommendations for cold-start users or to display cold-start items on users' top-N recommendation lists. Data imputation has been a popular choice to deal with such problems in the context of CF, filling empty ratings with inferred scores. Different from (and complementary to) data imputation, this paper presents AR-CF, which stands for Augmented Reality CF, a novel framework for addressing the cold-start problems by generating virtual, but plausible neighbors for cold-start users or items and augmenting them to the rating matrix as additional information for CF models. Notably, AR-CF not only directly tackles the cold-start problems, but is also effective in improving overall recommendation qualities. Via extensive experiments on real-world datasets, AR-CF is shown to (1) significantly improve the accuracy of recommendation for cold-start users, (2) provide a meaningful number of the cold-start items to display in top-N lists of users, and (3) achieve the best accuracy as well in the basic top-N recommendations, all of which are compared with recent state-of-the-art methods.
{"title":"AR-CF: Augmenting Virtual Users and Items in Collaborative Filtering for Addressing Cold-Start Problems","authors":"Dong-Kyu Chae, Jihoo Kim, Duen Horng Chau, Sang-Wook Kim","doi":"10.1145/3397271.3401038","DOIUrl":"https://doi.org/10.1145/3397271.3401038","url":null,"abstract":"Cold-start problems are arguably the biggest challenges faced by collaborative filtering (CF) used in recommender systems. When few ratings are available, CF models typically fail to provide satisfactory recommendations for cold-start users or to display cold-start items on users' top-N recommendation lists. Data imputation has been a popular choice to deal with such problems in the context of CF, filling empty ratings with inferred scores. Different from (and complementary to) data imputation, this paper presents AR-CF, which stands for Augmented Reality CF, a novel framework for addressing the cold-start problems by generating virtual, but plausible neighbors for cold-start users or items and augmenting them to the rating matrix as additional information for CF models. Notably, AR-CF not only directly tackles the cold-start problems, but is also effective in improving overall recommendation qualities. Via extensive experiments on real-world datasets, AR-CF is shown to (1) significantly improve the accuracy of recommendation for cold-start users, (2) provide a meaningful number of the cold-start items to display in top-N lists of users, and (3) achieve the best accuracy as well in the basic top-N recommendations, all of which are compared with recent state-of-the-art methods.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115812001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Information retrieval (IR) is the science of search, the search of user query relevant pieces of information from a collection of unstructured resources. Information in this context includes text, imagery, audio, video, xml, program, and metadata. The journey of an IR process begins with a user query sent to the IR system which encodes the query, compares the query with the available resources, and returns the most relevant pieces of information. Thus, the system is equipped with the ability to store, retrieve and maintain information. In the early era of IR, the whole process was completed using handcrafted features and ad-hoc relevance measures. Later, principled frameworks for relevance measure were developed with statistical learning as a basis. Recently, deep learning has proven essential to the introduction of more opportunities to IR. This is because data-driven features combined with data-driven relevance measures can effectively eliminate the human bias in either feature or relevance measure design. Deep learning has shown its significant potential to transform IR evidenced by abundant empirical results. However, we continue to strive to gain a comprehensive understanding of deep learning. This is done by answering questions such as why deep structures are superior to shallow structures, how skip connections affect a model's performance, uncovering the potential relationship between some of the hyper-parameters and a model's performance, and exploring ways to reduce the chance for deep models to be fooled by adversaries. Answering such questions can help design more effective deep models and devise more efficient schemes for model training.
{"title":"How Deep Learning Works for Information Retrieval","authors":"D. Tao","doi":"10.1145/3397271.3402429","DOIUrl":"https://doi.org/10.1145/3397271.3402429","url":null,"abstract":"Information retrieval (IR) is the science of search, the search of user query relevant pieces of information from a collection of unstructured resources. Information in this context includes text, imagery, audio, video, xml, program, and metadata. The journey of an IR process begins with a user query sent to the IR system which encodes the query, compares the query with the available resources, and returns the most relevant pieces of information. Thus, the system is equipped with the ability to store, retrieve and maintain information. In the early era of IR, the whole process was completed using handcrafted features and ad-hoc relevance measures. Later, principled frameworks for relevance measure were developed with statistical learning as a basis. Recently, deep learning has proven essential to the introduction of more opportunities to IR. This is because data-driven features combined with data-driven relevance measures can effectively eliminate the human bias in either feature or relevance measure design. Deep learning has shown its significant potential to transform IR evidenced by abundant empirical results. However, we continue to strive to gain a comprehensive understanding of deep learning. This is done by answering questions such as why deep structures are superior to shallow structures, how skip connections affect a model's performance, uncovering the potential relationship between some of the hyper-parameters and a model's performance, and exploring ways to reduce the chance for deep models to be fooled by adversaries. Answering such questions can help design more effective deep models and devise more efficient schemes for model training.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114785015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul Thomas, Daniel J. McDuff, M. Czerwinski, Nick Craswell
Past work in information-seeking conversation has demonstrated that people exhibit different conversational styles---for example, in word choice or prosody---that differences in style lead to poorer conversations, and that partners actively align their styles over time. One might assume that this would also be true for conversations with an artificial agent such as Cortana, Siri, or Alexa; and that agents should therefore track and mimic a user's style. We examine this hypothesis with reference to a lab study, where 24 participants carried out relatively long information-seeking tasks with an embodied conversational agent. The agent combined topical language models with a conversational dialogue engine, style recognition and alignment modules. We see that "style'' can be measured in human-to-agent conversation, although it looks somewhat different to style in human-to-human conversation and does not correlate with self-reported preferences. There is evidence that people align their style to the agent, and that conversations run more smoothly if the agent detects, and aligns to, the human's style as well.
{"title":"Expressions of Style in Information Seeking Conversation with an Agent","authors":"Paul Thomas, Daniel J. McDuff, M. Czerwinski, Nick Craswell","doi":"10.1145/3397271.3401127","DOIUrl":"https://doi.org/10.1145/3397271.3401127","url":null,"abstract":"Past work in information-seeking conversation has demonstrated that people exhibit different conversational styles---for example, in word choice or prosody---that differences in style lead to poorer conversations, and that partners actively align their styles over time. One might assume that this would also be true for conversations with an artificial agent such as Cortana, Siri, or Alexa; and that agents should therefore track and mimic a user's style. We examine this hypothesis with reference to a lab study, where 24 participants carried out relatively long information-seeking tasks with an embodied conversational agent. The agent combined topical language models with a conversational dialogue engine, style recognition and alignment modules. We see that \"style'' can be measured in human-to-agent conversation, although it looks somewhat different to style in human-to-human conversation and does not correlate with self-reported preferences. There is evidence that people align their style to the agent, and that conversations run more smoothly if the agent detects, and aligns to, the human's style as well.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127422891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Term frequency is a common method for identifying the importance of a term in a document. But term frequency ignores how a term interacts with its text context, which is key to estimating document-specific term weights. This paper proposes a Deep Contextualized Term Weighting framework (DeepCT) that maps the contextualized term representations from BERT to into context-aware term weights for passage retrieval. The new, deep term weights can be stored in an ordinary inverted index for efficient retrieval. Experiments on two datasets demonstrate that DeepCT greatly improves the accuracy of first-stage passage retrieval algorithms.
{"title":"Context-Aware Term Weighting For First Stage Passage Retrieval","authors":"Zhuyun Dai, Jamie Callan","doi":"10.1145/3397271.3401204","DOIUrl":"https://doi.org/10.1145/3397271.3401204","url":null,"abstract":"Term frequency is a common method for identifying the importance of a term in a document. But term frequency ignores how a term interacts with its text context, which is key to estimating document-specific term weights. This paper proposes a Deep Contextualized Term Weighting framework (DeepCT) that maps the contextualized term representations from BERT to into context-aware term weights for passage retrieval. The new, deep term weights can be stored in an ordinary inverted index for efficient retrieval. Experiments on two datasets demonstrate that DeepCT greatly improves the accuracy of first-stage passage retrieval algorithms.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126814395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}