European Conference on Information Retrieval最新文献

英文中文

Injecting the BM25 Score as Text Improves BERT-Based Re-rankers 注入BM25分数作为文本改善基于bert的重新排序

European Conference on Information Retrieval

Pub Date : 2023-01-23 DOI: 10.48550/arXiv.2301.09728

Arian Askari, Amin Abolghasemi, G. Pasi, Wessel Kraaij, S. Verberne

In this paper we propose a novel approach for combining first-stage lexical retrieval models and Transformer-based re-rankers: we inject the relevance score of the lexical model as a token in the middle of the input of the cross-encoder re-ranker. It was shown in prior work that interpolation between the relevance score of lexical and BERT-based re-rankers may not consistently result in higher effectiveness. Our idea is motivated by the finding that BERT models can capture numeric information. We compare several representations of the BM25 score and inject them as text in the input of four different cross-encoders. We additionally analyze the effect for different query types, and investigate the effectiveness of our method for capturing exact matching relevance. Evaluation on the MSMARCO Passage collection and the TREC DL collections shows that the proposed method significantly improves over all cross-encoder re-rankers as well as the common interpolation methods. We show that the improvement is consistent for all query types. We also find an improvement in exact matching capabilities over both BM25 and the cross-encoders. Our findings indicate that cross-encoder re-rankers can efficiently be improved without additional computational burden and extra steps in the pipeline by explicitly adding the output of the first-stage ranker to the model input, and this effect is robust for different models and query types.

本文提出了一种结合第一阶段词法检索模型和基于transformer的重新排序器的新方法:我们在交叉编码器重新排序器的输入中间注入词法模型的相关分数作为标记。先前的研究表明，词汇相关性评分与基于bert的重新排序器之间的插值并不一定会产生更高的有效性。我们的想法源于BERT模型可以捕获数字信息的发现。我们比较了BM25分数的几种表示，并将它们作为文本注入到四个不同的交叉编码器的输入中。我们还分析了不同查询类型的效果，并研究了我们的方法在捕获精确匹配相关性方面的有效性。对MSMARCO通道集合和TREC DL集合的评估表明，该方法比所有交叉编码器重新排序器以及常用的插值方法都有显著的改进。我们展示了对所有查询类型的改进是一致的。我们还发现在BM25和交叉编码器上精确匹配能力的改进。我们的研究结果表明，通过显式地将第一阶段排序器的输出添加到模型输入中，可以有效地改进跨编码器重新排序，而无需额外的计算负担和额外的步骤，并且这种效果对于不同的模型和查询类型都是鲁棒的。

{"title":"Injecting the BM25 Score as Text Improves BERT-Based Re-rankers","authors":"Arian Askari, Amin Abolghasemi, G. Pasi, Wessel Kraaij, S. Verberne","doi":"10.48550/arXiv.2301.09728","DOIUrl":"https://doi.org/10.48550/arXiv.2301.09728","url":null,"abstract":"In this paper we propose a novel approach for combining first-stage lexical retrieval models and Transformer-based re-rankers: we inject the relevance score of the lexical model as a token in the middle of the input of the cross-encoder re-ranker. It was shown in prior work that interpolation between the relevance score of lexical and BERT-based re-rankers may not consistently result in higher effectiveness. Our idea is motivated by the finding that BERT models can capture numeric information. We compare several representations of the BM25 score and inject them as text in the input of four different cross-encoders. We additionally analyze the effect for different query types, and investigate the effectiveness of our method for capturing exact matching relevance. Evaluation on the MSMARCO Passage collection and the TREC DL collections shows that the proposed method significantly improves over all cross-encoder re-rankers as well as the common interpolation methods. We show that the improvement is consistent for all query types. We also find an improvement in exact matching capabilities over both BM25 and the cross-encoders. Our findings indicate that cross-encoder re-rankers can efficiently be improved without additional computational burden and extra steps in the pipeline by explicitly adding the output of the first-stage ranker to the model input, and this effect is robust for different models and query types.","PeriodicalId":126309,"journal":{"name":"European Conference on Information Retrieval","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126906690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

New Metrics to Encourage Innovation and Diversity in Information Retrieval Approaches 鼓励信息检索方法创新和多样化的新指标

European Conference on Information Retrieval

Pub Date : 2023-01-19 DOI: 10.48550/arXiv.2301.08062

Mehmet Deniz Turkmen, Matthew Lease, Mucahid Kutlu

In evaluation campaigns, participants often explore variations of popular, state-of-the-art baselines as a low-risk strategy to achieve competitive results. While effective, this can lead to local"hill climbing"rather than more radical and innovative departure from standard methods. Moreover, if many participants build on similar baselines, the overall diversity of approaches considered may be limited. In this work, we propose a new class of IR evaluation metrics intended to promote greater diversity of approaches in evaluation campaigns. Whereas traditional IR metrics focus on user experience, our two"innovation"metrics instead reward exploration of more divergent, higher-risk strategies finding relevant documents missed by other systems. Experiments on four TREC collections show that our metrics do change system rankings by rewarding systems that find such rare, relevant documents. This result is further supported by a controlled, synthetic data experiment, and a qualitative analysis. In addition, we show that our metrics achieve higher evaluation stability and discriminative power than the standard metrics we modify. To support reproducibility, we share our source code.

在评估活动中，参与者经常探索流行的、最先进的基线的变化，作为实现竞争结果的低风险策略。虽然有效，但这可能导致局部的“攀高”，而不是对标准方法的更激进和创新的偏离。此外，如果许多参与者建立在类似的基线上，所考虑的方法的总体多样性可能会受到限制。在这项工作中，我们提出了一类新的IR评估指标，旨在促进评估活动中方法的更大多样性。传统的IR指标侧重于用户体验，而我们的两个“创新”指标则奖励探索更具发散性、风险更高的策略，以发现其他系统遗漏的相关文档。在四个TREC集合上的实验表明，我们的指标确实通过奖励找到这些罕见的相关文档的系统来改变系统排名。这一结果进一步得到了控制，综合数据实验和定性分析的支持。此外，我们证明了我们的指标比我们修改的标准指标具有更高的评估稳定性和判别能力。为了支持再现性，我们共享了源代码。

{"title":"New Metrics to Encourage Innovation and Diversity in Information Retrieval Approaches","authors":"Mehmet Deniz Turkmen, Matthew Lease, Mucahid Kutlu","doi":"10.48550/arXiv.2301.08062","DOIUrl":"https://doi.org/10.48550/arXiv.2301.08062","url":null,"abstract":"In evaluation campaigns, participants often explore variations of popular, state-of-the-art baselines as a low-risk strategy to achieve competitive results. While effective, this can lead to local\"hill climbing\"rather than more radical and innovative departure from standard methods. Moreover, if many participants build on similar baselines, the overall diversity of approaches considered may be limited. In this work, we propose a new class of IR evaluation metrics intended to promote greater diversity of approaches in evaluation campaigns. Whereas traditional IR metrics focus on user experience, our two\"innovation\"metrics instead reward exploration of more divergent, higher-risk strategies finding relevant documents missed by other systems. Experiments on four TREC collections show that our metrics do change system rankings by rewarding systems that find such rare, relevant documents. This result is further supported by a controlled, synthetic data experiment, and a qualitative analysis. In addition, we show that our metrics achieve higher evaluation stability and discriminative power than the standard metrics we modify. To support reproducibility, we share our source code.","PeriodicalId":126309,"journal":{"name":"European Conference on Information Retrieval","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127068078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Keyword Embeddings for Query Suggestion 关键字嵌入查询建议

European Conference on Information Retrieval

Pub Date : 2023-01-19 DOI: 10.48550/arXiv.2301.08006

Jorge Gab'in, M. E. Ares, Javier Parapar

Nowadays, search engine users commonly rely on query suggestions to improve their initial inputs. Current systems are very good at recommending lexical adaptations or spelling corrections to users' queries. However, they often struggle to suggest semantically related keywords given a user's query. The construction of a detailed query is crucial in some tasks, such as legal retrieval or academic search. In these scenarios, keyword suggestion methods are critical to guide the user during the query formulation. This paper proposes two novel models for the keyword suggestion task trained on scientific literature. Our techniques adapt the architecture of Word2Vec and FastText to generate keyword embeddings by leveraging documents' keyword co-occurrence. Along with these models, we also present a specially tailored negative sampling approach that exploits how keywords appear in academic publications. We devise a ranking-based evaluation methodology following both known-item and ad-hoc search scenarios. Finally, we evaluate our proposals against the state-of-the-art word and sentence embedding models showing considerable improvements over the baselines for the tasks.

如今，搜索引擎用户通常依靠查询建议来改进他们的初始输入。当前的系统非常擅长为用户的查询推荐词汇调整或拼写更正。然而，他们常常很难根据用户的查询提示语义相关的关键字。详细查询的构造在某些任务中是至关重要的，例如法律检索或学术检索。在这些场景下，关键字建议方法对于指导用户制定查询至关重要。本文提出了两种基于科学文献训练的关键词建议任务模型。我们的技术采用Word2Vec和FastText的架构，通过利用文档的关键词共现来生成关键字嵌入。除了这些模型，我们还提出了一种专门定制的负抽样方法，利用关键词在学术出版物中的出现方式。我们设计了一种基于排名的评估方法，遵循已知项目和特别搜索场景。最后，我们根据最先进的单词和句子嵌入模型对我们的建议进行评估，显示出在任务基线上有相当大的改进。

引用次数: 0

Detecting Stance of Authorities towards Rumors in Arabic Tweets: A Preliminary Study 权威机构对阿拉伯语推特谣言的检测立场初探

European Conference on Information Retrieval

Pub Date : 2023-01-14 DOI: 10.48550/arXiv.2301.05863

Fatima Haouari, Tamer Elsayed

A myriad of studies addressed the problem of rumor verification in Twitter by either utilizing evidence from the propagation networks or external evidence from the Web. However, none of these studies exploited evidence from trusted authorities. In this paper, we define the task of detecting the stance of authorities towards rumors in tweets, i.e., whether a tweet from an authority agrees, disagrees, or is unrelated to the rumor. We believe the task is useful to augment the sources of evidence utilized by existing rumor verification systems. We construct and release the first Authority STance towards Rumors (AuSTR) dataset, where evidence is retrieved from authority timelines in Arabic Twitter. Due to the relatively limited size of our dataset, we study the usefulness of existing datasets for stance detection in our task. We show that existing datasets are somewhat useful for the task; however, they are clearly insufficient, which motivates the need to augment them with annotated data constituting stance of authorities from Twitter.

无数的研究通过利用来自传播网络的证据或来自网络的外部证据来解决Twitter上的谣言验证问题。然而，这些研究都没有利用可信权威机构的证据。在本文中，我们定义了检测权威机构对推文中谣言的立场的任务，即来自权威机构的推文是同意、不同意还是与谣言无关。我们认为，这项任务有助于增加现有谣言验证系统所利用的证据来源。我们构建并发布了第一个针对谣言的权威立场(AuSTR)数据集，其中的证据是从阿拉伯语Twitter的权威时间轴中检索的。由于我们的数据集规模相对有限，我们研究了现有数据集在我们的任务中对姿态检测的有用性。我们表明，现有的数据集对任务有些用处;然而，它们显然是不够的，这促使人们需要用带有注释的数据来增强它们，这些数据构成了Twitter的权威立场。

引用次数: 2

Knowledge is Power, Understanding is Impact: Utility and Beyond Goals, Explanation Quality, and Fairness in Path Reasoning Recommendation 知识就是力量，理解就是影响:路径推理推荐中的效用与超越目标、解释质量与公平性

European Conference on Information Retrieval

Pub Date : 2023-01-14 DOI: 10.48550/arXiv.2301.05944

Giacomo Balloccu, Ludovico Boratto, Christian Cancedda, G. Fenu, M. Marras

Path reasoning is a notable recommendation approach that models high-order user-product relations, based on a Knowledge Graph (KG). This approach can extract reasoning paths between recommended products and already experienced products and, then, turn such paths into textual explanations for the user. Unfortunately, evaluation protocols in this field appear heterogeneous and limited, making it hard to contextualize the impact of the existing methods. In this paper, we replicated three state-of-the-art relevant path reasoning recommendation methods proposed in top-tier conferences. Under a common evaluation protocol, based on two public data sets and in comparison with other knowledge-aware methods, we then studied the extent to which they meet recommendation utility and beyond objectives, explanation quality, and consumer and provider fairness. Our study provides a picture of the progress in this field, highlighting open issues and future directions. Source code: url{https://github.com/giacoballoccu/rep-path-reasoning-recsys}.

路径推理是一种值得注意的推荐方法，它基于知识图(Knowledge Graph, KG)建模高阶用户-产品关系。这种方法可以提取推荐产品和已经体验过的产品之间的推理路径，然后将这些路径转换为用户的文本解释。不幸的是，该领域的评估协议表现出异质性和局限性，使得很难将现有方法的影响置于环境中。在本文中，我们复制了在顶级会议上提出的三种最先进的相关路径推理推荐方法。在一个通用的评估协议下，基于两个公共数据集，并与其他知识感知方法进行比较，我们然后研究了它们满足推荐效用和超越目标、解释质量以及消费者和提供者公平性的程度。我们的研究提供了这一领域的进展情况，突出了开放的问题和未来的方向。源代码:url{https://github.com/giacoballoccu/rep-path-reasoning-recsys}。

引用次数: 2

It's Just a Matter of Time: Detecting Depression with Time-Enriched Multimodal Transformers 这只是时间问题:用时间丰富的多模态变压器检测抑郁症

European Conference on Information Retrieval

Pub Date : 2023-01-13 DOI: 10.48550/arXiv.2301.05453

Ana-Maria Bucur, Adrian Cosma, Paolo Rosso, Liviu P. Dinu

Depression detection from user-generated content on the internet has been a long-lasting topic of interest in the research community, providing valuable screening tools for psychologists. The ubiquitous use of social media platforms lays out the perfect avenue for exploring mental health manifestations in posts and interactions with other users. Current methods for depression detection from social media mainly focus on text processing, and only a few also utilize images posted by users. In this work, we propose a flexible time-enriched multimodal transformer architecture for detecting depression from social media posts, using pretrained models for extracting image and text embeddings. Our model operates directly at the user-level, and we enrich it with the relative time between posts by using time2vec positional embeddings. Moreover, we propose another model variant, which can operate on randomly sampled and unordered sets of posts to be more robust to dataset noise. We show that our method, using EmoBERTa and CLIP embeddings, surpasses other methods on two multimodal datasets, obtaining state-of-the-art results of 0.931 F1 score on a popular multimodal Twitter dataset, and 0.902 F1 score on the only multimodal Reddit dataset.

从互联网上用户生成的内容中检测抑郁症一直是研究界感兴趣的一个长期话题，为心理学家提供了有价值的筛查工具。社交媒体平台的普遍使用为探索帖子和与其他用户的互动中心理健康的表现提供了完美的途径。目前的社交媒体抑郁症检测方法主要集中在文本处理上，只有少数方法还利用了用户发布的图片。在这项工作中，我们提出了一个灵活的时间丰富的多模态变压器架构，用于从社交媒体帖子中检测抑郁，使用预训练模型提取图像和文本嵌入。我们的模型直接在用户级别上运行，我们通过使用time2vec位置嵌入来丰富帖子之间的相对时间。此外，我们提出了另一种模型变体，它可以对随机采样和无序的帖子集进行操作，从而对数据集噪声具有更强的鲁棒性。我们表明，使用EmoBERTa和CLIP嵌入的方法在两个多模态数据集上优于其他方法，在一个流行的多模态Twitter数据集上获得了0.931 F1分数，在唯一的多模态Reddit数据集上获得了0.902 F1分数。

{"title":"It's Just a Matter of Time: Detecting Depression with Time-Enriched Multimodal Transformers","authors":"Ana-Maria Bucur, Adrian Cosma, Paolo Rosso, Liviu P. Dinu","doi":"10.48550/arXiv.2301.05453","DOIUrl":"https://doi.org/10.48550/arXiv.2301.05453","url":null,"abstract":"Depression detection from user-generated content on the internet has been a long-lasting topic of interest in the research community, providing valuable screening tools for psychologists. The ubiquitous use of social media platforms lays out the perfect avenue for exploring mental health manifestations in posts and interactions with other users. Current methods for depression detection from social media mainly focus on text processing, and only a few also utilize images posted by users. In this work, we propose a flexible time-enriched multimodal transformer architecture for detecting depression from social media posts, using pretrained models for extracting image and text embeddings. Our model operates directly at the user-level, and we enrich it with the relative time between posts by using time2vec positional embeddings. Moreover, we propose another model variant, which can operate on randomly sampled and unordered sets of posts to be more robust to dataset noise. We show that our method, using EmoBERTa and CLIP embeddings, surpasses other methods on two multimodal datasets, obtaining state-of-the-art results of 0.931 F1 score on a popular multimodal Twitter dataset, and 0.902 F1 score on the only multimodal Reddit dataset.","PeriodicalId":126309,"journal":{"name":"European Conference on Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130833961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Multilingual Detection of Check-Worthy Claims using World Languages and Adapter Fusion 使用世界语言和适配器融合的多语言检测值得检查的索赔

European Conference on Information Retrieval

Pub Date : 2023-01-13 DOI: 10.48550/arXiv.2301.05494

I. Baris Schlicht, Lucie Flek, Paolo Rosso

Check-worthiness detection is the task of identifying claims, worthy to be investigated by fact-checkers. Resource scarcity for non-world languages and model learning costs remain major challenges for the creation of models supporting multilingual check-worthiness detection. This paper proposes cross-training adapters on a subset of world languages, combined by adapter fusion, to detect claims emerging globally in multiple languages. (1) With a vast number of annotators available for world languages and the storage-efficient adapter models, this approach is more cost efficient. Models can be updated more frequently and thus stay up-to-date. (2) Adapter fusion provides insights and allows for interpretation regarding the influence of each adapter model on a particular language. The proposed solution often outperformed the top multilingual approaches in our benchmark tasks.

值得核查的检测是识别主张的任务，值得事实核查者进行调查。非世界语言的资源稀缺和模型学习成本仍然是创建支持多语言检查性检测的模型的主要挑战。本文提出了在世界语言子集上交叉训练适配器，并结合适配器融合来检测全球范围内出现的多语言索赔。(1)有了世界语言的大量注释器和存储效率高的适配器模型，这种方法的成本效益更高。可以更频繁地更新模型，从而保持最新状态。(2)适配器融合提供了关于每种适配器模型对特定语言的影响的见解和解释。在我们的基准任务中，建议的解决方案通常优于顶级多语言方法。

引用次数: 1

Do the Findings of Document and Passage Retrieval Generalize to the Retrieval of Responses for Dialogues? 文献和文章检索的结果是否可以推广到对话响应的检索?

European Conference on Information Retrieval

Pub Date : 2023-01-13 DOI: 10.48550/arXiv.2301.05508

Gustavo Penha, C. Hauff

A number of learned sparse and dense retrieval approaches have recently been proposed and proven effective in tasks such as passage retrieval and document retrieval. In this paper we analyze with a replicability study if the lessons learned generalize to the retrieval of responses for dialogues, an important task for the increasingly popular field of conversational search. Unlike passage and document retrieval where documents are usually longer than queries, in response ranking for dialogues the queries (dialogue contexts) are often longer than the documents (responses). Additionally, dialogues have a particular structure, i.e. multiple utterances by different users. With these differences in mind, we here evaluate how generalizable the following major findings from previous works are: (F1) query expansion outperforms a no-expansion baseline; (F2) document expansion outperforms a no-expansion baseline; (F3) zero-shot dense retrieval underperforms sparse baselines; (F4) dense retrieval outperforms sparse baselines; (F5) hard negative sampling is better than random sampling for training dense models. Our experiments -- based on three different information-seeking dialogue datasets -- reveal that four out of five findings (F2-F5) generalize to our domain

近年来，人们提出了许多学习稀疏和密集检索方法，并证明它们在文章检索和文档检索等任务中是有效的。在本文中，我们通过可复制性研究来分析这些经验教训是否可以推广到对话响应的检索中，这是日益流行的对话搜索领域的一项重要任务。与通道和文档检索(文档通常比查询长)不同，在对话的响应排序中，查询(对话上下文)通常比文档(响应)长。此外，对话具有特定的结构，即不同用户的多个话语。考虑到这些差异，我们在这里评估从以前的工作中得出的以下主要发现的普遍性:(F1)查询扩展优于无扩展基线;(F2)文档扩展优于无扩展基线;(F3)零采样密集检索不如稀疏基线;(F4)密集检索优于稀疏基线;(F5)对于训练密集模型，硬负抽样优于随机抽样。我们的实验——基于三个不同的信息搜索对话数据集——揭示了五分之四的发现(F2-F5)可以推广到我们的领域

{"title":"Do the Findings of Document and Passage Retrieval Generalize to the Retrieval of Responses for Dialogues?","authors":"Gustavo Penha, C. Hauff","doi":"10.48550/arXiv.2301.05508","DOIUrl":"https://doi.org/10.48550/arXiv.2301.05508","url":null,"abstract":"A number of learned sparse and dense retrieval approaches have recently been proposed and proven effective in tasks such as passage retrieval and document retrieval. In this paper we analyze with a replicability study if the lessons learned generalize to the retrieval of responses for dialogues, an important task for the increasingly popular field of conversational search. Unlike passage and document retrieval where documents are usually longer than queries, in response ranking for dialogues the queries (dialogue contexts) are often longer than the documents (responses). Additionally, dialogues have a particular structure, i.e. multiple utterances by different users. With these differences in mind, we here evaluate how generalizable the following major findings from previous works are: (F1) query expansion outperforms a no-expansion baseline; (F2) document expansion outperforms a no-expansion baseline; (F3) zero-shot dense retrieval underperforms sparse baselines; (F4) dense retrieval outperforms sparse baselines; (F5) hard negative sampling is better than random sampling for training dense models. Our experiments -- based on three different information-seeking dialogue datasets -- reveal that four out of five findings (F2-F5) generalize to our domain","PeriodicalId":126309,"journal":{"name":"European Conference on Information Retrieval","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130169842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Multimodal Inverse Cloze Task for Knowledge-based Visual Question Answering 基于知识的视觉问答的多模态逆向完形填空任务

European Conference on Information Retrieval

Pub Date : 2023-01-11 DOI: 10.48550/arXiv.2301.04366

Paul Lerner, O. Ferret, C. Guinaudeau

We present a new pre-training method, Multimodal Inverse Cloze Task, for Knowledge-based Visual Question Answering about named Entities (KVQAE). KVQAE is a recently introduced task that consists in answering questions about named entities grounded in a visual context using a Knowledge Base. Therefore, the interaction between the modalities is paramount to retrieve information and must be captured with complex fusion models. As these models require a lot of training data, we design this pre-training task from existing work in textual Question Answering. It consists in considering a sentence as a pseudo-question and its context as a pseudo-relevant passage and is extended by considering images near texts in multimodal documents. Our method is applicable to different neural network architectures and leads to a 9% relative-MRR and 15% relative-F1 gain for retrieval and reading comprehension, respectively, over a no-pre-training baseline.

针对基于知识的命名实体视觉问答(KVQAE)，提出了一种新的预训练方法——多模态逆完形任务。KVQAE是最近引入的一项任务，它包括使用知识库回答关于基于视觉上下文的命名实体的问题。因此，模式之间的交互对于检索信息至关重要，必须使用复杂的融合模型来捕获。由于这些模型需要大量的训练数据，我们从文本问答的现有工作中设计了这个预训练任务。它包括将句子视为一个伪疑问句，将其上下文视为一个伪相关段落，并通过考虑多模态文档中文本附近的图像来扩展。我们的方法适用于不同的神经网络架构，在没有预训练的基线上，检索和阅读理解分别获得9%的相对mrr和15%的相对f1增益。

引用次数: 5

Topics in Contextualised Attention Embeddings 语境化注意嵌入中的主题

European Conference on Information Retrieval

Pub Date : 2023-01-11 DOI: 10.48550/arXiv.2301.04339

Mozhgan Talebpour, A. G. S. D. Herrera, Shoaib Jameel

Contextualised word vectors obtained via pre-trained language models encode a variety of knowledge that has already been exploited in applications. Complementary to these language models are probabilistic topic models that learn thematic patterns from the text. Recent work has demonstrated that conducting clustering on the word-level contextual representations from a language model emulates word clusters that are discovered in latent topics of words from Latent Dirichlet Allocation. The important question is how such topical word clusters are automatically formed, through clustering, in the language model when it has not been explicitly designed to model latent topics. To address this question, we design different probe experiments. Using BERT and DistilBERT, we find that the attention framework plays a key role in modelling such word topic clusters. We strongly believe that our work paves way for further research into the relationships between probabilistic topic models and pre-trained language models.

通过预先训练的语言模型获得的语境化词向量编码了各种已经在应用中被利用的知识。与这些语言模型相辅相成的是概率主题模型，它从文本中学习主题模式。最近的研究表明，从语言模型中对词级上下文表示进行聚类，可以模拟从潜在狄利克雷分配中发现的词的潜在主题中的词聚类。重要的问题是，当语言模型没有明确设计为潜在主题建模时，如何通过聚类自动形成这些主题词簇。为了解决这个问题，我们设计了不同的探针实验。利用BERT和蒸馏伯特，我们发现注意框架在建模这类词主题聚类中起着关键作用。我们坚信，我们的工作为进一步研究概率主题模型和预训练语言模型之间的关系铺平了道路。

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

European Conference on Information Retrieval

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀