首页 > 最新文献

Proceedings of the 2015 International Conference on The Theory of Information Retrieval最新文献

英文 中文
Optimal Packing in Simple-Family Codecs 简单族编解码器的最佳封装
A. Trotman, Michael H. Albert, Blake Burgess
The Simple family of codecs is popular for encoding postings lists for a search engine because they are both space effective and time efficient at decoding. These algorithms pack as many integers into a codeword as possible before moving on to the next codeword. This technique is known as left-greedy. This contribution proves that left-greedy is not optimal and then goes on to introduce a dynamic programming solution to find the optimal packing. Experiments on .gov2 and INEX Wikipedia 2009 show that although this is an interesting theoretical result, left-greedy is empirically near optimal in effectiveness and efficiency.
Simple系列编解码器在为搜索引擎编码帖子列表时很受欢迎,因为它们在解码时既节省空间又节省时间。这些算法在移动到下一个码字之前,将尽可能多的整数打包到一个码字中。这种技术被称为左贪。这一贡献证明了左贪婪不是最优的,然后引入了一个动态规划解决方案来寻找最优包装。在.gov2和INEX Wikipedia 2009上的实验表明,尽管这是一个有趣的理论结果,但从经验上看,左贪婪在有效性和效率上接近最优。
{"title":"Optimal Packing in Simple-Family Codecs","authors":"A. Trotman, Michael H. Albert, Blake Burgess","doi":"10.1145/2808194.2809483","DOIUrl":"https://doi.org/10.1145/2808194.2809483","url":null,"abstract":"The Simple family of codecs is popular for encoding postings lists for a search engine because they are both space effective and time efficient at decoding. These algorithms pack as many integers into a codeword as possible before moving on to the next codeword. This technique is known as left-greedy. This contribution proves that left-greedy is not optimal and then goes on to introduce a dynamic programming solution to find the optimal packing. Experiments on .gov2 and INEX Wikipedia 2009 show that although this is an interesting theoretical result, left-greedy is empirically near optimal in effectiveness and efficiency.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133670788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Dynamic Information Retrieval: Theoretical Framework and Application 动态信息检索:理论框架与应用
Marc Sloan, Jun Wang
Theoretical frameworks like the Probability Ranking Principle and its more recent Interactive Information Retrieval variant have guided the development of ranking and retrieval algorithms for decades, yet they are not capable of helping us model problems in Dynamic Information Retrieval which exhibit the following three properties; an observable user signal, retrieval over multiple stages and an overall search intent. In this paper a new theoretical framework for retrieval in these scenarios is proposed. We derive a general dynamic utility function for optimizing over these types of tasks, that takes into account the utility of each stage and the probability of observing user feedback. We apply our framework to experiments over TREC data in the dynamic multi page search scenario as a practical demonstration of its effectiveness and to frame the discussion of its use, its limitations and to compare it against the existing frameworks.
几十年来,概率排序原则及其最近的交互式信息检索变体等理论框架指导了排序和检索算法的发展,但它们无法帮助我们对动态信息检索中的问题进行建模,动态信息检索表现出以下三个特征:一个可观察的用户信号,多个阶段的检索和一个整体的搜索意图。本文提出了一个新的理论框架,用于这些场景下的检索。我们推导了一个通用的动态效用函数来优化这些类型的任务,它考虑了每个阶段的效用和观察用户反馈的概率。我们将我们的框架应用于动态多页搜索场景中TREC数据的实验,作为其有效性的实际演示,并对其使用、局限性进行讨论,并将其与现有框架进行比较。
{"title":"Dynamic Information Retrieval: Theoretical Framework and Application","authors":"Marc Sloan, Jun Wang","doi":"10.1145/2808194.2809457","DOIUrl":"https://doi.org/10.1145/2808194.2809457","url":null,"abstract":"Theoretical frameworks like the Probability Ranking Principle and its more recent Interactive Information Retrieval variant have guided the development of ranking and retrieval algorithms for decades, yet they are not capable of helping us model problems in Dynamic Information Retrieval which exhibit the following three properties; an observable user signal, retrieval over multiple stages and an overall search intent. In this paper a new theoretical framework for retrieval in these scenarios is proposed. We derive a general dynamic utility function for optimizing over these types of tasks, that takes into account the utility of each stage and the probability of observing user feedback. We apply our framework to experiments over TREC data in the dynamic multi page search scenario as a practical demonstration of its effectiveness and to frame the discussion of its use, its limitations and to compare it against the existing frameworks.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132986229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Building a Self-Contained Search Engine in the Browser 在浏览器中构建一个独立的搜索引擎
Jimmy J. Lin
JavaScript engines inside modern web browsers are capable of running sophisticated multi-player games, rendering impressive 3D scenes, and supporting complex, interactive visualizations. Can this processing power be harnessed for information retrieval? This paper explores the feasibility of building a JavaScript search engine that runs completely self-contained on the client side within the browser - this includes building the inverted index, gathering terms statistics for scoring, and performing query evaluation. The design takes advantage of the IndexDB API, which is implemented by the LevelDB key{value store inside Google's Chrome browser. Experiments show that although the performance of the JavaScript prototype falls far short of the open-source Lucene search engine, it is sufficiently responsive for interactive applications. This feasibility demonstration opens the door to interesting applications and architectures.
现代web浏览器中的JavaScript引擎能够运行复杂的多人游戏,呈现令人印象深刻的3D场景,并支持复杂的交互式可视化。这种处理能力能否用于信息检索?本文探讨了构建一个在浏览器的客户端上完全独立运行的JavaScript搜索引擎的可行性——这包括构建倒排索引、收集用于评分的术语统计信息以及执行查询评估。该设计利用了IndexDB API,它是由b谷歌的Chrome浏览器中的LevelDB key{值存储实现的。实验表明,尽管JavaScript原型的性能远不及开源的Lucene搜索引擎,但它对交互式应用程序的响应足够灵敏。这个可行性演示为有趣的应用程序和体系结构打开了大门。
{"title":"Building a Self-Contained Search Engine in the Browser","authors":"Jimmy J. Lin","doi":"10.1145/2808194.2809478","DOIUrl":"https://doi.org/10.1145/2808194.2809478","url":null,"abstract":"JavaScript engines inside modern web browsers are capable of running sophisticated multi-player games, rendering impressive 3D scenes, and supporting complex, interactive visualizations. Can this processing power be harnessed for information retrieval? This paper explores the feasibility of building a JavaScript search engine that runs completely self-contained on the client side within the browser - this includes building the inverted index, gathering terms statistics for scoring, and performing query evaluation. The design takes advantage of the IndexDB API, which is implemented by the LevelDB key{value store inside Google's Chrome browser. Experiments show that although the performance of the JavaScript prototype falls far short of the open-source Lucene search engine, it is sufficiently responsive for interactive applications. This feasibility demonstration opens the door to interesting applications and architectures.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127240902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
An Analysis of Theories of Search and Search Behavior 搜索理论与搜索行为分析
L. Azzopardi, G. Zuccon
Theories of search and search behavior can be used to glean insights and generate hypotheses about how people interact with retrieval systems. This paper examines three such theories, the long standing Information Foraging Theory, along with the more recently proposed Search Economic Theory and the Interactive Probability Ranking Principle. Our goal is to develop a model for ad-hoc topic retrieval using each approach, all within a common framework, in order to (1) determine what predictions each approach makes about search behavior, and (2) show the relationships, equivalences and differences between the approaches. While each approach takes a different perspective on modeling searcher interactions, we show that under certain assumptions, they lead to similar hypotheses regarding search behavior. Moreover, we show that the models are complementary to each other, but operate at different levels (i.e., sessions, patches and situations). We further show how the differences between the approaches lead to new insights into the theories and new models. This contribution will not only lead to further theoretical developments, but also enables practitioners to employ one of the three equivalent models depending on the data available.
搜索和搜索行为理论可以用来收集见解,并产生关于人们如何与检索系统交互的假设。本文考察了三个这样的理论,即长期存在的信息觅食理论,以及最近提出的搜索经济理论和交互概率排序原则。我们的目标是开发一个使用每种方法的特别主题检索模型,所有这些方法都在一个共同的框架内,以便(1)确定每种方法对搜索行为的预测,(2)显示方法之间的关系、等价和差异。虽然每种方法都采用不同的视角来建模搜索者交互,但我们表明,在某些假设下,它们会导致关于搜索行为的类似假设。此外,我们表明,这些模型是相互补充的,但在不同的级别(即,会话,补丁和情况)上运行。我们进一步展示了方法之间的差异如何导致对理论和新模型的新见解。这一贡献不仅将导致进一步的理论发展,而且还使实践者能够根据现有数据采用三种等效模型中的一种。
{"title":"An Analysis of Theories of Search and Search Behavior","authors":"L. Azzopardi, G. Zuccon","doi":"10.1145/2808194.2809447","DOIUrl":"https://doi.org/10.1145/2808194.2809447","url":null,"abstract":"Theories of search and search behavior can be used to glean insights and generate hypotheses about how people interact with retrieval systems. This paper examines three such theories, the long standing Information Foraging Theory, along with the more recently proposed Search Economic Theory and the Interactive Probability Ranking Principle. Our goal is to develop a model for ad-hoc topic retrieval using each approach, all within a common framework, in order to (1) determine what predictions each approach makes about search behavior, and (2) show the relationships, equivalences and differences between the approaches. While each approach takes a different perspective on modeling searcher interactions, we show that under certain assumptions, they lead to similar hypotheses regarding search behavior. Moreover, we show that the models are complementary to each other, but operate at different levels (i.e., sessions, patches and situations). We further show how the differences between the approaches lead to new insights into the theories and new models. This contribution will not only lead to further theoretical developments, but also enables practitioners to employ one of the three equivalent models depending on the data available.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121936182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Embedded Representations of Lexical and Knowledge-Base Semantics 词汇语义和知识库语义的嵌入式表示
A. McCallum
BIO Andrew McCallum is a Professor, Director of the Center for Data Science, and Director of the Information Extraction and Synthesis Laboratory in the College of Information and Computer Sciences at University of Massachusetts Amherst. He has published over 250 papers in many areas of AI, including natural language processing, machine learning, data mining and reinforcement learning, and his work has received over 40,000 citations.
Andrew McCallum是马萨诸塞大学阿姆赫斯特分校信息与计算机科学学院的教授、数据科学中心主任、信息提取与合成实验室主任。他在人工智能的许多领域发表了250多篇论文,包括自然语言处理、机器学习、数据挖掘和强化学习,他的工作被引用超过4万次。
{"title":"Embedded Representations of Lexical and Knowledge-Base Semantics","authors":"A. McCallum","doi":"10.1145/2808194.2808195","DOIUrl":"https://doi.org/10.1145/2808194.2808195","url":null,"abstract":"BIO Andrew McCallum is a Professor, Director of the Center for Data Science, and Director of the Information Extraction and Synthesis Laboratory in the College of Information and Computer Sciences at University of Massachusetts Amherst. He has published over 250 papers in many areas of AI, including natural language processing, machine learning, data mining and reinforcement learning, and his work has received over 40,000 citations.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121065419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Learning Asymmetric Co-Relevance 学习非对称共关联
Fiana Raiber, Oren Kurland, Filip Radlinski, Milad Shokouhi
Several applications in information retrieval rely on asymmetric co-relevance estimation; that is, estimating the relevance of a document to a query under the assumption that another document is relevant. We present a supervised model for learning an asymmetric co-relevance estimate. The model uses different types of similarities with the assumed relevant document and the query, as well as document-quality measures. Empirical evaluation demonstrates the merits of using the co-relevance estimate in various applications, including cluster-based and graph-based document retrieval. Specifically, the resultant performance transcends that of using a wide variety of alternative estimates, mostly symmetric inter-document similarity measures that dominate past work.
非对称相关估计在信息检索中的应用也就是说,在假设另一个文档是相关的情况下,估计文档与查询的相关性。我们提出了一个学习非对称相关估计的监督模型。该模型使用与假定的相关文档和查询的不同类型的相似性,以及文档质量度量。经验评估证明了在各种应用中使用相关估计的优点,包括基于聚类和基于图的文档检索。具体地说,由此产生的性能优于使用各种各样的替代估计,大多数对称的文档间相似性度量在过去的工作中占主导地位。
{"title":"Learning Asymmetric Co-Relevance","authors":"Fiana Raiber, Oren Kurland, Filip Radlinski, Milad Shokouhi","doi":"10.1145/2808194.2809454","DOIUrl":"https://doi.org/10.1145/2808194.2809454","url":null,"abstract":"Several applications in information retrieval rely on asymmetric co-relevance estimation; that is, estimating the relevance of a document to a query under the assumption that another document is relevant. We present a supervised model for learning an asymmetric co-relevance estimate. The model uses different types of similarities with the assumed relevant document and the query, as well as document-quality measures. Empirical evaluation demonstrates the merits of using the co-relevance estimate in various applications, including cluster-based and graph-based document retrieval. Specifically, the resultant performance transcends that of using a wide variety of alternative estimates, mostly symmetric inter-document similarity measures that dominate past work.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"206 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132161532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Entity Linking in Queries: Tasks and Evaluation 查询中的实体链接:任务和评估
Faegheh Hasibi, K. Balog, Svein Erik Bratsberg
Annotating queries with entities is one of the core problem areas in query understanding. While seeming similar, the task of entity linking in queries is different from entity linking in documents and requires a methodological departure due to the inherent ambiguity of queries. We differentiate between two specific tasks, semantic mapping and interpretation finding, discuss current evaluation methodology, and propose refinements. We examine publicly available datasets for these tasks and introduce a new manually curated dataset for interpretation finding. To further deepen the understanding of task differences, we present a set of approaches for effectively addressing these tasks and report on experimental results.
用实体标注查询是查询理解中的核心问题之一。虽然看起来很相似,但查询中的实体链接任务与文档中的实体链接任务不同,并且由于查询固有的模糊性,需要在方法上有所不同。我们区分了两个特定的任务,语义映射和解释发现,讨论了当前的评估方法,并提出了改进建议。我们为这些任务检查了公开可用的数据集,并引入了一个新的手动管理的数据集来进行解释查找。为了进一步加深对任务差异的理解,我们提出了一套有效解决这些任务的方法,并报告了实验结果。
{"title":"Entity Linking in Queries: Tasks and Evaluation","authors":"Faegheh Hasibi, K. Balog, Svein Erik Bratsberg","doi":"10.1145/2808194.2809473","DOIUrl":"https://doi.org/10.1145/2808194.2809473","url":null,"abstract":"Annotating queries with entities is one of the core problem areas in query understanding. While seeming similar, the task of entity linking in queries is different from entity linking in documents and requires a methodological departure due to the inherent ambiguity of queries. We differentiate between two specific tasks, semantic mapping and interpretation finding, discuss current evaluation methodology, and propose refinements. We examine publicly available datasets for these tasks and introduce a new manually curated dataset for interpretation finding. To further deepen the understanding of task differences, we present a set of approaches for effectively addressing these tasks and report on experimental results.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"54 20","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113957363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
Using Part-of-Speech N-grams for Sensitive-Text Classification 基于词性n图的敏感文本分类
G. Mcdonald, C. Macdonald, I. Ounis
Freedom of Information legislations in many western democracies, including the United Kingdom (UK) and the United States of America (USA), state that citizens have typically the right to access government documents. However, certain sensitive information is exempt from release into the public domain. For example, in the UK, FOIA Exemption 27 (International Relations) excludes the release of Information that might damage the interests of the UK abroad. Therefore, the process of reviewing government documents for sensitivity is essential to determine if a document must be redacted before it is archived, or closed until the information is no longer sensitive. With the increased volume of digital government documents in recent years, there is a need for new tools to assist the digital sensitivity review process. Therefore, in this paper we propose an automatic approach for identifying sensitive text in documents by measuring the amount of sensitivity in sequences of text. Using government documents reviewed by trained sensitivity reviewers, we focus on an aspect of FOIA Exemption 27 which can have a major impact on international relations, namely, information supplied in confidence. We show that our approach leads to markedly increased recall of sensitive text, while achieving a very high level of precision, when compared to a baseline that has been shown to be effective at identifying sensitive text in other domains.
在许多西方民主国家,包括英国(UK)和美利坚合众国(USA),信息自由立法规定公民通常有权查阅政府文件。然而,某些敏感信息是免于发布到公共领域的。例如,在英国,FOIA豁免27(国际关系)排除了可能损害英国海外利益的信息的发布。因此,审查政府文件的敏感性过程至关重要,以确定文件是否必须在存档之前进行编辑,或者直到信息不再敏感时才关闭。随着近年来数字政府文件数量的增加,需要新的工具来协助数字敏感性审查过程。因此,在本文中,我们提出了一种通过测量文本序列的敏感性来自动识别文档中敏感文本的方法。我们利用经过训练的敏感审查员审查的政府文件,重点关注《信息自由法》豁免条款27中可能对国际关系产生重大影响的一个方面,即保密提供的信息。我们表明,与基线相比,我们的方法显著提高了敏感文本的召回率,同时达到了非常高的精度,而基线在识别其他领域的敏感文本方面已被证明是有效的。
{"title":"Using Part-of-Speech N-grams for Sensitive-Text Classification","authors":"G. Mcdonald, C. Macdonald, I. Ounis","doi":"10.1145/2808194.2809496","DOIUrl":"https://doi.org/10.1145/2808194.2809496","url":null,"abstract":"Freedom of Information legislations in many western democracies, including the United Kingdom (UK) and the United States of America (USA), state that citizens have typically the right to access government documents. However, certain sensitive information is exempt from release into the public domain. For example, in the UK, FOIA Exemption 27 (International Relations) excludes the release of Information that might damage the interests of the UK abroad. Therefore, the process of reviewing government documents for sensitivity is essential to determine if a document must be redacted before it is archived, or closed until the information is no longer sensitive. With the increased volume of digital government documents in recent years, there is a need for new tools to assist the digital sensitivity review process. Therefore, in this paper we propose an automatic approach for identifying sensitive text in documents by measuring the amount of sensitivity in sequences of text. Using government documents reviewed by trained sensitivity reviewers, we focus on an aspect of FOIA Exemption 27 which can have a major impact on international relations, namely, information supplied in confidence. We show that our approach leads to markedly increased recall of sensitive text, while achieving a very high level of precision, when compared to a baseline that has been shown to be effective at identifying sensitive text in other domains.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114579364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Bayesian Inference for Information Retrieval Evaluation 信息检索评价中的贝叶斯推理
Ben Carterette
A key component of experimentation in IR is statistical hypothesis testing, which researchers and developers use to make inferences about the effectiveness of their system relative to others. A statistical hypothesis test can tell us the likelihood that small mean differences in effectiveness (on the order of 5%, say) is due to randomness or measurement error, and thus is critical for making progress in research. But the tests typically used in IR - the t-test, the Wilcoxon signed-rank test - are very general, not developed specifically for the problems we face in information retrieval evaluation. A better approach would take advantage of the fact that the atomic unit of measurement in IR is the relevance judgment rather than the effectiveness measure, and develop tests that model relevance directly. In this work we present such an approach, showing theoretically that modeling relevance in this way naturally gives rise to the effectiveness measures we care about. We demonstrate the usefulness of our model on both simulated data and a diverse set of runs from various TREC tracks.
IR实验的一个关键组成部分是统计假设检验,研究人员和开发人员用它来推断他们的系统相对于其他系统的有效性。统计假设检验可以告诉我们,有效性的微小平均差异(比如5%左右)是由于随机性或测量误差造成的可能性,因此对取得研究进展至关重要。但是,IR中通常使用的检验——t检验,Wilcoxon符号秩检验——是非常通用的,不是专门为我们在信息检索评估中面临的问题而开发的。更好的方法是利用IR中度量的原子单位是相关性判断而不是有效性度量这一事实,并开发直接对相关性建模的测试。在这项工作中,我们提出了这样一种方法,从理论上表明,以这种方式建模相关性自然会产生我们所关心的有效性度量。我们在模拟数据和来自不同TREC轨道的各种运行集上展示了我们的模型的实用性。
{"title":"Bayesian Inference for Information Retrieval Evaluation","authors":"Ben Carterette","doi":"10.1145/2808194.2809469","DOIUrl":"https://doi.org/10.1145/2808194.2809469","url":null,"abstract":"A key component of experimentation in IR is statistical hypothesis testing, which researchers and developers use to make inferences about the effectiveness of their system relative to others. A statistical hypothesis test can tell us the likelihood that small mean differences in effectiveness (on the order of 5%, say) is due to randomness or measurement error, and thus is critical for making progress in research. But the tests typically used in IR - the t-test, the Wilcoxon signed-rank test - are very general, not developed specifically for the problems we face in information retrieval evaluation. A better approach would take advantage of the fact that the atomic unit of measurement in IR is the relevance judgment rather than the effectiveness measure, and develop tests that model relevance directly. In this work we present such an approach, showing theoretically that modeling relevance in this way naturally gives rise to the effectiveness measures we care about. We demonstrate the usefulness of our model on both simulated data and a diverse set of runs from various TREC tracks.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128076775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Dynamic Test Collections for Retrieval Evaluation 用于检索评估的动态测试集合
Ben Carterette, Ashraf Bah Rabiou, M. Zengin
Batch evaluation with test collections of documents, search topics, and relevance judgments has been the bedrock of IR evaluation since its adoption by Salton for his experiments on vector space systems. Such test collections have limitations: they contain no user interaction data; there is typically only one query per topic; they have limited size due to the cost of constructing them. In the last 15-20 years, it has become evident that having a log of user interactions and a large space of queries is invaluable for building effective retrieval systems, but such data is generally only available to search engine companies. Thus there is a gap between what academics can study using static test collections and what industrial researchers can study using dynamic user data. In this work we propose dynamic test collections to help bridge this gap. Like traditional test collections, a dynamic test collection consists of a set of topics and relevance judgments. But instead of static one-time queries, dynamic test collections generate queries in response to the system. They can generate other actions such as clicks and time spent reading documents. Like static test collections, there is no human in the loop, but since the queries are dynamic they can generate much more data for evaluation than static test collections can. And since they can simulate user interactions across a session, they can be used for evaluating retrieval systems that make use of session history or other user information to try to improve results.
使用文档、搜索主题和相关性判断的测试集合进行批量评估,自从Salton在他的向量空间系统实验中采用它以来,一直是IR评估的基础。这样的测试集合有局限性:它们不包含用户交互数据;每个主题通常只有一个查询;由于建造成本的原因,它们的尺寸有限。在过去的15-20年里,很明显,拥有用户交互日志和大量查询空间对于构建有效的检索系统是无价的,但这些数据通常只有搜索引擎公司才能获得。因此,学术界可以使用静态测试集进行研究,而工业研究人员可以使用动态用户数据进行研究,这两者之间存在差距。在这项工作中,我们提出动态测试集合来帮助弥合这一差距。与传统的测试集合一样,动态测试集合由一组主题和相关判断组成。但是与静态的一次性查询不同,动态测试集合生成查询以响应系统。它们可以生成其他操作,比如点击和阅读文档所花费的时间。与静态测试集合一样,在循环中没有人,但是由于查询是动态的,因此它们可以生成比静态测试集合多得多的用于评估的数据。由于它们可以模拟跨会话的用户交互,因此它们可以用于评估利用会话历史记录或其他用户信息来尝试改进结果的检索系统。
{"title":"Dynamic Test Collections for Retrieval Evaluation","authors":"Ben Carterette, Ashraf Bah Rabiou, M. Zengin","doi":"10.1145/2808194.2809470","DOIUrl":"https://doi.org/10.1145/2808194.2809470","url":null,"abstract":"Batch evaluation with test collections of documents, search topics, and relevance judgments has been the bedrock of IR evaluation since its adoption by Salton for his experiments on vector space systems. Such test collections have limitations: they contain no user interaction data; there is typically only one query per topic; they have limited size due to the cost of constructing them. In the last 15-20 years, it has become evident that having a log of user interactions and a large space of queries is invaluable for building effective retrieval systems, but such data is generally only available to search engine companies. Thus there is a gap between what academics can study using static test collections and what industrial researchers can study using dynamic user data. In this work we propose dynamic test collections to help bridge this gap. Like traditional test collections, a dynamic test collection consists of a set of topics and relevance judgments. But instead of static one-time queries, dynamic test collections generate queries in response to the system. They can generate other actions such as clicks and time spent reading documents. Like static test collections, there is no human in the loop, but since the queries are dynamic they can generate much more data for evaluation than static test collections can. And since they can simulate user interactions across a session, they can be used for evaluating retrieval systems that make use of session history or other user information to try to improve results.","PeriodicalId":440325,"journal":{"name":"Proceedings of the 2015 International Conference on The Theory of Information Retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131119827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
期刊
Proceedings of the 2015 International Conference on The Theory of Information Retrieval
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1