The Probabilistic Relevance Framework: BM25 and Beyond

IF 8.3 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Foundations and Trends in Information Retrieval Pub Date : 2009-04-01 DOI:10.1561/1500000019
S. Robertson, H. Zaragoza
{"title":"The Probabilistic Relevance Framework: BM25 and Beyond","authors":"S. Robertson, H. Zaragoza","doi":"10.1561/1500000019","DOIUrl":null,"url":null,"abstract":"The Probabilistic Relevance Framework (PRF) is a formal framework for document retrieval, grounded in work done in the 1970—1980s, which led to the development of one of the most successful text-retrieval algorithms, BM25. In recent years, research in the PRF has yielded new retrieval models capable of taking into account document meta-data (especially structure and link-graph information). Again, this has led to one of the most successful Web-search and corporate-search algorithms, BM25F. This work presents the PRF from a conceptual point of view, describing the probabilistic modelling assumptions behind the framework and the different ranking algorithms that result from its application: the binary independence model, relevance feedback models, BM25 and BM25F. It also discusses the relation between the PRF and other statistical models for IR, and covers some related topics, such as the use of non-textual features, and parameter optimisation for models with free parameters.","PeriodicalId":48829,"journal":{"name":"Foundations and Trends in Information Retrieval","volume":"35 1","pages":"333-389"},"PeriodicalIF":8.3000,"publicationDate":"2009-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2328","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Foundations and Trends in Information Retrieval","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1561/1500000019","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 2328

Abstract

The Probabilistic Relevance Framework (PRF) is a formal framework for document retrieval, grounded in work done in the 1970—1980s, which led to the development of one of the most successful text-retrieval algorithms, BM25. In recent years, research in the PRF has yielded new retrieval models capable of taking into account document meta-data (especially structure and link-graph information). Again, this has led to one of the most successful Web-search and corporate-search algorithms, BM25F. This work presents the PRF from a conceptual point of view, describing the probabilistic modelling assumptions behind the framework and the different ranking algorithms that result from its application: the binary independence model, relevance feedback models, BM25 and BM25F. It also discusses the relation between the PRF and other statistical models for IR, and covers some related topics, such as the use of non-textual features, and parameter optimisation for models with free parameters.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
概率关联框架:BM25及以后
概率相关框架(PRF)是一个用于文档检索的正式框架,以1970 - 1980年代的工作为基础,它导致了最成功的文本检索算法之一BM25的发展。近年来,PRF的研究产生了能够考虑文档元数据(特别是结构和链接图信息)的新的检索模型。同样,这导致了最成功的web搜索和企业搜索算法之一BM25F。这项工作从概念的角度介绍了PRF,描述了框架背后的概率建模假设以及由其应用产生的不同排名算法:二元独立模型、相关反馈模型、BM25和BM25F。它还讨论了PRF和其他IR统计模型之间的关系,并涵盖了一些相关主题,例如非文本特征的使用,以及具有自由参数的模型的参数优化。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Foundations and Trends in Information Retrieval
Foundations and Trends in Information Retrieval COMPUTER SCIENCE, INFORMATION SYSTEMS-
CiteScore
39.10
自引率
0.00%
发文量
3
期刊介绍: The surge in research across all domains in the past decade has resulted in a plethora of new publications, causing an exponential growth in published research. Navigating through this extensive literature and staying current has become a time-consuming challenge. While electronic publishing provides instant access to more articles than ever, discerning the essential ones for a comprehensive understanding of any topic remains an issue. To tackle this, Foundations and Trends® in Information Retrieval - FnTIR - addresses the problem by publishing high-quality survey and tutorial monographs in the field. Each issue of Foundations and Trends® in Information Retrieval - FnT IR features a 50-100 page monograph authored by research leaders, covering tutorial subjects, research retrospectives, and survey papers that provide state-of-the-art reviews within the scope of the journal.
期刊最新文献
Multi-hop Question Answering User Simulation for Evaluating Information Access Systems Conversational Information Seeking Perspectives of Neurodiverse Participants in Interactive Information Retrieval Efficient and Effective Tree-based and Neural Learning to Rank
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1