利用排名模型加强问答文本检索:为 RAG 制定基准、微调和部署 Rerankers

Gabriel de Souza P. Moreira, Ronay Ak, Benedikt Schifferer, Mengyao Xu, Radek Osmulski, Even Oldridge
{"title":"利用排名模型加强问答文本检索:为 RAG 制定基准、微调和部署 Rerankers","authors":"Gabriel de Souza P. Moreira, Ronay Ak, Benedikt Schifferer, Mengyao Xu, Radek Osmulski, Even Oldridge","doi":"arxiv-2409.07691","DOIUrl":null,"url":null,"abstract":"Ranking models play a crucial role in enhancing overall accuracy of text\nretrieval systems. These multi-stage systems typically utilize either dense\nembedding models or sparse lexical indices to retrieve relevant passages based\non a given query, followed by ranking models that refine the ordering of the\ncandidate passages by its relevance to the query. This paper benchmarks various publicly available ranking models and examines\ntheir impact on ranking accuracy. We focus on text retrieval for\nquestion-answering tasks, a common use case for Retrieval-Augmented Generation\nsystems. Our evaluation benchmarks include models some of which are\ncommercially viable for industrial applications. We introduce a state-of-the-art ranking model, NV-RerankQA-Mistral-4B-v3,\nwhich achieves a significant accuracy increase of ~14% compared to pipelines\nwith other rerankers. We also provide an ablation study comparing the\nfine-tuning of ranking models with different sizes, losses and self-attention\nmechanisms. Finally, we discuss challenges of text retrieval pipelines with ranking\nmodels in real-world industry applications, in particular the trade-offs among\nmodel size, ranking accuracy and system requirements like indexing and serving\nlatency / throughput.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Enhancing Q&A Text Retrieval with Ranking Models: Benchmarking, fine-tuning and deploying Rerankers for RAG\",\"authors\":\"Gabriel de Souza P. Moreira, Ronay Ak, Benedikt Schifferer, Mengyao Xu, Radek Osmulski, Even Oldridge\",\"doi\":\"arxiv-2409.07691\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ranking models play a crucial role in enhancing overall accuracy of text\\nretrieval systems. These multi-stage systems typically utilize either dense\\nembedding models or sparse lexical indices to retrieve relevant passages based\\non a given query, followed by ranking models that refine the ordering of the\\ncandidate passages by its relevance to the query. This paper benchmarks various publicly available ranking models and examines\\ntheir impact on ranking accuracy. We focus on text retrieval for\\nquestion-answering tasks, a common use case for Retrieval-Augmented Generation\\nsystems. Our evaluation benchmarks include models some of which are\\ncommercially viable for industrial applications. We introduce a state-of-the-art ranking model, NV-RerankQA-Mistral-4B-v3,\\nwhich achieves a significant accuracy increase of ~14% compared to pipelines\\nwith other rerankers. We also provide an ablation study comparing the\\nfine-tuning of ranking models with different sizes, losses and self-attention\\nmechanisms. Finally, we discuss challenges of text retrieval pipelines with ranking\\nmodels in real-world industry applications, in particular the trade-offs among\\nmodel size, ranking accuracy and system requirements like indexing and serving\\nlatency / throughput.\",\"PeriodicalId\":501281,\"journal\":{\"name\":\"arXiv - CS - Information Retrieval\",\"volume\":\"6 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07691\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07691","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

排序模型在提高文本检索系统的整体准确性方面起着至关重要的作用。这些多阶段系统通常利用密集嵌入模型或稀疏词性索引来检索基于给定查询的相关段落,然后利用排序模型根据其与查询的相关性来完善候选段落的排序。本文对各种公开可用的排序模型进行了基准测试,并检验了它们对排序准确性的影响。我们的重点是问题解答任务的文本检索,这是检索增强生成系统的常见用例。我们的评估基准包括一些在工业应用中具有商业可行性的模型。我们引入了最先进的排序模型 NV-RerankQA-Mistral-4B-v3,与使用其他排序器的管道相比,它的准确率显著提高了约 14%。我们还提供了一项消融研究,比较了不同规模、损失和自我关注机制的排序模型的微调。最后,我们讨论了在实际行业应用中使用排名模型的文本检索管道所面临的挑战,特别是在模型大小、排名准确性和系统要求(如索引和服务延迟/吞吐量)之间的权衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Enhancing Q&A Text Retrieval with Ranking Models: Benchmarking, fine-tuning and deploying Rerankers for RAG
Ranking models play a crucial role in enhancing overall accuracy of text retrieval systems. These multi-stage systems typically utilize either dense embedding models or sparse lexical indices to retrieve relevant passages based on a given query, followed by ranking models that refine the ordering of the candidate passages by its relevance to the query. This paper benchmarks various publicly available ranking models and examines their impact on ranking accuracy. We focus on text retrieval for question-answering tasks, a common use case for Retrieval-Augmented Generation systems. Our evaluation benchmarks include models some of which are commercially viable for industrial applications. We introduce a state-of-the-art ranking model, NV-RerankQA-Mistral-4B-v3, which achieves a significant accuracy increase of ~14% compared to pipelines with other rerankers. We also provide an ablation study comparing the fine-tuning of ranking models with different sizes, losses and self-attention mechanisms. Finally, we discuss challenges of text retrieval pipelines with ranking models in real-world industry applications, in particular the trade-offs among model size, ranking accuracy and system requirements like indexing and serving latency / throughput.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference Retrieve, Annotate, Evaluate, Repeat: Leveraging Multimodal LLMs for Large-Scale Product Retrieval Evaluation Active Reconfigurable Intelligent Surface Empowered Synthetic Aperture Radar Imaging FLARE: Fusing Language Models and Collaborative Architectures for Recommender Enhancement Basket-Enhanced Heterogenous Hypergraph for Price-Sensitive Next Basket Recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1