{"title":"Crafting the Path: Robust Query Rewriting for Information Retrieval","authors":"Ingeol Baek;Jimin Lee;Joonho Yang;Hwanhee Lee","doi":"10.1109/ACCESS.2025.3538665","DOIUrl":null,"url":null,"abstract":"Query rewriting aims to generate a new query that can complement the original query to improve the information retrieval system. Recent studies on query rewriting, such as query2doc, query2expand and querey2cot, rely on the internal knowledge of Large Language Models (LLMs) to generate a relevant passage to add information to the query. Nevertheless, the efficacy of these methodologies may markedly decline in instances where the requisite knowledge is not encapsulated within the model’s intrinsic parameters. In this paper, we propose a novel structured query rewriting method called Crafting The Path tailored for retrieval systems. Crafting The Path involves a three-step process that crafts query-related information necessary for finding the passages to be searched in each step. Specifically, the Crafting The Path begins with Query Concept Comprehension, proceeds to Query Type Identification, and finally conducts Expected Answer Extraction. Experimental results show that our method outperforms previous rewriting methods, especially in less familiar domains for LLMs. We demonstrate that our method is less dependent on the internal parameter knowledge of the model and generates queries with fewer factual inaccuracies. Furthermore, we observe that Crafting The Path demonstrates superior performance in the retrieval-augmented generation scenarios.","PeriodicalId":13079,"journal":{"name":"IEEE Access","volume":"13 ","pages":"24171-24180"},"PeriodicalIF":3.6000,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10870252","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Access","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10870252/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Query rewriting aims to generate a new query that can complement the original query to improve the information retrieval system. Recent studies on query rewriting, such as query2doc, query2expand and querey2cot, rely on the internal knowledge of Large Language Models (LLMs) to generate a relevant passage to add information to the query. Nevertheless, the efficacy of these methodologies may markedly decline in instances where the requisite knowledge is not encapsulated within the model’s intrinsic parameters. In this paper, we propose a novel structured query rewriting method called Crafting The Path tailored for retrieval systems. Crafting The Path involves a three-step process that crafts query-related information necessary for finding the passages to be searched in each step. Specifically, the Crafting The Path begins with Query Concept Comprehension, proceeds to Query Type Identification, and finally conducts Expected Answer Extraction. Experimental results show that our method outperforms previous rewriting methods, especially in less familiar domains for LLMs. We demonstrate that our method is less dependent on the internal parameter knowledge of the model and generates queries with fewer factual inaccuracies. Furthermore, we observe that Crafting The Path demonstrates superior performance in the retrieval-augmented generation scenarios.
查询重写的目的是生成一个新的查询来补充原查询,以改进信息检索系统。最近关于查询重写的研究,如query2doc、query2expand和querey2cot,都依赖于大型语言模型(Large Language Models, llm)的内部知识来生成相关的段落以向查询添加信息。然而,当必要的知识没有被封装在模型的内在参数中时,这些方法的有效性可能会显著下降。在本文中,我们提出了一种新的结构化查询重写方法,称为为检索系统量身定制的路径。制作路径包括一个三步的过程,制作与查询相关的信息,以便在每个步骤中找到要搜索的通道。具体来说,路径的构建从查询概念理解开始,然后进行查询类型识别,最后进行预期答案提取。实验结果表明,我们的方法优于以前的重写方法,特别是在llm不太熟悉的领域。我们证明了我们的方法较少依赖于模型的内部参数知识,并且生成的查询具有较少的事实不准确性。此外,我们观察到Crafting The Path在检索增强生成场景中表现出卓越的性能。
IEEE AccessCOMPUTER SCIENCE, INFORMATION SYSTEMSENGIN-ENGINEERING, ELECTRICAL & ELECTRONIC
CiteScore
9.80
自引率
7.70%
发文量
6673
审稿时长
6 weeks
期刊介绍:
IEEE Access® is a multidisciplinary, open access (OA), applications-oriented, all-electronic archival journal that continuously presents the results of original research or development across all of IEEE''s fields of interest.
IEEE Access will publish articles that are of high interest to readers, original, technically correct, and clearly presented. Supported by author publication charges (APC), its hallmarks are a rapid peer review and publication process with open access to all readers. Unlike IEEE''s traditional Transactions or Journals, reviews are "binary", in that reviewers will either Accept or Reject an article in the form it is submitted in order to achieve rapid turnaround. Especially encouraged are submissions on:
Multidisciplinary topics, or applications-oriented articles and negative results that do not fit within the scope of IEEE''s traditional journals.
Practical articles discussing new experiments or measurement techniques, interesting solutions to engineering.
Development of new or improved fabrication or manufacturing techniques.
Reviews or survey articles of new or evolving fields oriented to assist others in understanding the new area.