求助PDF
{"title":"Enhancing Large Language Models with Retrieval-augmented Generation: A Radiology-specific Approach.","authors":"Dane A Weinert, Andreas M Rauschecker","doi":"10.1148/ryai.240313","DOIUrl":null,"url":null,"abstract":"<p><p><i>\"Just Accepted\" papers have undergone full peer review and have been accepted for publication in <i>Radiology: Artificial Intelligence</i>. This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content.</i> Retrieval-augmented generation (RAG) is a strategy to improve performance of large language models (LLMs) by providing the LLM with an updated corpus of knowledge that can be used for answer generation in real-time. RAG may improve LLM performance and clinical applicability in radiology by providing citable, up-to-date information without requiring model fine-tuning. In this retrospective study, a radiology-specific RAG was developed using a vector database of 3,689 <i>RadioGraphics</i> articles published from January 1999 to December 2023. Performance of 5 LLMs with and without RAG on a 192-question radiology examination was compared. RAG significantly improved examination scores for GPT-4 (81.2% versus 75.5%, <i>P</i> = .04) and Command R+ (70.3% versus 62.0%, <i>P</i> = .02), but not for Claude Opus, Mixtral, or Gemini 1.5 Pro. RAG-System performed significantly better than pure LLMs on a 24-question subset directly sourced from <i>RadioGraphics</i> (85% versus 76%, <i>P</i> = .03). The RAG-System retrieved 21/24 (87.5%, <i>P</i> < .001) relevant <i>RadioGraphics</i> references cited in the examination's answer explanations and successfully cited them in 18/21 (85.7%, <i>P</i> < .001) outputs. The results suggest that RAG is a promising approach to enhance LLM capabilities for radiology knowledge tasks, providing transparent, domain-specific information retrieval. ©RSNA, 2025.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e240313"},"PeriodicalIF":8.1000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology-Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1148/ryai.240313","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
引用
批量引用
Abstract
"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence . This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Retrieval-augmented generation (RAG) is a strategy to improve performance of large language models (LLMs) by providing the LLM with an updated corpus of knowledge that can be used for answer generation in real-time. RAG may improve LLM performance and clinical applicability in radiology by providing citable, up-to-date information without requiring model fine-tuning. In this retrospective study, a radiology-specific RAG was developed using a vector database of 3,689 RadioGraphics articles published from January 1999 to December 2023. Performance of 5 LLMs with and without RAG on a 192-question radiology examination was compared. RAG significantly improved examination scores for GPT-4 (81.2% versus 75.5%, P = .04) and Command R+ (70.3% versus 62.0%, P = .02), but not for Claude Opus, Mixtral, or Gemini 1.5 Pro. RAG-System performed significantly better than pure LLMs on a 24-question subset directly sourced from RadioGraphics (85% versus 76%, P = .03). The RAG-System retrieved 21/24 (87.5%, P < .001) relevant RadioGraphics references cited in the examination's answer explanations and successfully cited them in 18/21 (85.7%, P < .001) outputs. The results suggest that RAG is a promising approach to enhance LLM capabilities for radiology knowledge tasks, providing transparent, domain-specific information retrieval. ©RSNA, 2025.