Evaluating the utility of large language models in generating search strings for systematic reviews in anesthesiology: a comparative analysis of top-ranked journals.

IF 5.1 2区 医学 Q1 ANESTHESIOLOGY Regional Anesthesia and Pain Medicine Pub Date : 2025-01-19 DOI:10.1136/rapm-2024-106231
Alessandro De Cassai, Burhan Dost, Yunus Emre Karapinar, Müzeyyen Beldagli, Mirac Selcen Ozkal Yalin, Esra Turunc, Engin Ihsan Turan, Nicolò Sella
{"title":"Evaluating the utility of large language models in generating search strings for systematic reviews in anesthesiology: a comparative analysis of top-ranked journals.","authors":"Alessandro De Cassai, Burhan Dost, Yunus Emre Karapinar, Müzeyyen Beldagli, Mirac Selcen Ozkal Yalin, Esra Turunc, Engin Ihsan Turan, Nicolò Sella","doi":"10.1136/rapm-2024-106231","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>This study evaluated the effectiveness of large language models (LLMs), specifically ChatGPT 4o and a custom-designed model, Meta-Analysis Librarian, in generating accurate search strings for systematic reviews (SRs) in the field of anesthesiology.</p><p><strong>Methods: </strong>We selected 85 SRs from the top 10 anesthesiology journals, according to Web of Science rankings, and extracted reference lists as benchmarks. Using study titles as input, we generated four search strings per SR: three with ChatGPT 4o using general prompts and one with the Meta-Analysis Librarian model, which follows a structured, Population, Intervention, Comparator, Outcome-based approach aligned with Cochrane Handbook standards. Each search string was used to query PubMed, and the retrieved results were compared with the PubMed retrieved studies from the original search string in each SR to assess retrieval accuracy. Statistical analysis compared the performance of each model.</p><p><strong>Results: </strong>Original search strings demonstrated superior performance with a 65% (IQR: 43%-81%) retrieval rate, which was statistically different from both LLM groups in PubMed retrieved studies (p=0.001). The Meta-Analysis Librarian achieved a superior median retrieval rate to ChatGPT 4o (median, (IQR); 24% (13%-38%) vs 6% (0%-14%), respectively).</p><p><strong>Conclusion: </strong>The findings of this study highlight the significant advantage of using original search strings over LLM-generated search strings in PubMed retrieval studies. The Meta-Analysis Librarian demonstrated notable superiority in retrieval performance compared with ChatGPT 4o. Further research is needed to assess the broader applicability of LLM-generated search strings, especially across multiple databases.</p>","PeriodicalId":54503,"journal":{"name":"Regional Anesthesia and Pain Medicine","volume":" ","pages":""},"PeriodicalIF":5.1000,"publicationDate":"2025-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Regional Anesthesia and Pain Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1136/rapm-2024-106231","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ANESTHESIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: This study evaluated the effectiveness of large language models (LLMs), specifically ChatGPT 4o and a custom-designed model, Meta-Analysis Librarian, in generating accurate search strings for systematic reviews (SRs) in the field of anesthesiology.

Methods: We selected 85 SRs from the top 10 anesthesiology journals, according to Web of Science rankings, and extracted reference lists as benchmarks. Using study titles as input, we generated four search strings per SR: three with ChatGPT 4o using general prompts and one with the Meta-Analysis Librarian model, which follows a structured, Population, Intervention, Comparator, Outcome-based approach aligned with Cochrane Handbook standards. Each search string was used to query PubMed, and the retrieved results were compared with the PubMed retrieved studies from the original search string in each SR to assess retrieval accuracy. Statistical analysis compared the performance of each model.

Results: Original search strings demonstrated superior performance with a 65% (IQR: 43%-81%) retrieval rate, which was statistically different from both LLM groups in PubMed retrieved studies (p=0.001). The Meta-Analysis Librarian achieved a superior median retrieval rate to ChatGPT 4o (median, (IQR); 24% (13%-38%) vs 6% (0%-14%), respectively).

Conclusion: The findings of this study highlight the significant advantage of using original search strings over LLM-generated search strings in PubMed retrieval studies. The Meta-Analysis Librarian demonstrated notable superiority in retrieval performance compared with ChatGPT 4o. Further research is needed to assess the broader applicability of LLM-generated search strings, especially across multiple databases.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
评估大型语言模型在为麻醉学系统综述生成搜索字符串中的效用:对排名最高的期刊的比较分析。
背景:本研究评估了大型语言模型(llm),特别是ChatGPT 40和定制设计的Meta-Analysis Librarian模型,在为麻醉领域的系统评价(SRs)生成准确的搜索字符串方面的有效性。方法:从Web of Science排名前10位的麻醉学期刊中选取85篇论文,提取参考文献列表作为基准。使用研究标题作为输入,我们为每个SR生成了四个搜索字符串:三个使用ChatGPT 40使用一般提示,一个使用元分析图书管理员模型,该模型遵循与Cochrane手册标准一致的结构化,人口,干预,比较器,基于结果的方法。每个检索字符串用于查询PubMed,并将检索结果与每个SR中从原始检索字符串检索的PubMed研究进行比较,以评估检索准确性。统计分析比较了各模型的性能。结果:原始搜索字符串表现出优异的性能,检索率为65% (IQR: 43%-81%),这与PubMed检索研究中的两个LLM组有统计学差异(p=0.001)。Meta-Analysis Librarian的中位数检索率优于ChatGPT 40(中位数,(IQR);分别是24%(13%-38%)和6%(0%-14%)。结论:本研究的发现突出了在PubMed检索研究中使用原始搜索字符串比llm生成的搜索字符串具有显著优势。Meta-Analysis Librarian在检索性能上明显优于ChatGPT 40。需要进一步的研究来评估法学硕士生成的搜索字符串的更广泛的适用性,特别是跨多个数据库的适用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
8.50
自引率
11.80%
发文量
175
审稿时长
6-12 weeks
期刊介绍: Regional Anesthesia & Pain Medicine, the official publication of the American Society of Regional Anesthesia and Pain Medicine (ASRA), is a monthly journal that publishes peer-reviewed scientific and clinical studies to advance the understanding and clinical application of regional techniques for surgical anesthesia and postoperative analgesia. Coverage includes intraoperative regional techniques, perioperative pain, chronic pain, obstetric anesthesia, pediatric anesthesia, outcome studies, and complications. Published for over thirty years, this respected journal also serves as the official publication of the European Society of Regional Anaesthesia and Pain Therapy (ESRA), the Asian and Oceanic Society of Regional Anesthesia (AOSRA), the Latin American Society of Regional Anesthesia (LASRA), the African Society for Regional Anesthesia (AFSRA), and the Academy of Regional Anaesthesia of India (AORA).
期刊最新文献
Cadaveric study of the obturator nerve: frequency of skin innervation and the optimal site for blocking the cutaneous branch. Effect of stellate ganglion block on brain hemodynamics and the inflammatory response in moderate and severe traumatic brain injury: a pilot study. Letter to the editor: Is medetomidine the next perioperative substance of abuse? Perineuromal hydrodissection for acute postamputation pain? An observational study in a time of war. FRONT block: a cadaveric study of a dual-plane injection block targeting femoral rami and obturator nerve trunk for anterior hip joint analgesia.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1