Learning to match patients to clinical trials using large language models

IF 4 2区 医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Journal of Biomedical Informatics Pub Date : 2024-10-09 DOI:10.1016/j.jbi.2024.104734
Maciej Rybinski , Wojciech Kusa , Sarvnaz Karimi , Allan Hanbury
{"title":"Learning to match patients to clinical trials using large language models","authors":"Maciej Rybinski ,&nbsp;Wojciech Kusa ,&nbsp;Sarvnaz Karimi ,&nbsp;Allan Hanbury","doi":"10.1016/j.jbi.2024.104734","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective:</h3><div>This study investigates the use of Large Language Models (LLMs) for matching patients to clinical trials (CTs) within an information retrieval pipeline. Our objective is to enhance the process of patient-trial matching by leveraging the semantic processing capabilities of LLMs, thereby improving the effectiveness of patient recruitment for clinical trials.</div></div><div><h3>Methods:</h3><div>We employed a multi-stage retrieval pipeline integrating various methodologies, including BM25 and Transformer-based rankers, along with LLM-based methods. Our primary datasets were the TREC Clinical Trials 2021–23 track collections. We compared LLM-based approaches, focusing on methods that leverage LLMs in query formulation, filtering, relevance ranking, and re-ranking of CTs.</div></div><div><h3>Results:</h3><div>Our results indicate that LLM-based systems, particularly those involving re-ranking with a fine-tuned LLM, outperform traditional methods in terms of nDCG and Precision measures. The study demonstrates that fine-tuning LLMs enhances their ability to find eligible trials. Moreover, our LLM-based approach is competitive with state-of-the-art systems in the TREC challenges.</div><div>The study shows the effectiveness of LLMs in CT matching, highlighting their potential in handling complex semantic analysis and improving patient-trial matching. However, the use of LLMs increases the computational cost and reduces efficiency. We provide a detailed analysis of effectiveness-efficiency trade-offs.</div></div><div><h3>Conclusion:</h3><div>This research demonstrates the promising role of LLMs in enhancing the patient-to-clinical trial matching process, offering a significant advancement in the automation of patient recruitment. Future work should explore optimising the balance between computational cost and retrieval effectiveness in practical applications.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"159 ","pages":"Article 104734"},"PeriodicalIF":4.0000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomedical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1532046424001527","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Objective:

This study investigates the use of Large Language Models (LLMs) for matching patients to clinical trials (CTs) within an information retrieval pipeline. Our objective is to enhance the process of patient-trial matching by leveraging the semantic processing capabilities of LLMs, thereby improving the effectiveness of patient recruitment for clinical trials.

Methods:

We employed a multi-stage retrieval pipeline integrating various methodologies, including BM25 and Transformer-based rankers, along with LLM-based methods. Our primary datasets were the TREC Clinical Trials 2021–23 track collections. We compared LLM-based approaches, focusing on methods that leverage LLMs in query formulation, filtering, relevance ranking, and re-ranking of CTs.

Results:

Our results indicate that LLM-based systems, particularly those involving re-ranking with a fine-tuned LLM, outperform traditional methods in terms of nDCG and Precision measures. The study demonstrates that fine-tuning LLMs enhances their ability to find eligible trials. Moreover, our LLM-based approach is competitive with state-of-the-art systems in the TREC challenges.
The study shows the effectiveness of LLMs in CT matching, highlighting their potential in handling complex semantic analysis and improving patient-trial matching. However, the use of LLMs increases the computational cost and reduces efficiency. We provide a detailed analysis of effectiveness-efficiency trade-offs.

Conclusion:

This research demonstrates the promising role of LLMs in enhancing the patient-to-clinical trial matching process, offering a significant advancement in the automation of patient recruitment. Future work should explore optimising the balance between computational cost and retrieval effectiveness in practical applications.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用大型语言模型学习将患者与临床试验相匹配。
研究目的本研究探讨了在信息检索管道中使用大语言模型(LLM)将患者与临床试验(CT)进行匹配的问题。我们的目标是利用大型语言模型的语义处理能力,加强患者与试验的匹配过程,从而提高临床试验患者招募的效率:我们采用了一个多阶段检索管道,其中整合了各种方法,包括基于 BM25 和 Transformer 的排序器以及基于 LLM 的方法。我们的主要数据集是TREC临床试验2021-23轨迹集。我们对基于 LLM 的方法进行了比较,重点关注在查询制定、过滤、相关性排序和 CT 重新排序中利用 LLM 的方法:我们的结果表明,基于 LLM 的系统,特别是那些涉及使用微调 LLM 重新排序的系统,在 nDCG 和精度测量方面优于传统方法。研究表明,对 LLM 进行微调可增强其发现合格试验的能力。此外,在 TREC 挑战赛中,我们基于 LLM 的方法与最先进的系统相比具有竞争力。研究显示了 LLM 在 CT 匹配中的有效性,突出了其在处理复杂语义分析和改善患者-试验匹配方面的潜力。但是,LLM 的使用增加了计算成本,降低了效率。我们详细分析了有效性和效率之间的权衡:这项研究证明了 LLMs 在加强患者与临床试验匹配过程中的重要作用,为患者招募自动化提供了重大进展。未来的工作应探索如何在实际应用中优化计算成本与检索效率之间的平衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Biomedical Informatics
Journal of Biomedical Informatics 医学-计算机:跨学科应用
CiteScore
8.90
自引率
6.70%
发文量
243
审稿时长
32 days
期刊介绍: The Journal of Biomedical Informatics reflects a commitment to high-quality original research papers, reviews, and commentaries in the area of biomedical informatics methodology. Although we publish articles motivated by applications in the biomedical sciences (for example, clinical medicine, health care, population health, and translational bioinformatics), the journal emphasizes reports of new methodologies and techniques that have general applicability and that form the basis for the evolving science of biomedical informatics. Articles on medical devices; evaluations of implemented systems (including clinical trials of information technologies); or papers that provide insight into a biological process, a specific disease, or treatment options would generally be more suitable for publication in other venues. Papers on applications of signal processing and image analysis are often more suitable for biomedical engineering journals or other informatics journals, although we do publish papers that emphasize the information management and knowledge representation/modeling issues that arise in the storage and use of biological signals and images. System descriptions are welcome if they illustrate and substantiate the underlying methodology that is the principal focus of the report and an effort is made to address the generalizability and/or range of application of that methodology. Note also that, given the international nature of JBI, papers that deal with specific languages other than English, or with country-specific health systems or approaches, are acceptable for JBI only if they offer generalizable lessons that are relevant to the broad JBI readership, regardless of their country, language, culture, or health system.
期刊最新文献
Early multi-cancer detection through deep learning: An anomaly detection approach using Variational Autoencoder. Importance of variables from different time frames for predicting self-harm using health system data Cross-Modal self-supervised vision language pre-training with multiple objectives for medical visual question answering Machine learning approaches for the discovery of clinical pathways from patient data: A systematic review MultiADE: A Multi-domain benchmark for Adverse Drug Event extraction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1