利用大型语言模型学习将患者与临床试验相匹配。

IF 4 2区医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Journal of Biomedical Informatics Pub Date : 2024-10-09 DOI:10.1016/j.jbi.2024.104734

Maciej Rybinski , Wojciech Kusa , Sarvnaz Karimi , Allan Hanbury

{"title":"利用大型语言模型学习将患者与临床试验相匹配。","authors":"Maciej Rybinski , Wojciech Kusa , Sarvnaz Karimi , Allan Hanbury","doi":"10.1016/j.jbi.2024.104734","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective:</h3><div>This study investigates the use of Large Language Models (LLMs) for matching patients to clinical trials (CTs) within an information retrieval pipeline. Our objective is to enhance the process of patient-trial matching by leveraging the semantic processing capabilities of LLMs, thereby improving the effectiveness of patient recruitment for clinical trials.</div></div><div><h3>Methods:</h3><div>We employed a multi-stage retrieval pipeline integrating various methodologies, including BM25 and Transformer-based rankers, along with LLM-based methods. Our primary datasets were the TREC Clinical Trials 2021–23 track collections. We compared LLM-based approaches, focusing on methods that leverage LLMs in query formulation, filtering, relevance ranking, and re-ranking of CTs.</div></div><div><h3>Results:</h3><div>Our results indicate that LLM-based systems, particularly those involving re-ranking with a fine-tuned LLM, outperform traditional methods in terms of nDCG and Precision measures. The study demonstrates that fine-tuning LLMs enhances their ability to find eligible trials. Moreover, our LLM-based approach is competitive with state-of-the-art systems in the TREC challenges.</div><div>The study shows the effectiveness of LLMs in CT matching, highlighting their potential in handling complex semantic analysis and improving patient-trial matching. However, the use of LLMs increases the computational cost and reduces efficiency. We provide a detailed analysis of effectiveness-efficiency trade-offs.</div></div><div><h3>Conclusion:</h3><div>This research demonstrates the promising role of LLMs in enhancing the patient-to-clinical trial matching process, offering a significant advancement in the automation of patient recruitment. Future work should explore optimising the balance between computational cost and retrieval effectiveness in practical applications.</div></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":"159 ","pages":"Article 104734"},"PeriodicalIF":4.0000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning to match patients to clinical trials using large language models\",\"authors\":\"Maciej Rybinski , Wojciech Kusa , Sarvnaz Karimi , Allan Hanbury\",\"doi\":\"10.1016/j.jbi.2024.104734\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Objective:</h3><div>This study investigates the use of Large Language Models (LLMs) for matching patients to clinical trials (CTs) within an information retrieval pipeline. Our objective is to enhance the process of patient-trial matching by leveraging the semantic processing capabilities of LLMs, thereby improving the effectiveness of patient recruitment for clinical trials.</div></div><div><h3>Methods:</h3><div>We employed a multi-stage retrieval pipeline integrating various methodologies, including BM25 and Transformer-based rankers, along with LLM-based methods. Our primary datasets were the TREC Clinical Trials 2021–23 track collections. We compared LLM-based approaches, focusing on methods that leverage LLMs in query formulation, filtering, relevance ranking, and re-ranking of CTs.</div></div><div><h3>Results:</h3><div>Our results indicate that LLM-based systems, particularly those involving re-ranking with a fine-tuned LLM, outperform traditional methods in terms of nDCG and Precision measures. The study demonstrates that fine-tuning LLMs enhances their ability to find eligible trials. Moreover, our LLM-based approach is competitive with state-of-the-art systems in the TREC challenges.</div><div>The study shows the effectiveness of LLMs in CT matching, highlighting their potential in handling complex semantic analysis and improving patient-trial matching. However, the use of LLMs increases the computational cost and reduces efficiency. We provide a detailed analysis of effectiveness-efficiency trade-offs.</div></div><div><h3>Conclusion:</h3><div>This research demonstrates the promising role of LLMs in enhancing the patient-to-clinical trial matching process, offering a significant advancement in the automation of patient recruitment. Future work should explore optimising the balance between computational cost and retrieval effectiveness in practical applications.</div></div>\",\"PeriodicalId\":15263,\"journal\":{\"name\":\"Journal of Biomedical Informatics\",\"volume\":\"159 \",\"pages\":\"Article 104734\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Biomedical Informatics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1532046424001527\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomedical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1532046424001527","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

研究目的本研究探讨了在信息检索管道中使用大语言模型（LLM）将患者与临床试验（CT）进行匹配的问题。我们的目标是利用大型语言模型的语义处理能力，加强患者与试验的匹配过程，从而提高临床试验患者招募的效率：我们采用了一个多阶段检索管道，其中整合了各种方法，包括基于 BM25 和 Transformer 的排序器以及基于 LLM 的方法。我们的主要数据集是TREC临床试验2021-23轨迹集。我们对基于 LLM 的方法进行了比较，重点关注在查询制定、过滤、相关性排序和 CT 重新排序中利用 LLM 的方法：我们的结果表明，基于 LLM 的系统，特别是那些涉及使用微调 LLM 重新排序的系统，在 nDCG 和精度测量方面优于传统方法。研究表明，对 LLM 进行微调可增强其发现合格试验的能力。此外，在 TREC 挑战赛中，我们基于 LLM 的方法与最先进的系统相比具有竞争力。研究显示了 LLM 在 CT 匹配中的有效性，突出了其在处理复杂语义分析和改善患者-试验匹配方面的潜力。但是，LLM 的使用增加了计算成本，降低了效率。我们详细分析了有效性和效率之间的权衡：这项研究证明了 LLMs 在加强患者与临床试验匹配过程中的重要作用，为患者招募自动化提供了重大进展。未来的工作应探索如何在实际应用中优化计算成本与检索效率之间的平衡。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Learning to match patients to clinical trials using large language models

Objective:

This study investigates the use of Large Language Models (LLMs) for matching patients to clinical trials (CTs) within an information retrieval pipeline. Our objective is to enhance the process of patient-trial matching by leveraging the semantic processing capabilities of LLMs, thereby improving the effectiveness of patient recruitment for clinical trials.

Methods:

We employed a multi-stage retrieval pipeline integrating various methodologies, including BM25 and Transformer-based rankers, along with LLM-based methods. Our primary datasets were the TREC Clinical Trials 2021–23 track collections. We compared LLM-based approaches, focusing on methods that leverage LLMs in query formulation, filtering, relevance ranking, and re-ranking of CTs.

Results:

Our results indicate that LLM-based systems, particularly those involving re-ranking with a fine-tuned LLM, outperform traditional methods in terms of nDCG and Precision measures. The study demonstrates that fine-tuning LLMs enhances their ability to find eligible trials. Moreover, our LLM-based approach is competitive with state-of-the-art systems in the TREC challenges.

The study shows the effectiveness of LLMs in CT matching, highlighting their potential in handling complex semantic analysis and improving patient-trial matching. However, the use of LLMs increases the computational cost and reduces efficiency. We provide a detailed analysis of effectiveness-efficiency trade-offs.

Conclusion:

This research demonstrates the promising role of LLMs in enhancing the patient-to-clinical trial matching process, offering a significant advancement in the automation of patient recruitment. Future work should explore optimising the balance between computational cost and retrieval effectiveness in practical applications.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Biomedical Informatics 医学-计算机：跨学科应用

CiteScore

8.90

自引率

6.70%

发文量

243

审稿时长

32 days

期刊介绍： The Journal of Biomedical Informatics reflects a commitment to high-quality original research papers, reviews, and commentaries in the area of biomedical informatics methodology. Although we publish articles motivated by applications in the biomedical sciences (for example, clinical medicine, health care, population health, and translational bioinformatics), the journal emphasizes reports of new methodologies and techniques that have general applicability and that form the basis for the evolving science of biomedical informatics. Articles on medical devices; evaluations of implemented systems (including clinical trials of information technologies); or papers that provide insight into a biological process, a specific disease, or treatment options would generally be more suitable for publication in other venues. Papers on applications of signal processing and image analysis are often more suitable for biomedical engineering journals or other informatics journals, although we do publish papers that emphasize the information management and knowledge representation/modeling issues that arise in the storage and use of biological signals and images. System descriptions are welcome if they illustrate and substantiate the underlying methodology that is the principal focus of the report and an effort is made to address the generalizability and/or range of application of that methodology. Note also that, given the international nature of JBI, papers that deal with specific languages other than English, or with country-specific health systems or approaches, are acceptable for JBI only if they offer generalizable lessons that are relevant to the broad JBI readership, regardless of their country, language, culture, or health system.