Ki Young Huh, Ildae Song, Yoonjin Kim, Jiyeon Park, Hyunwook Ryu, JaeEun Koh, Kyung-Sang Yu, Kyung Hwan Kim, SeungHwan Lee
{"title":"Exploration of Using an Open-Source Large Language Model for Analyzing Trial Information: A Case Study of Clinical Trials With Decentralized Elements","authors":"Ki Young Huh, Ildae Song, Yoonjin Kim, Jiyeon Park, Hyunwook Ryu, JaeEun Koh, Kyung-Sang Yu, Kyung Hwan Kim, SeungHwan Lee","doi":"10.1111/cts.70183","DOIUrl":null,"url":null,"abstract":"<p>Despite interest in clinical trials with decentralized elements (DCTs), analysis of their trends in trial registries is lacking due to heterogeneous designs and unstandardized terms. We explored Llama 3, an open-source large language model, to efficiently evaluate these trends. Trial data were sourced from Aggregate Analysis of ClinicalTrials.gov, focusing on drug trials conducted between 2018 and 2023. We utilized three Llama 3 models with a different number of parameters: 8b (model 1), fine-tuned 8b (model 2) with curated data, and 70b (model 3). Prompt engineering enabled sophisticated tasks such as classification of DCTs with explanations and extracting decentralized elements. Model performance, evaluated on a 3-month exploratory test dataset, demonstrated that sensitivity could be improved after fine-tuning from 0.0357 to 0.5385. Low positive predictive value in the fine-tuned model 2 could be improved by focusing on trials with DCT-associated expressions from 0.5385 to 0.9167. However, the extraction of decentralized elements was only properly performed by model 3, which had a larger number of parameters. Based on the results, we screened the entire 6-year dataset after applying DCT-associated expressions. After the subsequent application of models 2 and 3, we identified 692 DCTs. We found that a total of 213 trials were classified as phase 2, followed by 162 phase 4 trials, 112 phase 3 trials, and 92 phase 1 trials. In conclusion, our study demonstrated the potential of large language models for analyzing clinical trial information not structured in a machine-readable format. Managing potential biases during model application is crucial.</p>","PeriodicalId":50610,"journal":{"name":"Cts-Clinical and Translational Science","volume":"18 3","pages":""},"PeriodicalIF":3.1000,"publicationDate":"2025-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cts.70183","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cts-Clinical and Translational Science","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cts.70183","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Despite interest in clinical trials with decentralized elements (DCTs), analysis of their trends in trial registries is lacking due to heterogeneous designs and unstandardized terms. We explored Llama 3, an open-source large language model, to efficiently evaluate these trends. Trial data were sourced from Aggregate Analysis of ClinicalTrials.gov, focusing on drug trials conducted between 2018 and 2023. We utilized three Llama 3 models with a different number of parameters: 8b (model 1), fine-tuned 8b (model 2) with curated data, and 70b (model 3). Prompt engineering enabled sophisticated tasks such as classification of DCTs with explanations and extracting decentralized elements. Model performance, evaluated on a 3-month exploratory test dataset, demonstrated that sensitivity could be improved after fine-tuning from 0.0357 to 0.5385. Low positive predictive value in the fine-tuned model 2 could be improved by focusing on trials with DCT-associated expressions from 0.5385 to 0.9167. However, the extraction of decentralized elements was only properly performed by model 3, which had a larger number of parameters. Based on the results, we screened the entire 6-year dataset after applying DCT-associated expressions. After the subsequent application of models 2 and 3, we identified 692 DCTs. We found that a total of 213 trials were classified as phase 2, followed by 162 phase 4 trials, 112 phase 3 trials, and 92 phase 1 trials. In conclusion, our study demonstrated the potential of large language models for analyzing clinical trial information not structured in a machine-readable format. Managing potential biases during model application is crucial.
期刊介绍:
Clinical and Translational Science (CTS), an official journal of the American Society for Clinical Pharmacology and Therapeutics, highlights original translational medicine research that helps bridge laboratory discoveries with the diagnosis and treatment of human disease. Translational medicine is a multi-faceted discipline with a focus on translational therapeutics. In a broad sense, translational medicine bridges across the discovery, development, regulation, and utilization spectrum. Research may appear as Full Articles, Brief Reports, Commentaries, Phase Forwards (clinical trials), Reviews, or Tutorials. CTS also includes invited didactic content that covers the connections between clinical pharmacology and translational medicine. Best-in-class methodologies and best practices are also welcomed as Tutorials. These additional features provide context for research articles and facilitate understanding for a wide array of individuals interested in clinical and translational science. CTS welcomes high quality, scientifically sound, original manuscripts focused on clinical pharmacology and translational science, including animal, in vitro, in silico, and clinical studies supporting the breadth of drug discovery, development, regulation and clinical use of both traditional drugs and innovative modalities.