The Potential Clinical Utility of the Customized Large Language Model in Gastroenterology: A Pilot Study.

IF 3.7 3区 医学 Q2 ENGINEERING, BIOMEDICAL Bioengineering Pub Date : 2024-12-24 DOI:10.3390/bioengineering12010001
Eun Jeong Gong, Chang Seok Bang, Jae Jun Lee, Jonghyung Park, Eunsil Kim, Subeen Kim, Minjae Kimm, Seoung-Ho Choi
{"title":"The Potential Clinical Utility of the Customized Large Language Model in Gastroenterology: A Pilot Study.","authors":"Eun Jeong Gong, Chang Seok Bang, Jae Jun Lee, Jonghyung Park, Eunsil Kim, Subeen Kim, Minjae Kimm, Seoung-Ho Choi","doi":"10.3390/bioengineering12010001","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background:</b> The large language model (LLM) has the potential to be applied to clinical practice. However, there has been scarce study on this in the field of gastroenterology. Aim: This study explores the potential clinical utility of two LLMs in the field of gastroenterology: a customized GPT model and a conventional GPT-4o, an advanced LLM capable of retrieval-augmented generation (RAG). <b>Method:</b> We established a customized GPT with the BM25 algorithm using Open AI's GPT-4o model, which allows it to produce responses in the context of specific documents including textbooks of internal medicine (in English) and gastroenterology (in Korean). Also, we prepared a conventional ChatGPT 4o (accessed on 16 October 2024) access. The benchmark (written in Korean) consisted of 15 clinical questions developed by four clinical experts, representing typical questions for medical students. The two LLMs, a gastroenterology fellow, and an expert gastroenterologist were tested to assess their performance. <b>Results:</b> While the customized LLM correctly answered 8 out of 15 questions, the fellow answered 10 correctly. When the standardized Korean medical terms were replaced with English terminology, the LLM's performance improved, answering two additional knowledge-based questions correctly, matching the fellow's score. However, judgment-based questions remained a challenge for the model. Even with the implementation of 'Chain of Thought' prompt engineering, the customized GPT did not achieve improved reasoning. Conventional GPT-4o achieved the highest score among the AI models (14/15). Although both models performed slightly below the expert gastroenterologist's level (15/15), they show promising potential for clinical applications (scores comparable with or higher than that of the gastroenterology fellow). <b>Conclusions:</b> LLMs could be utilized to assist with specialized tasks such as patient counseling. However, RAG capabilities by enabling real-time retrieval of external data not included in the training dataset, appear essential for managing complex, specialized content, and clinician oversight will remain crucial to ensure safe and effective use in clinical practice.</p>","PeriodicalId":8874,"journal":{"name":"Bioengineering","volume":"12 1","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11760845/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioengineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3390/bioengineering12010001","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Background: The large language model (LLM) has the potential to be applied to clinical practice. However, there has been scarce study on this in the field of gastroenterology. Aim: This study explores the potential clinical utility of two LLMs in the field of gastroenterology: a customized GPT model and a conventional GPT-4o, an advanced LLM capable of retrieval-augmented generation (RAG). Method: We established a customized GPT with the BM25 algorithm using Open AI's GPT-4o model, which allows it to produce responses in the context of specific documents including textbooks of internal medicine (in English) and gastroenterology (in Korean). Also, we prepared a conventional ChatGPT 4o (accessed on 16 October 2024) access. The benchmark (written in Korean) consisted of 15 clinical questions developed by four clinical experts, representing typical questions for medical students. The two LLMs, a gastroenterology fellow, and an expert gastroenterologist were tested to assess their performance. Results: While the customized LLM correctly answered 8 out of 15 questions, the fellow answered 10 correctly. When the standardized Korean medical terms were replaced with English terminology, the LLM's performance improved, answering two additional knowledge-based questions correctly, matching the fellow's score. However, judgment-based questions remained a challenge for the model. Even with the implementation of 'Chain of Thought' prompt engineering, the customized GPT did not achieve improved reasoning. Conventional GPT-4o achieved the highest score among the AI models (14/15). Although both models performed slightly below the expert gastroenterologist's level (15/15), they show promising potential for clinical applications (scores comparable with or higher than that of the gastroenterology fellow). Conclusions: LLMs could be utilized to assist with specialized tasks such as patient counseling. However, RAG capabilities by enabling real-time retrieval of external data not included in the training dataset, appear essential for managing complex, specialized content, and clinician oversight will remain crucial to ensure safe and effective use in clinical practice.

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
定制大语言模型在胃肠病学中的潜在临床应用:一项初步研究。
背景:大语言模型(LLM)具有应用于临床实践的潜力。然而,在胃肠病学领域对此的研究很少。目的:本研究探讨了两种LLM在胃肠病学领域的潜在临床应用:定制的GPT模型和传统的GPT- 40,一种能够检索增强生成(RAG)的先进LLM。方法:利用Open AI的GPT- 40模型,利用BM25算法建立了一个定制化的GPT,该GPT可以根据内科学教科书(英文)和胃肠病学教科书(韩文)等特定文档的背景产生响应。此外,我们还准备了常规的ChatGPT 40(于2024年10月16日访问)访问。基准(韩文)由4名临床专家开发的15个临床问题组成,代表了医学生的典型问题。两位法学硕士、一位胃肠病学研究员和一位胃肠病学专家接受了测试,以评估他们的表现。结果:在15个问题中,定制LLM正确回答了8个,而同伴正确回答了10个。当标准化的韩语医学术语被英语术语取代时,法学硕士的表现有所提高,他正确回答了另外两个基于知识的问题,与他的得分相当。然而,基于判断的问题对该模型来说仍然是一个挑战。即使实施了“思维链”提示工程,定制的GPT也没有实现改进的推理。常规gpt - 40在人工智能模型中得分最高(14/15)。尽管这两种模型的表现都略低于胃肠病学专家的水平(15/15),但它们在临床应用方面表现出了很大的潜力(得分与胃肠病学专家相当或更高)。结论:法学硕士可用于辅助专业任务,如患者咨询。然而,通过实时检索未包含在训练数据集中的外部数据,RAG功能对于管理复杂的专业内容至关重要,并且临床医生的监督对于确保临床实践中安全有效的使用仍然至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Bioengineering
Bioengineering Chemical Engineering-Bioengineering
CiteScore
4.00
自引率
8.70%
发文量
661
期刊介绍: Aims Bioengineering (ISSN 2306-5354) provides an advanced forum for the science and technology of bioengineering. It publishes original research papers, comprehensive reviews, communications and case reports. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. All aspects of bioengineering are welcomed from theoretical concepts to education and applications. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced. There are, in addition, four key features of this Journal: ● We are introducing a new concept in scientific and technical publications “The Translational Case Report in Bioengineering”. It is a descriptive explanatory analysis of a transformative or translational event. Understanding that the goal of bioengineering scholarship is to advance towards a transformative or clinical solution to an identified transformative/clinical need, the translational case report is used to explore causation in order to find underlying principles that may guide other similar transformative/translational undertakings. ● Manuscripts regarding research proposals and research ideas will be particularly welcomed. ● Electronic files and software regarding the full details of the calculation and experimental procedure, if unable to be published in a normal way, can be deposited as supplementary material. ● We also accept manuscripts communicating to a broader audience with regard to research projects financed with public funds. Scope ● Bionics and biological cybernetics: implantology; bio–abio interfaces ● Bioelectronics: wearable electronics; implantable electronics; “more than Moore” electronics; bioelectronics devices ● Bioprocess and biosystems engineering and applications: bioprocess design; biocatalysis; bioseparation and bioreactors; bioinformatics; bioenergy; etc. ● Biomolecular, cellular and tissue engineering and applications: tissue engineering; chromosome engineering; embryo engineering; cellular, molecular and synthetic biology; metabolic engineering; bio-nanotechnology; micro/nano technologies; genetic engineering; transgenic technology ● Biomedical engineering and applications: biomechatronics; biomedical electronics; biomechanics; biomaterials; biomimetics; biomedical diagnostics; biomedical therapy; biomedical devices; sensors and circuits; biomedical imaging and medical information systems; implants and regenerative medicine; neurotechnology; clinical engineering; rehabilitation engineering ● Biochemical engineering and applications: metabolic pathway engineering; modeling and simulation ● Translational bioengineering
期刊最新文献
Engineering a Quantitative Organ-on-a-Chip Platform for Myogenic Mechanobiology. Cross-Modal Alignment and Rectified Flow-Based Latent Representation Synthesis for Enhanced Speech-Driven Alzheimer's Disease Detection. Transcranial Alternating Current Stimulation as an Adjuvant for Nonfluent Aphasia: A Proof-of-Concept Study. Comparative Evaluation of Beverage-Induced Surface Alterations on Dental Enamel: An In Vitro Biomaterial Study. Effect of Internal Structural Design on Stress Distribution in 3D-Printed Subperiosteal Implants Under Mechanical Loading.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1