Diagnostic Accuracy of a Custom Large Language Model on Rare Pediatric Disease Case Reports

IF 1.7 4区 生物学 Q3 GENETICS & HEREDITY American Journal of Medical Genetics Part A Pub Date : 2024-09-13 DOI:10.1002/ajmg.a.63878
Cameron C. Young, Ellie Enichen, Christian Rivera, Corinne A. Auger, Nathan Grant, Arya Rao, Marc D. Succi
{"title":"Diagnostic Accuracy of a Custom Large Language Model on Rare Pediatric Disease Case Reports","authors":"Cameron C. Young, Ellie Enichen, Christian Rivera, Corinne A. Auger, Nathan Grant, Arya Rao, Marc D. Succi","doi":"10.1002/ajmg.a.63878","DOIUrl":null,"url":null,"abstract":"Accurately diagnosing rare pediatric diseases frequently represent a clinical challenge due to their complex and unusual clinical presentations. Here, we explore the capabilities of three large language models (LLMs), GPT‐4, Gemini Pro, and a custom‐built LLM (GPT‐4 integrated with the Human Phenotype Ontology [GPT‐4 HPO]), by evaluating their diagnostic performance on 61 rare pediatric disease case reports. The performance of the LLMs were assessed for accuracy in identifying specific diagnoses, listing the correct diagnosis among a differential list, and broad disease categories. In addition, GPT‐4 HPO was tested on 100 general pediatrics case reports previously assessed on other LLMs to further validate its performance. The results indicated that GPT‐4 was able to predict the correct diagnosis with a diagnostic accuracy of 13.1%, whereas both GPT‐4 HPO and Gemini Pro had diagnostic accuracies of 8.2%. Further, GPT‐4 HPO showed an improved performance compared with the other two LLMs in identifying the correct diagnosis among its differential list and the broad disease category. Although these findings underscore the potential of LLMs for diagnostic support, particularly when enhanced with domain‐specific ontologies, they also stress the need for further improvement prior to integration into clinical practice.","PeriodicalId":7507,"journal":{"name":"American Journal of Medical Genetics Part A","volume":null,"pages":null},"PeriodicalIF":1.7000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Journal of Medical Genetics Part A","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/ajmg.a.63878","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"GENETICS & HEREDITY","Score":null,"Total":0}
引用次数: 0

Abstract

Accurately diagnosing rare pediatric diseases frequently represent a clinical challenge due to their complex and unusual clinical presentations. Here, we explore the capabilities of three large language models (LLMs), GPT‐4, Gemini Pro, and a custom‐built LLM (GPT‐4 integrated with the Human Phenotype Ontology [GPT‐4 HPO]), by evaluating their diagnostic performance on 61 rare pediatric disease case reports. The performance of the LLMs were assessed for accuracy in identifying specific diagnoses, listing the correct diagnosis among a differential list, and broad disease categories. In addition, GPT‐4 HPO was tested on 100 general pediatrics case reports previously assessed on other LLMs to further validate its performance. The results indicated that GPT‐4 was able to predict the correct diagnosis with a diagnostic accuracy of 13.1%, whereas both GPT‐4 HPO and Gemini Pro had diagnostic accuracies of 8.2%. Further, GPT‐4 HPO showed an improved performance compared with the other two LLMs in identifying the correct diagnosis among its differential list and the broad disease category. Although these findings underscore the potential of LLMs for diagnostic support, particularly when enhanced with domain‐specific ontologies, they also stress the need for further improvement prior to integration into clinical practice.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
定制大语言模型对罕见儿科疾病病例报告的诊断准确性
由于罕见儿科疾病的临床表现复杂而不寻常,因此准确诊断罕见儿科疾病常常是一项临床挑战。在此,我们通过对 61 例罕见儿科疾病病例报告的诊断性能进行评估,探讨了 GPT-4、Gemini Pro 和定制 LLM(GPT-4 与人类表型本体 [GPT-4 HPO] 集成)这三种大型语言模型(LLM)的能力。对 LLM 的性能进行了评估,包括识别特定诊断的准确性、在鉴别列表中列出正确诊断的准确性以及疾病类别的广泛性。此外,GPT-4 HPO 还在 100 份普通儿科病例报告上进行了测试,这些病例报告之前曾在其他 LLMs 上进行过评估,以进一步验证其性能。结果表明,GPT-4 预测正确诊断的准确率为 13.1%,而 GPT-4 HPO 和 Gemini Pro 的诊断准确率均为 8.2%。此外,与其他两种 LLM 相比,GPT-4 HPO 在确定其鉴别列表和疾病大类中的正确诊断方面表现更佳。尽管这些研究结果强调了 LLMs 在诊断支持方面的潜力,尤其是在使用特定领域本体的情况下,但它们也强调了在整合到临床实践之前进一步改进的必要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
3.50
自引率
5.00%
发文量
432
审稿时长
2-4 weeks
期刊介绍: The American Journal of Medical Genetics - Part A (AJMG) gives you continuous coverage of all biological and medical aspects of genetic disorders and birth defects, as well as in-depth documentation of phenotype analysis within the current context of genotype/phenotype correlations. In addition to Part A , AJMG also publishes two other parts: Part B: Neuropsychiatric Genetics , covering experimental and clinical investigations of the genetic mechanisms underlying neurologic and psychiatric disorders. Part C: Seminars in Medical Genetics , guest-edited collections of thematic reviews of topical interest to the readership of AJMG .
期刊最新文献
Associated Anomalies in Radial Ray Deficiency. Hospital Visits Associated With Oral Infections in Patients With Neurofibromatosis Type 1: A Register-Based Analysis. Evaluating the Influence of Social Determinants of Health on Blood Phenylalanine Levels in Phenylketonuria Patients. SF3B2 Haploinsufficiency Associated With Hirschprung Disease and Complex Cardiac Defect Without Craniofacial Microsomia. Craniotubular Dysplasia Ikegawa Type: Further Delineation of the Phenotype
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1