Evaluating ChatGPT for neurocognitive disorder diagnosis: a multicenter study.

IF 3 3区 心理学 Q2 CLINICAL NEUROLOGY Clinical Neuropsychologist Pub Date : 2025-03-16 DOI:10.1080/13854046.2025.2475567
A Andrew Dimmick, Charlie C Su, Hanan S Rafiuddin, David C Cicero
{"title":"Evaluating ChatGPT for neurocognitive disorder diagnosis: a multicenter study.","authors":"A Andrew Dimmick, Charlie C Su, Hanan S Rafiuddin, David C Cicero","doi":"10.1080/13854046.2025.2475567","DOIUrl":null,"url":null,"abstract":"<p><p><b>Objective</b>: To evaluate the accuracy and reliability of ChatGPT 4 Omni in diagnosing neurocognitive disorders using comprehensive clinical data and compare its performance to previous versions of ChatGPT. <b>Method</b>: This project utilized a two-part design: Study 1 examined diagnostic agreement between ChatGPT 4 Omni and clinicians using a few-shot prompt approach, and Study 2 compared the diagnostic performance of ChatGPT models using a zero-shot prompt approach using data from the National Alzheimer's Coordinating Center (NACC) Uniform Data Set 3. Study 1 included 12,922 older adults (<i>M<sub>age</sub></i> = 69.13, <i>SD</i> = 9.87), predominantly female (57%) and White (80%). Study 2 involved 537 older adults (<i>M<sub>age</sub></i> = 67.88, <i>SD</i> = 9.52), majority female (57%) and White (81%). Diagnoses included no cognitive impairment, amnestic mild cognitive impairment (MCI), nonamnestic MCI, and dementia. <b>Results</b>: In Study 1, ChatGPT 4 Omni showed fair association with clinician diagnoses (χ2 (9) = 6021.96, <i>p</i> < .001; κ = .33). Notable predictive measures of agreement included the MoCA and memory recall tests. ChatGPT 4 Omni demonstrated high internal reliability (α = .96). In Study 2, no significant diagnostic agreement was found between ChatGPT versions and clinicians. <b>Conclusions</b>: Although ChatGPT 4 Omni shows potential in aligning with clinician diagnoses, its diagnostic accuracy is insufficient for clinical application without human oversight. Continued refinement and comprehensive training of AI models are essential to enhance their utility in neuropsychological assessment. With rapidly developing technological innovations, integrating AI tools in clinical practice could soon improve diagnostic efficiency and accessibility to neuropsychological services.</p>","PeriodicalId":55250,"journal":{"name":"Clinical Neuropsychologist","volume":" ","pages":"1-16"},"PeriodicalIF":3.0000,"publicationDate":"2025-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Neuropsychologist","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1080/13854046.2025.2475567","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Objective: To evaluate the accuracy and reliability of ChatGPT 4 Omni in diagnosing neurocognitive disorders using comprehensive clinical data and compare its performance to previous versions of ChatGPT. Method: This project utilized a two-part design: Study 1 examined diagnostic agreement between ChatGPT 4 Omni and clinicians using a few-shot prompt approach, and Study 2 compared the diagnostic performance of ChatGPT models using a zero-shot prompt approach using data from the National Alzheimer's Coordinating Center (NACC) Uniform Data Set 3. Study 1 included 12,922 older adults (Mage = 69.13, SD = 9.87), predominantly female (57%) and White (80%). Study 2 involved 537 older adults (Mage = 67.88, SD = 9.52), majority female (57%) and White (81%). Diagnoses included no cognitive impairment, amnestic mild cognitive impairment (MCI), nonamnestic MCI, and dementia. Results: In Study 1, ChatGPT 4 Omni showed fair association with clinician diagnoses (χ2 (9) = 6021.96, p < .001; κ = .33). Notable predictive measures of agreement included the MoCA and memory recall tests. ChatGPT 4 Omni demonstrated high internal reliability (α = .96). In Study 2, no significant diagnostic agreement was found between ChatGPT versions and clinicians. Conclusions: Although ChatGPT 4 Omni shows potential in aligning with clinician diagnoses, its diagnostic accuracy is insufficient for clinical application without human oversight. Continued refinement and comprehensive training of AI models are essential to enhance their utility in neuropsychological assessment. With rapidly developing technological innovations, integrating AI tools in clinical practice could soon improve diagnostic efficiency and accessibility to neuropsychological services.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
求助全文
约1分钟内获得全文 去求助
来源期刊
Clinical Neuropsychologist
Clinical Neuropsychologist 医学-临床神经学
CiteScore
8.40
自引率
12.80%
发文量
61
审稿时长
6-12 weeks
期刊介绍: The Clinical Neuropsychologist (TCN) serves as the premier forum for (1) state-of-the-art clinically-relevant scientific research, (2) in-depth professional discussions of matters germane to evidence-based practice, and (3) clinical case studies in neuropsychology. Of particular interest are papers that can make definitive statements about a given topic (thereby having implications for the standards of clinical practice) and those with the potential to expand today’s clinical frontiers. Research on all age groups, and on both clinical and normal populations, is considered.
期刊最新文献
23rd Annual AACN Conference and Workshops of the American Academy of Clinical Neuropsychology (AACN), June 11-14, 2025. Associations between the logical memory test story recall metrics and plasma biomarkers for Alzheimer's disease in individuals free of dementia. Cross-cultural tele-neuropsychology: the use of cultural consultation and interpretation services to improve access for patients and trainees. Demographically adjusted normative study of everyday cognition in the ACTIVE sample. Annual report of the presidents of the AACN, ABCN, and AACNF.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1