ChatGPT-3.5 和 - 4 在提供口腔颌面部疾病鉴别诊断方面的准确性和一致性:诊断性能比较分析。

IF 3.1 2区 医学 Q1 DENTISTRY, ORAL SURGERY & MEDICINE Clinical Oral Investigations Pub Date : 2024-09-24 DOI:10.1007/s00784-024-05939-1
Saygo Tomo, Jérôme R Lechien, Hugo Sobrinho Bueno, Daniela Filié Cantieri-Debortoli, Luciana Estevam Simonato
{"title":"ChatGPT-3.5 和 - 4 在提供口腔颌面部疾病鉴别诊断方面的准确性和一致性:诊断性能比较分析。","authors":"Saygo Tomo, Jérôme R Lechien, Hugo Sobrinho Bueno, Daniela Filié Cantieri-Debortoli, Luciana Estevam Simonato","doi":"10.1007/s00784-024-05939-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To investigate the performance of ChatGPT in the differential diagnosis of oral and maxillofacial diseases.</p><p><strong>Methods: </strong>Thirty-seven oral and maxillofacial lesions findings were presented to ChatGPT-3.5 and - 4, 18 dental surgeons trained in oral medicine/pathology (OMP), 23 general dental surgeons (DDS), and 16 dental students (DS) for differential diagnosis. Additionally, a group of 15 general dentists was asked to describe 11 cases to ChatGPT versions. The ChatGPT-3.5, -4, and human primary and alternative diagnoses were rated by 2 independent investigators with a 4 Likert-Scale. The consistency of ChatGPT-3.5 and - 4 was evaluated with regenerated inputs.</p><p><strong>Results: </strong>Moderate consistency of outputs was observed for ChatGPT-3.5 and - 4 to provide primary (κ = 0.532 and κ = 0.533 respectively) and alternative (κ = 0.337 and κ = 0.367 respectively) hypotheses. The mean of correct diagnoses was 64.86% for ChatGPT-3.5, 80.18% for ChatGPT-4, 86.64% for OMP, 24.32% for DDS, and 16.67% for DS. The mean correct primary hypothesis rates were 45.95% for ChatGPT-3.5, 61.80% for ChatGPT-4, 82.28% for OMP, 22.72% for DDS, and 15.77% for DS. The mean correct diagnosis rate for ChatGPT-3.5 with standard descriptions was 64.86%, compared to 45.95% with participants' descriptions. For ChatGPT-4, the mean was 80.18% with standard descriptions and 61.80% with participant descriptions.</p><p><strong>Conclusion: </strong>ChatGPT-4 demonstrates an accuracy comparable to specialists to provide differential diagnosis for oral and maxillofacial diseases. Consistency of ChatGPT to provide diagnostic hypotheses for oral diseases cases is moderate, representing a weakness for clinical application. The quality of case documentation and descriptions impacts significantly on the performance of ChatGPT.</p><p><strong>Clinical relevance: </strong>General dentists, dental students and specialists in oral medicine and pathology may benefit from ChatGPT-4 as an auxiliary method to define differential diagnosis for oral and maxillofacial lesions, but its accuracy is dependent on precise case descriptions.</p>","PeriodicalId":10461,"journal":{"name":"Clinical Oral Investigations","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accuracy and consistency of ChatGPT-3.5 and - 4 in providing differential diagnoses in oral and maxillofacial diseases: a comparative diagnostic performance analysis.\",\"authors\":\"Saygo Tomo, Jérôme R Lechien, Hugo Sobrinho Bueno, Daniela Filié Cantieri-Debortoli, Luciana Estevam Simonato\",\"doi\":\"10.1007/s00784-024-05939-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>To investigate the performance of ChatGPT in the differential diagnosis of oral and maxillofacial diseases.</p><p><strong>Methods: </strong>Thirty-seven oral and maxillofacial lesions findings were presented to ChatGPT-3.5 and - 4, 18 dental surgeons trained in oral medicine/pathology (OMP), 23 general dental surgeons (DDS), and 16 dental students (DS) for differential diagnosis. Additionally, a group of 15 general dentists was asked to describe 11 cases to ChatGPT versions. The ChatGPT-3.5, -4, and human primary and alternative diagnoses were rated by 2 independent investigators with a 4 Likert-Scale. The consistency of ChatGPT-3.5 and - 4 was evaluated with regenerated inputs.</p><p><strong>Results: </strong>Moderate consistency of outputs was observed for ChatGPT-3.5 and - 4 to provide primary (κ = 0.532 and κ = 0.533 respectively) and alternative (κ = 0.337 and κ = 0.367 respectively) hypotheses. The mean of correct diagnoses was 64.86% for ChatGPT-3.5, 80.18% for ChatGPT-4, 86.64% for OMP, 24.32% for DDS, and 16.67% for DS. The mean correct primary hypothesis rates were 45.95% for ChatGPT-3.5, 61.80% for ChatGPT-4, 82.28% for OMP, 22.72% for DDS, and 15.77% for DS. The mean correct diagnosis rate for ChatGPT-3.5 with standard descriptions was 64.86%, compared to 45.95% with participants' descriptions. For ChatGPT-4, the mean was 80.18% with standard descriptions and 61.80% with participant descriptions.</p><p><strong>Conclusion: </strong>ChatGPT-4 demonstrates an accuracy comparable to specialists to provide differential diagnosis for oral and maxillofacial diseases. Consistency of ChatGPT to provide diagnostic hypotheses for oral diseases cases is moderate, representing a weakness for clinical application. The quality of case documentation and descriptions impacts significantly on the performance of ChatGPT.</p><p><strong>Clinical relevance: </strong>General dentists, dental students and specialists in oral medicine and pathology may benefit from ChatGPT-4 as an auxiliary method to define differential diagnosis for oral and maxillofacial lesions, but its accuracy is dependent on precise case descriptions.</p>\",\"PeriodicalId\":10461,\"journal\":{\"name\":\"Clinical Oral Investigations\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-09-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Clinical Oral Investigations\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00784-024-05939-1\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"DENTISTRY, ORAL SURGERY & MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Oral Investigations","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00784-024-05939-1","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0

摘要

目的:研究 ChatGPT 在口腔颌面部疾病鉴别诊断中的性能:研究 ChatGPT 在口腔颌面部疾病鉴别诊断中的表现:向 ChatGPT-3.5 和 - 4、18 名接受过口腔内科/病理学培训的牙科医生 (OMP)、23 名普通牙科医生 (DDS) 和 16 名牙科学生 (DS) 提交了 37 个口腔颌面部病变结果,以进行鉴别诊断。此外,一组 15 名普通牙科医生被要求向 ChatGPT 版本描述 11 个病例。ChatGPT 3.5、-4 和人类主要诊断和替代诊断由两名独立调查员用 4 分李克特量表进行评分。结果显示,ChatGPT-3.5 和 ChatGPT - 4 的输出结果具有中等程度的一致性:结果:ChatGPT-3.5 和 - 4 在提供主要假设(κ = 0.532 和 κ = 0.533)和替代假设(κ = 0.337 和 κ = 0.367)方面的输出具有适度的一致性。ChatGPT-3.5 的平均诊断正确率为 64.86%,ChatGPT-4 为 80.18%,OMP 为 86.64%,DDS 为 24.32%,DS 为 16.67%。ChatGPT-3.5 的平均初级假设正确率为 45.95%,ChatGPT-4 为 61.80%,OMP 为 82.28%,DDS 为 22.72%,DS 为 15.77%。使用标准描述的 ChatGPT-3.5 的平均正确诊断率为 64.86%,而使用参与者描述的正确诊断率为 45.95%。对于 ChatGPT-4,使用标准描述的平均正确率为 80.18%,而使用参与者描述的正确率为 61.80%:结论:ChatGPT-4 在为口腔颌面部疾病提供鉴别诊断方面的准确性可与专家媲美。ChatGPT 为口腔疾病病例提供诊断假设的一致性一般,是临床应用的一个薄弱环节。病例记录和描述的质量对 ChatGPT 的性能影响很大:临床相关性:普通牙医、牙科学生以及口腔医学和病理学专家可能会从 ChatGPT-4 中获益,将其作为确定口腔颌面部病变鉴别诊断的辅助方法,但其准确性取决于精确的病例描述。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Accuracy and consistency of ChatGPT-3.5 and - 4 in providing differential diagnoses in oral and maxillofacial diseases: a comparative diagnostic performance analysis.

Objective: To investigate the performance of ChatGPT in the differential diagnosis of oral and maxillofacial diseases.

Methods: Thirty-seven oral and maxillofacial lesions findings were presented to ChatGPT-3.5 and - 4, 18 dental surgeons trained in oral medicine/pathology (OMP), 23 general dental surgeons (DDS), and 16 dental students (DS) for differential diagnosis. Additionally, a group of 15 general dentists was asked to describe 11 cases to ChatGPT versions. The ChatGPT-3.5, -4, and human primary and alternative diagnoses were rated by 2 independent investigators with a 4 Likert-Scale. The consistency of ChatGPT-3.5 and - 4 was evaluated with regenerated inputs.

Results: Moderate consistency of outputs was observed for ChatGPT-3.5 and - 4 to provide primary (κ = 0.532 and κ = 0.533 respectively) and alternative (κ = 0.337 and κ = 0.367 respectively) hypotheses. The mean of correct diagnoses was 64.86% for ChatGPT-3.5, 80.18% for ChatGPT-4, 86.64% for OMP, 24.32% for DDS, and 16.67% for DS. The mean correct primary hypothesis rates were 45.95% for ChatGPT-3.5, 61.80% for ChatGPT-4, 82.28% for OMP, 22.72% for DDS, and 15.77% for DS. The mean correct diagnosis rate for ChatGPT-3.5 with standard descriptions was 64.86%, compared to 45.95% with participants' descriptions. For ChatGPT-4, the mean was 80.18% with standard descriptions and 61.80% with participant descriptions.

Conclusion: ChatGPT-4 demonstrates an accuracy comparable to specialists to provide differential diagnosis for oral and maxillofacial diseases. Consistency of ChatGPT to provide diagnostic hypotheses for oral diseases cases is moderate, representing a weakness for clinical application. The quality of case documentation and descriptions impacts significantly on the performance of ChatGPT.

Clinical relevance: General dentists, dental students and specialists in oral medicine and pathology may benefit from ChatGPT-4 as an auxiliary method to define differential diagnosis for oral and maxillofacial lesions, but its accuracy is dependent on precise case descriptions.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Clinical Oral Investigations
Clinical Oral Investigations 医学-牙科与口腔外科
CiteScore
6.30
自引率
5.90%
发文量
484
审稿时长
3 months
期刊介绍: The journal Clinical Oral Investigations is a multidisciplinary, international forum for publication of research from all fields of oral medicine. The journal publishes original scientific articles and invited reviews which provide up-to-date results of basic and clinical studies in oral and maxillofacial science and medicine. The aim is to clarify the relevance of new results to modern practice, for an international readership. Coverage includes maxillofacial and oral surgery, prosthetics and restorative dentistry, operative dentistry, endodontics, periodontology, orthodontics, dental materials science, clinical trials, epidemiology, pedodontics, oral implant, preventive dentistiry, oral pathology, oral basic sciences and more.
期刊最新文献
Effective doses of scout projections in maxillofacial cone beam computed tomography. Oral health in patients with inflammatory bowel disease: A cross-sectional survey in Sweden. A retrospective comparative cephalometric evaluation of non-extraction multiloop edgewise archwire and bicuspid extraction therapies in anterior open bite treatment. Comparative assessment of the stability of buccal shelf mini-screws with and without pre-drilling- a split-mouth, randomized controlled trial. Cytotoxicity assessment of eluates from vacuum-forming thermoplastics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1