A content-aware chatbot based on GPT 4 provides trustworthy recommendations for Cone-Beam CT guidelines in dental imaging.

IF 2.9 2区 医学 Q1 DENTISTRY, ORAL SURGERY & MEDICINE Dento maxillo facial radiology Pub Date : 2024-02-08 DOI:10.1093/dmfr/twad015
Maximilian Frederik Russe, Alexander Rau, Michael Andreas Ermer, René Rothweiler, Sina Wenger, Klara Klöble, Ralf K W Schulze, Fabian Bamberg, Rainer Schmelzeisen, Marco Reisert, Wiebke Semper-Hogg
{"title":"A content-aware chatbot based on GPT 4 provides trustworthy recommendations for Cone-Beam CT guidelines in dental imaging.","authors":"Maximilian Frederik Russe, Alexander Rau, Michael Andreas Ermer, René Rothweiler, Sina Wenger, Klara Klöble, Ralf K W Schulze, Fabian Bamberg, Rainer Schmelzeisen, Marco Reisert, Wiebke Semper-Hogg","doi":"10.1093/dmfr/twad015","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To develop a content-aware chatbot based on GPT-3.5-Turbo and GPT-4 with specialized knowledge on the German S2 Cone-Beam CT (CBCT) dental imaging guideline and to compare the performance against humans.</p><p><strong>Methods: </strong>The LlamaIndex software library was used to integrate the guideline context into the chatbots. Based on the CBCT S2 guideline, 40 questions were posed to content-aware chatbots and early career and senior practitioners with different levels of experience served as reference. The chatbots' performance was compared in terms of recommendation accuracy and explanation quality. Chi-square test and one-tailed Wilcoxon signed rank test evaluated accuracy and explanation quality, respectively.</p><p><strong>Results: </strong>The GPT-4 based chatbot provided 100% correct recommendations and superior explanation quality compared to the one based on GPT3.5-Turbo (87.5% vs. 57.5% for GPT-3.5-Turbo; P = .003). Moreover, it outperformed early career practitioners in correct answers (P = .002 and P = .032) and earned higher trust than the chatbot using GPT-3.5-Turbo (P = 0.006).</p><p><strong>Conclusions: </strong>A content-aware chatbot using GPT-4 reliably provided recommendations according to current consensus guidelines. The responses were deemed trustworthy and transparent, and therefore facilitate the integration of artificial intelligence into clinical decision-making.</p>","PeriodicalId":11261,"journal":{"name":"Dento maxillo facial radiology","volume":" ","pages":"109-114"},"PeriodicalIF":2.9000,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11003655/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Dento maxillo facial radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1093/dmfr/twad015","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: To develop a content-aware chatbot based on GPT-3.5-Turbo and GPT-4 with specialized knowledge on the German S2 Cone-Beam CT (CBCT) dental imaging guideline and to compare the performance against humans.

Methods: The LlamaIndex software library was used to integrate the guideline context into the chatbots. Based on the CBCT S2 guideline, 40 questions were posed to content-aware chatbots and early career and senior practitioners with different levels of experience served as reference. The chatbots' performance was compared in terms of recommendation accuracy and explanation quality. Chi-square test and one-tailed Wilcoxon signed rank test evaluated accuracy and explanation quality, respectively.

Results: The GPT-4 based chatbot provided 100% correct recommendations and superior explanation quality compared to the one based on GPT3.5-Turbo (87.5% vs. 57.5% for GPT-3.5-Turbo; P = .003). Moreover, it outperformed early career practitioners in correct answers (P = .002 and P = .032) and earned higher trust than the chatbot using GPT-3.5-Turbo (P = 0.006).

Conclusions: A content-aware chatbot using GPT-4 reliably provided recommendations according to current consensus guidelines. The responses were deemed trustworthy and transparent, and therefore facilitate the integration of artificial intelligence into clinical decision-making.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于 GPT 4 的内容感知聊天机器人为牙科成像中的锥形束计算机断层扫描指南提供值得信赖的建议。
目的开发基于 GPT-3.5-Turbo 和 GPT-4 的内容感知聊天机器人,该聊天机器人具备德国 S2 锥束 CT(CBCT)牙科成像指南的专业知识,并将其性能与人类进行比较:方法:使用 LlamaIndex 软件库将指南内容整合到聊天机器人中。根据 CBCT S2 指南,向内容感知聊天机器人提出了 40 个问题,并以不同经验水平的早期和资深从业者作为参考。聊天机器人在推荐准确性和解释质量方面的表现进行了比较。对准确性和解释质量分别进行了卡方检验和单尾 Wilcoxon 符号秩检验:结果:与基于 GPT3.5-Turbo 的聊天机器人相比,基于 GPT-4 的聊天机器人提供了 100% 的正确推荐和更高的解释质量(87.5% vs. 57.5% for GPT-3.5-Turbo;p = 0.003)。此外,与使用 GPT-3.5-Turbo 的聊天机器人相比,GPT-3.5-Turbo 的正确答案率(p = 0.002 和 p = 0.032)和信任度(p = 0.006)均优于早期职业从业者:使用 GPT-4 的内容感知聊天机器人根据当前的共识指南提供了可靠的建议。结论:使用 GPT-4 的内容感知聊天机器人根据当前的共识指南提供了可靠的建议,其回复被认为是可信和透明的,因此促进了人工智能与临床决策的整合。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
5.60
自引率
9.10%
发文量
65
审稿时长
4-8 weeks
期刊介绍: Dentomaxillofacial Radiology (DMFR) is the journal of the International Association of Dentomaxillofacial Radiology (IADMFR) and covers the closely related fields of oral radiology and head and neck imaging. Established in 1972, DMFR is a key resource keeping dentists, radiologists and clinicians and scientists with an interest in Head and Neck imaging abreast of important research and developments in oral and maxillofacial radiology. The DMFR editorial board features a panel of international experts including Editor-in-Chief Professor Ralf Schulze. Our editorial board provide their expertise and guidance in shaping the content and direction of the journal. Quick Facts: - 2015 Impact Factor - 1.919 - Receipt to first decision - average of 3 weeks - Acceptance to online publication - average of 3 weeks - Open access option - ISSN: 0250-832X - eISSN: 1476-542X
期刊最新文献
Application of Radiomics Features in Differential Diagnosis of Odontogenic Cysts. Converting dose-area product to effective dose in dental cone-beam computed tomography using organ-specific deep learning. Diagnostic performance of approximal caries in bitewing radiographs from different monitors and room illuminances. Evaluation of Temporomandibular Joint Disc Displacement with Magnetic Resonance Imaging Based Radiomics Analysis. Evaluation of temporomandibular joint osteoarthritis using a new FRACTURE sequence of 3.0T magnetic resonance imaging.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1