{"title":"Evidence-Based Potential of Generative Artificial Intelligence Large Language Models on Dental Avulsion: ChatGPT Versus Gemini.","authors":"Taibe Tokgöz Kaplan, Muhammet Cankar","doi":"10.1111/edt.12999","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>In this study, the accuracy and comprehensiveness of the answers given to questions about dental avulsion by two artificial intelligence-based language models, ChatGPT and Gemini, were comparatively evaluated.</p><p><strong>Materials and methods: </strong>Based on the guidelines of the International Society of Dental Traumatology, a total of 33 questions were prepared, including multiple-choice questions, binary questions, and open-ended questions as technical questions and patient questions about dental avulsion. They were directed to ChatGPT and Gemini. Responses were recorded and scored by four pediatric dentists. Statistical analyses, including ICC analysis, were performed to determine the agreement and accuracy of the responses. The significance level was set as p < 0.050.</p><p><strong>Results: </strong>The mean score of the Gemini model was statistically significantly higher than the ChatGPT (p = 0.001). ChatGPT gave more correct answers to open-ended questions and T/F questions on dental avulsion; it showed the lowest accuracy in the MCQ section. There was no significant difference between the responses of the Gemini model to different types of questions on dental avulsion and the median scores (p = 0.088). ChatGPT and Gemini were analyzed with the Mann-Whitney U test without making a distinction between question types, and Gemini answers were found to be statistically significantly more accurate (p = 0.004).</p><p><strong>Conclusions: </strong>The Gemini and ChatGPT language models based on the IADT guideline for dental avulsion undoubtedly show promise. To guarantee the successful incorporation of LLMs into practice, it is imperative to conduct additional research, clinical validation, and improvements to the models.</p>","PeriodicalId":55180,"journal":{"name":"Dental Traumatology","volume":" ","pages":""},"PeriodicalIF":2.3000,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Dental Traumatology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/edt.12999","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Background: In this study, the accuracy and comprehensiveness of the answers given to questions about dental avulsion by two artificial intelligence-based language models, ChatGPT and Gemini, were comparatively evaluated.
Materials and methods: Based on the guidelines of the International Society of Dental Traumatology, a total of 33 questions were prepared, including multiple-choice questions, binary questions, and open-ended questions as technical questions and patient questions about dental avulsion. They were directed to ChatGPT and Gemini. Responses were recorded and scored by four pediatric dentists. Statistical analyses, including ICC analysis, were performed to determine the agreement and accuracy of the responses. The significance level was set as p < 0.050.
Results: The mean score of the Gemini model was statistically significantly higher than the ChatGPT (p = 0.001). ChatGPT gave more correct answers to open-ended questions and T/F questions on dental avulsion; it showed the lowest accuracy in the MCQ section. There was no significant difference between the responses of the Gemini model to different types of questions on dental avulsion and the median scores (p = 0.088). ChatGPT and Gemini were analyzed with the Mann-Whitney U test without making a distinction between question types, and Gemini answers were found to be statistically significantly more accurate (p = 0.004).
Conclusions: The Gemini and ChatGPT language models based on the IADT guideline for dental avulsion undoubtedly show promise. To guarantee the successful incorporation of LLMs into practice, it is imperative to conduct additional research, clinical validation, and improvements to the models.
期刊介绍:
Dental Traumatology is an international journal that aims to convey scientific and clinical progress in all areas related to adult and pediatric dental traumatology. This includes the following topics:
- Epidemiology, Social Aspects, Education, Diagnostics
- Esthetics / Prosthetics/ Restorative
- Evidence Based Traumatology & Study Design
- Oral & Maxillofacial Surgery/Transplant/Implant
- Pediatrics and Orthodontics
- Prevention and Sports Dentistry
- Endodontics and Periodontal Aspects
The journal"s aim is to promote communication among clinicians, educators, researchers, and others interested in the field of dental traumatology.