{"title":"ScholarGPT's performance in oral and maxillofacial surgery.","authors":"Yunus Balel","doi":"10.1016/j.jormas.2024.102114","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>The purpose of this study is to evaluate the performance of Scholar GPT in answering technical questions in the field of oral and maxillofacial surgery and to conduct a comparative analysis with the results of a previous study that assessed the performance of ChatGPT.</p><p><strong>Materials and methods: </strong>Scholar GPT was accessed via ChatGPT (www.chatgpt.com) on March 20, 2024. A total of 60 technical questions (15 each on impacted teeth, dental implants, temporomandibular joint disorders, and orthognathic surgery) from our previous study were used. Scholar GPT's responses were evaluated using a modified Global Quality Scale (GQS). The questions were randomized before scoring using an online randomizer (www.randomizer.org). A single researcher performed the evaluations at three different times, three weeks apart, with each evaluation preceded by a new randomization. In cases of score discrepancies, a fourth evaluation was conducted to determine the final score.</p><p><strong>Results: </strong>Scholar GPT performed well across all technical questions, with an average GQS score of 4.48 (SD=0.93). Comparatively, ChatGPT's average GQS score in previous study was 3.1 (SD=1.492). The Wilcoxon Signed-Rank Test indicated a statistically significant higher average score for Scholar GPT compared to ChatGPT (Mean Difference = 2.00, SE = 0.163, p < 0.001). The Kruskal-Wallis Test showed no statistically significant differences among the topic groups (χ² = 0.799, df = 3, p = 0.850, ε² = 0.0135).</p><p><strong>Conclusion: </strong>Scholar GPT demonstrated a generally high performance in technical questions within oral and maxillofacial surgery and produced more consistent and higher-quality responses compared to ChatGPT. The findings suggest that GPT models based on academic databases can provide more accurate and reliable information. Additionally, developing a specialized GPT model for oral and maxillofacial surgery could ensure higher quality and consistency in artificial intelligence-generated information.</p>","PeriodicalId":56038,"journal":{"name":"Journal of Stomatology Oral and Maxillofacial Surgery","volume":" ","pages":"102114"},"PeriodicalIF":2.2000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Stomatology Oral and Maxillofacial Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jormas.2024.102114","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Dentistry","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: The purpose of this study is to evaluate the performance of Scholar GPT in answering technical questions in the field of oral and maxillofacial surgery and to conduct a comparative analysis with the results of a previous study that assessed the performance of ChatGPT.
Materials and methods: Scholar GPT was accessed via ChatGPT (www.chatgpt.com) on March 20, 2024. A total of 60 technical questions (15 each on impacted teeth, dental implants, temporomandibular joint disorders, and orthognathic surgery) from our previous study were used. Scholar GPT's responses were evaluated using a modified Global Quality Scale (GQS). The questions were randomized before scoring using an online randomizer (www.randomizer.org). A single researcher performed the evaluations at three different times, three weeks apart, with each evaluation preceded by a new randomization. In cases of score discrepancies, a fourth evaluation was conducted to determine the final score.
Results: Scholar GPT performed well across all technical questions, with an average GQS score of 4.48 (SD=0.93). Comparatively, ChatGPT's average GQS score in previous study was 3.1 (SD=1.492). The Wilcoxon Signed-Rank Test indicated a statistically significant higher average score for Scholar GPT compared to ChatGPT (Mean Difference = 2.00, SE = 0.163, p < 0.001). The Kruskal-Wallis Test showed no statistically significant differences among the topic groups (χ² = 0.799, df = 3, p = 0.850, ε² = 0.0135).
Conclusion: Scholar GPT demonstrated a generally high performance in technical questions within oral and maxillofacial surgery and produced more consistent and higher-quality responses compared to ChatGPT. The findings suggest that GPT models based on academic databases can provide more accurate and reliable information. Additionally, developing a specialized GPT model for oral and maxillofacial surgery could ensure higher quality and consistency in artificial intelligence-generated information.
期刊介绍:
J Stomatol Oral Maxillofac Surg publishes research papers and techniques - (guest) editorials, original articles, reviews, technical notes, case reports, images, letters to the editor, guidelines - dedicated to enhancing surgical expertise in all fields relevant to oral and maxillofacial surgery: from plastic and reconstructive surgery of the face, oral surgery and medicine, … to dentofacial and maxillofacial orthopedics.
Original articles include clinical or laboratory investigations and clinical or equipment reports. Reviews include narrative reviews, systematic reviews and meta-analyses.
All manuscripts submitted to the journal are subjected to peer review by international experts, and must:
Be written in excellent English, clear and easy to understand, precise and concise;
Bring new, interesting, valid information - and improve clinical care or guide future research;
Be solely the work of the author(s) stated;
Not have been previously published elsewhere and not be under consideration by another journal;
Be in accordance with the journal''s Guide for Authors'' instructions: manuscripts that fail to comply with these rules may be returned to the authors without being reviewed.
Under no circumstances does the journal guarantee publication before the editorial board makes its final decision.
The journal is indexed in the main international databases and is accessible worldwide through the ScienceDirect and ClinicalKey Platforms.