Utility of Generative Artificial Intelligence for Patient Care Counseling for Mandibular Fractures.

IF 1 4区医学 Q3 SURGERY Journal of Craniofacial Surgery Pub Date : 2025-07-01 Epub Date: 2024-11-04 DOI:10.1097/SCS.0000000000010832

Ariana L Shaari, Disha P Patil, Saad Mohammed, Parsa P Salehi

{"title":"Utility of Generative Artificial Intelligence for Patient Care Counseling for Mandibular Fractures.","authors":"Ariana L Shaari, Disha P Patil, Saad Mohammed, Parsa P Salehi","doi":"10.1097/SCS.0000000000010832","DOIUrl":null,"url":null,"abstract":"Objective: To determine the readability and accuracy of information regarding mandible fractures generated by Chat Generative Pre-trained Transformer (ChatGPT) versions 3.5 and 4o.Background: Patients are increasingly turning to generative artificial intelligence to answer medical queries. To date, the accuracy and readability of responses regarding mandible fractures have not been assessed.Methods: Twenty patient questions regarding mandible fractures were developed by querying AlsoAsked ( https://alsoasked.com ), SearchResponse ( https://searchresponse.io ), and Answer the Public ( https://answerthepublic.com/ ). Questions were posed to ChatGPT 3.5 and 4o. Readability was assessed by calculating the Flesch Kincaid Reading Ease, Flesch Kincaid Grade Level, number of sentences, and percentage of complex words. Accuracy was assessed by a board-certified facial plastic and reconstructive otolaryngologist using a 5-point Likert Scale.Results: No significant differences were observed between the two versions for readability or accuracy. Readability was above recommended levels for patient education materials. Accuracy was low, and a majority of responses were deemed inappropriate for patient use with multiple inaccuracies and/or missing information.Conclusion: ChatGPT produced responses written at a high level inappropriate for the average patient, in addition to containing several inaccurate statements. Patients and clinicians should be aware of the limitations of generative artificial intelligence when seeking medical information regarding mandible fractures.","PeriodicalId":15462,"journal":{"name":"Journal of Craniofacial Surgery","volume":" ","pages":"1459-1463"},"PeriodicalIF":1.0000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Craniofacial Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/SCS.0000000000010832","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/4 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"SURGERY","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: To determine the readability and accuracy of information regarding mandible fractures generated by Chat Generative Pre-trained Transformer (ChatGPT) versions 3.5 and 4o.

Background: Patients are increasingly turning to generative artificial intelligence to answer medical queries. To date, the accuracy and readability of responses regarding mandible fractures have not been assessed.

Methods: Twenty patient questions regarding mandible fractures were developed by querying AlsoAsked ( https://alsoasked.com ), SearchResponse ( https://searchresponse.io ), and Answer the Public ( https://answerthepublic.com/ ). Questions were posed to ChatGPT 3.5 and 4o. Readability was assessed by calculating the Flesch Kincaid Reading Ease, Flesch Kincaid Grade Level, number of sentences, and percentage of complex words. Accuracy was assessed by a board-certified facial plastic and reconstructive otolaryngologist using a 5-point Likert Scale.

Results: No significant differences were observed between the two versions for readability or accuracy. Readability was above recommended levels for patient education materials. Accuracy was low, and a majority of responses were deemed inappropriate for patient use with multiple inaccuracies and/or missing information.

Conclusion: ChatGPT produced responses written at a high level inappropriate for the average patient, in addition to containing several inaccurate statements. Patients and clinicians should be aware of the limitations of generative artificial intelligence when seeking medical information regarding mandible fractures.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

生成式人工智能在下颌骨骨折患者护理咨询中的实用性。

目的：确定下颌骨骨折信息的可读性和准确性：确定由 Chat Generative Pre-trained Transformer（ChatGPT）3.5 版和 4o 版生成的下颌骨骨折相关信息的可读性和准确性：背景：越来越多的患者求助于生成式人工智能来回答医疗问题。迄今为止，有关下颌骨骨折的回答的准确性和可读性尚未得到评估：通过查询 AlsoAsked (https://alsoasked.com)、SearchResponse (https://searchresponse.io) 和 Answer the Public (https://answerthepublic.com/)，编写了 20 个有关下颌骨骨折的患者问题。问题在 ChatGPT 3.5 和 4o 中提出。可读性通过计算 Flesch Kincaid 阅读轻松度、Flesch Kincaid 年级水平、句子数量和复杂单词百分比进行评估。准确性由一位获得认证的面部整形和耳鼻喉科医师使用 5 点李克特量表进行评估：结果：两个版本在可读性和准确性方面没有明显差异。可读性高于患者教育材料的建议水平。准确性较低，大多数回复被认为不适合患者使用，存在多处不准确和/或信息缺失：结论：ChatGPT 提供的回答水平较高，不适合普通患者使用，此外还包含一些不准确的陈述。患者和临床医生在寻求有关下颌骨骨折的医疗信息时，应认识到人工智能生成器的局限性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Craniofacial Surgery 医学-外科

CiteScore

1.70

自引率

11.10%

发文量

968

审稿时长

1.5 months

期刊介绍： The Journal of Craniofacial Surgery serves as a forum of communication for all those involved in craniofacial surgery, maxillofacial surgery and pediatric plastic surgery. Coverage ranges from practical aspects of craniofacial surgery to the basic science that underlies surgical practice. The journal publishes original articles, scientific reviews, editorials and invited commentary, abstracts and selected articles from international journals, and occasional international bibliographies in craniofacial surgery.