Emily Ajit-Roger, Alexander Moise, Carolina Peralta, Ostap Orishchak, Sam J Daniel
{"title":"Enhancing Multilingual Patient Education: ChatGPT's Accuracy and Readability for SSNHL Queries in English and Spanish.","authors":"Emily Ajit-Roger, Alexander Moise, Carolina Peralta, Ostap Orishchak, Sam J Daniel","doi":"10.1002/oto2.70048","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study investigates ChatGPT's accuracy, readability, understandability, and actionability in responding to patient queries on sudden sensorineural hearing loss (SSNHL) in English and Spanish, when compared to Google responses. The objective is to address concerns regarding its proficiency in addressing medical inquiries when presented in a language divergent from its primary programming.</p><p><strong>Study design: </strong>Observational.</p><p><strong>Setting: </strong>Virtual environment.</p><p><strong>Methods: </strong>Using ChatGPT 3.5 and Google, questions from the AAO-HNSF guidelines were presented in English and Spanish. Responses were graded by 2 otolaryngologists proficient in both languages using a 4-point Likert scale and the PEMAT-P tool. To ensure uniform application of the Likert scale, a third independent evaluator reviewed the consistency in grading. Readability was evaluated using 3 different tools specific to each language. IBM SPSS Version 29 was used for statistical analysis using one-way analysis of variance.</p><p><strong>Results: </strong>Across both languages, the responses displayed a native-level language proficiency. Accuracy was comparable between sources and languages. Google's Spanish responses had better readability (effect size 0.35, <i>P</i> < .001), while Google's English responses were more understandable (effect size 0.67, <i>P</i> = .018). ChatGPT's English responses demonstrated the highest level of actionability (60%), though not significantly different when compared to other sources (effect size 0.47, <i>P</i> = .14).</p><p><strong>Conclusion: </strong>ChatGPT offers patients comprehensive and guideline-conforming answers to SSNHL patient medical queries in the 2 most spoken languages in the United States. However, improvements in its readability and understandability are warranted for more accessible patient education.</p>","PeriodicalId":19697,"journal":{"name":"OTO Open","volume":"8 4","pages":"e70048"},"PeriodicalIF":1.8000,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11633712/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"OTO Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/oto2.70048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"OTORHINOLARYNGOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: This study investigates ChatGPT's accuracy, readability, understandability, and actionability in responding to patient queries on sudden sensorineural hearing loss (SSNHL) in English and Spanish, when compared to Google responses. The objective is to address concerns regarding its proficiency in addressing medical inquiries when presented in a language divergent from its primary programming.
Study design: Observational.
Setting: Virtual environment.
Methods: Using ChatGPT 3.5 and Google, questions from the AAO-HNSF guidelines were presented in English and Spanish. Responses were graded by 2 otolaryngologists proficient in both languages using a 4-point Likert scale and the PEMAT-P tool. To ensure uniform application of the Likert scale, a third independent evaluator reviewed the consistency in grading. Readability was evaluated using 3 different tools specific to each language. IBM SPSS Version 29 was used for statistical analysis using one-way analysis of variance.
Results: Across both languages, the responses displayed a native-level language proficiency. Accuracy was comparable between sources and languages. Google's Spanish responses had better readability (effect size 0.35, P < .001), while Google's English responses were more understandable (effect size 0.67, P = .018). ChatGPT's English responses demonstrated the highest level of actionability (60%), though not significantly different when compared to other sources (effect size 0.47, P = .14).
Conclusion: ChatGPT offers patients comprehensive and guideline-conforming answers to SSNHL patient medical queries in the 2 most spoken languages in the United States. However, improvements in its readability and understandability are warranted for more accessible patient education.
目的:本研究考察ChatGPT在回答英语和西班牙语患者关于突发性感音神经性听力损失(SSNHL)问题时的准确性、可读性、可理解性和可操作性,并将其与谷歌应答进行比较。其目的是解决在以不同于其主要节目编制的语言处理医疗问询时对其熟练程度的关切。研究设计:观察性研究。设置:虚拟环境。方法:使用ChatGPT 3.5和谷歌,以英语和西班牙语呈现AAO-HNSF指南中的问题。2名精通两种语言的耳鼻喉科医生使用4分李克特量表和PEMAT-P工具对回答进行评分。为了确保李克特量表的统一应用,第三个独立评估人员审查了评分的一致性。使用3种特定于每种语言的不同工具来评估可读性。采用IBM SPSS Version 29进行统计分析,采用单因素方差分析。结果:在两种语言中,回答显示出母语水平的语言熟练程度。准确性在来源和语言之间是相当的。b谷歌的西班牙语回答具有更好的可读性(效应值0.35,P P = 0.018)。ChatGPT的英语回答显示出最高水平的可操作性(60%),尽管与其他来源相比没有显著差异(效应值0.47,P = 0.14)。结论:ChatGPT用美国最常用的两种语言为SSNHL患者的医疗问题提供了全面且符合指南的答案。然而,其可读性和可理解性的改进是必要的,以便更容易获得的患者教育。