Quynh-Lam Tran, Pauline P Huynh, Bryan Le, Nancy Jiang
{"title":"Utilization of Artificial Intelligence in the Creation of Patient Information on Laryngology Topics.","authors":"Quynh-Lam Tran, Pauline P Huynh, Bryan Le, Nancy Jiang","doi":"10.1002/lary.31891","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To evaluate and compare the readability and quality of patient information generated by Chat-Generative Pre-Trained Transformer-3.5 (ChatGPT) and the American Academy of Otolaryngology-Head and Neck Surgery (AAO-HNS) using validated instruments including Flesch-Kincaid Grade Level (FKGL), Flesch Reading Ease, DISCERN, and Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P).</p><p><strong>Methods: </strong>ENTHealth.org and ChatGPT-3.5 were queried for patient information on laryngology topics. ChatGPT-3.5 was queried twice for a given topic to evaluate for reliability. This generated three de-identified text documents for each topic: one from AAO-HNS and two from ChatGPT (ChatGPT Output 1, ChatGPT Output 2). Grade level and reading ease were compared between the three sources using a one-way analysis of variance and Tukey's post hoc test. Independent t-tests were used to compare DISCERN and PEMAT understandability and actionability scores between AAO-HNS and ChatGPT Output 1.</p><p><strong>Results: </strong>Material generated from ChatGPT Output 1 and ChatGPT Output 2 were at least two reading grade levels higher than that of material from AAO-HNS (p < 0.001). Regarding reading ease, ChatGPT Output 1 and ChatGPT Output 2 documents had significantly lower mean scores compared to AAO-HNS (p < 0.001). Moreover, ChatGPT Output 1 material on vocal cord paralysis had a lower PEMAT-P understandability compared to that of AAO-HNS material (p > 0.05).</p><p><strong>Conclusion: </strong>Patient information on the ENTHealth.org website for select laryngology topics was, on average, of a lower grade level and higher reading ease compared to that produced by ChatGPT, but interestingly with largely no difference in the quality of information provided.</p><p><strong>Level of evidence: </strong>NA Laryngoscope, 2024.</p>","PeriodicalId":49921,"journal":{"name":"Laryngoscope","volume":" ","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Laryngoscope","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/lary.31891","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: To evaluate and compare the readability and quality of patient information generated by Chat-Generative Pre-Trained Transformer-3.5 (ChatGPT) and the American Academy of Otolaryngology-Head and Neck Surgery (AAO-HNS) using validated instruments including Flesch-Kincaid Grade Level (FKGL), Flesch Reading Ease, DISCERN, and Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P).
Methods: ENTHealth.org and ChatGPT-3.5 were queried for patient information on laryngology topics. ChatGPT-3.5 was queried twice for a given topic to evaluate for reliability. This generated three de-identified text documents for each topic: one from AAO-HNS and two from ChatGPT (ChatGPT Output 1, ChatGPT Output 2). Grade level and reading ease were compared between the three sources using a one-way analysis of variance and Tukey's post hoc test. Independent t-tests were used to compare DISCERN and PEMAT understandability and actionability scores between AAO-HNS and ChatGPT Output 1.
Results: Material generated from ChatGPT Output 1 and ChatGPT Output 2 were at least two reading grade levels higher than that of material from AAO-HNS (p < 0.001). Regarding reading ease, ChatGPT Output 1 and ChatGPT Output 2 documents had significantly lower mean scores compared to AAO-HNS (p < 0.001). Moreover, ChatGPT Output 1 material on vocal cord paralysis had a lower PEMAT-P understandability compared to that of AAO-HNS material (p > 0.05).
Conclusion: Patient information on the ENTHealth.org website for select laryngology topics was, on average, of a lower grade level and higher reading ease compared to that produced by ChatGPT, but interestingly with largely no difference in the quality of information provided.
期刊介绍:
The Laryngoscope has been the leading source of information on advances in the diagnosis and treatment of head and neck disorders since 1890. The Laryngoscope is the first choice among otolaryngologists for publication of their important findings and techniques. Each monthly issue of The Laryngoscope features peer-reviewed medical, clinical, and research contributions in general otolaryngology, allergy/rhinology, otology/neurotology, laryngology/bronchoesophagology, head and neck surgery, sleep medicine, pediatric otolaryngology, facial plastics and reconstructive surgery, oncology, and communicative disorders. Contributions include papers and posters presented at the Annual and Section Meetings of the Triological Society, as well as independent papers, "How I Do It", "Triological Best Practice" articles, and contemporary reviews. Theses authored by the Triological Society’s new Fellows as well as papers presented at meetings of the American Laryngological Association are published in The Laryngoscope.
• Broncho-esophagology
• Communicative disorders
• Head and neck surgery
• Plastic and reconstructive facial surgery
• Oncology
• Speech and hearing defects