Comparing ChatGPT and clinical nurses’ performances on tracheostomy care: A cross-sectional study

IF 3.1 Q1 NURSING International Journal of Nursing Studies Advances Pub Date : 2024-01-28 DOI:10.1016/j.ijnsa.2024.100181

Tongyao Wang , Juan Mu , Jialing Chen , Chia-Chin Lin

{"title":"Comparing ChatGPT and clinical nurses’ performances on tracheostomy care: A cross-sectional study","authors":"Tongyao Wang , Juan Mu , Jialing Chen , Chia-Chin Lin","doi":"10.1016/j.ijnsa.2024.100181","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>The release of ChatGPT for general use in 2023 by OpenAI has significantly expanded the possible applications of generative artificial intelligence in the healthcare sector, particularly in terms of information retrieval by patients, medical and nursing students, and healthcare personnel.</p></div><div><h3>Objective</h3><p>To compare the performance of ChatGPT-3.5 and ChatGPT-4.0 to clinical nurses on answering questions about tracheostomy care, as well as to determine whether using different prompts to pre-define the scope of the ChatGPT affects the accuracy of their responses.</p></div><div><h3>Design</h3><p>Cross-sectional study.</p></div><div><h3>Setting</h3><p>The data collected from the ChatGPT was collected using the ChatGPT-3.5 and 4.0 using access provided by the University of Hong Kong. The data from the clinical nurses working in mainland China was collected using the Qualtrics survey program.</p></div><div><h3>Participants</h3><p>No participants were needed for collecting the ChatGPT responses. A total of 272 clinical nurses, with 98.5 % of them working in tertiary care hospitals in mainland China, were recruited using a snowball sampling approach.</p></div><div><h3>Method</h3><p>We used 43 tracheostomy care-related questions in a multiple-choice format to evaluate the performance of ChatGPT-3.5, ChatGPT-4.0, and clinical nurses. ChatGPT-3.5 and GPT-4.0 were both queried three times with the same questions by different prompts: no prompt, patient-friendly prompt, and act-as-nurse prompt. All responses were independently graded by two qualified otorhinolaryngology nurses on a 3-point accuracy scale (correct, partially correct, and incorrect). The Chi-squared test and Fisher exact test with post-hoc Bonferroni adjustment were used to assess the differences in performance between the three groups, as well as the differences in accuracy between different prompts.</p></div><div><h3>Results</h3><p>ChatGPT-4.0 showed significantly higher accuracy, with 64.3 % of responses rated as ‘correct’, compared to 60.5 % in ChatGPT-3.5 and 36.7 % in clinical nurses (<em>X <sup>2</sup></em> = 74.192, <em>p</em> < .001). Except for the ‘care for the tracheostomy stoma and surrounding skin’ domain (<em>X<sup>2</sup></em> = 6.227, <em>p</em> = .156), scores from ChatGPT-3.5 and -4.0 were significantly better than nurses’ on domains related to airway humidification, cuff management, tracheostomy tube care, suction techniques, and management of complications. Overall, ChatGPT-4.0 consistently performed well in all domains, achieving over 50 % accuracy in each domain. Alterations to the prompt had no impact on the performance of ChatGPT-3.5 or -4.0.</p></div><div><h3>Conclusion</h3><p>ChatGPT may serve as a complementary medical information tool for patients and physicians to improve knowledge in tracheostomy care.</p></div><div><h3>Tweetable abstract</h3><p>ChatGPT-4.0 can answer tracheostomy care questions better than most clinical nurses. There is no reason nurses should not be using it.</p></div>","PeriodicalId":34476,"journal":{"name":"International Journal of Nursing Studies Advances","volume":"6 ","pages":"Article 100181"},"PeriodicalIF":3.1000,"publicationDate":"2024-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666142X24000080/pdfft?md5=deec54a55b2e4390941888b959a5b0f5&pid=1-s2.0-S2666142X24000080-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Nursing Studies Advances","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666142X24000080","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"NURSING","Score":null,"Total":0}

引用次数: 0

Abstract

Background

The release of ChatGPT for general use in 2023 by OpenAI has significantly expanded the possible applications of generative artificial intelligence in the healthcare sector, particularly in terms of information retrieval by patients, medical and nursing students, and healthcare personnel.

Objective

To compare the performance of ChatGPT-3.5 and ChatGPT-4.0 to clinical nurses on answering questions about tracheostomy care, as well as to determine whether using different prompts to pre-define the scope of the ChatGPT affects the accuracy of their responses.

Design

Cross-sectional study.

Setting

The data collected from the ChatGPT was collected using the ChatGPT-3.5 and 4.0 using access provided by the University of Hong Kong. The data from the clinical nurses working in mainland China was collected using the Qualtrics survey program.

Participants

No participants were needed for collecting the ChatGPT responses. A total of 272 clinical nurses, with 98.5 % of them working in tertiary care hospitals in mainland China, were recruited using a snowball sampling approach.

Method

We used 43 tracheostomy care-related questions in a multiple-choice format to evaluate the performance of ChatGPT-3.5, ChatGPT-4.0, and clinical nurses. ChatGPT-3.5 and GPT-4.0 were both queried three times with the same questions by different prompts: no prompt, patient-friendly prompt, and act-as-nurse prompt. All responses were independently graded by two qualified otorhinolaryngology nurses on a 3-point accuracy scale (correct, partially correct, and incorrect). The Chi-squared test and Fisher exact test with post-hoc Bonferroni adjustment were used to assess the differences in performance between the three groups, as well as the differences in accuracy between different prompts.

Results

ChatGPT-4.0 showed significantly higher accuracy, with 64.3 % of responses rated as ‘correct’, compared to 60.5 % in ChatGPT-3.5 and 36.7 % in clinical nurses (X ² = 74.192, p < .001). Except for the ‘care for the tracheostomy stoma and surrounding skin’ domain (X² = 6.227, p = .156), scores from ChatGPT-3.5 and -4.0 were significantly better than nurses’ on domains related to airway humidification, cuff management, tracheostomy tube care, suction techniques, and management of complications. Overall, ChatGPT-4.0 consistently performed well in all domains, achieving over 50 % accuracy in each domain. Alterations to the prompt had no impact on the performance of ChatGPT-3.5 or -4.0.

Conclusion

ChatGPT may serve as a complementary medical information tool for patients and physicians to improve knowledge in tracheostomy care.

Tweetable abstract

ChatGPT-4.0 can answer tracheostomy care questions better than most clinical nurses. There is no reason nurses should not be using it.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

比较 ChatGPT 和临床护士在气管切开护理问题上的表现：横断面研究

背景OpenAI于2023年发布了可普遍使用的ChatGPT，这极大地扩展了生成式人工智能在医疗保健领域的应用，尤其是在患者、医护学生和医护人员的信息检索方面。比较 ChatGPT-3.5 和 ChatGPT-4.0 对临床护士回答气管造口术护理问题的表现，并确定使用不同的提示来预先定义 ChatGPT 的范围是否会影响其回答的准确性。在中国大陆工作的临床护士的数据使用 Qualtrics 调查程序收集。我们使用 43 个气管造口护理相关问题，以多项选择的形式评估 ChatGPT-3.5、ChatGPT-4.0 和临床护士的表现。ChatGPT-3.5 和 GPT-4.0 都通过不同的提示对相同的问题进行了三次查询：无提示、患者友好提示和扮演护士提示。所有回答均由两名合格的耳鼻喉科护士按照 3 级准确度（正确、部分正确和不正确）进行独立评分。结果ChatGPT-4.0的准确率明显更高，64.3%的回答被评为 "正确"，而ChatGPT-3.5为60.5%，临床护士为36.7% (X 2 = 74.192, p <.001)。除 "气管造口和周围皮肤护理 "领域（X2 = 6.227，p = .156）外，ChatGPT-3.5 和 -4.0 在气道加湿、充气罩囊管理、气管造口管护理、吸痰技术和并发症处理等相关领域的得分均显著高于护士。总体而言，ChatGPT-4.0 在所有领域都表现出色，每个领域的准确率都超过了 50%。结论ChatGPT可作为辅助医疗信息工具，帮助患者和医生提高气管造口护理知识。护士没有理由不使用它。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊