Nada Ismaiel MD , Teresa Phuongtram Nguyen MD , Nan Guo PhD , Brendan Carvalho MBBCh , Pervez Sultan MBChB
{"title":"The evaluation of the performance of ChatGPT in the management of labor analgesia","authors":"Nada Ismaiel MD , Teresa Phuongtram Nguyen MD , Nan Guo PhD , Brendan Carvalho MBBCh , Pervez Sultan MBChB","doi":"10.1016/j.jclinane.2024.111582","DOIUrl":null,"url":null,"abstract":"<div><p><em>ChatGPT4</em> is a leading large language model (LLM) chatbot released by OpenAI in 2023. <em>ChatGPT4</em> can respond to free-text queries, answer questions and make suggestions regarding virtually any topic. <em>ChatGPT4</em> has successfully answered anesthesia and even obstetric anesthesia knowledge-based questions with reasonable accuracy. However, <em>ChatGPT4</em> has yet to be challenged in obstetric anesthesia clinical decision-making. <strong>Study Objective:</strong> In this study, we evaluated the performance of <em>ChatGPT4</em> in the management of clinical labor analgesia scenarios compared to expert obstetric anesthesiologists. <strong>Intervention:</strong> Eight clinical questions with progressively increasing medical complexity were posed to <em>ChatGPT4</em>. <strong>Measurements:</strong> The <em>ChatGPT4</em> responses were rated by seven expert obstetric anesthesiologists based on safety, accuracy and completeness of each response using a five-point Likert rating scale. <strong>Main Results:</strong> <em>ChatGPT4</em> was deemed safe in 73% of responses to the presented obstetric anesthesia clinical scenarios (27% of responses were deemed unsafe). None of the <em>ChatGPT4</em> responses were unanimously deemed to be safe by all seven expert obstetric anesthesiologists. Moreover, <em>ChatGPT4</em> responses were overall partly accurate (score 4 out of 5) and somewhat incomplete (score 3.5 out of 5). <strong>Conclusions:</strong> In summary, approximately one quarter of all responses by <em>ChatGPT4</em> were deemed unsafe by expert obstetric anesthesiologists. These findings may suggest the need for more fine-tuning and training of LLMs such as <em>ChatGPT4</em> specifically for clinical decision making in obstetric anesthesia or other specialized medical fields. These LLMs may come to play an important future role in assisting obstetric anesthesiologists in clinical decision making and enhancing overall patient care.</p></div>","PeriodicalId":15506,"journal":{"name":"Journal of Clinical Anesthesia","volume":"98 ","pages":"Article 111582"},"PeriodicalIF":5.0000,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Anesthesia","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952818024002113","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ANESTHESIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
ChatGPT4 is a leading large language model (LLM) chatbot released by OpenAI in 2023. ChatGPT4 can respond to free-text queries, answer questions and make suggestions regarding virtually any topic. ChatGPT4 has successfully answered anesthesia and even obstetric anesthesia knowledge-based questions with reasonable accuracy. However, ChatGPT4 has yet to be challenged in obstetric anesthesia clinical decision-making. Study Objective: In this study, we evaluated the performance of ChatGPT4 in the management of clinical labor analgesia scenarios compared to expert obstetric anesthesiologists. Intervention: Eight clinical questions with progressively increasing medical complexity were posed to ChatGPT4. Measurements: The ChatGPT4 responses were rated by seven expert obstetric anesthesiologists based on safety, accuracy and completeness of each response using a five-point Likert rating scale. Main Results:ChatGPT4 was deemed safe in 73% of responses to the presented obstetric anesthesia clinical scenarios (27% of responses were deemed unsafe). None of the ChatGPT4 responses were unanimously deemed to be safe by all seven expert obstetric anesthesiologists. Moreover, ChatGPT4 responses were overall partly accurate (score 4 out of 5) and somewhat incomplete (score 3.5 out of 5). Conclusions: In summary, approximately one quarter of all responses by ChatGPT4 were deemed unsafe by expert obstetric anesthesiologists. These findings may suggest the need for more fine-tuning and training of LLMs such as ChatGPT4 specifically for clinical decision making in obstetric anesthesia or other specialized medical fields. These LLMs may come to play an important future role in assisting obstetric anesthesiologists in clinical decision making and enhancing overall patient care.
期刊介绍:
The Journal of Clinical Anesthesia (JCA) addresses all aspects of anesthesia practice, including anesthetic administration, pharmacokinetics, preoperative and postoperative considerations, coexisting disease and other complicating factors, cost issues, and similar concerns anesthesiologists contend with daily. Exceptionally high standards of presentation and accuracy are maintained.
The core of the journal is original contributions on subjects relevant to clinical practice, and rigorously peer-reviewed. Highly respected international experts have joined together to form the Editorial Board, sharing their years of experience and clinical expertise. Specialized section editors cover the various subspecialties within the field. To keep your practical clinical skills current, the journal bridges the gap between the laboratory and the clinical practice of anesthesiology and critical care to clarify how new insights can improve daily practice.