Mark Miller, William T DiCiurcio, Matthew Meade, Levi Buchan, Jeffrey Gleimer, Barrett Woods, Christopher Kepler
{"title":"Appropriateness and Consistency of an Online Artificial Intelligence System's Response to Common Questions Regarding Cervical Fusion.","authors":"Mark Miller, William T DiCiurcio, Matthew Meade, Levi Buchan, Jeffrey Gleimer, Barrett Woods, Christopher Kepler","doi":"10.1097/BSD.0000000000001768","DOIUrl":null,"url":null,"abstract":"<p><strong>Study design: </strong>Prospective survey study.</p><p><strong>Objective: </strong>To address a gap that exists concerning ChatGPT's ability to respond to various types of questions regarding cervical surgery.</p><p><strong>Summary of background data: </strong>Artificial Intelligence (AI) and machine learning have been creating great change in the landscape of scientific research. Chat Generative Pre-trained Transformer(ChatGPT), an online AI language model, has emerged as a powerful tool in clinical medicine and surgery. Previous studies have demonstrated appropriate and reliable responses from ChatGPT concerning patient questions regarding total joint arthroplasty, distal radius fractures, and lumbar laminectomy. However, there is a gap that exists in examining how accurate and reliable ChatGPT responses are to common questions related to cervical surgery.</p><p><strong>Materials and methods: </strong>Twenty questions regarding cervical surgery were presented to the online ChatGPT-3.5 web application 3 separate times, creating 60 responses. Responses were then analyzed by 3 fellowship-trained spine surgeons across 2 institutions using a modified Global Quality Scale (1-5 rating) to evaluate accuracy and utility. Descriptive statistics were reported based on responses, and intraclass correlation coefficients were then calculated to assess the consistency of response quality.</p><p><strong>Results: </strong>Out of all questions proposed to the AI platform, the average score was 3.17 (95% CI, 2.92, 3.42), with 66.7% of responses being recorded to be of at least \"moderate\" quality by 1 reviewer. Nine (45%) questions yielded responses that were graded at least \"moderate\" quality by all 3 reviewers. The test-retest reliability was poor with the intraclass correlation coefficient (ICC) calculated as 0.0941 (-0.222, 0.135).</p><p><strong>Conclusion: </strong>This study demonstrated that ChatGPT can answer common patient questions concerning cervical surgery with moderate quality during the majority of responses. Further research within AI is necessary to increase response.</p>","PeriodicalId":10457,"journal":{"name":"Clinical Spine Surgery","volume":" ","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical Spine Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/BSD.0000000000001768","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Study design: Prospective survey study.
Objective: To address a gap that exists concerning ChatGPT's ability to respond to various types of questions regarding cervical surgery.
Summary of background data: Artificial Intelligence (AI) and machine learning have been creating great change in the landscape of scientific research. Chat Generative Pre-trained Transformer(ChatGPT), an online AI language model, has emerged as a powerful tool in clinical medicine and surgery. Previous studies have demonstrated appropriate and reliable responses from ChatGPT concerning patient questions regarding total joint arthroplasty, distal radius fractures, and lumbar laminectomy. However, there is a gap that exists in examining how accurate and reliable ChatGPT responses are to common questions related to cervical surgery.
Materials and methods: Twenty questions regarding cervical surgery were presented to the online ChatGPT-3.5 web application 3 separate times, creating 60 responses. Responses were then analyzed by 3 fellowship-trained spine surgeons across 2 institutions using a modified Global Quality Scale (1-5 rating) to evaluate accuracy and utility. Descriptive statistics were reported based on responses, and intraclass correlation coefficients were then calculated to assess the consistency of response quality.
Results: Out of all questions proposed to the AI platform, the average score was 3.17 (95% CI, 2.92, 3.42), with 66.7% of responses being recorded to be of at least "moderate" quality by 1 reviewer. Nine (45%) questions yielded responses that were graded at least "moderate" quality by all 3 reviewers. The test-retest reliability was poor with the intraclass correlation coefficient (ICC) calculated as 0.0941 (-0.222, 0.135).
Conclusion: This study demonstrated that ChatGPT can answer common patient questions concerning cervical surgery with moderate quality during the majority of responses. Further research within AI is necessary to increase response.
期刊介绍:
Clinical Spine Surgery is the ideal journal for the busy practicing spine surgeon or trainee, as it is the only journal necessary to keep up to date with new clinical research and surgical techniques. Readers get to watch leaders in the field debate controversial topics in a new controversies section, and gain access to evidence-based reviews of important pathologies in the systematic reviews section. The journal features a surgical technique complete with a video, and a tips and tricks section that allows surgeons to review the important steps prior to a complex procedure.
Clinical Spine Surgery provides readers with primary research studies, specifically level 1, 2 and 3 studies, ensuring that articles that may actually change a surgeon’s practice will be read and published. Each issue includes a brief article that will help a surgeon better understand the business of healthcare, as well as an article that will help a surgeon understand how to interpret increasingly complex research methodology. Clinical Spine Surgery is your single source for up-to-date, evidence-based recommendations for spine care.