Evaluating ChatGPT o1’s Capabilities in Peripheral Nerve Surgery: Advancing Artificial Intelligence in Clinical Practice

IF 2.1 4区 医学 Q3 CLINICAL NEUROLOGY World neurosurgery Pub Date : 2025-04-01 Epub Date: 2025-03-06 DOI:10.1016/j.wneu.2025.123753
Tim Leypold , Jörg Bahm , Justus P. Beier , Vincent GJ. Guillaume , Tekoshin Ammo , Henrik Lauer , Jonas Kolbenschlag , Benedikt Schäfer
{"title":"Evaluating ChatGPT o1’s Capabilities in Peripheral Nerve Surgery: Advancing Artificial Intelligence in Clinical Practice","authors":"Tim Leypold ,&nbsp;Jörg Bahm ,&nbsp;Justus P. Beier ,&nbsp;Vincent GJ. Guillaume ,&nbsp;Tekoshin Ammo ,&nbsp;Henrik Lauer ,&nbsp;Jonas Kolbenschlag ,&nbsp;Benedikt Schäfer","doi":"10.1016/j.wneu.2025.123753","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Artificial intelligence (AI) continues to advance in healthcare, offering innovative approaches to enhance clinical decision-making and patient management. Peripheral nerve surgery poses unique challenges due to the complexity of cases and the need for precise diagnostic and therapeutic strategies. This study investigates the application of OpenAI's generative AI model, o1, in assisting with intricate decision-making processes in peripheral nerve surgery.</div></div><div><h3>Methods</h3><div>Using advanced prompt engineering techniques, o1 was configured as a virtual medical assistant (Generative Pretrained Transformer–Nerve Surgery [GPT-NS]) to process 5 simulated clinical scenarios modeled after real-world cases. The AI guided surgeons through medical history, diagnostics, and treatment planning, culminating in case summaries. A panel of nerve surgery specialists and residents evaluated the AI's performance using a Likert scale across 7 criteria.</div></div><div><h3>Results</h3><div>GPT-NS demonstrated strong capabilities, achieving an average score of 4.3. High ratings were observed for understanding clinical issues and case presentation clarity. However, areas for improvement were noted in diagnostic sequencing and treatment recommendations. Despite a lower score indicating human evaluators’ perception of their superiority over the AI in handling cases, GPT-NS showed promise as a supportive tool in clinical practice.</div></div><div><h3>Conclusions</h3><div>As the performance of large language model AI continues to improve, it is becoming increasingly important that absolute experts assess the accuracy of the answers to ensure reliable and clinically sound integration into healthcare practices. This study underscores the potential of large language model AI in augmenting clinical decision-making in highly specialized fields like peripheral nerve surgery while demonstrating the ongoing importance of human expertise. Future research should explore ways to further refine AI capabilities and assess its integration into routine surgical workflows.</div></div>","PeriodicalId":23906,"journal":{"name":"World neurosurgery","volume":"196 ","pages":"Article 123753"},"PeriodicalIF":2.1000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"World neurosurgery","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1878875025001093","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/6 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Objective

Artificial intelligence (AI) continues to advance in healthcare, offering innovative approaches to enhance clinical decision-making and patient management. Peripheral nerve surgery poses unique challenges due to the complexity of cases and the need for precise diagnostic and therapeutic strategies. This study investigates the application of OpenAI's generative AI model, o1, in assisting with intricate decision-making processes in peripheral nerve surgery.

Methods

Using advanced prompt engineering techniques, o1 was configured as a virtual medical assistant (Generative Pretrained Transformer–Nerve Surgery [GPT-NS]) to process 5 simulated clinical scenarios modeled after real-world cases. The AI guided surgeons through medical history, diagnostics, and treatment planning, culminating in case summaries. A panel of nerve surgery specialists and residents evaluated the AI's performance using a Likert scale across 7 criteria.

Results

GPT-NS demonstrated strong capabilities, achieving an average score of 4.3. High ratings were observed for understanding clinical issues and case presentation clarity. However, areas for improvement were noted in diagnostic sequencing and treatment recommendations. Despite a lower score indicating human evaluators’ perception of their superiority over the AI in handling cases, GPT-NS showed promise as a supportive tool in clinical practice.

Conclusions

As the performance of large language model AI continues to improve, it is becoming increasingly important that absolute experts assess the accuracy of the answers to ensure reliable and clinically sound integration into healthcare practices. This study underscores the potential of large language model AI in augmenting clinical decision-making in highly specialized fields like peripheral nerve surgery while demonstrating the ongoing importance of human expertise. Future research should explore ways to further refine AI capabilities and assess its integration into routine surgical workflows.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
评估ChatGPT 01在周围神经手术中的能力:在临床实践中推进人工智能。
人工智能(AI)在医疗保健领域不断发展,为增强临床决策和患者管理提供了创新方法。由于病例的复杂性和对精确诊断和治疗策略的需要,周围神经手术提出了独特的挑战。本研究探讨了OpenAI的生成式人工智能模型01在辅助周围神经手术复杂决策过程中的应用。利用先进的快速工程技术,01被配置为虚拟医疗助理(GPT-NS),以处理以现实世界病例为模型的五个模拟临床场景。人工智能引导外科医生了解病史、诊断和治疗计划,最后得出病例摘要。一个由神经外科专家和住院医生组成的小组使用李克特量表评估了人工智能的七个标准。GPT-NS表现出较强的能力,平均得分为4.3分。对临床问题的理解和病例表现的清晰度评分较高。然而,在诊断序列和治疗建议方面指出了需要改进的领域。尽管较低的分数表明人类评估者认为他们在处理病例方面优于人工智能,但GPT-NS在临床实践中显示出作为辅助工具的希望。随着LLM(大型语言模型)人工智能的性能不断提高,绝对专家评估答案的准确性以确保可靠和临床合理地整合到医疗保健实践中变得越来越重要。这项研究强调了LLM人工智能在增强周围神经外科等高度专业化领域的临床决策方面的潜力,同时也证明了人类专业知识的持续重要性。未来的研究应探索进一步完善人工智能能力的方法,并评估其与常规外科工作流程的整合情况。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
World neurosurgery
World neurosurgery CLINICAL NEUROLOGY-SURGERY
CiteScore
3.90
自引率
15.00%
发文量
1765
审稿时长
47 days
期刊介绍: World Neurosurgery has an open access mirror journal World Neurosurgery: X, sharing the same aims and scope, editorial team, submission system and rigorous peer review. The journal''s mission is to: -To provide a first-class international forum and a 2-way conduit for dialogue that is relevant to neurosurgeons and providers who care for neurosurgery patients. The categories of the exchanged information include clinical and basic science, as well as global information that provide social, political, educational, economic, cultural or societal insights and knowledge that are of significance and relevance to worldwide neurosurgery patient care. -To act as a primary intellectual catalyst for the stimulation of creativity, the creation of new knowledge, and the enhancement of quality neurosurgical care worldwide. -To provide a forum for communication that enriches the lives of all neurosurgeons and their colleagues; and, in so doing, enriches the lives of their patients. Topics to be addressed in World Neurosurgery include: EDUCATION, ECONOMICS, RESEARCH, POLITICS, HISTORY, CULTURE, CLINICAL SCIENCE, LABORATORY SCIENCE, TECHNOLOGY, OPERATIVE TECHNIQUES, CLINICAL IMAGES, VIDEOS
期刊最新文献
The Baylor Score for Prognosticating Cranial Gunshot Wounds: Clinical Application and Nuances Clinical Predictors of 90-Day Mortality After Endovascular Treatment for Acute Basilar Artery Occlusion Lattice versus Pipeline and Tubridge Flow Diverters for Unruptured Internal Carotid Artery Aneurysms: A Retrospective Cohort Study Multivariable Analysis-Based Risk Prediction Model for Intracranial Hematoma Expansion in Traumatic Brain Injury Patients Early Increase in Perihematomal Edema Volume after Intracerebral Hemorrhage Is an Independent Predictor of 90-Day Poor Functional Outcome
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1