Tim Leypold , Jörg Bahm , Justus P. Beier , Vincent GJ. Guillaume , Tekoshin Ammo , Henrik Lauer , Jonas Kolbenschlag , Benedikt Schäfer
{"title":"Evaluating ChatGPT o1’s Capabilities in Peripheral Nerve Surgery: Advancing Artificial Intelligence in Clinical Practice","authors":"Tim Leypold , Jörg Bahm , Justus P. Beier , Vincent GJ. Guillaume , Tekoshin Ammo , Henrik Lauer , Jonas Kolbenschlag , Benedikt Schäfer","doi":"10.1016/j.wneu.2025.123753","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Artificial intelligence (AI) continues to advance in healthcare, offering innovative approaches to enhance clinical decision-making and patient management. Peripheral nerve surgery poses unique challenges due to the complexity of cases and the need for precise diagnostic and therapeutic strategies. This study investigates the application of OpenAI's generative AI model, o1, in assisting with intricate decision-making processes in peripheral nerve surgery.</div></div><div><h3>Methods</h3><div>Using advanced prompt engineering techniques, o1 was configured as a virtual medical assistant (Generative Pretrained Transformer–Nerve Surgery [GPT-NS]) to process 5 simulated clinical scenarios modeled after real-world cases. The AI guided surgeons through medical history, diagnostics, and treatment planning, culminating in case summaries. A panel of nerve surgery specialists and residents evaluated the AI's performance using a Likert scale across 7 criteria.</div></div><div><h3>Results</h3><div>GPT-NS demonstrated strong capabilities, achieving an average score of 4.3. High ratings were observed for understanding clinical issues and case presentation clarity. However, areas for improvement were noted in diagnostic sequencing and treatment recommendations. Despite a lower score indicating human evaluators’ perception of their superiority over the AI in handling cases, GPT-NS showed promise as a supportive tool in clinical practice.</div></div><div><h3>Conclusions</h3><div>As the performance of large language model AI continues to improve, it is becoming increasingly important that absolute experts assess the accuracy of the answers to ensure reliable and clinically sound integration into healthcare practices. This study underscores the potential of large language model AI in augmenting clinical decision-making in highly specialized fields like peripheral nerve surgery while demonstrating the ongoing importance of human expertise. Future research should explore ways to further refine AI capabilities and assess its integration into routine surgical workflows.</div></div>","PeriodicalId":23906,"journal":{"name":"World neurosurgery","volume":"196 ","pages":"Article 123753"},"PeriodicalIF":1.9000,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"World neurosurgery","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1878875025001093","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Objective
Artificial intelligence (AI) continues to advance in healthcare, offering innovative approaches to enhance clinical decision-making and patient management. Peripheral nerve surgery poses unique challenges due to the complexity of cases and the need for precise diagnostic and therapeutic strategies. This study investigates the application of OpenAI's generative AI model, o1, in assisting with intricate decision-making processes in peripheral nerve surgery.
Methods
Using advanced prompt engineering techniques, o1 was configured as a virtual medical assistant (Generative Pretrained Transformer–Nerve Surgery [GPT-NS]) to process 5 simulated clinical scenarios modeled after real-world cases. The AI guided surgeons through medical history, diagnostics, and treatment planning, culminating in case summaries. A panel of nerve surgery specialists and residents evaluated the AI's performance using a Likert scale across 7 criteria.
Results
GPT-NS demonstrated strong capabilities, achieving an average score of 4.3. High ratings were observed for understanding clinical issues and case presentation clarity. However, areas for improvement were noted in diagnostic sequencing and treatment recommendations. Despite a lower score indicating human evaluators’ perception of their superiority over the AI in handling cases, GPT-NS showed promise as a supportive tool in clinical practice.
Conclusions
As the performance of large language model AI continues to improve, it is becoming increasingly important that absolute experts assess the accuracy of the answers to ensure reliable and clinically sound integration into healthcare practices. This study underscores the potential of large language model AI in augmenting clinical decision-making in highly specialized fields like peripheral nerve surgery while demonstrating the ongoing importance of human expertise. Future research should explore ways to further refine AI capabilities and assess its integration into routine surgical workflows.
期刊介绍:
World Neurosurgery has an open access mirror journal World Neurosurgery: X, sharing the same aims and scope, editorial team, submission system and rigorous peer review.
The journal''s mission is to:
-To provide a first-class international forum and a 2-way conduit for dialogue that is relevant to neurosurgeons and providers who care for neurosurgery patients. The categories of the exchanged information include clinical and basic science, as well as global information that provide social, political, educational, economic, cultural or societal insights and knowledge that are of significance and relevance to worldwide neurosurgery patient care.
-To act as a primary intellectual catalyst for the stimulation of creativity, the creation of new knowledge, and the enhancement of quality neurosurgical care worldwide.
-To provide a forum for communication that enriches the lives of all neurosurgeons and their colleagues; and, in so doing, enriches the lives of their patients.
Topics to be addressed in World Neurosurgery include: EDUCATION, ECONOMICS, RESEARCH, POLITICS, HISTORY, CULTURE, CLINICAL SCIENCE, LABORATORY SCIENCE, TECHNOLOGY, OPERATIVE TECHNIQUES, CLINICAL IMAGES, VIDEOS