Vangelis G Alexiou, Bauer E Sumpio, Areti Vassiliou, Stavros K Kakkos, George Geroulakos
{"title":"Artificial Intelligence in Diagnosing and Managing Vascular Surgery Patients: An Experimental Study Using the GPT-4 Model.","authors":"Vangelis G Alexiou, Bauer E Sumpio, Areti Vassiliou, Stavros K Kakkos, George Geroulakos","doi":"10.1016/j.avsg.2024.11.014","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>The introduction of artificial intelligence (AI) has led to groundbreaking advancements across many scientific fields. Machine learning algorithms have enabled AI models to learn, adapt, and solve complex problems in previously unimaginable ways. Natural language processing (NLP) allows these models to comprehend and respond to inquiries in a natural and humanly understandable way. We sought to investigate the application and performance of an AI chatbot in the diagnosis and management of vascular surgery patients.</p><p><strong>Design: </strong>An experimental study to evaluate the performance of GPT-4 AI model across 57 clinical scenarios derived from a textbook in vascular surgery.</p><p><strong>Methods: </strong>Specific prompts were devised to address the AI model and task it to identify symptoms, diagnose conditions, and select appropriate therapeutic approaches. Answers were scored, descriptive statistics were produced and means were compared across topics. The reasoning and evidence used in the cases in which AI performed poorly were critically reviewed.</p><p><strong>Results: </strong>The AI model correctly answered over 65% of the 385 questions. Performance variation between and within 13 vascular surgery topics did not show any statistically significant differences. Analysis of the questions where the model failed by more than 50% suggests a gap in the ability to interpret and process multifaceted medical information. 27% of these errors were attributed to potential lack of understanding of complex clinical scenarios. The AI model also quoted incorrect or outdated information in 14% of cases and showed an inability to comprehend context, nuances, and medical classification systems in 11% of the cases.</p><p><strong>Conclusion: </strong>GPT-4 demonstrated potential to provide clinically relevant answers for most of the tested scenarios. However, its reasoning must still be carefully analyzed for exactitude and clinical validity. While language models show promise as valuable tools for clinicians, it is essential to recognize their role as supportive mechanisms rather than standalone solutions.</p>","PeriodicalId":8061,"journal":{"name":"Annals of vascular surgery","volume":" ","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of vascular surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.avsg.2024.11.014","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PERIPHERAL VASCULAR DISEASE","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: The introduction of artificial intelligence (AI) has led to groundbreaking advancements across many scientific fields. Machine learning algorithms have enabled AI models to learn, adapt, and solve complex problems in previously unimaginable ways. Natural language processing (NLP) allows these models to comprehend and respond to inquiries in a natural and humanly understandable way. We sought to investigate the application and performance of an AI chatbot in the diagnosis and management of vascular surgery patients.
Design: An experimental study to evaluate the performance of GPT-4 AI model across 57 clinical scenarios derived from a textbook in vascular surgery.
Methods: Specific prompts were devised to address the AI model and task it to identify symptoms, diagnose conditions, and select appropriate therapeutic approaches. Answers were scored, descriptive statistics were produced and means were compared across topics. The reasoning and evidence used in the cases in which AI performed poorly were critically reviewed.
Results: The AI model correctly answered over 65% of the 385 questions. Performance variation between and within 13 vascular surgery topics did not show any statistically significant differences. Analysis of the questions where the model failed by more than 50% suggests a gap in the ability to interpret and process multifaceted medical information. 27% of these errors were attributed to potential lack of understanding of complex clinical scenarios. The AI model also quoted incorrect or outdated information in 14% of cases and showed an inability to comprehend context, nuances, and medical classification systems in 11% of the cases.
Conclusion: GPT-4 demonstrated potential to provide clinically relevant answers for most of the tested scenarios. However, its reasoning must still be carefully analyzed for exactitude and clinical validity. While language models show promise as valuable tools for clinicians, it is essential to recognize their role as supportive mechanisms rather than standalone solutions.
期刊介绍:
Annals of Vascular Surgery, published eight times a year, invites original manuscripts reporting clinical and experimental work in vascular surgery for peer review. Articles may be submitted for the following sections of the journal:
Clinical Research (reports of clinical series, new drug or medical device trials)
Basic Science Research (new investigations, experimental work)
Case Reports (reports on a limited series of patients)
General Reviews (scholarly review of the existing literature on a relevant topic)
Developments in Endovascular and Endoscopic Surgery
Selected Techniques (technical maneuvers)
Historical Notes (interesting vignettes from the early days of vascular surgery)
Editorials/Correspondence