Anna Stroop, Tabea Stroop, Samer Zawy Alsofy, Makoto Nakamura, Frank Möllmann, Christoph Greiner, Ralf Stroop
{"title":"Large language models: Are artificial intelligence-based chatbots a reliable source of patient information for spinal surgery?","authors":"Anna Stroop, Tabea Stroop, Samer Zawy Alsofy, Makoto Nakamura, Frank Möllmann, Christoph Greiner, Ralf Stroop","doi":"10.1007/s00586-023-07975-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Large language models (LLM) have recently attracted attention because of their enormous performance. Based on artificial intelligence, LLM enable dialogic communication using quasi-natural language that approximates the quality of human communication. Thus, LLM could play an important role for patients to become informed. To evaluate the validity of an LLM in providing medical information, we used one of the first high-performance LLM (ChatGPT) on the clinical example of acute lumbar disc herniation (LDH).</p><p><strong>Methods: </strong>Twenty-four spinal surgeons experienced in LDH surgery directed questions to ChatGPT about the clinical picture of LDH from a patient's perspective. They evaluated the quality of ChatGPT responses and its potential use in medical communication. The responses were compared with the information content of a standard informed consent form.</p><p><strong>Results: </strong>ChatGPT provided good results in terms of comprehensibility, specificity, and satisfaction of responses and in terms of medical accuracy and completeness. ChatGPT was not able to provide all the information that was provided in the informed consent form, but did communicate information that was not listed there. In some cases, albeit minor, ChatGPT made medically inaccurate claims, such as listing kyphoplasty and vertebroplasty as surgical options for LDH.</p><p><strong>Conclusion: </strong>With the incipient use of artificial intelligence in communication, LLM will certainly become increasingly important to patients. Even if LLM are unlikely to play a role in clinical communication between physicians and patients at the moment, the opportunities-but also the risks-of this novel technology should be alertly monitored.</p>","PeriodicalId":12323,"journal":{"name":"European Spine Journal","volume":" ","pages":"4135-4143"},"PeriodicalIF":2.6000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Spine Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00586-023-07975-z","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/10/11 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Large language models (LLM) have recently attracted attention because of their enormous performance. Based on artificial intelligence, LLM enable dialogic communication using quasi-natural language that approximates the quality of human communication. Thus, LLM could play an important role for patients to become informed. To evaluate the validity of an LLM in providing medical information, we used one of the first high-performance LLM (ChatGPT) on the clinical example of acute lumbar disc herniation (LDH).
Methods: Twenty-four spinal surgeons experienced in LDH surgery directed questions to ChatGPT about the clinical picture of LDH from a patient's perspective. They evaluated the quality of ChatGPT responses and its potential use in medical communication. The responses were compared with the information content of a standard informed consent form.
Results: ChatGPT provided good results in terms of comprehensibility, specificity, and satisfaction of responses and in terms of medical accuracy and completeness. ChatGPT was not able to provide all the information that was provided in the informed consent form, but did communicate information that was not listed there. In some cases, albeit minor, ChatGPT made medically inaccurate claims, such as listing kyphoplasty and vertebroplasty as surgical options for LDH.
Conclusion: With the incipient use of artificial intelligence in communication, LLM will certainly become increasingly important to patients. Even if LLM are unlikely to play a role in clinical communication between physicians and patients at the moment, the opportunities-but also the risks-of this novel technology should be alertly monitored.
期刊介绍:
"European Spine Journal" is a publication founded in response to the increasing trend toward specialization in spinal surgery and spinal pathology in general. The Journal is devoted to all spine related disciplines, including functional and surgical anatomy of the spine, biomechanics and pathophysiology, diagnostic procedures, and neurology, surgery and outcomes. The aim of "European Spine Journal" is to support the further development of highly innovative spine treatments including but not restricted to surgery and to provide an integrated and balanced view of diagnostic, research and treatment procedures as well as outcomes that will enhance effective collaboration among specialists worldwide. The “European Spine Journal” also participates in education by means of videos, interactive meetings and the endorsement of educative efforts.
Official publication of EUROSPINE, The Spine Society of Europe