Sherif Ramadan, Adam Mutsaers, Po-Hsuan Cameron Chen, Glenn Bauman, Vikram Velker, Belal Ahmad, Andrew J Arifin, Timothy K Nguyen, David Palma, Christopher D Goodman
{"title":"评估 ChatGPT 在放射肿瘤学方面的能力:跨临床场景的综合评估。","authors":"Sherif Ramadan, Adam Mutsaers, Po-Hsuan Cameron Chen, Glenn Bauman, Vikram Velker, Belal Ahmad, Andrew J Arifin, Timothy K Nguyen, David Palma, Christopher D Goodman","doi":"10.1016/j.radonc.2024.110645","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Artificial intelligence (AI) and machine learning present an opportunity to enhance clinical decision-making in radiation oncology. This study aims to evaluate the competency of ChatGPT, an AI language model, in interpreting clinical scenarios and assessing its oncology knowledge.</p><p><strong>Methods and materials: </strong>A series of clinical cases were designed covering 12 disease sites. Questions were grouped into domains: epidemiology, staging and workup, clinical management, treatment planning, cancer biology, physics, and surveillance. Royal College-certified radiation oncologists (ROs) reviewed cases and provided solutions. ROs scored responses on 3 criteria: conciseness (focused answers), completeness (addressing all aspects of the question), and correctness (answer aligns with expert opinion) using a standardized rubric. Scores ranged from 0 to 5 for each criterion for a total possible score of 15.</p><p><strong>Results: </strong>Across 12 cases, 182 questions were answered with a total AI score of 2317/2730 (84 %). Scores by criteria were: completeness (79 %, range: 70-99 %), conciseness (92 %, range: 83-99 %), and correctness (81 %, range: 72-92 %). AI performed best in the domains of epidemiology (93 %) and cancer biology (93 %) and reasonably in staging and workup (89 %), physics (86 %) and surveillance (82 %). Weaker domains included treatment planning (78 %) and clinical management (81 %). Statistical differences were driven by variations in the completeness (p < 0.01) and correctness (p = 0.04) criteria, whereas conciseness scored universally high (p = 0.91). These trends were consistent across disease sites.</p><p><strong>Conclusions: </strong>ChatGPT showed potential as a tool in radiation oncology, demonstrating a high degree of accuracy in several oncologic domains. However, this study highlights limitations with incorrect and incomplete answers in complex cases.</p>","PeriodicalId":21041,"journal":{"name":"Radiotherapy and Oncology","volume":" ","pages":"110645"},"PeriodicalIF":4.9000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating ChatGPT's competency in radiation oncology: A comprehensive assessment across clinical scenarios.\",\"authors\":\"Sherif Ramadan, Adam Mutsaers, Po-Hsuan Cameron Chen, Glenn Bauman, Vikram Velker, Belal Ahmad, Andrew J Arifin, Timothy K Nguyen, David Palma, Christopher D Goodman\",\"doi\":\"10.1016/j.radonc.2024.110645\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Artificial intelligence (AI) and machine learning present an opportunity to enhance clinical decision-making in radiation oncology. This study aims to evaluate the competency of ChatGPT, an AI language model, in interpreting clinical scenarios and assessing its oncology knowledge.</p><p><strong>Methods and materials: </strong>A series of clinical cases were designed covering 12 disease sites. Questions were grouped into domains: epidemiology, staging and workup, clinical management, treatment planning, cancer biology, physics, and surveillance. Royal College-certified radiation oncologists (ROs) reviewed cases and provided solutions. ROs scored responses on 3 criteria: conciseness (focused answers), completeness (addressing all aspects of the question), and correctness (answer aligns with expert opinion) using a standardized rubric. Scores ranged from 0 to 5 for each criterion for a total possible score of 15.</p><p><strong>Results: </strong>Across 12 cases, 182 questions were answered with a total AI score of 2317/2730 (84 %). Scores by criteria were: completeness (79 %, range: 70-99 %), conciseness (92 %, range: 83-99 %), and correctness (81 %, range: 72-92 %). AI performed best in the domains of epidemiology (93 %) and cancer biology (93 %) and reasonably in staging and workup (89 %), physics (86 %) and surveillance (82 %). Weaker domains included treatment planning (78 %) and clinical management (81 %). Statistical differences were driven by variations in the completeness (p < 0.01) and correctness (p = 0.04) criteria, whereas conciseness scored universally high (p = 0.91). These trends were consistent across disease sites.</p><p><strong>Conclusions: </strong>ChatGPT showed potential as a tool in radiation oncology, demonstrating a high degree of accuracy in several oncologic domains. However, this study highlights limitations with incorrect and incomplete answers in complex cases.</p>\",\"PeriodicalId\":21041,\"journal\":{\"name\":\"Radiotherapy and Oncology\",\"volume\":\" \",\"pages\":\"110645\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2024-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Radiotherapy and Oncology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1016/j.radonc.2024.110645\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiotherapy and Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.radonc.2024.110645","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
Evaluating ChatGPT's competency in radiation oncology: A comprehensive assessment across clinical scenarios.
Purpose: Artificial intelligence (AI) and machine learning present an opportunity to enhance clinical decision-making in radiation oncology. This study aims to evaluate the competency of ChatGPT, an AI language model, in interpreting clinical scenarios and assessing its oncology knowledge.
Methods and materials: A series of clinical cases were designed covering 12 disease sites. Questions were grouped into domains: epidemiology, staging and workup, clinical management, treatment planning, cancer biology, physics, and surveillance. Royal College-certified radiation oncologists (ROs) reviewed cases and provided solutions. ROs scored responses on 3 criteria: conciseness (focused answers), completeness (addressing all aspects of the question), and correctness (answer aligns with expert opinion) using a standardized rubric. Scores ranged from 0 to 5 for each criterion for a total possible score of 15.
Results: Across 12 cases, 182 questions were answered with a total AI score of 2317/2730 (84 %). Scores by criteria were: completeness (79 %, range: 70-99 %), conciseness (92 %, range: 83-99 %), and correctness (81 %, range: 72-92 %). AI performed best in the domains of epidemiology (93 %) and cancer biology (93 %) and reasonably in staging and workup (89 %), physics (86 %) and surveillance (82 %). Weaker domains included treatment planning (78 %) and clinical management (81 %). Statistical differences were driven by variations in the completeness (p < 0.01) and correctness (p = 0.04) criteria, whereas conciseness scored universally high (p = 0.91). These trends were consistent across disease sites.
Conclusions: ChatGPT showed potential as a tool in radiation oncology, demonstrating a high degree of accuracy in several oncologic domains. However, this study highlights limitations with incorrect and incomplete answers in complex cases.
期刊介绍:
Radiotherapy and Oncology publishes papers describing original research as well as review articles. It covers areas of interest relating to radiation oncology. This includes: clinical radiotherapy, combined modality treatment, translational studies, epidemiological outcomes, imaging, dosimetry, and radiation therapy planning, experimental work in radiobiology, chemobiology, hyperthermia and tumour biology, as well as data science in radiation oncology and physics aspects relevant to oncology.Papers on more general aspects of interest to the radiation oncologist including chemotherapy, surgery and immunology are also published.