Branimir Rusanov, Martin A Ebert, Mahsheed Sabet, Pejman Rowshanfarzad, Nathaniel Barry, Jake Kendrick, Zaid Alkhatib, Suki Gill, Joshua Dass, Nicholas Bucknell, Jeremy Croker, Colin Tang, Rohen White, Sean Bydder, Mandy Taylor, Luke Slama, Godfrey Mukwada
{"title":"Guidance on selecting and evaluating AI auto-segmentation systems in clinical radiotherapy: insights from a six-vendor analysis.","authors":"Branimir Rusanov, Martin A Ebert, Mahsheed Sabet, Pejman Rowshanfarzad, Nathaniel Barry, Jake Kendrick, Zaid Alkhatib, Suki Gill, Joshua Dass, Nicholas Bucknell, Jeremy Croker, Colin Tang, Rohen White, Sean Bydder, Mandy Taylor, Luke Slama, Godfrey Mukwada","doi":"10.1007/s13246-024-01513-x","DOIUrl":null,"url":null,"abstract":"<p><p>Artificial Intelligence (AI) based auto-segmentation has demonstrated numerous benefits to clinical radiotherapy workflows. However, the rapidly changing regulatory, research, and market environment presents challenges around selecting and evaluating the most suitable solution. To support the clinical adoption of AI auto-segmentation systems, Selection Criteria recommendations were developed to enable a holistic evaluation of vendors, considering not only raw performance but associated risks uniquely related to the clinical deployment of AI. In-house experience and key bodies of work on ethics, standards, and best practices for AI in Radiation Oncology were reviewed to inform selection criteria and evaluation strategies. A retrospective analysis using the criteria was performed across six vendors, including a quantitative assessment using five metrics (Dice, Hausdorff Distance, Average Surface Distance, Surface Dice, Added Path Length) across 20 head and neck, 20 thoracic, and 19 male pelvis patients for AI models as of March 2023. A total of 47 selection criteria were identified across seven categories. A retrospective analysis showed that overall no vendor performed exceedingly well, with systematically poor performance in Data Security & Responsibility, Vendor Support Tools, and Transparency & Ethics. In terms of raw performance, vendors varied widely from excellent to poor. As new regulations come into force and the scope of AI auto-segmentation systems adapt to clinical needs, continued interest in ensuring safe, fair, and transparent AI will persist. The selection and evaluation framework provided herein aims to promote user confidence by exploring the breadth of clinically relevant factors to support informed decision-making.</p>","PeriodicalId":48490,"journal":{"name":"Physical and Engineering Sciences in Medicine","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical and Engineering Sciences in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13246-024-01513-x","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Artificial Intelligence (AI) based auto-segmentation has demonstrated numerous benefits to clinical radiotherapy workflows. However, the rapidly changing regulatory, research, and market environment presents challenges around selecting and evaluating the most suitable solution. To support the clinical adoption of AI auto-segmentation systems, Selection Criteria recommendations were developed to enable a holistic evaluation of vendors, considering not only raw performance but associated risks uniquely related to the clinical deployment of AI. In-house experience and key bodies of work on ethics, standards, and best practices for AI in Radiation Oncology were reviewed to inform selection criteria and evaluation strategies. A retrospective analysis using the criteria was performed across six vendors, including a quantitative assessment using five metrics (Dice, Hausdorff Distance, Average Surface Distance, Surface Dice, Added Path Length) across 20 head and neck, 20 thoracic, and 19 male pelvis patients for AI models as of March 2023. A total of 47 selection criteria were identified across seven categories. A retrospective analysis showed that overall no vendor performed exceedingly well, with systematically poor performance in Data Security & Responsibility, Vendor Support Tools, and Transparency & Ethics. In terms of raw performance, vendors varied widely from excellent to poor. As new regulations come into force and the scope of AI auto-segmentation systems adapt to clinical needs, continued interest in ensuring safe, fair, and transparent AI will persist. The selection and evaluation framework provided herein aims to promote user confidence by exploring the breadth of clinically relevant factors to support informed decision-making.