Guidance on selecting and evaluating AI auto-segmentation systems in clinical radiotherapy: insights from a six-vendor analysis.

IF 2.4 4区医学 Q3 ENGINEERING, BIOMEDICAL Physical and Engineering Sciences in Medicine Pub Date : 2025-01-13 DOI:10.1007/s13246-024-01513-x

Branimir Rusanov, Martin A Ebert, Mahsheed Sabet, Pejman Rowshanfarzad, Nathaniel Barry, Jake Kendrick, Zaid Alkhatib, Suki Gill, Joshua Dass, Nicholas Bucknell, Jeremy Croker, Colin Tang, Rohen White, Sean Bydder, Mandy Taylor, Luke Slama, Godfrey Mukwada

{"title":"Guidance on selecting and evaluating AI auto-segmentation systems in clinical radiotherapy: insights from a six-vendor analysis.","authors":"Branimir Rusanov, Martin A Ebert, Mahsheed Sabet, Pejman Rowshanfarzad, Nathaniel Barry, Jake Kendrick, Zaid Alkhatib, Suki Gill, Joshua Dass, Nicholas Bucknell, Jeremy Croker, Colin Tang, Rohen White, Sean Bydder, Mandy Taylor, Luke Slama, Godfrey Mukwada","doi":"10.1007/s13246-024-01513-x","DOIUrl":null,"url":null,"abstract":"<p><p>Artificial Intelligence (AI) based auto-segmentation has demonstrated numerous benefits to clinical radiotherapy workflows. However, the rapidly changing regulatory, research, and market environment presents challenges around selecting and evaluating the most suitable solution. To support the clinical adoption of AI auto-segmentation systems, Selection Criteria recommendations were developed to enable a holistic evaluation of vendors, considering not only raw performance but associated risks uniquely related to the clinical deployment of AI. In-house experience and key bodies of work on ethics, standards, and best practices for AI in Radiation Oncology were reviewed to inform selection criteria and evaluation strategies. A retrospective analysis using the criteria was performed across six vendors, including a quantitative assessment using five metrics (Dice, Hausdorff Distance, Average Surface Distance, Surface Dice, Added Path Length) across 20 head and neck, 20 thoracic, and 19 male pelvis patients for AI models as of March 2023. A total of 47 selection criteria were identified across seven categories. A retrospective analysis showed that overall no vendor performed exceedingly well, with systematically poor performance in Data Security & Responsibility, Vendor Support Tools, and Transparency & Ethics. In terms of raw performance, vendors varied widely from excellent to poor. As new regulations come into force and the scope of AI auto-segmentation systems adapt to clinical needs, continued interest in ensuring safe, fair, and transparent AI will persist. The selection and evaluation framework provided herein aims to promote user confidence by exploring the breadth of clinically relevant factors to support informed decision-making.</p>","PeriodicalId":48490,"journal":{"name":"Physical and Engineering Sciences in Medicine","volume":" ","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical and Engineering Sciences in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s13246-024-01513-x","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Artificial Intelligence (AI) based auto-segmentation has demonstrated numerous benefits to clinical radiotherapy workflows. However, the rapidly changing regulatory, research, and market environment presents challenges around selecting and evaluating the most suitable solution. To support the clinical adoption of AI auto-segmentation systems, Selection Criteria recommendations were developed to enable a holistic evaluation of vendors, considering not only raw performance but associated risks uniquely related to the clinical deployment of AI. In-house experience and key bodies of work on ethics, standards, and best practices for AI in Radiation Oncology were reviewed to inform selection criteria and evaluation strategies. A retrospective analysis using the criteria was performed across six vendors, including a quantitative assessment using five metrics (Dice, Hausdorff Distance, Average Surface Distance, Surface Dice, Added Path Length) across 20 head and neck, 20 thoracic, and 19 male pelvis patients for AI models as of March 2023. A total of 47 selection criteria were identified across seven categories. A retrospective analysis showed that overall no vendor performed exceedingly well, with systematically poor performance in Data Security & Responsibility, Vendor Support Tools, and Transparency & Ethics. In terms of raw performance, vendors varied widely from excellent to poor. As new regulations come into force and the scope of AI auto-segmentation systems adapt to clinical needs, continued interest in ensuring safe, fair, and transparent AI will persist. The selection and evaluation framework provided herein aims to promote user confidence by exploring the breadth of clinically relevant factors to support informed decision-making.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

临床放疗中人工智能自动分割系统的选择和评估指南：来自六家供应商分析的见解。

基于人工智能（AI）的自动分割已经证明了临床放疗工作流程的许多好处。然而，快速变化的监管、研究和市场环境在选择和评估最合适的解决方案方面提出了挑战。为了支持临床采用人工智能自动分割系统，制定了选择标准建议，以便对供应商进行全面评估，不仅考虑原始性能，还考虑与人工智能临床部署独特相关的相关风险。对放射肿瘤学人工智能的伦理、标准和最佳实践方面的内部经验和关键工作机构进行了审查，以告知选择标准和评估策略。使用标准对六家供应商进行了回顾性分析，包括使用五个指标（Dice、Hausdorff距离、平均表面距离、表面Dice、添加路径长度）进行定量评估，涉及20名头颈部、20名胸部和19名男性骨盆患者，用于人工智能模型截至2023年3月。在7个类别中共确定了47项选择标准。回顾性分析显示，总体而言，没有一家供应商表现得非常好，在数据安全和责任、供应商支持工具以及透明度和道德方面的表现都很差。在原始性能方面，供应商差异很大，从优秀到差。随着新法规的生效以及人工智能自动分割系统的范围适应临床需求，确保安全、公平和透明的人工智能将持续存在。本文提供的选择和评估框架旨在通过探索临床相关因素的广度来促进用户的信心，以支持知情决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊