Associations Between Radiation Oncologist Demographic Factors and Segmentation Similarity Benchmarks: Insights From a Crowd-Sourced Challenge Using Bayesian Estimation.

IF 2.8 Q2 ONCOLOGY JCO Clinical Cancer Informatics Pub Date : 2024-06-01 DOI:10.1200/CCI.23.00174

Kareem A Wahid, Onur Sahin, Suprateek Kundu, Diana Lin, Anthony Alanis, Salik Tehami, Serageldin Kamel, Simon Duke, Michael V Sherer, Mathis Rasmussen, Stine Korreman, David Fuentes, Michael Cislo, Benjamin E Nelms, John P Christodouleas, James D Murphy, Abdallah S R Mohamed, Renjie He, Mohammed A Naser, Erin F Gillespie, Clifton D Fuller

{"title":"Associations Between Radiation Oncologist Demographic Factors and Segmentation Similarity Benchmarks: Insights From a Crowd-Sourced Challenge Using Bayesian Estimation.","authors":"Kareem A Wahid, Onur Sahin, Suprateek Kundu, Diana Lin, Anthony Alanis, Salik Tehami, Serageldin Kamel, Simon Duke, Michael V Sherer, Mathis Rasmussen, Stine Korreman, David Fuentes, Michael Cislo, Benjamin E Nelms, John P Christodouleas, James D Murphy, Abdallah S R Mohamed, Renjie He, Mohammed A Naser, Erin F Gillespie, Clifton D Fuller","doi":"10.1200/CCI.23.00174","DOIUrl":null,"url":null,"abstract":"Purpose: The quality of radiotherapy auto-segmentation training data, primarily derived from clinician observers, is of utmost importance. However, the factors influencing the quality of clinician-derived segmentations are poorly understood; our study aims to quantify these factors.Methods: Organ at risk (OAR) and tumor-related segmentations provided by radiation oncologists from the Contouring Collaborative for Consensus in Radiation Oncology data set were used. Segmentations were derived from five disease sites: breast, sarcoma, head and neck (H&N), gynecologic (GYN), and GI. Segmentation quality was determined on a structure-by-structure basis by comparing the observer segmentations with an expert-derived consensus, which served as a reference standard benchmark. The Dice similarity coefficient (DSC) was primarily used as a metric for the comparisons. DSC was stratified into binary groups on the basis of structure-specific expert-derived interobserver variability (IOV) cutoffs. Generalized linear mixed-effects models using Bayesian estimation were used to investigate the association between demographic variables and the binarized DSC for each disease site. Variables with a highest density interval excluding zero were considered to substantially affect the outcome measure.Results: Five hundred seventy-four, 110, 452, 112, and 48 segmentations were used for the breast, sarcoma, H&N, GYN, and GI cases, respectively. The median percentage of segmentations that crossed the expert DSC IOV cutoff when stratified by structure type was 55% and 31% for OARs and tumors, respectively. Regression analysis revealed that the structure being tumor-related had a substantial negative impact on binarized DSC for the breast, sarcoma, H&N, and GI cases. There were no recurring relationships between segmentation quality and demographic variables across the cases, with most variables demonstrating large standard deviations.Conclusion: Our study highlights substantial uncertainty surrounding conventionally presumed factors influencing segmentation quality relative to benchmarks.","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":"e2300174"},"PeriodicalIF":2.8000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11214868/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI.23.00174","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose: The quality of radiotherapy auto-segmentation training data, primarily derived from clinician observers, is of utmost importance. However, the factors influencing the quality of clinician-derived segmentations are poorly understood; our study aims to quantify these factors.

Methods: Organ at risk (OAR) and tumor-related segmentations provided by radiation oncologists from the Contouring Collaborative for Consensus in Radiation Oncology data set were used. Segmentations were derived from five disease sites: breast, sarcoma, head and neck (H&N), gynecologic (GYN), and GI. Segmentation quality was determined on a structure-by-structure basis by comparing the observer segmentations with an expert-derived consensus, which served as a reference standard benchmark. The Dice similarity coefficient (DSC) was primarily used as a metric for the comparisons. DSC was stratified into binary groups on the basis of structure-specific expert-derived interobserver variability (IOV) cutoffs. Generalized linear mixed-effects models using Bayesian estimation were used to investigate the association between demographic variables and the binarized DSC for each disease site. Variables with a highest density interval excluding zero were considered to substantially affect the outcome measure.

Results: Five hundred seventy-four, 110, 452, 112, and 48 segmentations were used for the breast, sarcoma, H&N, GYN, and GI cases, respectively. The median percentage of segmentations that crossed the expert DSC IOV cutoff when stratified by structure type was 55% and 31% for OARs and tumors, respectively. Regression analysis revealed that the structure being tumor-related had a substantial negative impact on binarized DSC for the breast, sarcoma, H&N, and GI cases. There were no recurring relationships between segmentation quality and demographic variables across the cases, with most variables demonstrating large standard deviations.

Conclusion: Our study highlights substantial uncertainty surrounding conventionally presumed factors influencing segmentation quality relative to benchmarks.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

放射肿瘤学家人口统计因素与分段相似性基准之间的关联：利用贝叶斯估计从众包挑战中获得的启示。

目的：放射治疗自动分割训练数据主要来自临床医生观察者，其质量至关重要。然而，影响临床医生分段质量的因素却鲜为人知；我们的研究旨在量化这些因素：方法：使用放射肿瘤学轮廓协作共识数据集中由放射肿瘤科医生提供的危险器官（OAR）和肿瘤相关分割。分段来自五个疾病部位：乳腺、肉瘤、头颈部（H&N）、妇科（GYN）和消化道。通过将观察者的分割结果与专家达成的共识（作为参考标准基准）进行比较，逐个结构确定分割质量。Dice 相似性系数 (DSC) 主要用作比较指标。根据特定结构专家得出的观察者间变异性（IOV）临界值，将 DSC 分为二元组。采用贝叶斯估计法建立的广义线性混合效应模型用于研究人口统计学变量与各疾病部位二元化 DSC 之间的关联。最高密度区间不为零的变量被认为会对结果测量产生重大影响：乳腺、肉瘤、H&N、妇科和消化道病例分别使用了 574、110、452、112 和 48 个分割。按结构类型分层后，OAR 和肿瘤中超过专家 DSC IOV 临界值的分割百分比中位数分别为 55% 和 31%。回归分析表明，与肿瘤相关的结构对乳腺、肉瘤、H&N 和消化道病例的二值化 DSC 有很大的负面影响。在所有病例中，分割质量与人口统计学变量之间没有反复出现的关系，大多数变量的标准偏差较大：我们的研究凸显了影响分割质量的传统假定因素相对于基准的巨大不确定性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊