Deep learning with uncertainty estimation for automatic tumor segmentation in PET/CT of head and neck cancers: impact of model complexity, image processing and augmentation.

IF 1.3 Q3 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Biomedical Physics & Engineering Express Pub Date : 2024-08-30 DOI:10.1088/2057-1976/ad6dcd

Bao Ngoc Huynh, Aurora Rosvoll Groendahl, Oliver Tomic, Kristian Hovde Liland, Ingerid Skjei Knudtsen, Frank Hoebers, Wouter van Elmpt, Einar Dale, Eirik Malinen, Cecilia Marie Futsaether

{"title":"Deep learning with uncertainty estimation for automatic tumor segmentation in PET/CT of head and neck cancers: impact of model complexity, image processing and augmentation.","authors":"Bao Ngoc Huynh, Aurora Rosvoll Groendahl, Oliver Tomic, Kristian Hovde Liland, Ingerid Skjei Knudtsen, Frank Hoebers, Wouter van Elmpt, Einar Dale, Eirik Malinen, Cecilia Marie Futsaether","doi":"10.1088/2057-1976/ad6dcd","DOIUrl":null,"url":null,"abstract":"Objective.Target volumes for radiotherapy are usually contoured manually, which can be time-consuming and prone to inter- and intra-observer variability. Automatic contouring by convolutional neural networks (CNN) can be fast and consistent but may produce unrealistic contours or miss relevant structures. We evaluate approaches for increasing the quality and assessing the uncertainty of CNN-generated contours of head and neck cancers with PET/CT as input.Approach.Two patient cohorts with head and neck squamous cell carcinoma and baseline18F-fluorodeoxyglucose positron emission tomography and computed tomography images (FDG-PET/CT) were collected retrospectively from two centers. The union of manual contours of the gross primary tumor and involved nodes was used to train CNN models for generating automatic contours. The impact of image preprocessing, image augmentation, transfer learning and CNN complexity, architecture, and dimension (2D or 3D) on model performance and generalizability across centers was evaluated. A Monte Carlo dropout technique was used to quantify and visualize the uncertainty of the automatic contours.Main results. CNN models provided contours with good overlap with the manually contoured ground truth (median Dice Similarity Coefficient: 0.75-0.77), consistent with reported inter-observer variations and previous auto-contouring studies. Image augmentation and model dimension, rather than model complexity, architecture, or advanced image preprocessing, had the largest impact on model performance and cross-center generalizability. Transfer learning on a limited number of patients from a separate center increased model generalizability without decreasing model performance on the original training cohort. High model uncertainty was associated with false positive and false negative voxels as well as low Dice coefficients.Significance.High quality automatic contours can be obtained using deep learning architectures that are not overly complex. Uncertainty estimation of the predicted contours shows potential for highlighting regions of the contour requiring manual revision or flagging segmentations requiring manual inspection and intervention.","PeriodicalId":8896,"journal":{"name":"Biomedical Physics & Engineering Express","volume":" ","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Physics & Engineering Express","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1088/2057-1976/ad6dcd","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Objective.Target volumes for radiotherapy are usually contoured manually, which can be time-consuming and prone to inter- and intra-observer variability. Automatic contouring by convolutional neural networks (CNN) can be fast and consistent but may produce unrealistic contours or miss relevant structures. We evaluate approaches for increasing the quality and assessing the uncertainty of CNN-generated contours of head and neck cancers with PET/CT as input.Approach.Two patient cohorts with head and neck squamous cell carcinoma and baseline¹⁸F-fluorodeoxyglucose positron emission tomography and computed tomography images (FDG-PET/CT) were collected retrospectively from two centers. The union of manual contours of the gross primary tumor and involved nodes was used to train CNN models for generating automatic contours. The impact of image preprocessing, image augmentation, transfer learning and CNN complexity, architecture, and dimension (2D or 3D) on model performance and generalizability across centers was evaluated. A Monte Carlo dropout technique was used to quantify and visualize the uncertainty of the automatic contours.Main results. CNN models provided contours with good overlap with the manually contoured ground truth (median Dice Similarity Coefficient: 0.75-0.77), consistent with reported inter-observer variations and previous auto-contouring studies. Image augmentation and model dimension, rather than model complexity, architecture, or advanced image preprocessing, had the largest impact on model performance and cross-center generalizability. Transfer learning on a limited number of patients from a separate center increased model generalizability without decreasing model performance on the original training cohort. High model uncertainty was associated with false positive and false negative voxels as well as low Dice coefficients.Significance.High quality automatic contours can be obtained using deep learning architectures that are not overly complex. Uncertainty estimation of the predicted contours shows potential for highlighting regions of the contour requiring manual revision or flagging segmentations requiring manual inspection and intervention.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

深度学习与不确定性估计用于头颈部癌症 PET/CT 中的自动肿瘤分割：模型复杂性、图像处理和增强的影响。

目的：放射治疗的靶体积通常由人工绘制，这不仅耗时，而且容易造成观察者之间和观察者内部的差异。使用卷积神经网络（CNN）进行自动轮廓绘制既快速又一致，但可能会产生不切实际的轮廓或遗漏相关结构。我们以 PET/CT 为输入，评估了提高 CNN 生成的头颈部癌症轮廓质量和评估其不确定性的方法。从两个中心回顾性地收集了两组头颈部鳞状细胞癌患者和基线 18F- 氟脱氧葡萄糖正电子发射断层扫描和计算机断层扫描图像（FDG-PET/CT）。原发肿瘤和受累结节的人工轮廓联合用于训练 CNN 模型，以生成自动轮廓。评估了图像预处理、图像增强、迁移学习和 CNN 复杂性、架构和维度（二维或三维）对模型性能和跨中心通用性的影响。蒙特卡洛放弃技术用于量化和可视化自动轮廓的不确定性。CNN 模型提供的轮廓与人工绘制的地面真实轮廓有很好的重合度（中位数 Dice 相似系数：0.75 - 0.77），与报告的观察者之间的差异和之前的自动轮廓绘制研究一致。对模型性能和跨中心通用性影响最大的是图像增强和模型维度，而不是模型复杂性、结构或高级图像预处理。对来自另一个中心的有限数量的患者进行迁移学习可提高模型的可推广性，而不会降低模型在原始训练队列中的性能。高模型不确定性与假阳性和假阴性体素以及低 Dice 系数有关。利用不过分复杂的深度学习架构可以获得高质量的自动轮廓。对预测轮廓的不确定性估计表明，有可能突出需要人工修改的轮廓区域，或标记需要人工检查和干预的分段。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Biomedical Physics & Engineering Express RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING-

CiteScore

2.80

自引率

0.00%

发文量

153

期刊介绍： BPEX is an inclusive, international, multidisciplinary journal devoted to publishing new research on any application of physics and/or engineering in medicine and/or biology. Characterized by a broad geographical coverage and a fast-track peer-review process, relevant topics include all aspects of biophysics, medical physics and biomedical engineering. Papers that are almost entirely clinical or biological in their focus are not suitable. The journal has an emphasis on publishing interdisciplinary work and bringing research fields together, encompassing experimental, theoretical and computational work.