Development and validation of fully automated robust deep learning models for multi-organ segmentation from whole-body CT images

IF 2.7 3区医学 Q1 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Physica Medica-European Journal of Medical Physics Pub Date : 2025-02-01 DOI:10.1016/j.ejmp.2025.104911

Yazdan Salimi , Isaac Shiri , Zahra Mansouri , Habib Zaidi

{"title":"Development and validation of fully automated robust deep learning models for multi-organ segmentation from whole-body CT images","authors":"Yazdan Salimi , Isaac Shiri , Zahra Mansouri , Habib Zaidi","doi":"10.1016/j.ejmp.2025.104911","DOIUrl":null,"url":null,"abstract":"<div><h3>Purpose</h3><div>This study aimed to develop a deep-learning framework to generate multi-organ masks from CT images in adult and pediatric patients.</div></div><div><h3>Methods</h3><div>A dataset consisting of 4082 CT images and ground-truth manual segmentation from various databases, including 300 pediatric cases, were collected. In strategy#1, the manual segmentation masks provided by public databases were split into training (90%) and testing (10% of each database named subset #1) cohort. The training set was used to train multiple nnU-Net networks in five-fold cross-validation (CV) for 26 separate organs. In the next step, the trained models from strategy #1 were used to generate missing organs for the entire dataset. This generated data was then used to train a multi-organ nnU-Net segmentation model in a five-fold CV (strategy#2). Models’ performance were evaluated in terms of Dice coefficient (DSC) and other well-established image segmentation metrics.</div></div><div><h3>Results</h3><div>The lowest CV DSC for strategy#1 was 0.804 ± 0.094 for adrenal glands while average DSC > 0.90 were achieved for 17/26 organs. The lowest DSC for strategy#2 (0.833 ± 0.177) was obtained for the pancreas, whereas DSC > 0.90 was achieved for 13/19 of the organs. For all mutual organs included in subset #1 and subset #2, our model outperformed the TotalSegmentator models in both strategies. In addition, our models outperformed the TotalSegmentator models on subset #3.</div></div><div><h3>Conclusions</h3><div>Our model was trained on images with significant variability from different databases, producing acceptable results on both pediatric and adult cases, making it well-suited for implementation in clinical setting.</div></div>","PeriodicalId":56092,"journal":{"name":"Physica Medica-European Journal of Medical Physics","volume":"130 ","pages":"Article 104911"},"PeriodicalIF":2.7000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physica Medica-European Journal of Medical Physics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1120179725000213","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}

引用次数: 0

Abstract

Purpose

This study aimed to develop a deep-learning framework to generate multi-organ masks from CT images in adult and pediatric patients.

Methods

A dataset consisting of 4082 CT images and ground-truth manual segmentation from various databases, including 300 pediatric cases, were collected. In strategy#1, the manual segmentation masks provided by public databases were split into training (90%) and testing (10% of each database named subset #1) cohort. The training set was used to train multiple nnU-Net networks in five-fold cross-validation (CV) for 26 separate organs. In the next step, the trained models from strategy #1 were used to generate missing organs for the entire dataset. This generated data was then used to train a multi-organ nnU-Net segmentation model in a five-fold CV (strategy#2). Models’ performance were evaluated in terms of Dice coefficient (DSC) and other well-established image segmentation metrics.

Results

The lowest CV DSC for strategy#1 was 0.804 ± 0.094 for adrenal glands while average DSC > 0.90 were achieved for 17/26 organs. The lowest DSC for strategy#2 (0.833 ± 0.177) was obtained for the pancreas, whereas DSC > 0.90 was achieved for 13/19 of the organs. For all mutual organs included in subset #1 and subset #2, our model outperformed the TotalSegmentator models in both strategies. In addition, our models outperformed the TotalSegmentator models on subset #3.

Conclusions

Our model was trained on images with significant variability from different databases, producing acceptable results on both pediatric and adult cases, making it well-suited for implementation in clinical setting.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

开发和验证全自动化鲁棒深度学习模型，用于从全身CT图像中分割多器官。

目的：本研究旨在开发一个深度学习框架，从成人和儿科患者的CT图像中生成多器官面具。方法：收集来自不同数据库的4082张CT图像和人工分割的基线数据集，包括300例儿童病例。在策略#1中，公共数据库提供的手动分割掩码被分成训练（90%）和测试（每个数据库命名为子集#1的10%）队列。该训练集用于训练多个nnU-Net网络，对26个不同器官进行五重交叉验证（CV）。下一步，使用策略#1中训练好的模型为整个数据集生成缺失的器官。然后使用生成的数据在五重CV（策略#2）中训练多器官nnU-Net分割模型。根据Dice系数（DSC）和其他成熟的图像分割指标来评估模型的性能。结果：策略1中肾上腺的CV DSC最低为0.804±0.094,17/26个器官的平均DSC为0.90。策略2的最低DSC为胰腺（0.833±0.177），而13/19的器官DSC为bb0 0.90。对于子集#1和子集#2中包含的所有相互器官，我们的模型在这两种策略中都优于TotalSegmentator模型。此外，我们的模型在子集#3上优于TotalSegmentator模型。结论：我们的模型是在来自不同数据库的具有显著可变性的图像上进行训练的，在儿童和成人病例中都产生了可接受的结果，使其非常适合在临床环境中实施。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Physica Medica-European Journal of Medical Physics 生物-生物物理

CiteScore

6.80

自引率

14.70%

发文量

493

审稿时长

78 days

期刊介绍： Physica Medica, European Journal of Medical Physics, publishing with Elsevier from 2007, provides an international forum for research and reviews on the following main topics: Medical Imaging Radiation Therapy Radiation Protection Measuring Systems and Signal Processing Education and training in Medical Physics Professional issues in Medical Physics.