Julius C Holzschuh, Michael Mix, Martin T Freitag, Tobias Hölscher, Anja Braune, Jörg Kotzerke, Alexis Vrachimis, Paul Doolan, Harun Ilhan, Ioana M Marinescu, Simon K B Spohn, Tobias Fechter, Dejan Kuhn, Christian Gratzke, Radu Grosu, Anca-Ligia Grosu, C Zamboglou
{"title":"利用卷积神经网络对 18F-PSMA-1007 PET 进行原发性前列腺癌肿瘤自动定界的多中心数据集的影响。","authors":"Julius C Holzschuh, Michael Mix, Martin T Freitag, Tobias Hölscher, Anja Braune, Jörg Kotzerke, Alexis Vrachimis, Paul Doolan, Harun Ilhan, Ioana M Marinescu, Simon K B Spohn, Tobias Fechter, Dejan Kuhn, Christian Gratzke, Radu Grosu, Anca-Ligia Grosu, C Zamboglou","doi":"10.1186/s13014-024-02491-w","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Convolutional Neural Networks (CNNs) have emerged as transformative tools in the field of radiation oncology, significantly advancing the precision of contouring practices. However, the adaptability of these algorithms across diverse scanners, institutions, and imaging protocols remains a considerable obstacle. This study aims to investigate the effects of incorporating institution-specific datasets into the training regimen of CNNs to assess their generalization ability in real-world clinical environments. Focusing on a data-centric analysis, the influence of varying multi- and single center training approaches on algorithm performance is conducted.</p><p><strong>Methods: </strong>nnU-Net is trained using a dataset comprising 161 <sup>18</sup>F-PSMA-1007 PET images collected from four distinct institutions (Freiburg: n = 96, Munich: n = 19, Cyprus: n = 32, Dresden: n = 14). The dataset is partitioned such that data from each center are systematically excluded from training and used solely for testing to assess the model's generalizability and adaptability to data from unfamiliar sources. Performance is compared through a 5-Fold Cross-Validation, providing a detailed comparison between models trained on datasets from single centers to those trained on aggregated multi-center datasets. Dice Similarity Score, Hausdorff distance and volumetric analysis are used as primary evaluation metrics.</p><p><strong>Results: </strong>The mixed training approach yielded a median DSC of 0.76 (IQR: 0.64-0.84) in a five-fold cross-validation, showing no significant differences (p = 0.18) compared to models trained with data exclusion from each center, which performed with a median DSC of 0.74 (IQR: 0.56-0.86). Significant performance improvements regarding multi-center training were observed for the Dresden cohort (multi-center median DSC 0.71, IQR: 0.58-0.80 vs. single-center 0.68, IQR: 0.50-0.80, p < 0.001) and Cyprus cohort (multi-center 0.74, IQR: 0.62-0.83 vs. single-center 0.72, IQR: 0.54-0.82, p < 0.01). While Munich and Freiburg also showed performance improvements with multi-center training, results showed no statistical significance (Munich: multi-center DSC 0.74, IQR: 0.60-0.80 vs. single-center 0.72, IQR: 0.59-0.82, p > 0.05; Freiburg: multi-center 0.78, IQR: 0.53-0.87 vs. single-center 0.71, IQR: 0.53-0.83, p = 0.23).</p><p><strong>Conclusion: </strong>CNNs trained for auto contouring intraprostatic GTV in <sup>18</sup>F-PSMA-1007 PET on a diverse dataset from multiple centers mostly generalize well to unseen data from other centers. Training on a multicentric dataset can improve performance compared to training exclusively with a single-center dataset regarding intraprostatic <sup>18</sup>F-PSMA-1007 PET GTV segmentation. The segmentation performance of the same CNN can vary depending on the dataset employed for training and testing.</p>","PeriodicalId":49639,"journal":{"name":"Radiation Oncology","volume":"19 1","pages":"106"},"PeriodicalIF":3.3000,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11304577/pdf/","citationCount":"0","resultStr":"{\"title\":\"The impact of multicentric datasets for the automated tumor delineation in primary prostate cancer using convolutional neural networks on <sup>18</sup>F-PSMA-1007 PET.\",\"authors\":\"Julius C Holzschuh, Michael Mix, Martin T Freitag, Tobias Hölscher, Anja Braune, Jörg Kotzerke, Alexis Vrachimis, Paul Doolan, Harun Ilhan, Ioana M Marinescu, Simon K B Spohn, Tobias Fechter, Dejan Kuhn, Christian Gratzke, Radu Grosu, Anca-Ligia Grosu, C Zamboglou\",\"doi\":\"10.1186/s13014-024-02491-w\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Convolutional Neural Networks (CNNs) have emerged as transformative tools in the field of radiation oncology, significantly advancing the precision of contouring practices. However, the adaptability of these algorithms across diverse scanners, institutions, and imaging protocols remains a considerable obstacle. This study aims to investigate the effects of incorporating institution-specific datasets into the training regimen of CNNs to assess their generalization ability in real-world clinical environments. Focusing on a data-centric analysis, the influence of varying multi- and single center training approaches on algorithm performance is conducted.</p><p><strong>Methods: </strong>nnU-Net is trained using a dataset comprising 161 <sup>18</sup>F-PSMA-1007 PET images collected from four distinct institutions (Freiburg: n = 96, Munich: n = 19, Cyprus: n = 32, Dresden: n = 14). The dataset is partitioned such that data from each center are systematically excluded from training and used solely for testing to assess the model's generalizability and adaptability to data from unfamiliar sources. Performance is compared through a 5-Fold Cross-Validation, providing a detailed comparison between models trained on datasets from single centers to those trained on aggregated multi-center datasets. Dice Similarity Score, Hausdorff distance and volumetric analysis are used as primary evaluation metrics.</p><p><strong>Results: </strong>The mixed training approach yielded a median DSC of 0.76 (IQR: 0.64-0.84) in a five-fold cross-validation, showing no significant differences (p = 0.18) compared to models trained with data exclusion from each center, which performed with a median DSC of 0.74 (IQR: 0.56-0.86). Significant performance improvements regarding multi-center training were observed for the Dresden cohort (multi-center median DSC 0.71, IQR: 0.58-0.80 vs. single-center 0.68, IQR: 0.50-0.80, p < 0.001) and Cyprus cohort (multi-center 0.74, IQR: 0.62-0.83 vs. single-center 0.72, IQR: 0.54-0.82, p < 0.01). While Munich and Freiburg also showed performance improvements with multi-center training, results showed no statistical significance (Munich: multi-center DSC 0.74, IQR: 0.60-0.80 vs. single-center 0.72, IQR: 0.59-0.82, p > 0.05; Freiburg: multi-center 0.78, IQR: 0.53-0.87 vs. single-center 0.71, IQR: 0.53-0.83, p = 0.23).</p><p><strong>Conclusion: </strong>CNNs trained for auto contouring intraprostatic GTV in <sup>18</sup>F-PSMA-1007 PET on a diverse dataset from multiple centers mostly generalize well to unseen data from other centers. Training on a multicentric dataset can improve performance compared to training exclusively with a single-center dataset regarding intraprostatic <sup>18</sup>F-PSMA-1007 PET GTV segmentation. The segmentation performance of the same CNN can vary depending on the dataset employed for training and testing.</p>\",\"PeriodicalId\":49639,\"journal\":{\"name\":\"Radiation Oncology\",\"volume\":\"19 1\",\"pages\":\"106\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-08-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11304577/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Radiation Oncology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s13014-024-02491-w\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiation Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s13014-024-02491-w","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
The impact of multicentric datasets for the automated tumor delineation in primary prostate cancer using convolutional neural networks on 18F-PSMA-1007 PET.
Purpose: Convolutional Neural Networks (CNNs) have emerged as transformative tools in the field of radiation oncology, significantly advancing the precision of contouring practices. However, the adaptability of these algorithms across diverse scanners, institutions, and imaging protocols remains a considerable obstacle. This study aims to investigate the effects of incorporating institution-specific datasets into the training regimen of CNNs to assess their generalization ability in real-world clinical environments. Focusing on a data-centric analysis, the influence of varying multi- and single center training approaches on algorithm performance is conducted.
Methods: nnU-Net is trained using a dataset comprising 161 18F-PSMA-1007 PET images collected from four distinct institutions (Freiburg: n = 96, Munich: n = 19, Cyprus: n = 32, Dresden: n = 14). The dataset is partitioned such that data from each center are systematically excluded from training and used solely for testing to assess the model's generalizability and adaptability to data from unfamiliar sources. Performance is compared through a 5-Fold Cross-Validation, providing a detailed comparison between models trained on datasets from single centers to those trained on aggregated multi-center datasets. Dice Similarity Score, Hausdorff distance and volumetric analysis are used as primary evaluation metrics.
Results: The mixed training approach yielded a median DSC of 0.76 (IQR: 0.64-0.84) in a five-fold cross-validation, showing no significant differences (p = 0.18) compared to models trained with data exclusion from each center, which performed with a median DSC of 0.74 (IQR: 0.56-0.86). Significant performance improvements regarding multi-center training were observed for the Dresden cohort (multi-center median DSC 0.71, IQR: 0.58-0.80 vs. single-center 0.68, IQR: 0.50-0.80, p < 0.001) and Cyprus cohort (multi-center 0.74, IQR: 0.62-0.83 vs. single-center 0.72, IQR: 0.54-0.82, p < 0.01). While Munich and Freiburg also showed performance improvements with multi-center training, results showed no statistical significance (Munich: multi-center DSC 0.74, IQR: 0.60-0.80 vs. single-center 0.72, IQR: 0.59-0.82, p > 0.05; Freiburg: multi-center 0.78, IQR: 0.53-0.87 vs. single-center 0.71, IQR: 0.53-0.83, p = 0.23).
Conclusion: CNNs trained for auto contouring intraprostatic GTV in 18F-PSMA-1007 PET on a diverse dataset from multiple centers mostly generalize well to unseen data from other centers. Training on a multicentric dataset can improve performance compared to training exclusively with a single-center dataset regarding intraprostatic 18F-PSMA-1007 PET GTV segmentation. The segmentation performance of the same CNN can vary depending on the dataset employed for training and testing.
Radiation OncologyONCOLOGY-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
CiteScore
6.50
自引率
2.80%
发文量
181
审稿时长
3-6 weeks
期刊介绍:
Radiation Oncology encompasses all aspects of research that impacts on the treatment of cancer using radiation. It publishes findings in molecular and cellular radiation biology, radiation physics, radiation technology, and clinical oncology.