Optimizing Performance of Transformer-based Models for Fetal Brain MR Image Segmentation.
Nicolò Pecco, Pasquale Anthony Della Rosa, Matteo Canini, Gianluca Nocera, Paola Scifo, Paolo Ivo Cavoretto, Massimo Candiani, Andrea Falini, Antonella Castellano, Cristina Baldoli
求助PDF
{"title":"Optimizing Performance of Transformer-based Models for Fetal Brain MR Image Segmentation.","authors":"Nicolò Pecco, Pasquale Anthony Della Rosa, Matteo Canini, Gianluca Nocera, Paola Scifo, Paolo Ivo Cavoretto, Massimo Candiani, Andrea Falini, Antonella Castellano, Cristina Baldoli","doi":"10.1148/ryai.230229","DOIUrl":null,"url":null,"abstract":"<p><p>Purpose To test the performance of a transformer-based model when manipulating pretraining weights, dataset size, and input size and comparing the best model with the reference standard and state-of-the-art models for a resting-state functional (rs-fMRI) fetal brain extraction task. Materials and Methods An internal retrospective dataset (172 fetuses, 519 images; collected 2018-2022) was used to investigate influence of dataset size, pretraining approaches, and image input size on Swin-U-Net transformer (UNETR) and UNETR models. The internal and external (131 fetuses, 561 images) datasets were used to cross-validate and to assess generalization capability of the best model versus state-of-the-art models on different scanner types and number of gestational weeks (GWs). The Dice similarity coefficient (DSC) and the balanced average Hausdorff distance (BAHD) were used as segmentation performance metrics. Generalized equation estimation multifactorial models were used to assess significant model and interaction effects of interest. Results The Swin-UNETR model was not affected by the pretraining approach and dataset size and performed best with the mean dataset image size, with a mean DSC of 0.92 and BAHD of 0.097. Swin-UNETR was not affected by scanner type. Generalization results on the internal dataset showed that Swin-UNETR had lower performance compared with the reference standard models and comparable performance on the external dataset. Cross-validation on internal and external test sets demonstrated better and comparable performance of Swin-UNETR versus convolutional neural network architectures during the late-fetal period (GWs > 25) but lower performance during the midfetal period (GWs ≤ 25). Conclusion Swin-UNTER showed flexibility in dealing with smaller datasets, regardless of pretraining approaches. For fetal brain extraction from rs-fMR images, Swin-UNTER showed comparable performance with that of reference standard models during the late-fetal period and lower performance during the early GW period. <b>Keywords:</b> Transformers, CNN, Medical Imaging Segmentation, MRI, Dataset Size, Input Size, Transfer Learning <i>Supplemental material is available for this article.</i> © RSNA, 2024.</p>","PeriodicalId":29787,"journal":{"name":"Radiology-Artificial Intelligence","volume":" ","pages":"e230229"},"PeriodicalIF":8.1000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiology-Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1148/ryai.230229","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
引用
批量引用
Abstract
Purpose To test the performance of a transformer-based model when manipulating pretraining weights, dataset size, and input size and comparing the best model with the reference standard and state-of-the-art models for a resting-state functional (rs-fMRI) fetal brain extraction task. Materials and Methods An internal retrospective dataset (172 fetuses, 519 images; collected 2018-2022) was used to investigate influence of dataset size, pretraining approaches, and image input size on Swin-U-Net transformer (UNETR) and UNETR models. The internal and external (131 fetuses, 561 images) datasets were used to cross-validate and to assess generalization capability of the best model versus state-of-the-art models on different scanner types and number of gestational weeks (GWs). The Dice similarity coefficient (DSC) and the balanced average Hausdorff distance (BAHD) were used as segmentation performance metrics. Generalized equation estimation multifactorial models were used to assess significant model and interaction effects of interest. Results The Swin-UNETR model was not affected by the pretraining approach and dataset size and performed best with the mean dataset image size, with a mean DSC of 0.92 and BAHD of 0.097. Swin-UNETR was not affected by scanner type. Generalization results on the internal dataset showed that Swin-UNETR had lower performance compared with the reference standard models and comparable performance on the external dataset. Cross-validation on internal and external test sets demonstrated better and comparable performance of Swin-UNETR versus convolutional neural network architectures during the late-fetal period (GWs > 25) but lower performance during the midfetal period (GWs ≤ 25). Conclusion Swin-UNTER showed flexibility in dealing with smaller datasets, regardless of pretraining approaches. For fetal brain extraction from rs-fMR images, Swin-UNTER showed comparable performance with that of reference standard models during the late-fetal period and lower performance during the early GW period. Keywords: Transformers, CNN, Medical Imaging Segmentation, MRI, Dataset Size, Input Size, Transfer Learning Supplemental material is available for this article. © RSNA, 2024.