Pub Date : 2026-03-04eCollection Date: 2026-03-01DOI: 10.1093/radadv/umag012
{"title":"Art of imaging: Aurora Cerebralis Rivers of the Mind.","authors":"","doi":"10.1093/radadv/umag012","DOIUrl":"https://doi.org/10.1093/radadv/umag012","url":null,"abstract":"","PeriodicalId":519940,"journal":{"name":"Radiology advances","volume":"3 2","pages":"umag012"},"PeriodicalIF":0.0,"publicationDate":"2026-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12975340/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147446394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-05eCollection Date: 2026-03-01DOI: 10.1093/radadv/umag007
Matthew Allan Thomas, Ryan C Lee, Tharun Alamuri, Dan Giardina, John Karageorgiou, Naganathan Mani, Daniel A Braga, Christopher D Malone
Background: Lung shunt fraction (LSF) derived from macroaggregated albumin (MAA)-based nuclear medicine imaging is a standard component of yttrium-90 selective internal radiation therapy (90Y-SIRT) treatment planning. Elimination of MAA-based LSF determination has been suggested in selected cases.
Purpose: To propose and evaluate a pretreatment identification method for patient-specific LSF that may influence treatment planning in 90Y-SIRT and necessitate LSF determination using MAA-based imaging.
Methods: MAA SPECT/CT-based LSF (LSFSPECT) was analyzed retrospectively in glass 90Y-SIRT cases from September 2022 to June 2025 at a single center. A new metric (LSFbound) was defined as the minimum LSF value where the maximum achievable perfused volume (PV) dose is determined by a selected lung dose threshold (Lungsmax) instead of a designated whole-liver dose threshold (Livermax). LSFbound values computed using both clinical and simulated treatment planning parameters were quantitatively evaluated relative to LSFSPECT. A clinical workflow based on this new metric was evaluated.
Results: A total of 354 cases were analyzed from 297 patients (92 females and 205 males). Median (interquartile range) age at MAA-SPECT/CT was 69 (63-74). LSFbound depends only on liver mass, lung mass, Livermax, and Lungsmax, whereas PV size plays no role. Using observed LSFSPECT distributions, the median (max) probability for LSFSPECT to exceed LSFbound was ≤1% (≤4%) for hepatocellular carcinoma ≤ 8 cm and non-hepatocellular carcinoma cases without macrovascular invasion (87% of all cases). Receiver operating characteristic analysis showed that pretreatment use of LSFbound could achieve 100% sensitivity and >60% specificities at Livermax values up to 180 Gy.
Conclusion: Patient-specific, MAA-based LSF determination may be obviated in most 90Y-SIRT cases as LSF and Lungsmax play no role in limiting the achievable PV dose. Pretreatment calculation of LSFbound provides individualized, quantitative guidance for identifying when MAA-based, patient-specific LSF assessment is warranted.
{"title":"When is patient-specific lung shunt fraction necessary in <sup>90</sup>Y selective internal radiation therapy of liver cancer?","authors":"Matthew Allan Thomas, Ryan C Lee, Tharun Alamuri, Dan Giardina, John Karageorgiou, Naganathan Mani, Daniel A Braga, Christopher D Malone","doi":"10.1093/radadv/umag007","DOIUrl":"https://doi.org/10.1093/radadv/umag007","url":null,"abstract":"<p><strong>Background: </strong>Lung shunt fraction (LSF) derived from macroaggregated albumin (MAA)-based nuclear medicine imaging is a standard component of yttrium-90 selective internal radiation therapy (<sup>90</sup>Y-SIRT) treatment planning. Elimination of MAA-based LSF determination has been suggested in selected cases.</p><p><strong>Purpose: </strong>To propose and evaluate a pretreatment identification method for patient-specific LSF that may influence treatment planning in <sup>90</sup>Y-SIRT and necessitate LSF determination using MAA-based imaging.</p><p><strong>Methods: </strong>MAA SPECT/CT-based LSF (LSF<sub>SPECT</sub>) was analyzed retrospectively in glass <sup>90</sup>Y-SIRT cases from September 2022 to June 2025 at a single center. A new metric (LSF<sub>bound</sub>) was defined as the minimum LSF value where the maximum achievable perfused volume (PV) dose is determined by a selected lung dose threshold (Lungs<sub>max</sub>) instead of a designated whole-liver dose threshold (Liver<sub>max</sub>). LSF<sub>bound</sub> values computed using both clinical and simulated treatment planning parameters were quantitatively evaluated relative to LSF<sub>SPECT</sub>. A clinical workflow based on this new metric was evaluated.</p><p><strong>Results: </strong>A total of 354 cases were analyzed from 297 patients (92 females and 205 males). Median (interquartile range) age at MAA-SPECT/CT was 69 (63-74). LSF<sub>bound</sub> depends only on liver mass, lung mass, Liver<sub>max</sub>, and Lungs<sub>max</sub>, whereas PV size plays no role. Using observed LSF<sub>SPECT</sub> distributions, the median (max) probability for LSF<sub>SPECT</sub> to exceed LSF<sub>bound</sub> was ≤1% (≤4%) for hepatocellular carcinoma ≤ 8 cm and non-hepatocellular carcinoma cases without macrovascular invasion (87% of all cases). Receiver operating characteristic analysis showed that pretreatment use of LSF<sub>bound</sub> could achieve 100% sensitivity and >60% specificities at Liver<sub>max</sub> values up to 180 Gy.</p><p><strong>Conclusion: </strong>Patient-specific, MAA-based LSF determination may be obviated in most <sup>90</sup>Y-SIRT cases as LSF and Lungs<sub>max</sub> play no role in limiting the achievable PV dose. Pretreatment calculation of LSF<sub>bound</sub> provides individualized, quantitative guidance for identifying when MAA-based, patient-specific LSF assessment is warranted.</p>","PeriodicalId":519940,"journal":{"name":"Radiology advances","volume":"3 2","pages":"umag007"},"PeriodicalIF":0.0,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC13005926/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147505940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-03eCollection Date: 2026-01-01DOI: 10.1093/radadv/umag003
Eduardo J Mortani Barbosa, Yohan Kim, Yanbo Zhang, Arnaud A A Setio, Francois Mellot, Philippe A Grenier, Mathis Zimmermann, Bogdan Georgescu, Sasa Grbic, Warren B Gefter
Background: Pulmonary nodules are commonly encountered in lung cancer screening. The risk of malignancy varies widely and is generally estimated using expert consensus guidelines (Lung CT Imaging Reporting and Data Systems [Lung-RADS]).
Purpose: To assess the performance of a deep learning algorithm (Deep Pulmonary Nodule Profiler [DeepPNP]) for pulmonary nodule malignancy risk estimation in a lung cancer screening dataset and the effect of data enrichment in model training.
Materials and methods: A retrospective analysis was conducted using 3 datasets. DeepPNP is a 3D convolutional network (EfficientNet-B0-based) operating on nodule-centered 3D patches. For the DeepPNP model training and validation, the National Lung Screening Trial (NLST) dataset was combined with 2 independent malignant nodule-only datasets, resulting in a merged dataset of 28 057 nodules, including 2362 malignant nodules. An ablation model (DeepPNP-NLST) was trained on NLST only. The testing was conducted on a held-out dataset from the NLST dataset. Performance metrics, including sensitivity, specificity, precision, F1 score, and accuracy, were analyzed across 3 operating thresholds selected based on specificities of 0.80, 0.85, and 0.90 (selected on the validation set). Benchmarks included Lung-RADS v2022 and the PanCan model.
Results: On the NLST test set (including 2597 nodules from 1243 CT scans), DeepPNP achieved an area under the receiver operating characteristic curve (ROC AUC) of 0.96 (95% confidence interval [CI], 0.95-0.97), outperforming Lung-RADS AUC = 0.91 (95% CI, 0.89-0.94; P < .001) and PanCan AUC = 0.93 (95% CI, 0.91-0.95; P < .001). DeepPNP-NLST had an AUC of 0.95 (95% CI, 0.93-0.97; P = .045 vs DeepPNP), indicating a modest gain from positive-only supplementation. Subgroup analyses showed consistent outperformance across nodule sizes and types. Operating-point metrics at 0.80/0.85/0.90 specificity are reported; at 0.80 specificity, DeepPNP achieved sensitivity of 0.94 (100/107; 95% CI, 0.88-0.98) and specificity of 0.88 (2196/2490; 95% CI, 0.87-0.90).
Conclusion: DeepPNP outperformed established malignancy risk models in lung cancer screening. The inclusion of biopsy-confirmed malignant nodules from 2 external datasets provided a measurable performance gain, underscoring the importance of data enrichment during model training.
{"title":"Deep learning-based pulmonary nodule risk assessment outperforms established malignancy risk scores in lung cancer screening.","authors":"Eduardo J Mortani Barbosa, Yohan Kim, Yanbo Zhang, Arnaud A A Setio, Francois Mellot, Philippe A Grenier, Mathis Zimmermann, Bogdan Georgescu, Sasa Grbic, Warren B Gefter","doi":"10.1093/radadv/umag003","DOIUrl":"https://doi.org/10.1093/radadv/umag003","url":null,"abstract":"<p><strong>Background: </strong>Pulmonary nodules are commonly encountered in lung cancer screening. The risk of malignancy varies widely and is generally estimated using expert consensus guidelines (Lung CT Imaging Reporting and Data Systems [Lung-RADS]).</p><p><strong>Purpose: </strong>To assess the performance of a deep learning algorithm (Deep Pulmonary Nodule Profiler [DeepPNP]) for pulmonary nodule malignancy risk estimation in a lung cancer screening dataset and the effect of data enrichment in model training.</p><p><strong>Materials and methods: </strong>A retrospective analysis was conducted using 3 datasets. DeepPNP is a 3D convolutional network (EfficientNet-B0-based) operating on nodule-centered 3D patches. For the DeepPNP model training and validation, the National Lung Screening Trial (NLST) dataset was combined with 2 independent malignant nodule-only datasets, resulting in a merged dataset of 28 057 nodules, including 2362 malignant nodules. An ablation model (DeepPNP-NLST) was trained on NLST only. The testing was conducted on a held-out dataset from the NLST dataset. Performance metrics, including sensitivity, specificity, precision, F1 score, and accuracy, were analyzed across 3 operating thresholds selected based on specificities of 0.80, 0.85, and 0.90 (selected on the validation set). Benchmarks included Lung-RADS v2022 and the PanCan model.</p><p><strong>Results: </strong>On the NLST test set (including 2597 nodules from 1243 CT scans), DeepPNP achieved an area under the receiver operating characteristic curve (ROC AUC) of 0.96 (95% confidence interval [CI], 0.95-0.97), outperforming Lung-RADS AUC = 0.91 (95% CI, 0.89-0.94; <i>P</i> < .001) and PanCan AUC = 0.93 (95% CI, 0.91-0.95; <i>P</i> < .001). DeepPNP-NLST had an AUC of 0.95 (95% CI, 0.93-0.97; <i>P</i> = .045 vs DeepPNP), indicating a modest gain from positive-only supplementation. Subgroup analyses showed consistent outperformance across nodule sizes and types. Operating-point metrics at 0.80/0.85/0.90 specificity are reported; at 0.80 specificity, DeepPNP achieved sensitivity of 0.94 (100/107; 95% CI, 0.88-0.98) and specificity of 0.88 (2196/2490; 95% CI, 0.87-0.90).</p><p><strong>Conclusion: </strong>DeepPNP outperformed established malignancy risk models in lung cancer screening. The inclusion of biopsy-confirmed malignant nodules from 2 external datasets provided a measurable performance gain, underscoring the importance of data enrichment during model training.</p>","PeriodicalId":519940,"journal":{"name":"Radiology advances","volume":"3 1","pages":"umag003"},"PeriodicalIF":0.0,"publicationDate":"2026-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12944827/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147328965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-25eCollection Date: 2026-01-01DOI: 10.1093/radadv/umag005
Alexis B Slutsky-Ganesh, Salomé Baup, Upasana U Bharadwaj, Jake A Slaton, Melanie Valencia, Jed A Diekfuss, Taylor M Zuleger, Shayla M Warren, Kim D Barber Foss, Kyle Hammond, John W Xerogeanes, Ruud B van Heeswijk, Gregory D Myer, Augustin C Ogier
Background: Accurate thigh muscle segmentation from magnetic resonance imaging (MRI) enables quantitative assessment of muscle health. Although manual segmentation is the gold standard, it is labor-intensive and variable, and existing automated/semi-automatic approaches remain limited by segmentation errors/user dependence, restricting scalability. Defining data requirements for robust automated segmentation therefore remains a critical unmet need.
Purpose: To determine the number of annotated lower extremity MRI studies needed to train an accurate deep learning (DL) model for thigh muscle segmentation and to assess the effect of training size on agreement of downstream quantitative measures.
Materials and methods: Lower extremity MR images were obtained from competitive athletes with anterior cruciate ligament injuries and professional-level football athletes scanned at a single site on a 3 T GE Premier system. Fourteen thigh muscles were segmented using semi-automatic propagation followed by manual correction to generate high-quality ground-truth assisted manual segmentations (SegM). Thirteen DL models (nnU-Net) were trained with SegM on increasing numbers of training subjects (Ntrain) ranging from Ntrain = 5 up to Ntrain = 120, each evaluated on a fixed independent test set of 41 subjects. Automated segmentation (SegA) performance was evaluated using standard geometric accuracy metrics (Dice similarity coefficient [DSC], relative volume difference [RVD], Hausdorff Distance [HD], HD95, and average symmetric surface distance [ASSD]). To determine whether SegA would lead to meaningful quantitative MRI results, we also compared fat fraction and diffusion-tensor imaging measures extracted from SegA to those derived from SegM.
Results: DL model training on Ntrain = 20 subjects achieved high accuracy on the fixed test set (mean ± SD: DSC 0.94 ± 0.02; RVD 4.9% ± 5.2%; ASSD 0.8 ± 0.4 mm; HD95 3.2 ± 2.8 mm), with modest improvement at 50 subjects.
Conclusion: Twenty annotated images were sufficient for clinically acceptable performance, supporting streamlined segmentation and quantitative reporting in athlete care and research.
{"title":"Optimizing MRI annotation workflows for high-accuracy deep learning thigh muscle segmentation in athletes.","authors":"Alexis B Slutsky-Ganesh, Salomé Baup, Upasana U Bharadwaj, Jake A Slaton, Melanie Valencia, Jed A Diekfuss, Taylor M Zuleger, Shayla M Warren, Kim D Barber Foss, Kyle Hammond, John W Xerogeanes, Ruud B van Heeswijk, Gregory D Myer, Augustin C Ogier","doi":"10.1093/radadv/umag005","DOIUrl":"10.1093/radadv/umag005","url":null,"abstract":"<p><strong>Background: </strong>Accurate thigh muscle segmentation from magnetic resonance imaging (MRI) enables quantitative assessment of muscle health. Although manual segmentation is the gold standard, it is labor-intensive and variable, and existing automated/semi-automatic approaches remain limited by segmentation errors/user dependence, restricting scalability. Defining data requirements for robust automated segmentation therefore remains a critical unmet need.</p><p><strong>Purpose: </strong>To determine the number of annotated lower extremity MRI studies needed to train an accurate deep learning (DL) model for thigh muscle segmentation and to assess the effect of training size on agreement of downstream quantitative measures.</p><p><strong>Materials and methods: </strong>Lower extremity MR images were obtained from competitive athletes with anterior cruciate ligament injuries and professional-level football athletes scanned at a single site on a 3 T GE Premier system. Fourteen thigh muscles were segmented using semi-automatic propagation followed by manual correction to generate high-quality ground-truth assisted manual segmentations (Seg<sub>M</sub>). Thirteen DL models (nnU-Net) were trained with Seg<sub>M</sub> on increasing numbers of training subjects (<i>N</i> <sub>train</sub>) ranging from <i>N</i> <sub>train</sub> = 5 up to <i>N</i> <sub>train</sub> = 120, each evaluated on a fixed independent test set of 41 subjects. Automated segmentation (Seg<sub>A</sub>) performance was evaluated using standard geometric accuracy metrics (Dice similarity coefficient [DSC], relative volume difference [RVD], Hausdorff Distance [HD], HD95, and average symmetric surface distance [ASSD]). To determine whether Seg<sub>A</sub> would lead to <i>meaningful</i> quantitative MRI results, we also compared fat fraction and diffusion-tensor imaging measures extracted from Seg<sub>A</sub> to those derived from Seg<sub>M</sub>.</p><p><strong>Results: </strong>DL model training on <i>N</i> <sub>train</sub> = 20 subjects achieved high accuracy on the fixed test set (mean ± SD: DSC 0.94 ± 0.02; RVD 4.9% ± 5.2%; ASSD 0.8 ± 0.4 mm; HD95 3.2 ± 2.8 mm), with modest improvement at 50 subjects.</p><p><strong>Conclusion: </strong>Twenty annotated images were sufficient for clinically acceptable performance, supporting streamlined segmentation and quantitative reporting in athlete care and research.</p>","PeriodicalId":519940,"journal":{"name":"Radiology advances","volume":"3 1","pages":"umag005"},"PeriodicalIF":0.0,"publicationDate":"2026-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12906233/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146204470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21eCollection Date: 2026-03-01DOI: 10.1093/radadv/umag004
Bingjing Zhou, Su Wu, James Francis Griffith, Fan Xiao, Miaoru Zhang, Takeshi Fukuda, Lai-Shan Tam
Background: Synovitis is the key inflammatory feature of rheumatoid arthritis (RA). Quantitative assessment of synovitis better correlates with patient outcomes than semiquantitative assessment but it is time-consuming.
Purpose: To develop and validate an automated model for segmentation and quantification of wrist synovial tissue volume on postcontrast fat-suppressed T1-weighted MRI.
Material and methods: Patients with early RA (symptoms for ≤24 months) at a single center were recruited at baseline and were followed up at year 1 and year 8. Postcontrast axial fat-suppressed T1-weighted images of the most symptomatic wrist were acquired at 3.0 T. One observer manually segmented consecutive synovitis areas on all MRI datasets. A framework, based on the convolutional neural network, nnU-Net, was trained and validated (5-fold cross-validation with image level splits) with 295 image datasets used for model training and validation. The rheumatoid arthritis MRI score was used to semiquantitatively grade synovitis. Manually segmented synovial volume by a single reader was used as the reference standard. Forty-five external image datasets from 2 different imaging centers were used to test generalizable applicability.
Results: For automated synovitis segmentation, the overall Sørensen-Dice similarity coefficient (DSC) was 0.75 ± 0.11 (mean ± SD) compared to manual segmentation. Higher DSC values were found in patients with moderate (0.80 ± 0.06) and severe (0.84 ± 0.05) degrees of synovitis. The model had a similar performance with externally acquired data (DSC value: 0.70 ± 0.20). Predicted and manually segmented synovitis volume measurements showed excellent agreement (Pearson correlation: r = 0.975, P < .001).
Conclusion: A fully automated model quantified wrist synovial tissue volume with good agreement to manual reference and maintained performance on external data, supporting potential use in clinical studies and prospective evaluation in practice.
{"title":"nnUnet-based automated quantification of wrist joint synovitis volume in patients with rheumatoid arthritis: a feasibility study.","authors":"Bingjing Zhou, Su Wu, James Francis Griffith, Fan Xiao, Miaoru Zhang, Takeshi Fukuda, Lai-Shan Tam","doi":"10.1093/radadv/umag004","DOIUrl":"https://doi.org/10.1093/radadv/umag004","url":null,"abstract":"<p><strong>Background: </strong>Synovitis is the key inflammatory feature of rheumatoid arthritis (RA). Quantitative assessment of synovitis better correlates with patient outcomes than semiquantitative assessment but it is time-consuming.</p><p><strong>Purpose: </strong>To develop and validate an automated model for segmentation and quantification of wrist synovial tissue volume on postcontrast fat-suppressed T1-weighted MRI.</p><p><strong>Material and methods: </strong>Patients with early RA (symptoms for ≤24 months) at a single center were recruited at baseline and were followed up at year 1 and year 8. Postcontrast axial fat-suppressed T1-weighted images of the most symptomatic wrist were acquired at 3.0 T. One observer manually segmented consecutive synovitis areas on all MRI datasets. A framework, based on the convolutional neural network, nnU-Net, was trained and validated (5-fold cross-validation with image level splits) with 295 image datasets used for model training and validation. The rheumatoid arthritis MRI score was used to semiquantitatively grade synovitis. Manually segmented synovial volume by a single reader was used as the reference standard. Forty-five external image datasets from 2 different imaging centers were used to test generalizable applicability.</p><p><strong>Results: </strong>For automated synovitis segmentation, the overall Sørensen-Dice similarity coefficient (DSC) was 0.75 ± 0.11 (mean ± SD) compared to manual segmentation. Higher DSC values were found in patients with moderate (0.80 ± 0.06) and severe (0.84 ± 0.05) degrees of synovitis. The model had a similar performance with externally acquired data (DSC value: 0.70 ± 0.20). Predicted and manually segmented synovitis volume measurements showed excellent agreement (Pearson correlation: <i>r </i>= 0.975, <i>P </i>< .001).</p><p><strong>Conclusion: </strong>A fully automated model quantified wrist synovial tissue volume with good agreement to manual reference and maintained performance on external data, supporting potential use in clinical studies and prospective evaluation in practice.</p>","PeriodicalId":519940,"journal":{"name":"Radiology advances","volume":"3 2","pages":"umag004"},"PeriodicalIF":0.0,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12975715/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147446486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21eCollection Date: 2026-03-01DOI: 10.1093/radadv/umag006
Kathrin Bäumler, Marina Codari, Domenico Mastrodicasa, Gabriel Mistelbauer, Martin J Willemink, Shannon Walters, Virginia Hinostroza, Valery Turner, Leonid Chepelev, Apichaya Sriprachyakul, Mohammad H Madani, Alex Ewane, Edward P Chen, Alison L Marsden, Benoit Desjardins, Dominik Fleischmann
Purpose: To evaluate the performance of deep reinforcement learning (DRL) agents for the detection of anatomic landmarks in patients with Stanford Type B aortic dissection (TBAD).
Materials and methods: This is an international retrospective study of 396 CT angiography scans of patients with TBAD from 9 participating sites (mean age 57.6 years ± 13.7/[SD]; 236 male, 160 female). Aortic landmarks, including the aortic annulus and 8 aortic branch vessels, were manually labeled. Additionally, interobserver variability data were collected between 2 observers for 30 scans. DRL agents were trained independently for each landmark with the manual labels serving as the reference standard. Unique landmark locations were obtained from (1) single agents' predictions and (2) clusters of landmark predictions using the DBSCAN clustering algorithm. The performance was analyzed based on distance metrics (mean, median, quantiles) and failure rates, defined as a distance error of more than 10 mm. Interobserver variability data were analyzed with a pairwise Wilcoxon test.
Results: On the internal test set, DRL single agents predicted landmark locations with median errors of 2.7 (95% CI, 2.2-3.3) mm and 4.8% failure rate. Cluster-based predictions resulted in a median error of 2.5 (95% CI, 2.4-2.7) mm and 4.0% failure rate. Pooled over all landmarks, cluster-based predictions outperformed single-agent predictions (P < 1e-5). In the external test set, cluster-based DRL models demonstrated significantly lower localization errors and fewer failures compared to single-agent DRL models (P < .01), and were either not significantly different (single agents) from or significantly better (cluster-based, P < .05) than human interobserver variability. The median processing time for a single agent's prediction was 1.0 second (IQR, 0.7-1.4 seconds).
Conclusion: Single-agent and cluster-based DRL predict aortic landmarks in patients with TBAD with high accuracy and precision, comparable to the variability between human observers.
{"title":"Deep reinforcement learning for automatic anatomic CT landmark localization in Stanford Type B aortic dissection.","authors":"Kathrin Bäumler, Marina Codari, Domenico Mastrodicasa, Gabriel Mistelbauer, Martin J Willemink, Shannon Walters, Virginia Hinostroza, Valery Turner, Leonid Chepelev, Apichaya Sriprachyakul, Mohammad H Madani, Alex Ewane, Edward P Chen, Alison L Marsden, Benoit Desjardins, Dominik Fleischmann","doi":"10.1093/radadv/umag006","DOIUrl":"https://doi.org/10.1093/radadv/umag006","url":null,"abstract":"<p><strong>Background: </strong>Long-term aortic dissection monitoring requires consistent, landmark-based measurements over time.</p><p><strong>Purpose: </strong>To evaluate the performance of deep reinforcement learning (DRL) agents for the detection of anatomic landmarks in patients with Stanford Type B aortic dissection (TBAD).</p><p><strong>Materials and methods: </strong>This is an international retrospective study of 396 CT angiography scans of patients with TBAD from 9 participating sites (mean age 57.6 years ± 13.7/[SD]; 236 male, 160 female). Aortic landmarks, including the aortic annulus and 8 aortic branch vessels, were manually labeled. Additionally, interobserver variability data were collected between 2 observers for 30 scans. DRL agents were trained independently for each landmark with the manual labels serving as the reference standard. Unique landmark locations were obtained from (1) single agents' predictions and (2) clusters of landmark predictions using the DBSCAN clustering algorithm. The performance was analyzed based on distance metrics (mean, median, quantiles) and failure rates, defined as a distance error of more than 10 mm. Interobserver variability data were analyzed with a pairwise Wilcoxon test.</p><p><strong>Results: </strong>On the internal test set, DRL single agents predicted landmark locations with median errors of 2.7 (95% CI, 2.2-3.3) mm and 4.8% failure rate. Cluster-based predictions resulted in a median error of 2.5 (95% CI, 2.4-2.7) mm and 4.0% failure rate. Pooled over all landmarks, cluster-based predictions outperformed single-agent predictions (<i>P</i> < 1e-5). In the external test set, cluster-based DRL models demonstrated significantly lower localization errors and fewer failures compared to single-agent DRL models (<i>P</i> < .01), and were either not significantly different (single agents) from or significantly better (cluster-based, <i>P</i> < .05) than human interobserver variability. The median processing time for a single agent's prediction was 1.0 second (IQR, 0.7-1.4 seconds).</p><p><strong>Conclusion: </strong>Single-agent and cluster-based DRL predict aortic landmarks in patients with TBAD with high accuracy and precision, comparable to the variability between human observers.</p>","PeriodicalId":519940,"journal":{"name":"Radiology advances","volume":"3 2","pages":"umag006"},"PeriodicalIF":0.0,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12956048/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147358319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19eCollection Date: 2026-01-01DOI: 10.1093/radadv/umag002
Wenwen Zhang, Sevgi Gokce Kafali, Timothy Adamos, Kelsey Kuwahara, Ashley Dong, Jessica Li, Shu-Fu Shih, Timoteo Delgado-Esbenshade, Shilpy Chowdhury, Spencer Loong, Jeremy Moretz, Samuel R Barnes, Zhaoping Li, Shahnaz Ghahremani, Kara L Calkins, Holden H Wu
Background: Pediatric abdominal visceral and subcutaneous adipose tissue (VAT, SAT) quantified on magnetic resonance imaging (MRI) can assess risk for metabolic diseases. However, the complex structure of VAT in children and the lack of sufficient MRI datasets pose challenges for developing automated segmentation methods.
Purpose: To achieve accurate and rapid automated segmentation of pediatric abdominal VAT and SAT on motion-robust free-breathing (FB) 3D Dixon MRI by developing a cross-cohort federated learning (FL) framework that leverages adult datasets.
Materials and methods: 3D FB-MRI datasets were prospectively acquired in children 6-18 years old (single center, 2 scanners; 2016-2023) and used to train 3D neural network models for segmenting abdominal VAT and SAT. The FL model was trained across the pediatric cohort and a separate adult cohort (5 centers, 7 scanners) without requiring direct data sharing. Segmentation performance of the FL model was assessed by Dice scores with respect to references and compared with standalone local training and joint training with full data access. Quantification of VAT and SAT volume and proton-density fat fraction (PDFF) was compared against references using intraclass correlation coefficients (ICCs) and Bland-Altman analysis. Differences between training approaches were analyzed using the Kruskal-Wallis test followed by paired Wilcoxon signed-rank tests.
Results: The FL model, trained and tested with 134 children (mean age, 13.3 years ± 2.7 [standard deviation]; 71 males) and 920 adults (50.4 years ± 14.0; 677 females), achieved mean Dice scores of 91.09% (VAT) and 95.55% (SAT), outperforming standalone training (VAT: P < .001) and performing comparably to joint training (VAT: P = .21). Volume quantification demonstrated strong agreement (VAT: ICC = 0.99, SAT: ICC = 1.00). PDFF quantification showed small mean differences (VAT: 0.21%, SAT: -1.19%). Inference time was <3 seconds for each subject.
Conclusion: The proposed FL framework achieved accurate and rapid automated segmentation and quantification of pediatric abdominal VAT and SAT on 3D FB-MRI.
背景:磁共振成像(MRI)量化儿童腹部内脏和皮下脂肪组织(VAT, SAT)可以评估代谢性疾病的风险。然而,儿童VAT的复杂结构和缺乏足够的MRI数据集为开发自动分割方法带来了挑战。目的:通过开发一个利用成人数据集的跨队列联邦学习(FL)框架,在运动稳健自由呼吸(FB) 3D Dixon MRI上实现儿科腹部VAT和SAT的准确、快速的自动分割。材料和方法:前瞻性地获取6-18岁儿童(单中心,2台扫描仪;2016-2023)的3D FB-MRI数据集,并用于训练用于分割腹部VAT和SAT的3D神经网络模型。FL模型在儿科队列和单独的成人队列(5个中心,7台扫描仪)中进行训练,无需直接共享数据。通过参考文献的Dice分数评估FL模型的分割性能,并与独立的局部训练和完全数据访问的联合训练进行比较。使用类内相关系数(ICCs)和Bland-Altman分析将VAT和SAT体积和质子密度脂肪分数(PDFF)的定量与参考文献进行比较。使用Kruskal-Wallis检验和配对Wilcoxon符号秩检验分析训练方法之间的差异。结果:FL模型对134名儿童(平均年龄,13.3岁±2.7[标准差];71名男性)和920名成人(50.4岁±14.0;677名女性)进行了训练和测试,其平均Dice得分为91.09% (VAT)和95.55% (SAT),优于独立训练(VAT: P P = .21)。体积量化显示出强烈的一致性(增值税:ICC = 0.99, SAT: ICC = 1.00)。PDFF量化显示平均差异较小(VAT: 0.21%, SAT: -1.19%)。结论:所提出的FL框架在3D FB-MRI上实现了儿童腹部VAT和SAT的准确、快速的自动分割和定量。
{"title":"Cross-cohort federated learning for pediatric abdominal adipose tissue segmentation and quantification using free-breathing 3D MRI.","authors":"Wenwen Zhang, Sevgi Gokce Kafali, Timothy Adamos, Kelsey Kuwahara, Ashley Dong, Jessica Li, Shu-Fu Shih, Timoteo Delgado-Esbenshade, Shilpy Chowdhury, Spencer Loong, Jeremy Moretz, Samuel R Barnes, Zhaoping Li, Shahnaz Ghahremani, Kara L Calkins, Holden H Wu","doi":"10.1093/radadv/umag002","DOIUrl":"10.1093/radadv/umag002","url":null,"abstract":"<p><strong>Background: </strong>Pediatric abdominal visceral and subcutaneous adipose tissue (VAT, SAT) quantified on magnetic resonance imaging (MRI) can assess risk for metabolic diseases. However, the complex structure of VAT in children and the lack of sufficient MRI datasets pose challenges for developing automated segmentation methods.</p><p><strong>Purpose: </strong>To achieve accurate and rapid automated segmentation of pediatric abdominal VAT and SAT on motion-robust free-breathing (FB) 3D Dixon MRI by developing a cross-cohort federated learning (FL) framework that leverages adult datasets.</p><p><strong>Materials and methods: </strong>3D FB-MRI datasets were prospectively acquired in children 6-18 years old (single center, 2 scanners; 2016-2023) and used to train 3D neural network models for segmenting abdominal VAT and SAT. The FL model was trained across the pediatric cohort and a separate adult cohort (5 centers, 7 scanners) without requiring direct data sharing. Segmentation performance of the FL model was assessed by Dice scores with respect to references and compared with standalone local training and joint training with full data access. Quantification of VAT and SAT volume and proton-density fat fraction (PDFF) was compared against references using intraclass correlation coefficients (ICCs) and Bland-Altman analysis. Differences between training approaches were analyzed using the Kruskal-Wallis test followed by paired Wilcoxon signed-rank tests.</p><p><strong>Results: </strong>The FL model, trained and tested with 134 children (mean age, 13.3 years ± 2.7 [standard deviation]; 71 males) and 920 adults (50.4 years ± 14.0; 677 females), achieved mean Dice scores of 91.09% (VAT) and 95.55% (SAT), outperforming standalone training (VAT: <i>P</i> < .001) and performing comparably to joint training (VAT: <i>P</i> = .21). Volume quantification demonstrated strong agreement (VAT: ICC = 0.99, SAT: ICC = 1.00). PDFF quantification showed small mean differences (VAT: 0.21%, SAT: -1.19%). Inference time was <3 seconds for each subject.</p><p><strong>Conclusion: </strong>The proposed FL framework achieved accurate and rapid automated segmentation and quantification of pediatric abdominal VAT and SAT on 3D FB-MRI.</p>","PeriodicalId":519940,"journal":{"name":"Radiology advances","volume":"3 1","pages":"umag002"},"PeriodicalIF":0.0,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12916172/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146230475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-13eCollection Date: 2026-01-01DOI: 10.1093/radadv/umag001
Richard B Schonour, Felicia I Tang, Kiara A Bowers, Pan Su, Michael A Ohliger, Yoo Jin Lee, Jonathan A Liu, Peder E Z Larson, Yang Yang, Jae Ho Sohn
Background: Mid-field MRI scans (0.55 T) have gained attention for pulmonary imaging because of reduced susceptibility artifact, but image quality remains variable across patient populations, limiting its clinical adoption.
Purpose: To identify technical and clinical factors associated with poor image quality in 0.55 T lung MRI in adults with various pulmonary diseases.
Materials and methods: Adults with major pulmonary disease (eg, infection, pulmonary fibrosis, cancer) who were scheduled for chest CT or PET/CT at a single health care site were prospectively recruited from January to August 2023 to undergo same-day 0.55 T MRI scans. Exclusion criteria included inability to communicate in English, overlapping pulmonary diagnoses already represented, or declining consent. Respiratory-triggered T2-weighted BLADE and T1-weighted UTE sequences were acquired on a Siemens MAGNETOM Free.Max (0.55 T) scanner. Six radiologists independently graded overall image quality (1 = poor, 2 = fine, 3 = excellent). Respiratory metrics were quantified, including tidal depth, respiratory rate, and respiration length. Body mass index (BMI) and body surface area were calculated. One-way analysis of variance was used to test the association between these factors and image quality. Interreader agreement was assessed using Fleiss kappa and the intraclass correlation coefficient.
Results: Twenty-eight participants (mean age, 59 years ± 19; 17 women) were evaluated. Fibrotic interstitial lung disease was linked to degraded image quality. Deeper tidal depth (P = .04), longer respiration length (P = .002), and higher BMI (P = 0.02) were significant predictors of degradation on univariate analysis. Respiratory rate and body surface area were not significantly associated (P > .05).
Conclusion: This preliminary study suggests that BMI, pulmonary fibrosis, and deep/slow breathing patterns may be associated with degraded respiratory triggered 0.55 T lung MRI. If confirmed in larger, more diverse cohorts, these findings could help identify patients at risk for lower quality imaging and inform strategies to optimize image quality in clinical practice.
{"title":"Determinants of image quality in respiratory triggered free breathing lung MRI at 0.55 T in adults.","authors":"Richard B Schonour, Felicia I Tang, Kiara A Bowers, Pan Su, Michael A Ohliger, Yoo Jin Lee, Jonathan A Liu, Peder E Z Larson, Yang Yang, Jae Ho Sohn","doi":"10.1093/radadv/umag001","DOIUrl":"10.1093/radadv/umag001","url":null,"abstract":"<p><strong>Background: </strong>Mid-field MRI scans (0.55 T) have gained attention for pulmonary imaging because of reduced susceptibility artifact, but image quality remains variable across patient populations, limiting its clinical adoption.</p><p><strong>Purpose: </strong>To identify technical and clinical factors associated with poor image quality in 0.55 T lung MRI in adults with various pulmonary diseases.</p><p><strong>Materials and methods: </strong>Adults with major pulmonary disease (eg, infection, pulmonary fibrosis, cancer) who were scheduled for chest CT or PET/CT at a single health care site were prospectively recruited from January to August 2023 to undergo same-day 0.55 T MRI scans. Exclusion criteria included inability to communicate in English, overlapping pulmonary diagnoses already represented, or declining consent. Respiratory-triggered T2-weighted BLADE and T1-weighted UTE sequences were acquired on a Siemens MAGNETOM Free.Max (0.55 T) scanner. Six radiologists independently graded overall image quality (1 = poor, 2 = fine, 3 = excellent). Respiratory metrics were quantified, including tidal depth, respiratory rate, and respiration length. Body mass index (BMI) and body surface area were calculated. One-way analysis of variance was used to test the association between these factors and image quality. Interreader agreement was assessed using Fleiss kappa and the intraclass correlation coefficient.</p><p><strong>Results: </strong>Twenty-eight participants (mean age, 59 years ± 19; 17 women) were evaluated. Fibrotic interstitial lung disease was linked to degraded image quality. Deeper tidal depth (<i>P </i>= .04), longer respiration length (<i>P </i>= .002), and higher BMI (<i>P </i>= 0.02) were significant predictors of degradation on univariate analysis. Respiratory rate and body surface area were not significantly associated (<i>P </i>> .05).</p><p><strong>Conclusion: </strong>This preliminary study suggests that BMI, pulmonary fibrosis, and deep/slow breathing patterns may be associated with degraded respiratory triggered 0.55 T lung MRI. If confirmed in larger, more diverse cohorts, these findings could help identify patients at risk for lower quality imaging and inform strategies to optimize image quality in clinical practice.</p>","PeriodicalId":519940,"journal":{"name":"Radiology advances","volume":"3 1","pages":"umag001"},"PeriodicalIF":0.0,"publicationDate":"2026-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12902788/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146204412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-02eCollection Date: 2026-01-01DOI: 10.1093/radadv/umaf043
Michael L Wells, Jeff L Fidler
Background: The evaluation of gastrointestinal (GI) bleeding is frequently negative despite a prolonged workup involving several different radiologic and endoscopic tests. Ferumoxytol, a superparamagnetic iron oxide agent used for treatment of iron deficiency anemia, can be used off-label as a blood pool contrast agent at MRI.
Purpose: To perform a proof-of-concept study to determine if ferumoxytol-enhanced MRI (FeMRI) can be performed to assist in detecting the presence and location of GI bleeding in patients who have undergone a negative standard evaluation.
Methods: A retrospective convenience cohort of patients examined with FeMRI at a single center over a 2-year period was evaluated. Inclusion criteria included clinical suspicion for slow or intermittent, active GI bleeding not localized following conventional testing. Exclusion criteria included any allergy to ferumoxytol or iron-containing agents, additional MRI scans scheduled within the following 72 hours, current pregnancy or breastfeeding, history of medication-related hypotensive episodes not related to an acute illness, syncopal events, arrhythmia, or low resting blood pressure. The patient's clinical characteristics, examination results, and outcome during their hospital stay were examined.
Results: Five males and 1 female with average age of 66 years (range, 37-79 years) were imaged with FeMRI. All had undergone full diagnostic evaluation of the GI tract including upper and lower endoscopy and small bowel evaluation with imaging and/or endoscopic testing; in all patients, repeated rounds of testing were performed. Diagnostic images were obtained in all patients who underwent FeMRI. In 4/6 (67%) patients, FeMRI demonstrated active bleeding in the small bowel (n = 3) and colon (n = 1).
Conclusion: FeMRI was able to demonstrate the presence and location of active GI bleeding in hospitalized patients with previous extensive negative evaluation. FeMRI should be further evaluated to determine its performance characteristics as an examination in evaluating patients with GI bleeding.
{"title":"Feasibility of ferumoxytol-enhanced MRI for detection of gastrointestinal bleeding when conventional evaluation is negative.","authors":"Michael L Wells, Jeff L Fidler","doi":"10.1093/radadv/umaf043","DOIUrl":"https://doi.org/10.1093/radadv/umaf043","url":null,"abstract":"<p><strong>Background: </strong>The evaluation of gastrointestinal (GI) bleeding is frequently negative despite a prolonged workup involving several different radiologic and endoscopic tests. Ferumoxytol, a superparamagnetic iron oxide agent used for treatment of iron deficiency anemia, can be used off-label as a blood pool contrast agent at MRI.</p><p><strong>Purpose: </strong>To perform a proof-of-concept study to determine if ferumoxytol-enhanced MRI (FeMRI) can be performed to assist in detecting the presence and location of GI bleeding in patients who have undergone a negative standard evaluation.</p><p><strong>Methods: </strong>A retrospective convenience cohort of patients examined with FeMRI at a single center over a 2-year period was evaluated. Inclusion criteria included clinical suspicion for slow or intermittent, active GI bleeding not localized following conventional testing. Exclusion criteria included any allergy to ferumoxytol or iron-containing agents, additional MRI scans scheduled within the following 72 hours, current pregnancy or breastfeeding, history of medication-related hypotensive episodes not related to an acute illness, syncopal events, arrhythmia, or low resting blood pressure. The patient's clinical characteristics, examination results, and outcome during their hospital stay were examined.</p><p><strong>Results: </strong>Five males and 1 female with average age of 66 years (range, 37-79 years) were imaged with FeMRI. All had undergone full diagnostic evaluation of the GI tract including upper and lower endoscopy and small bowel evaluation with imaging and/or endoscopic testing; in all patients, repeated rounds of testing were performed. Diagnostic images were obtained in all patients who underwent FeMRI. In 4/6 (67%) patients, FeMRI demonstrated active bleeding in the small bowel (<i>n</i> = 3) and colon (<i>n</i> = 1).</p><p><strong>Conclusion: </strong>FeMRI was able to demonstrate the presence and location of active GI bleeding in hospitalized patients with previous extensive negative evaluation. FeMRI should be further evaluated to determine its performance characteristics as an examination in evaluating patients with GI bleeding.</p>","PeriodicalId":519940,"journal":{"name":"Radiology advances","volume":"3 1","pages":"umaf043"},"PeriodicalIF":0.0,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12948164/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147328977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}