Pub Date : 2025-01-30DOI: 10.1007/s00259-025-07090-9
Chong Jiang, Zekun Jiang, Zitong Zhang, Hexiao Huang, Hang Zhou, Qiuhui Jiang, Yue Teng, Hai Li, Bing Xu, Xin Li, Jingyan Xu, Chongyang Ding, Kang Li, Rong Tian
Background
Pathological grade is a critical determinant of clinical outcomes and decision-making of follicular lymphoma (FL). This study aimed to develop a deep learning model as a digital biopsy for the non-invasive identification of FL grade.
Methods
This study retrospectively included 513 FL patients from five independent hospital centers, randomly divided into training, internal validation, and external validation cohorts. A multimodal fusion Transformer model was developed integrating 3D PET tumor images with tabular data to predict FL grade. Additionally, the model is equipped with explainable modules, including Gradient-weighted Class Activation Mapping (Grad-CAM) for PET images, SHapley Additive exPlanations analysis for tabular data, and the calculation of predictive contribution ratios for both modalities, to enhance clinical interpretability and reliability. The predictive performance was evaluated using the area under the receiver operating characteristic curve (AUC) and accuracy, and its prognostic value was also assessed.
Results
The Transformer model demonstrated high accuracy in grading FL, with AUCs of 0.964–0.985 and accuracies of 90.2-96.7% in the training cohort, and similar performance in the validation cohorts (AUCs: 0.936–0.971, accuracies: 86.4-97.0%). Ablation studies confirmed that the fusion model outperformed single-modality models (AUCs: 0.974 − 0.956, accuracies: 89.8%-85.8%). Interpretability analysis revealed that PET images contributed 81-89% of the predictive value. Grad-CAM highlighted the tumor and peri-tumor regions. The model also effectively stratified patients by survival risk (P < 0.05), highlighting its prognostic value.
Conclusions
Our study developed an explainable multimodal fusion Transformer model for accurate grading and prognosis of FL, with the potential to aid clinical decision-making.
Graphical Abstract
{"title":"An explainable transformer model integrating PET and tabular data for histologic grading and prognosis of follicular lymphoma: a multi-institutional digital biopsy study","authors":"Chong Jiang, Zekun Jiang, Zitong Zhang, Hexiao Huang, Hang Zhou, Qiuhui Jiang, Yue Teng, Hai Li, Bing Xu, Xin Li, Jingyan Xu, Chongyang Ding, Kang Li, Rong Tian","doi":"10.1007/s00259-025-07090-9","DOIUrl":"https://doi.org/10.1007/s00259-025-07090-9","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Background</h3><p>Pathological grade is a critical determinant of clinical outcomes and decision-making of follicular lymphoma (FL). This study aimed to develop a deep learning model as a digital biopsy for the non-invasive identification of FL grade.</p><h3 data-test=\"abstract-sub-heading\">Methods</h3><p>This study retrospectively included 513 FL patients from five independent hospital centers, randomly divided into training, internal validation, and external validation cohorts. A multimodal fusion Transformer model was developed integrating 3D PET tumor images with tabular data to predict FL grade. Additionally, the model is equipped with explainable modules, including Gradient-weighted Class Activation Mapping (Grad-CAM) for PET images, SHapley Additive exPlanations analysis for tabular data, and the calculation of predictive contribution ratios for both modalities, to enhance clinical interpretability and reliability. The predictive performance was evaluated using the area under the receiver operating characteristic curve (AUC) and accuracy, and its prognostic value was also assessed.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>The Transformer model demonstrated high accuracy in grading FL, with AUCs of 0.964–0.985 and accuracies of 90.2-96.7% in the training cohort, and similar performance in the validation cohorts (AUCs: 0.936–0.971, accuracies: 86.4-97.0%). Ablation studies confirmed that the fusion model outperformed single-modality models (AUCs: 0.974 − 0.956, accuracies: 89.8%-85.8%). Interpretability analysis revealed that PET images contributed 81-89% of the predictive value. Grad-CAM highlighted the tumor and peri-tumor regions. The model also effectively stratified patients by survival risk (<i>P</i> < 0.05), highlighting its prognostic value.</p><h3 data-test=\"abstract-sub-heading\">Conclusions</h3><p>Our study developed an explainable multimodal fusion Transformer model for accurate grading and prognosis of FL, with the potential to aid clinical decision-making.</p><h3 data-test=\"abstract-sub-heading\">Graphical Abstract</h3>","PeriodicalId":11909,"journal":{"name":"European Journal of Nuclear Medicine and Molecular Imaging","volume":"78 1","pages":""},"PeriodicalIF":9.1,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143056252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To evaluate the diagnostic accuracy and clinical impact of fibroblast activation protein (FAP)-targeted PET/CT imaging in primary and metastatic breast cancer and compare the results with those of standard-of-care imaging (SCI) and [18F]FDG PET/CT.
Methods
We prospectively analyzed patients with diagnosed or suspected breast cancer who underwent concomitant FAP-targeted PET/CT (radiotracers including either [68Ga]Ga-FAPI-46 or [18F]FAPI-42) and [18F]FDG PET/CT scans from June 2020 to January 2024 at two medical centers. Breast ultrasound (US) imaging was performed in all treatment-naïve patients as SCI. The SUVmax, tumor-to-background ratio (TBR), lesion detection rate, and tumor-node-metastasis (TNM) classifications between FAP-targeted and [18F]FDG PET/CT were evaluated and compared.
Results
Sixty-one female patients (median age, 52 y; range, 28–82 y) were included. Among them, 23 patients underwent evaluation for a definitive diagnosis of suspected breast lesions, 15 underwent initial staging, and 23 were evaluated for the detection of recurrence. The sensitivities of breast US, [18F]FDG, and FAP-targeted PET/CT for detecting primary breast tumors were 82%, 79%, and 100%, respectively. Regarding the diagnosis of recurrent/metastatic lesions, the lesion-based detection rate of FAP-targeted PET/CT was significantly higher than that of [18F]FDG, which included local and regional recurrence, neck lymph node (LN), abdomen LN, bone, and liver metastases. Compared with [18F]FDG PET/CT, FAP-targeted PET/CT altered thirteen patients’ TNM staging/restaging (13/59, 22%) and nine patients’ clinical management (9/59, 15%). Compared to SCI, FAPI changed fourteen patients’ TNM staging/re-staging (14/59, 24%) and eleven patients’ therapeutic regimens(11/59, 19%). There was no significant association between FAPI-derived SUVmax and receptor status/histologic type in both primary and metastatic lesions.
Conclusion
FAP-targeted PET/CT was superior to [18F]FDG in diagnosing primary and metastatic breast cancer, with higher radiotracer uptake and TBR, especially in the detection of primary/recurrent tumors, abdominal LN metastases, liver, and bone metastases. FAP-targeted PET/CT is superior to [18F]FDG and SCI in TNM staging and may improve tumor staging, recurrence detection, and implementation of necessary treatment modifications.
{"title":"FAP-targeted PET/CT imaging in patients with breast cancer from a prospective bi-center study: insights into diagnosis and clinic management","authors":"Wei Guo, Weizhi Xu, Tinghua Meng, Chunlei Fan, Hao Fu, Yizhen Pang, Liang Zhao, Long Sun, Jingxiong Huang, Yanjun Mi, Xinlu Wang, Haojun Chen","doi":"10.1007/s00259-025-07108-2","DOIUrl":"https://doi.org/10.1007/s00259-025-07108-2","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Purpose</h3><p>To evaluate the diagnostic accuracy and clinical impact of fibroblast activation protein (FAP)-targeted PET/CT imaging in primary and metastatic breast cancer and compare the results with those of standard-of-care imaging (SCI) and [<sup>18</sup>F]FDG PET/CT.</p><h3 data-test=\"abstract-sub-heading\">Methods</h3><p>We prospectively analyzed patients with diagnosed or suspected breast cancer who underwent concomitant FAP-targeted PET/CT (radiotracers including either [<sup>68</sup>Ga]Ga-FAPI-46 or [<sup>18</sup>F]FAPI-42) and [<sup>18</sup>F]FDG PET/CT scans from June 2020 to January 2024 at two medical centers. Breast ultrasound (US) imaging was performed in all treatment-naïve patients as SCI. The SUVmax, tumor-to-background ratio (TBR), lesion detection rate, and tumor-node-metastasis (TNM) classifications between FAP-targeted and [<sup>18</sup>F]FDG PET/CT were evaluated and compared.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>Sixty-one female patients (median age, 52 y; range, 28–82 y) were included. Among them, 23 patients underwent evaluation for a definitive diagnosis of suspected breast lesions, 15 underwent initial staging, and 23 were evaluated for the detection of recurrence. The sensitivities of breast US, [<sup>18</sup>F]FDG, and FAP-targeted PET/CT for detecting primary breast tumors were 82%, 79%, and 100%, respectively. Regarding the diagnosis of recurrent/metastatic lesions, the lesion-based detection rate of FAP-targeted PET/CT was significantly higher than that of [<sup>18</sup>F]FDG, which included local and regional recurrence, neck lymph node (LN), abdomen LN, bone, and liver metastases. Compared with [<sup>18</sup>F]FDG PET/CT, FAP-targeted PET/CT altered thirteen patients’ TNM staging/restaging (13/59, 22%) and nine patients’ clinical management (9/59, 15%). Compared to SCI, FAPI changed fourteen patients’ TNM staging/re-staging (14/59, 24%) and eleven patients’ therapeutic regimens(11/59, 19%). There was no significant association between FAPI-derived SUVmax and receptor status/histologic type in both primary and metastatic lesions.</p><h3 data-test=\"abstract-sub-heading\">Conclusion</h3><p>FAP-targeted PET/CT was superior to [<sup>18</sup>F]FDG in diagnosing primary and metastatic breast cancer, with higher radiotracer uptake and TBR, especially in the detection of primary/recurrent tumors, abdominal LN metastases, liver, and bone metastases. FAP-targeted PET/CT is superior to [<sup>18</sup>F]FDG and SCI in TNM staging and may improve tumor staging, recurrence detection, and implementation of necessary treatment modifications.</p>","PeriodicalId":11909,"journal":{"name":"European Journal of Nuclear Medicine and Molecular Imaging","volume":"17 1","pages":""},"PeriodicalIF":9.1,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143056253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-30DOI: 10.1007/s00259-025-07073-w
Eduardo Calderón, Lena S. Kiefer, Fabian P. Schmidt, Wenhong Lan, Andreas S. Brendlin, Christian P. Reinert, Stephan Singer, Gerald Reischl, Martina Hinterleitner, Helmut Dittmann, Christian la Fougère, Nils F. Trautwein
<h3 data-test="abstract-sub-heading">Purpose</h3><p>Somatostatin receptor (SSTR)-PET is crucial for effective treatment stratification of neuroendocrine neoplasms (NENs). In highly proliferating or poorly differentiated NENs, dual-tracer approaches using additional [<sup>18</sup>F]FDG PET can effectively identify SSTR-negative disease, usually requiring separate imaging sessions. We evaluated the feasibility of a one-day dual-tracer imaging protocol with a low activity [<sup>18</sup>F]FDG PET followed by an SSTR-PET using the recently introduced [<sup>18</sup>F]SiFA<i>lin</i>-TATE tracer in a long axial field-of-view (LAFOV) PET/CT scanner and its implications in patient management.</p><h3 data-test="abstract-sub-heading">Methods</h3><p>Twenty NEN patients were included in this study. Initially, a low activity [<sup>18</sup>F]FDG PET was performed (0.5 ± 0.01 MBq/kg; PET scan 60 min p.i.). After 4.2 ± 0.09 h after completion of the [<sup>18</sup>F]FDG PET, a standard activity of [<sup>18</sup>F]SiFA<i>lin</i>-TATE was administered (3.0 MBq/kg; PET scan 90 min p.i.). To ensure the quantification accuracy of the second scan, we evaluated the potential impact of residual [<sup>18</sup>F]FDG activity by segmenting organs with minimal physiological SSTR-tracer uptake, such as the brain and myocardium, and assessing the activity concentrations (ACTs) of tumor lesions. Residual tumor lesion ACTs of [<sup>18</sup>F]FDG were calculated by factoring fluorine-18 decay, identifying a maximum residual ACT of 15% (R15%). To account for increased [<sup>18</sup>F]FDG trapping over time, higher residual ACTs of 20% (R20%) were considered. These simulated [<sup>18</sup>F]FDG ACTs were compared with those measured in the second PET scan with [<sup>18</sup>F]SiFA<i>lin</i>-TATE. The influence of the dual-tracer PET/CT results on therapeutic strategies was evaluated.</p><h3 data-test="abstract-sub-heading">Results</h3><p>[<sup>18</sup>F]FDG cerebral uptake significantly decreased in the subsequent SSTR-PET (mean uptake [<sup>18</sup>F]FDG: SUV<sub>mean</sub> 6.0 ± 0.4; mean uptake in [<sup>18</sup>F]SiFA<i>lin</i>-TATE PET: SUV<sub>mean</sub> 0.2 ± 0.01; <i>p</i> < 0.0001); with similar results recorded for the myocardium. Simulated residual [<sup>18</sup>F]FDG ACTs represented only a minimal percentage of ACTs measured in the tumor lesions from the second PET scan (R15%: mean 5.2 ± 0.9% and R20%: mean 6.8 ± 1.2%), indicating only minimal residual activity of [<sup>18</sup>F]FDG that might interfere with the second PET scan using [<sup>18</sup>F]SiFA<i>lin</i>-TATE and preserved semi-quantification of the latter. Dual-tracer PET/CT findings directly influenced changes in therapy plans in eleven (55%) of the examined patients.</p><h3 data-test="abstract-sub-heading">Conclusion</h3><p>LAFOV PET scanners enable a one-day dual-tracer protocol, providing diagnostic image quality while preserving the semi-quantification of two <sup>18</sup>F-labeled radiotracers, potent
{"title":"One-day dual-tracer examination in neuroendocrine neoplasms: a real advantage of low activity LAFOV PET imaging","authors":"Eduardo Calderón, Lena S. Kiefer, Fabian P. Schmidt, Wenhong Lan, Andreas S. Brendlin, Christian P. Reinert, Stephan Singer, Gerald Reischl, Martina Hinterleitner, Helmut Dittmann, Christian la Fougère, Nils F. Trautwein","doi":"10.1007/s00259-025-07073-w","DOIUrl":"https://doi.org/10.1007/s00259-025-07073-w","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Purpose</h3><p>Somatostatin receptor (SSTR)-PET is crucial for effective treatment stratification of neuroendocrine neoplasms (NENs). In highly proliferating or poorly differentiated NENs, dual-tracer approaches using additional [<sup>18</sup>F]FDG PET can effectively identify SSTR-negative disease, usually requiring separate imaging sessions. We evaluated the feasibility of a one-day dual-tracer imaging protocol with a low activity [<sup>18</sup>F]FDG PET followed by an SSTR-PET using the recently introduced [<sup>18</sup>F]SiFA<i>lin</i>-TATE tracer in a long axial field-of-view (LAFOV) PET/CT scanner and its implications in patient management.</p><h3 data-test=\"abstract-sub-heading\">Methods</h3><p>Twenty NEN patients were included in this study. Initially, a low activity [<sup>18</sup>F]FDG PET was performed (0.5 ± 0.01 MBq/kg; PET scan 60 min p.i.). After 4.2 ± 0.09 h after completion of the [<sup>18</sup>F]FDG PET, a standard activity of [<sup>18</sup>F]SiFA<i>lin</i>-TATE was administered (3.0 MBq/kg; PET scan 90 min p.i.). To ensure the quantification accuracy of the second scan, we evaluated the potential impact of residual [<sup>18</sup>F]FDG activity by segmenting organs with minimal physiological SSTR-tracer uptake, such as the brain and myocardium, and assessing the activity concentrations (ACTs) of tumor lesions. Residual tumor lesion ACTs of [<sup>18</sup>F]FDG were calculated by factoring fluorine-18 decay, identifying a maximum residual ACT of 15% (R15%). To account for increased [<sup>18</sup>F]FDG trapping over time, higher residual ACTs of 20% (R20%) were considered. These simulated [<sup>18</sup>F]FDG ACTs were compared with those measured in the second PET scan with [<sup>18</sup>F]SiFA<i>lin</i>-TATE. The influence of the dual-tracer PET/CT results on therapeutic strategies was evaluated.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>[<sup>18</sup>F]FDG cerebral uptake significantly decreased in the subsequent SSTR-PET (mean uptake [<sup>18</sup>F]FDG: SUV<sub>mean</sub> 6.0 ± 0.4; mean uptake in [<sup>18</sup>F]SiFA<i>lin</i>-TATE PET: SUV<sub>mean</sub> 0.2 ± 0.01; <i>p</i> < 0.0001); with similar results recorded for the myocardium. Simulated residual [<sup>18</sup>F]FDG ACTs represented only a minimal percentage of ACTs measured in the tumor lesions from the second PET scan (R15%: mean 5.2 ± 0.9% and R20%: mean 6.8 ± 1.2%), indicating only minimal residual activity of [<sup>18</sup>F]FDG that might interfere with the second PET scan using [<sup>18</sup>F]SiFA<i>lin</i>-TATE and preserved semi-quantification of the latter. Dual-tracer PET/CT findings directly influenced changes in therapy plans in eleven (55%) of the examined patients.</p><h3 data-test=\"abstract-sub-heading\">Conclusion</h3><p>LAFOV PET scanners enable a one-day dual-tracer protocol, providing diagnostic image quality while preserving the semi-quantification of two <sup>18</sup>F-labeled radiotracers, potent","PeriodicalId":11909,"journal":{"name":"European Journal of Nuclear Medicine and Molecular Imaging","volume":"25 1","pages":""},"PeriodicalIF":9.1,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143056954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-29DOI: 10.1007/s00259-024-07061-6
Prodromos Gavriilidis, Felix M. Mottaghy, Michel Koole, Tineke van de Weijer, Cristina Mitea, Jochem A. J. van der Pol, Thiemo J. A. van Nijnatten, Floris P. Jansen, Roel Wierts
Purpose
The positron range effect can impair PET image quality of Gallium-68 (68Ga). A positron range correction (PRC) can be applied to reduce this effect. In this study, the effect of a tissue-independent PRC for 68Ga was investigated on patient data.
Methods
PET/CT data (40 patients: [68Ga]Ga-DOTATOC or [68Ga]Ga-PSMA) were reconstructed using Q.Clear reconstruction algorithm. Two reconstructions were performed per patient, Q.Clear with and without PRC. SUVmax and contrast-to-noise ratio (CNR) values per lesion were compared between PRC and non-PRC images. Five experienced nuclear medicine physicians reviewed the images and chose the preferred reconstruction based on the image quality, lesion detectability, and diagnostic confidence.
Results
A total of 155 lesions were identified. The PRC resulted in statistically significant increase of the SUVmax and CNR for soft tissue lesions (6.4%, p < 0.001; 8.6%, p < 0.001), bone lesions (14.6%, p < 0.001; 12.5%, p < 0.001), and lung lesions (3.6%, p = 0.010; 6.3%, p = 0.001). This effect was most prominent in small lesions (SUVmax: 12.0%, p < 0.001, and CNR: 13.0%, p < 0.001). Similar or better image quality, lesion detectability, and diagnostic confidence was achieved in PRC images compared to the non-PRC images as those assessed by the expert readers.
Conclusions
A tissue-independent PRC increased the SUVmax and CNR in soft tissue, bone, and lung lesions with a larger effect for the small lesions. Visual assessment demonstrated similar or better image quality, lesion detectability, and diagnostic confidence in PRC images compared to the non-PRC images.
{"title":"Impact of tissue-independent positron range correction on [68Ga]Ga-DOTATOC and [68Ga]Ga-PSMA PET image reconstructions: a patient data study","authors":"Prodromos Gavriilidis, Felix M. Mottaghy, Michel Koole, Tineke van de Weijer, Cristina Mitea, Jochem A. J. van der Pol, Thiemo J. A. van Nijnatten, Floris P. Jansen, Roel Wierts","doi":"10.1007/s00259-024-07061-6","DOIUrl":"https://doi.org/10.1007/s00259-024-07061-6","url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Purpose</h3><p>The positron range effect can impair PET image quality of Gallium-68 (<sup>68</sup>Ga). A positron range correction (PRC) can be applied to reduce this effect. In this study, the effect of a tissue-independent PRC for <sup>68</sup>Ga was investigated on patient data.</p><h3 data-test=\"abstract-sub-heading\">Methods</h3><p>PET/CT data (40 patients: [<sup>68</sup>Ga]Ga-DOTATOC or [<sup>68</sup>Ga]Ga-PSMA) were reconstructed using Q.Clear reconstruction algorithm. Two reconstructions were performed per patient, Q.Clear with and without PRC. SUV<sub>max</sub> and contrast-to-noise ratio (CNR) values per lesion were compared between PRC and non-PRC images. Five experienced nuclear medicine physicians reviewed the images and chose the preferred reconstruction based on the image quality, lesion detectability, and diagnostic confidence.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>A total of 155 lesions were identified. The PRC resulted in statistically significant increase of the SUV<sub>max</sub> and CNR for soft tissue lesions (6.4%, p < 0.001; 8.6%, p < 0.001), bone lesions (14.6%, p < 0.001; 12.5%, p < 0.001), and lung lesions (3.6%, p = 0.010; 6.3%, p = 0.001). This effect was most prominent in small lesions (SUV<sub>max</sub>: 12.0%, p < 0.001, and CNR: 13.0%, p < 0.001). Similar or better image quality, lesion detectability, and diagnostic confidence was achieved in PRC images compared to the non-PRC images as those assessed by the expert readers.</p><h3 data-test=\"abstract-sub-heading\">Conclusions</h3><p>A tissue-independent PRC increased the SUV<sub>max</sub> and CNR in soft tissue, bone, and lung lesions with a larger effect for the small lesions. Visual assessment demonstrated similar or better image quality, lesion detectability, and diagnostic confidence in PRC images compared to the non-PRC images.</p>","PeriodicalId":11909,"journal":{"name":"European Journal of Nuclear Medicine and Molecular Imaging","volume":"40 1","pages":""},"PeriodicalIF":9.1,"publicationDate":"2025-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143054915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-29DOI: 10.1007/s00259-025-07091-8
David Haberl, Jing Ning, Kilian Kluge, Katarina Kumpf, Josef Yu, Zewen Jiang, Claudia Constantino, Alice Monaci, Maria Starace, Alexander R. Haug, Raffaella Calabretta, Luca Camoni, Francesco Bertagna, Katharina Mascherbauer, Felix Hofer, Domenico Albano, Roberto Sciagra, Francisco Oliveira, Durval Costa, Christian Nitsche, Marcus Hacker, Clemens P. Spielvogel
Purpose
Advancements of deep learning in medical imaging are often constrained by the limited availability of large, annotated datasets, resulting in underperforming models when deployed under real-world conditions. This study investigated a generative artificial intelligence (AI) approach to create synthetic medical images taking the example of bone scintigraphy scans, to increase the data diversity of small-scale datasets for more effective model training and improved generalization.
Methods
We trained a generative model on 99mTc-bone scintigraphy scans from 9,170 patients in one center to generate high-quality and fully anonymized annotated scans of patients representing two distinct disease patterns: abnormal uptake indicative of (i) bone metastases and (ii) cardiac uptake indicative of cardiac amyloidosis. A blinded reader study was performed to assess the clinical validity and quality of the generated data. We investigated the added value of the generated data by augmenting an independent small single-center dataset with synthetic data and by training a deep learning model to detect abnormal uptake in a downstream classification task. We tested this model on 7,472 scans from 6,448 patients across four external sites in a cross-tracer and cross-scanner setting and associated the resulting model predictions with clinical outcomes.
Results
The clinical value and high quality of the synthetic imaging data were confirmed by four readers, who were unable to distinguish synthetic scans from real scans (average accuracy: 0.48% [95% CI 0.46–0.51]), disagreeing in 239 (60%) of 400 cases (Fleiss’ kappa: 0.18). Adding synthetic data to the training set improved model performance by a mean (± SD) of 33(± 10)% AUC (p < 0.0001) for detecting abnormal uptake indicative of bone metastases and by 5(± 4)% AUC (p < 0.0001) for detecting uptake indicative of cardiac amyloidosis across both internal and external testing cohorts, compared to models without synthetic training data. Patients with predicted abnormal uptake had adverse clinical outcomes (log-rank: p < 0.0001).
Conclusions
Generative AI enables the targeted generation of bone scintigraphy images representing different clinical conditions. Our findings point to the potential of synthetic data to overcome challenges in data sharing and in developing reliable and prognostic deep learning models in data-limited environments.