Benno Gesierich, Laura Sander, Lukas Pirpamer, Dominik S. Meier, Esther Ruberte, Michael Amann, Tim Sinnecker, Antal Huck, Frank-Erik de Leeuw, Pauline Maillard, Sue Moy, Karl G. Helmer, MarkVCID Consortium, Johannes Levin, Günter U. Höglinger, PROMESA Study Group, Michael Kühne, Leo H. Bonati, Jens Kuhle, Philippe Cattin, Cristina Granziera, Regina Schlaeger, Marco Duering
{"title":"Extended Technical and Clinical Validation of Deep Learning-Based Brainstem Segmentation for Application in Neurodegenerative Diseases","authors":"Benno Gesierich, Laura Sander, Lukas Pirpamer, Dominik S. Meier, Esther Ruberte, Michael Amann, Tim Sinnecker, Antal Huck, Frank-Erik de Leeuw, Pauline Maillard, Sue Moy, Karl G. Helmer, MarkVCID Consortium, Johannes Levin, Günter U. Höglinger, PROMESA Study Group, Michael Kühne, Leo H. Bonati, Jens Kuhle, Philippe Cattin, Cristina Granziera, Regina Schlaeger, Marco Duering","doi":"10.1002/hbm.70141","DOIUrl":null,"url":null,"abstract":"<p>Disorders of the central nervous system, including neurodegenerative diseases, frequently affect the brainstem and can present with focal atrophy. This study aimed to (1) optimize deep learning-based brainstem segmentation for a wide range of pathologies and T1-weighted image acquisition parameters, (2) conduct a systematic technical and clinical validation, (3) improve segmentation quality in the presence of brainstem lesions, and (4) make an optimized brainstem segmentation tool available for public use. An intentionally heterogeneous ground truth dataset (<i>n</i> = 257) was employed in the training of deep learning models based on multi-dimensional gated recurrent units (MD-GRU) or the nnU-Net method. Segmentation performance was evaluated against ground truth labels. FreeSurfer was used for benchmarking in subsequent validation. Technical validation, including scan-rescan repeatability (<i>n</i> = 46) and inter-scanner reproducibility (<i>n</i> = 20, 3 different scanners) in unseen data, was conducted in patients with cerebral small vessel disease. Clinical validation in unseen data was performed in 1-year follow-up data of 16 patients with multiple system atrophy, evaluating the annual percentage volume change. Two lesion filling algorithms were investigated to improve segmentation performance in 23 patients with multiple sclerosis. The MD-GRU and nnU-Net models demonstrated very good segmentation performance (median Dice coefficients ≥ 0.95 each) and outperformed a previously published model trained on a narrower dataset. Scan–rescan repeatability and inter-scanner reproducibility yielded similar Bland–Altman derived limits of agreement for longitudinal FreeSurfer (total brainstem volume repeatability/reproducibility 0.68/1.85), MD-GRU (0.72/1.46), and nnU-Net (0.48/1.52). All methods showed comparable performance in the detection of atrophy in the total brainstem (atrophy detected in 100% of patients) and its substructures. In patients with multiple sclerosis, lesion filling further improved the accuracy of brainstem segmentation. We enhanced and systematically validated two fully automated deep learning brainstem segmentation methods and released them publicly. This enables a broader evaluation of brainstem volume as a candidate biomarker for neurodegeneration.</p>","PeriodicalId":13019,"journal":{"name":"Human Brain Mapping","volume":"46 3","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/hbm.70141","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Brain Mapping","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/hbm.70141","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"NEUROIMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Disorders of the central nervous system, including neurodegenerative diseases, frequently affect the brainstem and can present with focal atrophy. This study aimed to (1) optimize deep learning-based brainstem segmentation for a wide range of pathologies and T1-weighted image acquisition parameters, (2) conduct a systematic technical and clinical validation, (3) improve segmentation quality in the presence of brainstem lesions, and (4) make an optimized brainstem segmentation tool available for public use. An intentionally heterogeneous ground truth dataset (n = 257) was employed in the training of deep learning models based on multi-dimensional gated recurrent units (MD-GRU) or the nnU-Net method. Segmentation performance was evaluated against ground truth labels. FreeSurfer was used for benchmarking in subsequent validation. Technical validation, including scan-rescan repeatability (n = 46) and inter-scanner reproducibility (n = 20, 3 different scanners) in unseen data, was conducted in patients with cerebral small vessel disease. Clinical validation in unseen data was performed in 1-year follow-up data of 16 patients with multiple system atrophy, evaluating the annual percentage volume change. Two lesion filling algorithms were investigated to improve segmentation performance in 23 patients with multiple sclerosis. The MD-GRU and nnU-Net models demonstrated very good segmentation performance (median Dice coefficients ≥ 0.95 each) and outperformed a previously published model trained on a narrower dataset. Scan–rescan repeatability and inter-scanner reproducibility yielded similar Bland–Altman derived limits of agreement for longitudinal FreeSurfer (total brainstem volume repeatability/reproducibility 0.68/1.85), MD-GRU (0.72/1.46), and nnU-Net (0.48/1.52). All methods showed comparable performance in the detection of atrophy in the total brainstem (atrophy detected in 100% of patients) and its substructures. In patients with multiple sclerosis, lesion filling further improved the accuracy of brainstem segmentation. We enhanced and systematically validated two fully automated deep learning brainstem segmentation methods and released them publicly. This enables a broader evaluation of brainstem volume as a candidate biomarker for neurodegeneration.
期刊介绍:
Human Brain Mapping publishes peer-reviewed basic, clinical, technical, and theoretical research in the interdisciplinary and rapidly expanding field of human brain mapping. The journal features research derived from non-invasive brain imaging modalities used to explore the spatial and temporal organization of the neural systems supporting human behavior. Imaging modalities of interest include positron emission tomography, event-related potentials, electro-and magnetoencephalography, magnetic resonance imaging, and single-photon emission tomography. Brain mapping research in both normal and clinical populations is encouraged.
Article formats include Research Articles, Review Articles, Clinical Case Studies, and Technique, as well as Technological Developments, Theoretical Articles, and Synthetic Reviews. Technical advances, such as novel brain imaging methods, analyses for detecting or localizing neural activity, synergistic uses of multiple imaging modalities, and strategies for the design of behavioral paradigms and neural-systems modeling are of particular interest. The journal endorses the propagation of methodological standards and encourages database development in the field of human brain mapping.