Knowledge distillation on individual vertebrae segmentation exploiting 3D U-Net

IF 4.9 2区医学 Q1 ENGINEERING, BIOMEDICAL Computerized Medical Imaging and Graphics Pub Date : 2024-02-08 DOI:10.1016/j.compmedimag.2024.102350

Luís Serrador , Francesca Pia Villani , Sara Moccia , Cristina P. Santos

{"title":"Knowledge distillation on individual vertebrae segmentation exploiting 3D U-Net","authors":"Luís Serrador , Francesca Pia Villani , Sara Moccia , Cristina P. Santos","doi":"10.1016/j.compmedimag.2024.102350","DOIUrl":null,"url":null,"abstract":"<div><p>Recent advances in medical imaging have highlighted the critical development of algorithms for individual vertebral segmentation on computed tomography (CT) scans. Essential for diagnostic accuracy and treatment planning in orthopaedics, neurosurgery and oncology, these algorithms face challenges in clinical implementation, including integration into healthcare systems. Consequently, our focus lies in exploring the application of knowledge distillation (KD) methods to train shallower networks capable of efficiently segmenting vertebrae in CT scans. This approach aims to reduce segmentation time, enhance suitability for emergency cases, and optimize computational and memory resource efficiency. Building upon prior research in the field, a two-step segmentation approach was employed. Firstly, the spine’s location was determined by predicting a heatmap, indicating the probability of each voxel belonging to the spine. Subsequently, an iterative segmentation of vertebrae was performed from the top to the bottom of the CT volume over the located spine, using a memory instance to record the already segmented vertebrae. KD methods were implemented by training a teacher network with performance similar to that found in the literature, and this knowledge was distilled to a shallower network (student). Two KD methods were applied: (1) using the soft outputs of both networks and (2) matching logits. Two publicly available datasets, comprising 319 CT scans from 300 patients and a total of 611 cervical, 2387 thoracic, and 1507 lumbar vertebrae, were used. To ensure dataset balance and robustness, effective data augmentation methods were applied, including cleaning the memory instance to replicate the first vertebra segmentation. The teacher network achieved an average Dice similarity coefficient (DSC) of 88.22% and a Hausdorff distance (HD) of 7.71 mm, showcasing performance similar to other approaches in the literature. Through knowledge distillation from the teacher network, the student network’s performance improved, with an average DSC increasing from 75.78% to 84.70% and an HD decreasing from 15.17 mm to 8.08 mm. Compared to other methods, our teacher network exhibited up to 99.09% fewer parameters, 90.02% faster inference time, 88.46% shorter total segmentation time, and 89.36% less associated carbon (CO<sub>2</sub>) emission rate. Regarding our student network, it featured 75.00% fewer parameters than our teacher, resulting in a 36.15% reduction in inference time, a 33.33% decrease in total segmentation time, and a 42.96% reduction in CO<sub>2</sub> emissions. This study marks the first exploration of applying KD to the problem of individual vertebrae segmentation in CT, demonstrating the feasibility of achieving comparable performance to existing methods using smaller neural networks.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"113 ","pages":"Article 102350"},"PeriodicalIF":4.9000,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0895611124000272/pdfft?md5=5527b04bad0cd774436ca9f2fd764d59&pid=1-s2.0-S0895611124000272-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computerized Medical Imaging and Graphics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0895611124000272","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Recent advances in medical imaging have highlighted the critical development of algorithms for individual vertebral segmentation on computed tomography (CT) scans. Essential for diagnostic accuracy and treatment planning in orthopaedics, neurosurgery and oncology, these algorithms face challenges in clinical implementation, including integration into healthcare systems. Consequently, our focus lies in exploring the application of knowledge distillation (KD) methods to train shallower networks capable of efficiently segmenting vertebrae in CT scans. This approach aims to reduce segmentation time, enhance suitability for emergency cases, and optimize computational and memory resource efficiency. Building upon prior research in the field, a two-step segmentation approach was employed. Firstly, the spine’s location was determined by predicting a heatmap, indicating the probability of each voxel belonging to the spine. Subsequently, an iterative segmentation of vertebrae was performed from the top to the bottom of the CT volume over the located spine, using a memory instance to record the already segmented vertebrae. KD methods were implemented by training a teacher network with performance similar to that found in the literature, and this knowledge was distilled to a shallower network (student). Two KD methods were applied: (1) using the soft outputs of both networks and (2) matching logits. Two publicly available datasets, comprising 319 CT scans from 300 patients and a total of 611 cervical, 2387 thoracic, and 1507 lumbar vertebrae, were used. To ensure dataset balance and robustness, effective data augmentation methods were applied, including cleaning the memory instance to replicate the first vertebra segmentation. The teacher network achieved an average Dice similarity coefficient (DSC) of 88.22% and a Hausdorff distance (HD) of 7.71 mm, showcasing performance similar to other approaches in the literature. Through knowledge distillation from the teacher network, the student network’s performance improved, with an average DSC increasing from 75.78% to 84.70% and an HD decreasing from 15.17 mm to 8.08 mm. Compared to other methods, our teacher network exhibited up to 99.09% fewer parameters, 90.02% faster inference time, 88.46% shorter total segmentation time, and 89.36% less associated carbon (CO₂) emission rate. Regarding our student network, it featured 75.00% fewer parameters than our teacher, resulting in a 36.15% reduction in inference time, a 33.33% decrease in total segmentation time, and a 42.96% reduction in CO₂ emissions. This study marks the first exploration of applying KD to the problem of individual vertebrae segmentation in CT, demonstrating the feasibility of achieving comparable performance to existing methods using smaller neural networks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用 3D U-Net 对单个椎骨分割进行知识提炼

医学成像技术的最新进展突显了计算机断层扫描（CT）上单个脊椎分割算法的重要发展。这些算法对骨科、神经外科和肿瘤科的诊断准确性和治疗规划至关重要，但在临床应用中却面临着挑战，包括与医疗保健系统的整合。因此，我们的重点在于探索知识蒸馏（KD）方法的应用，以训练能够有效分割 CT 扫描中椎骨的较浅网络。这种方法旨在缩短分割时间，提高紧急情况下的适用性，并优化计算和内存资源效率。在该领域先前研究的基础上，我们采用了两步分割法。首先，通过预测热图确定脊柱的位置，热图显示了每个体素属于脊柱的概率。随后，使用记忆实例记录已分割的椎体，在定位的脊柱上从 CT 容积的顶部到底部对椎体进行迭代分割。KD 方法是通过训练一个性能与文献中相似的教师网络来实现的，然后将这些知识提炼到一个较浅的网络（学生）中。应用了两种 KD 方法：(1) 使用两个网络的软输出；(2) 匹配对数。使用了两个公开可用的数据集，其中包括来自 300 名患者的 319 个 CT 扫描以及共计 611 个颈椎、2387 个胸椎和 1507 个腰椎。为确保数据集的平衡性和鲁棒性，采用了有效的数据增强方法，包括清理内存实例以复制第一个椎体分割。教师网络的平均骰子相似系数（DSC）为 88.22%，豪斯多夫距离（HD）为 7.71 mm，与文献中的其他方法表现相似。通过从教师网络中提炼知识，学生网络的性能得到提高，平均 DSC 从 75.78% 提高到 84.70%，HD 从 15.17 mm 下降到 8.08 mm。与其他方法相比，我们的教师网络的参数数量减少了 99.09%，推理时间缩短了 90.02%，总分割时间缩短了 88.46%，相关碳排放率降低了 89.36%。学生网络的参数比教师网络少 75.00%，推理时间缩短 36.15%，总分割时间缩短 33.33%，二氧化碳排放量减少 42.96%。这项研究标志着将 KD 应用于 CT 中单个椎骨分割问题的首次探索，证明了使用较小的神经网络实现与现有方法相当的性能的可行性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computerized Medical Imaging and Graphics 医学-核医学

CiteScore

10.70

自引率

3.50%

发文量

审稿时长

26 days

期刊介绍： The purpose of the journal Computerized Medical Imaging and Graphics is to act as a source for the exchange of research results concerning algorithmic advances, development, and application of digital imaging in disease detection, diagnosis, intervention, prevention, precision medicine, and population health. Included in the journal will be articles on novel computerized imaging or visualization techniques, including artificial intelligence and machine learning, augmented reality for surgical planning and guidance, big biomedical data visualization, computer-aided diagnosis, computerized-robotic surgery, image-guided therapy, imaging scanning and reconstruction, mobile and tele-imaging, radiomics, and imaging integration and modeling with other information relevant to digital health. The types of biomedical imaging include: magnetic resonance, computed tomography, ultrasound, nuclear medicine, X-ray, microwave, optical and multi-photon microscopy, video and sensory imaging, and the convergence of biomedical images with other non-imaging datasets.