Pub Date : 2024-02-09DOI: 10.1016/j.compmedimag.2024.102356
Amine Sadikine , Bogdan Badic , Jean-Pierre Tasu , Vincent Noblet , Pascal Ballet , Dimitris Visvikis , Pierre-Henri Conze
The extraction of abdominal structures using deep learning has recently experienced a widespread interest in medical image analysis. Automatic abdominal organ and vessel segmentation is highly desirable to guide clinicians in computer-assisted diagnosis, therapy, or surgical planning. Despite a good ability to extract large organs, the capacity of U-Net inspired architectures to automatically delineate smaller structures remains a major issue, especially given the increase in receptive field size as we go deeper into the network. To deal with various abdominal structure sizes while exploiting efficient geometric constraints, we present a novel approach that integrates into deep segmentation shape priors from a semi-overcomplete convolutional auto-encoder (S-OCAE) embedding. Compared to standard convolutional auto-encoders (CAE), it exploits an over-complete branch that projects data onto higher dimensions to better characterize anatomical structures with a small spatial extent. Experiments on abdominal organs and vessel delineation performed on various publicly available datasets highlight the effectiveness of our method compared to state-of-the-art, including U-Net trained without and with shape priors from a traditional CAE. Exploiting a semi-overcomplete convolutional auto-encoder embedding as shape priors improves the ability of deep segmentation models to provide realistic and accurate abdominal structure contours.
{"title":"Improving abdominal image segmentation with overcomplete shape priors","authors":"Amine Sadikine , Bogdan Badic , Jean-Pierre Tasu , Vincent Noblet , Pascal Ballet , Dimitris Visvikis , Pierre-Henri Conze","doi":"10.1016/j.compmedimag.2024.102356","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2024.102356","url":null,"abstract":"<div><p>The extraction of abdominal structures using deep learning has recently experienced a widespread interest in medical image analysis. Automatic abdominal organ and vessel segmentation is highly desirable to guide clinicians in computer-assisted diagnosis, therapy, or surgical planning. Despite a good ability to extract large organs, the capacity of U-Net inspired architectures to automatically delineate smaller structures remains a major issue, especially given the increase in receptive field size as we go deeper into the network. To deal with various abdominal structure sizes while exploiting efficient geometric constraints, we present a novel approach that integrates into deep segmentation shape priors from a semi-overcomplete convolutional auto-encoder (S-OCAE) embedding. Compared to standard convolutional auto-encoders (CAE), it exploits an over-complete branch that projects data onto higher dimensions to better characterize anatomical structures with a small spatial extent. Experiments on abdominal organs and vessel delineation performed on various publicly available datasets highlight the effectiveness of our method compared to state-of-the-art, including U-Net trained without and with shape priors from a traditional CAE. Exploiting a semi-overcomplete convolutional auto-encoder embedding as shape priors improves the ability of deep segmentation models to provide realistic and accurate abdominal structure contours.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"113 ","pages":"Article 102356"},"PeriodicalIF":5.7,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0895611124000338/pdfft?md5=e83e8e8de56fb1f5b580a68d5fed492b&pid=1-s2.0-S0895611124000338-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139714471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-09DOI: 10.1016/j.compmedimag.2024.102347
Yiqing Liu , Farhad R. Nezami , Elazer R. Edelman
Characterizing coronary calcified plaque (CCP) provides essential insight into diagnosis and treatment of atherosclerosis. Intravascular optical coherence tomography (OCT) offers significant advantages for detecting CCP and even automated segmentation with recent advances in deep learning techniques. Most of current methods have achieved promising results by adopting existing convolution neural networks (CNNs) in computer vision domain. However, their performance can be detrimentally affected by unseen plaque patterns and artifacts due to inherent limitation of CNNs in contextual reasoning. To overcome this obstacle, we proposed a Transformer-based pyramid network called AFS-TPNet for robust, end-to-end segmentation of CCP from OCT images. Its encoder is built upon CSWin Transformer architecture, allowing for better perceptual understanding of calcified arteries at a higher semantic level. Specifically, an augmented feature split (AFS) module and residual convolutional position encoding (RCPE) mechanism are designed to effectively enhance the capability of Transformer in capturing both fine-grained features and global contexts. Extensive experiments showed that AFS-TPNet trained using Lovasz Loss achieved superior performance in segmentation CCP under various contexts, surpassing prior state-of-the-art CNN and Transformer architectures by more than 6.58% intersection over union (IoU) score. The application of this promising method to extract CCP features is expected to enhance clinical intervention and translational research using OCT.
冠状动脉钙化斑块(CCP)的特征为动脉粥样硬化的诊断和治疗提供了重要依据。血管内光学相干断层扫描(OCT)在检测冠状动脉钙化斑块(CCP)方面具有显著优势,随着深度学习技术的最新进展,甚至可以实现自动分割。目前的大多数方法都采用了计算机视觉领域现有的卷积神经网络(CNN),取得了可喜的成果。然而,由于卷积神经网络在上下文推理方面的固有局限性,它们的性能可能会受到未见斑块模式和伪影的不利影响。为了克服这一障碍,我们提出了一种名为 AFS-TPNet 的基于变换器的金字塔网络,用于从 OCT 图像中对 CCP 进行稳健的端到端分割。它的编码器建立在 CSWin Transformer 架构之上,可以在更高的语义层面上更好地感知钙化动脉。具体来说,设计了增强特征分割(AFS)模块和残差卷积位置编码(RCPE)机制,以有效增强 Transformer 在捕捉细粒度特征和全局上下文方面的能力。广泛的实验表明,使用 Lovasz Loss 训练的 AFS-TPNet 在各种上下文条件下分割 CCP 时都取得了优异的性能,超过了之前最先进的 CNN 和 Transformer 架构 6.58% 以上的 intersection over union (IoU) 分数。应用这种前景广阔的方法来提取 CCP 特征,有望利用 OCT 加强临床干预和转化研究。
{"title":"A transformer-based pyramid network for coronary calcified plaque segmentation in intravascular optical coherence tomography images","authors":"Yiqing Liu , Farhad R. Nezami , Elazer R. Edelman","doi":"10.1016/j.compmedimag.2024.102347","DOIUrl":"10.1016/j.compmedimag.2024.102347","url":null,"abstract":"<div><p>Characterizing coronary calcified plaque (CCP) provides essential insight into diagnosis and treatment of atherosclerosis. Intravascular optical coherence tomography (OCT) offers significant advantages for detecting CCP and even automated segmentation with recent advances in deep learning techniques. Most of current methods have achieved promising results by adopting existing convolution neural networks (CNNs) in computer vision domain. However, their performance can be detrimentally affected by unseen plaque patterns and artifacts due to inherent limitation of CNNs in contextual reasoning. To overcome this obstacle, we proposed a Transformer-based pyramid network called AFS-TPNet for robust, end-to-end segmentation of CCP from OCT images. Its encoder is built upon CSWin Transformer architecture, allowing for better perceptual understanding of calcified arteries at a higher semantic level. Specifically, an augmented feature split (AFS) module and residual convolutional position encoding (RCPE) mechanism are designed to effectively enhance the capability of Transformer in capturing both fine-grained features and global contexts. Extensive experiments showed that AFS-TPNet trained using Lovasz Loss achieved superior performance in segmentation CCP under various contexts, surpassing prior state-of-the-art CNN and Transformer architectures by more than 6.58% intersection over union (IoU) score. The application of this promising method to extract CCP features is expected to enhance clinical intervention and translational research using OCT.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"113 ","pages":"Article 102347"},"PeriodicalIF":5.7,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139718035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-08DOI: 10.1016/j.compmedimag.2024.102350
Luís Serrador , Francesca Pia Villani , Sara Moccia , Cristina P. Santos
Recent advances in medical imaging have highlighted the critical development of algorithms for individual vertebral segmentation on computed tomography (CT) scans. Essential for diagnostic accuracy and treatment planning in orthopaedics, neurosurgery and oncology, these algorithms face challenges in clinical implementation, including integration into healthcare systems. Consequently, our focus lies in exploring the application of knowledge distillation (KD) methods to train shallower networks capable of efficiently segmenting vertebrae in CT scans. This approach aims to reduce segmentation time, enhance suitability for emergency cases, and optimize computational and memory resource efficiency. Building upon prior research in the field, a two-step segmentation approach was employed. Firstly, the spine’s location was determined by predicting a heatmap, indicating the probability of each voxel belonging to the spine. Subsequently, an iterative segmentation of vertebrae was performed from the top to the bottom of the CT volume over the located spine, using a memory instance to record the already segmented vertebrae. KD methods were implemented by training a teacher network with performance similar to that found in the literature, and this knowledge was distilled to a shallower network (student). Two KD methods were applied: (1) using the soft outputs of both networks and (2) matching logits. Two publicly available datasets, comprising 319 CT scans from 300 patients and a total of 611 cervical, 2387 thoracic, and 1507 lumbar vertebrae, were used. To ensure dataset balance and robustness, effective data augmentation methods were applied, including cleaning the memory instance to replicate the first vertebra segmentation. The teacher network achieved an average Dice similarity coefficient (DSC) of 88.22% and a Hausdorff distance (HD) of 7.71 mm, showcasing performance similar to other approaches in the literature. Through knowledge distillation from the teacher network, the student network’s performance improved, with an average DSC increasing from 75.78% to 84.70% and an HD decreasing from 15.17 mm to 8.08 mm. Compared to other methods, our teacher network exhibited up to 99.09% fewer parameters, 90.02% faster inference time, 88.46% shorter total segmentation time, and 89.36% less associated carbon (CO2) emission rate. Regarding our student network, it featured 75.00% fewer parameters than our teacher, resulting in a 36.15% reduction in inference time, a 33.33% decrease in total segmentation time, and a 42.96% reduction in CO2 emissions. This study marks the first exploration of applying KD to the problem of individual vertebrae segmentation in CT, demonstrating the feasibility of achieving comparable performance to existing methods using smaller neural networks.
{"title":"Knowledge distillation on individual vertebrae segmentation exploiting 3D U-Net","authors":"Luís Serrador , Francesca Pia Villani , Sara Moccia , Cristina P. Santos","doi":"10.1016/j.compmedimag.2024.102350","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2024.102350","url":null,"abstract":"<div><p>Recent advances in medical imaging have highlighted the critical development of algorithms for individual vertebral segmentation on computed tomography (CT) scans. Essential for diagnostic accuracy and treatment planning in orthopaedics, neurosurgery and oncology, these algorithms face challenges in clinical implementation, including integration into healthcare systems. Consequently, our focus lies in exploring the application of knowledge distillation (KD) methods to train shallower networks capable of efficiently segmenting vertebrae in CT scans. This approach aims to reduce segmentation time, enhance suitability for emergency cases, and optimize computational and memory resource efficiency. Building upon prior research in the field, a two-step segmentation approach was employed. Firstly, the spine’s location was determined by predicting a heatmap, indicating the probability of each voxel belonging to the spine. Subsequently, an iterative segmentation of vertebrae was performed from the top to the bottom of the CT volume over the located spine, using a memory instance to record the already segmented vertebrae. KD methods were implemented by training a teacher network with performance similar to that found in the literature, and this knowledge was distilled to a shallower network (student). Two KD methods were applied: (1) using the soft outputs of both networks and (2) matching logits. Two publicly available datasets, comprising 319 CT scans from 300 patients and a total of 611 cervical, 2387 thoracic, and 1507 lumbar vertebrae, were used. To ensure dataset balance and robustness, effective data augmentation methods were applied, including cleaning the memory instance to replicate the first vertebra segmentation. The teacher network achieved an average Dice similarity coefficient (DSC) of 88.22% and a Hausdorff distance (HD) of 7.71 mm, showcasing performance similar to other approaches in the literature. Through knowledge distillation from the teacher network, the student network’s performance improved, with an average DSC increasing from 75.78% to 84.70% and an HD decreasing from 15.17 mm to 8.08 mm. Compared to other methods, our teacher network exhibited up to 99.09% fewer parameters, 90.02% faster inference time, 88.46% shorter total segmentation time, and 89.36% less associated carbon (CO<sub>2</sub>) emission rate. Regarding our student network, it featured 75.00% fewer parameters than our teacher, resulting in a 36.15% reduction in inference time, a 33.33% decrease in total segmentation time, and a 42.96% reduction in CO<sub>2</sub> emissions. This study marks the first exploration of applying KD to the problem of individual vertebrae segmentation in CT, demonstrating the feasibility of achieving comparable performance to existing methods using smaller neural networks.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"113 ","pages":"Article 102350"},"PeriodicalIF":5.7,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0895611124000272/pdfft?md5=5527b04bad0cd774436ca9f2fd764d59&pid=1-s2.0-S0895611124000272-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139714472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-08DOI: 10.1016/j.compmedimag.2024.102348
Kai Lønning , Matthan W.A. Caan , Marlies E. Nowee , Jan-Jakob Sonke
Recurrent inference machines (RIM), a deep learning model that learns an iterative scheme for reconstructing sparsely sampled MRI, has been shown able to perform well on accelerated 2D and 3D MRI scans, learn from small datasets and generalize well to unseen types of data. Here we propose the dynamic recurrent inference machine (DRIM) for reconstructing sparsely sampled 4D MRI by exploiting correlations between respiratory states. The DRIM was applied to a 4D protocol for MR-guided radiotherapy of liver lesions based on repetitive interleaved coronal 2D multi-slice -weighted acquisitions. We demonstrate with an ablation study that the DRIM outperforms the RIM, increasing the SSIM score from about 0.89 to 0.95. The DRIM allowed for an approximately 2.7 times faster scan time than the current clinical protocol with only a slight loss in image sharpness. Correlations between slice locations can also be used, but were found to be of less importance, as were a majority of tested variations in network architecture, as long as the respiratory states are processed by the network. Through cross-validation, the DRIM is also shown to be robust in terms of training data. We further demonstrate a good performance across a large range of subsampling factors, and conclude through an evaluation by a radiation oncologist that reconstructed images of the liver contour and inner structures are of a clinically acceptable standard at acceleration factors 10x and 8x, respectively. Finally, we show that binning the data with respect to respiratory states prior to reconstruction comes at a slight cost to reconstruction quality, but at greater speed of the overall protocol.
{"title":"Dynamic recurrent inference machines for accelerated MRI-guided radiotherapy of the liver","authors":"Kai Lønning , Matthan W.A. Caan , Marlies E. Nowee , Jan-Jakob Sonke","doi":"10.1016/j.compmedimag.2024.102348","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2024.102348","url":null,"abstract":"<div><p>Recurrent inference machines (RIM), a deep learning model that learns an iterative scheme for reconstructing sparsely sampled MRI, has been shown able to perform well on accelerated 2D and 3D MRI scans, learn from small datasets and generalize well to unseen types of data. Here we propose the dynamic recurrent inference machine (DRIM) for reconstructing sparsely sampled 4D MRI by exploiting correlations between respiratory states. The DRIM was applied to a 4D protocol for MR-guided radiotherapy of liver lesions based on repetitive interleaved coronal 2D multi-slice <span><math><msub><mrow><mi>T</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-weighted acquisitions. We demonstrate with an ablation study that the DRIM outperforms the RIM, increasing the SSIM score from about 0.89 to 0.95. The DRIM allowed for an approximately 2.7 times faster scan time than the current clinical protocol with only a slight loss in image sharpness. Correlations between slice locations can also be used, but were found to be of less importance, as were a majority of tested variations in network architecture, as long as the respiratory states are processed by the network. Through cross-validation, the DRIM is also shown to be robust in terms of training data. We further demonstrate a good performance across a large range of subsampling factors, and conclude through an evaluation by a radiation oncologist that reconstructed images of the liver contour and inner structures are of a clinically acceptable standard at acceleration factors 10x and 8x, respectively. Finally, we show that binning the data with respect to respiratory states prior to reconstruction comes at a slight cost to reconstruction quality, but at greater speed of the overall protocol.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"113 ","pages":"Article 102348"},"PeriodicalIF":5.7,"publicationDate":"2024-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139748249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-07DOI: 10.1016/j.compmedimag.2024.102354
Yu Jin , Juan Liu , Yuanyuan Zhou , Rong Chen , Hua Chen , Wensi Duan , Yuqi Chen , Xiao-Lian Zhang
Lung granuloma is a very common lung disease, and its specific diagnosis is important for determining the exact cause of the disease as well as the prognosis of the patient. And, an effective lung granuloma detection model based on computer-aided diagnostics (CAD) can help pathologists to localize granulomas, thereby improving the efficiency of the specific diagnosis. However, for lung granuloma detection models based on CAD, the significant size differences between granulomas and how to better utilize the morphological features of granulomas are both critical challenges to be addressed. In this paper, we propose an automatic method CRDet to localize granulomas in histopathological images and deal with these challenges. We first introduce the multi-scale feature extraction network with self-attention to extract features at different scales at the same time. Then, the features will be converted to circle representations of granulomas by circle representation detection heads to achieve the alignment of features and ground truth. In this way, we can also more effectively use the circular morphological features of granulomas. Finally, we propose a center point calibration method at the inference stage to further optimize the circle representation. For model evaluation, we built a lung granuloma circle representation dataset named LGCR, including 288 images from 50 subjects. Our method yielded 0.316 and 0.571 , outperforming the state-of-the-art object detection methods on our proposed LGCR.
{"title":"CRDet: A circle representation detector for lung granulomas based on multi-scale attention features with center point calibration","authors":"Yu Jin , Juan Liu , Yuanyuan Zhou , Rong Chen , Hua Chen , Wensi Duan , Yuqi Chen , Xiao-Lian Zhang","doi":"10.1016/j.compmedimag.2024.102354","DOIUrl":"10.1016/j.compmedimag.2024.102354","url":null,"abstract":"<div><p>Lung granuloma is a very common lung disease, and its specific diagnosis is important for determining the exact cause of the disease as well as the prognosis of the patient. And, an effective lung granuloma detection model based on computer-aided diagnostics (CAD) can help pathologists to localize granulomas, thereby improving the efficiency of the specific diagnosis. However, for lung granuloma detection models based on CAD, the significant size differences between granulomas and how to better utilize the morphological features of granulomas are both critical challenges to be addressed. In this paper, we propose an automatic method CRDet to localize granulomas in histopathological images and deal with these challenges. We first introduce the multi-scale feature extraction network with self-attention to extract features at different scales at the same time. Then, the features will be converted to circle representations of granulomas by circle representation detection heads to achieve the alignment of features and ground truth. In this way, we can also more effectively use the circular morphological features of granulomas. Finally, we propose a center point calibration method at the inference stage to further optimize the circle representation. For model evaluation, we built a lung granuloma circle representation dataset named LGCR, including 288 images from 50 subjects. Our method yielded 0.316 <span><math><mrow><mi>m</mi><mi>A</mi><mi>P</mi></mrow></math></span> and 0.571 <span><math><mrow><mi>m</mi><mi>A</mi><mi>R</mi></mrow></math></span>, outperforming the state-of-the-art object detection methods on our proposed LGCR.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"113 ","pages":"Article 102354"},"PeriodicalIF":5.7,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139717959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-07DOI: 10.1016/j.compmedimag.2024.102349
Pierre-Henri Conze , Gustavo Andrade-Miranda , Yannick Le Meur , Emilie Cornec-Le Gall , François Rousseau
Autosomal-dominant polycystic kidney disease is a prevalent genetic disorder characterized by the development of renal cysts, leading to kidney enlargement and renal failure. Accurate measurement of total kidney volume through polycystic kidney segmentation is crucial to assess disease severity, predict progression and evaluate treatment effects. Traditional manual segmentation suffers from intra- and inter-expert variability, prompting the exploration of automated approaches. In recent years, convolutional neural networks have been employed for polycystic kidney segmentation from magnetic resonance images. However, the use of Transformer-based models, which have shown remarkable performance in a wide range of computer vision and medical image analysis tasks, remains unexplored in this area. With their self-attention mechanism, Transformers excel in capturing global context information, which is crucial for accurate organ delineations. In this paper, we evaluate and compare various convolutional-based, Transformers-based, and hybrid convolutional/Transformers-based networks for polycystic kidney segmentation. Additionally, we propose a dual-task learning scheme, where a common feature extractor is followed by per-kidney decoders, towards better generalizability and efficiency. We extensively evaluate various architectures and learning schemes on a heterogeneous magnetic resonance imaging dataset collected from 112 patients with polycystic kidney disease. Our results highlight the effectiveness of Transformer-based models for polycystic kidney segmentation and the relevancy of exploiting dual-task learning to improve segmentation accuracy and mitigate data scarcity issues. A promising ability in accurately delineating polycystic kidneys is especially shown in the presence of heterogeneous cyst distributions and adjacent cyst-containing organs. This work contribute to the advancement of reliable delineation methods in nephrology, paving the way for a broad spectrum of clinical applications.
{"title":"Dual-task kidney MR segmentation with transformers in autosomal-dominant polycystic kidney disease","authors":"Pierre-Henri Conze , Gustavo Andrade-Miranda , Yannick Le Meur , Emilie Cornec-Le Gall , François Rousseau","doi":"10.1016/j.compmedimag.2024.102349","DOIUrl":"10.1016/j.compmedimag.2024.102349","url":null,"abstract":"<div><p>Autosomal-dominant polycystic kidney disease is a prevalent genetic disorder characterized by the development of renal cysts, leading to kidney enlargement and renal failure. Accurate measurement of total kidney volume through polycystic kidney segmentation is crucial to assess disease severity, predict progression and evaluate treatment effects. Traditional manual segmentation suffers from intra- and inter-expert variability, prompting the exploration of automated approaches. In recent years, convolutional neural networks have been employed for polycystic kidney segmentation from magnetic resonance images. However, the use of Transformer-based models, which have shown remarkable performance in a wide range of computer vision and medical image analysis tasks, remains unexplored in this area. With their self-attention mechanism, Transformers excel in capturing global context information, which is crucial for accurate organ delineations. In this paper, we evaluate and compare various convolutional-based, Transformers-based, and hybrid convolutional/Transformers-based networks for polycystic kidney segmentation. Additionally, we propose a dual-task learning scheme, where a common feature extractor is followed by per-kidney decoders, towards better generalizability and efficiency. We extensively evaluate various architectures and learning schemes on a heterogeneous magnetic resonance imaging dataset collected from 112 patients with polycystic kidney disease. Our results highlight the effectiveness of Transformer-based models for polycystic kidney segmentation and the relevancy of exploiting dual-task learning to improve segmentation accuracy and mitigate data scarcity issues. A promising ability in accurately delineating polycystic kidneys is especially shown in the presence of heterogeneous cyst distributions and adjacent cyst-containing organs. This work contribute to the advancement of reliable delineation methods in nephrology, paving the way for a broad spectrum of clinical applications.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"113 ","pages":"Article 102349"},"PeriodicalIF":5.7,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0895611124000260/pdfft?md5=e30cfd320b78d2e35a07f947c27731d3&pid=1-s2.0-S0895611124000260-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139708421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-06DOI: 10.1016/j.compmedimag.2024.102351
Guangtong Yang , Chen Li , Yudong Yao , Ge Wang , Yueyang Teng
Low resolution of positron emission tomography (PET) limits its diagnostic performance. Deep learning has been successfully applied to achieve super-resolution PET. However, commonly used supervised learning methods in this context require many pairs of low- and high-resolution (LR and HR) PET images. Although unsupervised learning utilizes unpaired images, the results are not as good as that obtained with supervised deep learning. In this paper, we propose a quasi-supervised learning method, which is a new type of weakly-supervised learning methods, to recover HR PET images from LR counterparts by leveraging similarity between unpaired LR and HR image patches. Specifically, LR image patches are taken from a patient as inputs, while the most similar HR patches from other patients are found as labels. The similarity between the matched HR and LR patches serves as a prior for network construction. Our proposed method can be implemented by designing a new network or modifying an existing network. As an example in this study, we have modified the cycle-consistent generative adversarial network (CycleGAN) for super-resolution PET. Our numerical and experimental results qualitatively and quantitatively show the merits of our method relative to the state-of-the-art methods. The code is publicly available at https://github.com/PigYang-ops/CycleGAN-QSDL.
正电子发射断层扫描(PET)的低分辨率限制了其诊断性能。深度学习已成功应用于实现超分辨率 PET。然而,在这种情况下,常用的监督学习方法需要许多对低分辨率和高分辨率(LR 和 HR)PET 图像。虽然无监督学习利用的是未配对的图像,但其结果不如有监督深度学习获得的结果好。本文提出了一种准监督学习方法,即一种新型的弱监督学习方法,利用未配对的 LR 和 HR 图像片段之间的相似性,从 LR 对应图像中恢复 HR PET 图像。具体来说,将患者的 LR 图像片段作为输入,而从其他患者中找到最相似的 HR 图像片段作为标签。匹配的 HR 和 LR 补丁之间的相似性可作为网络构建的先验。我们提出的方法可以通过设计新网络或修改现有网络来实现。以本研究为例,我们修改了用于超分辨率 PET 的周期一致性生成对抗网络(CycleGAN)。我们的数值和实验结果定性和定量地显示了我们的方法相对于最先进方法的优点。代码可在 https://github.com/PigYang-ops/CycleGAN-QSDL 公开获取。
{"title":"Quasi-supervised learning for super-resolution PET","authors":"Guangtong Yang , Chen Li , Yudong Yao , Ge Wang , Yueyang Teng","doi":"10.1016/j.compmedimag.2024.102351","DOIUrl":"https://doi.org/10.1016/j.compmedimag.2024.102351","url":null,"abstract":"<div><p>Low resolution of positron emission tomography (PET) limits its diagnostic performance. Deep learning has been successfully applied to achieve super-resolution PET. However, commonly used supervised learning methods in this context require many pairs of low- and high-resolution (LR and HR) PET images. Although unsupervised learning utilizes unpaired images, the results are not as good as that obtained with supervised deep learning. In this paper, we propose a quasi-supervised learning method, which is a new type of weakly-supervised learning methods, to recover HR PET images from LR counterparts by leveraging similarity between unpaired LR and HR image patches. Specifically, LR image patches are taken from a patient as inputs, while the most similar HR patches from other patients are found as labels. The similarity between the matched HR and LR patches serves as a prior for network construction. Our proposed method can be implemented by designing a new network or modifying an existing network. As an example in this study, we have modified the cycle-consistent generative adversarial network (CycleGAN) for super-resolution PET. Our numerical and experimental results qualitatively and quantitatively show the merits of our method relative to the state-of-the-art methods. The code is publicly available at <span>https://github.com/PigYang-ops/CycleGAN-QSDL</span><svg><path></path></svg>.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"113 ","pages":"Article 102351"},"PeriodicalIF":5.7,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139710315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automated medical image segmentation plays a crucial role in diverse clinical applications. The high annotation costs of fully-supervised medical segmentation methods have spurred a growing interest in semi-supervised methods. Existing semi-supervised medical segmentation methods train the teacher segmentation network using labeled data to establish pseudo labels for unlabeled data. The quality of these pseudo labels is constrained as these methods fail to effectively address the significant bias in the data distribution learned from the limited labeled data. To address these challenges, this paper introduces an innovative Correspondence-based Generative Bayesian Deep Learning (C-GBDL) model. Built upon the teacher–student architecture, we design a multi-scale semantic correspondence method to aid the teacher model in generating high-quality pseudo labels. Specifically, our teacher model, embedded with the multi-scale semantic correspondence, learns a better-generalized data distribution from input volumes by feature matching with the reference volumes. Additionally, a double uncertainty estimation schema is proposed to further rectify the noisy pseudo labels. The double uncertainty estimation takes the predictive entropy as the first uncertainty estimation and takes the structural similarity between the input volume and its corresponding reference volumes as the second uncertainty estimation. Four groups of comparative experiments conducted on two public medical datasets demonstrate the effectiveness and the superior performance of our proposed model. Our code is available on https://github.com/yumjoo/C-GBDL.
{"title":"Correspondence-based Generative Bayesian Deep Learning for semi-supervised volumetric medical image segmentation","authors":"Yuzhou Zhao , Xinyu Zhou , Tongxin Pan , Shuyong Gao , Wenqiang Zhang","doi":"10.1016/j.compmedimag.2024.102352","DOIUrl":"10.1016/j.compmedimag.2024.102352","url":null,"abstract":"<div><p>Automated medical image segmentation plays a crucial role in diverse clinical applications. The high annotation costs of fully-supervised medical segmentation methods have spurred a growing interest in semi-supervised methods. Existing semi-supervised medical segmentation methods train the teacher segmentation network using labeled data to establish pseudo labels for unlabeled data. The quality of these pseudo labels is constrained as these methods fail to effectively address the significant bias in the data distribution learned from the limited labeled data. To address these challenges, this paper introduces an innovative Correspondence-based Generative Bayesian Deep Learning (C-GBDL) model. Built upon the teacher–student architecture, we design a multi-scale semantic correspondence method to aid the teacher model in generating high-quality pseudo labels. Specifically, our teacher model, embedded with the multi-scale semantic correspondence, learns a better-generalized data distribution from input volumes by feature matching with the reference volumes. Additionally, a double uncertainty estimation schema is proposed to further rectify the noisy pseudo labels. The double uncertainty estimation takes the predictive entropy as the first uncertainty estimation and takes the structural similarity between the input volume and its corresponding reference volumes as the second uncertainty estimation. Four groups of comparative experiments conducted on two public medical datasets demonstrate the effectiveness and the superior performance of our proposed model. Our code is available on <span>https://github.com/yumjoo/C-GBDL</span><svg><path></path></svg>.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"113 ","pages":"Article 102352"},"PeriodicalIF":5.7,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139718036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-05DOI: 10.1016/j.compmedimag.2024.102345
Binchun Lu , Lidan Fu , Yixuan Pan , Yonggui Dong
Robust and interpretable image reconstruction is central to imageology applications in clinical practice. Prevalent deep networks, with strong learning ability to extract implicit information from data manifold, are still lack of prior knowledge introduced from mathematics or physics, leading to instability, poor structure interpretability and high computation cost. As to this issue, we propose two prior knowledge-driven networks to combine the good interpretability of mathematical methods and the powerful learnability of deep learning methods. Incorporating different kinds of prior knowledge, we propose subband-adaptive wavelet iterative shrinkage thresholding networks (SWISTA-Nets), where almost every network module is in one-to-one correspondence with each step involved in the iterative algorithm. By end-to-end training of proposed SWISTA-Nets, implicit information can be extracted from training data and guide the tuning process of key parameters that possess mathematical definition. The inverse problems associated with two medical imaging modalities, i.e., electromagnetic tomography and X-ray computational tomography are applied to validate the proposed networks. Both visual and quantitative results indicate that the SWISTA-Nets outperform mathematical methods and state-of-the-art prior knowledge-driven networks, especially with fewer training parameters, interpretable network structures and well robustness. We assume that our analysis will support further investigation of prior knowledge-driven networks in the field of ill-posed image reconstruction.
稳健且可解释的图像重建是临床实践中图像学应用的核心。目前流行的深度网络具有很强的学习能力,能从数据流形中提取隐含信息,但仍然缺乏从数学或物理学中引入的先验知识,导致网络不稳定、结构可解释性差、计算成本高。针对这一问题,我们提出了两种先验知识驱动的网络,以结合数学方法的良好可解释性和深度学习方法的强大可学习性。结合不同类型的先验知识,我们提出了子带自适应小波迭代收缩阈值网络(SWISTA-Nets),其中几乎每个网络模块都与迭代算法中涉及的每个步骤一一对应。通过对所提出的 SWISTA-Nets 进行端到端训练,可以从训练数据中提取隐含信息,并指导具有数学定义的关键参数的调整过程。与两种医学成像模式(即电磁断层扫描和 X 射线计算断层扫描)相关的逆问题应用于验证所提出的网络。直观和定量结果表明,SWISTA 网络优于数学方法和最先进的先验知识驱动网络,尤其是在训练参数较少、网络结构可解释和鲁棒性良好的情况下。我们认为,我们的分析将有助于进一步研究先验知识驱动网络在困难图像重建领域的应用。
{"title":"SWISTA-Nets: Subband-adaptive wavelet iterative shrinkage thresholding networks for image reconstruction","authors":"Binchun Lu , Lidan Fu , Yixuan Pan , Yonggui Dong","doi":"10.1016/j.compmedimag.2024.102345","DOIUrl":"10.1016/j.compmedimag.2024.102345","url":null,"abstract":"<div><p>Robust and interpretable image reconstruction is central to imageology applications in clinical practice. Prevalent deep networks, with strong learning ability to extract implicit information from data manifold, are still lack of prior knowledge introduced from mathematics or physics, leading to instability, poor structure interpretability and high computation cost. As to this issue, we propose two prior knowledge-driven networks to combine the good interpretability of mathematical methods and the powerful learnability of deep learning methods. Incorporating different kinds of prior knowledge, we propose subband-adaptive wavelet iterative shrinkage thresholding networks (SWISTA-Nets), where almost every network module is in one-to-one correspondence with each step involved in the iterative algorithm. By end-to-end training of proposed SWISTA-Nets, implicit information can be extracted from training data and guide the tuning process of key parameters that possess mathematical definition. The inverse problems associated with two medical imaging modalities, i.e., electromagnetic tomography and X-ray computational tomography are applied to validate the proposed networks. Both visual and quantitative results indicate that the SWISTA-Nets outperform mathematical methods and state-of-the-art prior knowledge-driven networks, especially with fewer training parameters, interpretable network structures and well robustness. We assume that our analysis will support further investigation of prior knowledge-driven networks in the field of ill-posed image reconstruction.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"113 ","pages":"Article 102345"},"PeriodicalIF":5.7,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139688749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-02DOI: 10.1016/j.compmedimag.2024.102344
Linjie Fu , Xia Li , Xiuding Cai , Dong Miao , Yu Yao , Yali Shen
Cone Beam Computed Tomography (CBCT) plays a crucial role in Image-Guided Radiation Therapy (IGRT), providing essential assurance of accuracy in radiation treatment by monitoring changes in anatomical structures during the treatment process. However, CBCT images often face interference from scatter noise and artifacts, posing a significant challenge when relying solely on CBCT for precise dose calculation and accurate tissue localization. There is an urgent need to enhance the quality of CBCT images, enabling a more practical application in IGRT. This study introduces EGDiff, a novel framework based on the diffusion model, designed to address the challenges posed by scatter noise and artifacts in CBCT images. In our approach, we employ a forward diffusion process by adding Gaussian noise to CT images, followed by a reverse denoising process using ResUNet with an attention mechanism to predict noise intensity, ultimately synthesizing CBCT-to-CT images. Additionally, we design an energy-guided function to retain domain-independent features and discard domain-specific features during the denoising process, enhancing the effectiveness of CBCT-CT generation. We conduct numerous experiments on the thorax dataset and pancreas dataset. The results demonstrate that EGDiff performs better on the thoracic tumor dataset with SSIM of 0.850, MAE of 26.87 HU, PSNR of 19.83 dB, and NCC of 0.874. EGDiff outperforms SoTA CBCT-to-CT synthesis methods on the pancreas dataset with SSIM of 0.754, MAE of 32.19 HU, PSNR of 19.35 dB, and NCC of 0.846. By improving the accuracy and reliability of CBCT images, EGDiff can enhance the precision of radiation therapy, minimize radiation exposure to healthy tissues, and ultimately contribute to more effective and personalized cancer treatment strategies.
{"title":"Energy-guided diffusion model for CBCT-to-CT synthesis","authors":"Linjie Fu , Xia Li , Xiuding Cai , Dong Miao , Yu Yao , Yali Shen","doi":"10.1016/j.compmedimag.2024.102344","DOIUrl":"10.1016/j.compmedimag.2024.102344","url":null,"abstract":"<div><p><span><span>Cone Beam Computed Tomography<span> (CBCT) plays a crucial role in Image-Guided Radiation Therapy (IGRT), providing essential assurance of accuracy in radiation treatment<span><span> by monitoring changes in anatomical structures during the treatment process. However, CBCT images often face interference from scatter noise and artifacts, posing a significant challenge when relying solely on CBCT for precise dose calculation and accurate tissue localization. There is an urgent need to enhance the quality of CBCT images, enabling a more practical application in IGRT. This study introduces EGDiff, a novel framework based on the </span>diffusion model<span>, designed to address the challenges posed by scatter noise and artifacts in CBCT images. In our approach, we employ a forward diffusion process<span> by adding Gaussian noise to CT images, followed by a reverse </span></span></span></span></span>denoising<span> process using ResUNet with an attention mechanism<span><span> to predict noise intensity, ultimately synthesizing CBCT-to-CT images. Additionally, we design an energy-guided function to retain domain-independent features and discard domain-specific features during the denoising process, enhancing the effectiveness of CBCT-CT generation. We conduct numerous experiments on the thorax dataset and pancreas dataset. The results demonstrate that EGDiff performs better on the </span>thoracic tumor<span> dataset with SSIM of 0.850, MAE<span> of 26.87 HU, PSNR of 19.83 dB, and </span></span></span></span></span>NCC of 0.874. EGDiff outperforms SoTA CBCT-to-CT synthesis methods on the pancreas dataset with SSIM of 0.754, MAE of 32.19 HU, PSNR of 19.35 dB, and NCC of 0.846. By improving the accuracy and reliability of CBCT images, EGDiff can enhance the precision of radiation therapy, minimize radiation exposure to healthy tissues, and ultimately contribute to more effective and personalized cancer treatment strategies.</p></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"113 ","pages":"Article 102344"},"PeriodicalIF":5.7,"publicationDate":"2024-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139669890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}