Portable head CT images often suffer motion artifacts due to the prolonged scanning time and critically ill patients who are unable to hold still. Image-domain motion correction is attractive for this application as it does not require CT projection data. This paper describes and evaluates a generative model based on conditional diffusion to correct motion artifacts in portable head CT scans. This model was trained to find the motion-free CT image conditioned on the paired motion-corrupted image. Our method utilizes histogram equalization to resolve the intensity range discrepancy of skull and brain tissue and an advanced Elucidated Diffusion Model (EDM) framework for faster sampling and better motion correction performance. Our EDM framework is superior in correcting artifacts in the brain tissue region and across the entire image compared to CNN-based methods and standard diffusion approach (DDPM) in a simulation study and a phantom study with known motion-free ground truth. Furthermore, we conducted a reader study on real-world portable CT scans to demonstrate improvement of image quality using our method.
{"title":"Portable head CT motion artifact correction via diffusion-based generative model","authors":"Zhennong Chen , Siyeop Yoon , Quirin Strotzer , Rehab Naeem Khalid , Matthew Tivnan , Quanzheng Li , Rajiv Gupta , Dufan Wu","doi":"10.1016/j.compmedimag.2024.102478","DOIUrl":"10.1016/j.compmedimag.2024.102478","url":null,"abstract":"<div><div>Portable head CT images often suffer motion artifacts due to the prolonged scanning time and critically ill patients who are unable to hold still. Image-domain motion correction is attractive for this application as it does not require CT projection data. This paper describes and evaluates a generative model based on conditional diffusion to correct motion artifacts in portable head CT scans. This model was trained to find the motion-free CT image conditioned on the paired motion-corrupted image. Our method utilizes histogram equalization to resolve the intensity range discrepancy of skull and brain tissue and an advanced Elucidated Diffusion Model (EDM) framework for faster sampling and better motion correction performance. Our EDM framework is superior in correcting artifacts in the brain tissue region and across the entire image compared to CNN-based methods and standard diffusion approach (DDPM) in a simulation study and a phantom study with known motion-free ground truth. Furthermore, we conducted a reader study on real-world portable CT scans to demonstrate improvement of image quality using our method.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"119 ","pages":"Article 102478"},"PeriodicalIF":5.4,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142873358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.compmedimag.2024.102482
Yiming Liu , Ling Zhang , Mingxue Gu , Yaoxing Xiao , Ting Yu , Xiang Tao , Qing Zhang , Yan Wang , Dinggang Shen , Qingli Li
Pathological analysis of placenta is currently a valuable tool for gaining insights into pregnancy outcomes. In placental histopathology, multiple functional tissues can be inspected as potential signals reflecting the transfer functionality between fetal and maternal circulations. However, the identification of multiple functional tissues is challenging due to (1) severe heterogeneity in texture, size and shape, (2) distribution across different scales and (3) the need for comprehensive assessment at the whole slide image (WSI) level. To solve aforementioned problems, we establish a brand new dataset and propose a computer-aided segmentation framework through multi-model fusion and distillation to identify multiple functional tissues in placental histopathologic images, including villi, capillaries, fibrin deposits and trophoblast aggregations. Specifically, we propose a two-stage Multi-model Fusion and Distillation (MMFD) framework. Considering the multi-scale distribution and heterogeneity of multiple functional tissues, we enhance the visual representation in the first stage by fusing feature from multiple models to boost the effectiveness of the network. However, the multi-model fusion stage contributes to extra parameters and a significant computational burden, which is impractical for recognizing gigapixels of WSIs within clinical practice. In the second stage, we propose straightforward plug-in feature distillation method that transfers knowledge from the large fused model to a compact student model. In self-collected placental dataset, our proposed MMFD framework demonstrates an improvement of 4.3% in mean Intersection over Union (mIoU) while achieving an approximate 50% increase in inference speed and utilizing only 10% of parameters and computational resources, compared to the parameter-efficient fine-tuned Segment Anything Model (SAM) baseline. Visualization of segmentation results across entire WSIs on unseen cases demonstrates the generalizability of our proposed MMFD framework. Besides, experimental results on a public dataset further prove the effectiveness of MMFD framework on other tasks. Our work can present a fundamental method to expedite quantitative analysis of placental histopathology.
{"title":"Inspect quantitative signals in placental histopathology: Computer-assisted multiple functional tissues identification through multi-model fusion and distillation framework","authors":"Yiming Liu , Ling Zhang , Mingxue Gu , Yaoxing Xiao , Ting Yu , Xiang Tao , Qing Zhang , Yan Wang , Dinggang Shen , Qingli Li","doi":"10.1016/j.compmedimag.2024.102482","DOIUrl":"10.1016/j.compmedimag.2024.102482","url":null,"abstract":"<div><div>Pathological analysis of placenta is currently a valuable tool for gaining insights into pregnancy outcomes. In placental histopathology, multiple functional tissues can be inspected as potential signals reflecting the transfer functionality between fetal and maternal circulations. However, the identification of multiple functional tissues is challenging due to (1) severe heterogeneity in texture, size and shape, (2) distribution across different scales and (3) the need for comprehensive assessment at the whole slide image (WSI) level. To solve aforementioned problems, we establish a brand new dataset and propose a computer-aided segmentation framework through multi-model fusion and distillation to identify multiple functional tissues in placental histopathologic images, including villi, capillaries, fibrin deposits and trophoblast aggregations. Specifically, we propose a two-stage Multi-model Fusion and Distillation (MMFD) framework. Considering the multi-scale distribution and heterogeneity of multiple functional tissues, we enhance the visual representation in the first stage by fusing feature from multiple models to boost the effectiveness of the network. However, the multi-model fusion stage contributes to extra parameters and a significant computational burden, which is impractical for recognizing gigapixels of WSIs within clinical practice. In the second stage, we propose straightforward plug-in feature distillation method that transfers knowledge from the large fused model to a compact student model. In self-collected placental dataset, our proposed MMFD framework demonstrates an improvement of 4.3% in mean Intersection over Union (mIoU) while achieving an approximate 50% increase in inference speed and utilizing only 10% of parameters and computational resources, compared to the parameter-efficient fine-tuned Segment Anything Model (SAM) baseline. Visualization of segmentation results across entire WSIs on unseen cases demonstrates the generalizability of our proposed MMFD framework. Besides, experimental results on a public dataset further prove the effectiveness of MMFD framework on other tasks. Our work can present a fundamental method to expedite quantitative analysis of placental histopathology.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"119 ","pages":"Article 102482"},"PeriodicalIF":5.4,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142923514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.compmedimag.2024.102474
Pierre Rougé , Pierre-Henri Conze , Nicolas Passat , Odyssée Merveille
Segmentation in medical imaging is an essential and often preliminary task in the image processing chain, driving numerous efforts towards the design of robust segmentation algorithms. Supervised learning methods achieve excellent performances when fed with a sufficient amount of labeled data. However, such labels are typically highly time-consuming, error-prone and expensive to produce. Alternatively, semi-supervised learning approaches leverage both labeled and unlabeled data, and are very useful when only a small fraction of the dataset is labeled. They are particularly useful for cerebrovascular segmentation, given that labeling a single volume requires several hours for an expert. In addition to the challenge posed by insufficient annotations, there are concerns regarding annotation consistency. The task of annotating the cerebrovascular tree is inherently ambiguous. Due to the discrete nature of images, the borders and extremities of vessels are often unclear. Consequently, annotations heavily rely on the expert subjectivity and on the underlying clinical objective. These discrepancies significantly increase the complexity of the segmentation task for the model and consequently impair the results. Consequently, it becomes imperative to provide clinicians with precise guidelines to improve the annotation process and construct more uniform datasets. In this article, we investigate the data dependency of deep learning methods within the context of imperfect data and semi-supervised learning, for cerebrovascular segmentation. Specifically, this study compares various state-of-the-art semi-supervised methods based on unsupervised regularization and evaluates their performance in diverse quantity and quality data scenarios. Based on these experiments, we provide guidelines for the annotation and training of cerebrovascular segmentation models.
{"title":"Guidelines for cerebrovascular segmentation: Managing imperfect annotations in the context of semi-supervised learning","authors":"Pierre Rougé , Pierre-Henri Conze , Nicolas Passat , Odyssée Merveille","doi":"10.1016/j.compmedimag.2024.102474","DOIUrl":"10.1016/j.compmedimag.2024.102474","url":null,"abstract":"<div><div>Segmentation in medical imaging is an essential and often preliminary task in the image processing chain, driving numerous efforts towards the design of robust segmentation algorithms. Supervised learning methods achieve excellent performances when fed with a sufficient amount of labeled data. However, such labels are typically highly time-consuming, error-prone and expensive to produce. Alternatively, semi-supervised learning approaches leverage both labeled and unlabeled data, and are very useful when only a small fraction of the dataset is labeled. They are particularly useful for cerebrovascular segmentation, given that labeling a single volume requires several hours for an expert. In addition to the challenge posed by insufficient annotations, there are concerns regarding annotation consistency. The task of annotating the cerebrovascular tree is inherently ambiguous. Due to the discrete nature of images, the borders and extremities of vessels are often unclear. Consequently, annotations heavily rely on the expert subjectivity and on the underlying clinical objective. These discrepancies significantly increase the complexity of the segmentation task for the model and consequently impair the results. Consequently, it becomes imperative to provide clinicians with precise guidelines to improve the annotation process and construct more uniform datasets. In this article, we investigate the data dependency of deep learning methods within the context of imperfect data and semi-supervised learning, for cerebrovascular segmentation. Specifically, this study compares various state-of-the-art semi-supervised methods based on unsupervised regularization and evaluates their performance in diverse quantity and quality data scenarios. Based on these experiments, we provide guidelines for the annotation and training of cerebrovascular segmentation models.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"119 ","pages":"Article 102474"},"PeriodicalIF":5.4,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142873357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.compmedimag.2024.102473
Mudassar Ali , Tong Wu , Haoji Hu , Qiong Luo , Dong Xu , Weizeng Zheng , Neng Jin , Chen Yang , Jincao Yao
The purpose of this paper is to provide an overview of the developments that have occurred in the Segment Anything Model (SAM) within the medical image segmentation category over the course of the past year. However, SAM has demonstrated notable achievements in adapting to medical image segmentation tasks through fine-tuning on medical datasets, transitioning from 2D to 3D datasets, and optimizing prompting engineering. This is despite the fact that direct application on medical datasets has shown mixed results. Despite the difficulties, the paper emphasizes the significant potential that SAM possesses in the field of medical segmentation. One of the suggested directions for the future is to investigate the construction of large-scale datasets, to address multi-modal and multi-scale information, to integrate with semi-supervised learning structures, and to extend the application methods of SAM in clinical settings. In addition to making a significant contribution to the field of medical segmentation.
{"title":"A review of the Segment Anything Model (SAM) for medical image analysis: Accomplishments and perspectives","authors":"Mudassar Ali , Tong Wu , Haoji Hu , Qiong Luo , Dong Xu , Weizeng Zheng , Neng Jin , Chen Yang , Jincao Yao","doi":"10.1016/j.compmedimag.2024.102473","DOIUrl":"10.1016/j.compmedimag.2024.102473","url":null,"abstract":"<div><div>The purpose of this paper is to provide an overview of the developments that have occurred in the Segment Anything Model (SAM) within the medical image segmentation category over the course of the past year. However, SAM has demonstrated notable achievements in adapting to medical image segmentation tasks through fine-tuning on medical datasets, transitioning from 2D to 3D datasets, and optimizing prompting engineering. This is despite the fact that direct application on medical datasets has shown mixed results. Despite the difficulties, the paper emphasizes the significant potential that SAM possesses in the field of medical segmentation. One of the suggested directions for the future is to investigate the construction of large-scale datasets, to address multi-modal and multi-scale information, to integrate with semi-supervised learning structures, and to extend the application methods of SAM in clinical settings. In addition to making a significant contribution to the field of medical segmentation.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"119 ","pages":"Article 102473"},"PeriodicalIF":5.4,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142824125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.compmedimag.2024.102477
Chi Dong , Yujiao Wu , Bo Sun , Jiayi Bo , Yufei Huang , Yikang Geng , Qianhui Zhang , Ruixiang Liu , Wei Guo , Xingling Wang , Xiran Jiang
Objective
This study presents a novel framework that integrates contrastive learning and knowledge distillation to improve early ovarian cancer (OC) recurrence prediction, addressing the challenges posed by limited labeled data and tumor heterogeneity.
Methods
The research utilized CT imaging data from 585 OC patients, including 142 cases with complete follow-up information and 125 cases with unknown recurrence status. To pre-train the teacher network, 318 unlabeled images were sourced from public datasets (TCGA-OV and PLAGH-202-OC). Multi-view contrastive learning (MVCL) was employed to generate multi-view 2D tumor slices, enhancing the teacher network’s ability to extract features from complex, heterogeneous tumors with high intra-class variability. Building on this foundation, the proposed semi-supervised multi-task self-distillation (Semi-MTSD) framework integrated OC subtyping as an auxiliary task using multi-task learning (MTL). This approach allowed the co-training of a student network for recurrence prediction, leveraging both labeled and unlabeled data to improve predictive performance in data-limited settings. The student network's performance was assessed using preoperative CT images with known recurrence outcomes. Evaluation metrics included area under the receiver operating characteristic curve (AUC), accuracy (ACC), sensitivity (SEN), specificity (SPE), F1 score, floating-point operations (FLOPs), parameter count, training time, inference time, and mean corruption error (mCE).
Results
The proposed framework achieved an ACC of 0.862, an AUC of 0.916, a SPE of 0.895, and an F1 score of 0.831, surpassing existing methods for OC recurrence prediction. Comparative and ablation studies validated the model’s robustness, particularly in scenarios characterized by data scarcity and tumor heterogeneity.
Conclusion
The MVCL and Semi-MTSD framework demonstrates significant advancements in OC recurrence prediction, showcasing strong generalization capabilities in complex, data-constrained environments. This approach offers a promising pathway toward more personalized treatment strategies for OC patients.
{"title":"A multi-view contrastive learning and semi-supervised self-distillation framework for early recurrence prediction in ovarian cancer","authors":"Chi Dong , Yujiao Wu , Bo Sun , Jiayi Bo , Yufei Huang , Yikang Geng , Qianhui Zhang , Ruixiang Liu , Wei Guo , Xingling Wang , Xiran Jiang","doi":"10.1016/j.compmedimag.2024.102477","DOIUrl":"10.1016/j.compmedimag.2024.102477","url":null,"abstract":"<div><h3>Objective</h3><div>This study presents a novel framework that integrates contrastive learning and knowledge distillation to improve early ovarian cancer (OC) recurrence prediction, addressing the challenges posed by limited labeled data and tumor heterogeneity.</div></div><div><h3>Methods</h3><div>The research utilized CT imaging data from 585 OC patients, including 142 cases with complete follow-up information and 125 cases with unknown recurrence status. To pre-train the teacher network, 318 unlabeled images were sourced from public datasets (TCGA-OV and PLAGH-202-OC). Multi-view contrastive learning (MVCL) was employed to generate multi-view 2D tumor slices, enhancing the teacher network’s ability to extract features from complex, heterogeneous tumors with high intra-class variability. Building on this foundation, the proposed semi-supervised multi-task self-distillation (Semi-MTSD) framework integrated OC subtyping as an auxiliary task using multi-task learning (MTL). This approach allowed the co-training of a student network for recurrence prediction, leveraging both labeled and unlabeled data to improve predictive performance in data-limited settings. The student network's performance was assessed using preoperative CT images with known recurrence outcomes. Evaluation metrics included area under the receiver operating characteristic curve (AUC), accuracy (ACC), sensitivity (SEN), specificity (SPE), F1 score, floating-point operations (FLOPs), parameter count, training time, inference time, and mean corruption error (mCE).</div></div><div><h3>Results</h3><div>The proposed framework achieved an ACC of 0.862, an AUC of 0.916, a SPE of 0.895, and an F1 score of 0.831, surpassing existing methods for OC recurrence prediction. Comparative and ablation studies validated the model’s robustness, particularly in scenarios characterized by data scarcity and tumor heterogeneity.</div></div><div><h3>Conclusion</h3><div>The MVCL and Semi-MTSD framework demonstrates significant advancements in OC recurrence prediction, showcasing strong generalization capabilities in complex, data-constrained environments. This approach offers a promising pathway toward more personalized treatment strategies for OC patients.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"119 ","pages":"Article 102477"},"PeriodicalIF":5.4,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142824120","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.compmedimag.2024.102458
Liangchen Liu , Jianfei Liu , Bikash Santra , Christopher Parnell , Pritam Mukherjee , Tejas Mathai , Yingying Zhu , Akshaya Anand , Ronald M. Summers
Multiple intravenous contrast phases of CT scans are commonly used in clinical practice to facilitate disease diagnosis. However, contrast phase information is commonly missing or incorrect due to discrepancies in CT series descriptions and imaging practices. This work aims to develop a classification algorithm to automatically determine the contrast phase of a CT scan. We hypothesize that image intensities of key organs (e.g. aorta, inferior vena cava) affected by contrast enhancement are inherent feature information to decide the contrast phase. These organs are segmented by TotalSegmentator followed by generating intensity features on each segmented organ region. Two internal and one external dataset were collected to validate the classification accuracy. In comparison with the baseline ResNet classification method that did not make use of key organs features, the proposed method achieved the comparable accuracy of 92.5% and F1 score of 92.5% in one internal dataset. The accuracy was improved from 63.9% to 79.8% and F1 score from 43.9% to 65.0% using the proposed method on the other internal dataset. The accuracy improved from 63.5% to 85.1% and the F1 score from 56.4% to 83.9% on the external dataset. Image intensity features from key organs are critical for improving the classification accuracy of contrast phases of CT scans. The classification method based on these features is robust to different scanners and imaging protocols from different institutes. Our results suggested improved classification accuracy over existing approaches, which advances the application of automatic contrast phase classification toward real clinical practice. The code for this work can be found here: (https://github.com/rsummers11/CT_Contrast_Phase_Classifier).
{"title":"Utilizing domain knowledge to improve the classification of intravenous contrast phase of CT scans","authors":"Liangchen Liu , Jianfei Liu , Bikash Santra , Christopher Parnell , Pritam Mukherjee , Tejas Mathai , Yingying Zhu , Akshaya Anand , Ronald M. Summers","doi":"10.1016/j.compmedimag.2024.102458","DOIUrl":"10.1016/j.compmedimag.2024.102458","url":null,"abstract":"<div><div>Multiple intravenous contrast phases of CT scans are commonly used in clinical practice to facilitate disease diagnosis. However, contrast phase information is commonly missing or incorrect due to discrepancies in CT series descriptions and imaging practices. This work aims to develop a classification algorithm to automatically determine the contrast phase of a CT scan. We hypothesize that image intensities of key organs (e.g. aorta, inferior vena cava) affected by contrast enhancement are inherent feature information to decide the contrast phase. These organs are segmented by TotalSegmentator followed by generating intensity features on each segmented organ region. Two internal and one external dataset were collected to validate the classification accuracy. In comparison with the baseline ResNet classification method that did not make use of key organs features, the proposed method achieved the comparable accuracy of 92.5% and F1 score of 92.5% in one internal dataset. The accuracy was improved from 63.9% to 79.8% and F1 score from 43.9% to 65.0% using the proposed method on the other internal dataset. The accuracy improved from 63.5% to 85.1% and the F1 score from 56.4% to 83.9% on the external dataset. Image intensity features from key organs are critical for improving the classification accuracy of contrast phases of CT scans. The classification method based on these features is robust to different scanners and imaging protocols from different institutes. Our results suggested improved classification accuracy over existing approaches, which advances the application of automatic contrast phase classification toward real clinical practice. The code for this work can be found here: (<span><span>https://github.com/rsummers11/CT_Contrast_Phase_Classifier</span><svg><path></path></svg></span>).</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"119 ","pages":"Article 102458"},"PeriodicalIF":5.4,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142911012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In real-world scenarios, medical image segmentation models encounter input images that may deviate from the training images in various ways. These differences can arise from changes in image scanners and acquisition protocols, or even the images can come from a different modality or domain. When the model encounters these out-of-distribution (OOD) images, it can behave unpredictably. Therefore, it is important to develop a system that handles such out-of-distribution images to ensure the safe usage of the models in clinical practice. In this paper, we propose a post-hoc out-of-distribution (OOD) detection method that can be used with any pre-trained segmentation model. Our method utilizes multi-scale representations extracted from the encoder blocks of the segmentation model and employs Mahalanobis distance as a metric to measure the similarity between the input image and the in-distribution images. The segmentation model is pre-trained on a publicly available cardiac short-axis cine MRI dataset. The detection performance of the proposed method is evaluated on 13 different OOD datasets, which can be categorized as near, mild, and far OOD datasets based on their similarity to the in-distribution dataset. The results show that our method outperforms state-of-the-art feature space-based and uncertainty-based OOD detection methods across the various OOD datasets. Our method successfully detects near, mild, and far OOD images with high detection accuracy, showcasing the advantage of using the multi-scale and semantically rich representations of the encoder. In addition to the feature-based approach, we also propose a Dice coefficient-based OOD detection method, which demonstrates superior performance for adversarial OOD detection and shows a high correlation with segmentation quality. For the uncertainty-based method, despite having a strong correlation with the quality of the segmentation results in the near OOD datasets, they failed to detect mild and far OOD images, indicating the weakness of these methods when the images are more dissimilar. Future work will explore combining Mahalanobis distance and uncertainty scores for improved detection of challenging OOD images that are difficult to segment.
在现实场景中,医学图像分割模型会遇到输入图像可能以各种方式偏离训练图像的情况。这些差异可能来自图像扫描仪和采集协议的变化,甚至图像可能来自不同的模态或域。当模型遇到这些分布外(OOD)图像时,它的行为可能不可预测。因此,开发一种系统来处理这种分布外的图像,以确保模型在临床实践中的安全使用是很重要的。在本文中,我们提出了一种可用于任何预训练分割模型的post-hoc out- distribution (OOD)检测方法。我们的方法利用从分割模型的编码器块中提取的多尺度表示,并使用马氏距离作为度量输入图像与分布中图像之间的相似性的度量。分割模型在公开可用的心脏短轴电影MRI数据集上进行预训练。在13个不同的OOD数据集上评估了该方法的检测性能,这些数据集可以根据其与分布内数据集的相似性分为近、轻度和远OOD数据集。结果表明,我们的方法在各种OOD数据集上优于最先进的基于特征空间和基于不确定性的OOD检测方法。我们的方法以较高的检测精度成功地检测了近、轻度和远OOD图像,展示了使用编码器的多尺度和语义丰富表示的优势。除了基于特征的方法外,我们还提出了一种基于Dice系数的OOD检测方法,该方法在对抗性OOD检测中表现出优越的性能,并且与分割质量具有很高的相关性。对于基于不确定性的方法,尽管与近OOD数据集的分割结果质量有很强的相关性,但它们无法检测到轻度和远OOD图像,这表明这些方法在图像差异较大时的弱点。未来的工作将探索结合马氏距离和不确定性评分,以改进难以分割的具有挑战性的OOD图像的检测。
{"title":"Post-hoc out-of-distribution detection for cardiac MRI segmentation","authors":"Tewodros Weldebirhan Arega , Stéphanie Bricq , Fabrice Meriaudeau","doi":"10.1016/j.compmedimag.2024.102476","DOIUrl":"10.1016/j.compmedimag.2024.102476","url":null,"abstract":"<div><div>In real-world scenarios, medical image segmentation models encounter input images that may deviate from the training images in various ways. These differences can arise from changes in image scanners and acquisition protocols, or even the images can come from a different modality or domain. When the model encounters these out-of-distribution (OOD) images, it can behave unpredictably. Therefore, it is important to develop a system that handles such out-of-distribution images to ensure the safe usage of the models in clinical practice. In this paper, we propose a post-hoc out-of-distribution (OOD) detection method that can be used with any pre-trained segmentation model. Our method utilizes multi-scale representations extracted from the encoder blocks of the segmentation model and employs Mahalanobis distance as a metric to measure the similarity between the input image and the in-distribution images. The segmentation model is pre-trained on a publicly available cardiac short-axis cine MRI dataset. The detection performance of the proposed method is evaluated on 13 different OOD datasets, which can be categorized as near, mild, and far OOD datasets based on their similarity to the in-distribution dataset. The results show that our method outperforms state-of-the-art feature space-based and uncertainty-based OOD detection methods across the various OOD datasets. Our method successfully detects near, mild, and far OOD images with high detection accuracy, showcasing the advantage of using the multi-scale and semantically rich representations of the encoder. In addition to the feature-based approach, we also propose a Dice coefficient-based OOD detection method, which demonstrates superior performance for adversarial OOD detection and shows a high correlation with segmentation quality. For the uncertainty-based method, despite having a strong correlation with the quality of the segmentation results in the near OOD datasets, they failed to detect mild and far OOD images, indicating the weakness of these methods when the images are more dissimilar. Future work will explore combining Mahalanobis distance and uncertainty scores for improved detection of challenging OOD images that are difficult to segment.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"119 ","pages":"Article 102476"},"PeriodicalIF":5.4,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142866125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-01DOI: 10.1016/j.compmedimag.2024.102479
Yaolin He , Bowen Li , Ruimin He , Guangming Fu , Dan Sun , Dongyong Shan , Zijian Zhang
Accurate preoperative grading of prostate cancer is crucial for assisted diagnosis. Multi-parametric magnetic resonance imaging (MRI) is a commonly used non-invasive approach, however, the interpretation of MRI images is still subject to significant subjectivity due to variations in physicians’ expertise and experience. To achieve accurate, non-invasive, and efficient grading of prostate cancer, this paper proposes a deep learning method that adaptively fuses dual-view MRI images. Specifically, a dual-view adaptive fusion model is designed. The model employs encoders to extract embedded features from two MRI sequences: T2-weighted imaging (T2WI) and apparent diffusion coefficient (ADC). The model reconstructs the original input images using the embedded features and adopts a cross-embedding fusion module to adaptively fuse the embedded features from the two views. Adaptive fusion refers to dynamically adjusting the fusion weights of the features from the two views according to different input samples, thereby fully utilizing complementary information. Furthermore, the model adaptively weights the prediction results from the two views based on uncertainty estimation, further enhancing the grading performance. To verify the importance of effective multi-view fusion for prostate cancer grading, extensive experiments are designed. The experiments evaluate the performance of single-view models, dual-view models, and state-of-the-art multi-view fusion algorithms. The results demonstrate that the proposed dual-view adaptive fusion method achieves the best grading performance, confirming its effectiveness for assisted grading diagnosis of prostate cancer. This study provides a novel deep learning solution for preoperative grading of prostate cancer, which has the potential to assist clinical physicians in making more accurate diagnostic decisions and has significant clinical application value.
{"title":"Adaptive fusion of dual-view for grading prostate cancer","authors":"Yaolin He , Bowen Li , Ruimin He , Guangming Fu , Dan Sun , Dongyong Shan , Zijian Zhang","doi":"10.1016/j.compmedimag.2024.102479","DOIUrl":"10.1016/j.compmedimag.2024.102479","url":null,"abstract":"<div><div>Accurate preoperative grading of prostate cancer is crucial for assisted diagnosis. Multi-parametric magnetic resonance imaging (MRI) is a commonly used non-invasive approach, however, the interpretation of MRI images is still subject to significant subjectivity due to variations in physicians’ expertise and experience. To achieve accurate, non-invasive, and efficient grading of prostate cancer, this paper proposes a deep learning method that adaptively fuses dual-view MRI images. Specifically, a dual-view adaptive fusion model is designed. The model employs encoders to extract embedded features from two MRI sequences: T2-weighted imaging (T2WI) and apparent diffusion coefficient (ADC). The model reconstructs the original input images using the embedded features and adopts a cross-embedding fusion module to adaptively fuse the embedded features from the two views. Adaptive fusion refers to dynamically adjusting the fusion weights of the features from the two views according to different input samples, thereby fully utilizing complementary information. Furthermore, the model adaptively weights the prediction results from the two views based on uncertainty estimation, further enhancing the grading performance. To verify the importance of effective multi-view fusion for prostate cancer grading, extensive experiments are designed. The experiments evaluate the performance of single-view models, dual-view models, and state-of-the-art multi-view fusion algorithms. The results demonstrate that the proposed dual-view adaptive fusion method achieves the best grading performance, confirming its effectiveness for assisted grading diagnosis of prostate cancer. This study provides a novel deep learning solution for preoperative grading of prostate cancer, which has the potential to assist clinical physicians in making more accurate diagnostic decisions and has significant clinical application value.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"119 ","pages":"Article 102479"},"PeriodicalIF":5.4,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142873356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-30DOI: 10.1016/j.compmedimag.2024.102483
Yifei Yang , Jingfan Fan , Tianyu Fu , Deqiang Xiao , Dongsheng Ma , Hong Song , Zhengkai Feng , Youping Liu , Jian Yang
In skull base surgery, the method of using a probe to draw or 3D scanners to acquire intraoperative facial point clouds for spatial registration presents several issues. Manual manipulation results in inefficiency and poor consistency. Traditional registration algorithms based on point clouds are highly dependent on the initial pose. The complexity of registration algorithms can also extend the required time. To address these issues, we used an RGB-D camera to capture real-time facial point clouds during surgery. The initial registration of the 3D model reconstructed from preoperative CT/MR images and the point cloud collected during surgery is accomplished through corresponding facial landmarks. The facial point clouds collected intraoperatively often contain rotations caused by the free-angle camera. Benefit from the close spatial geometric relationship between head pose and facial landmarks coordinates, we propose a facial landmarks localization network assisted by estimating head pose. The shared representation head pose estimation module boosts network performance by enhancing its perception of global facial features. The proposed network facilitates the localization of landmark points in both preoperative and intraoperative point clouds, enabling rapid automatic registration. A free-view human facial landmarks dataset called 3D-FVL was synthesized from clinical CT images for training. The proposed network achieves leading localization accuracy and robustness on two public datasets and the 3D-FVL. In clinical experiments, using the Artec Eva scanner, the trained network achieved a concurrent reduction in average registration time to 0.28 s, with an average registration error of 2.33 mm. The proposed method significantly reduced registration time, while meeting clinical accuracy requirements for surgical navigation. Our research will help to improving the efficiency and quality of skull base surgery.
{"title":"Head pose-assisted localization of facial landmarks for enhanced fast registration in skull base surgery","authors":"Yifei Yang , Jingfan Fan , Tianyu Fu , Deqiang Xiao , Dongsheng Ma , Hong Song , Zhengkai Feng , Youping Liu , Jian Yang","doi":"10.1016/j.compmedimag.2024.102483","DOIUrl":"10.1016/j.compmedimag.2024.102483","url":null,"abstract":"<div><div>In skull base surgery, the method of using a probe to draw or 3D scanners to acquire intraoperative facial point clouds for spatial registration presents several issues. Manual manipulation results in inefficiency and poor consistency. Traditional registration algorithms based on point clouds are highly dependent on the initial pose. The complexity of registration algorithms can also extend the required time. To address these issues, we used an RGB-D camera to capture real-time facial point clouds during surgery. The initial registration of the 3D model reconstructed from preoperative CT/MR images and the point cloud collected during surgery is accomplished through corresponding facial landmarks. The facial point clouds collected intraoperatively often contain rotations caused by the free-angle camera. Benefit from the close spatial geometric relationship between head pose and facial landmarks coordinates, we propose a facial landmarks localization network assisted by estimating head pose. The shared representation head pose estimation module boosts network performance by enhancing its perception of global facial features. The proposed network facilitates the localization of landmark points in both preoperative and intraoperative point clouds, enabling rapid automatic registration. A free-view human facial landmarks dataset called 3D-FVL was synthesized from clinical CT images for training. The proposed network achieves leading localization accuracy and robustness on two public datasets and the 3D-FVL. In clinical experiments, using the Artec Eva scanner, the trained network achieved a concurrent reduction in average registration time to 0.28 s, with an average registration error of 2.33 mm. The proposed method significantly reduced registration time, while meeting clinical accuracy requirements for surgical navigation. Our research will help to improving the efficiency and quality of skull base surgery.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"120 ","pages":"Article 102483"},"PeriodicalIF":5.4,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142958412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-28DOI: 10.1016/j.compmedimag.2024.102475
Siying Xu , Kerstin Hammernik , Andreas Lingg , Jens Kübler , Patrick Krumm , Daniel Rueckert , Sergios Gatidis , Thomas Küstner
Cardiac Cine Magnetic Resonance Imaging (MRI) provides an accurate assessment of heart morphology and function in clinical practice. However, MRI requires long acquisition times, with recent deep learning-based methods showing great promise to accelerate imaging and enhance reconstruction quality. Existing networks exhibit some common limitations that constrain further acceleration possibilities, including single-domain learning, reliance on a single regularization term, and equal feature contribution. To address these limitations, we propose to embed information from multiple domains, including low-rank, image, and k-space, in a novel deep learning network for MRI reconstruction, which we denote as A-LIKNet. A-LIKNet adopts a parallel-branch structure, enabling independent learning in the k-space and image domain. Coupled information sharing layers realize the information exchange between domains. Furthermore, we introduce attention mechanisms into the network to assign greater weights to more critical coils or important temporal frames. Training and testing were conducted on an in-house dataset, including 91 cardiovascular patients and 38 healthy subjects scanned with 2D cardiac Cine using retrospective undersampling. Additionally, we evaluated A-LIKNet on the real-time prospectively undersampled data from the OCMR dataset. The results demonstrate that our proposed A-LIKNet outperforms existing methods and provides high-quality reconstructions. The network can effectively reconstruct highly retrospectively undersampled dynamic MR images up to accelerations, indicating its potential for single breath-hold imaging.
{"title":"Attention incorporated network for sharing low-rank, image and k-space information during MR image reconstruction to achieve single breath-hold cardiac Cine imaging","authors":"Siying Xu , Kerstin Hammernik , Andreas Lingg , Jens Kübler , Patrick Krumm , Daniel Rueckert , Sergios Gatidis , Thomas Küstner","doi":"10.1016/j.compmedimag.2024.102475","DOIUrl":"10.1016/j.compmedimag.2024.102475","url":null,"abstract":"<div><div>Cardiac Cine Magnetic Resonance Imaging (MRI) provides an accurate assessment of heart morphology and function in clinical practice. However, MRI requires long acquisition times, with recent deep learning-based methods showing great promise to accelerate imaging and enhance reconstruction quality. Existing networks exhibit some common limitations that constrain further acceleration possibilities, including single-domain learning, reliance on a single regularization term, and equal feature contribution. To address these limitations, we propose to embed information from multiple domains, including low-rank, image, and k-space, in a novel deep learning network for MRI reconstruction, which we denote as A-LIKNet. A-LIKNet adopts a parallel-branch structure, enabling independent learning in the k-space and image domain. Coupled information sharing layers realize the information exchange between domains. Furthermore, we introduce attention mechanisms into the network to assign greater weights to more critical coils or important temporal frames. Training and testing were conducted on an in-house dataset, including 91 cardiovascular patients and 38 healthy subjects scanned with 2D cardiac Cine using retrospective undersampling. Additionally, we evaluated A-LIKNet on the real-time prospectively undersampled data from the OCMR dataset. The results demonstrate that our proposed A-LIKNet outperforms existing methods and provides high-quality reconstructions. The network can effectively reconstruct highly retrospectively undersampled dynamic MR images up to <span><math><mrow><mn>24</mn><mo>×</mo></mrow></math></span> accelerations, indicating its potential for single breath-hold imaging.</div></div>","PeriodicalId":50631,"journal":{"name":"Computerized Medical Imaging and Graphics","volume":"120 ","pages":"Article 102475"},"PeriodicalIF":5.4,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142985476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}