Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention最新文献
Pub Date : 2024-10-01Epub Date: 2024-10-03DOI: 10.1007/978-3-031-72117-5_42
Matthew Tivnan, Siyeop Yoon, Zhennong Chen, Xiang Li, Dufan Wu, Quanzheng Li
Generative image reconstruction algorithms such as measurement conditioned diffusion models are increasingly popular in the field of medical imaging. These powerful models can transform low signal-to-noise ratio (SNR) inputs into outputs with the appearance of high SNR. However, the outputs can have a new type of error called hallucinations. In medical imaging, these hallucinations may not be obvious to a Radiologist but could cause diagnostic errors. Generally, hallucination refers to error in estimation of object structure caused by a machine learning model, but there is no widely accepted method to evaluate hallucination magnitude. In this work, we propose a new image quality metric called the hallucination index. Our approach is to compute the Hellinger distance from the distribution of reconstructed images to a zero hallucination reference distribution. To evaluate our approach, we conducted a numerical experiment with electron microscopy images, simulated noisy measurements, and applied diffusion based reconstructions. We sampled the measurements and the generative reconstructions repeatedly to compute the sample mean and covariance. For the zero hallucination reference, we used the forward diffusion process applied to ground truth. Our results show that higher measurement SNR leads to lower hallucination index for the same apparent image quality. We also evaluated the impact of early stopping in the reverse diffusion process and found that more modest denoising strengths can reduce hallucination. We believe this metric could be useful for evaluation of generative image reconstructions or as a warning label to inform radiologists about the degree of hallucinations in medical images.
{"title":"Hallucination Index: An Image Quality Metric for Generative Reconstruction Models.","authors":"Matthew Tivnan, Siyeop Yoon, Zhennong Chen, Xiang Li, Dufan Wu, Quanzheng Li","doi":"10.1007/978-3-031-72117-5_42","DOIUrl":"10.1007/978-3-031-72117-5_42","url":null,"abstract":"<p><p>Generative image reconstruction algorithms such as measurement conditioned diffusion models are increasingly popular in the field of medical imaging. These powerful models can transform low signal-to-noise ratio (SNR) inputs into outputs with the appearance of high SNR. However, the outputs can have a new type of error called hallucinations. In medical imaging, these hallucinations may not be obvious to a Radiologist but could cause diagnostic errors. Generally, hallucination refers to error in estimation of object structure caused by a machine learning model, but there is no widely accepted method to evaluate hallucination magnitude. In this work, we propose a new image quality metric called the hallucination index. Our approach is to compute the Hellinger distance from the distribution of reconstructed images to a zero hallucination reference distribution. To evaluate our approach, we conducted a numerical experiment with electron microscopy images, simulated noisy measurements, and applied diffusion based reconstructions. We sampled the measurements and the generative reconstructions repeatedly to compute the sample mean and covariance. For the zero hallucination reference, we used the forward diffusion process applied to ground truth. Our results show that higher measurement SNR leads to lower hallucination index for the same apparent image quality. We also evaluated the impact of early stopping in the reverse diffusion process and found that more modest denoising strengths can reduce hallucination. We believe this metric could be useful for evaluation of generative image reconstructions or as a warning label to inform radiologists about the degree of hallucinations in medical images.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15010 ","pages":"449-458"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11956116/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143757111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-23DOI: 10.1007/978-3-031-72390-2_40
Davood Karimi
Existing machine learning methods for brain image analysis are mostly based on supervised training. They require large labeled datasets, which can be costly or impossible to obtain. Moreover, the trained models are useful only for the narrow task defined by the labels. In this work, we developed a new method, based on the concept of foundation models, to overcome these limitations. Our model is an attention-based neural network that is trained using a novel self-supervised approach. Specifically, the model is trained to generate brain images in a patch-wise manner, thereby learning the brain structure. To facilitate learning of image details, we propose a new method that encodes high-frequency information using convolutional kernels with random weights. We trained our model on a pool of 10 public datasets. We then applied the model on five independent datasets to perform segmentation, lesion detection, denoising, and brain age estimation. Results showed that the foundation model achieved competitive or better results on all tasks, while significantly reducing the required amount of labeled training data. Our method enables leveraging large unlabeled neuroimaging datasets to effectively address diverse brain image analysis tasks and reduce the time and cost requirements of acquiring labels.
{"title":"An approach to building foundation models for brain image analysis.","authors":"Davood Karimi","doi":"10.1007/978-3-031-72390-2_40","DOIUrl":"https://doi.org/10.1007/978-3-031-72390-2_40","url":null,"abstract":"<p><p>Existing machine learning methods for brain image analysis are mostly based on supervised training. They require large labeled datasets, which can be costly or impossible to obtain. Moreover, the trained models are useful only for the narrow task defined by the labels. In this work, we developed a new method, based on the concept of foundation models, to overcome these limitations. Our model is an attention-based neural network that is trained using a novel self-supervised approach. Specifically, the model is trained to generate brain images in a patch-wise manner, thereby learning the brain structure. To facilitate learning of image details, we propose a new method that encodes high-frequency information using convolutional kernels with random weights. We trained our model on a pool of 10 public datasets. We then applied the model on five independent datasets to perform segmentation, lesion detection, denoising, and brain age estimation. Results showed that the foundation model achieved competitive or better results on all tasks, while significantly reducing the required amount of labeled training data. Our method enables leveraging large unlabeled neuroimaging datasets to effectively address diverse brain image analysis tasks and reduce the time and cost requirements of acquiring labels.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15012 ","pages":"421-431"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12033034/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144048319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-04DOI: 10.1007/978-3-031-72086-4_71
Alberto M Ceballos-Arroyo, Hieu T Nguyen, Fangrui Zhu, Shrikanth M Yadav, Jisoo Kim, Lei Qin, Geoffrey Young, Huaizu Jiang
Manual detection of intracranial aneurysms (IAs) in computed tomography (CT) scans is a complex, time-consuming task even for expert clinicians, and automating the process is no less challenging. Critical difficulties associated with detecting aneurysms include their small (yet varied) size compared to scans and a high potential for false positive (FP) predictions. To address these issues, we propose a 3D, multi-scale neural architecture that detects aneurysms via a deformable attention mechanism that operates on vessel distance maps derived from vessel segmentations and 3D features extracted from the layers of a convolutional network. Likewise, we reformulate aneurysm segmentation as bounding cuboid prediction using binary cross entropy and three localization losses (location, size, IoU). Given three validation sets comprised of 152/138/38 CT scans and containing 126/101/58 aneurysms, we achieved a Sensitivity of 91.3%/97.0%/74.1% @ FP rates 0.53/0.56/0.87, with Sensitivity around 80% on small aneurysms. Manual inspection of outputs by experts showed our model only tends to miss aneurysms located in unusual locations. Code and model weights are available online.
{"title":"Vessel-aware aneurysm detection using multi-scale deformable 3D attention.","authors":"Alberto M Ceballos-Arroyo, Hieu T Nguyen, Fangrui Zhu, Shrikanth M Yadav, Jisoo Kim, Lei Qin, Geoffrey Young, Huaizu Jiang","doi":"10.1007/978-3-031-72086-4_71","DOIUrl":"https://doi.org/10.1007/978-3-031-72086-4_71","url":null,"abstract":"<p><p>Manual detection of intracranial aneurysms (IAs) in computed tomography (CT) scans is a complex, time-consuming task even for expert clinicians, and automating the process is no less challenging. Critical difficulties associated with detecting aneurysms include their small (yet varied) size compared to scans and a high potential for false positive (FP) predictions. To address these issues, we propose a 3D, multi-scale neural architecture that detects aneurysms via a deformable attention mechanism that operates on vessel distance maps derived from vessel segmentations and 3D features extracted from the layers of a convolutional network. Likewise, we reformulate aneurysm segmentation as bounding cuboid prediction using binary cross entropy and three localization losses (location, size, IoU). Given three validation sets comprised of 152/138/38 CT scans and containing 126/101/58 aneurysms, we achieved a Sensitivity of 91.3%/97.0%/74.1% @ FP rates 0.53/0.56/0.87, with Sensitivity around 80% on small aneurysms. Manual inspection of outputs by experts showed our model only tends to miss aneurysms located in unusual locations. Code and model weights are available online.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15005 ","pages":"754-765"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11986933/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144013943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Segment anything models (SAMs) are gaining attention for their zero-shot generalization capability in segmenting objects of unseen classes and in unseen domains when properly prompted. Interactivity is a key strength of SAMs, allowing users to iteratively provide prompts that specify objects of interest to refine outputs. However, to realize the interactive use of SAMs for 3D medical imaging tasks, rapid inference times are necessary. High memory requirements and long processing delays remain constraints that hinder the adoption of SAMs for this purpose. Specifically, while 2D SAMs applied to 3D volumes contend with repetitive computation to process all slices independently, 3D SAMs suffer from an exponential increase in model parameters and FLOPS. To address these challenges, we present FastSAM3D which accelerates SAM inference to 8 milliseconds per 128 × 128 × 128 3D volumetric image on an NVIDIA A100 GPU. This speedup is accomplished through 1) a novel layer-wise progressive distillation scheme that enables knowledge transfer from a complex 12-layer ViT-B to a lightweight 6-layer ViT-Tiny variant encoder without training from scratch; and 2) a novel 3D sparse flash attention to replace vanilla attention operators, substantially reducing memory needs and improving parallelization. Experiments on three diverse datasets reveal that FastSAM3D achieves a remarkable speedup of 527.38× compared to 2D SAMs and 8.75× compared to 3D SAMs on the same volumes without significant performance decline. Thus, FastSAM3D opens the door for low-cost truly interactive SAM-based 3D medical imaging segmentation with commonly used GPU hardware. Code is available at https://github.com/arcadelab/FastSAM3D.
{"title":"FastSAM3D: An Efficient Segment Anything Model for 3D Volumetric Medical Images.","authors":"Yiqing Shen, Jingxing Li, Xinyuan Shao, Blanca Inigo Romillo, Ankush Jindal, David Dreizin, Mathias Unberath","doi":"10.1007/978-3-031-72390-2_51","DOIUrl":"10.1007/978-3-031-72390-2_51","url":null,"abstract":"<p><p>Segment anything models (SAMs) are gaining attention for their zero-shot generalization capability in segmenting objects of unseen classes and in unseen domains when properly prompted. Interactivity is a key strength of SAMs, allowing users to iteratively provide prompts that specify objects of interest to refine outputs. However, to realize the interactive use of SAMs for 3D medical imaging tasks, rapid inference times are necessary. High memory requirements and long processing delays remain constraints that hinder the adoption of SAMs for this purpose. Specifically, while 2D SAMs applied to 3D volumes contend with repetitive computation to process all slices independently, 3D SAMs suffer from an exponential increase in model parameters and FLOPS. To address these challenges, we present FastSAM3D which accelerates SAM inference to 8 milliseconds per 128 × 128 × 128 3D volumetric image on an NVIDIA A100 GPU. This speedup is accomplished through 1) a novel layer-wise progressive distillation scheme that enables knowledge transfer from a complex 12-layer ViT-B to a lightweight 6-layer ViT-Tiny variant encoder without training from scratch; and 2) a novel 3D sparse flash attention to replace vanilla attention operators, substantially reducing memory needs and improving parallelization. Experiments on three diverse datasets reveal that FastSAM3D achieves a remarkable speedup of 527.38× compared to 2D SAMs and 8.75× compared to 3D SAMs on the same volumes without significant performance decline. Thus, FastSAM3D opens the door for low-cost truly interactive SAM-based 3D medical imaging segmentation with commonly used GPU hardware. Code is available at https://github.com/arcadelab/FastSAM3D.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15012 ","pages":"542-552"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12377522/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144984624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Whole slide image (WSI) classification plays a crucial role in digital pathology data analysis. However, the immense size of WSIs and the absence of fine-grained sub-region labels pose significant challenges for accurate WSI classification. Typical classification-driven deep learning methods often struggle to generate informative image representations, which can compromise the robustness of WSI classification. In this study, we address this challenge by incorporating both discriminative and contrastive learning techniques for WSI classification. Different from the existing contrastive learning methods for WSI classification that primarily rely on pseudo labels assigned to patches based on the WSI-level labels, our approach takes a different route to directly focus on constructing positive and negative samples at the WSI-level. Specifically, we select a subset of representative image patches to represent WSIs and create positive and negative samples at the WSI-level, facilitating effective learning of informative image features. Experimental results on two datasets and ablation studies have demonstrated that our method significantly improved the WSI classification performance compared to state-of-the-art deep learning methods and enabled learning of informative features that promoted robustness of the WSI classification.
{"title":"Enhancing Whole Slide Image Classification with Discriminative and Contrastive Learning.","authors":"Peixian Liang, Hao Zheng, Hongming Li, Yuxin Gong, Spyridon Bakas, Yong Fan","doi":"10.1007/978-3-031-72083-3_10","DOIUrl":"10.1007/978-3-031-72083-3_10","url":null,"abstract":"<p><p>Whole slide image (WSI) classification plays a crucial role in digital pathology data analysis. However, the immense size of WSIs and the absence of fine-grained sub-region labels pose significant challenges for accurate WSI classification. Typical classification-driven deep learning methods often struggle to generate informative image representations, which can compromise the robustness of WSI classification. In this study, we address this challenge by incorporating both discriminative and contrastive learning techniques for WSI classification. Different from the existing contrastive learning methods for WSI classification that primarily rely on pseudo labels assigned to patches based on the WSI-level labels, our approach takes a different route to directly focus on constructing positive and negative samples at the WSI-level. Specifically, we select a subset of representative image patches to represent WSIs and create positive and negative samples at the WSI-level, facilitating effective learning of informative image features. Experimental results on two datasets and ablation studies have demonstrated that our method significantly improved the WSI classification performance compared to state-of-the-art deep learning methods and enabled learning of informative features that promoted robustness of the WSI classification.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15004 ","pages":"102-112"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11877581/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143568358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-03DOI: 10.1007/978-3-031-72378-0_26
K M Arefeen Sultan, Md Hasibul Husain Hisham, Benjamin Orkild, Alan Morris, Eugene Kholmovski, Erik Bieging, Eugene Kwan, Ravi Ranjan, Ed DiBella, Shireen Elhabian
The accurate evaluation of left atrial fibrosis via high-quality 3D Late Gadolinium Enhancement (LGE) MRI is crucial for atrial fibrillation management but is hindered by factors like patient movement and imaging variability. The pursuit of automated LGE MRI quality assessment is critical for enhancing diagnostic accuracy, standardizing evaluations, and improving patient outcomes. The deep learning models aimed at automating this process face significant challenges due to the scarcity of expert annotations, high computational costs, and the need to capture subtle diagnostic details in highly variable images. This study introduces HAMIL-QA, a multiple instance learning (MIL) framework, designed to overcome these obstacles. HAMIL-QA employs a hierarchical bag and sub-bag structure that allows for targeted analysis within sub-bags and aggregates insights at the volume level. This hierarchical MIL approach reduces reliance on extensive annotations, lessens computational load, and ensures clinically relevant quality predictions by focusing on diagnostically critical image features. Our experiments show that HAMIL-QA surpasses existing MIL methods and traditional supervised approaches in accuracy, AUROC, and F1-Score on an LGE MRI scan dataset, demonstrating its potential as a scalable solution for LGE MRI quality assessment automation. The code is available at: https://github.com/arf111/HAMIL-QA.
{"title":"HAMIL-QA: Hierarchical Approach to Multiple Instance Learning for Atrial LGE MRI Quality Assessment.","authors":"K M Arefeen Sultan, Md Hasibul Husain Hisham, Benjamin Orkild, Alan Morris, Eugene Kholmovski, Erik Bieging, Eugene Kwan, Ravi Ranjan, Ed DiBella, Shireen Elhabian","doi":"10.1007/978-3-031-72378-0_26","DOIUrl":"10.1007/978-3-031-72378-0_26","url":null,"abstract":"<p><p>The accurate evaluation of left atrial fibrosis via high-quality 3D Late Gadolinium Enhancement (LGE) MRI is crucial for atrial fibrillation management but is hindered by factors like patient movement and imaging variability. The pursuit of automated LGE MRI quality assessment is critical for enhancing diagnostic accuracy, standardizing evaluations, and improving patient outcomes. The deep learning models aimed at automating this process face significant challenges due to the scarcity of expert annotations, high computational costs, and the need to capture subtle diagnostic details in highly variable images. This study introduces HAMIL-QA, a multiple instance learning (MIL) framework, designed to overcome these obstacles. HAMIL-QA employs a hierarchical bag and sub-bag structure that allows for targeted analysis within sub-bags and aggregates insights at the volume level. This hierarchical MIL approach reduces reliance on extensive annotations, lessens computational load, and ensures clinically relevant quality predictions by focusing on diagnostically critical image features. Our experiments show that HAMIL-QA surpasses existing MIL methods and traditional supervised approaches in accuracy, AUROC, and F1-Score on an LGE MRI scan dataset, demonstrating its potential as a scalable solution for LGE MRI quality assessment automation. The code is available at: https://github.com/arf111/HAMIL-QA.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15001 ","pages":"275-284"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12745976/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145866931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-06DOI: 10.1007/978-3-031-72111-3_11
Alex Ling Yu Hung, Haoxin Zheng, Kai Zhao, Kaifeng Pang, Demetri Terzopoulos, Kyunghyun Sung
Current deep learning-based models typically analyze medical images in either 2D or 3D albeit disregarding volumetric information or suffering sub-optimal performance due to the anisotropic resolution of MR data. Furthermore, providing an accurate uncertainty estimation is beneficial to clinicians, as it indicates how confident a model is about its prediction. We propose a novel 2.5D cross-slice attention model that utilizes both global and local information, along with an evidential critical loss, to perform evidential deep learning for the detection in MR images of prostate cancer, one of the most common cancers and a leading cause of cancer-related death in men. We perform extensive experiments with our model on two different datasets and achieve state-of-the-art performance in prostate cancer detection along with improved epistemic uncertainty estimation. The implementation of the model is available at https://github.com/aL3x-O-o-Hung/GLCSA_ECLoss.
{"title":"Cross-Slice Attention and Evidential Critical Loss for Uncertainty-Aware Prostate Cancer Detection.","authors":"Alex Ling Yu Hung, Haoxin Zheng, Kai Zhao, Kaifeng Pang, Demetri Terzopoulos, Kyunghyun Sung","doi":"10.1007/978-3-031-72111-3_11","DOIUrl":"10.1007/978-3-031-72111-3_11","url":null,"abstract":"<p><p>Current deep learning-based models typically analyze medical images in either 2D or 3D albeit disregarding volumetric information or suffering sub-optimal performance due to the anisotropic resolution of MR data. Furthermore, providing an accurate uncertainty estimation is beneficial to clinicians, as it indicates how confident a model is about its prediction. We propose a novel 2.5D cross-slice attention model that utilizes both global and local information, along with an evidential critical loss, to perform evidential deep learning for the detection in MR images of prostate cancer, one of the most common cancers and a leading cause of cancer-related death in men. We perform extensive experiments with our model on two different datasets and achieve state-of-the-art performance in prostate cancer detection along with improved epistemic uncertainty estimation. The implementation of the model is available at https://github.com/aL3x-O-o-Hung/GLCSA_ECLoss.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15008 ","pages":"113-123"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11646698/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142831545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-03DOI: 10.1007/978-3-031-72114-4_20
Behzad Hejrati, Soumyanil Banerjee, Carri Glide-Hurst, Ming Dong
Diffusion models have been used extensively for high quality image and video generation tasks. In this paper, we propose a novel conditional diffusion model with spatial attention and latent embedding (cDAL) for medical image segmentation. In cDAL, a convolutional neural network (CNN) based discriminator is used at every time-step of the diffusion process to distinguish between the generated labels and the real ones. A spatial attention map is computed based on the features learned by the discriminator to help cDAL generate more accurate segmentation of discriminative regions in an input image. Additionally, we incorporated a random latent embedding into each layer of our model to significantly reduce the number of training and sampling time-steps, thereby making it much faster than other diffusion models for image segmentation. We applied cDAL on 3 publicly available medical image segmentation datasets (MoNuSeg, Chest X-ray and Hippocampus) and observed significant qualitative and quantitative improvements with higher Dice scores and mIoU over the state-of-the-art algorithms. The source code is publicly available at https://github.com/Hejrati/cDAL/.
{"title":"Conditional Diffusion Model with Spatial Attention and Latent Embedding for Medical Image Segmentation.","authors":"Behzad Hejrati, Soumyanil Banerjee, Carri Glide-Hurst, Ming Dong","doi":"10.1007/978-3-031-72114-4_20","DOIUrl":"10.1007/978-3-031-72114-4_20","url":null,"abstract":"<p><p>Diffusion models have been used extensively for high quality image and video generation tasks. In this paper, we propose a novel conditional diffusion model with spatial attention and latent embedding (cDAL) for medical image segmentation. In cDAL, a convolutional neural network (CNN) based discriminator is used at every time-step of the diffusion process to distinguish between the generated labels and the real ones. A spatial attention map is computed based on the features learned by the discriminator to help cDAL generate more accurate segmentation of discriminative regions in an input image. Additionally, we incorporated a random latent embedding into each layer of our model to significantly reduce the number of training and sampling time-steps, thereby making it much faster than other diffusion models for image segmentation. We applied cDAL on 3 publicly available medical image segmentation datasets (MoNuSeg, Chest X-ray and Hippocampus) and observed significant qualitative and quantitative improvements with higher Dice scores and mIoU over the state-of-the-art algorithms. The source code is publicly available at https://github.com/Hejrati/cDAL/.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15009 ","pages":"202-212"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11974562/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143805308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-03DOI: 10.1007/978-3-031-72089-5_30
Maximilian Fehrentz, Mohammad Farid Azampour, Reuben Dorent, Hassan Rasheed, Colin Galvin, Alexandra Golby, William M Wells, Sarah Frisken, Nassir Navab, Nazim Haouchine
We present in this paper a novel approach for 3D/2D intraoperative registration during neurosurgery via cross-modal inverse neural rendering. Our approach separates implicit neural representation into two components, handling anatomical structure preoperatively and appearance intraoperatively. This disentanglement is achieved by controlling a Neural Radiance Field's appearance with a multi-style hypernetwork. Once trained, the implicit neural representation serves as a differentiable rendering engine, which can be used to estimate the surgical camera pose by minimizing the dissimilarity between its rendered images and the target intraoperative image. We tested our method on retrospective patients' data from clinical cases, showing that our method outperforms state-of-the-art while meeting current clinical standards for registration. Code and additional resources can be found at https://maxfehrentz.github.io/style-ngp/.
{"title":"Intraoperative Registration by Cross-Modal Inverse Neural Rendering.","authors":"Maximilian Fehrentz, Mohammad Farid Azampour, Reuben Dorent, Hassan Rasheed, Colin Galvin, Alexandra Golby, William M Wells, Sarah Frisken, Nassir Navab, Nazim Haouchine","doi":"10.1007/978-3-031-72089-5_30","DOIUrl":"10.1007/978-3-031-72089-5_30","url":null,"abstract":"<p><p>We present in this paper a novel approach for 3D/2D intraoperative registration during neurosurgery via cross-modal inverse neural rendering. Our approach separates implicit neural representation into two components, handling anatomical structure preoperatively and appearance intraoperatively. This disentanglement is achieved by controlling a Neural Radiance Field's appearance with a multi-style hypernetwork. Once trained, the implicit neural representation serves as a differentiable rendering engine, which can be used to estimate the surgical camera pose by minimizing the dissimilarity between its rendered images and the target intraoperative image. We tested our method on retrospective patients' data from clinical cases, showing that our method outperforms state-of-the-art while meeting current clinical standards for registration. Code and additional resources can be found at https://maxfehrentz.github.io/style-ngp/.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15006 ","pages":"317-327"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12714352/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145807091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-01DOI: 10.1007/978-3-031-43993-3_42
Zhengwang Wu, Jiale Cheng, Fenqiang Zhao, Ya Wang, Yue Sun, Dajiang Zhu, Tianming Liu, Valerie Jewells, Weili Lin, Li Wang, Gang Li
The cerebellum (i.e., little brain) plays an important role in motion and balances control abilities, despite its much smaller size and deeper sulci compared to the cerebrum. Previous cerebellum studies mainly relied on and focused on conventional volumetric analysis, which ignores the extremely deep and highly convoluted nature of the cerebellar cortex. To better reveal localized functional and structural changes, we propose cortical surface-based analysis of the cerebellar cortex. Specifically, we first reconstruct the cerebellar cortical surfaces to represent and characterize the highly folded cerebellar cortex in a geometrically accurate and topologically correct manner. Then, we propose a novel method to automatically parcellate the cerebellar cortical surface into anatomically meaningful regions by a weakly supervised graph convolutional neural network. Instead of relying on registration or requiring mapping the cerebellar surface to a sphere, which are either inaccurate or have large geometric distortions due to the deep cerebellar sulci, our learning-based model directly deals with the original cerebellar cortical surface by decomposing this challenging task into two steps. First, we learn the effective representation of the cerebellar cortical surface patches with a contrastive self-learning framework. Then, we map the learned representations to parcellation labels. We have validated our method using data from the Baby Connectome Project and the experimental results demonstrate its superior effectiveness and accuracy, compared to existing methods.
{"title":"Weakly Supervised Cerebellar Cortical Surface Parcellation with Self-Visual Representation Learning.","authors":"Zhengwang Wu, Jiale Cheng, Fenqiang Zhao, Ya Wang, Yue Sun, Dajiang Zhu, Tianming Liu, Valerie Jewells, Weili Lin, Li Wang, Gang Li","doi":"10.1007/978-3-031-43993-3_42","DOIUrl":"https://doi.org/10.1007/978-3-031-43993-3_42","url":null,"abstract":"<p><p>The cerebellum (i.e., little brain) plays an important role in motion and balances control abilities, despite its much smaller size and deeper sulci compared to the cerebrum. Previous cerebellum studies mainly relied on and focused on conventional volumetric analysis, which ignores the extremely deep and highly convoluted nature of the cerebellar cortex. To better reveal localized functional and structural changes, we propose cortical surface-based analysis of the cerebellar cortex. Specifically, we first reconstruct the cerebellar cortical surfaces to represent and characterize the highly folded cerebellar cortex in a geometrically accurate and topologically correct manner. Then, we propose a novel method to automatically parcellate the cerebellar cortical surface into anatomically meaningful regions by a weakly supervised graph convolutional neural network. Instead of relying on registration or requiring mapping the cerebellar surface to a sphere, which are either inaccurate or have large geometric distortions due to the deep cerebellar sulci, our learning-based model directly deals with the original cerebellar cortical surface by decomposing this challenging task into two steps. First, we learn the effective representation of the cerebellar cortical surface patches with a contrastive self-learning framework. Then, we map the learned representations to parcellation labels. We have validated our method using data from the Baby Connectome Project and the experimental results demonstrate its superior effectiveness and accuracy, compared to existing methods.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"14227 ","pages":"429-438"},"PeriodicalIF":0.0,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12030008/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144036370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention