Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention最新文献
Generative models, such as GANs and diffusion models, have been used to augment training sets and boost performances in different tasks. We focus on generative models for cell detection instead, i.e., locating and classifying cells in given pathology images. One important information that has been largely overlooked is the spatial patterns of the cells. In this paper, we propose a spatial-pattern-guided generative model for cell layout generation. Specifically, a novel diffusion model guided by spatial features and generates realistic cell layouts has been proposed. We explore different density models as spatial features for the diffusion model. In downstream tasks, we show that the generated cell layouts can be used to guide the generation of high-quality pathology images. Augmenting with these images can significantly boost the performance of SOTA cell detection methods. The code is available at https://github.com/superlc1995/Diffusion-cell.
{"title":"Spatial Diffusion for Cell Layout Generation.","authors":"Chen Li, Xiaoling Hu, Shahira Abousamra, Meilong Xu, Chao Chen","doi":"10.1007/978-3-031-72083-3_45","DOIUrl":"10.1007/978-3-031-72083-3_45","url":null,"abstract":"<p><p>Generative models, such as GANs and diffusion models, have been used to augment training sets and boost performances in different tasks. We focus on generative models for cell detection instead, i.e., locating and classifying cells in given pathology images. One important information that has been largely overlooked is the spatial patterns of the cells. In this paper, we propose a spatial-pattern-guided generative model for cell layout generation. Specifically, a novel diffusion model guided by spatial features and generates realistic cell layouts has been proposed. We explore different density models as spatial features for the diffusion model. In downstream tasks, we show that the generated cell layouts can be used to guide the generation of high-quality pathology images. Augmenting with these images can significantly boost the performance of SOTA cell detection methods. The code is available at https://github.com/superlc1995/Diffusion-cell.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15004 ","pages":"481-491"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12206494/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144532224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-04DOI: 10.1007/978-3-031-72069-7_22
Haoteng Tang, Guodong Liu, Siyuan Dai, Kai Ye, Kun Zhao, Wenlu Wang, Carl Yang, Lifang He, Alex Leow, Paul Thompson, Heng Huang, Liang Zhan
The MRI-derived brain network serves as a pivotal instrument in elucidating both the structural and functional aspects of the brain, encompassing the ramifications of diseases and developmental processes. However, prevailing methodologies, often focusing on synchronous BOLD signals from functional MRI (fMRI), may not capture directional influences among brain regions and rarely tackle temporal functional dynamics. In this study, we first construct the brain-effective network via the dynamic causal model. Subsequently, we introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE). This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic inter-play between structural and effective networks via an ordinary differential equation (ODE) model, which characterizes spatial-temporal brain dynamics. Our framework is validated on several clinical phenotype prediction tasks using two independent publicly available datasets (HCP and OASIS). The experimental results clearly demonstrate the advantages of our model compared to several state-of-the-art methods.
{"title":"Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation.","authors":"Haoteng Tang, Guodong Liu, Siyuan Dai, Kai Ye, Kun Zhao, Wenlu Wang, Carl Yang, Lifang He, Alex Leow, Paul Thompson, Heng Huang, Liang Zhan","doi":"10.1007/978-3-031-72069-7_22","DOIUrl":"10.1007/978-3-031-72069-7_22","url":null,"abstract":"<p><p>The MRI-derived brain network serves as a pivotal instrument in elucidating both the structural and functional aspects of the brain, encompassing the ramifications of diseases and developmental processes. However, prevailing methodologies, often focusing on synchronous BOLD signals from functional MRI (fMRI), may not capture directional influences among brain regions and rarely tackle temporal functional dynamics. In this study, we first construct the brain-effective network via the dynamic causal model. Subsequently, we introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE). This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic inter-play between structural and effective networks via an ordinary differential equation (ODE) model, which characterizes spatial-temporal brain dynamics. Our framework is validated on several clinical phenotype prediction tasks using two independent publicly available datasets (HCP and OASIS). The experimental results clearly demonstrate the advantages of our model compared to several state-of-the-art methods.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15002 ","pages":"227-237"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11513182/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142515737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-03DOI: 10.1007/978-3-031-72117-5_42
Matthew Tivnan, Siyeop Yoon, Zhennong Chen, Xiang Li, Dufan Wu, Quanzheng Li
Generative image reconstruction algorithms such as measurement conditioned diffusion models are increasingly popular in the field of medical imaging. These powerful models can transform low signal-to-noise ratio (SNR) inputs into outputs with the appearance of high SNR. However, the outputs can have a new type of error called hallucinations. In medical imaging, these hallucinations may not be obvious to a Radiologist but could cause diagnostic errors. Generally, hallucination refers to error in estimation of object structure caused by a machine learning model, but there is no widely accepted method to evaluate hallucination magnitude. In this work, we propose a new image quality metric called the hallucination index. Our approach is to compute the Hellinger distance from the distribution of reconstructed images to a zero hallucination reference distribution. To evaluate our approach, we conducted a numerical experiment with electron microscopy images, simulated noisy measurements, and applied diffusion based reconstructions. We sampled the measurements and the generative reconstructions repeatedly to compute the sample mean and covariance. For the zero hallucination reference, we used the forward diffusion process applied to ground truth. Our results show that higher measurement SNR leads to lower hallucination index for the same apparent image quality. We also evaluated the impact of early stopping in the reverse diffusion process and found that more modest denoising strengths can reduce hallucination. We believe this metric could be useful for evaluation of generative image reconstructions or as a warning label to inform radiologists about the degree of hallucinations in medical images.
{"title":"Hallucination Index: An Image Quality Metric for Generative Reconstruction Models.","authors":"Matthew Tivnan, Siyeop Yoon, Zhennong Chen, Xiang Li, Dufan Wu, Quanzheng Li","doi":"10.1007/978-3-031-72117-5_42","DOIUrl":"10.1007/978-3-031-72117-5_42","url":null,"abstract":"<p><p>Generative image reconstruction algorithms such as measurement conditioned diffusion models are increasingly popular in the field of medical imaging. These powerful models can transform low signal-to-noise ratio (SNR) inputs into outputs with the appearance of high SNR. However, the outputs can have a new type of error called hallucinations. In medical imaging, these hallucinations may not be obvious to a Radiologist but could cause diagnostic errors. Generally, hallucination refers to error in estimation of object structure caused by a machine learning model, but there is no widely accepted method to evaluate hallucination magnitude. In this work, we propose a new image quality metric called the hallucination index. Our approach is to compute the Hellinger distance from the distribution of reconstructed images to a zero hallucination reference distribution. To evaluate our approach, we conducted a numerical experiment with electron microscopy images, simulated noisy measurements, and applied diffusion based reconstructions. We sampled the measurements and the generative reconstructions repeatedly to compute the sample mean and covariance. For the zero hallucination reference, we used the forward diffusion process applied to ground truth. Our results show that higher measurement SNR leads to lower hallucination index for the same apparent image quality. We also evaluated the impact of early stopping in the reverse diffusion process and found that more modest denoising strengths can reduce hallucination. We believe this metric could be useful for evaluation of generative image reconstructions or as a warning label to inform radiologists about the degree of hallucinations in medical images.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15010 ","pages":"449-458"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11956116/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143757111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-23DOI: 10.1007/978-3-031-72390-2_40
Davood Karimi
Existing machine learning methods for brain image analysis are mostly based on supervised training. They require large labeled datasets, which can be costly or impossible to obtain. Moreover, the trained models are useful only for the narrow task defined by the labels. In this work, we developed a new method, based on the concept of foundation models, to overcome these limitations. Our model is an attention-based neural network that is trained using a novel self-supervised approach. Specifically, the model is trained to generate brain images in a patch-wise manner, thereby learning the brain structure. To facilitate learning of image details, we propose a new method that encodes high-frequency information using convolutional kernels with random weights. We trained our model on a pool of 10 public datasets. We then applied the model on five independent datasets to perform segmentation, lesion detection, denoising, and brain age estimation. Results showed that the foundation model achieved competitive or better results on all tasks, while significantly reducing the required amount of labeled training data. Our method enables leveraging large unlabeled neuroimaging datasets to effectively address diverse brain image analysis tasks and reduce the time and cost requirements of acquiring labels.
{"title":"An approach to building foundation models for brain image analysis.","authors":"Davood Karimi","doi":"10.1007/978-3-031-72390-2_40","DOIUrl":"https://doi.org/10.1007/978-3-031-72390-2_40","url":null,"abstract":"<p><p>Existing machine learning methods for brain image analysis are mostly based on supervised training. They require large labeled datasets, which can be costly or impossible to obtain. Moreover, the trained models are useful only for the narrow task defined by the labels. In this work, we developed a new method, based on the concept of foundation models, to overcome these limitations. Our model is an attention-based neural network that is trained using a novel self-supervised approach. Specifically, the model is trained to generate brain images in a patch-wise manner, thereby learning the brain structure. To facilitate learning of image details, we propose a new method that encodes high-frequency information using convolutional kernels with random weights. We trained our model on a pool of 10 public datasets. We then applied the model on five independent datasets to perform segmentation, lesion detection, denoising, and brain age estimation. Results showed that the foundation model achieved competitive or better results on all tasks, while significantly reducing the required amount of labeled training data. Our method enables leveraging large unlabeled neuroimaging datasets to effectively address diverse brain image analysis tasks and reduce the time and cost requirements of acquiring labels.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15012 ","pages":"421-431"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12033034/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144048319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-04DOI: 10.1007/978-3-031-72086-4_71
Alberto M Ceballos-Arroyo, Hieu T Nguyen, Fangrui Zhu, Shrikanth M Yadav, Jisoo Kim, Lei Qin, Geoffrey Young, Huaizu Jiang
Manual detection of intracranial aneurysms (IAs) in computed tomography (CT) scans is a complex, time-consuming task even for expert clinicians, and automating the process is no less challenging. Critical difficulties associated with detecting aneurysms include their small (yet varied) size compared to scans and a high potential for false positive (FP) predictions. To address these issues, we propose a 3D, multi-scale neural architecture that detects aneurysms via a deformable attention mechanism that operates on vessel distance maps derived from vessel segmentations and 3D features extracted from the layers of a convolutional network. Likewise, we reformulate aneurysm segmentation as bounding cuboid prediction using binary cross entropy and three localization losses (location, size, IoU). Given three validation sets comprised of 152/138/38 CT scans and containing 126/101/58 aneurysms, we achieved a Sensitivity of 91.3%/97.0%/74.1% @ FP rates 0.53/0.56/0.87, with Sensitivity around 80% on small aneurysms. Manual inspection of outputs by experts showed our model only tends to miss aneurysms located in unusual locations. Code and model weights are available online.
{"title":"Vessel-aware aneurysm detection using multi-scale deformable 3D attention.","authors":"Alberto M Ceballos-Arroyo, Hieu T Nguyen, Fangrui Zhu, Shrikanth M Yadav, Jisoo Kim, Lei Qin, Geoffrey Young, Huaizu Jiang","doi":"10.1007/978-3-031-72086-4_71","DOIUrl":"https://doi.org/10.1007/978-3-031-72086-4_71","url":null,"abstract":"<p><p>Manual detection of intracranial aneurysms (IAs) in computed tomography (CT) scans is a complex, time-consuming task even for expert clinicians, and automating the process is no less challenging. Critical difficulties associated with detecting aneurysms include their small (yet varied) size compared to scans and a high potential for false positive (FP) predictions. To address these issues, we propose a 3D, multi-scale neural architecture that detects aneurysms via a deformable attention mechanism that operates on vessel distance maps derived from vessel segmentations and 3D features extracted from the layers of a convolutional network. Likewise, we reformulate aneurysm segmentation as bounding cuboid prediction using binary cross entropy and three localization losses (location, size, IoU). Given three validation sets comprised of 152/138/38 CT scans and containing 126/101/58 aneurysms, we achieved a Sensitivity of 91.3%/97.0%/74.1% @ FP rates 0.53/0.56/0.87, with Sensitivity around 80% on small aneurysms. Manual inspection of outputs by experts showed our model only tends to miss aneurysms located in unusual locations. Code and model weights are available online.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15005 ","pages":"754-765"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11986933/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144013943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Segment anything models (SAMs) are gaining attention for their zero-shot generalization capability in segmenting objects of unseen classes and in unseen domains when properly prompted. Interactivity is a key strength of SAMs, allowing users to iteratively provide prompts that specify objects of interest to refine outputs. However, to realize the interactive use of SAMs for 3D medical imaging tasks, rapid inference times are necessary. High memory requirements and long processing delays remain constraints that hinder the adoption of SAMs for this purpose. Specifically, while 2D SAMs applied to 3D volumes contend with repetitive computation to process all slices independently, 3D SAMs suffer from an exponential increase in model parameters and FLOPS. To address these challenges, we present FastSAM3D which accelerates SAM inference to 8 milliseconds per 128 × 128 × 128 3D volumetric image on an NVIDIA A100 GPU. This speedup is accomplished through 1) a novel layer-wise progressive distillation scheme that enables knowledge transfer from a complex 12-layer ViT-B to a lightweight 6-layer ViT-Tiny variant encoder without training from scratch; and 2) a novel 3D sparse flash attention to replace vanilla attention operators, substantially reducing memory needs and improving parallelization. Experiments on three diverse datasets reveal that FastSAM3D achieves a remarkable speedup of 527.38× compared to 2D SAMs and 8.75× compared to 3D SAMs on the same volumes without significant performance decline. Thus, FastSAM3D opens the door for low-cost truly interactive SAM-based 3D medical imaging segmentation with commonly used GPU hardware. Code is available at https://github.com/arcadelab/FastSAM3D.
{"title":"FastSAM3D: An Efficient Segment Anything Model for 3D Volumetric Medical Images.","authors":"Yiqing Shen, Jingxing Li, Xinyuan Shao, Blanca Inigo Romillo, Ankush Jindal, David Dreizin, Mathias Unberath","doi":"10.1007/978-3-031-72390-2_51","DOIUrl":"10.1007/978-3-031-72390-2_51","url":null,"abstract":"<p><p>Segment anything models (SAMs) are gaining attention for their zero-shot generalization capability in segmenting objects of unseen classes and in unseen domains when properly prompted. Interactivity is a key strength of SAMs, allowing users to iteratively provide prompts that specify objects of interest to refine outputs. However, to realize the interactive use of SAMs for 3D medical imaging tasks, rapid inference times are necessary. High memory requirements and long processing delays remain constraints that hinder the adoption of SAMs for this purpose. Specifically, while 2D SAMs applied to 3D volumes contend with repetitive computation to process all slices independently, 3D SAMs suffer from an exponential increase in model parameters and FLOPS. To address these challenges, we present FastSAM3D which accelerates SAM inference to 8 milliseconds per 128 × 128 × 128 3D volumetric image on an NVIDIA A100 GPU. This speedup is accomplished through 1) a novel layer-wise progressive distillation scheme that enables knowledge transfer from a complex 12-layer ViT-B to a lightweight 6-layer ViT-Tiny variant encoder without training from scratch; and 2) a novel 3D sparse flash attention to replace vanilla attention operators, substantially reducing memory needs and improving parallelization. Experiments on three diverse datasets reveal that FastSAM3D achieves a remarkable speedup of 527.38× compared to 2D SAMs and 8.75× compared to 3D SAMs on the same volumes without significant performance decline. Thus, FastSAM3D opens the door for low-cost truly interactive SAM-based 3D medical imaging segmentation with commonly used GPU hardware. Code is available at https://github.com/arcadelab/FastSAM3D.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15012 ","pages":"542-552"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12377522/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144984624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Whole slide image (WSI) classification plays a crucial role in digital pathology data analysis. However, the immense size of WSIs and the absence of fine-grained sub-region labels pose significant challenges for accurate WSI classification. Typical classification-driven deep learning methods often struggle to generate informative image representations, which can compromise the robustness of WSI classification. In this study, we address this challenge by incorporating both discriminative and contrastive learning techniques for WSI classification. Different from the existing contrastive learning methods for WSI classification that primarily rely on pseudo labels assigned to patches based on the WSI-level labels, our approach takes a different route to directly focus on constructing positive and negative samples at the WSI-level. Specifically, we select a subset of representative image patches to represent WSIs and create positive and negative samples at the WSI-level, facilitating effective learning of informative image features. Experimental results on two datasets and ablation studies have demonstrated that our method significantly improved the WSI classification performance compared to state-of-the-art deep learning methods and enabled learning of informative features that promoted robustness of the WSI classification.
{"title":"Enhancing Whole Slide Image Classification with Discriminative and Contrastive Learning.","authors":"Peixian Liang, Hao Zheng, Hongming Li, Yuxin Gong, Spyridon Bakas, Yong Fan","doi":"10.1007/978-3-031-72083-3_10","DOIUrl":"10.1007/978-3-031-72083-3_10","url":null,"abstract":"<p><p>Whole slide image (WSI) classification plays a crucial role in digital pathology data analysis. However, the immense size of WSIs and the absence of fine-grained sub-region labels pose significant challenges for accurate WSI classification. Typical classification-driven deep learning methods often struggle to generate informative image representations, which can compromise the robustness of WSI classification. In this study, we address this challenge by incorporating both discriminative and contrastive learning techniques for WSI classification. Different from the existing contrastive learning methods for WSI classification that primarily rely on pseudo labels assigned to patches based on the WSI-level labels, our approach takes a different route to directly focus on constructing positive and negative samples at the WSI-level. Specifically, we select a subset of representative image patches to represent WSIs and create positive and negative samples at the WSI-level, facilitating effective learning of informative image features. Experimental results on two datasets and ablation studies have demonstrated that our method significantly improved the WSI classification performance compared to state-of-the-art deep learning methods and enabled learning of informative features that promoted robustness of the WSI classification.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15004 ","pages":"102-112"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11877581/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143568358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-03DOI: 10.1007/978-3-031-72378-0_26
K M Arefeen Sultan, Md Hasibul Husain Hisham, Benjamin Orkild, Alan Morris, Eugene Kholmovski, Erik Bieging, Eugene Kwan, Ravi Ranjan, Ed DiBella, Shireen Elhabian
The accurate evaluation of left atrial fibrosis via high-quality 3D Late Gadolinium Enhancement (LGE) MRI is crucial for atrial fibrillation management but is hindered by factors like patient movement and imaging variability. The pursuit of automated LGE MRI quality assessment is critical for enhancing diagnostic accuracy, standardizing evaluations, and improving patient outcomes. The deep learning models aimed at automating this process face significant challenges due to the scarcity of expert annotations, high computational costs, and the need to capture subtle diagnostic details in highly variable images. This study introduces HAMIL-QA, a multiple instance learning (MIL) framework, designed to overcome these obstacles. HAMIL-QA employs a hierarchical bag and sub-bag structure that allows for targeted analysis within sub-bags and aggregates insights at the volume level. This hierarchical MIL approach reduces reliance on extensive annotations, lessens computational load, and ensures clinically relevant quality predictions by focusing on diagnostically critical image features. Our experiments show that HAMIL-QA surpasses existing MIL methods and traditional supervised approaches in accuracy, AUROC, and F1-Score on an LGE MRI scan dataset, demonstrating its potential as a scalable solution for LGE MRI quality assessment automation. The code is available at: https://github.com/arf111/HAMIL-QA.
{"title":"HAMIL-QA: Hierarchical Approach to Multiple Instance Learning for Atrial LGE MRI Quality Assessment.","authors":"K M Arefeen Sultan, Md Hasibul Husain Hisham, Benjamin Orkild, Alan Morris, Eugene Kholmovski, Erik Bieging, Eugene Kwan, Ravi Ranjan, Ed DiBella, Shireen Elhabian","doi":"10.1007/978-3-031-72378-0_26","DOIUrl":"10.1007/978-3-031-72378-0_26","url":null,"abstract":"<p><p>The accurate evaluation of left atrial fibrosis via high-quality 3D Late Gadolinium Enhancement (LGE) MRI is crucial for atrial fibrillation management but is hindered by factors like patient movement and imaging variability. The pursuit of automated LGE MRI quality assessment is critical for enhancing diagnostic accuracy, standardizing evaluations, and improving patient outcomes. The deep learning models aimed at automating this process face significant challenges due to the scarcity of expert annotations, high computational costs, and the need to capture subtle diagnostic details in highly variable images. This study introduces HAMIL-QA, a multiple instance learning (MIL) framework, designed to overcome these obstacles. HAMIL-QA employs a hierarchical bag and sub-bag structure that allows for targeted analysis within sub-bags and aggregates insights at the volume level. This hierarchical MIL approach reduces reliance on extensive annotations, lessens computational load, and ensures clinically relevant quality predictions by focusing on diagnostically critical image features. Our experiments show that HAMIL-QA surpasses existing MIL methods and traditional supervised approaches in accuracy, AUROC, and F1-Score on an LGE MRI scan dataset, demonstrating its potential as a scalable solution for LGE MRI quality assessment automation. The code is available at: https://github.com/arf111/HAMIL-QA.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15001 ","pages":"275-284"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12745976/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145866931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-06DOI: 10.1007/978-3-031-72111-3_11
Alex Ling Yu Hung, Haoxin Zheng, Kai Zhao, Kaifeng Pang, Demetri Terzopoulos, Kyunghyun Sung
Current deep learning-based models typically analyze medical images in either 2D or 3D albeit disregarding volumetric information or suffering sub-optimal performance due to the anisotropic resolution of MR data. Furthermore, providing an accurate uncertainty estimation is beneficial to clinicians, as it indicates how confident a model is about its prediction. We propose a novel 2.5D cross-slice attention model that utilizes both global and local information, along with an evidential critical loss, to perform evidential deep learning for the detection in MR images of prostate cancer, one of the most common cancers and a leading cause of cancer-related death in men. We perform extensive experiments with our model on two different datasets and achieve state-of-the-art performance in prostate cancer detection along with improved epistemic uncertainty estimation. The implementation of the model is available at https://github.com/aL3x-O-o-Hung/GLCSA_ECLoss.
{"title":"Cross-Slice Attention and Evidential Critical Loss for Uncertainty-Aware Prostate Cancer Detection.","authors":"Alex Ling Yu Hung, Haoxin Zheng, Kai Zhao, Kaifeng Pang, Demetri Terzopoulos, Kyunghyun Sung","doi":"10.1007/978-3-031-72111-3_11","DOIUrl":"10.1007/978-3-031-72111-3_11","url":null,"abstract":"<p><p>Current deep learning-based models typically analyze medical images in either 2D or 3D albeit disregarding volumetric information or suffering sub-optimal performance due to the anisotropic resolution of MR data. Furthermore, providing an accurate uncertainty estimation is beneficial to clinicians, as it indicates how confident a model is about its prediction. We propose a novel 2.5D cross-slice attention model that utilizes both global and local information, along with an evidential critical loss, to perform evidential deep learning for the detection in MR images of prostate cancer, one of the most common cancers and a leading cause of cancer-related death in men. We perform extensive experiments with our model on two different datasets and achieve state-of-the-art performance in prostate cancer detection along with improved epistemic uncertainty estimation. The implementation of the model is available at https://github.com/aL3x-O-o-Hung/GLCSA_ECLoss.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15008 ","pages":"113-123"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11646698/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142831545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-03DOI: 10.1007/978-3-031-72114-4_20
Behzad Hejrati, Soumyanil Banerjee, Carri Glide-Hurst, Ming Dong
Diffusion models have been used extensively for high quality image and video generation tasks. In this paper, we propose a novel conditional diffusion model with spatial attention and latent embedding (cDAL) for medical image segmentation. In cDAL, a convolutional neural network (CNN) based discriminator is used at every time-step of the diffusion process to distinguish between the generated labels and the real ones. A spatial attention map is computed based on the features learned by the discriminator to help cDAL generate more accurate segmentation of discriminative regions in an input image. Additionally, we incorporated a random latent embedding into each layer of our model to significantly reduce the number of training and sampling time-steps, thereby making it much faster than other diffusion models for image segmentation. We applied cDAL on 3 publicly available medical image segmentation datasets (MoNuSeg, Chest X-ray and Hippocampus) and observed significant qualitative and quantitative improvements with higher Dice scores and mIoU over the state-of-the-art algorithms. The source code is publicly available at https://github.com/Hejrati/cDAL/.
{"title":"Conditional Diffusion Model with Spatial Attention and Latent Embedding for Medical Image Segmentation.","authors":"Behzad Hejrati, Soumyanil Banerjee, Carri Glide-Hurst, Ming Dong","doi":"10.1007/978-3-031-72114-4_20","DOIUrl":"10.1007/978-3-031-72114-4_20","url":null,"abstract":"<p><p>Diffusion models have been used extensively for high quality image and video generation tasks. In this paper, we propose a novel conditional diffusion model with spatial attention and latent embedding (cDAL) for medical image segmentation. In cDAL, a convolutional neural network (CNN) based discriminator is used at every time-step of the diffusion process to distinguish between the generated labels and the real ones. A spatial attention map is computed based on the features learned by the discriminator to help cDAL generate more accurate segmentation of discriminative regions in an input image. Additionally, we incorporated a random latent embedding into each layer of our model to significantly reduce the number of training and sampling time-steps, thereby making it much faster than other diffusion models for image segmentation. We applied cDAL on 3 publicly available medical image segmentation datasets (MoNuSeg, Chest X-ray and Hippocampus) and observed significant qualitative and quantitative improvements with higher Dice scores and mIoU over the state-of-the-art algorithms. The source code is publicly available at https://github.com/Hejrati/cDAL/.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15009 ","pages":"202-212"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11974562/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143805308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention