Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention最新文献
Weakly supervised whole slide image (WSI) classification is challenging due to the lack of patch-level labels and high computational costs. State-of-the-art methods use self-supervised patch-wise feature representations for multiple instance learning (MIL). Recently, methods have been proposed to fine-tune the feature representation on the downstream task using pseudo labeling, but mostly focusing on selecting high-quality positive patches. In this paper, we propose to mine hard negative samples during fine-tuning. This allows us to obtain better feature representations and reduce the training cost. Furthermore, we propose a novel patch-wise ranking loss in MIL to better exploit these hard negative samples. Experiments on two public datasets demonstrate the efficacy of these proposed ideas. Our codes are available at https://github.com/winston52/HNM-WSI.
{"title":"Hard Negative Sample Mining for Whole Slide Image Classification.","authors":"Wentao Huang, Xiaoling Hu, Shahira Abousamra, Prateek Prasanna, Chao Chen","doi":"10.1007/978-3-031-72083-3_14","DOIUrl":"10.1007/978-3-031-72083-3_14","url":null,"abstract":"<p><p>Weakly supervised whole slide image (WSI) classification is challenging due to the lack of patch-level labels and high computational costs. State-of-the-art methods use self-supervised patch-wise feature representations for multiple instance learning (MIL). Recently, methods have been proposed to fine-tune the feature representation on the downstream task using pseudo labeling, but mostly focusing on selecting high-quality positive patches. In this paper, we propose to mine hard negative samples during fine-tuning. This allows us to obtain better feature representations and reduce the training cost. Furthermore, we propose a novel patch-wise ranking loss in MIL to better exploit these hard negative samples. Experiments on two public datasets demonstrate the efficacy of these proposed ideas. Our codes are available at https://github.com/winston52/HNM-WSI.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15004 ","pages":"144-154"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12185924/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144487609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-03DOI: 10.1007/978-3-031-72104-5_67
Xiaofeng Liu, Fangxu Xing, Zhangxing Bian, Tomas Arias-Vergara, Paula Andrea Pérez-Toro, Andreas Maier, Maureen Stone, Jiachen Zhuo, Jerry L Prince, Jonghye Woo
Tagged magnetic resonance imaging (MRI) has been successfully used to track the motion of internal tissue points within moving organs. Typically, to analyze motion using tagged MRI, cine MRI data in the same coordinate system are acquired, incurring additional time and costs. Consequently, tagged-to-cine MR synthesis holds the potential to reduce the extra acquisition time and costs associated with cine MRI, without disrupting downstream motion analysis tasks. Previous approaches have processed each frame independently, thereby overlooking the fact that complementary information from occluded regions of the tag patterns could be present in neighboring frames exhibiting motion. Furthermore, the inconsistent visual appearance, e.g., tag fading, across frames can reduce synthesis performance. To address this, we propose an efficient framework for tagged-to-cine MR sequence synthesis, leveraging both spatial and temporal information with relatively limited data. Specifically, we follow a split-and-integral protocol to balance spatialtemporal modeling efficiency and consistency. The light spatial-temporal transformer (LiST2) is designed to exploit the local and global attention in motion sequence with relatively lightweight training parameters. The directional product relative position-time bias is adapted to make the model aware of the spatial-temporal correlation, while the shifted window is used for motion alignment. Then, a recurrent sliding fine-tuning (ReST) scheme is applied to further enhance the temporal consistency. Our framework is evaluated on paired tagged and cine MRI sequences, demonstrating superior performance over comparison methods.
标记磁共振成像(MRI)已成功用于跟踪移动器官内部组织点的运动。通常情况下,要使用标记磁共振成像分析运动,需要获取同一坐标系的 cine MRI 数据,这就需要额外的时间和成本。因此,从标记到线性磁共振合成有望减少与线性磁共振成像相关的额外采集时间和成本,同时又不会影响下游的运动分析任务。以往的方法对每一帧图像进行独立处理,从而忽略了标签图案闭塞区域的补充信息可能存在于显示运动的相邻帧图像中这一事实。此外,各帧之间不一致的视觉外观(如标签褪色)也会降低合成性能。为了解决这个问题,我们提出了一个高效的框架,利用空间和时间信息,在数据相对有限的情况下进行标记到线性 MR 序列合成。具体来说,我们采用分割-积分协议来平衡时空建模效率和一致性。轻型时空变换器(LiST2)旨在利用运动序列中的局部和全局注意力,训练参数相对较轻。通过调整方向积相对位置-时间偏置,使模型意识到时空相关性,同时使用移动窗口进行运动对齐。然后,采用循环滑动微调(ReST)方案进一步增强时间一致性。我们的框架在成对标记和电影核磁共振成像序列上进行了评估,证明其性能优于比较方法。
{"title":"Tagged-to-Cine MRI Sequence Synthesis via Light Spatial-Temporal Transformer.","authors":"Xiaofeng Liu, Fangxu Xing, Zhangxing Bian, Tomas Arias-Vergara, Paula Andrea Pérez-Toro, Andreas Maier, Maureen Stone, Jiachen Zhuo, Jerry L Prince, Jonghye Woo","doi":"10.1007/978-3-031-72104-5_67","DOIUrl":"10.1007/978-3-031-72104-5_67","url":null,"abstract":"<p><p>Tagged magnetic resonance imaging (MRI) has been successfully used to track the motion of internal tissue points within moving organs. Typically, to analyze motion using tagged MRI, cine MRI data in the same coordinate system are acquired, incurring additional time and costs. Consequently, tagged-to-cine MR synthesis holds the potential to reduce the extra acquisition time and costs associated with cine MRI, without disrupting downstream motion analysis tasks. Previous approaches have processed each frame independently, thereby overlooking the fact that complementary information from occluded regions of the tag patterns could be present in neighboring frames exhibiting motion. Furthermore, the inconsistent visual appearance, e.g., tag fading, across frames can reduce synthesis performance. To address this, we propose an efficient framework for tagged-to-cine MR sequence synthesis, leveraging both spatial and temporal information with relatively limited data. Specifically, we follow a split-and-integral protocol to balance spatialtemporal modeling efficiency and consistency. The light spatial-temporal transformer (LiST<sup>2</sup>) is designed to exploit the local and global attention in motion sequence with relatively lightweight training parameters. The directional product relative position-time bias is adapted to make the model aware of the spatial-temporal correlation, while the shifted window is used for motion alignment. Then, a recurrent sliding fine-tuning (ReST) scheme is applied to further enhance the temporal consistency. Our framework is evaluated on paired tagged and cine MRI sequences, demonstrating superior performance over comparison methods.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15007 ","pages":"701-711"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11517403/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-23DOI: 10.1007/978-3-031-72390-2_44
Yuexi Du, Brian Chang, Nicha C Dvornek
Recent advancements in Contrastive Language-Image Pre-training (CLIP) [21] have demonstrated notable success in self-supervised representation learning across various tasks. However, the existing CLIP-like approaches often demand extensive GPU resources and prolonged training times due to the considerable size of the model and dataset, making them poor for medical applications, in which large datasets are not always common. Meanwhile, the language model prompts are mainly manually derived from labels tied to images, potentially overlooking the richness of information within training samples. We introduce a novel language-image Contrastive Learning method with an Efficient large language model and prompt Fine-Tuning (CLEFT) that harnesses the strengths of the extensive pre-trained language and visual models. Furthermore, we present an efficient strategy for learning context-based prompts that mitigates the gap between informative clinical diagnostic data and simple class labels. Our method demonstrates state-of-the-art performance on multiple chest X-ray and mammography datasets compared with various baselines. The proposed parameter efficient framework can reduce the total trainable model size by 39% and reduce the trainable language model to only 4% compared with the current BERT encoder.
{"title":"CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning.","authors":"Yuexi Du, Brian Chang, Nicha C Dvornek","doi":"10.1007/978-3-031-72390-2_44","DOIUrl":"10.1007/978-3-031-72390-2_44","url":null,"abstract":"<p><p>Recent advancements in Contrastive Language-Image Pre-training (CLIP) [21] have demonstrated notable success in self-supervised representation learning across various tasks. However, the existing CLIP-like approaches often demand extensive GPU resources and prolonged training times due to the considerable size of the model and dataset, making them poor for medical applications, in which large datasets are not always common. Meanwhile, the language model prompts are mainly manually derived from labels tied to images, potentially overlooking the richness of information within training samples. We introduce a novel language-image Contrastive Learning method with an Efficient large language model and prompt Fine-Tuning (CLEFT) that harnesses the strengths of the extensive pre-trained language and visual models. Furthermore, we present an efficient strategy for learning context-based prompts that mitigates the gap between informative clinical diagnostic data and simple class labels. Our method demonstrates state-of-the-art performance on multiple chest X-ray and mammography datasets compared with various baselines. The proposed parameter efficient framework can reduce the total trainable model size by 39% and reduce the trainable language model to only 4% compared with the current BERT encoder.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15012 ","pages":"465-475"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11709740/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142960994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-04DOI: 10.1007/978-3-031-72069-7_68
Nian Wu, Jiarui Xing, Miaomiao Zhang
This paper presents a novel approach, termed Temporal Latent Residual Network (TLRN), to predict a sequence of deformation fields in time-series image registration. The challenge of registering time-series images often lies in the occurrence of large motions, especially when images differ significantly from a reference (e.g., the start of a cardiac cycle compared to the peak stretching phase). To achieve accurate and robust registration results, we leverage the nature of motion continuity and exploit the temporal smoothness in consecutive image frames. Our proposed TLRN highlights a temporal residual network with residual blocks carefully designed in latent deformation spaces, which are parameterized by time-sequential initial velocity fields. We treat a sequence of residual blocks over time as a dynamic training system, where each block is designed to learn the residual function between desired deformation features and current input accumulated from previous time frames. We validate the effectivenss of TLRN on both synthetic data and real-world cine cardiac magnetic resonance (CMR) image videos. Our experimental results shows that TLRN is able to achieve substantially improved registration accuracy compared to the state-of-the-art. Our code is publicly available at https://github.com/nellie689/TLRN.
{"title":"TLRN: Temporal Latent Residual Networks For Large Deformation Image Registration.","authors":"Nian Wu, Jiarui Xing, Miaomiao Zhang","doi":"10.1007/978-3-031-72069-7_68","DOIUrl":"10.1007/978-3-031-72069-7_68","url":null,"abstract":"<p><p>This paper presents a novel approach, termed <i>Temporal Latent Residual Network (TLRN)</i>, to predict a sequence of deformation fields in time-series image registration. The challenge of registering time-series images often lies in the occurrence of large motions, especially when images differ significantly from a reference (e.g., the start of a cardiac cycle compared to the peak stretching phase). To achieve accurate and robust registration results, we leverage the nature of motion continuity and exploit the temporal smoothness in consecutive image frames. Our proposed TLRN highlights a temporal residual network with residual blocks carefully designed in latent deformation spaces, which are parameterized by time-sequential initial velocity fields. We treat a sequence of residual blocks over time as a dynamic training system, where each block is designed to learn the residual function between desired deformation features and current input accumulated from previous time frames. We validate the effectivenss of TLRN on both synthetic data and real-world cine cardiac magnetic resonance (CMR) image videos. Our experimental results shows that TLRN is able to achieve substantially improved registration accuracy compared to the state-of-the-art. Our code is publicly available at https://github.com/nellie689/TLRN.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15002 ","pages":"728-738"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11929566/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143694983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-14DOI: 10.1007/978-3-031-72083-3_15
Ruining Deng, Quan Liu, Can Cui, Tianyuan Yao, Juming Xiong, Shunxing Bao, Hao Li, Mengmeng Yin, Yu Wang, Shilin Zhao, Yucheng Tang, Haichun Yang, Yuankai Huo
Panoramic image segmentation in computational pathology presents a remarkable challenge due to the morphologically complex and variably scaled anatomy. For instance, the intricate organization in kidney pathology spans multiple layers, from regions like the cortex and medulla to functional units such as glomeruli, tubules, and vessels, down to various cell types. In this paper, we propose a novel Hierarchical Adaptive Taxonomy Segmentation (HATs) method, which is designed to thoroughly segment panoramic views of kidney structures by leveraging detailed anatomical insights. Our approach entails (1) the innovative HATs technique which translates spatial relationships among 15 distinct object classes into a versatile "plug-and-play" loss function that spans across regions, functional units, and cells, (2) the incorporation of anatomical hierarchies and scale considerations into a unified simple matrix representation for all panoramic entities, (3) the adoption of the latest AI foundation model (EfficientSAM) as a feature extraction tool to boost the model's adaptability, yet eliminating the need for manual prompt generation in conventional segment anything model (SAM). Experimental findings demonstrate that the HATs method offers an efficient and effective strategy for integrating clinical insights and imaging precedents into a unified segmentation model across more than 15 categories. The official implementation is publicly available at https://github.com/hrlblab/HATs.
{"title":"HATs: Hierarchical Adaptive Taxonomy Segmentation for Panoramic Pathology Image Analysis.","authors":"Ruining Deng, Quan Liu, Can Cui, Tianyuan Yao, Juming Xiong, Shunxing Bao, Hao Li, Mengmeng Yin, Yu Wang, Shilin Zhao, Yucheng Tang, Haichun Yang, Yuankai Huo","doi":"10.1007/978-3-031-72083-3_15","DOIUrl":"10.1007/978-3-031-72083-3_15","url":null,"abstract":"<p><p>Panoramic image segmentation in computational pathology presents a remarkable challenge due to the morphologically complex and variably scaled anatomy. For instance, the intricate organization in kidney pathology spans multiple layers, from regions like the cortex and medulla to functional units such as glomeruli, tubules, and vessels, down to various cell types. In this paper, we propose a novel Hierarchical Adaptive Taxonomy Segmentation (HATs) method, which is designed to thoroughly segment panoramic views of kidney structures by leveraging detailed anatomical insights. Our approach entails (1) the innovative HATs technique which translates spatial relationships among 15 distinct object classes into a versatile \"plug-and-play\" loss function that spans across regions, functional units, and cells, (2) the incorporation of anatomical hierarchies and scale considerations into a unified simple matrix representation for all panoramic entities, (3) the adoption of the latest AI foundation model (EfficientSAM) as a feature extraction tool to boost the model's adaptability, yet eliminating the need for manual prompt generation in conventional segment anything model (SAM). Experimental findings demonstrate that the HATs method offers an efficient and effective strategy for integrating clinical insights and imaging precedents into a unified segmentation model across more than 15 categories. The official implementation is publicly available at https://github.com/hrlblab/HATs.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15004 ","pages":"155-166"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11927787/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143694985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-06DOI: 10.1007/978-3-031-72111-3_56
Wenhui Zhu, Xiwen Chen, Peijie Qiu, Mohammad Farazi, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang
Since its introduction, UNet has been leading a variety of medical image segmentation tasks. Although numerous follow-up studies have also been dedicated to improving the performance of standard UNet, few have conducted in-depth analyses of the underlying interest pattern of UNet in medical image segmentation. In this paper, we explore the patterns learned in a UNet and observe two important factors that potentially affect its performance: (i) irrelative feature learned caused by asymmetric supervision; (ii) feature redundancy in the feature map. To this end, we propose to balance the supervision between encoder and decoder and reduce the redundant information in the UNet. Specifically, we use the feature map that contains the most semantic information (i.e., the last layer of the decoder) to provide additional supervision to other blocks to provide additional supervision and reduce feature redundancy by leveraging feature distillation. The proposed method can be easily integrated into existing UNet architecture in a plug-and-play fashion with negligible computational cost. The experimental results suggest that the proposed method consistently improves the performance of standard UNets on four medical image segmentation datasets. The code is available at https://github.com/ChongQingNoSubway/SelfReg-UNet.
{"title":"SelfReg-UNet: Self-Regularized UNet for Medical Image Segmentation.","authors":"Wenhui Zhu, Xiwen Chen, Peijie Qiu, Mohammad Farazi, Aristeidis Sotiras, Abolfazl Razi, Yalin Wang","doi":"10.1007/978-3-031-72111-3_56","DOIUrl":"10.1007/978-3-031-72111-3_56","url":null,"abstract":"<p><p>Since its introduction, UNet has been leading a variety of medical image segmentation tasks. Although numerous follow-up studies have also been dedicated to improving the performance of standard UNet, few have conducted in-depth analyses of the underlying interest pattern of UNet in medical image segmentation. In this paper, we explore the patterns learned in a UNet and observe two important factors that potentially affect its performance: (i) irrelative feature learned caused by asymmetric supervision; (ii) feature redundancy in the feature map. To this end, we propose to balance the supervision between encoder and decoder and reduce the redundant information in the UNet. Specifically, we use the feature map that contains the most semantic information (i.e., the last layer of the decoder) to provide additional supervision to other blocks to provide additional supervision and reduce feature redundancy by leveraging feature distillation. The proposed method can be easily integrated into existing UNet architecture in a plug-and-play fashion with negligible computational cost. The experimental results suggest that the proposed method consistently improves the performance of standard UNets on four medical image segmentation datasets. The code is available at https://github.com/ChongQingNoSubway/SelfReg-UNet.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15008 ","pages":"601-611"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12408486/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145017053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-03DOI: 10.1007/978-3-031-72120-5_11
Yuqi Fang, Wei Wang, Qianqian Wang, Hong-Jun Li, Mingxia Liu
Asymptomatic neurocognitive impairment (ANI) is a predominant form of cognitive impairment among individuals infected with human immunodeficiency virus (HIV). The current diagnostic criteria for ANI primarily rely on subjective clinical assessments, possibly leading to different interpretations among clinicians. Some recent studies leverage structural or functional MRI containing objective biomarkers for ANI analysis, offering clinicians companion diagnostic tools. However, they mainly utilize a single imaging modality, neglecting complementary information provided by structural and functional MRI. To this end, we propose an attention-enhanced structural and functional MRI fusion (ASFF) framework for HIV-associated ANI analysis. Specifically, the ASFF first extracts data-driven and human-engineered features from structural MRI, and also captures functional MRI features via a graph isomorphism network and Transformer. A mutual cross-attention fusion module is then designed to model the underlying relationship between structural and functional MRI. Additionally, a semantic inter-modality constraint is introduced to encourage consistency of multimodal features, facilitating effective feature fusion. Experimental results on 137 subjects from an HIV-associated ANI dataset with T1-weighted MRI and resting-state functional MRI show the effectiveness of our ASFF in ANI identification. Furthermore, our method can identify both modality-shared and modality-specific brain regions, which may advance our understanding of the structural and functional pathology underlying ANI.
无症状神经认知功能障碍(ANI)是人类免疫缺陷病毒(HIV)感染者认知功能障碍的主要表现形式。目前 ANI 的诊断标准主要依赖于主观临床评估,这可能会导致临床医生之间产生不同的解释。最近的一些研究利用含有客观生物标志物的结构性或功能性磁共振成像进行 ANI 分析,为临床医生提供了辅助诊断工具。然而,这些研究主要利用单一成像模式,忽略了结构性和功能性 MRI 提供的互补信息。为此,我们提出了一种用于艾滋病相关 ANI 分析的注意力增强结构和功能 MRI 融合(ASFF)框架。具体来说,ASFF 首先从结构磁共振成像中提取数据驱动和人为设计的特征,然后通过图同构网络和 Transformer 捕捉功能磁共振成像特征。然后设计一个相互交叉关注融合模块,以模拟结构性和功能性 MRI 之间的潜在关系。此外,还引入了语义跨模态约束,以鼓励多模态特征的一致性,从而促进有效的特征融合。实验结果显示,我们的 ASFF 在 ANI 识别方面非常有效。此外,我们的方法还能识别模式共享和模式特异的脑区,这可能会促进我们对 ANI 的结构和功能病理的理解。
{"title":"Attention-Enhanced Fusion of Structural and Functional MRI for Analyzing HIV-Associated Asymptomatic Neurocognitive Impairment.","authors":"Yuqi Fang, Wei Wang, Qianqian Wang, Hong-Jun Li, Mingxia Liu","doi":"10.1007/978-3-031-72120-5_11","DOIUrl":"10.1007/978-3-031-72120-5_11","url":null,"abstract":"<p><p>Asymptomatic neurocognitive impairment (ANI) is a predominant form of cognitive impairment among individuals infected with human immunodeficiency virus (HIV). The current diagnostic criteria for ANI primarily rely on subjective clinical assessments, possibly leading to different interpretations among clinicians. Some recent studies leverage structural or functional MRI containing objective biomarkers for ANI analysis, offering clinicians companion diagnostic tools. However, they mainly utilize a single imaging modality, neglecting complementary information provided by structural and functional MRI. To this end, we propose an attention-enhanced structural and functional MRI fusion (ASFF) framework for HIV-associated ANI analysis. Specifically, the ASFF first extracts data-driven and human-engineered features from structural MRI, and also captures functional MRI features via a graph isomorphism network and Transformer. A <i>mutual cross-attention fusion module</i> is then designed to model the underlying relationship between structural and functional MRI. Additionally, a <i>semantic inter-modality constraint</i> is introduced to encourage consistency of multimodal features, facilitating effective feature fusion. Experimental results on 137 subjects from an HIV-associated ANI dataset with T1-weighted MRI and resting-state functional MRI show the effectiveness of our ASFF in ANI identification. Furthermore, our method can identify both modality-shared and modality-specific brain regions, which may advance our understanding of the structural and functional pathology underlying ANI.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15011 ","pages":"113-123"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11512738/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142516842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-03DOI: 10.1007/978-3-031-72089-5_45
Reuben Dorent, Erickson Torio, Nazim Haouchine, Colin Galvin, Sarah Frisken, Alexandra Golby, Tina Kapur, William Wells
Intraoperative ultrasound (iUS) imaging has the potential to improve surgical outcomes in brain surgery. However, its interpretation is challenging, even for expert neurosurgeons. In this work, we designed the first patient-specific framework that performs brain tumor segmentation in trackerless iUS. To disambiguate ultrasound imaging and adapt to the neurosurgeon's surgical objective, a patient-specific real-time network is trained using synthetic ultrasound data generated by simulating virtual iUS sweep acquisitions in pre-operative MR data. Extensive experiments performed in real ultrasound data demonstrate the effectiveness of the proposed approach, allowing for adapting to the surgeon's definition of surgical targets and outperforming non-patient-specific models, neurosurgeon experts, and high-end tracking systems. Our code is available at: https://github.com/ReubenDo/MHVAE-Seg.
{"title":"Patient-Specific Real-Time Segmentation in Trackerless Brain Ultrasound.","authors":"Reuben Dorent, Erickson Torio, Nazim Haouchine, Colin Galvin, Sarah Frisken, Alexandra Golby, Tina Kapur, William Wells","doi":"10.1007/978-3-031-72089-5_45","DOIUrl":"10.1007/978-3-031-72089-5_45","url":null,"abstract":"<p><p>Intraoperative ultrasound (iUS) imaging has the potential to improve surgical outcomes in brain surgery. However, its interpretation is challenging, even for expert neurosurgeons. In this work, we designed the first patient-specific framework that performs brain tumor segmentation in trackerless iUS. To disambiguate ultrasound imaging and adapt to the neurosurgeon's surgical objective, a patient-specific real-time network is trained using synthetic ultrasound data generated by simulating virtual iUS sweep acquisitions in pre-operative MR data. Extensive experiments performed in real ultrasound data demonstrate the effectiveness of the proposed approach, allowing for adapting to the surgeon's definition of surgical targets and outperforming non-patient-specific models, neurosurgeon experts, and high-end tracking systems. Our code is available at: https://github.com/ReubenDo/MHVAE-Seg.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15006 ","pages":"477-487"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12714359/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145807089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-04DOI: 10.1007/978-3-031-72069-7_9
Dan Hu, Kangfu Han, Jiale Cheng, Gang Li
Individualized brain parcellations derived from functional MRI (fMRI) are essential for discerning unique functional patterns of individuals, facilitating personalized diagnoses and treatments. Unfortunately, as fMRI signals are inherently noisy, establishing reliable individualized parcellations typically necessitates long-duration fMRI scan (> 25 min), posing a major challenge and resulting in the exclusion of numerous short-duration fMRI scans from individualized studies. To address this issue, we develop a novel Consecutive-Contrastive Spherical U-net (CC-SUnet) to enable the prediction of reliable individualized brain parcellation using short-duration fMRI data, greatly expanding its practical applicability. Specifically, 1) the widely used functional diffusion map (DM), obtained from functional connectivity, is carefully selected as the predictive feature, for its advantage in tracing the transitions between regions while reducing noise. To ensure a robust depiction of brain network, we propose a dual-task model to predict DM and cortical parcellation simultaneously, fully utilizing their reciprocal relationship. 2) By constructing a stepwise dataset to capture the gradual changes of DM over increasing scan durations, a consecutive prediction framework is designed to realize the prediction from short-to-long gradually. 3) A stepwise-denoising-prediction module is further proposed. The noise representations are separated and replaced by the latent representations of a group-level diffusion map, realizing informative guidance and denoising concurrently. 4) Additionally, an N-pair contrastive loss is introduced to strengthen the discriminability of the individualized parcellations. Extensive experimental results demonstrated the superiority of our proposed CC-SUnet in enhancing the reliability of the individualized parcellation with short-duration fMRI data, thereby significantly boosting their utility in individualized studies.
{"title":"Consecutive-Contrastive Spherical U-Net: Enhancing Reliability of Individualized Functional Brain Parcellation for Short-Duration fMRI Scans.","authors":"Dan Hu, Kangfu Han, Jiale Cheng, Gang Li","doi":"10.1007/978-3-031-72069-7_9","DOIUrl":"10.1007/978-3-031-72069-7_9","url":null,"abstract":"<p><p>Individualized brain parcellations derived from functional MRI (fMRI) are essential for discerning unique functional patterns of individuals, facilitating personalized diagnoses and treatments. Unfortunately, as fMRI signals are inherently noisy, establishing reliable individualized parcellations typically necessitates long-duration fMRI scan (> 25 min), posing a major challenge and resulting in the exclusion of numerous short-duration fMRI scans from individualized studies. To address this issue, we develop a novel Consecutive-Contrastive Spherical U-net (CC-SUnet) to enable the prediction of reliable individualized brain parcellation using short-duration fMRI data, greatly expanding its practical applicability. Specifically, 1) the widely used functional diffusion map (DM), obtained from functional connectivity, is carefully selected as the predictive feature, for its advantage in tracing the transitions between regions while reducing noise. To ensure a robust depiction of brain network, we propose a dual-task model to predict DM and cortical parcellation simultaneously, fully utilizing their reciprocal relationship. 2) By constructing a stepwise dataset to capture the gradual changes of DM over increasing scan durations, a consecutive prediction framework is designed to realize the prediction from short-to-long gradually. 3) A stepwise-denoising-prediction module is further proposed. The noise representations are separated and replaced by the latent representations of a group-level diffusion map, realizing informative guidance and denoising concurrently. 4) Additionally, an N-pair contrastive loss is introduced to strengthen the discriminability of the individualized parcellations. Extensive experimental results demonstrated the superiority of our proposed CC-SUnet in enhancing the reliability of the individualized parcellation with short-duration fMRI data, thereby significantly boosting their utility in individualized studies.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15002 ","pages":"88-98"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12716869/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145807080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-10-06DOI: 10.1007/978-3-031-72111-3_12
Yizhou Zhao, Hengwei Bian, Michael Mu, Mostofa R Uddin, Zhenyang Li, Xiang Li, Tianyang Wang, Min Xu
Cryogenic Electron Tomography (CryoET) is a useful imaging technology in structural biology that is hindered by its need for manual annotations, especially in particle picking. Recent works have endeavored to remedy this issue with few-shot learning or contrastive learning techniques. However, supervised training is still inevitable for them. We instead choose to leverage the power of existing 2D foundation models and present a novel, training-free framework, CryoSAM. In addition to prompt-based single-particle instance segmentation, our approach can automatically search for similar features, facilitating full tomogram semantic segmentation with only one prompt. CryoSAM is composed of two major parts: 1) a prompt-based 3D segmentation system that uses prompts to complete single-particle instance segmentation recursively with Cross-Plane Self-Prompting, and 2) a Hierarchical Feature Matching mechanism that efficiently matches relevant features with extracted tomogram features. They collaborate to enable the segmentation of all particles of one category with just one particle-specific prompt. Our experiments show that CryoSAM outperforms existing works by a significant margin and requires even fewer annotations in particle picking. Further visualizations demonstrate its ability when dealing with full tomogram segmentation for various subcellular structures. Our code is available at: https://github.com/xulabs/aitom.
低温电子层析成像技术(Cryogenic Electron Tomography, CryoET)是结构生物学中一项有用的成像技术,但它需要人工注释,特别是在粒子拾取方面。最近的研究试图用少量学习或对比学习技术来解决这个问题。然而,有监督的培训对他们来说仍然是不可避免的。相反,我们选择利用现有的2D基础模型的力量,并提出了一个新颖的,无需培训的框架,CryoSAM。除了基于提示符的单粒子实例分割之外,我们的方法还可以自动搜索相似的特征,仅用一个提示符就可以实现全层图语义分割。CryoSAM由两大部分组成:1)基于提示符的三维分割系统,利用提示符递归地完成单粒子实例的分割;2)分层特征匹配机制,将相关特征与提取的层析图特征进行高效匹配。他们合作,使一个类别的所有粒子的分割只有一个特定的粒子提示。我们的实验表明,CryoSAM大大优于现有的作品,并且在粒子选择中需要更少的注释。进一步的可视化显示了它在处理各种亚细胞结构的全层析图分割时的能力。我们的代码可在:https://github.com/xulabs/aitom。
{"title":"CryoSAM: Training-free CryoET Tomogram Segmentation with Foundation Models.","authors":"Yizhou Zhao, Hengwei Bian, Michael Mu, Mostofa R Uddin, Zhenyang Li, Xiang Li, Tianyang Wang, Min Xu","doi":"10.1007/978-3-031-72111-3_12","DOIUrl":"10.1007/978-3-031-72111-3_12","url":null,"abstract":"<p><p>Cryogenic Electron Tomography (CryoET) is a useful imaging technology in structural biology that is hindered by its need for manual annotations, especially in particle picking. Recent works have endeavored to remedy this issue with few-shot learning or contrastive learning techniques. However, supervised training is still inevitable for them. We instead choose to leverage the power of existing 2D foundation models and present a novel, training-free framework, CryoSAM. In addition to prompt-based single-particle instance segmentation, our approach can automatically search for similar features, facilitating full tomogram semantic segmentation with only one prompt. CryoSAM is composed of two major parts: 1) a prompt-based 3D segmentation system that uses prompts to complete single-particle instance segmentation recursively with Cross-Plane Self-Prompting, and 2) a Hierarchical Feature Matching mechanism that efficiently matches relevant features with extracted tomogram features. They collaborate to enable the segmentation of all particles of one category with just one particle-specific prompt. Our experiments show that CryoSAM outperforms existing works by a significant margin and requires even fewer annotations in particle picking. Further visualizations demonstrate its ability when dealing with full tomogram segmentation for various subcellular structures. Our code is available at: https://github.com/xulabs/aitom.</p>","PeriodicalId":94280,"journal":{"name":"Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention","volume":"15008 ","pages":"124-134"},"PeriodicalIF":0.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12923679/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147273604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention