IEEE transactions on medical imaging最新文献_第9页

IGU-Aug: Information-Guided Unsupervised Augmentation and Pixel-Wise Contrastive Learning for Medical Image Analysis IGU-Aug：用于医学图像分析的信息引导无监督增强和像素对比学习。

IEEE transactions on medical imaging

Pub Date : 2024-08-01 DOI: 10.1109/TMI.2024.3436713

Quan Quan;Qingsong Yao;Heqin Zhu;S. Kevin Zhou

Contrastive learning (CL) is a form of self-supervised learning and has been widely used for various tasks. Different from widely studied instance-level contrastive learning, pixel-wise contrastive learning mainly helps with pixel-wise dense prediction tasks. The counterpart to an instance in instance-level CL is a pixel, along with its neighboring context, in pixel-wise CL. Aiming to build better feature representation, there is a vast literature about designing instance augmentation strategies for instance-level CL; but there is little similar work on pixel augmentation for pixel-wise CL with a pixel granularity. In this paper, we attempt to bridge this gap. We first classify a pixel into three categories, namely low-, medium-, and high-informative, based on the information quantity the pixel contains. We then adaptively design separate augmentation strategies for each category in terms of augmentation intensity and sampling ratio. Extensive experiments validate that our information-guided pixel augmentation strategy succeeds in encoding more discriminative representations and surpassing other competitive approaches in unsupervised local feature matching. Furthermore, our pretrained model improves the performance of both one-shot and fully supervised models. To the best of our knowledge, we are the first to propose a pixel augmentation method with a pixel granularity for enhancing unsupervised pixel-wise contrastive learning. Code is available at https://github.com/Curli-quan/IGU-Aug.

对比学习（Contrastive Learning，CL）是一种自我监督学习，已被广泛应用于各种任务中。与广泛研究的实例级对比学习不同，像素级对比学习主要用于像素级密集预测任务。在像素对比学习中，与实例级对比学习中的实例相对应的是像素及其相邻上下文。为了建立更好的特征表示，有大量文献介绍了如何为实例级 CL 设计实例增强策略；但对于像素粒度的像素级 CL，却鲜有类似的像素增强研究。在本文中，我们试图弥合这一差距。首先，我们根据像素包含的信息量将其分为三类，即低信息量、中等信息量和高信息量。然后，我们根据增强强度和采样率为每个类别设计了不同的增强策略。广泛的实验验证了我们以信息为导向的像素增强策略能够成功地编码出更具区分度的表征，并在无监督局部特征匹配中超越了其他竞争方法。此外，我们的预训练模型还提高了单次模型和完全监督模型的性能。据我们所知，我们是第一个提出以像素粒度增强无监督像素对比学习的像素增强方法的人。代码见 https：//github.com/Curli-quan/IGU-Aug.

{"title":"IGU-Aug: Information-Guided Unsupervised Augmentation and Pixel-Wise Contrastive Learning for Medical Image Analysis","authors":"Quan Quan;Qingsong Yao;Heqin Zhu;S. Kevin Zhou","doi":"10.1109/TMI.2024.3436713","DOIUrl":"10.1109/TMI.2024.3436713","url":null,"abstract":"Contrastive learning (CL) is a form of self-supervised learning and has been widely used for various tasks. Different from widely studied instance-level contrastive learning, pixel-wise contrastive learning mainly helps with pixel-wise dense prediction tasks. The counterpart to an instance in instance-level CL is a pixel, along with its neighboring context, in pixel-wise CL. Aiming to build better feature representation, there is a vast literature about designing instance augmentation strategies for instance-level CL; but there is little similar work on pixel augmentation for pixel-wise CL with a pixel granularity. In this paper, we attempt to bridge this gap. We first classify a pixel into three categories, namely low-, medium-, and high-informative, based on the information quantity the pixel contains. We then adaptively design separate augmentation strategies for each category in terms of augmentation intensity and sampling ratio. Extensive experiments validate that our information-guided pixel augmentation strategy succeeds in encoding more discriminative representations and surpassing other competitive approaches in unsupervised local feature matching. Furthermore, our pretrained model improves the performance of both one-shot and fully supervised models. To the best of our knowledge, we are the first to propose a pixel augmentation method with a pixel granularity for enhancing unsupervised pixel-wise contrastive learning. Code is available at \u0000<uri>https://github.com/Curli-quan/IGU-Aug</uri>\u0000.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 1","pages":"154-164"},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141876993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Self-Supervised Medical Image Segmentation Using Deep Reinforced Adaptive Masking 利用深度强化自适应屏蔽进行自监督医学图像分割

IEEE transactions on medical imaging

Pub Date : 2024-08-01 DOI: 10.1109/TMI.2024.3436608

Zhenghua Xu;Yunxin Liu;Gang Xu;Thomas Lukasiewicz

Self-supervised learning aims to learn transferable representations from unlabeled data for downstream tasks. Inspired by masked language modeling in natural language processing, masked image modeling (MIM) has achieved certain success in the field of computer vision, but its effectiveness in medical images remains unsatisfactory. This is mainly due to the high redundancy and small discriminative regions in medical images compared to natural images. Therefore, this paper proposes an adaptive hard masking (AHM) approach based on deep reinforcement learning to expand the application of MIM in medical images. Unlike predefined random masks, AHM uses an asynchronous advantage actor-critic (A3C) model to predict reconstruction loss for each patch, enabling the model to learn where masking is valuable. By optimizing the non-differentiable sampling process using reinforcement learning, AHM enhances the understanding of key regions, thereby improving downstream task performance. Experimental results on two medical image datasets demonstrate that AHM outperforms state-of-the-art methods. Additional experiments under various settings validate the effectiveness of AHM in constructing masked images.

自我监督学习旨在从未标明的数据中学习可转移的表征，用于下游任务。受自然语言处理中的遮蔽语言建模启发，遮蔽图像建模（MIM）在计算机视觉领域取得了一定的成功，但其在医学图像中的效果仍不尽如人意。这主要是由于与自然图像相比，医学图像的冗余度高、辨别区域小。因此，本文提出了一种基于深度强化学习的自适应硬掩膜（AHM）方法，以拓展 MIM 在医学图像中的应用。与预定义的随机遮罩不同，AHM 使用异步优势行为批判（A3C）模型来预测每个补丁的重建损失，使模型能够学习遮罩在哪些地方是有价值的。通过使用强化学习优化无差别采样过程，AHM 增强了对关键区域的理解，从而提高了下游任务的性能。在两个医学图像数据集上的实验结果表明，AHM 的性能优于最先进的方法。在各种设置下进行的其他实验验证了 AHM 在构建遮蔽图像方面的有效性。

{"title":"Self-Supervised Medical Image Segmentation Using Deep Reinforced Adaptive Masking","authors":"Zhenghua Xu;Yunxin Liu;Gang Xu;Thomas Lukasiewicz","doi":"10.1109/TMI.2024.3436608","DOIUrl":"10.1109/TMI.2024.3436608","url":null,"abstract":"Self-supervised learning aims to learn transferable representations from unlabeled data for downstream tasks. Inspired by masked language modeling in natural language processing, masked image modeling (MIM) has achieved certain success in the field of computer vision, but its effectiveness in medical images remains unsatisfactory. This is mainly due to the high redundancy and small discriminative regions in medical images compared to natural images. Therefore, this paper proposes an adaptive hard masking (AHM) approach based on deep reinforcement learning to expand the application of MIM in medical images. Unlike predefined random masks, AHM uses an asynchronous advantage actor-critic (A3C) model to predict reconstruction loss for each patch, enabling the model to learn where masking is valuable. By optimizing the non-differentiable sampling process using reinforcement learning, AHM enhances the understanding of key regions, thereby improving downstream task performance. Experimental results on two medical image datasets demonstrate that AHM outperforms state-of-the-art methods. Additional experiments under various settings validate the effectiveness of AHM in constructing masked images.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 1","pages":"180-193"},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141876994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

IEEE Nuclear Science Symposium 电气和电子工程师学会核科学研讨会

IEEE transactions on medical imaging

Pub Date : 2024-08-01 DOI: 10.1109/TMI.2024.3372491

引用次数: 0

Spatially-Constrained and -Unconstrained Bi-Graph Interaction Network for Multi-Organ Pathology Image Classification 用于多器官病理图像分类的空间受限和非受限双图交互网络

IEEE transactions on medical imaging

Pub Date : 2024-07-31 DOI: 10.1109/TMI.2024.3436080

Doanh C. Bui;Boram Song;Kyungeun Kim;Jin Tae Kwak

In computational pathology, graphs have shown to be promising for pathology image analysis. There exist various graph structures that can discover differing features of pathology images. However, the combination and interaction between differing graph structures have not been fully studied and utilized for pathology image analysis. In this study, we propose a parallel, bi-graph neural network, designated as SCUBa-Net, equipped with both graph convolutional networks and Transformers, that processes a pathology image as two distinct graphs, including a spatially-constrained graph and a spatially-unconstrained graph. For efficient and effective graph learning, we introduce two inter-graph interaction blocks and an intra-graph interaction block. The inter-graph interaction blocks learn the node-to-node interactions within each graph. The intra-graph interaction block learns the graph-to-graph interactions at both global- and local-levels with the help of the virtual nodes that collect and summarize the information from the entire graphs. SCUBa-Net is systematically evaluated on four multi-organ datasets, including colorectal, prostate, gastric, and bladder cancers. The experimental results demonstrate the effectiveness of SCUBa-Net in comparison to the state-of-the-art convolutional neural networks, Transformer, and graph neural networks.

在计算病理学领域，图形在病理图像分析方面大有可为。现有的各种图结构可以发现病理图像的不同特征。然而，在病理图像分析中，不同图结构之间的组合和相互作用尚未得到充分研究和利用。在本研究中，我们提出了一种并行的双图神经网络，命名为 SCUBa-Net，它配备了图卷积网络和变换器，可将病理图像处理为两个不同的图，包括空间受限图和空间非受限图。为了实现高效的图学习，我们引入了两个图间交互块和一个图内交互块。图间交互块学习每个图内节点到节点的交互。图内交互块则借助收集和汇总整个图信息的虚拟节点，学习全局和局部层面的图与图之间的交互。SCUBa-Net 在四个多器官数据集上进行了系统评估，包括结直肠癌、前列腺癌、胃癌和膀胱癌。实验结果表明，与最先进的卷积神经网络、Transformer 和图神经网络相比，SCUBa-Net 非常有效。

{"title":"Spatially-Constrained and -Unconstrained Bi-Graph Interaction Network for Multi-Organ Pathology Image Classification","authors":"Doanh C. Bui;Boram Song;Kyungeun Kim;Jin Tae Kwak","doi":"10.1109/TMI.2024.3436080","DOIUrl":"10.1109/TMI.2024.3436080","url":null,"abstract":"In computational pathology, graphs have shown to be promising for pathology image analysis. There exist various graph structures that can discover differing features of pathology images. However, the combination and interaction between differing graph structures have not been fully studied and utilized for pathology image analysis. In this study, we propose a parallel, bi-graph neural network, designated as SCUBa-Net, equipped with both graph convolutional networks and Transformers, that processes a pathology image as two distinct graphs, including a spatially-constrained graph and a spatially-unconstrained graph. For efficient and effective graph learning, we introduce two inter-graph interaction blocks and an intra-graph interaction block. The inter-graph interaction blocks learn the node-to-node interactions within each graph. The intra-graph interaction block learns the graph-to-graph interactions at both global- and local-levels with the help of the virtual nodes that collect and summarize the information from the entire graphs. SCUBa-Net is systematically evaluated on four multi-organ datasets, including colorectal, prostate, gastric, and bladder cancers. The experimental results demonstrate the effectiveness of SCUBa-Net in comparison to the state-of-the-art convolutional neural networks, Transformer, and graph neural networks.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 1","pages":"194-206"},"PeriodicalIF":0.0,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10616189","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141861927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Orthogonal Mixed-Effects Modeling for High-Dimensional Longitudinal Data: An Unsupervised Learning Approach 高维纵向数据的正交混合效应建模：一种无监督学习方法。

IEEE transactions on medical imaging

Pub Date : 2024-07-30 DOI: 10.1109/TMI.2024.3435855

Ming Chen;Yijun Bian;Nanguang Chen;Anqi Qiu

The linear mixed-effects model is commonly utilized to interpret longitudinal data, characterizing both the global longitudinal trajectory across all observations and longitudinal trajectories within individuals. However, characterizing these trajectories in high-dimensional longitudinal data presents a challenge. To address this, our study proposes a novel approach, Unsupervised Orthogonal Mixed-Effects Trajectory Modeling (UOMETM), that leverages unsupervised learning to generate latent representations of both global and individual trajectories. We design an autoencoder with a latent space where an orthogonal constraint is imposed to separate the space of global trajectories from individual trajectories. We also devise a cross-reconstruction loss to ensure consistency of global trajectories and enhance the orthogonality between representation spaces. To evaluate UOMETM, we conducted simulation experiments on images to verify that every component functions as intended. Furthermore, we evaluated its performance and robustness using longitudinal brain cortical thickness from two Alzheimer’s disease (AD) datasets. Comparative analyses with state-of-the-art methods revealed UOMETM’s superiority in identifying global and individual longitudinal patterns, achieving a lower reconstruction error, superior orthogonality, and higher accuracy in AD classification and conversion forecasting. Remarkably, we found that the space of global trajectories did not significantly contribute to AD classification compared to the space of individual trajectories, emphasizing their clear separation. Moreover, our model exhibited satisfactory generalization and robustness across different datasets. The study shows the outstanding performance and potential clinical use of UOMETM in the context of longitudinal data analysis.

线性混合效应模型通常用于解释纵向数据，既能描述所有观测数据的总体纵向轨迹，也能描述个体内部的纵向轨迹。然而，在高维纵向数据中描述这些轨迹是一项挑战。为了解决这个问题，我们的研究提出了一种新方法--无监督正交混合效应轨迹建模（UOMETM），利用无监督学习生成全局和个体轨迹的潜在表征。我们设计了一个具有潜在空间的自动编码器，其中施加了一个正交约束，以分离全局轨迹空间和个体轨迹空间。我们还设计了一种交叉重构损失，以确保全局轨迹的一致性，并增强表示空间之间的正交性。为了评估 UOMETM，我们在图像上进行了模拟实验，以验证每个组件都能发挥预期功能。此外，我们还利用两个阿尔茨海默病（AD）数据集的纵向大脑皮层厚度对其性能和鲁棒性进行了评估。与最先进方法的对比分析表明，UOMETM 在识别全局和个体纵向模式方面更胜一筹，重建误差更低，正交性更好，在阿尔茨海默病分类和转换预测方面的准确性更高。值得注意的是，我们发现与单个轨迹空间相比，全局轨迹空间对 AD 分类的贡献并不明显，这强调了它们之间的明显分离。此外，我们的模型在不同数据集上表现出令人满意的泛化和鲁棒性。这项研究显示了 UOMETM 在纵向数据分析方面的卓越性能和潜在的临床应用。

{"title":"Orthogonal Mixed-Effects Modeling for High-Dimensional Longitudinal Data: An Unsupervised Learning Approach","authors":"Ming Chen;Yijun Bian;Nanguang Chen;Anqi Qiu","doi":"10.1109/TMI.2024.3435855","DOIUrl":"10.1109/TMI.2024.3435855","url":null,"abstract":"The linear mixed-effects model is commonly utilized to interpret longitudinal data, characterizing both the global longitudinal trajectory across all observations and longitudinal trajectories within individuals. However, characterizing these trajectories in high-dimensional longitudinal data presents a challenge. To address this, our study proposes a novel approach, Unsupervised Orthogonal Mixed-Effects Trajectory Modeling (UOMETM), that leverages unsupervised learning to generate latent representations of both global and individual trajectories. We design an autoencoder with a latent space where an orthogonal constraint is imposed to separate the space of global trajectories from individual trajectories. We also devise a cross-reconstruction loss to ensure consistency of global trajectories and enhance the orthogonality between representation spaces. To evaluate UOMETM, we conducted simulation experiments on images to verify that every component functions as intended. Furthermore, we evaluated its performance and robustness using longitudinal brain cortical thickness from two Alzheimer’s disease (AD) datasets. Comparative analyses with state-of-the-art methods revealed UOMETM’s superiority in identifying global and individual longitudinal patterns, achieving a lower reconstruction error, superior orthogonality, and higher accuracy in AD classification and conversion forecasting. Remarkably, we found that the space of global trajectories did not significantly contribute to AD classification compared to the space of individual trajectories, emphasizing their clear separation. Moreover, our model exhibited satisfactory generalization and robustness across different datasets. The study shows the outstanding performance and potential clinical use of UOMETM in the context of longitudinal data analysis.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 1","pages":"207-220"},"PeriodicalIF":0.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141857448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Segmentation and Vascular Vectorization for Coronary Artery by Geometry-Based Cascaded Neural Network 基于几何的级联神经网络对冠状动脉进行分割和血管矢量化

IEEE transactions on medical imaging

Pub Date : 2024-07-30 DOI: 10.1109/TMI.2024.3435714

Xiaoyu Yang;Lijian Xu;Simon Yu;Qing Xia;Hongsheng Li;Shaoting Zhang

Segmentation of the coronary artery is an important task for the quantitative analysis of coronary computed tomography angiography (CCTA) images and is being stimulated by the field of deep learning. However, the complex structures with tiny and narrow branches of the coronary artery bring it a great challenge. Coupled with the medical image limitations of low resolution and poor contrast, fragmentations of segmented vessels frequently occur in the prediction. Therefore, a geometry-based cascaded segmentation method is proposed for the coronary artery, which has the following innovations: 1) Integrating geometric deformation networks, we design a cascaded network for segmenting the coronary artery and vectorizing results. The generated meshes of the coronary artery are continuous and accurate for twisted and sophisticated coronary artery structures, without fragmentations. 2) Different from mesh annotations generated by the traditional marching cube method from voxel-based labels, a finer vectorized mesh of the coronary artery is reconstructed with the regularized morphology. The novel mesh annotation benefits the geometry-based segmentation network, avoiding bifurcation adhesion and point cloud dispersion in intricate branches. 3) A dataset named CCA-200 is collected, consisting of 200 CCTA images with coronary artery disease. The ground truths of 200 cases are coronary internal diameter annotations by professional radiologists. Extensive experiments verify our method on our collected dataset CCA-200 and public ASOCA dataset, with a Dice of 0.778 on CCA-200 and 0.895 on ASOCA, showing superior results. Especially, our geometry-based model generates an accurate, intact and smooth coronary artery, devoid of any fragmentations of segmented vessels.

冠状动脉的分割是冠状动脉计算机断层扫描（CCTA）图像定量分析的一项重要任务，目前正受到深度学习领域的推动。然而，冠状动脉结构复杂，分支细小而狭窄，给这项工作带来了巨大挑战。再加上医学影像分辨率低、对比度差的限制，预测中经常出现分割血管的碎片。因此，针对冠状动脉提出了一种基于几何的级联分割方法，其创新点如下：1) 结合几何变形网络，我们设计了一种级联网络，用于分割冠状动脉并将结果矢量化。生成的冠状动脉网格连续、精确，可用于扭曲和复杂的冠状动脉结构，不会出现碎裂。2) 与传统的基于体素标签的行进立方体方法生成的网格注释不同，利用正则化形态学重建的冠状动脉矢量化网格更精细。新的网格标注有利于基于几何的分割网络，避免了复杂分支中的分叉粘连和点云分散。3) 收集的数据集名为 CCA-200，由 200 张冠状动脉疾病的 CCTA 图像组成。200 个病例的地面真相是由专业放射科医生标注的冠状动脉内径。大量实验验证了我们的方法，CCA-200 和 ASOCA 数据集的 Dice 分别为 0.778 和 0.895，显示出卓越的效果。特别是，我们基于几何模型生成的冠状动脉准确、完整、光滑，没有任何分割血管的碎片。

{"title":"Segmentation and Vascular Vectorization for Coronary Artery by Geometry-Based Cascaded Neural Network","authors":"Xiaoyu Yang;Lijian Xu;Simon Yu;Qing Xia;Hongsheng Li;Shaoting Zhang","doi":"10.1109/TMI.2024.3435714","DOIUrl":"10.1109/TMI.2024.3435714","url":null,"abstract":"Segmentation of the coronary artery is an important task for the quantitative analysis of coronary computed tomography angiography (CCTA) images and is being stimulated by the field of deep learning. However, the complex structures with tiny and narrow branches of the coronary artery bring it a great challenge. Coupled with the medical image limitations of low resolution and poor contrast, fragmentations of segmented vessels frequently occur in the prediction. Therefore, a geometry-based cascaded segmentation method is proposed for the coronary artery, which has the following innovations: 1) Integrating geometric deformation networks, we design a cascaded network for segmenting the coronary artery and vectorizing results. The generated meshes of the coronary artery are continuous and accurate for twisted and sophisticated coronary artery structures, without fragmentations. 2) Different from mesh annotations generated by the traditional marching cube method from voxel-based labels, a finer vectorized mesh of the coronary artery is reconstructed with the regularized morphology. The novel mesh annotation benefits the geometry-based segmentation network, avoiding bifurcation adhesion and point cloud dispersion in intricate branches. 3) A dataset named CCA-200 is collected, consisting of 200 CCTA images with coronary artery disease. The ground truths of 200 cases are coronary internal diameter annotations by professional radiologists. Extensive experiments verify our method on our collected dataset CCA-200 and public ASOCA dataset, with a Dice of 0.778 on CCA-200 and 0.895 on ASOCA, showing superior results. Especially, our geometry-based model generates an accurate, intact and smooth coronary artery, devoid of any fragmentations of segmented vessels.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 1","pages":"259-269"},"PeriodicalIF":0.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141857449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Weakly Supervised Object Detection in Chest X-Rays With Differentiable ROI Proposal Networks and Soft ROI Pooling 利用可区分的 ROI 建议网络和软 ROI 池，实现胸部 X 光片中的弱监督对象检测。

IEEE transactions on medical imaging

Pub Date : 2024-07-29 DOI: 10.1109/TMI.2024.3435015

Philip Müller;Felix Meissen;Georgios Kaissis;Daniel Rueckert

Weakly supervised object detection (WSup-OD) increases the usefulness and interpretability of image classification algorithms without requiring additional supervision. The successes of multiple instance learning in this task for natural images, however, do not translate well to medical images due to the very different characteristics of their objects (i.e. pathologies). In this work, we propose Weakly Supervised ROI Proposal Networks (WSRPN), a new method for generating bounding box proposals on the fly using a specialized region of interest-attention (ROI-attention) module. WSRPN integrates well with classic backbone-head classification algorithms and is end-to-end trainable with only image-label supervision. We experimentally demonstrate that our new method outperforms existing methods in the challenging task of disease localization in chest X-ray images. Code: https://github.com/philip-mueller/wsrpn.

弱监督对象检测（WSup-OD）无需额外监督即可提高图像分类算法的实用性和可解释性。然而，由于对象（即病理）的特征截然不同，多实例学习在自然图像任务中取得的成功并不能很好地应用于医学图像。在这项工作中，我们提出了弱监督 ROI 建议网络（WSRPN），这是一种利用专门的兴趣区域关注（ROI-attention）模块即时生成边界框建议的新方法。WSRPN 与经典的骨干头分类算法集成良好，只需图像标签监督即可进行端到端的训练。我们通过实验证明，在胸部 X 光图像疾病定位这一具有挑战性的任务中，我们的新方法优于现有方法。代码：https://anonymous.4open.science/r/WSRPN-DCA1。

引用次数: 0

Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI 用于 DCE-MRI 中乳腺肿瘤分割的原型学习引导混合网络。

IEEE transactions on medical imaging

Pub Date : 2024-07-29 DOI: 10.1109/TMI.2024.3435450

Lei Zhou;Yuzhong Zhang;Jiadong Zhang;Xuejun Qian;Chen Gong;Kun Sun;Zhongxiang Ding;Xing Wang;Zhenhui Li;Zaiyi Liu;Dinggang Shen

Automated breast tumor segmentation on the basis of dynamic contrast-enhancement magnetic resonance imaging (DCE-MRI) has shown great promise in clinical practice, particularly for identifying the presence of breast disease. However, accurate segmentation of breast tumor is a challenging task, often necessitating the development of complex networks. To strike an optimal trade-off between computational costs and segmentation performance, we propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers. Specifically, the hybrid network consists of a encoder-decoder architecture by stacking convolution and deconvolution layers. Effective 3D transformer layers are then implemented after the encoder subnetworks, to capture global dependencies between the bottleneck features. To improve the efficiency of hybrid network, two parallel encoder subnetworks are designed for the decoder and the transformer layers, respectively. To further enhance the discriminative capability of hybrid network, a prototype learning guided prediction module is proposed, where the category-specified prototypical features are calculated through online clustering. All learned prototypical features are finally combined with the features from decoder for tumor mask prediction. The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network achieves superior performance than the state-of-the-art (SOTA) methods, while maintaining balance between segmentation accuracy and computation cost. Moreover, we demonstrate that automatically generated tumor masks can be effectively applied to identify HER2-positive subtype from HER2-negative subtype with the similar accuracy to the analysis based on manual tumor segmentation. The source code is available at https://github.com/ZhouL-lab/PLHN.

基于动态对比增强磁共振成像（DCE-MRI）的乳腺肿瘤自动分割技术在临床实践中大有可为，尤其是在识别乳腺疾病方面。然而，准确分割乳腺肿瘤是一项具有挑战性的任务，往往需要开发复杂的网络。为了在计算成本和分割性能之间取得最佳平衡，我们提出了一种结合卷积神经网络（CNN）和变压器层的混合网络。具体来说，该混合网络通过堆叠卷积层和解卷层组成了一个编码器-解码器架构。然后在编码器子网络之后实施有效的三维变换层，以捕捉瓶颈特征之间的全局依赖关系。为了提高混合网络的效率，还分别为解码器层和变换层设计了两个并行的编码器子网络。为进一步提高混合网络的分辨能力，提出了原型学习引导预测模块，通过在线聚类计算类别指定的原型特征。所有学习到的原型特征最终与来自解码器的特征相结合，用于肿瘤掩膜预测。在私人和公共 DCE-MRI 数据集上的实验结果表明，所提出的混合网络比最先进的（SOTA）方法性能更优越，同时保持了分割精度和计算成本之间的平衡。此外，我们还证明了自动生成的肿瘤掩膜可以有效地从 HER2 阴性亚型中识别出 HER2 阳性亚型，其准确性与基于人工肿瘤分割的分析相似。源代码见 https://github.com/ZhouL-lab/ PLHN。

{"title":"Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI","authors":"Lei Zhou;Yuzhong Zhang;Jiadong Zhang;Xuejun Qian;Chen Gong;Kun Sun;Zhongxiang Ding;Xing Wang;Zhenhui Li;Zaiyi Liu;Dinggang Shen","doi":"10.1109/TMI.2024.3435450","DOIUrl":"10.1109/TMI.2024.3435450","url":null,"abstract":"Automated breast tumor segmentation on the basis of dynamic contrast-enhancement magnetic resonance imaging (DCE-MRI) has shown great promise in clinical practice, particularly for identifying the presence of breast disease. However, accurate segmentation of breast tumor is a challenging task, often necessitating the development of complex networks. To strike an optimal trade-off between computational costs and segmentation performance, we propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers. Specifically, the hybrid network consists of a encoder-decoder architecture by stacking convolution and deconvolution layers. Effective 3D transformer layers are then implemented after the encoder subnetworks, to capture global dependencies between the bottleneck features. To improve the efficiency of hybrid network, two parallel encoder subnetworks are designed for the decoder and the transformer layers, respectively. To further enhance the discriminative capability of hybrid network, a prototype learning guided prediction module is proposed, where the category-specified prototypical features are calculated through online clustering. All learned prototypical features are finally combined with the features from decoder for tumor mask prediction. The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network achieves superior performance than the state-of-the-art (SOTA) methods, while maintaining balance between segmentation accuracy and computation cost. Moreover, we demonstrate that automatically generated tumor masks can be effectively applied to identify HER2-positive subtype from HER2-negative subtype with the similar accuracy to the analysis based on manual tumor segmentation. The source code is available at \u0000<uri>https://github.com/ZhouL-lab/PLHN</uri>\u0000.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 1","pages":"244-258"},"PeriodicalIF":0.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141794321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Modal Diagnosis of Alzheimer’s Disease Using Interpretable Graph Convolutional Networks 利用可解释图卷积网络对阿尔茨海默病进行多模式诊断

IEEE transactions on medical imaging

Pub Date : 2024-07-23 DOI: 10.1109/TMI.2024.3432531

Houliang Zhou;Lifang He;Brian Y. Chen;Li Shen;Yu Zhang

The interconnection between brain regions in neurological disease encodes vital information for the advancement of biomarkers and diagnostics. Although graph convolutional networks are widely applied for discovering brain connection patterns that point to disease conditions, the potential of connection patterns that arise from multiple imaging modalities has yet to be fully realized. In this paper, we propose a multi-modal sparse interpretable GCN framework (SGCN) for the detection of Alzheimer’s disease (AD) and its prodromal stage, known as mild cognitive impairment (MCI). In our experimentation, SGCN learned the sparse regional importance probability to find signature regions of interest (ROIs), and the connective importance probability to reveal disease-specific brain network connections. We evaluated SGCN on the Alzheimer’s Disease Neuroimaging Initiative database with multi-modal brain images and demonstrated that the ROI features learned by SGCN were effective for enhancing AD status identification. The identified abnormalities were significantly correlated with AD-related clinical symptoms. We further interpreted the identified brain dysfunctions at the level of large-scale neural systems and sex-related connectivity abnormalities in AD/MCI. The salient ROIs and the prominent brain connectivity abnormalities interpreted by SGCN are considerably important for developing novel biomarkers. These findings contribute to a better understanding of the network-based disorder via multi-modal diagnosis and offer the potential for precision diagnostics. The source code is available at https://github.com/Houliang-Zhou/SGCN.

神经系统疾病中大脑区域之间的相互联系为生物标记和诊断的发展提供了重要信息。虽然图卷积网络被广泛应用于发现指向疾病状况的大脑连接模式，但多种成像模式产生的连接模式的潜力尚未得到充分发挥。在本文中，我们提出了一种多模态稀疏可解释 GCN 框架（SGCN），用于检测阿尔茨海默病（AD）及其前驱阶段，即轻度认知障碍（MCI）。在我们的实验中，SGCN 学习了稀疏区域重要性概率以找到特征感兴趣区域（ROI），并学习了连接重要性概率以揭示特定疾病的大脑网络连接。我们在阿尔茨海默病神经影像倡议数据库的多模态脑图像上对 SGCN 进行了评估，结果表明，SGCN 学习到的 ROI 特征能有效增强对阿尔茨海默病状态的识别。识别出的异常与阿兹海默症相关临床症状明显相关。我们进一步从大尺度神经系统和与性别相关的连接异常层面解释了所发现的 AD/MCI 脑功能障碍。SGCN所解释的突出ROI和明显的大脑连接异常对于开发新型生物标记物相当重要。这些发现有助于通过多模态诊断更好地了解基于网络的疾病，并为精准诊断提供了潜力。源代码见 https://github.com/Houliang-Zhou/SGCN。

{"title":"Multi-Modal Diagnosis of Alzheimer’s Disease Using Interpretable Graph Convolutional Networks","authors":"Houliang Zhou;Lifang He;Brian Y. Chen;Li Shen;Yu Zhang","doi":"10.1109/TMI.2024.3432531","DOIUrl":"10.1109/TMI.2024.3432531","url":null,"abstract":"The interconnection between brain regions in neurological disease encodes vital information for the advancement of biomarkers and diagnostics. Although graph convolutional networks are widely applied for discovering brain connection patterns that point to disease conditions, the potential of connection patterns that arise from multiple imaging modalities has yet to be fully realized. In this paper, we propose a multi-modal sparse interpretable GCN framework (SGCN) for the detection of Alzheimer’s disease (AD) and its prodromal stage, known as mild cognitive impairment (MCI). In our experimentation, SGCN learned the sparse regional importance probability to find signature regions of interest (ROIs), and the connective importance probability to reveal disease-specific brain network connections. We evaluated SGCN on the Alzheimer’s Disease Neuroimaging Initiative database with multi-modal brain images and demonstrated that the ROI features learned by SGCN were effective for enhancing AD status identification. The identified abnormalities were significantly correlated with AD-related clinical symptoms. We further interpreted the identified brain dysfunctions at the level of large-scale neural systems and sex-related connectivity abnormalities in AD/MCI. The salient ROIs and the prominent brain connectivity abnormalities interpreted by SGCN are considerably important for developing novel biomarkers. These findings contribute to a better understanding of the network-based disorder via multi-modal diagnosis and offer the potential for precision diagnostics. The source code is available at \u0000<uri>https://github.com/Houliang-Zhou/SGCN</uri>\u0000.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 1","pages":"142-153"},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141753574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Interpretable Severity Scoring of Pelvic Trauma Through Automated Fracture Detection and Bayesian Inference 通过自动骨折检测和贝叶斯推理对骨盆创伤进行可解释的严重程度评分

IEEE transactions on medical imaging

Pub Date : 2024-07-22 DOI: 10.1109/TMI.2024.3428836

Haomin Chen;David Dreizin;Catalina Gomez;Anna Zapaishchykova;Mathias Unberath

Pelvic ring disruptions result from blunt injury mechanisms and are potentially lethal mainly due to associated injuries and massive pelvic hemorrhage. The severity of pelvic fractures in trauma victims is frequently assessed by grading the fracture according to the Tile AO/OTA classification in whole-body Computed Tomography (CT) scans. Due to the high volume of whole-body CT scans generated in trauma centers, the overall information content of a single whole-body CT scan and low manual CT reading speed, an automatic approach to Tile classification would provide substantial value, e.g., to prioritize the reading sequence of the trauma radiologists or enable them to focus on other major injuries in multi-trauma patients. In such a high-stakes scenario, an automated method for Tile grading should ideally be transparent such that the symbolic information provided by the method follows the same logic a radiologist or orthopedic surgeon would use to determine the fracture grade. This paper introduces an automated yet interpretable pelvic trauma decision support system to assist radiologists in fracture detection and Tile grading. To achieve interpretability despite processing high-dimensional whole-body CT images, we design a neurosymbolic algorithm that operates similarly to human interpretation of CT scans. The algorithm first detects relevant pelvic fractures on CTs with high specificity using Faster-RCNN. To generate robust fracture detections and associated detection (un)certainties, we perform test-time augmentation of the CT scans to apply fracture detection several times in a self-ensembling approach. The fracture detections are interpreted using a structural causal model based on clinical best practices to infer an initial Tile grade. We apply a Bayesian causal model to recover likely co-occurring fractures that may have been rejected initially due to the highly specific operating point of the detector, resulting in an updated list of detected fractures and corresponding final Tile grade. Our method is transparent in that it provides fracture location and types, as well as information on important counterfactuals that would invalidate the system’s recommendation. Our approach achieves an AUC of 0.89/0.74 for translational and rotational instability,which is comparable to radiologist performance. Despite being designed for human-machine teaming, our approach does not compromise on performance compared to previous black-box methods.

骨盆环破裂源于钝性损伤机制，主要由于伴发损伤和大量骨盆出血而具有潜在的致命性。创伤患者骨盆骨折的严重程度通常是根据全身计算机断层扫描（CT）中的 Tile AO/OTA 分级来评估的。由于创伤中心产生的全身 CT 扫描量大、单次全身 CT 扫描的整体信息量大以及人工 CT 阅读速度低，因此 Tile 分级的自动方法将产生巨大的价值，例如，可为创伤放射科医生的阅读顺序安排优先顺序，或使他们能够专注于多发创伤患者的其他主要损伤。在这种高风险的情况下，瓦片分级的自动方法最好是透明的，使该方法提供的符号信息与放射科医生或骨科医生用来确定骨折等级的逻辑一致。本文介绍了一种自动化但可解释的骨盆创伤决策支持系统，以协助放射科医生进行骨折检测和瓦片分级。为了在处理高维全身 CT 图像的同时实现可解释性，我们设计了一种神经符号算法，其操作类似于人类对 CT 扫描的解释。该算法首先使用 Faster-RCNN 高特异性地检测 CT 上的相关骨盆骨折。为了生成稳健的骨折检测和相关的检测（不）确定度，我们对 CT 扫描进行了测试时间增强，以自组装方法多次应用骨折检测。利用基于临床最佳实践的结构因果模型对骨折检测结果进行解释，以推断出最初的瓷砖等级。我们应用贝叶斯因果模型来恢复可能同时发生的骨折，这些骨折最初可能由于检测器高度特定的工作点而被剔除，因此我们会更新检测到的骨折列表和相应的最终瓦片等级。我们的方法是透明的，因为它提供了断裂位置和类型，以及会使系统建议无效的重要反事实信息。我们的方法在平移和旋转不稳定性方面的 AUC 值为 0.89/0.74，与放射科医生的表现相当。尽管我们的方法是为人机协作而设计的，但与以前的黑盒方法相比，我们的方法在性能上并没有打折扣。

{"title":"Interpretable Severity Scoring of Pelvic Trauma Through Automated Fracture Detection and Bayesian Inference","authors":"Haomin Chen;David Dreizin;Catalina Gomez;Anna Zapaishchykova;Mathias Unberath","doi":"10.1109/TMI.2024.3428836","DOIUrl":"10.1109/TMI.2024.3428836","url":null,"abstract":"Pelvic ring disruptions result from blunt injury mechanisms and are potentially lethal mainly due to associated injuries and massive pelvic hemorrhage. The severity of pelvic fractures in trauma victims is frequently assessed by grading the fracture according to the Tile AO/OTA classification in whole-body Computed Tomography (CT) scans. Due to the high volume of whole-body CT scans generated in trauma centers, the overall information content of a single whole-body CT scan and low manual CT reading speed, an automatic approach to Tile classification would provide substantial value, e.g., to prioritize the reading sequence of the trauma radiologists or enable them to focus on other major injuries in multi-trauma patients. In such a high-stakes scenario, an automated method for Tile grading should ideally be transparent such that the symbolic information provided by the method follows the same logic a radiologist or orthopedic surgeon would use to determine the fracture grade. This paper introduces an automated yet interpretable pelvic trauma decision support system to assist radiologists in fracture detection and Tile grading. To achieve interpretability despite processing high-dimensional whole-body CT images, we design a neurosymbolic algorithm that operates similarly to human interpretation of CT scans. The algorithm first detects relevant pelvic fractures on CTs with high specificity using Faster-RCNN. To generate robust fracture detections and associated detection (un)certainties, we perform test-time augmentation of the CT scans to apply fracture detection several times in a self-ensembling approach. The fracture detections are interpreted using a structural causal model based on clinical best practices to infer an initial Tile grade. We apply a Bayesian causal model to recover likely co-occurring fractures that may have been rejected initially due to the highly specific operating point of the detector, resulting in an updated list of detected fractures and corresponding final Tile grade. Our method is transparent in that it provides fracture location and types, as well as information on important counterfactuals that would invalidate the system’s recommendation. Our approach achieves an AUC of 0.89/0.74 for translational and rotational instability,which is comparable to radiologist performance. Despite being designed for human-machine teaming, our approach does not compromise on performance compared to previous black-box methods.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 1","pages":"130-141"},"PeriodicalIF":0.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10605832","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141750114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0