IEEE transactions on medical imaging最新文献_第7页

Orthogonal Mixed-Effects Modeling for High-Dimensional Longitudinal Data: An Unsupervised Learning Approach. 高维纵向数据的正交混合效应建模：一种无监督学习方法。

IEEE transactions on medical imaging

Pub Date : 2024-07-30 DOI: 10.1109/TMI.2024.3435855

Ming Chen, Yijun Bian, Nanguang Chen, Anqi Qiu

The linear mixed-effects model is commonly utilized to interpret longitudinal data, characterizing both the global longitudinal trajectory across all observations and longitudinal trajectories within individuals. However, characterizing these trajectories in high-dimensional longitudinal data presents a challenge. To address this, our study proposes a novel approach, Unsupervised Orthogonal Mixed-Effects Trajectory Modeling (UOMETM), that leverages unsupervised learning to generate latent representations of both global and individual trajectories. We design an autoencoder with a latent space where an orthogonal constraint is imposed to separate the space of global trajectories from individual trajectories. We also devise a cross-reconstruction loss to ensure consistency of global trajectories and enhance the orthogonality between representation spaces. To evaluate UOMETM, we conducted simulation experiments on images to verify that every component functions as intended. Furthermore, we evaluated its performance and robustness using longitudinal brain cortical thickness from two Alzheimer's disease (AD) datasets. Comparative analyses with state-of-the-art methods revealed UOMETM's superiority in identifying global and individual longitudinal patterns, achieving a lower reconstruction error, superior orthogonality, and higher accuracy in AD classification and conversion forecasting. Remarkably, we found that the space of global trajectories did not significantly contribute to AD classification compared to the space of individual trajectories, emphasizing their clear separation. Moreover, our model exhibited satisfactory generalization and robustness across different datasets. The study shows the outstanding performance and potential clinical use of UOMETM in the context of longitudinal data analysis.

线性混合效应模型通常用于解释纵向数据，既能描述所有观测数据的总体纵向轨迹，也能描述个体内部的纵向轨迹。然而，在高维纵向数据中描述这些轨迹是一项挑战。为了解决这个问题，我们的研究提出了一种新方法--无监督正交混合效应轨迹建模（UOMETM），利用无监督学习生成全局和个体轨迹的潜在表征。我们设计了一个具有潜在空间的自动编码器，其中施加了一个正交约束，以分离全局轨迹空间和个体轨迹空间。我们还设计了一种交叉重构损失，以确保全局轨迹的一致性，并增强表示空间之间的正交性。为了评估 UOMETM，我们在图像上进行了模拟实验，以验证每个组件都能发挥预期功能。此外，我们还利用两个阿尔茨海默病（AD）数据集的纵向大脑皮层厚度对其性能和鲁棒性进行了评估。与最先进方法的对比分析表明，UOMETM 在识别全局和个体纵向模式方面更胜一筹，重建误差更低，正交性更好，在阿尔茨海默病分类和转换预测方面的准确性更高。值得注意的是，我们发现与单个轨迹空间相比，全局轨迹空间对 AD 分类的贡献并不明显，这强调了它们之间的明显分离。此外，我们的模型在不同数据集上表现出令人满意的泛化和鲁棒性。这项研究显示了 UOMETM 在纵向数据分析方面的卓越性能和潜在的临床应用。

{"title":"Orthogonal Mixed-Effects Modeling for High-Dimensional Longitudinal Data: An Unsupervised Learning Approach.","authors":"Ming Chen, Yijun Bian, Nanguang Chen, Anqi Qiu","doi":"10.1109/TMI.2024.3435855","DOIUrl":"10.1109/TMI.2024.3435855","url":null,"abstract":"The linear mixed-effects model is commonly utilized to interpret longitudinal data, characterizing both the global longitudinal trajectory across all observations and longitudinal trajectories within individuals. However, characterizing these trajectories in high-dimensional longitudinal data presents a challenge. To address this, our study proposes a novel approach, Unsupervised Orthogonal Mixed-Effects Trajectory Modeling (UOMETM), that leverages unsupervised learning to generate latent representations of both global and individual trajectories. We design an autoencoder with a latent space where an orthogonal constraint is imposed to separate the space of global trajectories from individual trajectories. We also devise a cross-reconstruction loss to ensure consistency of global trajectories and enhance the orthogonality between representation spaces. To evaluate UOMETM, we conducted simulation experiments on images to verify that every component functions as intended. Furthermore, we evaluated its performance and robustness using longitudinal brain cortical thickness from two Alzheimer's disease (AD) datasets. Comparative analyses with state-of-the-art methods revealed UOMETM's superiority in identifying global and individual longitudinal patterns, achieving a lower reconstruction error, superior orthogonality, and higher accuracy in AD classification and conversion forecasting. Remarkably, we found that the space of global trajectories did not significantly contribute to AD classification compared to the space of individual trajectories, emphasizing their clear separation. Moreover, our model exhibited satisfactory generalization and robustness across different datasets. The study shows the outstanding performance and potential clinical use of UOMETM in the context of longitudinal data analysis.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141857448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Segmentation and Vascular Vectorization for Coronary Artery by Geometry-based Cascaded Neural Network. 基于几何的级联神经网络对冠状动脉进行分割和血管矢量化

IEEE transactions on medical imaging

Pub Date : 2024-07-30 DOI: 10.1109/TMI.2024.3435714

Xiaoyu Yang, Lijian Xu, Simon Yu, Qing Xia, Hongsheng Li, Shaoting Zhang

Segmentation of the coronary artery is an important task for the quantitative analysis of coronary computed tomography angiography (CCTA) images and is being stimulated by the field of deep learning. However, the complex structures with tiny and narrow branches of the coronary artery bring it a great challenge. Coupled with the medical image limitations of low resolution and poor contrast, fragmentations of segmented vessels frequently occur in the prediction. Therefore, a geometry-based cascaded segmentation method is proposed for the coronary artery, which has the following innovations: 1) Integrating geometric deformation networks, we design a cascaded network for segmenting the coronary artery and vectorizing results. The generated meshes of the coronary artery are continuous and accurate for twisted and sophisticated coronary artery structures, without fragmentations. 2) Different from mesh annotations generated by the traditional marching cube method from voxel-based labels, a finer vectorized mesh of the coronary artery is reconstructed with the regularized morphology. The novel mesh annotation benefits the geometry-based segmentation network, avoiding bifurcation adhesion and point cloud dispersion in intricate branches. 3) A dataset named CCA-200 is collected, consisting of 200 CCTA images with coronary artery disease. The ground truths of 200 cases are coronary internal diameter annotations by professional radiologists. Extensive experiments verify our method on our collected dataset CCA-200 and public ASOCA dataset, with a Dice of 0.778 on CCA-200 and 0.895 on ASOCA, showing superior results. Especially, our geometry-based model generates an accurate, intact and smooth coronary artery, devoid of any fragmentations of segmented vessels.

冠状动脉的分割是冠状动脉计算机断层扫描（CCTA）图像定量分析的一项重要任务，目前正受到深度学习领域的推动。然而，冠状动脉结构复杂，分支细小而狭窄，给这项工作带来了巨大挑战。再加上医学影像分辨率低、对比度差的限制，预测中经常出现分割血管的碎片。因此，针对冠状动脉提出了一种基于几何的级联分割方法，其创新点如下：1) 结合几何变形网络，我们设计了一种级联网络，用于分割冠状动脉并将结果矢量化。生成的冠状动脉网格连续、精确，可用于扭曲和复杂的冠状动脉结构，不会出现碎裂。2) 与传统的基于体素标签的行进立方体方法生成的网格注释不同，利用正则化形态学重建的冠状动脉矢量化网格更精细。新的网格标注有利于基于几何的分割网络，避免了复杂分支中的分叉粘连和点云分散。3) 收集的数据集名为 CCA-200，由 200 张冠状动脉疾病的 CCTA 图像组成。200 个病例的地面真相是由专业放射科医生标注的冠状动脉内径。大量实验验证了我们的方法，CCA-200 和 ASOCA 数据集的 Dice 分别为 0.778 和 0.895，显示出卓越的效果。特别是，我们基于几何模型生成的冠状动脉准确、完整、光滑，没有任何分割血管的碎片。

{"title":"Segmentation and Vascular Vectorization for Coronary Artery by Geometry-based Cascaded Neural Network.","authors":"Xiaoyu Yang, Lijian Xu, Simon Yu, Qing Xia, Hongsheng Li, Shaoting Zhang","doi":"10.1109/TMI.2024.3435714","DOIUrl":"https://doi.org/10.1109/TMI.2024.3435714","url":null,"abstract":"Segmentation of the coronary artery is an important task for the quantitative analysis of coronary computed tomography angiography (CCTA) images and is being stimulated by the field of deep learning. However, the complex structures with tiny and narrow branches of the coronary artery bring it a great challenge. Coupled with the medical image limitations of low resolution and poor contrast, fragmentations of segmented vessels frequently occur in the prediction. Therefore, a geometry-based cascaded segmentation method is proposed for the coronary artery, which has the following innovations: 1) Integrating geometric deformation networks, we design a cascaded network for segmenting the coronary artery and vectorizing results. The generated meshes of the coronary artery are continuous and accurate for twisted and sophisticated coronary artery structures, without fragmentations. 2) Different from mesh annotations generated by the traditional marching cube method from voxel-based labels, a finer vectorized mesh of the coronary artery is reconstructed with the regularized morphology. The novel mesh annotation benefits the geometry-based segmentation network, avoiding bifurcation adhesion and point cloud dispersion in intricate branches. 3) A dataset named CCA-200 is collected, consisting of 200 CCTA images with coronary artery disease. The ground truths of 200 cases are coronary internal diameter annotations by professional radiologists. Extensive experiments verify our method on our collected dataset CCA-200 and public ASOCA dataset, with a Dice of 0.778 on CCA-200 and 0.895 on ASOCA, showing superior results. Especially, our geometry-based model generates an accurate, intact and smooth coronary artery, devoid of any fragmentations of segmented vessels.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141857449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Weakly Supervised Object Detection in Chest X-Rays with Differentiable ROI Proposal Networks and Soft ROI Pooling. 利用可区分的 ROI 建议网络和软 ROI 池，实现胸部 X 光片中的弱监督对象检测。

IEEE transactions on medical imaging

Pub Date : 2024-07-29 DOI: 10.1109/TMI.2024.3435015

Philip Muller, Felix Meissen, Georgios Kaissis, Daniel Rueckert

Weakly supervised object detection (WSup-OD) increases the usefulness and interpretability of image classification algorithms without requiring additional supervision. The successes of multiple instance learning in this task for natural images, however, do not translate well to medical images due to the very different characteristics of their objects (i.e. pathologies). In this work, we propose Weakly Supervised ROI Proposal Networks (WSRPN), a new method for generating bounding box proposals on the fly using a specialized region of interest-attention (ROI-attention) module. WSRPN integrates well with classic backbone-head classification algorithms and is end-to-end trainable with only image-label supervision. We experimentally demonstrate that our new method outperforms existing methods in the challenging task of disease localization in chest X-ray images. Code: https://anonymous.4open.science/r/WSRPN-DCA1.

弱监督对象检测（WSup-OD）无需额外监督即可提高图像分类算法的实用性和可解释性。然而，由于对象（即病理）的特征截然不同，多实例学习在自然图像任务中取得的成功并不能很好地应用于医学图像。在这项工作中，我们提出了弱监督 ROI 建议网络（WSRPN），这是一种利用专门的兴趣区域关注（ROI-attention）模块即时生成边界框建议的新方法。WSRPN 与经典的骨干头分类算法集成良好，只需图像标签监督即可进行端到端的训练。我们通过实验证明，在胸部 X 光图像疾病定位这一具有挑战性的任务中，我们的新方法优于现有方法。代码：https://anonymous.4open.science/r/WSRPN-DCA1。

引用次数: 0

Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI. 用于 DCE-MRI 中乳腺肿瘤分割的原型学习引导混合网络。

IEEE transactions on medical imaging

Pub Date : 2024-07-29 DOI: 10.1109/TMI.2024.3435450

Lei Zhou, Yuzhong Zhang, Jiadong Zhang, Xuejun Qian, Chen Gong, Kun Sun, Zhongxiang Ding, Xing Wang, Zhenhui Li, Zaiyi Liu, Dinggang Shen

Automated breast tumor segmentation on the basis of dynamic contrast-enhancement magnetic resonance imaging (DCE-MRI) has shown great promise in clinical practice, particularly for identifying the presence of breast disease. However, accurate segmentation of breast tumor is a challenging task, often necessitating the development of complex networks. To strike an optimal tradeoff between computational costs and segmentation performance, we propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers. Specifically, the hybrid network consists of a encoder-decoder architecture by stacking convolution and deconvolution layers. Effective 3D transformer layers are then implemented after the encoder subnetworks, to capture global dependencies between the bottleneck features. To improve the efficiency of hybrid network, two parallel encoder sub-networks are designed for the decoder and the transformer layers, respectively. To further enhance the discriminative capability of hybrid network, a prototype learning guided prediction module is proposed, where the category-specified prototypical features are calculated through online clustering. All learned prototypical features are finally combined with the features from decoder for tumor mask prediction. The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network achieves superior performance than the state-of-the-art (SOTA) methods, while maintaining balance between segmentation accuracy and computation cost. Moreover, we demonstrate that automatically generated tumor masks can be effectively applied to identify HER2-positive subtype from HER2-negative subtype with the similar accuracy to the analysis based on manual tumor segmentation. The source code is available at https://github.com/ZhouL-lab/ PLHN.

基于动态对比增强磁共振成像（DCE-MRI）的乳腺肿瘤自动分割技术在临床实践中大有可为，尤其是在识别乳腺疾病方面。然而，准确分割乳腺肿瘤是一项具有挑战性的任务，往往需要开发复杂的网络。为了在计算成本和分割性能之间取得最佳平衡，我们提出了一种结合卷积神经网络（CNN）和变压器层的混合网络。具体来说，该混合网络通过堆叠卷积层和解卷层组成了一个编码器-解码器架构。然后在编码器子网络之后实施有效的三维变换层，以捕捉瓶颈特征之间的全局依赖关系。为了提高混合网络的效率，还分别为解码器层和变换层设计了两个并行的编码器子网络。为进一步提高混合网络的分辨能力，提出了原型学习引导预测模块，通过在线聚类计算类别指定的原型特征。所有学习到的原型特征最终与来自解码器的特征相结合，用于肿瘤掩膜预测。在私人和公共 DCE-MRI 数据集上的实验结果表明，所提出的混合网络比最先进的（SOTA）方法性能更优越，同时保持了分割精度和计算成本之间的平衡。此外，我们还证明了自动生成的肿瘤掩膜可以有效地从 HER2 阴性亚型中识别出 HER2 阳性亚型，其准确性与基于人工肿瘤分割的分析相似。源代码见 https://github.com/ZhouL-lab/ PLHN。

{"title":"Prototype Learning Guided Hybrid Network for Breast Tumor Segmentation in DCE-MRI.","authors":"Lei Zhou, Yuzhong Zhang, Jiadong Zhang, Xuejun Qian, Chen Gong, Kun Sun, Zhongxiang Ding, Xing Wang, Zhenhui Li, Zaiyi Liu, Dinggang Shen","doi":"10.1109/TMI.2024.3435450","DOIUrl":"10.1109/TMI.2024.3435450","url":null,"abstract":"Automated breast tumor segmentation on the basis of dynamic contrast-enhancement magnetic resonance imaging (DCE-MRI) has shown great promise in clinical practice, particularly for identifying the presence of breast disease. However, accurate segmentation of breast tumor is a challenging task, often necessitating the development of complex networks. To strike an optimal tradeoff between computational costs and segmentation performance, we propose a hybrid network via the combination of convolution neural network (CNN) and transformer layers. Specifically, the hybrid network consists of a encoder-decoder architecture by stacking convolution and deconvolution layers. Effective 3D transformer layers are then implemented after the encoder subnetworks, to capture global dependencies between the bottleneck features. To improve the efficiency of hybrid network, two parallel encoder sub-networks are designed for the decoder and the transformer layers, respectively. To further enhance the discriminative capability of hybrid network, a prototype learning guided prediction module is proposed, where the category-specified prototypical features are calculated through online clustering. All learned prototypical features are finally combined with the features from decoder for tumor mask prediction. The experimental results on private and public DCE-MRI datasets demonstrate that the proposed hybrid network achieves superior performance than the state-of-the-art (SOTA) methods, while maintaining balance between segmentation accuracy and computation cost. Moreover, we demonstrate that automatically generated tumor masks can be effectively applied to identify HER2-positive subtype from HER2-negative subtype with the similar accuracy to the analysis based on manual tumor segmentation. The source code is available at https://github.com/ZhouL-lab/ PLHN.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141794321","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Modal Diagnosis of Alzheimer's Disease using Interpretable Graph Convolutional Networks. 利用可解释图卷积网络对阿尔茨海默病进行多模式诊断

IEEE transactions on medical imaging

Pub Date : 2024-07-23 DOI: 10.1109/TMI.2024.3432531

Houliang Zhou, Lifang He, Brian Y Chen, Li Shen, Yu Zhang

The interconnection between brain regions in neurological disease encodes vital information for the advancement of biomarkers and diagnostics. Although graph convolutional networks are widely applied for discovering brain connection patterns that point to disease conditions, the potential of connection patterns that arise from multiple imaging modalities has yet to be fully realized. In this paper, we propose a multi-modal sparse interpretable GCN framework (SGCN) for the detection of Alzheimer's disease (AD) and its prodromal stage, known as mild cognitive impairment (MCI). In our experimentation, SGCN learned the sparse regional importance probability to find signature regions of interest (ROIs), and the connective importance probability to reveal disease-specific brain network connections. We evaluated SGCN on the Alzheimer's Disease Neuroimaging Initiative database with multi-modal brain images and demonstrated that the ROI features learned by SGCN were effective for enhancing AD status identification. The identified abnormalities were significantly correlated with AD-related clinical symptoms. We further interpreted the identified brain dysfunctions at the level of large-scale neural systems and sex-related connectivity abnormalities in AD/MCI. The salient ROIs and the prominent brain connectivity abnormalities interpreted by SGCN are considerably important for developing novel biomarkers. These findings contribute to a better understanding of the network-based disorder via multi-modal diagnosis and offer the potential for precision diagnostics. The source code is available at https://github.com/Houliang-Zhou/SGCN.

神经系统疾病中大脑区域之间的相互联系为生物标记和诊断的发展提供了重要信息。虽然图卷积网络被广泛应用于发现指向疾病状况的大脑连接模式，但多种成像模式产生的连接模式的潜力尚未得到充分发挥。在本文中，我们提出了一种多模态稀疏可解释 GCN 框架（SGCN），用于检测阿尔茨海默病（AD）及其前驱阶段，即轻度认知障碍（MCI）。在我们的实验中，SGCN 学习了稀疏区域重要性概率以找到特征感兴趣区域（ROI），并学习了连接重要性概率以揭示特定疾病的大脑网络连接。我们在阿尔茨海默病神经影像倡议数据库的多模态脑图像上对 SGCN 进行了评估，结果表明，SGCN 学习到的 ROI 特征能有效增强对阿尔茨海默病状态的识别。识别出的异常与阿兹海默症相关临床症状明显相关。我们进一步从大尺度神经系统和与性别相关的连接异常层面解释了所发现的 AD/MCI 脑功能障碍。SGCN所解释的突出ROI和明显的大脑连接异常对于开发新型生物标记物相当重要。这些发现有助于通过多模态诊断更好地了解基于网络的疾病，并为精准诊断提供了潜力。源代码见 https://github.com/Houliang-Zhou/SGCN。

{"title":"Multi-Modal Diagnosis of Alzheimer's Disease using Interpretable Graph Convolutional Networks.","authors":"Houliang Zhou, Lifang He, Brian Y Chen, Li Shen, Yu Zhang","doi":"10.1109/TMI.2024.3432531","DOIUrl":"10.1109/TMI.2024.3432531","url":null,"abstract":"The interconnection between brain regions in neurological disease encodes vital information for the advancement of biomarkers and diagnostics. Although graph convolutional networks are widely applied for discovering brain connection patterns that point to disease conditions, the potential of connection patterns that arise from multiple imaging modalities has yet to be fully realized. In this paper, we propose a multi-modal sparse interpretable GCN framework (SGCN) for the detection of Alzheimer's disease (AD) and its prodromal stage, known as mild cognitive impairment (MCI). In our experimentation, SGCN learned the sparse regional importance probability to find signature regions of interest (ROIs), and the connective importance probability to reveal disease-specific brain network connections. We evaluated SGCN on the Alzheimer's Disease Neuroimaging Initiative database with multi-modal brain images and demonstrated that the ROI features learned by SGCN were effective for enhancing AD status identification. The identified abnormalities were significantly correlated with AD-related clinical symptoms. We further interpreted the identified brain dysfunctions at the level of large-scale neural systems and sex-related connectivity abnormalities in AD/MCI. The salient ROIs and the prominent brain connectivity abnormalities interpreted by SGCN are considerably important for developing novel biomarkers. These findings contribute to a better understanding of the network-based disorder via multi-modal diagnosis and offer the potential for precision diagnostics. The source code is available at https://github.com/Houliang-Zhou/SGCN.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141753574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Interpretable Severity Scoring of Pelvic Trauma Through Automated Fracture Detection and Bayesian Inference. 通过自动骨折检测和贝叶斯推理对骨盆创伤进行可解释的严重程度评分

IEEE transactions on medical imaging

Pub Date : 2024-07-22 DOI: 10.1109/TMI.2024.3428836

Haomin Chen, David Dreizin, Catalina Gomez, Anna Zapaishchykova, Mathias Unberath

Pelvic ring disruptions result from blunt injury mechanisms and are potentially lethal mainly due to associated injuries and massive pelvic hemorrhage. The severity of pelvic fractures in trauma victims is frequently assessed by grading the fracture according to the Tile AO/OTA classification in whole-body Computed Tomography (CT) scans. Due to the high volume of whole-body CT scans generated in trauma centers, the overall information content of a single whole-body CT scan and low manual CT reading speed, an automatic approach to Tile classification would provide substantial value, e. g., to prioritize the reading sequence of the trauma radiologists or enable them to focus on other major injuries in multi-trauma patients. In such a high-stakes scenario, an automated method for Tile grading should ideally be transparent such that the symbolic information provided by the method follows the same logic a radiologist or orthopedic surgeon would use to determine the fracture grade. This paper introduces an automated yet interpretable pelvic trauma decision support system to assist radiologists in fracture detection and Tile grading. To achieve interpretability despite processing high-dimensional whole-body CT images, we design a neurosymbolic algorithm that operates similarly to human interpretation of CT scans. The algorithm first detects relevant pelvic fractures on CTs with high specificity using Faster-RCNN. To generate robust fracture detections and associated detection (un)certainties, we perform test-time augmentation of the CT scans to apply fracture detection several times in a self-ensembling approach. The fracture detections are interpreted using a structural causal model based on clinical best practices to infer an initial Tile grade. We apply a Bayesian causal model to recover likely co-occurring fractures that may have been rejected initially due to the highly specific operating point of the detector, resulting in an updated list of detected fractures and corresponding final Tile grade. Our method is transparent in that it provides fracture location and types, as well as information on important counterfactuals that would invalidate the system's recommendation. Our approach achieves an AUC of 0.89/0.74 for translational and rotational instability,which is comparable to radiologist performance. Despite being designed for human-machine teaming, our approach does not compromise on performance compared to previous black-box methods.

骨盆环破裂源于钝性损伤机制，主要由于伴发损伤和大量骨盆出血而具有潜在的致命性。创伤患者骨盆骨折的严重程度通常是根据全身计算机断层扫描（CT）中的 Tile AO/OTA 分级来评估的。由于创伤中心产生的全身 CT 扫描量大、单次全身 CT 扫描的整体信息量大以及人工 CT 阅读速度低，因此 Tile 分级的自动方法将产生巨大的价值，例如，可为创伤放射科医生的阅读顺序安排优先顺序，或使他们能够专注于多发创伤患者的其他主要损伤。在这种高风险的情况下，瓦片分级的自动方法最好是透明的，使该方法提供的符号信息与放射科医生或骨科医生用来确定骨折等级的逻辑一致。本文介绍了一种自动化但可解释的骨盆创伤决策支持系统，以协助放射科医生进行骨折检测和瓦片分级。为了在处理高维全身 CT 图像的同时实现可解释性，我们设计了一种神经符号算法，其操作类似于人类对 CT 扫描的解释。该算法首先使用 Faster-RCNN 高特异性地检测 CT 上的相关骨盆骨折。为了生成稳健的骨折检测和相关的检测（不）确定度，我们对 CT 扫描进行了测试时间增强，以自组装方法多次应用骨折检测。利用基于临床最佳实践的结构因果模型对骨折检测结果进行解释，以推断出最初的瓷砖等级。我们应用贝叶斯因果模型来恢复可能同时发生的骨折，这些骨折最初可能由于检测器高度特定的工作点而被剔除，因此我们会更新检测到的骨折列表和相应的最终瓦片等级。我们的方法是透明的，因为它提供了断裂位置和类型，以及会使系统建议无效的重要反事实信息。我们的方法在平移和旋转不稳定性方面的 AUC 值为 0.89/0.74，与放射科医生的表现相当。尽管我们的方法是为人机协作而设计的，但与以前的黑盒方法相比，我们的方法在性能上并没有打折扣。

{"title":"Interpretable Severity Scoring of Pelvic Trauma Through Automated Fracture Detection and Bayesian Inference.","authors":"Haomin Chen, David Dreizin, Catalina Gomez, Anna Zapaishchykova, Mathias Unberath","doi":"10.1109/TMI.2024.3428836","DOIUrl":"https://doi.org/10.1109/TMI.2024.3428836","url":null,"abstract":"Pelvic ring disruptions result from blunt injury mechanisms and are potentially lethal mainly due to associated injuries and massive pelvic hemorrhage. The severity of pelvic fractures in trauma victims is frequently assessed by grading the fracture according to the Tile AO/OTA classification in whole-body Computed Tomography (CT) scans. Due to the high volume of whole-body CT scans generated in trauma centers, the overall information content of a single whole-body CT scan and low manual CT reading speed, an automatic approach to Tile classification would provide substantial value, e. g., to prioritize the reading sequence of the trauma radiologists or enable them to focus on other major injuries in multi-trauma patients. In such a high-stakes scenario, an automated method for Tile grading should ideally be transparent such that the symbolic information provided by the method follows the same logic a radiologist or orthopedic surgeon would use to determine the fracture grade. This paper introduces an automated yet interpretable pelvic trauma decision support system to assist radiologists in fracture detection and Tile grading. To achieve interpretability despite processing high-dimensional whole-body CT images, we design a neurosymbolic algorithm that operates similarly to human interpretation of CT scans. The algorithm first detects relevant pelvic fractures on CTs with high specificity using Faster-RCNN. To generate robust fracture detections and associated detection (un)certainties, we perform test-time augmentation of the CT scans to apply fracture detection several times in a self-ensembling approach. The fracture detections are interpreted using a structural causal model based on clinical best practices to infer an initial Tile grade. We apply a Bayesian causal model to recover likely co-occurring fractures that may have been rejected initially due to the highly specific operating point of the detector, resulting in an updated list of detected fractures and corresponding final Tile grade. Our method is transparent in that it provides fracture location and types, as well as information on important counterfactuals that would invalidate the system's recommendation. Our approach achieves an AUC of 0.89/0.74 for translational and rotational instability,which is comparable to radiologist performance. Despite being designed for human-machine teaming, our approach does not compromise on performance compared to previous black-box methods.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141750114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

CADS: A Self-supervised Learner via Cross-modal Alignment and Deep Self-distillation for CT Volume Segmentation. CADS：通过跨模态对齐和深度自分馏实现 CT 容积分割的自监督学习器。

IEEE transactions on medical imaging

Pub Date : 2024-07-22 DOI: 10.1109/TMI.2024.3431916

Yiwen Ye, Jianpeng Zhang, Ziyang Chen, Yong Xia

Self-supervised learning (SSL) has long had great success in advancing the field of annotation-efficient learning. However, when applied to CT volume segmentation, most SSL methods suffer from two limitations, including rarely using the information acquired by different imaging modalities and providing supervision only to the bottleneck encoder layer. To address both limitations, we design a pretext task to align the information in each 3D CT volume and the corresponding 2D generated X-ray image and extend self-distillation to deep self-distillation. Thus, we propose a self-supervised learner based on Cross-modal Alignment and Deep Self-distillation (CADS) to improve the encoder's ability to characterize CT volumes. The cross-modal alignment is a more challenging pretext task that forces the encoder to learn better image representation ability. Deep self-distillation provides supervision to not only the bottleneck layer but also shallow layers, thus boosting the abilities of both. Comparative experiments show that, during pre-training, our CADS has lower computational complexity and GPU memory cost than competing SSL methods. Based on the pre-trained encoder, we construct PVT-UNet for 3D CT volume segmentation. Our results on seven downstream tasks indicate that PVT-UNet outperforms state-of-the-art SSL methods like MOCOv3 and DiRA, as well as prevalent medical image segmentation methods like nnUNet and CoTr. Code and pre-trained weight will be available at https://github.com/yeerwen/CADS.

长期以来，自我监督学习（SSL）在推动注释高效学习领域取得了巨大成功。然而，当应用于 CT 体块分割时，大多数 SSL 方法都存在两个局限性，包括很少使用不同成像模式获取的信息，以及只对瓶颈编码器层提供监督。为了解决这两个局限性，我们设计了一个借口任务来对齐每个三维 CT 体和相应的二维生成的 X 光图像中的信息，并将自抖动扩展到深度自抖动。因此，我们提出了一种基于跨模态配准和深度自馏（CADS）的自监督学习器，以提高编码器表征 CT 体的能力。跨模态对齐是一项更具挑战性的前置任务，它迫使编码器学习更好的图像表征能力。深度自发散不仅能对瓶颈层进行监督，还能对浅层进行监督，从而提高两者的能力。对比实验表明，在预训练过程中，我们的 CADS 的计算复杂度和 GPU 内存成本均低于同类 SSL 方法。在预训练编码器的基础上，我们构建了用于三维 CT 体块分割的 PVT-UNet。我们在七项下游任务上的结果表明，PVT-UNet 的性能优于 MOCOv3 和 DiRA 等最先进的 SSL 方法，以及 nnUNet 和 CoTr 等流行的医学图像分割方法。代码和预训练权重将在 https://github.com/yeerwen/CADS 网站上提供。

{"title":"CADS: A Self-supervised Learner via Cross-modal Alignment and Deep Self-distillation for CT Volume Segmentation.","authors":"Yiwen Ye, Jianpeng Zhang, Ziyang Chen, Yong Xia","doi":"10.1109/TMI.2024.3431916","DOIUrl":"https://doi.org/10.1109/TMI.2024.3431916","url":null,"abstract":"Self-supervised learning (SSL) has long had great success in advancing the field of annotation-efficient learning. However, when applied to CT volume segmentation, most SSL methods suffer from two limitations, including rarely using the information acquired by different imaging modalities and providing supervision only to the bottleneck encoder layer. To address both limitations, we design a pretext task to align the information in each 3D CT volume and the corresponding 2D generated X-ray image and extend self-distillation to deep self-distillation. Thus, we propose a self-supervised learner based on Cross-modal Alignment and Deep Self-distillation (CADS) to improve the encoder's ability to characterize CT volumes. The cross-modal alignment is a more challenging pretext task that forces the encoder to learn better image representation ability. Deep self-distillation provides supervision to not only the bottleneck layer but also shallow layers, thus boosting the abilities of both. Comparative experiments show that, during pre-training, our CADS has lower computational complexity and GPU memory cost than competing SSL methods. Based on the pre-trained encoder, we construct PVT-UNet for 3D CT volume segmentation. Our results on seven downstream tasks indicate that PVT-UNet outperforms state-of-the-art SSL methods like MOCOv3 and DiRA, as well as prevalent medical image segmentation methods like nnUNet and CoTr. Code and pre-trained weight will be available at https://github.com/yeerwen/CADS.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141750102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generalizable Reconstruction for Accelerating MR Imaging via Federated Learning with Neural Architecture Search. 通过神经架构搜索联合学习加速磁共振成像的通用重构技术

IEEE transactions on medical imaging

Pub Date : 2024-07-22 DOI: 10.1109/TMI.2024.3432388

Ruoyou Wu, Cheng Li, Juan Zou, Xinfeng Liu, Hairong Zheng, Shanshan Wang

Heterogeneous data captured by different scanning devices and imaging protocols can affect the generalization performance of the deep learning magnetic resonance (MR) reconstruction model. While a centralized training model is effective in mitigating this problem, it raises concerns about privacy protection. Federated learning is a distributed training paradigm that can utilize multi-institutional data for collaborative training without sharing data. However, existing federated learning MR image reconstruction methods rely on models designed manually by experts, which are complex and computationally expensive, suffering from performance degradation when facing heterogeneous data distributions. In addition, these methods give inadequate consideration to fairness issues, namely ensuring that the model's training does not introduce bias towards any specific dataset's distribution. To this end, this paper proposes a generalizable federated neural architecture search framework for accelerating MR imaging (GAutoMRI). Specifically, automatic neural architecture search is investigated for effective and efficient neural network representation learning of MR images from different centers. Furthermore, we design a fairness adjustment approach that can enable the model to learn features fairly from inconsistent distributions of different devices and centers, and thus facilitate the model to generalize well to the unseen center. Extensive experiments show that our proposed GAutoMRI has better performances and generalization ability compared with seven state-of-the-art federated learning methods. Moreover, the GAutoMRI model is significantly more lightweight, making it an efficient choice for MR image reconstruction tasks. The code will be made available at https://github.com/ternencewu123/GAutoMRI.

由不同扫描设备和成像协议采集的异构数据会影响深度学习磁共振（MR）重建模型的泛化性能。虽然集中式训练模型能有效缓解这一问题，但它会引发隐私保护方面的担忧。联合学习是一种分布式训练模式，可以利用多机构数据进行协作训练，而无需共享数据。然而，现有的联合学习磁共振图像重建方法依赖于专家手动设计的模型，这些模型复杂且计算成本高，在面对异构数据分布时性能下降。此外，这些方法对公平性问题考虑不足，即确保模型的训练不会对任何特定数据集的分布产生偏差。为此，本文提出了一种可通用的联合神经架构搜索框架，用于加速磁共振成像（GAutoMRI）。具体来说，我们研究了自动神经架构搜索，以便对来自不同中心的磁共振图像进行有效和高效的神经网络表征学习。此外，我们还设计了一种公平性调整方法，使模型能从不同设备和中心的不一致分布中公平地学习特征，从而促进模型对未见中心的良好泛化。大量实验表明，与七种最先进的联合学习方法相比，我们提出的 GAutoMRI 具有更好的性能和泛化能力。此外，GAutoMRI 模型明显更轻便，使其成为磁共振图像重建任务的有效选择。代码将公布在 https://github.com/ternencewu123/GAutoMRI 网站上。

{"title":"Generalizable Reconstruction for Accelerating MR Imaging via Federated Learning with Neural Architecture Search.","authors":"Ruoyou Wu, Cheng Li, Juan Zou, Xinfeng Liu, Hairong Zheng, Shanshan Wang","doi":"10.1109/TMI.2024.3432388","DOIUrl":"https://doi.org/10.1109/TMI.2024.3432388","url":null,"abstract":"Heterogeneous data captured by different scanning devices and imaging protocols can affect the generalization performance of the deep learning magnetic resonance (MR) reconstruction model. While a centralized training model is effective in mitigating this problem, it raises concerns about privacy protection. Federated learning is a distributed training paradigm that can utilize multi-institutional data for collaborative training without sharing data. However, existing federated learning MR image reconstruction methods rely on models designed manually by experts, which are complex and computationally expensive, suffering from performance degradation when facing heterogeneous data distributions. In addition, these methods give inadequate consideration to fairness issues, namely ensuring that the model's training does not introduce bias towards any specific dataset's distribution. To this end, this paper proposes a generalizable federated neural architecture search framework for accelerating MR imaging (GAutoMRI). Specifically, automatic neural architecture search is investigated for effective and efficient neural network representation learning of MR images from different centers. Furthermore, we design a fairness adjustment approach that can enable the model to learn features fairly from inconsistent distributions of different devices and centers, and thus facilitate the model to generalize well to the unseen center. Extensive experiments show that our proposed GAutoMRI has better performances and generalization ability compared with seven state-of-the-art federated learning methods. Moreover, the GAutoMRI model is significantly more lightweight, making it an efficient choice for MR image reconstruction tasks. The code will be made available at https://github.com/ternencewu123/GAutoMRI.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141750113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unsupervised Domain Adaptation for EM Image Denoising with Invertible Networks. 利用可逆网络实现 EM 图像去噪的无监督领域适应。

IEEE transactions on medical imaging

Pub Date : 2024-07-19 DOI: 10.1109/TMI.2024.3431192

Shiyu Deng, Yinda Chen, Wei Huang, Ruobing Zhang, Zhiwei Xiong

Electron microscopy (EM) image denoising is critical for visualization and subsequent analysis. Despite the remarkable achievements of deep learning-based non-blind denoising methods, their performance drops significantly when domain shifts exist between the training and testing data. To address this issue, unpaired blind denoising methods have been proposed. However, these methods heavily rely on image-to-image translation and neglect the inherent characteristics of EM images, limiting their overall denoising performance. In this paper, we propose the first unsupervised domain adaptive EM image denoising method, which is grounded in the observation that EM images from similar samples share common content characteristics. Specifically, we first disentangle the content representations and the noise components from noisy images and establish a shared domain-agnostic content space via domain alignment to bridge the synthetic images (source domain) and the real images (target domain). To ensure precise domain alignment, we further incorporate domain regularization by enforcing that: the pseudo-noisy images, reconstructed using both content representations and noise components, accurately capture the characteristics of the noisy images from which the noise components originate, all while maintaining semantic consistency with the noisy images from which the content representations originate. To guarantee lossless representation decomposition and image reconstruction, we introduce disentanglement-reconstruction invertible networks. Finally, the reconstructed pseudo-noisy images, paired with their corresponding clean counterparts, serve as valuable training data for the denoising network. Extensive experiments on synthetic and real EM datasets demonstrate the superiority of our method in terms of image restoration quality and downstream neuron segmentation accuracy. Our code is publicly available at https://github.com/sydeng99/DADn.

电子显微镜（EM）图像去噪对于可视化和后续分析至关重要。尽管基于深度学习的非盲去噪方法取得了显著成就，但当训练数据和测试数据之间存在域偏移时，这些方法的性能就会大幅下降。为了解决这个问题，有人提出了非配对盲去噪方法。然而，这些方法严重依赖于图像到图像的平移，忽略了电磁图像的固有特征，从而限制了其整体去噪性能。在本文中，我们提出了首个无监督域自适应 EM 图像去噪方法，该方法基于相似样本的 EM 图像具有共同的内容特征这一观察结果。具体来说，我们首先从噪声图像中分离出内容表示和噪声成分，并通过域对齐建立一个共享的域无关内容空间，以连接合成图像（源域）和真实图像（目标域）。为了确保精确的域对齐，我们进一步纳入了域正则化，强制要求：使用内容表征和噪声分量重建的伪噪声图像能准确捕捉噪声分量所来源的噪声图像的特征，同时与内容表征所来源的噪声图像保持语义一致。为了保证无损表示分解和图像重建，我们引入了分解-重建可逆网络。最后，重建的伪噪声图像与相应的干净图像配对，可作为去噪网络的宝贵训练数据。在合成和真实电磁数据集上进行的大量实验证明了我们的方法在图像复原质量和下游神经元分割准确性方面的优越性。我们的代码可通过 https://github.com/sydeng99/DADn 公开获取。

{"title":"Unsupervised Domain Adaptation for EM Image Denoising with Invertible Networks.","authors":"Shiyu Deng, Yinda Chen, Wei Huang, Ruobing Zhang, Zhiwei Xiong","doi":"10.1109/TMI.2024.3431192","DOIUrl":"https://doi.org/10.1109/TMI.2024.3431192","url":null,"abstract":"Electron microscopy (EM) image denoising is critical for visualization and subsequent analysis. Despite the remarkable achievements of deep learning-based non-blind denoising methods, their performance drops significantly when domain shifts exist between the training and testing data. To address this issue, unpaired blind denoising methods have been proposed. However, these methods heavily rely on image-to-image translation and neglect the inherent characteristics of EM images, limiting their overall denoising performance. In this paper, we propose the first unsupervised domain adaptive EM image denoising method, which is grounded in the observation that EM images from similar samples share common content characteristics. Specifically, we first disentangle the content representations and the noise components from noisy images and establish a shared domain-agnostic content space via domain alignment to bridge the synthetic images (source domain) and the real images (target domain). To ensure precise domain alignment, we further incorporate domain regularization by enforcing that: the pseudo-noisy images, reconstructed using both content representations and noise components, accurately capture the characteristics of the noisy images from which the noise components originate, all while maintaining semantic consistency with the noisy images from which the content representations originate. To guarantee lossless representation decomposition and image reconstruction, we introduce disentanglement-reconstruction invertible networks. Finally, the reconstructed pseudo-noisy images, paired with their corresponding clean counterparts, serve as valuable training data for the denoising network. Extensive experiments on synthetic and real EM datasets demonstrate the superiority of our method in terms of image restoration quality and downstream neuron segmentation accuracy. Our code is publicly available at https://github.com/sydeng99/DADn.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141728472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PST-Diff: Achieving High-Consistency Stain Transfer by Diffusion Models With Pathological and Structural Constraints PST-Diff：通过具有病理和结构约束的扩散模型实现高一致性染色转移。

IEEE transactions on medical imaging

Pub Date : 2024-07-18 DOI: 10.1109/TMI.2024.3430825

Yufang He;Zeyu Liu;Mingxin Qi;Shengwei Ding;Peng Zhang;Fan Song;Chenbin Ma;Huijie Wu;Ruxin Cai;Youdan Feng;Haonan Zhang;Tianyi Zhang;Guanglei Zhang

Histopathological examinations heavily rely on hematoxylin and eosin (HE) and immunohistochemistry (IHC) staining. IHC staining can offer more accurate diagnostic details but it brings significant financial and time costs. Furthermore, either re-staining HE-stained slides or using adjacent slides for IHC may compromise the accuracy of pathological diagnosis due to information loss. To address these challenges, we develop PST-Diff, a method for generating virtual IHC images from HE images based on diffusion models, which allows pathologists to simultaneously view multiple staining results from the same tissue slide. To maintain the pathological consistency of the stain transfer, we propose the asymmetric attention mechanism (AAM) and latent transfer (LT) module in PST-Diff. Specifically, the AAM can retain more local pathological information of the source domain images, while ensuring the model’s flexibility in generating virtual stained images that highly confirm to the target domain. Subsequently, the LT module transfers the implicit representations across different domains, effectively alleviating the bias introduced by direct connection and further enhancing the pathological consistency of PST-Diff. Furthermore, to maintain the structural consistency of the stain transfer, the conditional frequency guidance (CFG) module is proposed to precisely control image generation and preserve structural details according to the frequency recovery process. To conclude, the pathological and structural consistency constraints provide PST-Diff with effectiveness and superior generalization in generating stable and functionally pathological IHC images with the best evaluation score. In general, PST-Diff offers prospective application in clinical virtual staining and pathological image analysis.

组织病理学检查在很大程度上依赖于苏木精和伊红（HE）以及免疫组织化学（IHC）染色。IHC 染色能提供更准确的诊断细节，但也会带来巨大的经济和时间成本。此外，无论是对 HE 染色的切片重新染色，还是使用相邻切片进行 IHC 染色，都可能因信息丢失而影响病理诊断的准确性。为了应对这些挑战，我们开发了一种基于扩散模型从 HE 图像生成虚拟 IHC 图像的方法 PST-Diff，它允许病理学家同时查看来自同一组织切片的多个染色结果。为了保持染色转移的病理一致性，我们在 PST-Diff 中提出了非对称注意机制（AAM）和潜移默化转移（LT）模块。具体来说，非对称注意机制可通过非对称注意机制的设计保留源域图像的更多局部病理信息，同时确保模型在生成与目标域高度吻合的虚拟染色图像时的灵活性。随后，LT 模块将隐含表征跨域转移，有效缓解了直接连接带来的偏差，进一步增强了 PST-Diff 的病理一致性。此外，为了保持染色转移的结构一致性，还提出了条件频率引导（CFG）模块，以根据频率恢复过程精确控制图像生成并保留结构细节。总之，病理和结构一致性约束为 PST-Diff 提供了有效性和卓越的通用性，使其能够生成稳定且功能正常的病理 IHC 图像，并获得最佳评估分数。总之，PST-Diff 在临床虚拟染色和病理图像分析方面具有广阔的应用前景。

{"title":"PST-Diff: Achieving High-Consistency Stain Transfer by Diffusion Models With Pathological and Structural Constraints","authors":"Yufang He;Zeyu Liu;Mingxin Qi;Shengwei Ding;Peng Zhang;Fan Song;Chenbin Ma;Huijie Wu;Ruxin Cai;Youdan Feng;Haonan Zhang;Tianyi Zhang;Guanglei Zhang","doi":"10.1109/TMI.2024.3430825","DOIUrl":"10.1109/TMI.2024.3430825","url":null,"abstract":"Histopathological examinations heavily rely on hematoxylin and eosin (HE) and immunohistochemistry (IHC) staining. IHC staining can offer more accurate diagnostic details but it brings significant financial and time costs. Furthermore, either re-staining HE-stained slides or using adjacent slides for IHC may compromise the accuracy of pathological diagnosis due to information loss. To address these challenges, we develop PST-Diff, a method for generating virtual IHC images from HE images based on diffusion models, which allows pathologists to simultaneously view multiple staining results from the same tissue slide. To maintain the pathological consistency of the stain transfer, we propose the asymmetric attention mechanism (AAM) and latent transfer (LT) module in PST-Diff. Specifically, the AAM can retain more local pathological information of the source domain images, while ensuring the model’s flexibility in generating virtual stained images that highly confirm to the target domain. Subsequently, the LT module transfers the implicit representations across different domains, effectively alleviating the bias introduced by direct connection and further enhancing the pathological consistency of PST-Diff. Furthermore, to maintain the structural consistency of the stain transfer, the conditional frequency guidance (CFG) module is proposed to precisely control image generation and preserve structural details according to the frequency recovery process. To conclude, the pathological and structural consistency constraints provide PST-Diff with effectiveness and superior generalization in generating stable and functionally pathological IHC images with the best evaluation score. In general, PST-Diff offers prospective application in clinical virtual staining and pathological image analysis.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 10","pages":"3634-3647"},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0