IEEE transactions on medical imaging最新文献_第8页

Transferring Adult-like Phase Images for Robust Multi-view Isointense Infant Brain Segmentation. 传输成人相位图像，实现稳健的多视角等密度婴儿大脑分段。

IEEE transactions on medical imaging

Pub Date : 2024-07-18 DOI: 10.1109/TMI.2024.3430348

Huabing Liu, Jiawei Huang, Dengqiang Jia, Qian Wang, Jun Xu, Dinggang Shen

Accurate tissue segmentation of infant brain in magnetic resonance (MR) images is crucial for charting early brain development and identifying biomarkers. Due to ongoing myelination and maturation, in the isointense phase (6-9 months of age), the gray and white matters of infant brain exhibit similar intensity levels in MR images, posing significant challenges for tissue segmentation. Meanwhile, in the adult-like phase around 12 months of age, the MR images show high tissue contrast and can be easily segmented. In this paper, we propose to effectively exploit adult-like phase images to achieve robustmulti-view isointense infant brain segmentation. Specifically, in one way, we transfer adult-like phase images to the isointense view, which have similar tissue contrast as the isointense phase images, and use the transferred images to train an isointense-view segmentation network. On the other way, we transfer isointense phase images to the adult-like view, which have enhanced tissue contrast, for training a segmentation network in the adult-like view. The segmentation networks of different views form a multi-path architecture that performs multi-view learning to further boost the segmentation performance. Since anatomy-preserving style transfer is key to the downstream segmentation task, we develop a Disentangled Cycle-consistent Adversarial Network (DCAN) with strong regularization terms to accurately transfer realistic tissue contrast between isointense and adult-like phase images while still maintaining their structural consistency. Experiments on both NDAR and iSeg-2019 datasets demonstrate a significant superior performance of our method over the state-of-the-art methods.

在磁共振（MR）图像中对婴儿大脑进行准确的组织分割对于绘制早期大脑发育图和确定生物标记物至关重要。由于正在进行髓鞘化和成熟，在等密度阶段（6-9 个月大），婴儿大脑的灰质和白质在磁共振图像中表现出相似的强度水平，这给组织分割带来了巨大挑战。而在 12 个月左右的类成人期，核磁共振图像显示出较高的组织对比度，很容易进行组织分割。在本文中，我们提出有效利用类成人期图像来实现稳健的多视角等点状婴儿脑部分割。具体来说，一种方法是将与等点相位图像具有相似组织对比度的成人相位图像转移到等点相位视图，并利用转移的图像训练等点相位视图分割网络。另一方面，我们将组织对比度更强的等点相位图像转移到成人样视图，用于训练成人样视图的分割网络。不同视图的分割网络形成一个多路径架构，执行多视图学习，进一步提高分割性能。由于保留解剖结构的风格转移是下游分割任务的关键，我们开发了一种具有强正则化项的断裂循环一致性对抗网络（DCAN），以在等密度和成象相位图像之间准确转移真实的组织对比度，同时仍然保持其结构的一致性。在 NDAR 和 iSeg-2019 数据集上进行的实验表明，我们的方法明显优于最先进的方法。

{"title":"Transferring Adult-like Phase Images for Robust Multi-view Isointense Infant Brain Segmentation.","authors":"Huabing Liu, Jiawei Huang, Dengqiang Jia, Qian Wang, Jun Xu, Dinggang Shen","doi":"10.1109/TMI.2024.3430348","DOIUrl":"https://doi.org/10.1109/TMI.2024.3430348","url":null,"abstract":"Accurate tissue segmentation of infant brain in magnetic resonance (MR) images is crucial for charting early brain development and identifying biomarkers. Due to ongoing myelination and maturation, in the isointense phase (6-9 months of age), the gray and white matters of infant brain exhibit similar intensity levels in MR images, posing significant challenges for tissue segmentation. Meanwhile, in the adult-like phase around 12 months of age, the MR images show high tissue contrast and can be easily segmented. In this paper, we propose to effectively exploit adult-like phase images to achieve robustmulti-view isointense infant brain segmentation. Specifically, in one way, we transfer adult-like phase images to the isointense view, which have similar tissue contrast as the isointense phase images, and use the transferred images to train an isointense-view segmentation network. On the other way, we transfer isointense phase images to the adult-like view, which have enhanced tissue contrast, for training a segmentation network in the adult-like view. The segmentation networks of different views form a multi-path architecture that performs multi-view learning to further boost the segmentation performance. Since anatomy-preserving style transfer is key to the downstream segmentation task, we develop a Disentangled Cycle-consistent Adversarial Network (DCAN) with strong regularization terms to accurately transfer realistic tissue contrast between isointense and adult-like phase images while still maintaining their structural consistency. Experiments on both NDAR and iSeg-2019 datasets demonstrate a significant superior performance of our method over the state-of-the-art methods.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141725397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Label Generalized Zero Shot Chest Xray Classification By Combining Image-Text Information With Feature Disentanglement. 通过将图像文本信息与特征分离相结合实现多标签通用零镜头胸部 X 射线分类

IEEE transactions on medical imaging

Pub Date : 2024-07-17 DOI: 10.1109/TMI.2024.3429471

Dwarikanath Mahapatra, Antonio Jimeno Yepes, Behzad Bozorgtabar, Sudipta Roy, Zongyuan Ge, Mauricio Reyes

In fully supervised learning-based medical image classification, the robustness of a trained model is influenced by its exposure to the range of candidate disease classes. Generalized Zero Shot Learning (GZSL) aims to correctly predict seen and novel unseen classes. Current GZSL approaches have focused mostly on the single-label case. However, it is common for chest X-rays to be labelled with multiple disease classes. We propose a novel multi-modal multi-label GZSL approach that leverages feature disentanglement andmulti-modal information to synthesize features of unseen classes. Disease labels are processed through a pre-trained BioBert model to obtain text embeddings that are used to create a dictionary encoding similarity among different labels. We then use disentangled features and graph aggregation to learn a second dictionary of inter-label similarities. A subsequent clustering step helps to identify representative vectors for each class. The multi-modal multi-label dictionaries and the class representative vectors are used to guide the feature synthesis step, which is the most important component of our pipeline, for generating realistic multi-label disease samples of seen and unseen classes. Our method is benchmarked against multiple competing methods and we outperform all of them based on experiments conducted on the publicly available NIH and CheXpert chest X-ray datasets.

在基于完全监督学习的医学图像分类中，训练好的模型的鲁棒性会受到候选疾病类别范围的影响。广义零点学习（Generalized Zero Shot Learning，GZSL）旨在正确预测已见和未见的新类别。目前的 GZSL 方法主要侧重于单标签情况。然而，胸部 X 光片上标有多种疾病类别的情况很常见。我们提出了一种新颖的多模态多标签 GZSL 方法，该方法利用特征分解和多模态信息来综合未见类别的特征。疾病标签通过预训练的 BioBert 模型进行处理，以获得文本嵌入，用于创建不同标签之间相似性的编码字典。然后，我们使用分解特征和图聚合来学习第二份标签间相似性字典。随后的聚类步骤有助于确定每个类别的代表性向量。多模式多标签字典和类别代表向量用于指导特征合成步骤，这是我们管道中最重要的组成部分，用于生成真实的已见和未见类别的多标签疾病样本。根据在公开的美国国立卫生研究院（NIH）和 CheXpert 胸部 X 光数据集上进行的实验，我们的方法对多个竞争方法进行了基准测试，结果优于所有竞争方法。

{"title":"Multi-Label Generalized Zero Shot Chest Xray Classification By Combining Image-Text Information With Feature Disentanglement.","authors":"Dwarikanath Mahapatra, Antonio Jimeno Yepes, Behzad Bozorgtabar, Sudipta Roy, Zongyuan Ge, Mauricio Reyes","doi":"10.1109/TMI.2024.3429471","DOIUrl":"https://doi.org/10.1109/TMI.2024.3429471","url":null,"abstract":"In fully supervised learning-based medical image classification, the robustness of a trained model is influenced by its exposure to the range of candidate disease classes. Generalized Zero Shot Learning (GZSL) aims to correctly predict seen and novel unseen classes. Current GZSL approaches have focused mostly on the single-label case. However, it is common for chest X-rays to be labelled with multiple disease classes. We propose a novel multi-modal multi-label GZSL approach that leverages feature disentanglement andmulti-modal information to synthesize features of unseen classes. Disease labels are processed through a pre-trained BioBert model to obtain text embeddings that are used to create a dictionary encoding similarity among different labels. We then use disentangled features and graph aggregation to learn a second dictionary of inter-label similarities. A subsequent clustering step helps to identify representative vectors for each class. The multi-modal multi-label dictionaries and the class representative vectors are used to guide the feature synthesis step, which is the most important component of our pipeline, for generating realistic multi-label disease samples of seen and unseen classes. Our method is benchmarked against multiple competing methods and we outperform all of them based on experiments conducted on the publicly available NIH and CheXpert chest X-ray datasets.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141636246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Concept-based Lesion Aware Transformer for Interpretable Retinal Disease Diagnosis. 基于概念的病变感知转换器，用于可解释的视网膜疾病诊断

IEEE transactions on medical imaging

Pub Date : 2024-07-16 DOI: 10.1109/TMI.2024.3429148

Chi Wen, Mang Ye, He Li, Ting Chen, Xuan Xiao

Existing deep learning methods have achieved remarkable results in diagnosing retinal diseases, showcasing the potential of advanced AI in ophthalmology. However, the black-box nature of these methods obscures the decision-making process, compromising their trustworthiness and acceptability. Inspired by the concept-based approaches and recognizing the intrinsic correlation between retinal lesions and diseases, we regard retinal lesions as concepts and propose an inherently interpretable framework designed to enhance both the performance and explainability of diagnostic models. Leveraging the transformer architecture, known for its proficiency in capturing long-range dependencies, our model can effectively identify lesion features. By integrating with image-level annotations, it achieves the alignment of lesion concepts with human cognition under the guidance of a retinal foundation model. Furthermore, to attain interpretability without losing lesion-specific information, our method employs a classifier built on a cross-attention mechanism for disease diagnosis and explanation, where explanations are grounded in the contributions of human-understandable lesion concepts and their visual localization. Notably, due to the structure and inherent interpretability of our model, clinicians can implement concept-level interventions to correct the diagnostic errors by simply adjusting erroneous lesion predictions. Experiments conducted on four fundus image datasets demonstrate that our method achieves favorable performance against state-of-the-art methods while providing faithful explanations and enabling conceptlevel interventions. Our code is publicly available at https://github.com/Sorades/CLAT.

现有的深度学习方法在诊断视网膜疾病方面取得了显著成果，展示了先进人工智能在眼科领域的潜力。然而，这些方法的黑箱性质掩盖了决策过程，影响了其可信度和可接受性。受基于概念的方法的启发，并认识到视网膜病变与疾病之间的内在关联性，我们将视网膜病变视为概念，并提出了一个内在可解释的框架，旨在提高诊断模型的性能和可解释性。我们的模型利用以善于捕捉长距离依赖关系而著称的变换器架构，可以有效地识别病变特征。在视网膜基础模型的指导下，通过与图像级注释的整合，它实现了病变概念与人类认知的一致性。此外，为了在不丢失病变特定信息的情况下实现可解释性，我们的方法采用了一种建立在疾病诊断和解释的交叉注意机制上的分类器，其中解释是基于人类可理解的病变概念及其视觉定位的贡献。值得注意的是，由于我们模型的结构和内在可解释性，临床医生只需调整错误的病变预测，就能实施概念级干预，纠正诊断错误。在四个眼底图像数据集上进行的实验表明，与最先进的方法相比，我们的方法取得了良好的性能，同时提供了忠实的解释并实现了概念级干预。我们的代码可在 https://github.com/Sorades/CLAT 公开获取。

{"title":"Concept-based Lesion Aware Transformer for Interpretable Retinal Disease Diagnosis.","authors":"Chi Wen, Mang Ye, He Li, Ting Chen, Xuan Xiao","doi":"10.1109/TMI.2024.3429148","DOIUrl":"https://doi.org/10.1109/TMI.2024.3429148","url":null,"abstract":"Existing deep learning methods have achieved remarkable results in diagnosing retinal diseases, showcasing the potential of advanced AI in ophthalmology. However, the black-box nature of these methods obscures the decision-making process, compromising their trustworthiness and acceptability. Inspired by the concept-based approaches and recognizing the intrinsic correlation between retinal lesions and diseases, we regard retinal lesions as concepts and propose an inherently interpretable framework designed to enhance both the performance and explainability of diagnostic models. Leveraging the transformer architecture, known for its proficiency in capturing long-range dependencies, our model can effectively identify lesion features. By integrating with image-level annotations, it achieves the alignment of lesion concepts with human cognition under the guidance of a retinal foundation model. Furthermore, to attain interpretability without losing lesion-specific information, our method employs a classifier built on a cross-attention mechanism for disease diagnosis and explanation, where explanations are grounded in the contributions of human-understandable lesion concepts and their visual localization. Notably, due to the structure and inherent interpretability of our model, clinicians can implement concept-level interventions to correct the diagnostic errors by simply adjusting erroneous lesion predictions. Experiments conducted on four fundus image datasets demonstrate that our method achieves favorable performance against state-of-the-art methods while providing faithful explanations and enabling conceptlevel interventions. Our code is publicly available at https://github.com/Sorades/CLAT.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141629617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Ultrasound Report Generation with Cross-Modality Feature Alignment via Unsupervised Guidance. 通过无监督指导的跨模态特征对齐生成超声波报告

IEEE transactions on medical imaging

Pub Date : 2024-07-16 DOI: 10.1109/TMI.2024.3424978

Jun Li, Tongkun Su, Baoliang Zhao, Faqin Lv, Qiong Wang, Nassir Navab, Ying Hu, Zhongliang Jiang

Automatic report generation has arisen as a significant research area in computer-aided diagnosis, aiming to alleviate the burden on clinicians by generating reports automatically based on medical images. In this work, we propose a novel framework for automatic ultrasound report generation, leveraging a combination of unsupervised and supervised learning methods to aid the report generation process. Our framework incorporates unsupervised learning methods to extract potential knowledge from ultrasound text reports, serving as the prior information to guide the model in aligning visual and textual features, thereby addressing the challenge of feature discrepancy. Additionally, we design a global semantic comparison mechanism to enhance the performance of generating more comprehensive and accurate medical reports. To enable the implementation of ultrasound report generation, we constructed three large-scale ultrasound image-text datasets from different organs for training and validation purposes. Extensive evaluations with other state-of-the-art approaches exhibit its superior performance across all three datasets. Code and dataset are valuable at this link.

自动报告生成已成为计算机辅助诊断的一个重要研究领域，其目的是通过根据医学图像自动生成报告来减轻临床医生的负担。在这项工作中，我们提出了一种新颖的超声报告自动生成框架，利用无监督和有监督学习方法的结合来辅助报告生成过程。我们的框架结合了无监督学习方法，从超声文本报告中提取潜在知识，作为先验信息指导模型对齐视觉和文本特征，从而解决特征差异带来的挑战。此外，我们还设计了一种全局语义比较机制，以提高生成更全面、更准确的医疗报告的性能。为实现超声报告生成，我们构建了三个来自不同器官的大规模超声图像-文本数据集，用于训练和验证。与其他最先进的方法进行的广泛评估表明，该方法在所有三个数据集上都具有卓越的性能。代码和数据集在此链接中提供。

{"title":"Ultrasound Report Generation with Cross-Modality Feature Alignment via Unsupervised Guidance.","authors":"Jun Li, Tongkun Su, Baoliang Zhao, Faqin Lv, Qiong Wang, Nassir Navab, Ying Hu, Zhongliang Jiang","doi":"10.1109/TMI.2024.3424978","DOIUrl":"https://doi.org/10.1109/TMI.2024.3424978","url":null,"abstract":"Automatic report generation has arisen as a significant research area in computer-aided diagnosis, aiming to alleviate the burden on clinicians by generating reports automatically based on medical images. In this work, we propose a novel framework for automatic ultrasound report generation, leveraging a combination of unsupervised and supervised learning methods to aid the report generation process. Our framework incorporates unsupervised learning methods to extract potential knowledge from ultrasound text reports, serving as the prior information to guide the model in aligning visual and textual features, thereby addressing the challenge of feature discrepancy. Additionally, we design a global semantic comparison mechanism to enhance the performance of generating more comprehensive and accurate medical reports. To enable the implementation of ultrasound report generation, we constructed three large-scale ultrasound image-text datasets from different organs for training and validation purposes. Extensive evaluations with other state-of-the-art approaches exhibit its superior performance across all three datasets. Code and dataset are valuable at this link.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141629619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An end-to-end geometry-based pipeline for automatic preoperative surgical planning of pelvic fracture reduction and fixation. 基于几何形状的端到端管道，用于骨盆骨折复位和固定的术前自动手术规划。

IEEE transactions on medical imaging

Pub Date : 2024-07-16 DOI: 10.1109/TMI.2024.3429403

Jiaxuan Liu, Haitao Li, Bolun Zeng, Huixiang Wang, Ron Kikinis, Leo Joskowicz, Xiaojun Chen

Computer-assisted preoperative planning of pelvic fracture reduction surgery has the potential to increase the accuracy of the surgery and to reduce complications. However, the diversity of the pelvic fractures and the disturbance of small fracture fragments present a great challenge to perform reliable automatic preoperative planning. In this paper, we present a comprehensive and automatic preoperative planning pipeline for pelvic fracture surgery. It includes pelvic fracture labeling, reduction planning of the fracture, and customized screw implantation. First, automatic bone fracture labeling is performed based on the separation of the fracture sections. Then, fracture reduction planning is performed based on automatic extraction and pairing of the fracture surfaces. Finally, screw implantation is planned using the adjoint fracture surfaces. The proposed pipeline was tested on different types of pelvic fracture in 14 clinical cases. Our method achieved a translational and rotational accuracy of 2.56 mm and 3.31° in reduction planning. For fixation planning, a clinical acceptance rate of 86.7% was achieved. The results demonstrate the feasibility of the clinical application of our method. Our method has shown accuracy and reliability for complex multi-body bone fractures, which may provide effective clinical preoperative guidance and may improve the accuracy of pelvic fracture reduction surgery.

骨盆骨折复位手术的计算机辅助术前规划有可能提高手术的准确性并减少并发症。然而，骨盆骨折的多样性和细小骨折片的干扰给进行可靠的自动术前规划带来了巨大挑战。在本文中，我们介绍了骨盆骨折手术的综合自动术前规划流水线。它包括骨盆骨折标记、骨折复位规划和定制螺钉植入。首先，根据骨折断面的分离情况自动进行骨折标注。然后，在自动提取和配对骨折面的基础上进行骨折缩小规划。最后，利用相邻的骨折面规划螺钉植入。在 14 个临床病例中，对不同类型的骨盆骨折进行了测试。在复位规划中，我们的方法达到了 2.56 毫米和 3.31°的平移和旋转精度。在固定规划方面，临床接受率达到 86.7%。这些结果证明了我们的方法在临床应用中的可行性。我们的方法对复杂的多体骨折具有准确性和可靠性，可提供有效的临床术前指导，并可提高骨盆骨折复位手术的准确性。

{"title":"An end-to-end geometry-based pipeline for automatic preoperative surgical planning of pelvic fracture reduction and fixation.","authors":"Jiaxuan Liu, Haitao Li, Bolun Zeng, Huixiang Wang, Ron Kikinis, Leo Joskowicz, Xiaojun Chen","doi":"10.1109/TMI.2024.3429403","DOIUrl":"https://doi.org/10.1109/TMI.2024.3429403","url":null,"abstract":"Computer-assisted preoperative planning of pelvic fracture reduction surgery has the potential to increase the accuracy of the surgery and to reduce complications. However, the diversity of the pelvic fractures and the disturbance of small fracture fragments present a great challenge to perform reliable automatic preoperative planning. In this paper, we present a comprehensive and automatic preoperative planning pipeline for pelvic fracture surgery. It includes pelvic fracture labeling, reduction planning of the fracture, and customized screw implantation. First, automatic bone fracture labeling is performed based on the separation of the fracture sections. Then, fracture reduction planning is performed based on automatic extraction and pairing of the fracture surfaces. Finally, screw implantation is planned using the adjoint fracture surfaces. The proposed pipeline was tested on different types of pelvic fracture in 14 clinical cases. Our method achieved a translational and rotational accuracy of 2.56 mm and 3.31° in reduction planning. For fixation planning, a clinical acceptance rate of 86.7% was achieved. The results demonstrate the feasibility of the clinical application of our method. Our method has shown accuracy and reliability for complex multi-body bone fractures, which may provide effective clinical preoperative guidance and may improve the accuracy of pelvic fracture reduction surgery.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141629616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

COSTA: A Multi-center TOF-MRA Dataset and A Style Self-Consistency Network for Cerebrovascular Segmentation. COSTA：用于脑血管分割的多中心 TOF-MRA 数据集和风格自洽性网络。

IEEE transactions on medical imaging

Pub Date : 2024-07-16 DOI: 10.1109/TMI.2024.3424976

Lei Mou, Qifeng Yan, Jinghui Lin, Yifan Zhao, Yonghuai Liu, Shaodong Ma, Jiong Zhang, Wenhao Lv, Tao Zhou, Alejandro F Frangi, Yitian Zhao

Time-of-flight magnetic resonance angiography (TOF-MRA) is the least invasive and ionizing radiation-free approach for cerebrovascular imaging, but variations in imaging artifacts across different clinical centers and imaging vendors result in inter-site and inter-vendor heterogeneity, making its accurate and robust cerebrovascular segmentation challenging. Moreover, the limited availability and quality of annotated data pose further challenges for segmentation methods to generalize well to unseen datasets. In this paper, we construct the largest and most diverse TOF-MRA dataset (COSTA) from 8 individual imaging centers, with all the volumes manually annotated. Then we propose a novel network for cerebrovascular segmentation, namely CESAR, with the ability to tackle feature granularity and image style heterogeneity issues. Specifically, a coarse-to-fine architecture is implemented to refine cerebrovascular segmentation in an iterative manner. An automatic feature selection module is proposed to selectively fuse global long-range dependencies and local contextual information of cerebrovascular structures. A style self-consistency loss is then introduced to explicitly align diverse styles of TOF-MRA images to a standardized one. Extensive experimental results on the COSTA dataset demonstrate the effectiveness of our CESAR network against state-of-the-art methods. We have made 6 subsets of COSTA with the source code online available, in order to promote relevant research in the community.

飞行时间磁共振血管成像（TOF-MRA）是脑血管成像中侵入性最小、无电离辐射的方法，但不同临床中心和成像供应商的成像伪影存在差异，导致了不同临床中心和供应商之间的异质性，使其准确、稳健的脑血管分割具有挑战性。此外，注释数据的可用性和质量有限，也给分割方法在未见过的数据集上良好推广带来了更多挑战。在本文中，我们构建了最大、最多样化的 TOF-MRA 数据集 (COSTA)，该数据集来自 8 个独立的成像中心，所有体量均由人工标注。然后，我们提出了一种用于脑血管分割的新型网络，即 CESAR，它能够解决特征粒度和图像风格异质性问题。具体来说，我们采用了一种从粗到细的架构，以迭代的方式完善脑血管分割。提出了一个自动特征选择模块，以有选择性地融合脑血管结构的全局长程依赖性和局部上下文信息。然后引入风格自一致性损失，明确地将不同风格的 TOF-MRA 图像调整为标准化图像。在 COSTA 数据集上的大量实验结果表明，我们的 CESAR 网络与最先进的方法相比非常有效。我们在线提供了 COSTA 的 6 个子集及源代码，以促进社区的相关研究。

{"title":"COSTA: A Multi-center TOF-MRA Dataset and A Style Self-Consistency Network for Cerebrovascular Segmentation.","authors":"Lei Mou, Qifeng Yan, Jinghui Lin, Yifan Zhao, Yonghuai Liu, Shaodong Ma, Jiong Zhang, Wenhao Lv, Tao Zhou, Alejandro F Frangi, Yitian Zhao","doi":"10.1109/TMI.2024.3424976","DOIUrl":"https://doi.org/10.1109/TMI.2024.3424976","url":null,"abstract":"Time-of-flight magnetic resonance angiography (TOF-MRA) is the least invasive and ionizing radiation-free approach for cerebrovascular imaging, but variations in imaging artifacts across different clinical centers and imaging vendors result in inter-site and inter-vendor heterogeneity, making its accurate and robust cerebrovascular segmentation challenging. Moreover, the limited availability and quality of annotated data pose further challenges for segmentation methods to generalize well to unseen datasets. In this paper, we construct the largest and most diverse TOF-MRA dataset (COSTA) from 8 individual imaging centers, with all the volumes manually annotated. Then we propose a novel network for cerebrovascular segmentation, namely CESAR, with the ability to tackle feature granularity and image style heterogeneity issues. Specifically, a coarse-to-fine architecture is implemented to refine cerebrovascular segmentation in an iterative manner. An automatic feature selection module is proposed to selectively fuse global long-range dependencies and local contextual information of cerebrovascular structures. A style self-consistency loss is then introduced to explicitly align diverse styles of TOF-MRA images to a standardized one. Extensive experimental results on the COSTA dataset demonstrate the effectiveness of our CESAR network against state-of-the-art methods. We have made 6 subsets of COSTA with the source code online available, in order to promote relevant research in the community.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141629618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Video-Instrument Synergistic Network for Referring Video Instrument Segmentation in Robotic Surgery. 用于机器人手术中转诊视频器械分割的视频器械协同网络

IEEE transactions on medical imaging

Pub Date : 2024-07-11 DOI: 10.1109/TMI.2024.3426953

Hongqiu Wang, Guang Yang, Shichen Zhang, Jing Qin, Yike Guo, Bo Xu, Yueming Jin, Lei Zhu

Surgical instrument segmentation is fundamentally important for facilitating cognitive intelligence in robot-assisted surgery. Although existing methods have achieved accurate instrument segmentation results, they simultaneously generate segmentation masks of all instruments, which lack the capability to specify a target object and allow an interactive experience. This paper focuses on a novel and essential task in robotic surgery, i.e., Referring Surgical Video Instrument Segmentation (RSVIS), which aims to automatically identify and segment the target surgical instruments from each video frame, referred by a given language expression. This interactive feature offers enhanced user engagement and customized experiences, greatly benefiting the development of the next generation of surgical education systems. To achieve this, this paper constructs two surgery video datasets to promote the RSVIS research. Then, we devise a novel Video-Instrument Synergistic Network (VIS-Net) to learn both video-level and instrument-level knowledge to boost performance, while previous work only utilized video-level information. Meanwhile, we design a Graph-based Relation-aware Module (GRM) to model the correlation between multi-modal information (i.e., textual description and video frame) to facilitate the extraction of instrument-level information. Extensive experimental results on two RSVIS datasets exhibit that the VIS-Net can significantly outperform existing state-of-the-art referring segmentation methods. We will release our code and dataset for future research (Git).

手术器械分割对于促进机器人辅助手术中的认知智能至关重要。虽然现有的方法已经获得了精确的器械分割结果，但它们同时生成了所有器械的分割掩模，缺乏指定目标对象和实现交互体验的能力。本文的重点是机器人手术中一项新颖而重要的任务，即参考手术视频器械分割（RSVIS），其目的是从每个视频帧中自动识别和分割目标手术器械，并通过给定的语言表达进行参考。这种交互式功能提高了用户参与度和定制化体验，对下一代外科手术教育系统的开发大有裨益。为此，本文构建了两个手术视频数据集，以促进 RSVIS 的研究。然后，我们设计了一个新颖的视频-器械协同网络（VIS-Net）来学习视频级和器械级知识，以提高性能，而之前的工作只利用了视频级信息。同时，我们设计了一个基于图的关系感知模块（GRM）来模拟多模态信息（即文本描述和视频帧）之间的相关性，从而促进乐器级信息的提取。在两个 RSVIS 数据集上的大量实验结果表明，VIS-Net 的性能明显优于现有的最先进的指代分割方法。我们将发布我们的代码和数据集，供未来研究使用（Git）。

{"title":"Video-Instrument Synergistic Network for Referring Video Instrument Segmentation in Robotic Surgery.","authors":"Hongqiu Wang, Guang Yang, Shichen Zhang, Jing Qin, Yike Guo, Bo Xu, Yueming Jin, Lei Zhu","doi":"10.1109/TMI.2024.3426953","DOIUrl":"https://doi.org/10.1109/TMI.2024.3426953","url":null,"abstract":"Surgical instrument segmentation is fundamentally important for facilitating cognitive intelligence in robot-assisted surgery. Although existing methods have achieved accurate instrument segmentation results, they simultaneously generate segmentation masks of all instruments, which lack the capability to specify a target object and allow an interactive experience. This paper focuses on a novel and essential task in robotic surgery, i.e., Referring Surgical Video Instrument Segmentation (RSVIS), which aims to automatically identify and segment the target surgical instruments from each video frame, referred by a given language expression. This interactive feature offers enhanced user engagement and customized experiences, greatly benefiting the development of the next generation of surgical education systems. To achieve this, this paper constructs two surgery video datasets to promote the RSVIS research. Then, we devise a novel Video-Instrument Synergistic Network (VIS-Net) to learn both video-level and instrument-level knowledge to boost performance, while previous work only utilized video-level information. Meanwhile, we design a Graph-based Relation-aware Module (GRM) to model the correlation between multi-modal information (i.e., textual description and video frame) to facilitate the extraction of instrument-level information. Extensive experimental results on two RSVIS datasets exhibit that the VIS-Net can significantly outperform existing state-of-the-art referring segmentation methods. We will release our code and dataset for future research (Git).","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141592375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Counterfactual Causal-Effect Intervention for Interpretable Medical Visual Question Answering. 用于可解释医学视觉问题解答的反事实因果干预。

IEEE transactions on medical imaging

Pub Date : 2024-07-09 DOI: 10.1109/TMI.2024.3425533

Linqin Cai, Haodu Fang, Nuoying Xu, Bo Ren

Medical Visual Question Answering (VQA-Med) is a challenging task that involves answering clinical questions related to medical images. However, most current VQA-Med methods ignore the causal correlation between specific lesion or abnormality features and answers, while also failing to provide accurate explanations for their decisions. To explore the interpretability of VQA-Med, this paper proposes a novel CCIS-MVQA model for VQA-Med based on a counterfactual causal-effect intervention strategy. This model consists of the modified ResNet for image feature extraction, a GloVe decoder for question feature extraction, a bilinear attention network for vision and language feature fusion, and an interpretability generator for producing the interpretability and prediction results. The proposed CCIS-MVQA introduces a layer-wise relevance propagation method to automatically generate counterfactual samples. Additionally, CCIS-MVQA applies counterfactual causal reasoning throughout the training phase to enhance interpretability and generalization. Extensive experiments on three benchmark datasets show that the proposed CCIS-MVQA model outperforms the state-of-the-art methods. Enough visualization results are produced to analyze the interpretability and performance of CCIS-MVQA.

医学视觉问题解答（VQA-Med）是一项具有挑战性的任务，涉及回答与医学图像相关的临床问题。然而，目前大多数 VQA-Med 方法都忽略了特定病变或异常特征与答案之间的因果关系，同时也无法为其决策提供准确的解释。为了探索 VQA-Med 的可解释性，本文提出了一种基于反事实因果干预策略的新型 CCIS-MVQA VQA-Med 模型。该模型由用于图像特征提取的改进 ResNet、用于问题特征提取的 GloVe 解码器、用于视觉和语言特征融合的双线性注意网络以及用于生成可解释性和预测结果的可解释性生成器组成。所提出的 CCIS-MVQA 引入了一种分层相关性传播方法，可自动生成反事实样本。此外，CCIS-MVQA 还将反事实因果推理应用于整个训练阶段，以提高可解释性和泛化能力。在三个基准数据集上进行的广泛实验表明，所提出的 CCIS-MVQA 模型优于最先进的方法。实验产生的可视化结果足以分析 CCIS-MVQA 的可解释性和性能。

{"title":"Counterfactual Causal-Effect Intervention for Interpretable Medical Visual Question Answering.","authors":"Linqin Cai, Haodu Fang, Nuoying Xu, Bo Ren","doi":"10.1109/TMI.2024.3425533","DOIUrl":"https://doi.org/10.1109/TMI.2024.3425533","url":null,"abstract":"Medical Visual Question Answering (VQA-Med) is a challenging task that involves answering clinical questions related to medical images. However, most current VQA-Med methods ignore the causal correlation between specific lesion or abnormality features and answers, while also failing to provide accurate explanations for their decisions. To explore the interpretability of VQA-Med, this paper proposes a novel CCIS-MVQA model for VQA-Med based on a counterfactual causal-effect intervention strategy. This model consists of the modified ResNet for image feature extraction, a GloVe decoder for question feature extraction, a bilinear attention network for vision and language feature fusion, and an interpretability generator for producing the interpretability and prediction results. The proposed CCIS-MVQA introduces a layer-wise relevance propagation method to automatically generate counterfactual samples. Additionally, CCIS-MVQA applies counterfactual causal reasoning throughout the training phase to enhance interpretability and generalization. Extensive experiments on three benchmark datasets show that the proposed CCIS-MVQA model outperforms the state-of-the-art methods. Enough visualization results are produced to analyze the interpretability and performance of CCIS-MVQA.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141565413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Attribute Prototype-guided Iterative Scene Graph for Explainable Radiology Report Generation. 属性原型指导下的迭代场景图，用于生成可解释的放射学报告。

IEEE transactions on medical imaging

Pub Date : 2024-07-08 DOI: 10.1109/TMI.2024.3424505

Ke Zhang, Yan Yang, Jun Yu, Jianping Fan, Hanliang Jiang, Qingming Huang, Weidong Han

The potential of automated radiology report generation in alleviating the time-consuming tasks of radiologists is increasingly being recognized in medical practice. Existing report generation methods have evolved from using image-level features to the latest approach of utilizing anatomical regions, significantly enhancing interpretability. However, directly and simplistically using region features for report generation compromises the capability of relation reasoning and overlooks the common attributes potentially shared across regions. To address these limitations, we propose a novel region-based Attribute Prototype-guided Iterative Scene Graph generation framework (AP-ISG) for report generation, utilizing scene graph generation as an auxiliary task to further enhance interpretability and relational reasoning capability. The core components of AP-ISG are the Iterative Scene Graph Generation (ISGG) module and the Attribute Prototype-guided Learning (APL) module. Specifically, ISSG employs an autoregressive scheme for structural edge reasoning and a contextualization mechanism for relational reasoning. APL enhances intra-prototype matching and reduces inter-prototype semantic overlap in the visual space to fully model the potential attribute commonalities among regions. Extensive experiments on the MIMIC-CXR with Chest ImaGenome datasets demonstrate the superiority of AP-ISG across multiple metrics.

在医疗实践中，人们越来越认识到自动生成放射报告在减轻放射医师耗时工作方面的潜力。现有的报告生成方法已从使用图像级特征发展到最新的利用解剖区域的方法，大大提高了可解释性。然而，直接简单地使用区域特征生成报告，会损害关系推理的能力，并忽略区域之间可能共享的共同属性。为了解决这些局限性，我们提出了一种新颖的基于区域属性原型引导的迭代场景图生成框架（AP-ISG）来生成报告，利用场景图生成作为辅助任务，进一步提高可解释性和关系推理能力。AP-ISG 的核心组件是迭代场景图生成（ISSG）模块和属性原型指导学习（APL）模块。具体来说，ISSG 采用自回归方案进行结构边缘推理，并采用上下文机制进行关系推理。APL 增强了视觉空间中的原型内匹配，减少了原型间的语义重叠，从而为区域间潜在的属性共性建立了完整的模型。利用胸部 ImaGenome 数据集在 MIMIC-CXR 上进行的大量实验证明了 AP-ISG 在多个指标上的优越性。

{"title":"Attribute Prototype-guided Iterative Scene Graph for Explainable Radiology Report Generation.","authors":"Ke Zhang, Yan Yang, Jun Yu, Jianping Fan, Hanliang Jiang, Qingming Huang, Weidong Han","doi":"10.1109/TMI.2024.3424505","DOIUrl":"https://doi.org/10.1109/TMI.2024.3424505","url":null,"abstract":"The potential of automated radiology report generation in alleviating the time-consuming tasks of radiologists is increasingly being recognized in medical practice. Existing report generation methods have evolved from using image-level features to the latest approach of utilizing anatomical regions, significantly enhancing interpretability. However, directly and simplistically using region features for report generation compromises the capability of relation reasoning and overlooks the common attributes potentially shared across regions. To address these limitations, we propose a novel region-based Attribute Prototype-guided Iterative Scene Graph generation framework (AP-ISG) for report generation, utilizing scene graph generation as an auxiliary task to further enhance interpretability and relational reasoning capability. The core components of AP-ISG are the Iterative Scene Graph Generation (ISGG) module and the Attribute Prototype-guided Learning (APL) module. Specifically, ISSG employs an autoregressive scheme for structural edge reasoning and a contextualization mechanism for relational reasoning. APL enhances intra-prototype matching and reduces inter-prototype semantic overlap in the visual space to fully model the potential attribute commonalities among regions. Extensive experiments on the MIMIC-CXR with Chest ImaGenome datasets demonstrate the superiority of AP-ISG across multiple metrics.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141560572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Carotid Vessel Wall Segmentation Through Domain Aligner, Topological Learning, and Segment Anything Model for Sparse Annotation in MR Images. 通过域对齐器、拓扑学习和用于 MR 图像稀疏注释的分段 Anything 模型进行颈动脉血管壁分段

IEEE transactions on medical imaging

Pub Date : 2024-07-08 DOI: 10.1109/TMI.2024.3424884

Xibao Li, Xi Ouyang, Jiadong Zhang, Zhongxiang Ding, Yuyao Zhang, Zhong Xue, Feng Shi, Dinggang Shen

Medical image analysis poses significant challenges due to limited availability of clinical data, which is crucial for training accurate models. This limitation is further compounded by the specialized and labor-intensive nature of the data annotation process. For example, despite the popularity of computed tomography angiography (CTA) in diagnosing atherosclerosis with an abundance of annotated datasets, magnetic resonance (MR) images stand out with better visualization for soft plaque and vessel wall characterization. However, the higher cost and limited accessibility of MR, as well as time-consuming nature of manual labeling, contribute to fewer annotated datasets. To address these issues, we formulate a multi-modal transfer learning network, named MT-Net, designed to learn from unpaired CTA and sparsely-annotated MR data. Additionally, we harness the Segment Anything Model (SAM) to synthesize additional MR annotations, enriching the training process. Specifically, our method first segments vessel lumen regions followed by precise characterization of carotid artery vessel walls, thereby ensuring both segmentation accuracy and clinical relevance. Validation of our method involved rigorous experimentation on publicly available datasets from COSMOS and CARE-II challenge, demonstrating its superior performance compared to existing state-of-the-art techniques.

由于临床数据的可用性有限，医学图像分析面临着巨大的挑战，而临床数据对于训练精确的模型至关重要。数据标注过程的专业性和劳动密集型进一步加剧了这一局限性。例如，尽管计算机断层扫描血管造影术（CTA）在诊断动脉粥样硬化方面很受欢迎，有大量的注释数据集，但磁共振（MR）图像在软斑块和血管壁特征描述方面具有更好的可视化效果。然而，磁共振成像的成本较高，可访问性有限，而且人工标注耗时，因此注释数据集较少。为了解决这些问题，我们建立了一个多模态迁移学习网络（名为 MT-Net），旨在从未配对的 CTA 和稀疏标注的 MR 数据中学习。此外，我们还利用 "任意分段模型"（SAM）来合成额外的磁共振注释，从而丰富了训练过程。具体来说，我们的方法首先分割血管腔区域，然后精确描述颈动脉血管壁的特征，从而确保分割的准确性和临床相关性。我们的方法在 COSMOS 和 CARE-II 挑战赛的公开数据集上进行了严格的实验验证，证明其性能优于现有的先进技术。

{"title":"Carotid Vessel Wall Segmentation Through Domain Aligner, Topological Learning, and Segment Anything Model for Sparse Annotation in MR Images.","authors":"Xibao Li, Xi Ouyang, Jiadong Zhang, Zhongxiang Ding, Yuyao Zhang, Zhong Xue, Feng Shi, Dinggang Shen","doi":"10.1109/TMI.2024.3424884","DOIUrl":"https://doi.org/10.1109/TMI.2024.3424884","url":null,"abstract":"Medical image analysis poses significant challenges due to limited availability of clinical data, which is crucial for training accurate models. This limitation is further compounded by the specialized and labor-intensive nature of the data annotation process. For example, despite the popularity of computed tomography angiography (CTA) in diagnosing atherosclerosis with an abundance of annotated datasets, magnetic resonance (MR) images stand out with better visualization for soft plaque and vessel wall characterization. However, the higher cost and limited accessibility of MR, as well as time-consuming nature of manual labeling, contribute to fewer annotated datasets. To address these issues, we formulate a multi-modal transfer learning network, named MT-Net, designed to learn from unpaired CTA and sparsely-annotated MR data. Additionally, we harness the Segment Anything Model (SAM) to synthesize additional MR annotations, enriching the training process. Specifically, our method first segments vessel lumen regions followed by precise characterization of carotid artery vessel walls, thereby ensuring both segmentation accuracy and clinical relevance. Validation of our method involved rigorous experimentation on publicly available datasets from COSMOS and CARE-II challenge, demonstrating its superior performance compared to existing state-of-the-art techniques.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141560573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0