首页 > 最新文献

IEEE transactions on medical imaging最新文献

英文 中文
IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-training. IMITATE:临床先导分层视觉语言预培训。
Pub Date : 2024-08-26 DOI: 10.1109/TMI.2024.3449690
Che Liu, Sibo Cheng, Miaojing Shi, Anand Shah, Wenjia Bai, Rossella Arcucci

In the field of medical Vision-Language Pretraining (VLP), significant efforts have been devoted to deriving text and image features from both clinical reports and associated medical images. However, most existing methods may have overlooked the opportunity in leveraging the inherent hierarchical structure of clinical reports, which are generally split into 'findings' for descriptive content and 'impressions' for conclusive observation. Instead of utilizing this rich, structured format, current medical VLP approaches often simplify the report into either a unified entity or fragmented tokens. In this work, we propose a novel clinical prior guided VLP framework named IMITATE to learn the structure information from medical reports with hierarchical vision-language alignment. The framework derives multi-level visual features from the chest X-ray (CXR) images and separately aligns these features with the descriptive and the conclusive text encoded in the hierarchical medical report. Furthermore, a new clinical-informed contrastive loss is introduced for cross-modal learning, which accounts for clinical prior knowledge in formulating sample correlations in contrastive learning. The proposed model, IMITATE, outperforms baseline VLP methods across six different datasets, spanning five medical imaging downstream tasks. Comprehensive experimental results highlight the advantages of integrating the hierarchical structure of medical reports for vision-language alignment.

在医学视觉语言预训练(VLP)领域,人们一直致力于从临床报告和相关医学图像中获取文本和图像特征。然而,大多数现有方法可能忽略了利用临床报告固有的层次结构的机会,临床报告一般分为描述性内容的 "发现 "和结论性观察的 "印象"。当前的医学 VLP 方法往往没有利用这种丰富的结构化格式,而是将报告简化为统一的实体或零散的标记。在这项工作中,我们提出了一种名为 "IMITATE "的新型临床先验指导 VLP 框架,通过分层视觉语言对齐从医疗报告中学习结构信息。该框架从胸部 X 光(CXR)图像中提取多层次视觉特征,并分别将这些特征与分层医疗报告中编码的描述性和结论性文本进行对齐。此外,还为跨模态学习引入了一种新的临床信息对比损失(contrast-informed loss),它在对比学习中考虑到了制定样本相关性时的临床先验知识。在横跨五个医学影像下游任务的六个不同数据集上,所提出的模型 IMITATE 优于基准 VLP 方法。全面的实验结果凸显了将医学报告的层次结构整合到视觉语言配准中的优势。
{"title":"IMITATE: Clinical Prior Guided Hierarchical Vision-Language Pre-training.","authors":"Che Liu, Sibo Cheng, Miaojing Shi, Anand Shah, Wenjia Bai, Rossella Arcucci","doi":"10.1109/TMI.2024.3449690","DOIUrl":"https://doi.org/10.1109/TMI.2024.3449690","url":null,"abstract":"<p><p>In the field of medical Vision-Language Pretraining (VLP), significant efforts have been devoted to deriving text and image features from both clinical reports and associated medical images. However, most existing methods may have overlooked the opportunity in leveraging the inherent hierarchical structure of clinical reports, which are generally split into 'findings' for descriptive content and 'impressions' for conclusive observation. Instead of utilizing this rich, structured format, current medical VLP approaches often simplify the report into either a unified entity or fragmented tokens. In this work, we propose a novel clinical prior guided VLP framework named IMITATE to learn the structure information from medical reports with hierarchical vision-language alignment. The framework derives multi-level visual features from the chest X-ray (CXR) images and separately aligns these features with the descriptive and the conclusive text encoded in the hierarchical medical report. Furthermore, a new clinical-informed contrastive loss is introduced for cross-modal learning, which accounts for clinical prior knowledge in formulating sample correlations in contrastive learning. The proposed model, IMITATE, outperforms baseline VLP methods across six different datasets, spanning five medical imaging downstream tasks. Comprehensive experimental results highlight the advantages of integrating the hierarchical structure of medical reports for vision-language alignment.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142074850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative Adversarial Network with Robust Discriminator Through Multi-Task Learning for Low-Dose CT Denoising. 通过多任务学习为低剂量 CT 去噪提供具有鲁棒判别器的生成对抗网络
Pub Date : 2024-08-26 DOI: 10.1109/TMI.2024.3449647
Sunggu Kyung, Jongjun Won, Seongyong Pak, Sunwoo Kim, Sangyoon Lee, Kanggil Park, Gil-Sun Hong, Namkug Kim

Reducing the dose of radiation in computed tomography (CT) is vital to decreasing secondary cancer risk. However, the use of low-dose CT (LDCT) images is accompanied by increased noise that can negatively impact diagnoses. Although numerous deep learning algorithms have been developed for LDCT denoising, several challenges persist, including the visual incongruence experienced by radiologists, unsatisfactory performances across various metrics, and insufficient exploration of the networks' robustness in other CT domains. To address such issues, this study proposes three novel accretions. First, we propose a generative adversarial network (GAN) with a robust discriminator through multi-task learning that simultaneously performs three vision tasks: restoration, image-level, and pixel-level decisions. The more multi-tasks that are performed, the better the denoising performance of the generator, which means multi-task learning enables the discriminator to provide more meaningful feedback to the generator. Second, two regulatory mechanisms, restoration consistency (RC) and non-difference suppression (NDS), are introduced to improve the discriminator's representation capabilities. These mechanisms eliminate irrelevant regions and compare the discriminator's results from the input and restoration, thus facilitating effective GAN training. Lastly, we incorporate residual fast Fourier transforms with convolution (Res-FFT-Conv) blocks into the generator to utilize both frequency and spatial representations. This approach provides mixed receptive fields by using spatial (or local), spectral (or global), and residual connections. Our model was evaluated using various pixel- and feature-space metrics in two denoising tasks. Additionally, we conducted visual scoring with radiologists. The results indicate superior performance in both quantitative and qualitative measures compared to state-of-the-art denoising techniques.

减少计算机断层扫描(CT)的辐射剂量对于降低继发性癌症风险至关重要。然而,低剂量 CT(LDCT)图像的使用伴随着噪声的增加,会对诊断产生负面影响。虽然针对 LDCT 去噪已经开发出了许多深度学习算法,但仍存在一些挑战,包括放射科医生体验到的视觉不协调、各种指标的表现不尽如人意,以及对网络在其他 CT 领域的鲁棒性探索不足。为了解决这些问题,本研究提出了三个新的增量。首先,我们提出了一种生成式对抗网络(GAN),该网络通过多任务学习具有鲁棒性判别器,可同时执行三项视觉任务:还原、图像级和像素级决策。执行的多任务越多,生成器的去噪性能就越好,这意味着多任务学习能让判别器为生成器提供更有意义的反馈。其次,为了提高鉴别器的表征能力,引入了两种调节机制,即恢复一致性(RC)和无差异抑制(NDS)。这些机制可以消除无关区域,并比较鉴别器从输入和恢复中得到的结果,从而促进有效的 GAN 训练。最后,我们将残差快速傅立叶变换与卷积(Res-FFT-Conv)块纳入生成器,以利用频率和空间表示。这种方法通过使用空间(或局部)、频谱(或全局)和残差连接来提供混合感受野。我们在两项去噪任务中使用各种像素和特征空间指标对我们的模型进行了评估。此外,我们还与放射科医生进行了视觉评分。结果表明,与最先进的去噪技术相比,我们的模型在定量和定性测量方面都表现出色。
{"title":"Generative Adversarial Network with Robust Discriminator Through Multi-Task Learning for Low-Dose CT Denoising.","authors":"Sunggu Kyung, Jongjun Won, Seongyong Pak, Sunwoo Kim, Sangyoon Lee, Kanggil Park, Gil-Sun Hong, Namkug Kim","doi":"10.1109/TMI.2024.3449647","DOIUrl":"https://doi.org/10.1109/TMI.2024.3449647","url":null,"abstract":"<p><p>Reducing the dose of radiation in computed tomography (CT) is vital to decreasing secondary cancer risk. However, the use of low-dose CT (LDCT) images is accompanied by increased noise that can negatively impact diagnoses. Although numerous deep learning algorithms have been developed for LDCT denoising, several challenges persist, including the visual incongruence experienced by radiologists, unsatisfactory performances across various metrics, and insufficient exploration of the networks' robustness in other CT domains. To address such issues, this study proposes three novel accretions. First, we propose a generative adversarial network (GAN) with a robust discriminator through multi-task learning that simultaneously performs three vision tasks: restoration, image-level, and pixel-level decisions. The more multi-tasks that are performed, the better the denoising performance of the generator, which means multi-task learning enables the discriminator to provide more meaningful feedback to the generator. Second, two regulatory mechanisms, restoration consistency (RC) and non-difference suppression (NDS), are introduced to improve the discriminator's representation capabilities. These mechanisms eliminate irrelevant regions and compare the discriminator's results from the input and restoration, thus facilitating effective GAN training. Lastly, we incorporate residual fast Fourier transforms with convolution (Res-FFT-Conv) blocks into the generator to utilize both frequency and spatial representations. This approach provides mixed receptive fields by using spatial (or local), spectral (or global), and residual connections. Our model was evaluated using various pixel- and feature-space metrics in two denoising tasks. Additionally, we conducted visual scoring with radiologists. The results indicate superior performance in both quantitative and qualitative measures compared to state-of-the-art denoising techniques.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142074849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BCNet: Bronchus Classification via Structure Guided Representation Learning. BCNet:通过结构引导表征学习进行支气管分类
Pub Date : 2024-08-23 DOI: 10.1109/TMI.2024.3448468
Wenhao Huang, Haifan Gong, Huan Zhang, Yu Wang, Xiang Wan, Guanbin Li, Haofeng Li, Hong Shen

CT-based bronchial tree analysis is a key step for the diagnosis of lung and airway diseases. However, the topology of bronchial trees varies across individuals, which presents a challenge to the automatic bronchus classification. To solve this issue, we propose the Bronchus Classification Network (BCNet), a structure-guided framework that exploits the segment-level topological information using point clouds to learn the voxel-level features. BCNet has two branches, a Point-Voxel Graph Neural Network (PV-GNN) for segment classification, and a Convolutional Neural Network (CNN) for voxel labeling. The two branches are simultaneously trained to learn topology-aware features for their shared backbone while it is feasible to run only the CNN branch for the inference. Therefore, BCNet maintains the same inference efficiency as its CNN baseline. Experimental results show that BCNet significantly exceeds the state-of-the-art methods by over 8.0% both on F1-score for classifying bronchus. Furthermore, we contribute BronAtlas: an open-access benchmark of bronchus imaging analysis with high-quality voxel-wise annotations of both anatomical and abnormal bronchial segments. The benchmark is available at link1.

基于 CT 的支气管树分析是诊断肺部和气道疾病的关键步骤。然而,支气管树的拓扑结构因人而异,这给支气管自动分类带来了挑战。为了解决这个问题,我们提出了支气管分类网络(Bronchus Classification Network,BCNet),这是一个结构引导的框架,它利用点云的节段级拓扑信息来学习体素级特征。BCNet 有两个分支,一个是用于节段分类的点-体素图神经网络(PV-GNN),另一个是用于体素标记的卷积神经网络(CNN)。这两个分支同时接受训练,为其共享骨干学习拓扑感知特征,而只运行 CNN 分支进行推理是可行的。因此,BCNet 保持了与 CNN 基线相同的推理效率。实验结果表明,在支气管分类的 F1 分数上,BCNet 都比最先进的方法高出 8.0% 以上。此外,我们还贡献了 BronAtlas:一个开放存取的支气管成像分析基准,其中包含解剖和异常支气管段的高质量体素注释。该基准可在 link1 上获取。
{"title":"BCNet: Bronchus Classification via Structure Guided Representation Learning.","authors":"Wenhao Huang, Haifan Gong, Huan Zhang, Yu Wang, Xiang Wan, Guanbin Li, Haofeng Li, Hong Shen","doi":"10.1109/TMI.2024.3448468","DOIUrl":"https://doi.org/10.1109/TMI.2024.3448468","url":null,"abstract":"<p><p>CT-based bronchial tree analysis is a key step for the diagnosis of lung and airway diseases. However, the topology of bronchial trees varies across individuals, which presents a challenge to the automatic bronchus classification. To solve this issue, we propose the Bronchus Classification Network (BCNet), a structure-guided framework that exploits the segment-level topological information using point clouds to learn the voxel-level features. BCNet has two branches, a Point-Voxel Graph Neural Network (PV-GNN) for segment classification, and a Convolutional Neural Network (CNN) for voxel labeling. The two branches are simultaneously trained to learn topology-aware features for their shared backbone while it is feasible to run only the CNN branch for the inference. Therefore, BCNet maintains the same inference efficiency as its CNN baseline. Experimental results show that BCNet significantly exceeds the state-of-the-art methods by over 8.0% both on F1-score for classifying bronchus. Furthermore, we contribute BronAtlas: an open-access benchmark of bronchus imaging analysis with high-quality voxel-wise annotations of both anatomical and abnormal bronchial segments. The benchmark is available at link<sup>1</sup>.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142044270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-Supervised Representation Distribution Learning for Reliable Data Augmentation in Histopathology WSI Classification. 组织病理学 WSI 分类中用于可靠数据增强的自监督表征分布学习
Pub Date : 2024-08-22 DOI: 10.1109/TMI.2024.3447672
Kunming Tang, Zhiguo Jiang, Kun Wu, Jun Shi, Fengying Xie, Wei Wang, Haibo Wu, Yushan Zheng

Multiple instance learning (MIL) based whole slide image (WSI) classification is often carried out on the representations of patches extracted from WSI with a pre-trained patch encoder. The performance of classification relies on both patch-level representation learning and MIL classifier training. Most MIL methods utilize a frozen model pre-trained on ImageNet or a model trained with self-supervised learning on histopathology image dataset to extract patch image representations and then fix these representations in the training of the MIL classifiers for efficiency consideration. However, the invariance of representations cannot meet the diversity requirement for training a robust MIL classifier, which has significantly limited the performance of the WSI classification. In this paper, we propose a Self-Supervised Representation Distribution Learning framework (SSRDL) for patch-level representation learning with an online representation sampling strategy (ORS) for both patch feature extraction and WSI-level data augmentation. The proposed method was evaluated on three datasets under three MIL frameworks. The experimental results have demonstrated that the proposed method achieves the best performance in histopathology image representation learning and data augmentation and outperforms state-of-the-art methods under different WSI classification frameworks. The code is available at https://github.com/lazytkm/SSRDL.

基于多实例学习(MIL)的整张幻灯片图像(WSI)分类通常是通过预先训练的补丁编码器从 WSI 提取的补丁表示进行的。分类的性能取决于补丁级表示学习和 MIL 分类器训练。大多数 MIL 方法利用在 ImageNet 上预先训练的冻结模型或在组织病理学图像数据集上通过自监督学习训练的模型来提取补丁图像表征,然后出于效率考虑在 MIL 分类器的训练中固定这些表征。然而,表征的不变性无法满足训练鲁棒性 MIL 分类器的多样性要求,这大大限制了 WSI 分类的性能。在本文中,我们提出了一种用于斑块级表征学习的自监督表征分布学习框架(SSRDL),采用在线表征采样策略(ORS)进行斑块特征提取和 WSI 级数据增强。在三个 MIL 框架下的三个数据集上对所提出的方法进行了评估。实验结果表明,所提出的方法在组织病理学图像表征学习和数据增强方面取得了最佳性能,在不同的 WSI 分类框架下优于最先进的方法。代码见 https://github.com/lazytkm/SSRDL。
{"title":"Self-Supervised Representation Distribution Learning for Reliable Data Augmentation in Histopathology WSI Classification.","authors":"Kunming Tang, Zhiguo Jiang, Kun Wu, Jun Shi, Fengying Xie, Wei Wang, Haibo Wu, Yushan Zheng","doi":"10.1109/TMI.2024.3447672","DOIUrl":"https://doi.org/10.1109/TMI.2024.3447672","url":null,"abstract":"<p><p>Multiple instance learning (MIL) based whole slide image (WSI) classification is often carried out on the representations of patches extracted from WSI with a pre-trained patch encoder. The performance of classification relies on both patch-level representation learning and MIL classifier training. Most MIL methods utilize a frozen model pre-trained on ImageNet or a model trained with self-supervised learning on histopathology image dataset to extract patch image representations and then fix these representations in the training of the MIL classifiers for efficiency consideration. However, the invariance of representations cannot meet the diversity requirement for training a robust MIL classifier, which has significantly limited the performance of the WSI classification. In this paper, we propose a Self-Supervised Representation Distribution Learning framework (SSRDL) for patch-level representation learning with an online representation sampling strategy (ORS) for both patch feature extraction and WSI-level data augmentation. The proposed method was evaluated on three datasets under three MIL frameworks. The experimental results have demonstrated that the proposed method achieves the best performance in histopathology image representation learning and data augmentation and outperforms state-of-the-art methods under different WSI classification frameworks. The code is available at https://github.com/lazytkm/SSRDL.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142037962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Scale Spatial-Temporal Attention Networks for Functional Connectome Classification. 用于功能连接组分类的多尺度时空注意力网络
Pub Date : 2024-08-22 DOI: 10.1109/TMI.2024.3448214
Youyong Kong, Xiaotong Zhang, Wenhan Wang, Yue Zhou, Yueying Li, Yonggui Yuan

Many neuropsychiatric disorders are considered to be associated with abnormalities in the functional connectivity networks of the brain. The research on the classification of functional connectivity can therefore provide new perspectives for understanding the pathology of disorders and contribute to early diagnosis and treatment. Functional connectivity exhibits a nature of dynamically changing over time, however, the majority of existing methods are unable to collectively reveal the spatial topology and time-varying characteristics. Furthermore, despite the efforts of limited spatial-temporal studies to capture rich information across different spatial scales, they have not delved into the temporal characteristics among different scales. To address above issues, we propose a novel Multi-Scale Spatial-Temporal Attention Networks (MSSTAN) to exploit the multi-scale spatial-temporal information provided by functional connectome for classification. To fully extract spatial features of brain regions, we propose a Topology Enhanced Graph Transformer module to guide the attention calculations in the learning of spatial features by incorporating topology priors. A Multi-Scale Pooling Strategy is introduced to obtain representations of brain connectome at various scales. Considering the temporal dynamic characteristics between dynamic functional connectome, we employ Locality Sensitive Hashing attention to further capture long-term dependencies in time dynamics across multiple scales and reduce the computational complexity of the original attention mechanism. Experiments on three brain fMRI datasets of MDD and ASD demonstrate the superiority of our proposed approach. In addition, benefiting from the attention mechanism in Transformer, our results are interpretable, which can contribute to the discovery of biomarkers. The code is available at https://github.com/LIST-KONG/MSSTAN.

许多神经精神疾病被认为与大脑功能连接网络的异常有关。因此,对功能连接分类的研究可以为了解疾病的病理提供新的视角,并有助于早期诊断和治疗。功能连通性具有随时间动态变化的特性,但现有的大多数方法都无法全面揭示其空间拓扑和时变特征。此外,尽管有限的时空研究努力捕捉不同空间尺度的丰富信息,但它们并未深入研究不同尺度之间的时间特征。针对上述问题,我们提出了一种新颖的多尺度空间-时间注意网络(MSSTAN),利用功能连接组提供的多尺度空间-时间信息进行分类。为了充分提取脑区的空间特征,我们提出了拓扑增强图转换器模块,通过结合拓扑先验来指导空间特征学习中的注意力计算。我们引入了多尺度池化策略,以获得不同尺度的大脑连接组表征。考虑到动态功能连接组之间的时间动态特征,我们采用了位置敏感哈希注意力,以进一步捕捉跨多个尺度的时间动态的长期依赖性,并降低原始注意力机制的计算复杂性。在 MDD 和 ASD 的三个大脑 fMRI 数据集上进行的实验证明了我们提出的方法的优越性。此外,得益于 Transformer 中的注意力机制,我们的结果具有可解释性,有助于发现生物标志物。代码见 https://github.com/LIST-KONG/MSSTAN。
{"title":"Multi-Scale Spatial-Temporal Attention Networks for Functional Connectome Classification.","authors":"Youyong Kong, Xiaotong Zhang, Wenhan Wang, Yue Zhou, Yueying Li, Yonggui Yuan","doi":"10.1109/TMI.2024.3448214","DOIUrl":"https://doi.org/10.1109/TMI.2024.3448214","url":null,"abstract":"<p><p>Many neuropsychiatric disorders are considered to be associated with abnormalities in the functional connectivity networks of the brain. The research on the classification of functional connectivity can therefore provide new perspectives for understanding the pathology of disorders and contribute to early diagnosis and treatment. Functional connectivity exhibits a nature of dynamically changing over time, however, the majority of existing methods are unable to collectively reveal the spatial topology and time-varying characteristics. Furthermore, despite the efforts of limited spatial-temporal studies to capture rich information across different spatial scales, they have not delved into the temporal characteristics among different scales. To address above issues, we propose a novel Multi-Scale Spatial-Temporal Attention Networks (MSSTAN) to exploit the multi-scale spatial-temporal information provided by functional connectome for classification. To fully extract spatial features of brain regions, we propose a Topology Enhanced Graph Transformer module to guide the attention calculations in the learning of spatial features by incorporating topology priors. A Multi-Scale Pooling Strategy is introduced to obtain representations of brain connectome at various scales. Considering the temporal dynamic characteristics between dynamic functional connectome, we employ Locality Sensitive Hashing attention to further capture long-term dependencies in time dynamics across multiple scales and reduce the computational complexity of the original attention mechanism. Experiments on three brain fMRI datasets of MDD and ASD demonstrate the superiority of our proposed approach. In addition, benefiting from the attention mechanism in Transformer, our results are interpretable, which can contribute to the discovery of biomarkers. The code is available at https://github.com/LIST-KONG/MSSTAN.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142037961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Moment-Consistent Contrastive CycleGAN for Cross-Domain Pancreatic Image Segmentation. 用于跨域胰腺图像分割的时刻一致对比 CycleGAN
Pub Date : 2024-08-21 DOI: 10.1109/TMI.2024.3447071
Zhongyu Chen, Yun Bian, Erwei Shen, Ligang Fan, Weifang Zhu, Fei Shi, Chengwei Shao, Xinjian Chen, Dehui Xiang

CT and MR are currently the most common imaging techniques for pancreatic cancer diagnosis. Accurate segmentation of the pancreas in CT and MR images can provide significant help in the diagnosis and treatment of pancreatic cancer. Traditional supervised segmentation methods require a large number of labeled CT and MR training data, which is usually time-consuming and laborious. Meanwhile, due to domain shift, traditional segmentation networks are difficult to be deployed on different imaging modality datasets. Cross-domain segmentation can utilize labeled source domain data to assist unlabeled target domains in solving the above problems. In this paper, a cross-domain pancreas segmentation algorithm is proposed based on Moment-Consistent Contrastive Cycle Generative Adversarial Networks (MC-CCycleGAN). MC-CCycleGAN is a style transfer network, in which the encoder of its generator is used to extract features from real images and style transfer images, constrain feature extraction through a contrastive loss, and fully extract structural features of input images during style transfer while eliminate redundant style features. The multi-order central moments of the pancreas are proposed to describe its anatomy in high dimensions and a contrastive loss is also proposed to constrain the moment consistency, so as to maintain consistency of the pancreatic structure and shape before and after style transfer. Multi-teacher knowledge distillation framework is proposed to transfer the knowledge from multiple teachers to a single student, so as to improve the robustness and performance of the student network. The experimental results have demonstrated the superiority of our framework over state-of-the-art domain adaptation methods.

CT 和 MR 是目前诊断胰腺癌最常用的成像技术。准确分割 CT 和 MR 图像中的胰腺可为胰腺癌的诊断和治疗提供重要帮助。传统的监督分割方法需要大量标注的 CT 和 MR 训练数据,通常费时费力。同时,由于领域偏移,传统的分割网络难以在不同的成像模式数据集上部署。跨域分割可以利用已标记的源域数据来辅助未标记的目标域,从而解决上述问题。本文提出了一种基于时刻一致对比循环生成对抗网络(Moment-Consistent Contrastive Cycle Generative Adversarial Networks,MC-CCycleGAN)的跨域胰腺分割算法。MC-CCycleGAN 是一种风格转移网络,其生成器的编码器用于从真实图像和风格转移图像中提取特征,通过对比损失约束特征提取,并在风格转移过程中充分提取输入图像的结构特征,同时消除多余的风格特征。提出了胰腺的多阶中心矩来描述其高维解剖结构,并提出了对比损失来约束矩的一致性,以保持风格转换前后胰腺结构和形状的一致性。提出了多教师知识提炼框架,将多个教师的知识转移到一个学生身上,从而提高学生网络的鲁棒性和性能。实验结果表明,我们的框架优于最先进的领域适应方法。
{"title":"Moment-Consistent Contrastive CycleGAN for Cross-Domain Pancreatic Image Segmentation.","authors":"Zhongyu Chen, Yun Bian, Erwei Shen, Ligang Fan, Weifang Zhu, Fei Shi, Chengwei Shao, Xinjian Chen, Dehui Xiang","doi":"10.1109/TMI.2024.3447071","DOIUrl":"https://doi.org/10.1109/TMI.2024.3447071","url":null,"abstract":"<p><p>CT and MR are currently the most common imaging techniques for pancreatic cancer diagnosis. Accurate segmentation of the pancreas in CT and MR images can provide significant help in the diagnosis and treatment of pancreatic cancer. Traditional supervised segmentation methods require a large number of labeled CT and MR training data, which is usually time-consuming and laborious. Meanwhile, due to domain shift, traditional segmentation networks are difficult to be deployed on different imaging modality datasets. Cross-domain segmentation can utilize labeled source domain data to assist unlabeled target domains in solving the above problems. In this paper, a cross-domain pancreas segmentation algorithm is proposed based on Moment-Consistent Contrastive Cycle Generative Adversarial Networks (MC-CCycleGAN). MC-CCycleGAN is a style transfer network, in which the encoder of its generator is used to extract features from real images and style transfer images, constrain feature extraction through a contrastive loss, and fully extract structural features of input images during style transfer while eliminate redundant style features. The multi-order central moments of the pancreas are proposed to describe its anatomy in high dimensions and a contrastive loss is also proposed to constrain the moment consistency, so as to maintain consistency of the pancreatic structure and shape before and after style transfer. Multi-teacher knowledge distillation framework is proposed to transfer the knowledge from multiple teachers to a single student, so as to improve the robustness and performance of the student network. The experimental results have demonstrated the superiority of our framework over state-of-the-art domain adaptation methods.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142019954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Non-rigid Histological Image Registration Guided by Keypoint Correspondences Based on Learnable Deep Features with Iterative Training. 基于可学习深度特征迭代训练的关键点对应关系引导的无监督非刚性组织学图像注册
Pub Date : 2024-08-21 DOI: 10.1109/TMI.2024.3447214
Xingyue Wei, Lin Ge, Lijie Huang, Jianwen Luo, Yan Xu

Histological image registration is a fundamental task in histological image analysis. It is challenging because of substantial appearance differences due to multiple staining. Keypoint correspondences, i.e., matched keypoint pairs, have been introduced to guide unsupervised deep learning (DL) based registration methods to handle such a registration task. This paper proposes an iterative keypoint correspondence-guided (IKCG) unsupervised network for non-rigid histological image registration. Fixed deep features and learnable deep features are introduced as keypoint descriptors to automatically establish keypoint correspondences, the distance between which is used as a loss function to train the registration network. Fixed deep features extracted from DL networks that are pre-trained on natural image datasets are more discriminative than handcrafted ones, benefiting from the deep and hierarchical nature of DL networks. The intermediate layer outputs of the registration networks trained on histological image datasets are extracted as learnable deep features, which reveal unique information for histological images. An iterative training strategy is adopted to train the registration network and optimize learnable deep features jointly. Benefiting from the excellent matching ability of learnable deep features optimized with the iterative training strategy, the proposed method can solve the local non-rigid large displacement problem, an inevitable problem usually caused by misoperation, such as tears in producing tissue slices. The proposed method is evaluated on the Automatic Non-rigid Histology Image Registration (ANHIR) website and AutomatiC Registration Of Breast cAncer Tissue (ACROBAT) website. It ranked 1st on both websites as of August 6th, 2024.

组织学图像配准是组织学图像分析的一项基本任务。由于多重染色会造成巨大的外观差异,因此这项工作极具挑战性。关键点对应(即匹配的关键点对)被用来指导基于深度学习(DL)的无监督配准方法,以处理此类配准任务。本文提出了一种用于非刚性组织学图像配准的迭代关键点对应引导(IKCG)无监督网络。本文引入了固定深度特征和可学习深度特征作为关键点描述符,以自动建立关键点对应关系,并将两者之间的距离作为损失函数来训练配准网络。从在自然图像数据集上预先训练过的 DL 网络中提取的固定深度特征比手工制作的特征更有辨别力,这得益于 DL 网络的深度和分层特性。在组织学图像数据集上训练的配准网络的中间层输出被提取为可学习的深度特征,这些特征揭示了组织学图像的独特信息。采用迭代训练策略训练配准网络,并共同优化可学习深度特征。利用迭代训练策略优化的可学习深度特征的出色匹配能力,所提出的方法可以解决局部非刚性大位移问题,而这一问题通常是由误操作(如制作组织切片时的撕裂)引起的。该方法在自动非刚性组织学图像配准(ANHIR)网站和乳腺癌组织自动配准(ACROBAT)网站上进行了评估。截至 2024 年 8 月 6 日,该方法在这两个网站上均排名第一。
{"title":"Unsupervised Non-rigid Histological Image Registration Guided by Keypoint Correspondences Based on Learnable Deep Features with Iterative Training.","authors":"Xingyue Wei, Lin Ge, Lijie Huang, Jianwen Luo, Yan Xu","doi":"10.1109/TMI.2024.3447214","DOIUrl":"https://doi.org/10.1109/TMI.2024.3447214","url":null,"abstract":"<p><p>Histological image registration is a fundamental task in histological image analysis. It is challenging because of substantial appearance differences due to multiple staining. Keypoint correspondences, i.e., matched keypoint pairs, have been introduced to guide unsupervised deep learning (DL) based registration methods to handle such a registration task. This paper proposes an iterative keypoint correspondence-guided (IKCG) unsupervised network for non-rigid histological image registration. Fixed deep features and learnable deep features are introduced as keypoint descriptors to automatically establish keypoint correspondences, the distance between which is used as a loss function to train the registration network. Fixed deep features extracted from DL networks that are pre-trained on natural image datasets are more discriminative than handcrafted ones, benefiting from the deep and hierarchical nature of DL networks. The intermediate layer outputs of the registration networks trained on histological image datasets are extracted as learnable deep features, which reveal unique information for histological images. An iterative training strategy is adopted to train the registration network and optimize learnable deep features jointly. Benefiting from the excellent matching ability of learnable deep features optimized with the iterative training strategy, the proposed method can solve the local non-rigid large displacement problem, an inevitable problem usually caused by misoperation, such as tears in producing tissue slices. The proposed method is evaluated on the Automatic Non-rigid Histology Image Registration (ANHIR) website and AutomatiC Registration Of Breast cAncer Tissue (ACROBAT) website. It ranked 1st on both websites as of August 6th, 2024.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142019956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimized Excitation in Microwave-induced Thermoacoustic Imaging for Artifact Suppression. 优化微波诱导热声成像中的激励以抑制伪影。
Pub Date : 2024-08-21 DOI: 10.1109/TMI.2024.3447125
Qiang Liu, Weian Chao, Ruyi Wen, Yubin Gong, Lei Xi

Microwave-induced thermoacoustic imaging (M-TAI) allows the visualization of macroscopic and microscopic structures of bio-tissues. However, it suffers from severe inherent artifacts that might misguide the subsequent diagnostics and treatments of diseases. To overcome this limitation, we propose an optimized excitation strategy. In detail, the strategy integrates dynamically compound specific absorption rate (SAR) and co-planar configuration of polarization state, incident wave vector and imaging plane. Starting from the theoretical analysis, we interpret the underlying mechanism supporting the superiority of the optimized excitation strategy to achieve an effect equivalent to homogenizing the deposited electromagnetic energy in bio-tissues. The following numerical simulations demonstrate that the strategy enables better preservation of the conductivity weighting of samples while increasing Pearson correlation coefficient. Furthermore, the in vitro and in vivo M-TAI experiments validate the effectiveness and robustness of this optimized excitation strategy in artifact suppression, allowing the simultaneous identification of both boundary and inside fine structures within bio-tissues. All the results suggest that the optimized excitation strategy can be expanded to diverse scenarios, inspiring more suitable strategies that remarkably suppress the inherent artifacts in M-TAI.

微波诱导热声成像(M-TAI)可实现生物组织宏观和微观结构的可视化。然而,它存在严重的固有伪影,可能会误导后续的疾病诊断和治疗。为了克服这一局限性,我们提出了一种优化的激发策略。具体来说,该策略综合了动态复合比吸收率(SAR)和偏振态、入射波矢量和成像平面的共面配置。从理论分析入手,我们解释了支持优化激发策略优越性的基本机制,以达到等同于在生物组织中均匀沉积电磁能量的效果。接下来的数值模拟证明,该策略能更好地保持样本的电导率权重,同时提高皮尔逊相关系数。此外,体外和体内 M-TAI 实验验证了这种优化激励策略在抑制伪影方面的有效性和稳健性,可同时识别生物组织的边界和内部精细结构。所有这些结果表明,优化的激发策略可以扩展到不同的应用场景,从而激发出更合适的策略,显著抑制 M-TAI 中固有的伪影。
{"title":"Optimized Excitation in Microwave-induced Thermoacoustic Imaging for Artifact Suppression.","authors":"Qiang Liu, Weian Chao, Ruyi Wen, Yubin Gong, Lei Xi","doi":"10.1109/TMI.2024.3447125","DOIUrl":"https://doi.org/10.1109/TMI.2024.3447125","url":null,"abstract":"<p><p>Microwave-induced thermoacoustic imaging (M-TAI) allows the visualization of macroscopic and microscopic structures of bio-tissues. However, it suffers from severe inherent artifacts that might misguide the subsequent diagnostics and treatments of diseases. To overcome this limitation, we propose an optimized excitation strategy. In detail, the strategy integrates dynamically compound specific absorption rate (SAR) and co-planar configuration of polarization state, incident wave vector and imaging plane. Starting from the theoretical analysis, we interpret the underlying mechanism supporting the superiority of the optimized excitation strategy to achieve an effect equivalent to homogenizing the deposited electromagnetic energy in bio-tissues. The following numerical simulations demonstrate that the strategy enables better preservation of the conductivity weighting of samples while increasing Pearson correlation coefficient. Furthermore, the in vitro and in vivo M-TAI experiments validate the effectiveness and robustness of this optimized excitation strategy in artifact suppression, allowing the simultaneous identification of both boundary and inside fine structures within bio-tissues. All the results suggest that the optimized excitation strategy can be expanded to diverse scenarios, inspiring more suitable strategies that remarkably suppress the inherent artifacts in M-TAI.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142019955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FR-MIL: Distribution Re-calibration based Multiple Instance Learning with Transformer for Whole Slide Image Classification. FR-MIL:基于分布再校准的带变换器的多实例学习,用于整张幻灯片图像分类。
Pub Date : 2024-08-20 DOI: 10.1109/TMI.2024.3446716
Philip Chikontwe, Meejeong Kim, Jaehoon Jeong, Hyun Jung Sung, Heounjeong Go, Soo Jeong Nam, Sang Hyun Park

In digital pathology, whole slide images (WSI) are crucial for cancer prognostication and treatment planning. WSI classification is generally addressed using multiple instance learning (MIL), alleviating the challenge of processing billions of pixels and curating rich annotations. Though recent MIL approaches leverage variants of the attention mechanism to learn better representations, they scarcely study the properties of the data distribution itself i.e., different staining and acquisition protocols resulting in intra-patch and inter-slide variations. In this work, we first introduce a distribution re-calibration strategy to shift the feature distribution of a WSI bag (instances) using the statistics of the max-instance (critical) feature. Second, we enforce class (bag) separation via a metric loss assuming that positive bags exhibit larger magnitudes than negatives. We also introduce a generative process leveraging Vector Quantization (VQ) for improved instance discrimination i.e., VQ helps model bag latent factors for improved classification. To model spatial and context information, a position encoding module (PEM) is employed with transformer-based pooling by multi-head self-attention (PMSA). Evaluation of popular WSI benchmark datasets reveals our approach improves over state-of-the-art MIL methods. Further, we validate the general applicability of our method on classic MIL benchmark tasks and for point cloud classification with limited points https://github.com/PhilipChicco/FRMIL.

在数字病理学中,整张切片图像(WSI)对于癌症预后和治疗规划至关重要。WSI 分类通常采用多实例学习(MIL)方法,以减轻处理数十亿像素和整理丰富注释所带来的挑战。虽然最近的 MIL 方法利用注意力机制的变体来学习更好的表征,但它们几乎没有研究数据分布本身的属性,即不同染色和采集方案导致的斑块内和切片间的差异。在这项工作中,我们首先引入了一种分布重新校准策略,利用最大实例(临界)特征的统计数据来改变 WSI 包(实例)的特征分布。其次,我们通过度量损失来执行类(袋)分离,假设正向袋比负向袋表现出更大的量级。此外,我们还引入了一种利用矢量量化(VQ)的生成过程,以提高实例分辨能力,即 VQ 可帮助对袋的潜在因素进行建模,从而提高分类能力。为了对空间和上下文信息进行建模,我们采用了位置编码模块(PEM),并通过多头自注意(PMSA)进行基于变压器的汇集。对流行的 WSI 基准数据集进行评估后发现,我们的方法比最先进的 MIL 方法更胜一筹。此外,我们还验证了我们的方法在经典 MIL 基准任务和有限点 https://github.com/PhilipChicco/FRMIL 的点云分类中的普遍适用性。
{"title":"FR-MIL: Distribution Re-calibration based Multiple Instance Learning with Transformer for Whole Slide Image Classification.","authors":"Philip Chikontwe, Meejeong Kim, Jaehoon Jeong, Hyun Jung Sung, Heounjeong Go, Soo Jeong Nam, Sang Hyun Park","doi":"10.1109/TMI.2024.3446716","DOIUrl":"https://doi.org/10.1109/TMI.2024.3446716","url":null,"abstract":"<p><p>In digital pathology, whole slide images (WSI) are crucial for cancer prognostication and treatment planning. WSI classification is generally addressed using multiple instance learning (MIL), alleviating the challenge of processing billions of pixels and curating rich annotations. Though recent MIL approaches leverage variants of the attention mechanism to learn better representations, they scarcely study the properties of the data distribution itself i.e., different staining and acquisition protocols resulting in intra-patch and inter-slide variations. In this work, we first introduce a distribution re-calibration strategy to shift the feature distribution of a WSI bag (instances) using the statistics of the max-instance (critical) feature. Second, we enforce class (bag) separation via a metric loss assuming that positive bags exhibit larger magnitudes than negatives. We also introduce a generative process leveraging Vector Quantization (VQ) for improved instance discrimination i.e., VQ helps model bag latent factors for improved classification. To model spatial and context information, a position encoding module (PEM) is employed with transformer-based pooling by multi-head self-attention (PMSA). Evaluation of popular WSI benchmark datasets reveals our approach improves over state-of-the-art MIL methods. Further, we validate the general applicability of our method on classic MIL benchmark tasks and for point cloud classification with limited points https://github.com/PhilipChicco/FRMIL.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142010126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging MRI Cross-Modality Synthesis and Multi-Contrast Super-Resolution by Fine-Grained Difference Learning. 通过细粒度差分学习架起核磁共振成像跨模态合成与多对比超分辨率的桥梁
Pub Date : 2024-08-19 DOI: 10.1109/TMI.2024.3445969
Yidan Feng, Sen Deng, Jun Lyu, Jing Cai, Mingqiang Wei, Jing Qin

In multi-modal magnetic resonance imaging (MRI), the tasks of imputing or reconstructing the target modality share a common obstacle: the accurate modeling of fine-grained inter-modal differences, which has been sparingly addressed in current literature. These differences stem from two sources: 1) spatial misalignment remaining after coarse registration and 2) structural distinction arising from modality-specific signal manifestations. This paper integrates the previously separate research trajectories of cross-modality synthesis (CMS) and multi-contrast super-resolution (MCSR) to address this pervasive challenge within a unified framework. Connected through generalized down-sampling ratios, this unification not only emphasizes their common goal in reducing structural differences, but also identifies the key task distinguishing MCSR from CMS: modeling the structural distinctions using the limited information from the misaligned target input. Specifically, we propose a composite network architecture with several key components: a label correction module to align the coordinates of multi-modal training pairs, a CMS module serving as the base model, an SR branch to handle target inputs, and a difference projection discriminator for structural distinction-centered adversarial training. When training the SR branch as the generator, the adversarial learning is enhanced with distinction-aware incremental modulation to ensure better-controlled generation. Moreover, the SR branch integrates deformable convolutions to address cross-modal spatial misalignment at the feature level. Experiments conducted on three public datasets demonstrate that our approach effectively balances structural accuracy and realism, exhibiting overall superiority in comprehensive evaluations for both tasks over current state-of-the-art approaches. The code is available at https://github.com/papshare/FGDL.

在多模态磁共振成像(MRI)中,归因或重建目标模态的任务有一个共同的障碍:对细粒度模态间差异进行精确建模,而目前的文献很少涉及这一问题。这些差异有两个来源:1) 粗配准后残留的空间错位;2) 由特定模态信号表现产生的结构差异。本文整合了跨模态合成(CMS)和多对比度超分辨率(MCSR)这两个以往独立的研究方向,在一个统一的框架内解决了这一普遍存在的难题。通过广义下采样率的连接,这种统一不仅强调了它们在减少结构差异方面的共同目标,还确定了 MCSR 区别于 CMS 的关键任务:利用来自错位目标输入的有限信息对结构差异进行建模。具体来说,我们提出了一种包含几个关键组件的复合网络架构:用于对齐多模态训练对坐标的标签校正模块、作为基础模型的 CMS 模块、处理目标输入的 SR 分支,以及用于以结构差异为中心的对抗训练的差异投影判别器。在将 SR 分支作为生成器进行训练时,对抗学习会通过区分感知增量调制得到增强,以确保生成器得到更好的控制。此外,SR 分支还整合了可变形卷积,以解决特征层面的跨模态空间错位问题。在三个公共数据集上进行的实验表明,我们的方法有效地平衡了结构准确性和真实性,在这两项任务的综合评估中,我们的方法总体上优于目前最先进的方法。代码见 https://github.com/papshare/FGDL。
{"title":"Bridging MRI Cross-Modality Synthesis and Multi-Contrast Super-Resolution by Fine-Grained Difference Learning.","authors":"Yidan Feng, Sen Deng, Jun Lyu, Jing Cai, Mingqiang Wei, Jing Qin","doi":"10.1109/TMI.2024.3445969","DOIUrl":"https://doi.org/10.1109/TMI.2024.3445969","url":null,"abstract":"<p><p>In multi-modal magnetic resonance imaging (MRI), the tasks of imputing or reconstructing the target modality share a common obstacle: the accurate modeling of fine-grained inter-modal differences, which has been sparingly addressed in current literature. These differences stem from two sources: 1) spatial misalignment remaining after coarse registration and 2) structural distinction arising from modality-specific signal manifestations. This paper integrates the previously separate research trajectories of cross-modality synthesis (CMS) and multi-contrast super-resolution (MCSR) to address this pervasive challenge within a unified framework. Connected through generalized down-sampling ratios, this unification not only emphasizes their common goal in reducing structural differences, but also identifies the key task distinguishing MCSR from CMS: modeling the structural distinctions using the limited information from the misaligned target input. Specifically, we propose a composite network architecture with several key components: a label correction module to align the coordinates of multi-modal training pairs, a CMS module serving as the base model, an SR branch to handle target inputs, and a difference projection discriminator for structural distinction-centered adversarial training. When training the SR branch as the generator, the adversarial learning is enhanced with distinction-aware incremental modulation to ensure better-controlled generation. Moreover, the SR branch integrates deformable convolutions to address cross-modal spatial misalignment at the feature level. Experiments conducted on three public datasets demonstrate that our approach effectively balances structural accuracy and realism, exhibiting overall superiority in comprehensive evaluations for both tasks over current state-of-the-art approaches. The code is available at https://github.com/papshare/FGDL.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142006159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on medical imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1