首页 > 最新文献

IEEE transactions on medical imaging最新文献

英文 中文
Multi-Scale Spatial-Temporal Attention Networks for Functional Connectome Classification. 用于功能连接组分类的多尺度时空注意力网络
Pub Date : 2024-08-22 DOI: 10.1109/TMI.2024.3448214
Youyong Kong, Xiaotong Zhang, Wenhan Wang, Yue Zhou, Yueying Li, Yonggui Yuan

Many neuropsychiatric disorders are considered to be associated with abnormalities in the functional connectivity networks of the brain. The research on the classification of functional connectivity can therefore provide new perspectives for understanding the pathology of disorders and contribute to early diagnosis and treatment. Functional connectivity exhibits a nature of dynamically changing over time, however, the majority of existing methods are unable to collectively reveal the spatial topology and time-varying characteristics. Furthermore, despite the efforts of limited spatial-temporal studies to capture rich information across different spatial scales, they have not delved into the temporal characteristics among different scales. To address above issues, we propose a novel Multi-Scale Spatial-Temporal Attention Networks (MSSTAN) to exploit the multi-scale spatial-temporal information provided by functional connectome for classification. To fully extract spatial features of brain regions, we propose a Topology Enhanced Graph Transformer module to guide the attention calculations in the learning of spatial features by incorporating topology priors. A Multi-Scale Pooling Strategy is introduced to obtain representations of brain connectome at various scales. Considering the temporal dynamic characteristics between dynamic functional connectome, we employ Locality Sensitive Hashing attention to further capture long-term dependencies in time dynamics across multiple scales and reduce the computational complexity of the original attention mechanism. Experiments on three brain fMRI datasets of MDD and ASD demonstrate the superiority of our proposed approach. In addition, benefiting from the attention mechanism in Transformer, our results are interpretable, which can contribute to the discovery of biomarkers. The code is available at https://github.com/LIST-KONG/MSSTAN.

许多神经精神疾病被认为与大脑功能连接网络的异常有关。因此,对功能连接分类的研究可以为了解疾病的病理提供新的视角,并有助于早期诊断和治疗。功能连通性具有随时间动态变化的特性,但现有的大多数方法都无法全面揭示其空间拓扑和时变特征。此外,尽管有限的时空研究努力捕捉不同空间尺度的丰富信息,但它们并未深入研究不同尺度之间的时间特征。针对上述问题,我们提出了一种新颖的多尺度空间-时间注意网络(MSSTAN),利用功能连接组提供的多尺度空间-时间信息进行分类。为了充分提取脑区的空间特征,我们提出了拓扑增强图转换器模块,通过结合拓扑先验来指导空间特征学习中的注意力计算。我们引入了多尺度池化策略,以获得不同尺度的大脑连接组表征。考虑到动态功能连接组之间的时间动态特征,我们采用了位置敏感哈希注意力,以进一步捕捉跨多个尺度的时间动态的长期依赖性,并降低原始注意力机制的计算复杂性。在 MDD 和 ASD 的三个大脑 fMRI 数据集上进行的实验证明了我们提出的方法的优越性。此外,得益于 Transformer 中的注意力机制,我们的结果具有可解释性,有助于发现生物标志物。代码见 https://github.com/LIST-KONG/MSSTAN。
{"title":"Multi-Scale Spatial-Temporal Attention Networks for Functional Connectome Classification.","authors":"Youyong Kong, Xiaotong Zhang, Wenhan Wang, Yue Zhou, Yueying Li, Yonggui Yuan","doi":"10.1109/TMI.2024.3448214","DOIUrl":"https://doi.org/10.1109/TMI.2024.3448214","url":null,"abstract":"<p><p>Many neuropsychiatric disorders are considered to be associated with abnormalities in the functional connectivity networks of the brain. The research on the classification of functional connectivity can therefore provide new perspectives for understanding the pathology of disorders and contribute to early diagnosis and treatment. Functional connectivity exhibits a nature of dynamically changing over time, however, the majority of existing methods are unable to collectively reveal the spatial topology and time-varying characteristics. Furthermore, despite the efforts of limited spatial-temporal studies to capture rich information across different spatial scales, they have not delved into the temporal characteristics among different scales. To address above issues, we propose a novel Multi-Scale Spatial-Temporal Attention Networks (MSSTAN) to exploit the multi-scale spatial-temporal information provided by functional connectome for classification. To fully extract spatial features of brain regions, we propose a Topology Enhanced Graph Transformer module to guide the attention calculations in the learning of spatial features by incorporating topology priors. A Multi-Scale Pooling Strategy is introduced to obtain representations of brain connectome at various scales. Considering the temporal dynamic characteristics between dynamic functional connectome, we employ Locality Sensitive Hashing attention to further capture long-term dependencies in time dynamics across multiple scales and reduce the computational complexity of the original attention mechanism. Experiments on three brain fMRI datasets of MDD and ASD demonstrate the superiority of our proposed approach. In addition, benefiting from the attention mechanism in Transformer, our results are interpretable, which can contribute to the discovery of biomarkers. The code is available at https://github.com/LIST-KONG/MSSTAN.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142037961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Moment-Consistent Contrastive CycleGAN for Cross-Domain Pancreatic Image Segmentation. 用于跨域胰腺图像分割的时刻一致对比 CycleGAN
Pub Date : 2024-08-21 DOI: 10.1109/TMI.2024.3447071
Zhongyu Chen, Yun Bian, Erwei Shen, Ligang Fan, Weifang Zhu, Fei Shi, Chengwei Shao, Xinjian Chen, Dehui Xiang

CT and MR are currently the most common imaging techniques for pancreatic cancer diagnosis. Accurate segmentation of the pancreas in CT and MR images can provide significant help in the diagnosis and treatment of pancreatic cancer. Traditional supervised segmentation methods require a large number of labeled CT and MR training data, which is usually time-consuming and laborious. Meanwhile, due to domain shift, traditional segmentation networks are difficult to be deployed on different imaging modality datasets. Cross-domain segmentation can utilize labeled source domain data to assist unlabeled target domains in solving the above problems. In this paper, a cross-domain pancreas segmentation algorithm is proposed based on Moment-Consistent Contrastive Cycle Generative Adversarial Networks (MC-CCycleGAN). MC-CCycleGAN is a style transfer network, in which the encoder of its generator is used to extract features from real images and style transfer images, constrain feature extraction through a contrastive loss, and fully extract structural features of input images during style transfer while eliminate redundant style features. The multi-order central moments of the pancreas are proposed to describe its anatomy in high dimensions and a contrastive loss is also proposed to constrain the moment consistency, so as to maintain consistency of the pancreatic structure and shape before and after style transfer. Multi-teacher knowledge distillation framework is proposed to transfer the knowledge from multiple teachers to a single student, so as to improve the robustness and performance of the student network. The experimental results have demonstrated the superiority of our framework over state-of-the-art domain adaptation methods.

CT 和 MR 是目前诊断胰腺癌最常用的成像技术。准确分割 CT 和 MR 图像中的胰腺可为胰腺癌的诊断和治疗提供重要帮助。传统的监督分割方法需要大量标注的 CT 和 MR 训练数据,通常费时费力。同时,由于领域偏移,传统的分割网络难以在不同的成像模式数据集上部署。跨域分割可以利用已标记的源域数据来辅助未标记的目标域,从而解决上述问题。本文提出了一种基于时刻一致对比循环生成对抗网络(Moment-Consistent Contrastive Cycle Generative Adversarial Networks,MC-CCycleGAN)的跨域胰腺分割算法。MC-CCycleGAN 是一种风格转移网络,其生成器的编码器用于从真实图像和风格转移图像中提取特征,通过对比损失约束特征提取,并在风格转移过程中充分提取输入图像的结构特征,同时消除多余的风格特征。提出了胰腺的多阶中心矩来描述其高维解剖结构,并提出了对比损失来约束矩的一致性,以保持风格转换前后胰腺结构和形状的一致性。提出了多教师知识提炼框架,将多个教师的知识转移到一个学生身上,从而提高学生网络的鲁棒性和性能。实验结果表明,我们的框架优于最先进的领域适应方法。
{"title":"Moment-Consistent Contrastive CycleGAN for Cross-Domain Pancreatic Image Segmentation.","authors":"Zhongyu Chen, Yun Bian, Erwei Shen, Ligang Fan, Weifang Zhu, Fei Shi, Chengwei Shao, Xinjian Chen, Dehui Xiang","doi":"10.1109/TMI.2024.3447071","DOIUrl":"https://doi.org/10.1109/TMI.2024.3447071","url":null,"abstract":"<p><p>CT and MR are currently the most common imaging techniques for pancreatic cancer diagnosis. Accurate segmentation of the pancreas in CT and MR images can provide significant help in the diagnosis and treatment of pancreatic cancer. Traditional supervised segmentation methods require a large number of labeled CT and MR training data, which is usually time-consuming and laborious. Meanwhile, due to domain shift, traditional segmentation networks are difficult to be deployed on different imaging modality datasets. Cross-domain segmentation can utilize labeled source domain data to assist unlabeled target domains in solving the above problems. In this paper, a cross-domain pancreas segmentation algorithm is proposed based on Moment-Consistent Contrastive Cycle Generative Adversarial Networks (MC-CCycleGAN). MC-CCycleGAN is a style transfer network, in which the encoder of its generator is used to extract features from real images and style transfer images, constrain feature extraction through a contrastive loss, and fully extract structural features of input images during style transfer while eliminate redundant style features. The multi-order central moments of the pancreas are proposed to describe its anatomy in high dimensions and a contrastive loss is also proposed to constrain the moment consistency, so as to maintain consistency of the pancreatic structure and shape before and after style transfer. Multi-teacher knowledge distillation framework is proposed to transfer the knowledge from multiple teachers to a single student, so as to improve the robustness and performance of the student network. The experimental results have demonstrated the superiority of our framework over state-of-the-art domain adaptation methods.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142019954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised Non-rigid Histological Image Registration Guided by Keypoint Correspondences Based on Learnable Deep Features with Iterative Training. 基于可学习深度特征迭代训练的关键点对应关系引导的无监督非刚性组织学图像注册
Pub Date : 2024-08-21 DOI: 10.1109/TMI.2024.3447214
Xingyue Wei, Lin Ge, Lijie Huang, Jianwen Luo, Yan Xu

Histological image registration is a fundamental task in histological image analysis. It is challenging because of substantial appearance differences due to multiple staining. Keypoint correspondences, i.e., matched keypoint pairs, have been introduced to guide unsupervised deep learning (DL) based registration methods to handle such a registration task. This paper proposes an iterative keypoint correspondence-guided (IKCG) unsupervised network for non-rigid histological image registration. Fixed deep features and learnable deep features are introduced as keypoint descriptors to automatically establish keypoint correspondences, the distance between which is used as a loss function to train the registration network. Fixed deep features extracted from DL networks that are pre-trained on natural image datasets are more discriminative than handcrafted ones, benefiting from the deep and hierarchical nature of DL networks. The intermediate layer outputs of the registration networks trained on histological image datasets are extracted as learnable deep features, which reveal unique information for histological images. An iterative training strategy is adopted to train the registration network and optimize learnable deep features jointly. Benefiting from the excellent matching ability of learnable deep features optimized with the iterative training strategy, the proposed method can solve the local non-rigid large displacement problem, an inevitable problem usually caused by misoperation, such as tears in producing tissue slices. The proposed method is evaluated on the Automatic Non-rigid Histology Image Registration (ANHIR) website and AutomatiC Registration Of Breast cAncer Tissue (ACROBAT) website. It ranked 1st on both websites as of August 6th, 2024.

组织学图像配准是组织学图像分析的一项基本任务。由于多重染色会造成巨大的外观差异,因此这项工作极具挑战性。关键点对应(即匹配的关键点对)被用来指导基于深度学习(DL)的无监督配准方法,以处理此类配准任务。本文提出了一种用于非刚性组织学图像配准的迭代关键点对应引导(IKCG)无监督网络。本文引入了固定深度特征和可学习深度特征作为关键点描述符,以自动建立关键点对应关系,并将两者之间的距离作为损失函数来训练配准网络。从在自然图像数据集上预先训练过的 DL 网络中提取的固定深度特征比手工制作的特征更有辨别力,这得益于 DL 网络的深度和分层特性。在组织学图像数据集上训练的配准网络的中间层输出被提取为可学习的深度特征,这些特征揭示了组织学图像的独特信息。采用迭代训练策略训练配准网络,并共同优化可学习深度特征。利用迭代训练策略优化的可学习深度特征的出色匹配能力,所提出的方法可以解决局部非刚性大位移问题,而这一问题通常是由误操作(如制作组织切片时的撕裂)引起的。该方法在自动非刚性组织学图像配准(ANHIR)网站和乳腺癌组织自动配准(ACROBAT)网站上进行了评估。截至 2024 年 8 月 6 日,该方法在这两个网站上均排名第一。
{"title":"Unsupervised Non-rigid Histological Image Registration Guided by Keypoint Correspondences Based on Learnable Deep Features with Iterative Training.","authors":"Xingyue Wei, Lin Ge, Lijie Huang, Jianwen Luo, Yan Xu","doi":"10.1109/TMI.2024.3447214","DOIUrl":"https://doi.org/10.1109/TMI.2024.3447214","url":null,"abstract":"<p><p>Histological image registration is a fundamental task in histological image analysis. It is challenging because of substantial appearance differences due to multiple staining. Keypoint correspondences, i.e., matched keypoint pairs, have been introduced to guide unsupervised deep learning (DL) based registration methods to handle such a registration task. This paper proposes an iterative keypoint correspondence-guided (IKCG) unsupervised network for non-rigid histological image registration. Fixed deep features and learnable deep features are introduced as keypoint descriptors to automatically establish keypoint correspondences, the distance between which is used as a loss function to train the registration network. Fixed deep features extracted from DL networks that are pre-trained on natural image datasets are more discriminative than handcrafted ones, benefiting from the deep and hierarchical nature of DL networks. The intermediate layer outputs of the registration networks trained on histological image datasets are extracted as learnable deep features, which reveal unique information for histological images. An iterative training strategy is adopted to train the registration network and optimize learnable deep features jointly. Benefiting from the excellent matching ability of learnable deep features optimized with the iterative training strategy, the proposed method can solve the local non-rigid large displacement problem, an inevitable problem usually caused by misoperation, such as tears in producing tissue slices. The proposed method is evaluated on the Automatic Non-rigid Histology Image Registration (ANHIR) website and AutomatiC Registration Of Breast cAncer Tissue (ACROBAT) website. It ranked 1st on both websites as of August 6th, 2024.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142019956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimized Excitation in Microwave-induced Thermoacoustic Imaging for Artifact Suppression. 优化微波诱导热声成像中的激励以抑制伪影。
Pub Date : 2024-08-21 DOI: 10.1109/TMI.2024.3447125
Qiang Liu, Weian Chao, Ruyi Wen, Yubin Gong, Lei Xi

Microwave-induced thermoacoustic imaging (M-TAI) allows the visualization of macroscopic and microscopic structures of bio-tissues. However, it suffers from severe inherent artifacts that might misguide the subsequent diagnostics and treatments of diseases. To overcome this limitation, we propose an optimized excitation strategy. In detail, the strategy integrates dynamically compound specific absorption rate (SAR) and co-planar configuration of polarization state, incident wave vector and imaging plane. Starting from the theoretical analysis, we interpret the underlying mechanism supporting the superiority of the optimized excitation strategy to achieve an effect equivalent to homogenizing the deposited electromagnetic energy in bio-tissues. The following numerical simulations demonstrate that the strategy enables better preservation of the conductivity weighting of samples while increasing Pearson correlation coefficient. Furthermore, the in vitro and in vivo M-TAI experiments validate the effectiveness and robustness of this optimized excitation strategy in artifact suppression, allowing the simultaneous identification of both boundary and inside fine structures within bio-tissues. All the results suggest that the optimized excitation strategy can be expanded to diverse scenarios, inspiring more suitable strategies that remarkably suppress the inherent artifacts in M-TAI.

微波诱导热声成像(M-TAI)可实现生物组织宏观和微观结构的可视化。然而,它存在严重的固有伪影,可能会误导后续的疾病诊断和治疗。为了克服这一局限性,我们提出了一种优化的激发策略。具体来说,该策略综合了动态复合比吸收率(SAR)和偏振态、入射波矢量和成像平面的共面配置。从理论分析入手,我们解释了支持优化激发策略优越性的基本机制,以达到等同于在生物组织中均匀沉积电磁能量的效果。接下来的数值模拟证明,该策略能更好地保持样本的电导率权重,同时提高皮尔逊相关系数。此外,体外和体内 M-TAI 实验验证了这种优化激励策略在抑制伪影方面的有效性和稳健性,可同时识别生物组织的边界和内部精细结构。所有这些结果表明,优化的激发策略可以扩展到不同的应用场景,从而激发出更合适的策略,显著抑制 M-TAI 中固有的伪影。
{"title":"Optimized Excitation in Microwave-induced Thermoacoustic Imaging for Artifact Suppression.","authors":"Qiang Liu, Weian Chao, Ruyi Wen, Yubin Gong, Lei Xi","doi":"10.1109/TMI.2024.3447125","DOIUrl":"https://doi.org/10.1109/TMI.2024.3447125","url":null,"abstract":"<p><p>Microwave-induced thermoacoustic imaging (M-TAI) allows the visualization of macroscopic and microscopic structures of bio-tissues. However, it suffers from severe inherent artifacts that might misguide the subsequent diagnostics and treatments of diseases. To overcome this limitation, we propose an optimized excitation strategy. In detail, the strategy integrates dynamically compound specific absorption rate (SAR) and co-planar configuration of polarization state, incident wave vector and imaging plane. Starting from the theoretical analysis, we interpret the underlying mechanism supporting the superiority of the optimized excitation strategy to achieve an effect equivalent to homogenizing the deposited electromagnetic energy in bio-tissues. The following numerical simulations demonstrate that the strategy enables better preservation of the conductivity weighting of samples while increasing Pearson correlation coefficient. Furthermore, the in vitro and in vivo M-TAI experiments validate the effectiveness and robustness of this optimized excitation strategy in artifact suppression, allowing the simultaneous identification of both boundary and inside fine structures within bio-tissues. All the results suggest that the optimized excitation strategy can be expanded to diverse scenarios, inspiring more suitable strategies that remarkably suppress the inherent artifacts in M-TAI.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142019955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FR-MIL: Distribution Re-calibration based Multiple Instance Learning with Transformer for Whole Slide Image Classification. FR-MIL:基于分布再校准的带变换器的多实例学习,用于整张幻灯片图像分类。
Pub Date : 2024-08-20 DOI: 10.1109/TMI.2024.3446716
Philip Chikontwe, Meejeong Kim, Jaehoon Jeong, Hyun Jung Sung, Heounjeong Go, Soo Jeong Nam, Sang Hyun Park

In digital pathology, whole slide images (WSI) are crucial for cancer prognostication and treatment planning. WSI classification is generally addressed using multiple instance learning (MIL), alleviating the challenge of processing billions of pixels and curating rich annotations. Though recent MIL approaches leverage variants of the attention mechanism to learn better representations, they scarcely study the properties of the data distribution itself i.e., different staining and acquisition protocols resulting in intra-patch and inter-slide variations. In this work, we first introduce a distribution re-calibration strategy to shift the feature distribution of a WSI bag (instances) using the statistics of the max-instance (critical) feature. Second, we enforce class (bag) separation via a metric loss assuming that positive bags exhibit larger magnitudes than negatives. We also introduce a generative process leveraging Vector Quantization (VQ) for improved instance discrimination i.e., VQ helps model bag latent factors for improved classification. To model spatial and context information, a position encoding module (PEM) is employed with transformer-based pooling by multi-head self-attention (PMSA). Evaluation of popular WSI benchmark datasets reveals our approach improves over state-of-the-art MIL methods. Further, we validate the general applicability of our method on classic MIL benchmark tasks and for point cloud classification with limited points https://github.com/PhilipChicco/FRMIL.

在数字病理学中,整张切片图像(WSI)对于癌症预后和治疗规划至关重要。WSI 分类通常采用多实例学习(MIL)方法,以减轻处理数十亿像素和整理丰富注释所带来的挑战。虽然最近的 MIL 方法利用注意力机制的变体来学习更好的表征,但它们几乎没有研究数据分布本身的属性,即不同染色和采集方案导致的斑块内和切片间的差异。在这项工作中,我们首先引入了一种分布重新校准策略,利用最大实例(临界)特征的统计数据来改变 WSI 包(实例)的特征分布。其次,我们通过度量损失来执行类(袋)分离,假设正向袋比负向袋表现出更大的量级。此外,我们还引入了一种利用矢量量化(VQ)的生成过程,以提高实例分辨能力,即 VQ 可帮助对袋的潜在因素进行建模,从而提高分类能力。为了对空间和上下文信息进行建模,我们采用了位置编码模块(PEM),并通过多头自注意(PMSA)进行基于变压器的汇集。对流行的 WSI 基准数据集进行评估后发现,我们的方法比最先进的 MIL 方法更胜一筹。此外,我们还验证了我们的方法在经典 MIL 基准任务和有限点 https://github.com/PhilipChicco/FRMIL 的点云分类中的普遍适用性。
{"title":"FR-MIL: Distribution Re-calibration based Multiple Instance Learning with Transformer for Whole Slide Image Classification.","authors":"Philip Chikontwe, Meejeong Kim, Jaehoon Jeong, Hyun Jung Sung, Heounjeong Go, Soo Jeong Nam, Sang Hyun Park","doi":"10.1109/TMI.2024.3446716","DOIUrl":"https://doi.org/10.1109/TMI.2024.3446716","url":null,"abstract":"<p><p>In digital pathology, whole slide images (WSI) are crucial for cancer prognostication and treatment planning. WSI classification is generally addressed using multiple instance learning (MIL), alleviating the challenge of processing billions of pixels and curating rich annotations. Though recent MIL approaches leverage variants of the attention mechanism to learn better representations, they scarcely study the properties of the data distribution itself i.e., different staining and acquisition protocols resulting in intra-patch and inter-slide variations. In this work, we first introduce a distribution re-calibration strategy to shift the feature distribution of a WSI bag (instances) using the statistics of the max-instance (critical) feature. Second, we enforce class (bag) separation via a metric loss assuming that positive bags exhibit larger magnitudes than negatives. We also introduce a generative process leveraging Vector Quantization (VQ) for improved instance discrimination i.e., VQ helps model bag latent factors for improved classification. To model spatial and context information, a position encoding module (PEM) is employed with transformer-based pooling by multi-head self-attention (PMSA). Evaluation of popular WSI benchmark datasets reveals our approach improves over state-of-the-art MIL methods. Further, we validate the general applicability of our method on classic MIL benchmark tasks and for point cloud classification with limited points https://github.com/PhilipChicco/FRMIL.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142010126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging MRI Cross-Modality Synthesis and Multi-Contrast Super-Resolution by Fine-Grained Difference Learning. 通过细粒度差分学习架起核磁共振成像跨模态合成与多对比超分辨率的桥梁
Pub Date : 2024-08-19 DOI: 10.1109/TMI.2024.3445969
Yidan Feng, Sen Deng, Jun Lyu, Jing Cai, Mingqiang Wei, Jing Qin

In multi-modal magnetic resonance imaging (MRI), the tasks of imputing or reconstructing the target modality share a common obstacle: the accurate modeling of fine-grained inter-modal differences, which has been sparingly addressed in current literature. These differences stem from two sources: 1) spatial misalignment remaining after coarse registration and 2) structural distinction arising from modality-specific signal manifestations. This paper integrates the previously separate research trajectories of cross-modality synthesis (CMS) and multi-contrast super-resolution (MCSR) to address this pervasive challenge within a unified framework. Connected through generalized down-sampling ratios, this unification not only emphasizes their common goal in reducing structural differences, but also identifies the key task distinguishing MCSR from CMS: modeling the structural distinctions using the limited information from the misaligned target input. Specifically, we propose a composite network architecture with several key components: a label correction module to align the coordinates of multi-modal training pairs, a CMS module serving as the base model, an SR branch to handle target inputs, and a difference projection discriminator for structural distinction-centered adversarial training. When training the SR branch as the generator, the adversarial learning is enhanced with distinction-aware incremental modulation to ensure better-controlled generation. Moreover, the SR branch integrates deformable convolutions to address cross-modal spatial misalignment at the feature level. Experiments conducted on three public datasets demonstrate that our approach effectively balances structural accuracy and realism, exhibiting overall superiority in comprehensive evaluations for both tasks over current state-of-the-art approaches. The code is available at https://github.com/papshare/FGDL.

在多模态磁共振成像(MRI)中,归因或重建目标模态的任务有一个共同的障碍:对细粒度模态间差异进行精确建模,而目前的文献很少涉及这一问题。这些差异有两个来源:1) 粗配准后残留的空间错位;2) 由特定模态信号表现产生的结构差异。本文整合了跨模态合成(CMS)和多对比度超分辨率(MCSR)这两个以往独立的研究方向,在一个统一的框架内解决了这一普遍存在的难题。通过广义下采样率的连接,这种统一不仅强调了它们在减少结构差异方面的共同目标,还确定了 MCSR 区别于 CMS 的关键任务:利用来自错位目标输入的有限信息对结构差异进行建模。具体来说,我们提出了一种包含几个关键组件的复合网络架构:用于对齐多模态训练对坐标的标签校正模块、作为基础模型的 CMS 模块、处理目标输入的 SR 分支,以及用于以结构差异为中心的对抗训练的差异投影判别器。在将 SR 分支作为生成器进行训练时,对抗学习会通过区分感知增量调制得到增强,以确保生成器得到更好的控制。此外,SR 分支还整合了可变形卷积,以解决特征层面的跨模态空间错位问题。在三个公共数据集上进行的实验表明,我们的方法有效地平衡了结构准确性和真实性,在这两项任务的综合评估中,我们的方法总体上优于目前最先进的方法。代码见 https://github.com/papshare/FGDL。
{"title":"Bridging MRI Cross-Modality Synthesis and Multi-Contrast Super-Resolution by Fine-Grained Difference Learning.","authors":"Yidan Feng, Sen Deng, Jun Lyu, Jing Cai, Mingqiang Wei, Jing Qin","doi":"10.1109/TMI.2024.3445969","DOIUrl":"https://doi.org/10.1109/TMI.2024.3445969","url":null,"abstract":"<p><p>In multi-modal magnetic resonance imaging (MRI), the tasks of imputing or reconstructing the target modality share a common obstacle: the accurate modeling of fine-grained inter-modal differences, which has been sparingly addressed in current literature. These differences stem from two sources: 1) spatial misalignment remaining after coarse registration and 2) structural distinction arising from modality-specific signal manifestations. This paper integrates the previously separate research trajectories of cross-modality synthesis (CMS) and multi-contrast super-resolution (MCSR) to address this pervasive challenge within a unified framework. Connected through generalized down-sampling ratios, this unification not only emphasizes their common goal in reducing structural differences, but also identifies the key task distinguishing MCSR from CMS: modeling the structural distinctions using the limited information from the misaligned target input. Specifically, we propose a composite network architecture with several key components: a label correction module to align the coordinates of multi-modal training pairs, a CMS module serving as the base model, an SR branch to handle target inputs, and a difference projection discriminator for structural distinction-centered adversarial training. When training the SR branch as the generator, the adversarial learning is enhanced with distinction-aware incremental modulation to ensure better-controlled generation. Moreover, the SR branch integrates deformable convolutions to address cross-modal spatial misalignment at the feature level. Experiments conducted on three public datasets demonstrate that our approach effectively balances structural accuracy and realism, exhibiting overall superiority in comprehensive evaluations for both tasks over current state-of-the-art approaches. The code is available at https://github.com/papshare/FGDL.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142006159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SISMIK for brain MRI: Deep-learning-based motion estimation and model-based motion correction in k-space. 用于脑磁共振成像的 SISMIK:基于深度学习的运动估计和基于模型的 k 空间运动校正。
Pub Date : 2024-08-19 DOI: 10.1109/TMI.2024.3446450
Oscar Dabrowski, Jean-Luc Falcone, Antoine Klauser, Julien Songeon, Michel Kocher, Bastien Chopard, Francois Lazeyras, Sebastien Courvoisier

MRI, a widespread non-invasive medical imaging modality, is highly sensitive to patient motion. Despite many attempts over the years, motion correction remains a difficult problem and there is no general method applicable to all situations. We propose a retrospective method for motion estimation and correction to tackle the problem of in-plane rigid-body motion, apt for classical 2D Spin-Echo scans of the brain, which are regularly used in clinical practice. Due to the sequential acquisition of k-space, motion artifacts are well localized. The method leverages the power of deep neural networks to estimate motion parameters in k-space and uses a model-based approach to restore degraded images to avoid "hallucinations". Notable advantages are its ability to estimate motion occurring in high spatial frequencies without the need of a motion-free reference. The proposed method operates on the whole k-space dynamic range and is moderately affected by the lower SNR of higher harmonics. As a proof of concept, we provide models trained using supervised learning on 600k motion simulations based on motion-free scans of 43 different subjects. Generalization performance was tested with simulations as well as in-vivo. Qualitative and quantitative evaluations are presented for motion parameter estimations and image reconstruction. Experimental results show that our approach is able to obtain good generalization performance on simulated data and in-vivo acquisitions. We provide a Python implementation at https://gitlab.unige.ch/Oscar.Dabrowski/sismik_mri/.

核磁共振成像是一种广泛应用的无创医学成像模式,对患者的运动非常敏感。尽管多年来进行了许多尝试,但运动校正仍是一个难题,没有适用于所有情况的通用方法。我们提出了一种运动估计和校正的回顾性方法,以解决平面内刚体运动的问题,适用于临床上经常使用的经典二维脑部自旋回波扫描。由于 k 空间的顺序采集,运动伪影被很好地定位。该方法利用深度神经网络的强大功能来估计 k 空间中的运动参数,并使用基于模型的方法来恢复退化图像,以避免出现 "幻觉"。该方法的显著优点是能够估算高空间频率下的运动,而无需无运动参照物。该方法适用于整个 k 空间动态范围,受高次谐波较低信噪比的影响较小。作为概念验证,我们提供了基于 43 个不同受试者的无运动扫描的 600k 运动模拟的监督学习训练模型。通过模拟和活体测试了泛化性能。对运动参数估计和图像重建进行了定性和定量评估。实验结果表明,我们的方法能够在模拟数据和体内采集中获得良好的泛化性能。我们在 https://gitlab.unige.ch/Oscar.Dabrowski/sismik_mri/ 上提供了 Python 实现。
{"title":"SISMIK for brain MRI: Deep-learning-based motion estimation and model-based motion correction in k-space.","authors":"Oscar Dabrowski, Jean-Luc Falcone, Antoine Klauser, Julien Songeon, Michel Kocher, Bastien Chopard, Francois Lazeyras, Sebastien Courvoisier","doi":"10.1109/TMI.2024.3446450","DOIUrl":"https://doi.org/10.1109/TMI.2024.3446450","url":null,"abstract":"<p><p>MRI, a widespread non-invasive medical imaging modality, is highly sensitive to patient motion. Despite many attempts over the years, motion correction remains a difficult problem and there is no general method applicable to all situations. We propose a retrospective method for motion estimation and correction to tackle the problem of in-plane rigid-body motion, apt for classical 2D Spin-Echo scans of the brain, which are regularly used in clinical practice. Due to the sequential acquisition of k-space, motion artifacts are well localized. The method leverages the power of deep neural networks to estimate motion parameters in k-space and uses a model-based approach to restore degraded images to avoid \"hallucinations\". Notable advantages are its ability to estimate motion occurring in high spatial frequencies without the need of a motion-free reference. The proposed method operates on the whole k-space dynamic range and is moderately affected by the lower SNR of higher harmonics. As a proof of concept, we provide models trained using supervised learning on 600k motion simulations based on motion-free scans of 43 different subjects. Generalization performance was tested with simulations as well as in-vivo. Qualitative and quantitative evaluations are presented for motion parameter estimations and image reconstruction. Experimental results show that our approach is able to obtain good generalization performance on simulated data and in-vivo acquisitions. We provide a Python implementation at https://gitlab.unige.ch/Oscar.Dabrowski/sismik_mri/.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142006161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating and Improving Latent Density Segmentation Models for Aleatoric Uncertainty Quantification in Medical Imaging. 研究和改进用于医学影像不确定性量化的潜在密度分割模型。
Pub Date : 2024-08-19 DOI: 10.1109/TMI.2024.3445999
M M Amaan Valiuddin, Christiaan G A Viviers, Ruud J G Van Sloun, Peter H N De With, Fons van der Sommen

Data uncertainties, such as sensor noise, occlusions or limitations in the acquisition method can introduce irreducible ambiguities in images, which result in varying, yet plausible, semantic hypotheses. In Machine Learning, this ambiguity is commonly referred to as aleatoric uncertainty. In image segmentation, latent density models can be utilized to address this problem. The most popular approach is the Probabilistic U-Net (PU-Net), which uses latent Normal densities to optimize the conditional data log-likelihood Evidence Lower Bound. In this work, we demonstrate that the PU-Net latent space is severely sparse and heavily under-utilized. To address this, we introduce mutual information maximization and entropy-regularized Sinkhorn Divergence in the latent space to promote homogeneity across all latent dimensions, effectively improving gradient-descent updates and latent space informativeness. Our results show that by applying this on public datasets of various clinical segmentation problems, our proposed methodology receives up to 11% performance gains compared against preceding latent variable models for probabilistic segmentation on the Hungarian-Matched Intersection over Union. The results indicate that encouraging a homogeneous latent space significantly improves latent density modeling for medical image segmentation.

数据的不确定性,如传感器噪声、遮挡或采集方法的局限性,会在图像中引入不可还原的模糊性,从而产生不同的、但可信的语义假设。在机器学习中,这种模糊性通常被称为不确定性。在图像分割中,可以利用潜在密度模型来解决这个问题。最流行的方法是概率 U-Net (PU-Net),它使用潜在正态密度来优化条件数据对数似然证据下限。在这项工作中,我们证明了 PU-Net 潜在空间严重稀疏,利用率严重不足。为解决这一问题,我们在潜空间中引入了互信息最大化和熵规化 Sinkhorn Divergence,以促进所有潜维度的同质性,从而有效改善梯度下降更新和潜空间的信息量。我们的研究结果表明,通过在各种临床分割问题的公共数据集上应用这一方法,我们提出的方法与之前的潜在变量模型相比,在匈牙利匹配交叉联盟上的概率分割中获得了高达 11% 的性能提升。结果表明,鼓励使用同质潜空间能显著改善医学影像分割的潜密度建模。
{"title":"Investigating and Improving Latent Density Segmentation Models for Aleatoric Uncertainty Quantification in Medical Imaging.","authors":"M M Amaan Valiuddin, Christiaan G A Viviers, Ruud J G Van Sloun, Peter H N De With, Fons van der Sommen","doi":"10.1109/TMI.2024.3445999","DOIUrl":"https://doi.org/10.1109/TMI.2024.3445999","url":null,"abstract":"<p><p>Data uncertainties, such as sensor noise, occlusions or limitations in the acquisition method can introduce irreducible ambiguities in images, which result in varying, yet plausible, semantic hypotheses. In Machine Learning, this ambiguity is commonly referred to as aleatoric uncertainty. In image segmentation, latent density models can be utilized to address this problem. The most popular approach is the Probabilistic U-Net (PU-Net), which uses latent Normal densities to optimize the conditional data log-likelihood Evidence Lower Bound. In this work, we demonstrate that the PU-Net latent space is severely sparse and heavily under-utilized. To address this, we introduce mutual information maximization and entropy-regularized Sinkhorn Divergence in the latent space to promote homogeneity across all latent dimensions, effectively improving gradient-descent updates and latent space informativeness. Our results show that by applying this on public datasets of various clinical segmentation problems, our proposed methodology receives up to 11% performance gains compared against preceding latent variable models for probabilistic segmentation on the Hungarian-Matched Intersection over Union. The results indicate that encouraging a homogeneous latent space significantly improves latent density modeling for medical image segmentation.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142006160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
S2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR. S2Former-OR:用于在 OR 中生成场景图的单级双模变换器。
Pub Date : 2024-08-15 DOI: 10.1109/TMI.2024.3444279
Jialun Pei, Diandian Guo, Jingyang Zhang, Manxi Lin, Yueming Jin, Pheng-Ann Heng

Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR). However, previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection. This pipeline may potentially compromise the flexibility of learning multimodal representations, consequently constraining the overall effectiveness. In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed, S2Former-OR, aimed to complementally leverage multi-view 2D scenes and 3D point clouds for SGG in an end-to-end manner. Concretely, our model embraces a View-Sync Transfusion scheme to encourage multi-view visual information interaction. Concurrently, a Geometry-Visual Cohesion operation is designed to integrate the synergic 2D semantic features into 3D point cloud features. Moreover, based on the augmented feature, we propose a novel relation-sensitive transformer decoder that embeds dynamic entity-pair queries and relational trait priors, which enables the direct prediction of entity-pair relations for graph generation without intermediate steps. Extensive experiments have validated the superior SGG performance and lower computational cost of S2Former-OR on 4D-OR benchmark, compared with current OR-SGG methods, e.g., 3 percentage points Precision increase and 24.2M reduction in model parameters. We further compared our method with generic single-stage SGG methods with broader metrics for a comprehensive evaluation, with consistently better performance achieved. Our source code can be made available at: https://github.com/PJLallen/S2Former-OR.

手术过程的场景图生成(SGG)对于提高手术室(OR)的整体认知智能至关重要。然而,以前的工作主要依赖于多阶段学习,其中生成的语义场景图依赖于姿势估计和物体检测的中间过程。这种流水线可能会影响多模态表征学习的灵活性,从而制约整体效果。在本研究中,我们引入了一种新颖的单级双模态转换器框架,用于在OR中进行SGG,称为S2Former-OR,旨在以端到端的方式利用多视角二维场景和三维点云对SGG进行互补。具体来说,我们的模型采用视图同步转换方案,鼓励多视图视觉信息交互。同时,我们还设计了几何-视觉内聚操作,将协同的二维语义特征整合到三维点云特征中。此外,在增强特征的基础上,我们提出了一种新颖的关系敏感变换解码器,该解码器嵌入了动态实体对查询和关系特质先验,可直接预测实体对关系以生成图,而无需中间步骤。广泛的实验验证了 S2Former-OR 在 4D-OR 基准上比当前的 OR-SGG 方法具有更优越的 SGG 性能和更低的计算成本,例如精度提高了 3 个百分点,模型参数减少了 2420 万个。我们还进一步将我们的方法与通用的单级 SGG 方法进行了比较,并采用了更广泛的指标进行综合评估,结果显示我们的方法始终具有更好的性能。我们的源代码可在以下网址获取:https://github.com/PJLallen/S2Former-OR。
{"title":"S<sup>2</sup>Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR.","authors":"Jialun Pei, Diandian Guo, Jingyang Zhang, Manxi Lin, Yueming Jin, Pheng-Ann Heng","doi":"10.1109/TMI.2024.3444279","DOIUrl":"https://doi.org/10.1109/TMI.2024.3444279","url":null,"abstract":"<p><p>Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR). However, previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection. This pipeline may potentially compromise the flexibility of learning multimodal representations, consequently constraining the overall effectiveness. In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed, S<sup>2</sup>Former-OR, aimed to complementally leverage multi-view 2D scenes and 3D point clouds for SGG in an end-to-end manner. Concretely, our model embraces a View-Sync Transfusion scheme to encourage multi-view visual information interaction. Concurrently, a Geometry-Visual Cohesion operation is designed to integrate the synergic 2D semantic features into 3D point cloud features. Moreover, based on the augmented feature, we propose a novel relation-sensitive transformer decoder that embeds dynamic entity-pair queries and relational trait priors, which enables the direct prediction of entity-pair relations for graph generation without intermediate steps. Extensive experiments have validated the superior SGG performance and lower computational cost of S<sup>2</sup>Former-OR on 4D-OR benchmark, compared with current OR-SGG methods, e.g., 3 percentage points Precision increase and 24.2M reduction in model parameters. We further compared our method with generic single-stage SGG methods with broader metrics for a comprehensive evaluation, with consistently better performance achieved. Our source code can be made available at: https://github.com/PJLallen/S2Former-OR.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141989773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AutoSamp: Autoencoding k-space Sampling via Variational Information Maximization for 3D MRI. AutoSamp:通过变异信息最大化对三维核磁共振成像的 k 空间采样进行自动编码。
Pub Date : 2024-08-15 DOI: 10.1109/TMI.2024.3443292
Cagan Alkan, Morteza Mardani, Congyu Liao, Zhitao Li, Shreyas S Vasanawala, John M Pauly

Accelerated MRI protocols routinely involve a predefined sampling pattern that undersamples the k-space. Finding an optimal pattern can enhance the reconstruction quality, however this optimization is a challenging task. To address this challenge, we introduce a novel deep learning framework, AutoSamp, based on variational information maximization that enables joint optimization of sampling pattern and reconstruction of MRI scans. We represent the encoder as a non-uniform Fast Fourier Transform that allows continuous optimization of k-space sample locations on a non-Cartesian plane, and the decoder as a deep reconstruction network. Experiments on public 3D acquired MRI datasets show improved reconstruction quality of the proposed AutoSamp method over the prevailing variable density and variable density Poisson disc sampling for both compressed sensing and deep learning reconstructions. We demonstrate that our data-driven sampling optimization method achieves 4.4dB, 2.0dB, 0.75dB, 0.7dB PSNR improvements over reconstruction with Poisson Disc masks for acceleration factors of R = 5, 10, 15, 25, respectively. Prospectively accelerated acquisitions with 3D FSE sequences using our optimized sampling patterns exhibit improved image quality and sharpness. Furthermore, we analyze the characteristics of the learned sampling patterns with respect to changes in acceleration factor, measurement noise, underlying anatomy, and coil sensitivities. We show that all these factors contribute to the optimization result by affecting the sampling density, k-space coverage and point spread functions of the learned sampling patterns.

加速核磁共振成像方案通常会采用预先确定的采样模式,对 k 空间进行低采样。寻找最佳模式可以提高重建质量,但这种优化是一项具有挑战性的任务。为了应对这一挑战,我们引入了一种基于变异信息最大化的新型深度学习框架 AutoSamp,该框架能对磁共振成像扫描的采样模式和重建进行联合优化。我们将编码器表示为非均匀快速傅立叶变换,允许在非笛卡尔平面上连续优化 k 空间采样位置,将解码器表示为深度重建网络。在公开的三维核磁共振成像数据集上进行的实验表明,在压缩传感和深度学习重建方面,所提出的 AutoSamp 方法比现有的变密度和变密度泊松圆盘采样法提高了重建质量。我们证明,在加速因子为 R = 5、10、15、25 时,我们的数据驱动采样优化方法比使用泊松圆盘掩模重建的 PSNR 分别提高了 4.4dB、2.0dB、0.75dB、0.7dB。使用我们优化的采样模式的三维 FSE 序列的前瞻性加速采集显示出更高的图像质量和清晰度。此外,我们还分析了学习到的采样模式在加速因子、测量噪声、基础解剖和线圈灵敏度变化方面的特点。我们发现,所有这些因素都会影响学习到的采样模式的采样密度、k 空间覆盖率和点扩散函数,从而对优化结果产生影响。
{"title":"AutoSamp: Autoencoding k-space Sampling via Variational Information Maximization for 3D MRI.","authors":"Cagan Alkan, Morteza Mardani, Congyu Liao, Zhitao Li, Shreyas S Vasanawala, John M Pauly","doi":"10.1109/TMI.2024.3443292","DOIUrl":"10.1109/TMI.2024.3443292","url":null,"abstract":"<p><p>Accelerated MRI protocols routinely involve a predefined sampling pattern that undersamples the k-space. Finding an optimal pattern can enhance the reconstruction quality, however this optimization is a challenging task. To address this challenge, we introduce a novel deep learning framework, AutoSamp, based on variational information maximization that enables joint optimization of sampling pattern and reconstruction of MRI scans. We represent the encoder as a non-uniform Fast Fourier Transform that allows continuous optimization of k-space sample locations on a non-Cartesian plane, and the decoder as a deep reconstruction network. Experiments on public 3D acquired MRI datasets show improved reconstruction quality of the proposed AutoSamp method over the prevailing variable density and variable density Poisson disc sampling for both compressed sensing and deep learning reconstructions. We demonstrate that our data-driven sampling optimization method achieves 4.4dB, 2.0dB, 0.75dB, 0.7dB PSNR improvements over reconstruction with Poisson Disc masks for acceleration factors of R = 5, 10, 15, 25, respectively. Prospectively accelerated acquisitions with 3D FSE sequences using our optimized sampling patterns exhibit improved image quality and sharpness. Furthermore, we analyze the characteristics of the learned sampling patterns with respect to changes in acceleration factor, measurement noise, underlying anatomy, and coil sensitivities. We show that all these factors contribute to the optimization result by affecting the sampling density, k-space coverage and point spread functions of the learned sampling patterns.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141989744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on medical imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1