首页 > 最新文献

IEEE transactions on medical imaging最新文献

英文 中文
Unified Multi-Modal Image Synthesis for Missing Modality Imputation. 用于缺失模态估算的统一多模态图像合成。
Pub Date : 2024-07-08 DOI: 10.1109/TMI.2024.3424785
Yue Zhang, Chengtao Peng, Qiuli Wang, Dan Song, Kaiyan Li, S Kevin Zhou

Multi-modal medical images provide complementary soft-tissue characteristics that aid in the screening and diagnosis of diseases. However, limited scanning time, image corruption and various imaging protocols often result in incomplete multi-modal images, thus limiting the usage of multi-modal data for clinical purposes. To address this issue, in this paper, we propose a novel unified multi-modal image synthesis method for missing modality imputation. Our method overall takes a generative adversarial architecture, which aims to synthesize missing modalities from any combination of available ones with a single model. To this end, we specifically design a Commonality- and Discrepancy-Sensitive Encoder for the generator to exploit both modality-invariant and specific information contained in input modalities. The incorporation of both types of information facilitates the generation of images with consistent anatomy and realistic details of the desired distribution. Besides, we propose a Dynamic Feature Unification Module to integrate information from a varying number of available modalities, which enables the network to be robust to random missing modalities. The module performs both hard integration and soft integration, ensuring the effectiveness of feature combination while avoiding information loss. Verified on two public multi-modal magnetic resonance datasets, the proposed method is effective in handling various synthesis tasks and shows superior performance compared to previous methods.

多模态医学图像可提供互补的软组织特征,有助于疾病的筛查和诊断。然而,有限的扫描时间、图像损坏和各种成像协议往往导致多模态图像不完整,从而限制了多模态数据在临床上的应用。针对这一问题,我们在本文中提出了一种用于缺失模态估算的新型统一多模态图像合成方法。我们的方法总体上采用了生成对抗架构,旨在用单一模型从任意可用模态组合中合成缺失模态。为此,我们专门为生成器设计了共性和差异敏感编码器,以利用输入模态中包含的模态不变信息和特定信息。这两类信息的结合有助于生成具有一致解剖结构和所需分布的真实细节的图像。此外,我们还提出了一个动态特征统一模块,用于整合来自不同数量可用模态的信息,从而使网络对随机缺失模态具有鲁棒性。该模块同时执行硬整合和软整合,确保特征组合的有效性,同时避免信息丢失。经过在两个公开的多模态磁共振数据集上的验证,所提出的方法能有效地处理各种合成任务,与之前的方法相比表现出更优越的性能。
{"title":"Unified Multi-Modal Image Synthesis for Missing Modality Imputation.","authors":"Yue Zhang, Chengtao Peng, Qiuli Wang, Dan Song, Kaiyan Li, S Kevin Zhou","doi":"10.1109/TMI.2024.3424785","DOIUrl":"https://doi.org/10.1109/TMI.2024.3424785","url":null,"abstract":"<p><p>Multi-modal medical images provide complementary soft-tissue characteristics that aid in the screening and diagnosis of diseases. However, limited scanning time, image corruption and various imaging protocols often result in incomplete multi-modal images, thus limiting the usage of multi-modal data for clinical purposes. To address this issue, in this paper, we propose a novel unified multi-modal image synthesis method for missing modality imputation. Our method overall takes a generative adversarial architecture, which aims to synthesize missing modalities from any combination of available ones with a single model. To this end, we specifically design a Commonality- and Discrepancy-Sensitive Encoder for the generator to exploit both modality-invariant and specific information contained in input modalities. The incorporation of both types of information facilitates the generation of images with consistent anatomy and realistic details of the desired distribution. Besides, we propose a Dynamic Feature Unification Module to integrate information from a varying number of available modalities, which enables the network to be robust to random missing modalities. The module performs both hard integration and soft integration, ensuring the effectiveness of feature combination while avoiding information loss. Verified on two public multi-modal magnetic resonance datasets, the proposed method is effective in handling various synthesis tasks and shows superior performance compared to previous methods.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141560601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HiDiff: Hybrid Diffusion Framework for Medical Image Segmentation HiDiff:用于医学图像分割的混合扩散框架。
Pub Date : 2024-07-08 DOI: 10.1109/TMI.2024.3424471
Tao Chen;Chenhui Wang;Zhihao Chen;Yiming Lei;Hongming Shan
Medical image segmentation has been significantly advanced with the rapid development of deep learning (DL) techniques. Existing DL-based segmentation models are typically discriminative; i.e., they aim to learn a mapping from the input image to segmentation masks. However, these discriminative methods neglect the underlying data distribution and intrinsic class characteristics, suffering from unstable feature space. In this work, we propose to complement discriminative segmentation methods with the knowledge of underlying data distribution from generative models. To that end, we propose a novel hybrid diffusion framework for medical image segmentation, termed HiDiff, which can synergize the strengths of existing discriminative segmentation models and new generative diffusion models. HiDiff comprises two key components: discriminative segmentor and diffusion refiner. First, we utilize any conventional trained segmentation models as discriminative segmentor, which can provide a segmentation mask prior for diffusion refiner. Second, we propose a novel binary Bernoulli diffusion model (BBDM) as the diffusion refiner, which can effectively, efficiently, and interactively refine the segmentation mask by modeling the underlying data distribution. Third, we train the segmentor and BBDM in an alternate-collaborative manner to mutually boost each other. Extensive experimental results on abdomen organ, brain tumor, polyps, and retinal vessels segmentation datasets, covering four widely-used modalities, demonstrate the superior performance of HiDiff over existing medical segmentation algorithms, including the state-of-the-art transformer- and diffusion-based ones. In addition, HiDiff excels at segmenting small objects and generalizing to new datasets. Source codes are made available at https://github.com/takimailto/HiDiff.
随着深度学习(DL)技术的快速发展,医学影像分割技术得到了长足的进步。现有的基于深度学习的分割模型通常是判别性的,即它们旨在学习从输入图像到分割掩膜的映射。然而,这些判别方法忽视了底层数据分布和内在类别特征,导致特征空间不稳定。在这项工作中,我们建议利用生成模型中的底层数据分布知识来补充判别式分割方法。为此,我们提出了一种用于医学影像分割的新型混合扩散框架,称为 HiDiff,它可以协同现有的判别分割模型和新的生成扩散模型的优势。HiDiff 包括两个关键部分:判别分割器和扩散细化器。首先,我们利用任何传统的训练有素的分割模型作为判别分割器,为扩散细化器提供分割掩码先验。其次,我们提出了一种新颖的二进制伯努利扩散模型(BBDM)作为扩散细化器,它可以通过对底层数据分布建模,有效、高效、交互式地细化分割掩码。第三,我们以交替协作的方式训练分割器和 BBDM,使其相互促进。在腹部器官、脑肿瘤、息肉和视网膜血管分割数据集(涵盖四种广泛使用的模式)上的大量实验结果表明,HiDiff 的性能优于现有的医疗分割算法,包括最先进的基于变换器和扩散的算法。此外,HiDiff 还擅长分割小物体,并能推广到新的数据集。源代码可从 https://github.com/takimailto/HiDiff 获取。
{"title":"HiDiff: Hybrid Diffusion Framework for Medical Image Segmentation","authors":"Tao Chen;Chenhui Wang;Zhihao Chen;Yiming Lei;Hongming Shan","doi":"10.1109/TMI.2024.3424471","DOIUrl":"10.1109/TMI.2024.3424471","url":null,"abstract":"Medical image segmentation has been significantly advanced with the rapid development of deep learning (DL) techniques. Existing DL-based segmentation models are typically discriminative; i.e., they aim to learn a mapping from the input image to segmentation masks. However, these discriminative methods neglect the underlying data distribution and intrinsic class characteristics, suffering from unstable feature space. In this work, we propose to complement discriminative segmentation methods with the knowledge of underlying data distribution from generative models. To that end, we propose a novel \u0000<underline>h</u>\u0000ybr\u0000<underline>i</u>\u0000d \u0000<underline>diff</u>\u0000usion framework for medical image segmentation, termed HiDiff, which can synergize the strengths of existing discriminative segmentation models and new generative diffusion models. HiDiff comprises two key components: discriminative segmentor and diffusion refiner. First, we utilize any conventional trained segmentation models as discriminative segmentor, which can provide a segmentation mask prior for diffusion refiner. Second, we propose a novel binary Bernoulli diffusion model (BBDM) as the diffusion refiner, which can effectively, efficiently, and interactively refine the segmentation mask by modeling the underlying data distribution. Third, we train the segmentor and BBDM in an alternate-collaborative manner to mutually boost each other. Extensive experimental results on abdomen organ, brain tumor, polyps, and retinal vessels segmentation datasets, covering four widely-used modalities, demonstrate the superior performance of HiDiff over existing medical segmentation algorithms, including the state-of-the-art transformer- and diffusion-based ones. In addition, HiDiff excels at segmenting small objects and generalizing to new datasets. Source codes are made available at \u0000<uri>https://github.com/takimailto/HiDiff</uri>\u0000.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 10","pages":"3570-3583"},"PeriodicalIF":0.0,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141560600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint regional uptake quantification of thorium-227 and radium-223 using a multiple-energy-window projection-domain quantitative SPECT method. 使用多能量窗投影域定量 SPECT 方法联合量化钍-227 和镭-223 的区域摄取量。
Pub Date : 2024-07-05 DOI: 10.1109/TMI.2024.3420228
Zekun Li, Nadia Benabdallah, Richard Laforest, Richard L Wahl, Daniel L J Thorek, Abhinav K Jha

Thorium-227 (227Th)-based α-particle radiopharmaceutical therapies (α-RPTs) are currently being investigated in several clinical and pre-clinical studies. After administration, 227Th decays to 223Ra, another α-particle-emitting isotope, which redistributes within the patient. Reliable dose quantification of both 227Th and 223Ra is clinically important, and SPECT may perform this quantification as these isotopes also emit X- and γ-ray photons. However, reliable quantification is challenging for several reasons: the orders-of-magnitude lower activity compared to conventional SPECT, resulting in a very low number of detected counts, the presence of multiple photopeaks, substantial overlap in the emission spectra of these isotopes, and the image-degrading effects in SPECT. To address these issues, we propose a multiple-energy-window projection-domain quantification (MEW-PDQ) method that jointly estimates the regional activity uptake of both 227Th and 223Ra directly using the SPECT projection data from multiple energy windows. We evaluated the method with realistic simulation studies conducted with anthropomorphic digital phantoms, including a virtual imaging trial, in the context of imaging patients with bone metastases of prostate cancer who were treated with 227Th-based α-RPTs. The proposed method yielded reliable (accurate and precise) regional uptake estimates of both isotopes and outperformed state-of-the-art methods across different lesion sizes and contrasts, as well as in the virtual imaging trial. This reliable performance was also observed with moderate levels of intra-regional heterogeneous uptake as well as when there were moderate inaccuracies in the definitions of the support of various regions. Additionally, we demonstrated the effectiveness of using multiple energy windows and the variance of the estimated uptake using the proposed method approached the Cramér-Rao-lower-bound-defined theoretical limit. These results provide strong evidence in support of this method for reliable uptake quantification in 227Th-based α-RPTs.

基于钍-227(227Th)的α粒子放射性药物疗法(α-RPTs)目前正在进行多项临床和临床前研究。用药后,227Th 会衰变为 223Ra(另一种α粒子发射同位素),并在患者体内重新分布。对 227Th 和 223Ra 进行可靠的剂量定量在临床上非常重要,SPECT 可以进行这种定量,因为这些同位素也会发射 X 和 γ 射线光子。然而,可靠的定量具有挑战性,原因有几个:与传统的 SPECT 相比,这些同位素的放射性活度低几个数量级,导致检测到的计数非常少;存在多个光峰;这些同位素的发射光谱存在大量重叠;以及 SPECT 的图像降级效应。为了解决这些问题,我们提出了一种多能量窗投影域量化(MEW-PDQ)方法,该方法直接利用多个能量窗的 SPECT 投影数据来联合估算 227Th 和 223Ra 的区域活性吸收。我们利用拟人数字模型(包括虚拟成像试验)进行了逼真的模拟研究,评估了该方法在前列腺癌骨转移患者接受基于 227Th 的 α-RPTs 治疗时的成像效果。在不同的病灶大小和对比度以及虚拟成像试验中,所提出的方法对两种同位素的区域摄取量都做出了可靠(准确和精确)的估计,并优于最先进的方法。在中度区域内异质摄取以及各区域支持定义存在中度误差的情况下,也能观察到这种可靠的性能。此外,我们还证明了使用多个能量窗口的有效性,而且使用所提方法估算的摄取量方差接近克拉梅尔-拉奥下限定义的理论极限。这些结果为该方法在基于 227Th 的 α-RPT 中进行可靠的吸收定量提供了有力的支持。
{"title":"Joint regional uptake quantification of thorium-227 and radium-223 using a multiple-energy-window projection-domain quantitative SPECT method.","authors":"Zekun Li, Nadia Benabdallah, Richard Laforest, Richard L Wahl, Daniel L J Thorek, Abhinav K Jha","doi":"10.1109/TMI.2024.3420228","DOIUrl":"10.1109/TMI.2024.3420228","url":null,"abstract":"<p><p>Thorium-227 (<sup>227</sup>Th)-based α-particle radiopharmaceutical therapies (α-RPTs) are currently being investigated in several clinical and pre-clinical studies. After administration, <sup>227</sup>Th decays to <sup>223</sup>Ra, another α-particle-emitting isotope, which redistributes within the patient. Reliable dose quantification of both <sup>227</sup>Th and <sup>223</sup>Ra is clinically important, and SPECT may perform this quantification as these isotopes also emit X- and γ-ray photons. However, reliable quantification is challenging for several reasons: the orders-of-magnitude lower activity compared to conventional SPECT, resulting in a very low number of detected counts, the presence of multiple photopeaks, substantial overlap in the emission spectra of these isotopes, and the image-degrading effects in SPECT. To address these issues, we propose a multiple-energy-window projection-domain quantification (MEW-PDQ) method that jointly estimates the regional activity uptake of both <sup>227</sup>Th and <sup>223</sup>Ra directly using the SPECT projection data from multiple energy windows. We evaluated the method with realistic simulation studies conducted with anthropomorphic digital phantoms, including a virtual imaging trial, in the context of imaging patients with bone metastases of prostate cancer who were treated with <sup>227</sup>Th-based α-RPTs. The proposed method yielded reliable (accurate and precise) regional uptake estimates of both isotopes and outperformed state-of-the-art methods across different lesion sizes and contrasts, as well as in the virtual imaging trial. This reliable performance was also observed with moderate levels of intra-regional heterogeneous uptake as well as when there were moderate inaccuracies in the definitions of the support of various regions. Additionally, we demonstrated the effectiveness of using multiple energy windows and the variance of the estimated uptake using the proposed method approached the Cramér-Rao-lower-bound-defined theoretical limit. These results provide strong evidence in support of this method for reliable uptake quantification in <sup>227</sup>Th-based α-RPTs.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141539127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Denoising Diffusion Probabilistic Model for Metal Artifact Reduction in CT 用于减少 CT 中金属伪影的去噪扩散概率模型。
Pub Date : 2024-07-04 DOI: 10.1109/TMI.2024.3416398
Grigorios M. Karageorgos;Jiayong Zhang;Nils Peters;Wenjun Xia;Chuang Niu;Harald Paganetti;Ge Wang;Bruno De Man
The presence of metal objects leads to corrupted CT projection measurements, resulting in metal artifacts in the reconstructed CT images. AI promises to offer improved solutions to estimate missing sinogram data for metal artifact reduction (MAR), as previously shown with convolutional neural networks (CNNs) and generative adversarial networks (GANs). Recently, denoising diffusion probabilistic models (DDPM) have shown great promise in image generation tasks, potentially outperforming GANs. In this study, a DDPM-based approach is proposed for inpainting of missing sinogram data for improved MAR. The proposed model is unconditionally trained, free from information on metal objects, which can potentially enhance its generalization capabilities across different types of metal implants compared to conditionally trained approaches. The performance of the proposed technique was evaluated and compared to the state-of-the-art normalized MAR (NMAR) approach as well as to CNN-based and GAN-based MAR approaches. The DDPM-based approach provided significantly higher SSIM and PSNR, as compared to NMAR (SSIM: p $lt 10^{-{26}}$ ; PSNR: p $lt 10^{-{21}}$ ), the CNN (SSIM: p $lt 10^{-{25}}$ ; PSNR: p $lt 10^{-{9}}$ ) and the GAN (SSIM: p $lt 10^{-{6}}$ ; PSNR: p <0.05) methods. The DDPM-MAR technique was further evaluated based on clinically relevant image quality metrics on clinical CT images with virtually introduced metal objects and metal artifacts, demonstrating superior quality relative to the other three models. In general, the AI-based techniques showed improved MAR performance compared to the non-AI-based NMAR approach. The proposed methodology shows promise in enhancing the effectiveness of MAR, and therefore improving the diagnostic accuracy of CT.
金属物体的存在会破坏 CT 投影测量,导致重建的 CT 图像中出现金属伪影。人工智能有望提供更好的解决方案来估计缺失的正弦曲线数据,以减少金属伪影(MAR),正如之前卷积神经网络(CNN)和生成对抗网络(GAN)所显示的那样。最近,去噪扩散概率模型(DDPM)在图像生成任务中显示出了巨大的潜力,有可能超越 GANs。本研究提出了一种基于 DDPM 的方法,用于对缺失的正弦曲线数据进行内绘,以改善 MAR。所提出的模型是无条件训练的,不受金属物体信息的影响,与有条件训练的方法相比,这有可能增强其对不同类型金属植入物的泛化能力。对所提出技术的性能进行了评估,并与最先进的归一化 MAR(NMAR)方法以及基于 CNN 和基于 GAN 的 MAR 方法进行了比较。与 NMAR(SSIM:p < 10-26;PSNR:p < 10-21)、CNN(SSIM:p < 10-25;PSNR:p < 10-9)和 GAN(SSIM:p < 10-6;PSNR:p < 0.05)方法相比,基于 DDPM 的方法提供了明显更高的 SSIM 和 PSNR。根据临床相关的图像质量指标,在实际引入金属物体和金属伪影的临床 CT 图像上对 DDPM-MAR 技术进行了进一步评估,结果显示其质量优于其他三种模型。总体而言,与非基于人工智能的 NMAR 方法相比,基于人工智能的技术显示出更高的 MAR 性能。所提出的方法有望提高 MAR 的有效性,从而提高 CT 诊断的准确性。
{"title":"A Denoising Diffusion Probabilistic Model for Metal Artifact Reduction in CT","authors":"Grigorios M. Karageorgos;Jiayong Zhang;Nils Peters;Wenjun Xia;Chuang Niu;Harald Paganetti;Ge Wang;Bruno De Man","doi":"10.1109/TMI.2024.3416398","DOIUrl":"10.1109/TMI.2024.3416398","url":null,"abstract":"The presence of metal objects leads to corrupted CT projection measurements, resulting in metal artifacts in the reconstructed CT images. AI promises to offer improved solutions to estimate missing sinogram data for metal artifact reduction (MAR), as previously shown with convolutional neural networks (CNNs) and generative adversarial networks (GANs). Recently, denoising diffusion probabilistic models (DDPM) have shown great promise in image generation tasks, potentially outperforming GANs. In this study, a DDPM-based approach is proposed for inpainting of missing sinogram data for improved MAR. The proposed model is unconditionally trained, free from information on metal objects, which can potentially enhance its generalization capabilities across different types of metal implants compared to conditionally trained approaches. The performance of the proposed technique was evaluated and compared to the state-of-the-art normalized MAR (NMAR) approach as well as to CNN-based and GAN-based MAR approaches. The DDPM-based approach provided significantly higher SSIM and PSNR, as compared to NMAR (SSIM: p \u0000<inline-formula> <tex-math>$lt 10^{-{26}}$ </tex-math></inline-formula>\u0000; PSNR: p \u0000<inline-formula> <tex-math>$lt 10^{-{21}}$ </tex-math></inline-formula>\u0000), the CNN (SSIM: p \u0000<inline-formula> <tex-math>$lt 10^{-{25}}$ </tex-math></inline-formula>\u0000; PSNR: p \u0000<inline-formula> <tex-math>$lt 10^{-{9}}$ </tex-math></inline-formula>\u0000) and the GAN (SSIM: p \u0000<inline-formula> <tex-math>$lt 10^{-{6}}$ </tex-math></inline-formula>\u0000; PSNR: p <0.05) methods. The DDPM-MAR technique was further evaluated based on clinically relevant image quality metrics on clinical CT images with virtually introduced metal objects and metal artifacts, demonstrating superior quality relative to the other three models. In general, the AI-based techniques showed improved MAR performance compared to the non-AI-based NMAR approach. The proposed methodology shows promise in enhancing the effectiveness of MAR, and therefore improving the diagnostic accuracy of CT.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 10","pages":"3521-3532"},"PeriodicalIF":0.0,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141536216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mitigating Aberration-Induced Noise: A Deep Learning-Based Aberration-to-Aberration Approach. 减少像差引起的噪音:基于深度学习的 "从像差到像差 "方法。
Pub Date : 2024-07-03 DOI: 10.1109/TMI.2024.3422027
Mostafa Sharifzadeh, Sobhan Goudarzi, An Tang, Habib Benali, Hassan Rivaz

One of the primary sources of suboptimal image quality in ultrasound imaging is phase aberration. It is caused by spatial changes in sound speed over a heterogeneous medium, which disturbs the transmitted waves and prevents coherent summation of echo signals. Obtaining non-aberrated ground truths in real-world scenarios can be extremely challenging, if not impossible. This challenge hinders the performance of deep learning-based techniques due to the domain shift between simulated and experimental data. Here, for the first time, we propose a deep learning-based method that does not require ground truth to correct the phase aberration problem and, as such, can be directly trained on real data. We train a network wherein both the input and target output are randomly aberrated radio frequency (RF) data. Moreover, we demonstrate that a conventional loss function such as mean square error is inadequate for training such a network to achieve optimal performance. Instead, we propose an adaptive mixed loss function that employs both B-mode and RF data, resulting in more efficient convergence and enhanced performance. Finally, we publicly release our dataset, comprising over 180,000 aberrated single plane-wave images (RF data), wherein phase aberrations are modeled as near-field phase screens. Although not utilized in the proposed method, each aberrated image is paired with its corresponding aberration profile and the non-aberrated version, aiming to mitigate the data scarcity problem in developing deep learning-based techniques for phase aberration correction. Source code and trained model are also available along with the dataset at http://code.sonography.ai/main-aaa.

相位差是超声成像图像质量不佳的主要原因之一。相位差是由异质介质上声速的空间变化引起的,它干扰了传输波,阻碍了回波信号的连贯求和。在现实世界中获取无像差的地面实况极具挑战性,甚至是不可能的。由于模拟数据和实验数据之间存在域偏移,这一挑战阻碍了基于深度学习技术的性能。在这里,我们首次提出了一种基于深度学习的方法,它不需要地面实况来纠正相差问题,因此可以直接在真实数据上进行训练。我们训练了一个网络,其输入和目标输出都是随机畸变的射频(RF)数据。此外,我们还证明了均方误差等传统损失函数不足以训练这样的网络以达到最佳性能。相反,我们提出了一种自适应混合损失函数,同时采用 B 模式和射频数据,从而提高了收敛效率和性能。最后,我们公开发布了我们的数据集,其中包括 180,000 多幅畸变的单平面波图像(射频数据),其中相位畸变被建模为近场相位屏。虽然在所提出的方法中没有使用,但每幅畸变图像都与相应的畸变轮廓和非畸变版本配对,目的是在开发基于深度学习的相位差校正技术时缓解数据稀缺问题。源代码和训练好的模型以及数据集也可在 http://code.sonography.ai/main-aaa 上获取。
{"title":"Mitigating Aberration-Induced Noise: A Deep Learning-Based Aberration-to-Aberration Approach.","authors":"Mostafa Sharifzadeh, Sobhan Goudarzi, An Tang, Habib Benali, Hassan Rivaz","doi":"10.1109/TMI.2024.3422027","DOIUrl":"10.1109/TMI.2024.3422027","url":null,"abstract":"<p><p>One of the primary sources of suboptimal image quality in ultrasound imaging is phase aberration. It is caused by spatial changes in sound speed over a heterogeneous medium, which disturbs the transmitted waves and prevents coherent summation of echo signals. Obtaining non-aberrated ground truths in real-world scenarios can be extremely challenging, if not impossible. This challenge hinders the performance of deep learning-based techniques due to the domain shift between simulated and experimental data. Here, for the first time, we propose a deep learning-based method that does not require ground truth to correct the phase aberration problem and, as such, can be directly trained on real data. We train a network wherein both the input and target output are randomly aberrated radio frequency (RF) data. Moreover, we demonstrate that a conventional loss function such as mean square error is inadequate for training such a network to achieve optimal performance. Instead, we propose an adaptive mixed loss function that employs both B-mode and RF data, resulting in more efficient convergence and enhanced performance. Finally, we publicly release our dataset, comprising over 180,000 aberrated single plane-wave images (RF data), wherein phase aberrations are modeled as near-field phase screens. Although not utilized in the proposed method, each aberrated image is paired with its corresponding aberration profile and the non-aberrated version, aiming to mitigate the data scarcity problem in developing deep learning-based techniques for phase aberration correction. Source code and trained model are also available along with the dataset at http://code.sonography.ai/main-aaa.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141499992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Convolutional-Transformer Model for FFR and iFR Assessment From Coronary Angiography 从冠状动脉造影评估 FFR 和 iFR 的卷积变换器模型
Pub Date : 2024-07-02 DOI: 10.1109/TMI.2024.3383283
Raffaele Mineo;F. Proietto Salanitri;G. Bellitto;I. Kavasidis;O. De Filippo;M. Millesimo;G. M. De Ferrari;M. Aldinucci;D. Giordano;S. Palazzo;F. D’Ascenzo;C. Spampinato
The quantification of stenosis severity from X-ray catheter angiography is a challenging task. Indeed, this requires to fully understand the lesion’s geometry by analyzing dynamics of the contrast material, only relying on visual observation by clinicians. To support decision making for cardiac intervention, we propose a hybrid CNN-Transformer model for the assessment of angiography-based non-invasive fractional flow-reserve (FFR) and instantaneous wave-free ratio (iFR) of intermediate coronary stenosis. Our approach predicts whether a coronary artery stenosis is hemodynamically significant and provides direct FFR and iFR estimates. This is achieved through a combination of regression and classification branches that forces the model to focus on the cut-off region of FFR (around 0.8 FFR value), which is highly critical for decision-making. We also propose a spatio-temporal factorization mechanisms that redesigns the transformer’s self-attention mechanism to capture both local spatial and temporal interactions between vessel geometry, blood flow dynamics, and lesion morphology. The proposed method achieves state-of-the-art performance on a dataset of 778 exams from 389 patients. Unlike existing methods, our approach employs a single angiography view and does not require knowledge of the key frame; supervision at training time is provided by a classification loss (based on a threshold of the FFR/iFR values) and a regression loss for direct estimation. Finally, the analysis of model interpretability and calibration shows that, in spite of the complexity of angiographic imaging data, our method can robustly identify the location of the stenosis and correlate prediction uncertainty to the provided output scores.
通过 X 射线导管血管造影量化血管狭窄的严重程度是一项具有挑战性的任务。事实上,这需要通过分析造影剂的动态变化来充分了解病变的几何形状,而临床医生只能依靠肉眼观察。为了支持心脏介入治疗的决策,我们提出了一种混合 CNN-Transformer 模型,用于评估基于血管造影的无创血流储备分数(FFR)和中度冠状动脉狭窄的瞬时无波比(iFR)。我们的方法可以预测冠状动脉狭窄是否具有显著的血流动力学意义,并提供直接的 FFR 和 iFR 估计值。这是通过将回归和分类分支相结合来实现的,这就迫使模型关注 FFR 的临界区域(FFR 值在 0.8 左右),这对决策至关重要。我们还提出了一种时空因式分解机制,重新设计了转换器的自我注意机制,以捕捉血管几何形状、血流动力学和病变形态之间的局部时空相互作用。所提出的方法在来自 389 名患者的 778 个检查数据集上取得了最先进的性能。与现有方法不同的是,我们的方法采用单一血管造影视图,不需要了解关键帧;训练时的监督由分类损失(基于 FFR/iFR 值的阈值)和用于直接估计的回归损失提供。最后,对模型可解释性和校准性的分析表明,尽管血管造影成像数据很复杂,我们的方法仍能稳健地确定血管狭窄的位置,并将预测的不确定性与所提供的输出分数相关联。
{"title":"A Convolutional-Transformer Model for FFR and iFR Assessment From Coronary Angiography","authors":"Raffaele Mineo;F. Proietto Salanitri;G. Bellitto;I. Kavasidis;O. De Filippo;M. Millesimo;G. M. De Ferrari;M. Aldinucci;D. Giordano;S. Palazzo;F. D’Ascenzo;C. Spampinato","doi":"10.1109/TMI.2024.3383283","DOIUrl":"10.1109/TMI.2024.3383283","url":null,"abstract":"The quantification of stenosis severity from X-ray catheter angiography is a challenging task. Indeed, this requires to fully understand the lesion’s geometry by analyzing dynamics of the contrast material, only relying on visual observation by clinicians. To support decision making for cardiac intervention, we propose a hybrid CNN-Transformer model for the assessment of angiography-based non-invasive fractional flow-reserve (FFR) and instantaneous wave-free ratio (iFR) of intermediate coronary stenosis. Our approach predicts whether a coronary artery stenosis is hemodynamically significant and provides direct FFR and iFR estimates. This is achieved through a combination of regression and classification branches that forces the model to focus on the cut-off region of FFR (around 0.8 FFR value), which is highly critical for decision-making. We also propose a spatio-temporal factorization mechanisms that redesigns the transformer’s self-attention mechanism to capture both local spatial and temporal interactions between vessel geometry, blood flow dynamics, and lesion morphology. The proposed method achieves state-of-the-art performance on a dataset of 778 exams from 389 patients. Unlike existing methods, our approach employs a single angiography view and does not require knowledge of the key frame; supervision at training time is provided by a classification loss (based on a threshold of the FFR/iFR values) and a regression loss for direct estimation. Finally, the analysis of model interpretability and calibration shows that, in spite of the complexity of angiographic imaging data, our method can robustly identify the location of the stenosis and correlate prediction uncertainty to the provided output scores.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 8","pages":"2866-2877"},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10582501","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141494634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uni4Eye++: A General Masked Image Modeling Multi-modal Pre-training Framework for Ophthalmic Image Classification and Segmentation. Uni4Eye++:用于眼科图像分类和分割的通用屏蔽图像建模多模态预训练框架
Pub Date : 2024-07-02 DOI: 10.1109/TMI.2024.3422102
Zhiyuan Cai, Li Lin, Huaqing He, Pujin Cheng, Xiaoying Tang

A large-scale labeled dataset is a key factor for the success of supervised deep learning in most ophthalmic image analysis scenarios. However, limited annotated data is very common in ophthalmic image analysis, since manual annotation is time-consuming and labor-intensive. Self-supervised learning (SSL) methods bring huge opportunities for better utilizing unlabeled data, as they do not require massive annotations. To utilize as many unlabeled ophthalmic images as possible, it is necessary to break the dimension barrier, simultaneously making use of both 2D and 3D images as well as alleviating the issue of catastrophic forgetting. In this paper, we propose a universal self-supervised Transformer framework named Uni4Eye++ to discover the intrinsic image characteristic and capture domain-specific feature embedding in ophthalmic images. Uni4Eye++ can serve as a global feature extractor, which builds its basis on a Masked Image Modeling task with a Vision Transformer architecture. On the basis of our previous work Uni4Eye, we further employ an image entropy guided masking strategy to reconstruct more-informative patches and a dynamic head generator module to alleviate modality confusion. We evaluate the performance of our pre-trained Uni4Eye++ encoder by fine-tuning it on multiple downstream ophthalmic image classification and segmentation tasks. The superiority of Uni4Eye++ is successfully established through comparisons to other state-of-the-art SSL pre-training methods. Our code is available at Github1.

在大多数眼科图像分析场景中,大规模标注数据集是有监督深度学习取得成功的关键因素。然而,在眼科图像分析中,标注数据有限的情况非常普遍,因为人工标注既耗时又耗力。自监督学习(SSL)方法不需要大量注释,因此为更好地利用未标注数据带来了巨大的机遇。要利用尽可能多的未标记眼科图像,就必须打破维度障碍,同时利用二维和三维图像,并缓解灾难性遗忘问题。在本文中,我们提出了一个名为 Uni4Eye++ 的通用自监督变换器框架,用于发现眼科图像的内在特征并捕捉特定领域的特征嵌入。Uni4Eye++ 可作为全局特征提取器,其基础是具有视觉变换器架构的遮罩图像建模任务。在之前的 Uni4Eye 工作基础上,我们进一步采用了图像熵引导的遮罩策略来重建信息量更大的补丁,并使用动态头部生成器模块来缓解模态混淆。我们通过在多个下游眼科图像分类和分割任务中对预先训练好的 Uni4Eye++ 编码器进行微调来评估其性能。通过与其他最先进的 SSL 预训练方法进行比较,我们成功地确定了 Uni4Eye++ 的优越性。我们的代码可在 Github 上获取1。
{"title":"Uni4Eye++: A General Masked Image Modeling Multi-modal Pre-training Framework for Ophthalmic Image Classification and Segmentation.","authors":"Zhiyuan Cai, Li Lin, Huaqing He, Pujin Cheng, Xiaoying Tang","doi":"10.1109/TMI.2024.3422102","DOIUrl":"https://doi.org/10.1109/TMI.2024.3422102","url":null,"abstract":"<p><p>A large-scale labeled dataset is a key factor for the success of supervised deep learning in most ophthalmic image analysis scenarios. However, limited annotated data is very common in ophthalmic image analysis, since manual annotation is time-consuming and labor-intensive. Self-supervised learning (SSL) methods bring huge opportunities for better utilizing unlabeled data, as they do not require massive annotations. To utilize as many unlabeled ophthalmic images as possible, it is necessary to break the dimension barrier, simultaneously making use of both 2D and 3D images as well as alleviating the issue of catastrophic forgetting. In this paper, we propose a universal self-supervised Transformer framework named Uni4Eye++ to discover the intrinsic image characteristic and capture domain-specific feature embedding in ophthalmic images. Uni4Eye++ can serve as a global feature extractor, which builds its basis on a Masked Image Modeling task with a Vision Transformer architecture. On the basis of our previous work Uni4Eye, we further employ an image entropy guided masking strategy to reconstruct more-informative patches and a dynamic head generator module to alleviate modality confusion. We evaluate the performance of our pre-trained Uni4Eye++ encoder by fine-tuning it on multiple downstream ophthalmic image classification and segmentation tasks. The superiority of Uni4Eye++ is successfully established through comparisons to other state-of-the-art SSL pre-training methods. Our code is available at Github<sup>1</sup>.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141494635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEEE Nuclear Science Symposium 电气和电子工程师学会核科学研讨会
Pub Date : 2024-07-01 DOI: 10.1109/TMI.2024.3372492
{"title":"IEEE Nuclear Science Symposium","authors":"","doi":"10.1109/TMI.2024.3372492","DOIUrl":"10.1109/TMI.2024.3372492","url":null,"abstract":"","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"43 7","pages":"2730-2730"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10579890","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141489286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Organ-aware Diagnosis Framework for Radiology Report Generation. 用于生成放射报告的器官感知诊断框架
Pub Date : 2024-07-01 DOI: 10.1109/TMI.2024.3421599
Shiyu Li, Pengchong Qiao, Lin Wang, Munan Ning, Li Yuan, Yefeng Zheng, Jie Chen

Radiology report generation (RRG) is crucial to save the valuable time of radiologists in drafting the report, therefore increasing their work efficiency. Compared to typical methods that directly transfer image captioning technologies to RRG, our approach incorporates organ-wise priors into the report generation. Specifically, in this paper, we propose Organ-aware Diagnosis (OaD) to generate diagnostic reports containing descriptions of each physiological organ. During training, we first develop a task distillation (TD) module to extract organ-level descriptions from reports. We then introduce an organ-aware report generation module that, for one thing, provides a specific description for each organ, and for another, simulates clinical situations to provide short descriptions for normal cases. Furthermore, we design an auto-balance mask loss to ensure balanced training for normal/abnormal descriptions and various organs simultaneously. Being intuitively reasonable and practically simple, our OaD outperforms SOTA alternatives by large margins on commonly used IU-Xray and MIMIC-CXR datasets, as evidenced by a 3.4% BLEU-1 improvement on MIMIC-CXR and 2.0% BLEU-2 improvement on IU-Xray.

放射学报告生成(RRG)对于节省放射科医生起草报告的宝贵时间,从而提高他们的工作效率至关重要。与直接将图像标题技术移植到 RRG 的典型方法相比,我们的方法将器官先验纳入了报告生成。具体来说,我们在本文中提出了器官感知诊断(Organ-aware Diagnosis,OaD),以生成包含各生理器官描述的诊断报告。在训练过程中,我们首先开发了一个任务蒸馏(TD)模块,用于从报告中提取器官级描述。然后,我们引入了器官感知报告生成模块,该模块一方面为每个器官提供具体描述,另一方面模拟临床情况,为正常病例提供简短描述。此外,我们还设计了一种自动平衡掩码损失,以确保同时对正常/异常描述和各种器官进行均衡训练。我们的 OaD 直观合理、实用简单,在常用的 IU-Xray 和 MIMIC-CXR 数据集上的表现远远优于 SOTA 替代方案,在 MIMIC-CXR 数据集上的 BLEU-1 提高了 3.4%,在 IU-Xray 数据集上的 BLEU-2 提高了 2.0%。
{"title":"An Organ-aware Diagnosis Framework for Radiology Report Generation.","authors":"Shiyu Li, Pengchong Qiao, Lin Wang, Munan Ning, Li Yuan, Yefeng Zheng, Jie Chen","doi":"10.1109/TMI.2024.3421599","DOIUrl":"https://doi.org/10.1109/TMI.2024.3421599","url":null,"abstract":"<p><p>Radiology report generation (RRG) is crucial to save the valuable time of radiologists in drafting the report, therefore increasing their work efficiency. Compared to typical methods that directly transfer image captioning technologies to RRG, our approach incorporates organ-wise priors into the report generation. Specifically, in this paper, we propose Organ-aware Diagnosis (OaD) to generate diagnostic reports containing descriptions of each physiological organ. During training, we first develop a task distillation (TD) module to extract organ-level descriptions from reports. We then introduce an organ-aware report generation module that, for one thing, provides a specific description for each organ, and for another, simulates clinical situations to provide short descriptions for normal cases. Furthermore, we design an auto-balance mask loss to ensure balanced training for normal/abnormal descriptions and various organs simultaneously. Being intuitively reasonable and practically simple, our OaD outperforms SOTA alternatives by large margins on commonly used IU-Xray and MIMIC-CXR datasets, as evidenced by a 3.4% BLEU-1 improvement on MIMIC-CXR and 2.0% BLEU-2 improvement on IU-Xray.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141478183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Label Chest X-Ray Image Classification with Single Positive Labels. 使用单阳性标签进行多标签胸部 X 光图像分类
Pub Date : 2024-07-01 DOI: 10.1109/TMI.2024.3421644
Jiayin Xiao, Si Li, Tongxu Lin, Jian Zhu, Xiaochen Yuan, David Dagan Feng, Bin Sheng

Deep learning approaches for multi-label Chest X-ray (CXR) images classification usually require large-scale datasets. However, acquiring such datasets with full annotations is costly, time-consuming, and prone to noisy labels. Therefore, we introduce a weakly supervised learning problem called Single Positive Multi-label Learning (SPML) into CXR images classification (abbreviated as SPML-CXR), in which only one positive label is annotated per image. A simple solution to SPML-CXR problem is to assume that all the unannotated pathological labels are negative, however, it might introduce false negative labels and decrease the model performance. To this end, we present a Multi-level Pseudo-label Consistency (MPC) framework for SPML-CXR. First, inspired by the pseudo-labeling and consistency regularization in semi-supervised learning, we construct a weak-to-strong consistency framework, where the model prediction on weakly-augmented image is treated as the pseudo label for supervising the model prediction on a strongly-augmented version of the same image, and define an Image-level Perturbation-based Consistency (IPC) regularization to recover the potential mislabeled positive labels. Besides, we incorporate Random Elastic Deformation (RED) as an additional strong augmentation to enhance the perturbation. Second, aiming to expand the perturbation space, we design a perturbation stream to the consistency framework at the feature-level and introduce a Feature-level Perturbation-based Consistency (FPC) regularization as a supplement. Third, we design a Transformer-based encoder module to explore the sample relationship within each mini-batch by a Batch-level Transformer-based Correlation (BTC) regularization. Extensive experiments on the CheXpert and MIMIC-CXR datasets have shown the effectiveness of our MPC framework for solving the SPML-CXR problem.

用于多标签胸部 X 光(CXR)图像分类的深度学习方法通常需要大规模数据集。然而,获取这种带有完整注释的数据集成本高、耗时长,而且容易产生噪声标签。因此,我们在 CXR 图像分类(简称 SPML-CXR)中引入了一个弱监督学习问题,称为单正向多标签学习(Single Positive Multi-label Learning,SPML)。解决 SPML-CXR 问题的一个简单方法是假设所有未注释的病理标签都是阴性的,但这可能会引入假阴性标签,降低模型性能。为此,我们提出了 SPML-CXR 的多级伪标签一致性(MPC)框架。首先,受半监督学习中伪标签和一致性正则化的启发,我们构建了一个从弱到强的一致性框架,将弱增量图像上的模型预测视为伪标签,用于监督同一图像强增量版本上的模型预测,并定义了基于图像级扰动的一致性(IPC)正则化来恢复潜在的误贴正标签。此外,我们还加入了随机弹性变形(RED)作为额外的强增强,以增强扰动。其次,为了扩展扰动空间,我们在特征级的一致性框架中设计了扰动流,并引入了基于特征级扰动的一致性(FPC)正则化作为补充。第三,我们设计了一个基于变换器的编码器模块,通过批量级基于变换器的相关性(BTC)正则化来探索每个小批量内的样本关系。在 CheXpert 和 MIMIC-CXR 数据集上进行的大量实验表明,我们的 MPC 框架在解决 SPML-CXR 问题上非常有效。
{"title":"Multi-Label Chest X-Ray Image Classification with Single Positive Labels.","authors":"Jiayin Xiao, Si Li, Tongxu Lin, Jian Zhu, Xiaochen Yuan, David Dagan Feng, Bin Sheng","doi":"10.1109/TMI.2024.3421644","DOIUrl":"https://doi.org/10.1109/TMI.2024.3421644","url":null,"abstract":"<p><p>Deep learning approaches for multi-label Chest X-ray (CXR) images classification usually require large-scale datasets. However, acquiring such datasets with full annotations is costly, time-consuming, and prone to noisy labels. Therefore, we introduce a weakly supervised learning problem called Single Positive Multi-label Learning (SPML) into CXR images classification (abbreviated as SPML-CXR), in which only one positive label is annotated per image. A simple solution to SPML-CXR problem is to assume that all the unannotated pathological labels are negative, however, it might introduce false negative labels and decrease the model performance. To this end, we present a Multi-level Pseudo-label Consistency (MPC) framework for SPML-CXR. First, inspired by the pseudo-labeling and consistency regularization in semi-supervised learning, we construct a weak-to-strong consistency framework, where the model prediction on weakly-augmented image is treated as the pseudo label for supervising the model prediction on a strongly-augmented version of the same image, and define an Image-level Perturbation-based Consistency (IPC) regularization to recover the potential mislabeled positive labels. Besides, we incorporate Random Elastic Deformation (RED) as an additional strong augmentation to enhance the perturbation. Second, aiming to expand the perturbation space, we design a perturbation stream to the consistency framework at the feature-level and introduce a Feature-level Perturbation-based Consistency (FPC) regularization as a supplement. Third, we design a Transformer-based encoder module to explore the sample relationship within each mini-batch by a Batch-level Transformer-based Correlation (BTC) regularization. Extensive experiments on the CheXpert and MIMIC-CXR datasets have shown the effectiveness of our MPC framework for solving the SPML-CXR problem.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"PP ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141478185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on medical imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1