首页 > 最新文献

IEEE transactions on medical imaging最新文献

英文 中文
SISMIK for brain MRI: Deep-learning-based motion estimation and model-based motion correction in k-space. 用于脑磁共振成像的 SISMIK:基于深度学习的运动估计和基于模型的 k 空间运动校正。
Pub Date : 2024-08-19 DOI: 10.1109/TMI.2024.3446450
Oscar Dabrowski, Jean-Luc Falcone, Antoine Klauser, Julien Songeon, Michel Kocher, Bastien Chopard, Francois Lazeyras, Sebastien Courvoisier

MRI, a widespread non-invasive medical imaging modality, is highly sensitive to patient motion. Despite many attempts over the years, motion correction remains a difficult problem and there is no general method applicable to all situations. We propose a retrospective method for motion estimation and correction to tackle the problem of in-plane rigid-body motion, apt for classical 2D Spin-Echo scans of the brain, which are regularly used in clinical practice. Due to the sequential acquisition of k-space, motion artifacts are well localized. The method leverages the power of deep neural networks to estimate motion parameters in k-space and uses a model-based approach to restore degraded images to avoid "hallucinations". Notable advantages are its ability to estimate motion occurring in high spatial frequencies without the need of a motion-free reference. The proposed method operates on the whole k-space dynamic range and is moderately affected by the lower SNR of higher harmonics. As a proof of concept, we provide models trained using supervised learning on 600k motion simulations based on motion-free scans of 43 different subjects. Generalization performance was tested with simulations as well as in-vivo. Qualitative and quantitative evaluations are presented for motion parameter estimations and image reconstruction. Experimental results show that our approach is able to obtain good generalization performance on simulated data and in-vivo acquisitions. We provide a Python implementation at https://gitlab.unige.ch/Oscar.Dabrowski/sismik_mri/.

核磁共振成像是一种广泛应用的无创医学成像模式,对患者的运动非常敏感。尽管多年来进行了许多尝试,但运动校正仍是一个难题,没有适用于所有情况的通用方法。我们提出了一种运动估计和校正的回顾性方法,以解决平面内刚体运动的问题,适用于临床上经常使用的经典二维脑部自旋回波扫描。由于 k 空间的顺序采集,运动伪影被很好地定位。该方法利用深度神经网络的强大功能来估计 k 空间中的运动参数,并使用基于模型的方法来恢复退化图像,以避免出现 "幻觉"。该方法的显著优点是能够估算高空间频率下的运动,而无需无运动参照物。该方法适用于整个 k 空间动态范围,受高次谐波较低信噪比的影响较小。作为概念验证,我们提供了基于 43 个不同受试者的无运动扫描的 600k 运动模拟的监督学习训练模型。通过模拟和活体测试了泛化性能。对运动参数估计和图像重建进行了定性和定量评估。实验结果表明,我们的方法能够在模拟数据和体内采集中获得良好的泛化性能。我们在 https://gitlab.unige.ch/Oscar.Dabrowski/sismik_mri/ 上提供了 Python 实现。
{"title":"SISMIK for brain MRI: Deep-learning-based motion estimation and model-based motion correction in k-space.","authors":"Oscar Dabrowski, Jean-Luc Falcone, Antoine Klauser, Julien Songeon, Michel Kocher, Bastien Chopard, Francois Lazeyras, Sebastien Courvoisier","doi":"10.1109/TMI.2024.3446450","DOIUrl":"https://doi.org/10.1109/TMI.2024.3446450","url":null,"abstract":"<p><p>MRI, a widespread non-invasive medical imaging modality, is highly sensitive to patient motion. Despite many attempts over the years, motion correction remains a difficult problem and there is no general method applicable to all situations. We propose a retrospective method for motion estimation and correction to tackle the problem of in-plane rigid-body motion, apt for classical 2D Spin-Echo scans of the brain, which are regularly used in clinical practice. Due to the sequential acquisition of k-space, motion artifacts are well localized. The method leverages the power of deep neural networks to estimate motion parameters in k-space and uses a model-based approach to restore degraded images to avoid \"hallucinations\". Notable advantages are its ability to estimate motion occurring in high spatial frequencies without the need of a motion-free reference. The proposed method operates on the whole k-space dynamic range and is moderately affected by the lower SNR of higher harmonics. As a proof of concept, we provide models trained using supervised learning on 600k motion simulations based on motion-free scans of 43 different subjects. Generalization performance was tested with simulations as well as in-vivo. Qualitative and quantitative evaluations are presented for motion parameter estimations and image reconstruction. Experimental results show that our approach is able to obtain good generalization performance on simulated data and in-vivo acquisitions. We provide a Python implementation at https://gitlab.unige.ch/Oscar.Dabrowski/sismik_mri/.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142006161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating and Improving Latent Density Segmentation Models for Aleatoric Uncertainty Quantification in Medical Imaging. 研究和改进用于医学影像不确定性量化的潜在密度分割模型。
Pub Date : 2024-08-19 DOI: 10.1109/TMI.2024.3445999
M M Amaan Valiuddin, Christiaan G A Viviers, Ruud J G Van Sloun, Peter H N De With, Fons van der Sommen

Data uncertainties, such as sensor noise, occlusions or limitations in the acquisition method can introduce irreducible ambiguities in images, which result in varying, yet plausible, semantic hypotheses. In Machine Learning, this ambiguity is commonly referred to as aleatoric uncertainty. In image segmentation, latent density models can be utilized to address this problem. The most popular approach is the Probabilistic U-Net (PU-Net), which uses latent Normal densities to optimize the conditional data log-likelihood Evidence Lower Bound. In this work, we demonstrate that the PU-Net latent space is severely sparse and heavily under-utilized. To address this, we introduce mutual information maximization and entropy-regularized Sinkhorn Divergence in the latent space to promote homogeneity across all latent dimensions, effectively improving gradient-descent updates and latent space informativeness. Our results show that by applying this on public datasets of various clinical segmentation problems, our proposed methodology receives up to 11% performance gains compared against preceding latent variable models for probabilistic segmentation on the Hungarian-Matched Intersection over Union. The results indicate that encouraging a homogeneous latent space significantly improves latent density modeling for medical image segmentation.

数据的不确定性,如传感器噪声、遮挡或采集方法的局限性,会在图像中引入不可还原的模糊性,从而产生不同的、但可信的语义假设。在机器学习中,这种模糊性通常被称为不确定性。在图像分割中,可以利用潜在密度模型来解决这个问题。最流行的方法是概率 U-Net (PU-Net),它使用潜在正态密度来优化条件数据对数似然证据下限。在这项工作中,我们证明了 PU-Net 潜在空间严重稀疏,利用率严重不足。为解决这一问题,我们在潜空间中引入了互信息最大化和熵规化 Sinkhorn Divergence,以促进所有潜维度的同质性,从而有效改善梯度下降更新和潜空间的信息量。我们的研究结果表明,通过在各种临床分割问题的公共数据集上应用这一方法,我们提出的方法与之前的潜在变量模型相比,在匈牙利匹配交叉联盟上的概率分割中获得了高达 11% 的性能提升。结果表明,鼓励使用同质潜空间能显著改善医学影像分割的潜密度建模。
{"title":"Investigating and Improving Latent Density Segmentation Models for Aleatoric Uncertainty Quantification in Medical Imaging.","authors":"M M Amaan Valiuddin, Christiaan G A Viviers, Ruud J G Van Sloun, Peter H N De With, Fons van der Sommen","doi":"10.1109/TMI.2024.3445999","DOIUrl":"https://doi.org/10.1109/TMI.2024.3445999","url":null,"abstract":"<p><p>Data uncertainties, such as sensor noise, occlusions or limitations in the acquisition method can introduce irreducible ambiguities in images, which result in varying, yet plausible, semantic hypotheses. In Machine Learning, this ambiguity is commonly referred to as aleatoric uncertainty. In image segmentation, latent density models can be utilized to address this problem. The most popular approach is the Probabilistic U-Net (PU-Net), which uses latent Normal densities to optimize the conditional data log-likelihood Evidence Lower Bound. In this work, we demonstrate that the PU-Net latent space is severely sparse and heavily under-utilized. To address this, we introduce mutual information maximization and entropy-regularized Sinkhorn Divergence in the latent space to promote homogeneity across all latent dimensions, effectively improving gradient-descent updates and latent space informativeness. Our results show that by applying this on public datasets of various clinical segmentation problems, our proposed methodology receives up to 11% performance gains compared against preceding latent variable models for probabilistic segmentation on the Hungarian-Matched Intersection over Union. The results indicate that encouraging a homogeneous latent space significantly improves latent density modeling for medical image segmentation.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142006160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
S2Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR. S2Former-OR:用于在 OR 中生成场景图的单级双模变换器。
Pub Date : 2024-08-15 DOI: 10.1109/TMI.2024.3444279
Jialun Pei, Diandian Guo, Jingyang Zhang, Manxi Lin, Yueming Jin, Pheng-Ann Heng

Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR). However, previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection. This pipeline may potentially compromise the flexibility of learning multimodal representations, consequently constraining the overall effectiveness. In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed, S2Former-OR, aimed to complementally leverage multi-view 2D scenes and 3D point clouds for SGG in an end-to-end manner. Concretely, our model embraces a View-Sync Transfusion scheme to encourage multi-view visual information interaction. Concurrently, a Geometry-Visual Cohesion operation is designed to integrate the synergic 2D semantic features into 3D point cloud features. Moreover, based on the augmented feature, we propose a novel relation-sensitive transformer decoder that embeds dynamic entity-pair queries and relational trait priors, which enables the direct prediction of entity-pair relations for graph generation without intermediate steps. Extensive experiments have validated the superior SGG performance and lower computational cost of S2Former-OR on 4D-OR benchmark, compared with current OR-SGG methods, e.g., 3 percentage points Precision increase and 24.2M reduction in model parameters. We further compared our method with generic single-stage SGG methods with broader metrics for a comprehensive evaluation, with consistently better performance achieved. Our source code can be made available at: https://github.com/PJLallen/S2Former-OR.

手术过程的场景图生成(SGG)对于提高手术室(OR)的整体认知智能至关重要。然而,以前的工作主要依赖于多阶段学习,其中生成的语义场景图依赖于姿势估计和物体检测的中间过程。这种流水线可能会影响多模态表征学习的灵活性,从而制约整体效果。在本研究中,我们引入了一种新颖的单级双模态转换器框架,用于在OR中进行SGG,称为S2Former-OR,旨在以端到端的方式利用多视角二维场景和三维点云对SGG进行互补。具体来说,我们的模型采用视图同步转换方案,鼓励多视图视觉信息交互。同时,我们还设计了几何-视觉内聚操作,将协同的二维语义特征整合到三维点云特征中。此外,在增强特征的基础上,我们提出了一种新颖的关系敏感变换解码器,该解码器嵌入了动态实体对查询和关系特质先验,可直接预测实体对关系以生成图,而无需中间步骤。广泛的实验验证了 S2Former-OR 在 4D-OR 基准上比当前的 OR-SGG 方法具有更优越的 SGG 性能和更低的计算成本,例如精度提高了 3 个百分点,模型参数减少了 2420 万个。我们还进一步将我们的方法与通用的单级 SGG 方法进行了比较,并采用了更广泛的指标进行综合评估,结果显示我们的方法始终具有更好的性能。我们的源代码可在以下网址获取:https://github.com/PJLallen/S2Former-OR。
{"title":"S<sup>2</sup>Former-OR: Single-Stage Bi-Modal Transformer for Scene Graph Generation in OR.","authors":"Jialun Pei, Diandian Guo, Jingyang Zhang, Manxi Lin, Yueming Jin, Pheng-Ann Heng","doi":"10.1109/TMI.2024.3444279","DOIUrl":"https://doi.org/10.1109/TMI.2024.3444279","url":null,"abstract":"<p><p>Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence in the operating room (OR). However, previous works have primarily relied on multi-stage learning, where the generated semantic scene graphs depend on intermediate processes with pose estimation and object detection. This pipeline may potentially compromise the flexibility of learning multimodal representations, consequently constraining the overall effectiveness. In this study, we introduce a novel single-stage bi-modal transformer framework for SGG in the OR, termed, S<sup>2</sup>Former-OR, aimed to complementally leverage multi-view 2D scenes and 3D point clouds for SGG in an end-to-end manner. Concretely, our model embraces a View-Sync Transfusion scheme to encourage multi-view visual information interaction. Concurrently, a Geometry-Visual Cohesion operation is designed to integrate the synergic 2D semantic features into 3D point cloud features. Moreover, based on the augmented feature, we propose a novel relation-sensitive transformer decoder that embeds dynamic entity-pair queries and relational trait priors, which enables the direct prediction of entity-pair relations for graph generation without intermediate steps. Extensive experiments have validated the superior SGG performance and lower computational cost of S<sup>2</sup>Former-OR on 4D-OR benchmark, compared with current OR-SGG methods, e.g., 3 percentage points Precision increase and 24.2M reduction in model parameters. We further compared our method with generic single-stage SGG methods with broader metrics for a comprehensive evaluation, with consistently better performance achieved. Our source code can be made available at: https://github.com/PJLallen/S2Former-OR.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141989773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AutoSamp: Autoencoding k-space Sampling via Variational Information Maximization for 3D MRI. AutoSamp:通过变异信息最大化对三维核磁共振成像的 k 空间采样进行自动编码。
Pub Date : 2024-08-15 DOI: 10.1109/TMI.2024.3443292
Cagan Alkan, Morteza Mardani, Congyu Liao, Zhitao Li, Shreyas S Vasanawala, John M Pauly

Accelerated MRI protocols routinely involve a predefined sampling pattern that undersamples the k-space. Finding an optimal pattern can enhance the reconstruction quality, however this optimization is a challenging task. To address this challenge, we introduce a novel deep learning framework, AutoSamp, based on variational information maximization that enables joint optimization of sampling pattern and reconstruction of MRI scans. We represent the encoder as a non-uniform Fast Fourier Transform that allows continuous optimization of k-space sample locations on a non-Cartesian plane, and the decoder as a deep reconstruction network. Experiments on public 3D acquired MRI datasets show improved reconstruction quality of the proposed AutoSamp method over the prevailing variable density and variable density Poisson disc sampling for both compressed sensing and deep learning reconstructions. We demonstrate that our data-driven sampling optimization method achieves 4.4dB, 2.0dB, 0.75dB, 0.7dB PSNR improvements over reconstruction with Poisson Disc masks for acceleration factors of R = 5, 10, 15, 25, respectively. Prospectively accelerated acquisitions with 3D FSE sequences using our optimized sampling patterns exhibit improved image quality and sharpness. Furthermore, we analyze the characteristics of the learned sampling patterns with respect to changes in acceleration factor, measurement noise, underlying anatomy, and coil sensitivities. We show that all these factors contribute to the optimization result by affecting the sampling density, k-space coverage and point spread functions of the learned sampling patterns.

加速核磁共振成像方案通常会采用预先确定的采样模式,对 k 空间进行低采样。寻找最佳模式可以提高重建质量,但这种优化是一项具有挑战性的任务。为了应对这一挑战,我们引入了一种基于变异信息最大化的新型深度学习框架 AutoSamp,该框架能对磁共振成像扫描的采样模式和重建进行联合优化。我们将编码器表示为非均匀快速傅立叶变换,允许在非笛卡尔平面上连续优化 k 空间采样位置,将解码器表示为深度重建网络。在公开的三维核磁共振成像数据集上进行的实验表明,在压缩传感和深度学习重建方面,所提出的 AutoSamp 方法比现有的变密度和变密度泊松圆盘采样法提高了重建质量。我们证明,在加速因子为 R = 5、10、15、25 时,我们的数据驱动采样优化方法比使用泊松圆盘掩模重建的 PSNR 分别提高了 4.4dB、2.0dB、0.75dB、0.7dB。使用我们优化的采样模式的三维 FSE 序列的前瞻性加速采集显示出更高的图像质量和清晰度。此外,我们还分析了学习到的采样模式在加速因子、测量噪声、基础解剖和线圈灵敏度变化方面的特点。我们发现,所有这些因素都会影响学习到的采样模式的采样密度、k 空间覆盖率和点扩散函数,从而对优化结果产生影响。
{"title":"AutoSamp: Autoencoding k-space Sampling via Variational Information Maximization for 3D MRI.","authors":"Cagan Alkan, Morteza Mardani, Congyu Liao, Zhitao Li, Shreyas S Vasanawala, John M Pauly","doi":"10.1109/TMI.2024.3443292","DOIUrl":"10.1109/TMI.2024.3443292","url":null,"abstract":"<p><p>Accelerated MRI protocols routinely involve a predefined sampling pattern that undersamples the k-space. Finding an optimal pattern can enhance the reconstruction quality, however this optimization is a challenging task. To address this challenge, we introduce a novel deep learning framework, AutoSamp, based on variational information maximization that enables joint optimization of sampling pattern and reconstruction of MRI scans. We represent the encoder as a non-uniform Fast Fourier Transform that allows continuous optimization of k-space sample locations on a non-Cartesian plane, and the decoder as a deep reconstruction network. Experiments on public 3D acquired MRI datasets show improved reconstruction quality of the proposed AutoSamp method over the prevailing variable density and variable density Poisson disc sampling for both compressed sensing and deep learning reconstructions. We demonstrate that our data-driven sampling optimization method achieves 4.4dB, 2.0dB, 0.75dB, 0.7dB PSNR improvements over reconstruction with Poisson Disc masks for acceleration factors of R = 5, 10, 15, 25, respectively. Prospectively accelerated acquisitions with 3D FSE sequences using our optimized sampling patterns exhibit improved image quality and sharpness. Furthermore, we analyze the characteristics of the learned sampling patterns with respect to changes in acceleration factor, measurement noise, underlying anatomy, and coil sensitivities. We show that all these factors contribute to the optimization result by affecting the sampling density, k-space coverage and point spread functions of the learned sampling patterns.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141989744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bi-Constraints Diffusion: A Conditional Diffusion Model with Degradation Guidance for Metal Artifact Reduction. 双约束扩散:带退化指导的条件扩散模型,用于减少金属伪影。
Pub Date : 2024-08-15 DOI: 10.1109/TMI.2024.3442950
Mengting Luo, Nan Zhou, Tao Wang, Linchao He, Wang Wang, Hu Chen, Peixi Liao, Yi Zhang

In recent years, score-based diffusion models have emerged as effective tools for estimating score functions from empirical data distributions, particularly in integrating implicit priors with inverse problems like CT reconstruction. However, score-based diffusion models are rarely explored in challenging tasks such as metal artifact reduction (MAR). In this paper, we introduce the BiConstraints Diffusion Model for Metal Artifact Reduction (BCDMAR), an innovative approach that enhances iterative reconstruction with a conditional diffusion model for MAR. This method employs a metal artifact degradation operator in place of the traditional metal-excluded projection operator in the data-fidelity term, thereby preserving structure details around metal regions. However, scorebased diffusion models tend to be susceptible to grayscale shifts and unreliable structures, making it challenging to reach an optimal solution. To address this, we utilize a precorrected image as a prior constraint, guiding the generation of the score-based diffusion model. By iteratively applying the score-based diffusion model and the data-fidelity step in each sampling iteration, BCDMAR effectively maintains reliable tissue representation around metal regions and produces highly consistent structures in non-metal regions. Through extensive experiments focused on metal artifact reduction tasks, BCDMAR demonstrates superior performance over other state-of-the-art unsupervised and supervised methods, both quantitatively and in terms of visual results.

近年来,基于分数的扩散模型已成为从经验数据分布中估计分数函数的有效工具,特别是在将隐含先验与 CT 重建等逆问题相结合时。然而,基于分数的扩散模型很少在金属伪影减少(MAR)等具有挑战性的任务中得到应用。在本文中,我们介绍了用于减少金属伪影的双约束扩散模型(BiConstraints Diffusion Model for Metal Artifact Reduction,BCDMAR),这是一种用条件扩散模型增强迭代重建的创新方法。该方法在数据保真度项中采用金属伪影降级算子代替传统的金属排除投影算子,从而保留金属区域周围的结构细节。然而,基于分数的扩散模型往往容易受到灰度偏移和不可靠结构的影响,因此要获得最佳解决方案具有挑战性。为了解决这个问题,我们利用预校正图像作为先验约束,指导生成基于分数的扩散模型。通过在每次采样迭代中迭代应用基于分数的扩散模型和数据保真步骤,BCDMAR 能有效保持金属区域周围可靠的组织表示,并在非金属区域生成高度一致的结构。通过大量以减少金属伪影任务为重点的实验,BCDMAR 在定量和视觉效果方面都表现出优于其他最先进的无监督和有监督方法的性能。
{"title":"Bi-Constraints Diffusion: A Conditional Diffusion Model with Degradation Guidance for Metal Artifact Reduction.","authors":"Mengting Luo, Nan Zhou, Tao Wang, Linchao He, Wang Wang, Hu Chen, Peixi Liao, Yi Zhang","doi":"10.1109/TMI.2024.3442950","DOIUrl":"10.1109/TMI.2024.3442950","url":null,"abstract":"<p><p>In recent years, score-based diffusion models have emerged as effective tools for estimating score functions from empirical data distributions, particularly in integrating implicit priors with inverse problems like CT reconstruction. However, score-based diffusion models are rarely explored in challenging tasks such as metal artifact reduction (MAR). In this paper, we introduce the BiConstraints Diffusion Model for Metal Artifact Reduction (BCDMAR), an innovative approach that enhances iterative reconstruction with a conditional diffusion model for MAR. This method employs a metal artifact degradation operator in place of the traditional metal-excluded projection operator in the data-fidelity term, thereby preserving structure details around metal regions. However, scorebased diffusion models tend to be susceptible to grayscale shifts and unreliable structures, making it challenging to reach an optimal solution. To address this, we utilize a precorrected image as a prior constraint, guiding the generation of the score-based diffusion model. By iteratively applying the score-based diffusion model and the data-fidelity step in each sampling iteration, BCDMAR effectively maintains reliable tissue representation around metal regions and produces highly consistent structures in non-metal regions. Through extensive experiments focused on metal artifact reduction tasks, BCDMAR demonstrates superior performance over other state-of-the-art unsupervised and supervised methods, both quantitatively and in terms of visual results.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141989745","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Domain-interactive Contrastive Learning and Prototype-guided Self-training for Cross-domain Polyp Segmentation. 跨领域息肉分割的领域交互式对比学习和原型指导下的自我训练。
Pub Date : 2024-08-14 DOI: 10.1109/TMI.2024.3443262
Ziru Lu, Yizhe Zhang, Yi Zhou, Ye Wu, Tao Zhou

Accurate polyp segmentation plays a critical role from colonoscopy images in the diagnosis and treatment of colorectal cancer. While deep learning-based polyp segmentation models have made significant progress, they often suffer from performance degradation when applied to unseen target domain datasets collected from different imaging devices. To address this challenge, unsupervised domain adaptation (UDA) methods have gained attention by leveraging labeled source data and unlabeled target data to reduce the domain gap. However, existing UDA methods primarily focus on capturing class-wise representations, neglecting domain-wise representations. Additionally, uncertainty in pseudo labels could hinder the segmentation performance. To tackle these issues, we propose a novel Domain-interactive Contrastive Learning and Prototype-guided Self-training (DCL-PS) framework for cross-domain polyp segmentation. Specifically, domaininteractive contrastive learning (DCL) with a domain-mixed prototype updating strategy is proposed to discriminate class-wise feature representations across domains. Then, to enhance the feature extraction ability of the encoder, we present a contrastive learning-based cross-consistency training (CL-CCT) strategy, which is imposed on both the prototypes obtained by the outputs of the main decoder and perturbed auxiliary outputs. Furthermore, we propose a prototype-guided self-training (PS) strategy, which dynamically assigns a weight for each pixel during selftraining, filtering out unreliable pixels and improving the quality of pseudo-labels. Experimental results demonstrate the superiority of DCL-PS in improving polyp segmentation performance in the target domain. The code will be released at https://github.com/taozh2017/DCLPS.

根据结肠镜图像进行精确的息肉分割在结肠直肠癌的诊断和治疗中起着至关重要的作用。虽然基于深度学习的息肉分割模型取得了重大进展,但当它们应用于从不同成像设备收集的未见目标域数据集时,往往会出现性能下降的问题。为应对这一挑战,无监督领域适应(UDA)方法利用标记源数据和未标记目标数据来缩小领域差距,因而受到关注。然而,现有的 UDA 方法主要侧重于捕捉类表征,而忽略了域表征。此外,伪标签的不确定性也会影响分割性能。为了解决这些问题,我们提出了一种用于跨领域息肉分割的新型领域交互式对比学习和原型指导自我训练(DCL-PS)框架。具体来说,我们提出了采用领域混合原型更新策略的领域交互式对比学习(DCL)来区分跨领域的类特征表征。然后,为了增强编码器的特征提取能力,我们提出了基于对比学习的交叉一致性训练(CL-CCT)策略,该策略同时适用于主解码器输出和扰动辅助输出所获得的原型。此外,我们还提出了一种原型引导的自我训练(PS)策略,在自我训练过程中为每个像素动态分配权重,从而过滤掉不可靠的像素,提高伪标签的质量。实验结果表明,DCL-PS 在提高目标域息肉分割性能方面具有优势。代码将在 https://github.com/taozh2017/DCLPS 上发布。
{"title":"Domain-interactive Contrastive Learning and Prototype-guided Self-training for Cross-domain Polyp Segmentation.","authors":"Ziru Lu, Yizhe Zhang, Yi Zhou, Ye Wu, Tao Zhou","doi":"10.1109/TMI.2024.3443262","DOIUrl":"https://doi.org/10.1109/TMI.2024.3443262","url":null,"abstract":"<p><p>Accurate polyp segmentation plays a critical role from colonoscopy images in the diagnosis and treatment of colorectal cancer. While deep learning-based polyp segmentation models have made significant progress, they often suffer from performance degradation when applied to unseen target domain datasets collected from different imaging devices. To address this challenge, unsupervised domain adaptation (UDA) methods have gained attention by leveraging labeled source data and unlabeled target data to reduce the domain gap. However, existing UDA methods primarily focus on capturing class-wise representations, neglecting domain-wise representations. Additionally, uncertainty in pseudo labels could hinder the segmentation performance. To tackle these issues, we propose a novel Domain-interactive Contrastive Learning and Prototype-guided Self-training (DCL-PS) framework for cross-domain polyp segmentation. Specifically, domaininteractive contrastive learning (DCL) with a domain-mixed prototype updating strategy is proposed to discriminate class-wise feature representations across domains. Then, to enhance the feature extraction ability of the encoder, we present a contrastive learning-based cross-consistency training (CL-CCT) strategy, which is imposed on both the prototypes obtained by the outputs of the main decoder and perturbed auxiliary outputs. Furthermore, we propose a prototype-guided self-training (PS) strategy, which dynamically assigns a weight for each pixel during selftraining, filtering out unreliable pixels and improving the quality of pseudo-labels. Experimental results demonstrate the superiority of DCL-PS in improving polyp segmentation performance in the target domain. The code will be released at https://github.com/taozh2017/DCLPS.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141984206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prompt-driven Latent Domain Generalization for Medical Image Classification. 用于医学图像分类的提示驱动潜域泛化。
Pub Date : 2024-08-13 DOI: 10.1109/TMI.2024.3443119
Siyuan Yan, Zhen Yu, Chi Liu, Lie Ju, Dwarikanath Mahapatra, Brigid Betz-Stablein, Victoria Mar, Monika Janda, Peter Soyer, Zongyuan Ge

Deep learning models for medical image analysis easily suffer from distribution shifts caused by dataset artifact bias, camera variations, differences in the imaging station, etc., leading to unreliable diagnoses in real-world clinical settings. Domain generalization (DG) methods, which aim to train models on multiple domains to perform well on unseen domains, offer a promising direction to solve the problem. However, existing DG methods assume domain labels of each image are available and accurate, which is typically feasible for only a limited number of medical datasets. To address these challenges, we propose a unified DG framework for medical image classification without relying on domain labels, called Prompt-driven Latent Domain Generalization (PLDG). PLDG consists of unsupervised domain discovery and prompt learning. This framework first discovers pseudo domain labels by clustering the bias-associated style features, then leverages collaborative domain prompts to guide a Vision Transformer to learn knowledge from discovered diverse domains. To facilitate cross-domain knowledge learning between different prompts, we introduce a domain prompt generator that enables knowledge sharing between domain prompts and a shared prompt. A domain mixup strategy is additionally employed for more flexible decision margins and mitigates the risk of incorrect domain assignments. Extensive experiments on three medical image classification tasks and one debiasing task demonstrate that our method can achieve comparable or even superior performance than conventional DG algorithms without relying on domain labels. Our code is publicly available at https://github.com/SiyuanYan1/PLDG/tree/main.

用于医学图像分析的深度学习模型很容易受到数据集伪装偏差、相机变化、成像站差异等因素造成的分布偏移的影响,从而导致实际临床环境中诊断结果不可靠。领域泛化(DG)方法旨在训练多个领域的模型,使其在未见领域中表现良好,为解决这一问题提供了一个很有前景的方向。然而,现有的领域泛化方法假定每张图像的领域标签都是可用且准确的,而这通常只对有限的医疗数据集可行。为了应对这些挑战,我们提出了一种不依赖域标签的统一医学图像分类 DG 框架,称为提示驱动潜域泛化(Prompt-driven Latent Domain Generalization,PLDG)。PLDG 包括无监督领域发现和提示学习。该框架首先通过聚类与偏差相关的风格特征来发现伪领域标签,然后利用协作领域提示来引导视觉转换器从发现的不同领域中学习知识。为了促进不同提示之间的跨领域知识学习,我们引入了一个领域提示生成器,使领域提示和共享提示之间能够共享知识。此外,我们还采用了领域混合策略,以获得更灵活的决策空间,并降低错误领域分配的风险。在三个医学图像分类任务和一个除杂任务上的广泛实验表明,我们的方法无需依赖领域标签,就能获得与传统 DG 算法相当甚至更优的性能。我们的代码可通过 https://github.com/SiyuanYan1/PLDG/tree/main 公开获取。
{"title":"Prompt-driven Latent Domain Generalization for Medical Image Classification.","authors":"Siyuan Yan, Zhen Yu, Chi Liu, Lie Ju, Dwarikanath Mahapatra, Brigid Betz-Stablein, Victoria Mar, Monika Janda, Peter Soyer, Zongyuan Ge","doi":"10.1109/TMI.2024.3443119","DOIUrl":"https://doi.org/10.1109/TMI.2024.3443119","url":null,"abstract":"<p><p>Deep learning models for medical image analysis easily suffer from distribution shifts caused by dataset artifact bias, camera variations, differences in the imaging station, etc., leading to unreliable diagnoses in real-world clinical settings. Domain generalization (DG) methods, which aim to train models on multiple domains to perform well on unseen domains, offer a promising direction to solve the problem. However, existing DG methods assume domain labels of each image are available and accurate, which is typically feasible for only a limited number of medical datasets. To address these challenges, we propose a unified DG framework for medical image classification without relying on domain labels, called Prompt-driven Latent Domain Generalization (PLDG). PLDG consists of unsupervised domain discovery and prompt learning. This framework first discovers pseudo domain labels by clustering the bias-associated style features, then leverages collaborative domain prompts to guide a Vision Transformer to learn knowledge from discovered diverse domains. To facilitate cross-domain knowledge learning between different prompts, we introduce a domain prompt generator that enables knowledge sharing between domain prompts and a shared prompt. A domain mixup strategy is additionally employed for more flexible decision margins and mitigates the risk of incorrect domain assignments. Extensive experiments on three medical image classification tasks and one debiasing task demonstrate that our method can achieve comparable or even superior performance than conventional DG algorithms without relying on domain labels. Our code is publicly available at https://github.com/SiyuanYan1/PLDG/tree/main.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141977493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Benchmark: Clinical Uncertainty and Severity Aware Labeled Chest X-Ray Images with Multi-Relationship Graph Learning. 新基准:利用多关系图学习识别临床不确定性和严重程度标记胸部 X 光图像
Pub Date : 2024-08-09 DOI: 10.1109/TMI.2024.3441494
Mengliang Zhang, Xinyue Hu, Lin Gu, Liangchen Liu, Kazuma Kobayashi, Tatsuya Harada, Yan Yan, Ronald M Summers, Yingying Zhu

Chest radiography, commonly known as CXR, is frequently utilized in clinical settings to detect cardiopulmonary conditions. However, even seasoned radiologists might offer different evaluations regarding the seriousness and uncertainty associated with observed abnormalities. Previous research has attempted to utilize clinical notes to extract abnormal labels for training deep-learning models in CXR image diagnosis. However, these methods often neglected the varying degrees of severity and uncertainty linked to different labels. In our study, we initially assembled a comprehensive new dataset of CXR images based on clinical textual data, which incorporated radiologists' assessments of uncertainty and severity. Using this dataset, we introduced a multi-relationship graph learning framework that leverages spatial and semantic relationships while addressing expert uncertainty through a dedicated loss function. Our research showcases a notable enhancement in CXR image diagnosis and the interpretability of the diagnostic model, surpassing existing state-of-the-art methodologies. The dataset address of disease severity and uncertainty we extracted is: https://physionet.org/content/cad-chest/1.0/.

胸部放射线检查(俗称 CXR)在临床上经常用于检测心肺疾病。然而,即使是经验丰富的放射科医生也会对观察到的异常情况的严重性和不确定性做出不同的评价。以前的研究曾尝试利用临床笔记提取异常标签,用于训练 CXR 图像诊断中的深度学习模型。然而,这些方法往往忽略了与不同标签相关的不同严重程度和不确定性。在我们的研究中,我们基于临床文本数据,结合放射科医生对不确定性和严重程度的评估,初步建立了一个全面的 CXR 图像新数据集。利用该数据集,我们引入了多关系图学习框架,该框架利用空间和语义关系,同时通过专用损失函数解决专家的不确定性问题。我们的研究展示了 CXR 图像诊断和诊断模型可解释性的显著提升,超越了现有的最先进方法。我们提取的疾病严重性和不确定性数据集地址为:https://physionet.org/content/cad-chest/1.0/。
{"title":"A New Benchmark: Clinical Uncertainty and Severity Aware Labeled Chest X-Ray Images with Multi-Relationship Graph Learning.","authors":"Mengliang Zhang, Xinyue Hu, Lin Gu, Liangchen Liu, Kazuma Kobayashi, Tatsuya Harada, Yan Yan, Ronald M Summers, Yingying Zhu","doi":"10.1109/TMI.2024.3441494","DOIUrl":"https://doi.org/10.1109/TMI.2024.3441494","url":null,"abstract":"<p><p>Chest radiography, commonly known as CXR, is frequently utilized in clinical settings to detect cardiopulmonary conditions. However, even seasoned radiologists might offer different evaluations regarding the seriousness and uncertainty associated with observed abnormalities. Previous research has attempted to utilize clinical notes to extract abnormal labels for training deep-learning models in CXR image diagnosis. However, these methods often neglected the varying degrees of severity and uncertainty linked to different labels. In our study, we initially assembled a comprehensive new dataset of CXR images based on clinical textual data, which incorporated radiologists' assessments of uncertainty and severity. Using this dataset, we introduced a multi-relationship graph learning framework that leverages spatial and semantic relationships while addressing expert uncertainty through a dedicated loss function. Our research showcases a notable enhancement in CXR image diagnosis and the interpretability of the diagnostic model, surpassing existing state-of-the-art methodologies. The dataset address of disease severity and uncertainty we extracted is: https://physionet.org/content/cad-chest/1.0/.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141910184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RemixFormer++: A Multi-modal Transformer Model for Precision Skin Tumor Differential Diagnosis with Memory-efficient Attention. RemixFormer++:用于精确皮肤肿瘤鉴别诊断的多模态变压器模型,具有记忆效率高的注意力。
Pub Date : 2024-08-09 DOI: 10.1109/TMI.2024.3441012
Jing Xu, Kai Huang, Lianzhen Zhong, Yuan Gao, Kai Sun, Wei Liu, Yanjie Zhou, Wenchao Guo, Yuan Guo, Yuanqiang Zou, Yuping Duan, Le Lu, Yu Wang, Xiang Chen, Shuang Zhao

Diagnosing malignant skin tumors accurately at an early stage can be challenging due to ambiguous and even confusing visual characteristics displayed by various categories of skin tumors. To improve diagnosis precision, all available clinical data from multiple sources, particularly clinical images, dermoscopy images, and medical history, could be considered. Aligning with clinical practice, we propose a novel Transformer model, named Remix-Former++ that consists of a clinical image branch, a dermoscopy image branch, and a metadata branch. Given the unique characteristics inherent in clinical and dermoscopy images, specialized attention strategies are adopted for each type. Clinical images are processed through a top-down architecture, capturing both localized lesion details and global contextual information. Conversely, dermoscopy images undergo a bottom-up processing with two-level hierarchical encoders, designed to pinpoint fine-grained structural and textural features. A dedicated metadata branch seamlessly integrates non-visual information by encoding relevant patient data. Fusing features from three branches substantially boosts disease classification accuracy. RemixFormer++ demonstrates exceptional performance on four single-modality datasets (PAD-UFES-20, ISIC 2017/2018/2019). Compared with the previous best method using a public multi-modal Derm7pt dataset, we achieved an absolute 5.3% increase in averaged F1 and 1.2% in accuracy for the classification of five skin tumors. Furthermore, using a large-scale in-house dataset of 10,351 patients with the twelve most common skin tumors, our method obtained an overall classification accuracy of 92.6%. These promising results, on par or better with the performance of 191 dermatologists through a comprehensive reader study, evidently imply the potential clinical usability of our method.

由于各类皮肤肿瘤显示的视觉特征模糊不清,甚至容易混淆,因此在早期准确诊断恶性皮肤肿瘤具有挑战性。为了提高诊断的精确度,可以考虑从多个来源获取所有可用的临床数据,特别是临床图像、皮肤镜图像和病史。根据临床实践,我们提出了一种名为 Remix-Former++ 的新型转换器模型,它由临床图像分支、皮肤镜图像分支和元数据分支组成。鉴于临床图像和皮肤镜图像的固有特性,每种类型的图像都采用了专门的关注策略。临床图像通过自上而下的架构进行处理,同时捕捉局部病变细节和全局上下文信息。相反,皮肤镜图像则采用两级分层编码器进行自下而上的处理,旨在精确定位细粒度的结构和纹理特征。一个专门的元数据分支通过对相关患者数据进行编码,无缝整合了非视觉信息。融合三个分支的特征可大幅提高疾病分类的准确性。RemixFormer++ 在四个单模态数据集(PAD-UFES-20、ISIC 2017/2018/2019)上表现出卓越的性能。与之前使用公共多模态 Derm7pt 数据集的最佳方法相比,我们在对五种皮肤肿瘤进行分类时,平均 F1 绝对值提高了 5.3%,准确率提高了 1.2%。此外,在使用由 10351 名患有 12 种最常见皮肤肿瘤的患者组成的大规模内部数据集时,我们的方法获得了 92.6% 的总体分类准确率。这些令人鼓舞的结果与 191 位皮肤科医生通过综合读者研究得出的结果相当或更好,这显然意味着我们的方法具有潜在的临床实用性。
{"title":"RemixFormer++: A Multi-modal Transformer Model for Precision Skin Tumor Differential Diagnosis with Memory-efficient Attention.","authors":"Jing Xu, Kai Huang, Lianzhen Zhong, Yuan Gao, Kai Sun, Wei Liu, Yanjie Zhou, Wenchao Guo, Yuan Guo, Yuanqiang Zou, Yuping Duan, Le Lu, Yu Wang, Xiang Chen, Shuang Zhao","doi":"10.1109/TMI.2024.3441012","DOIUrl":"10.1109/TMI.2024.3441012","url":null,"abstract":"<p><p>Diagnosing malignant skin tumors accurately at an early stage can be challenging due to ambiguous and even confusing visual characteristics displayed by various categories of skin tumors. To improve diagnosis precision, all available clinical data from multiple sources, particularly clinical images, dermoscopy images, and medical history, could be considered. Aligning with clinical practice, we propose a novel Transformer model, named Remix-Former++ that consists of a clinical image branch, a dermoscopy image branch, and a metadata branch. Given the unique characteristics inherent in clinical and dermoscopy images, specialized attention strategies are adopted for each type. Clinical images are processed through a top-down architecture, capturing both localized lesion details and global contextual information. Conversely, dermoscopy images undergo a bottom-up processing with two-level hierarchical encoders, designed to pinpoint fine-grained structural and textural features. A dedicated metadata branch seamlessly integrates non-visual information by encoding relevant patient data. Fusing features from three branches substantially boosts disease classification accuracy. RemixFormer++ demonstrates exceptional performance on four single-modality datasets (PAD-UFES-20, ISIC 2017/2018/2019). Compared with the previous best method using a public multi-modal Derm7pt dataset, we achieved an absolute 5.3% increase in averaged F1 and 1.2% in accuracy for the classification of five skin tumors. Furthermore, using a large-scale in-house dataset of 10,351 patients with the twelve most common skin tumors, our method obtained an overall classification accuracy of 92.6%. These promising results, on par or better with the performance of 191 dermatologists through a comprehensive reader study, evidently imply the potential clinical usability of our method.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141910185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PRECISION: A Physics-Constrained and Noise-Controlled Diffusion Model for Photon Counting Computed Tomography. PRECISION:用于光子计数计算机断层扫描的物理约束和噪声控制扩散模型。
Pub Date : 2024-08-08 DOI: 10.1109/TMI.2024.3440651
Ruifeng Chen, Zhongliang Zhang, Guotao Quan, Yanfeng Du, Yang Chen, Yinsheng Li

Recently, the use of photon counting detectors in computed tomography (PCCT) has attracted extensive attention. It is highly desired to improve the quality of material basis image and the quantitative accuracy of elemental composition, particularly when PCCT data is acquired at lower radiation dose levels. In this work, we develop a physics-constrained and noise-controlled diffusion model, PRECISION in short, to address the degraded quality of material basis images and inaccurate quantification of elemental composition mainly caused by imperfect noise model and/or hand-crafted regularization of material basis images, such as local smoothness and/or sparsity, leveraged in the existing direct material basis image reconstruction approaches. In stark contrast, PRECISION learns distribution-level regularization to describe the feature of ideal material basis images via training a noise-controlled spatial-spectral diffusion model. The optimal material basis images of each individual subject are sampled from this learned distribution under the constraint of the physical model of a given PCCT and the measured data obtained from the subject. PRECISION exhibits the potential to improve the quality of material basis images and the quantitative accuracy of elemental composition for PCCT.

最近,光子计数探测器在计算机断层扫描(PCCT)中的应用引起了广泛关注。人们非常希望提高物质基础图像的质量和元素组成的定量准确性,尤其是在以较低辐射剂量水平获取 PCCT 数据时。在这项工作中,我们开发了一种物理约束和噪声控制的扩散模型(简称 PRECISION),以解决现有的直接物质基础图像重建方法中主要由不完善的噪声模型和/或对物质基础图像的手工正则化(如局部平滑和/或稀疏性)造成的物质基础图像质量下降和元素成分定量不准确的问题。与此形成鲜明对比的是,PRECISION 通过训练噪声控制的空间-光谱扩散模型,学习分布级正则化来描述理想物质基础图像的特征。每个受试者的最佳物质基础图像都是在给定 PCCT 物理模型和受试者测量数据的约束下,从学习到的分布中采样得到的。PRECISION 具有提高材料基础图像质量和 PCCT 元素组成定量准确性的潜力。
{"title":"PRECISION: A Physics-Constrained and Noise-Controlled Diffusion Model for Photon Counting Computed Tomography.","authors":"Ruifeng Chen, Zhongliang Zhang, Guotao Quan, Yanfeng Du, Yang Chen, Yinsheng Li","doi":"10.1109/TMI.2024.3440651","DOIUrl":"https://doi.org/10.1109/TMI.2024.3440651","url":null,"abstract":"<p><p>Recently, the use of photon counting detectors in computed tomography (PCCT) has attracted extensive attention. It is highly desired to improve the quality of material basis image and the quantitative accuracy of elemental composition, particularly when PCCT data is acquired at lower radiation dose levels. In this work, we develop a physics-constrained and noise-controlled diffusion model, PRECISION in short, to address the degraded quality of material basis images and inaccurate quantification of elemental composition mainly caused by imperfect noise model and/or hand-crafted regularization of material basis images, such as local smoothness and/or sparsity, leveraged in the existing direct material basis image reconstruction approaches. In stark contrast, PRECISION learns distribution-level regularization to describe the feature of ideal material basis images via training a noise-controlled spatial-spectral diffusion model. The optimal material basis images of each individual subject are sampled from this learned distribution under the constraint of the physical model of a given PCCT and the measured data obtained from the subject. PRECISION exhibits the potential to improve the quality of material basis images and the quantitative accuracy of elemental composition for PCCT.</p>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141908635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on medical imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1