IEEE transactions on medical imaging最新文献_第2页

Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction 针对稀疏视图 CBCT 重建的几何感知衰减学习。

IEEE transactions on medical imaging

Pub Date : 2024-10-04 DOI: 10.1109/TMI.2024.3473970

Zhentao Liu;Yu Fang;Changjian Li;Han Wu;Yuan Liu;Dinggang Shen;Zhiming Cui

Cone Beam Computed Tomography (CBCT) plays a vital role in clinical imaging. Traditional methods typically require hundreds of 2D X-ray projections to reconstruct a high-quality 3D CBCT image, leading to considerable radiation exposure. This has led to a growing interest in sparse-view CBCT reconstruction to reduce radiation doses. While recent advances, including deep learning and neural rendering algorithms, have made strides in this area, these methods either produce unsatisfactory results or suffer from time inefficiency of individual optimization. In this paper, we introduce a novel geometry-aware encoder-decoder framework to solve this problem. Our framework starts by encoding multi-view 2D features from various 2D X-ray projections with a 2D CNN encoder. Leveraging the geometry of CBCT scanning, it then back-projects the multi-view 2D features into the 3D space to formulate a comprehensive volumetric feature map, followed by a 3D CNN decoder to recover 3D CBCT image. Importantly, our approach respects the geometric relationship between 3D CBCT image and its 2D X-ray projections during feature back projection stage, and enjoys the prior knowledge learned from the data population. This ensures its adaptability in dealing with extremely sparse view inputs without individual training, such as scenarios with only 5 or 10 X-ray projections. Extensive evaluations on two simulated datasets and one real-world dataset demonstrate exceptional reconstruction quality and time efficiency of our method.

锥形束计算机断层扫描（CBCT）在临床成像中发挥着至关重要的作用。传统方法通常需要数百个二维 X 射线投影才能重建高质量的三维 CBCT 图像，从而导致大量辐射暴露。因此，人们对稀疏视图 CBCT 重建以减少辐射剂量的兴趣与日俱增。虽然包括深度学习和神经渲染算法在内的最新进展在这一领域取得了长足进步，但这些方法要么产生的结果不尽如人意，要么存在单个优化的时间效率低下问题。在本文中，我们介绍了一种新颖的几何感知编码器-解码器框架来解决这一问题。我们的框架首先使用二维 CNN 编码器对来自各种二维 X 射线投影的多视角二维特征进行编码。然后，利用 CBCT 扫描的几何原理，将多视角二维特征反向投影到三维空间，形成一个全面的容积特征图，再用三维 CNN 解码器恢复三维 CBCT 图像。重要的是，在特征反投影阶段，我们的方法尊重三维 CBCT 图像与其二维 X 射线投影之间的几何关系，并利用从数据群体中学到的先验知识。这确保了它在处理极其稀疏的视图输入时的适应性，而无需进行单独训练，例如只有 5 或 10 个 X 射线投影的情况。在两个模拟数据集和一个实际数据集上进行的广泛评估表明，我们的方法具有卓越的重建质量和时间效率。

{"title":"Geometry-Aware Attenuation Learning for Sparse-View CBCT Reconstruction","authors":"Zhentao Liu;Yu Fang;Changjian Li;Han Wu;Yuan Liu;Dinggang Shen;Zhiming Cui","doi":"10.1109/TMI.2024.3473970","DOIUrl":"10.1109/TMI.2024.3473970","url":null,"abstract":"Cone Beam Computed Tomography (CBCT) plays a vital role in clinical imaging. Traditional methods typically require hundreds of 2D X-ray projections to reconstruct a high-quality 3D CBCT image, leading to considerable radiation exposure. This has led to a growing interest in sparse-view CBCT reconstruction to reduce radiation doses. While recent advances, including deep learning and neural rendering algorithms, have made strides in this area, these methods either produce unsatisfactory results or suffer from time inefficiency of individual optimization. In this paper, we introduce a novel geometry-aware encoder-decoder framework to solve this problem. Our framework starts by encoding multi-view 2D features from various 2D X-ray projections with a 2D CNN encoder. Leveraging the geometry of CBCT scanning, it then back-projects the multi-view 2D features into the 3D space to formulate a comprehensive volumetric feature map, followed by a 3D CNN decoder to recover 3D CBCT image. Importantly, our approach respects the geometric relationship between 3D CBCT image and its 2D X-ray projections during feature back projection stage, and enjoys the prior knowledge learned from the data population. This ensures its adaptability in dealing with extremely sparse view inputs without individual training, such as scenarios with only 5 or 10 X-ray projections. Extensive evaluations on two simulated datasets and one real-world dataset demonstrate exceptional reconstruction quality and time efficiency of our method.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"1083-1097"},"PeriodicalIF":0.0,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142376486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Gradient-Based Approach to Fast and Accurate Head Motion Compensation in Cone-Beam CT 基于梯度的方法，在锥束 CT 中实现快速准确的头部运动补偿。

IEEE transactions on medical imaging

Pub Date : 2024-10-04 DOI: 10.1109/TMI.2024.3474250

Mareike Thies;Fabian Wagner;Noah Maul;Haijun Yu;Manuela Goldmann;Linda-Sophie Schneider;Mingxuan Gu;Siyuan Mei;Lukas Folle;Alexander Preuhs;Michael Manhart;Andreas Maier

Cone-beam computed tomography (CBCT) systems, with their flexibility, present a promising avenue for direct point-of-care medical imaging, particularly in critical scenarios such as acute stroke assessment. However, the integration of CBCT into clinical workflows faces challenges, primarily linked to long scan duration resulting in patient motion during scanning and leading to image quality degradation in the reconstructed volumes. This paper introduces a novel approach to CBCT motion estimation using a gradient-based optimization algorithm, which leverages generalized derivatives of the backprojection operator for cone-beam CT geometries. Building on that, a fully differentiable target function is formulated which grades the quality of the current motion estimate in reconstruction space. We drastically accelerate motion estimation yielding a 19-fold speed-up compared to existing methods. Additionally, we investigate the architecture of networks used for quality metric regression and propose predicting voxel-wise quality maps, favoring autoencoder-like architectures over contracting ones. This modification improves gradient flow, leading to more accurate motion estimation. The presented method is evaluated through realistic experiments on head anatomy. It achieves a reduction in reprojection error from an initial average of 3mm to 0.61mm after motion compensation and consistently demonstrates superior performance compared to existing approaches. The analytic Jacobian for the backprojection operation, which is at the core of the proposed method, is made publicly available. In summary, this paper contributes to the advancement of CBCT integration into clinical workflows by proposing a robust motion estimation approach that enhances efficiency and accuracy, addressing critical challenges in time-sensitive scenarios.

锥形束计算机断层扫描（CBCT）系统以其灵活性为直接医疗点医学成像提供了一条大有可为的途径，尤其是在急性中风评估等关键场景中。然而，将 CBCT 集成到临床工作流程中面临着挑战，主要原因是扫描时间长，导致扫描过程中患者移动，从而导致重建体的图像质量下降。本文介绍了一种利用基于梯度的优化算法进行 CBCT 运动估计的新方法，该算法利用了锥形束 CT 几何结构的反投影算子的广义导数。在此基础上，制定了一个完全可变的目标函数，该函数可对重建空间中当前运动估计的质量进行分级。与现有方法相比，我们大大加快了运动估计的速度，提高了 19 倍。此外，我们还研究了用于质量度量回归的网络架构，并提出了预测体素质量图的建议，同时倾向于采用类似自动编码器的架构，而不是收缩架构。这种修改改善了梯度流，从而实现了更精确的运动估计。通过对头部解剖的实际实验，对所提出的方法进行了评估。经过运动补偿后，该方法可将重投影误差从最初的平均 3 毫米减少到 0.61 毫米，与现有方法相比始终表现出卓越的性能。作为该方法核心的反向投影操作的雅各布解析式已公开发表。总之，本文提出了一种稳健的运动估算方法，提高了效率和准确性，解决了时间敏感场景中的关键难题，为将 CBCT 集成到临床工作流程中做出了贡献。

{"title":"A Gradient-Based Approach to Fast and Accurate Head Motion Compensation in Cone-Beam CT","authors":"Mareike Thies;Fabian Wagner;Noah Maul;Haijun Yu;Manuela Goldmann;Linda-Sophie Schneider;Mingxuan Gu;Siyuan Mei;Lukas Folle;Alexander Preuhs;Michael Manhart;Andreas Maier","doi":"10.1109/TMI.2024.3474250","DOIUrl":"10.1109/TMI.2024.3474250","url":null,"abstract":"Cone-beam computed tomography (CBCT) systems, with their flexibility, present a promising avenue for direct point-of-care medical imaging, particularly in critical scenarios such as acute stroke assessment. However, the integration of CBCT into clinical workflows faces challenges, primarily linked to long scan duration resulting in patient motion during scanning and leading to image quality degradation in the reconstructed volumes. This paper introduces a novel approach to CBCT motion estimation using a gradient-based optimization algorithm, which leverages generalized derivatives of the backprojection operator for cone-beam CT geometries. Building on that, a fully differentiable target function is formulated which grades the quality of the current motion estimate in reconstruction space. We drastically accelerate motion estimation yielding a 19-fold speed-up compared to existing methods. Additionally, we investigate the architecture of networks used for quality metric regression and propose predicting voxel-wise quality maps, favoring autoencoder-like architectures over contracting ones. This modification improves gradient flow, leading to more accurate motion estimation. The presented method is evaluated through realistic experiments on head anatomy. It achieves a reduction in reprojection error from an initial average of 3mm to 0.61mm after motion compensation and consistently demonstrates superior performance compared to existing approaches. The analytic Jacobian for the backprojection operation, which is at the core of the proposed method, is made publicly available. In summary, this paper contributes to the advancement of CBCT integration into clinical workflows by proposing a robust motion estimation approach that enhances efficiency and accuracy, addressing critical challenges in time-sensitive scenarios.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"1098-1109"},"PeriodicalIF":0.0,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142376485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AttriPrompter: Auto-Prompting With Attribute Semantics for Zero-Shot Nuclei Detection via Visual-Language Pre-Trained Models AttriPrompter：通过视觉语言预训练模型，利用属性语义自动提示进行零镜头核检测。

IEEE transactions on medical imaging

Pub Date : 2024-10-03 DOI: 10.1109/TMI.2024.3473745

Yongjian Wu;Yang Zhou;Jiya Saiyin;Bingzheng Wei;Maode Lai;Jianzhong Shou;Yan Xu

Large-scale visual-language pre-trained models (VLPMs) have demonstrated exceptional performance in downstream object detection through text prompts for natural scenes. However, their application to zero-shot nuclei detection on histopathology images remains relatively unexplored, mainly due to the significant gap between the characteristics of medical images and the web-originated text-image pairs used for pre-training. This paper aims to investigate the potential of the object-level VLPM, Grounded Language-Image Pre-training (GLIP), for zero-shot nuclei detection. Specifically, we propose an innovative auto-prompting pipeline, named AttriPrompter, comprising attribute generation, attribute augmentation, and relevance sorting, to avoid subjective manual prompt design. AttriPrompter utilizes VLPMs’ text-to-image alignment to create semantically rich text prompts, which are then fed into GLIP for initial zero-shot nuclei detection. Additionally, we propose a self-trained knowledge distillation framework, where GLIP serves as the teacher with its initial predictions used as pseudo labels, to address the challenges posed by high nuclei density, including missed detections, false positives, and overlapping instances. Our method exhibits remarkable performance in label-free nuclei detection, outperforming all existing unsupervised methods and demonstrating excellent generality. Notably, this work highlights the astonishing potential of VLPMs pre-trained on natural image-text pairs for downstream tasks in the medical field as well. Code will be released at github.com/AttriPrompter.

大规模视觉语言预训练模型（VLPM）在通过自然场景文本提示进行下游对象检测方面表现出了卓越的性能。然而，它们在组织病理学图像的零点核检测中的应用仍相对欠缺，这主要是由于医学图像的特征与用于预训练的源于网络的文本图像对之间存在巨大差距。本文旨在研究对象级 VLPM、Grounded Language-Image Pre-training (GLIP) 在零点核检测方面的潜力。具体来说，我们提出了一个创新的自动提示管道，名为 AttriPrompter，包括属性生成、属性增强和相关性排序，以避免主观的人工提示设计。AttriPrompter 利用 VLPM 的文本到图像对齐功能创建语义丰富的文本提示，然后将其输入 GLIP 进行初始零镜头核检测。此外，我们还提出了一个自我训练的知识提炼框架，由 GLIP 作为教师，将其初始预测作为伪标签，以应对高核密度带来的挑战，包括漏检、误报和重叠实例。我们的方法在无标签细胞核检测方面表现出色，优于所有现有的无监督方法，并显示出卓越的通用性。值得注意的是，这项工作凸显了在自然图像-文本对上预先训练的 VLPM 在医疗领域下游任务中的惊人潜力。代码将在 github.com/AttriPrompter 上发布。

{"title":"AttriPrompter: Auto-Prompting With Attribute Semantics for Zero-Shot Nuclei Detection via Visual-Language Pre-Trained Models","authors":"Yongjian Wu;Yang Zhou;Jiya Saiyin;Bingzheng Wei;Maode Lai;Jianzhong Shou;Yan Xu","doi":"10.1109/TMI.2024.3473745","DOIUrl":"10.1109/TMI.2024.3473745","url":null,"abstract":"Large-scale visual-language pre-trained models (VLPMs) have demonstrated exceptional performance in downstream object detection through text prompts for natural scenes. However, their application to zero-shot nuclei detection on histopathology images remains relatively unexplored, mainly due to the significant gap between the characteristics of medical images and the web-originated text-image pairs used for pre-training. This paper aims to investigate the potential of the object-level VLPM, Grounded Language-Image Pre-training (GLIP), for zero-shot nuclei detection. Specifically, we propose an innovative auto-prompting pipeline, named AttriPrompter, comprising attribute generation, attribute augmentation, and relevance sorting, to avoid subjective manual prompt design. AttriPrompter utilizes VLPMs’ text-to-image alignment to create semantically rich text prompts, which are then fed into GLIP for initial zero-shot nuclei detection. Additionally, we propose a self-trained knowledge distillation framework, where GLIP serves as the teacher with its initial predictions used as pseudo labels, to address the challenges posed by high nuclei density, including missed detections, false positives, and overlapping instances. Our method exhibits remarkable performance in label-free nuclei detection, outperforming all existing unsupervised methods and demonstrating excellent generality. Notably, this work highlights the astonishing potential of VLPMs pre-trained on natural image-text pairs for downstream tasks in the medical field as well. Code will be released at github.com/AttriPrompter.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"982-993"},"PeriodicalIF":0.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142373922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Organ Foundation Model for Universal Ultrasound Image Segmentation With Task Prompt and Anatomical Prior 带有任务提示和解剖先验的通用超声图像分割多器官基础模型

IEEE transactions on medical imaging

Pub Date : 2024-10-03 DOI: 10.1109/TMI.2024.3472672

Haobo Chen;Yehua Cai;Changyan Wang;Lin Chen;Bo Zhang;Hong Han;Yuqing Guo;Hong Ding;Qi Zhang

Semantic segmentation of ultrasound (US) images with deep learning has played a crucial role in computer-aided disease screening, diagnosis and prognosis. However, due to the scarcity of US images and small field of view, resulting segmentation models are tailored for a specific single organ and may lack robustness, overlooking correlations among anatomical structures of multiple organs. To address these challenges, we propose the Multi-Organ FOundation (MOFO) model for universal US image segmentation. The MOFO is optimized jointly from multiple organs across various anatomical regions to overcome the data scarcity and explore correlations between multiple organs. The MOFO extracts organ-invariant representations from US images. Simultaneously, the task prompt is employed to refine organ-specific representations for segmentation predictions. Moreover, the anatomical prior is incorporated to enhance the consistency of the anatomical structures. A multi-organ US database with segmentation labels, comprising 7039 images from 10 organs across various regions of the human body, has been established to develop and evaluate our model. Results demonstrate that the MOFO outperforms single-organ methods in terms of the Dice coefficient, 95% Hausdorff distance and average symmetric surface distance with statistically sufficient margins. Our experiments in multi-organ universal segmentation for US images serve as a pioneering exploration of improving segmentation performance by leveraging semantic and anatomical relationships within US images of multiple organs.

利用深度学习对超声波（US）图像进行语义分割在计算机辅助疾病筛查、诊断和预后方面发挥了至关重要的作用。然而，由于 US 图像的稀缺性和小视场，由此产生的分割模型都是为特定的单一器官量身定制的，可能缺乏鲁棒性，忽略了多个器官解剖结构之间的相关性。为了应对这些挑战，我们提出了用于通用 US 图像分割的多器官基金化（MOFO）模型。MOFO 从不同解剖区域的多个器官中联合优化，以克服数据稀缺性并探索多个器官之间的相关性。MOFO 可从 US 图像中提取与器官无关的表征。同时，利用任务提示来完善特定器官的表征，以进行分割预测。此外，还纳入了解剖先验，以增强解剖结构的一致性。为了评估我们的模型，我们建立了一个多器官 US 数据库，其中包括来自人体不同区域 10 个器官的 7039 幅图像。结果表明，MOFO 在 Dice 系数、95% Hausdorff 距离和平均对称面距离方面均优于单器官方法，且在统计学上有足够的优势。我们的 US 图像多器官通用分割实验是利用 US 图像中多个器官的语义和解剖关系提高分割性能的开创性探索。

{"title":"Multi-Organ Foundation Model for Universal Ultrasound Image Segmentation With Task Prompt and Anatomical Prior","authors":"Haobo Chen;Yehua Cai;Changyan Wang;Lin Chen;Bo Zhang;Hong Han;Yuqing Guo;Hong Ding;Qi Zhang","doi":"10.1109/TMI.2024.3472672","DOIUrl":"10.1109/TMI.2024.3472672","url":null,"abstract":"Semantic segmentation of ultrasound (US) images with deep learning has played a crucial role in computer-aided disease screening, diagnosis and prognosis. However, due to the scarcity of US images and small field of view, resulting segmentation models are tailored for a specific single organ and may lack robustness, overlooking correlations among anatomical structures of multiple organs. To address these challenges, we propose the Multi-Organ FOundation (MOFO) model for universal US image segmentation. The MOFO is optimized jointly from multiple organs across various anatomical regions to overcome the data scarcity and explore correlations between multiple organs. The MOFO extracts organ-invariant representations from US images. Simultaneously, the task prompt is employed to refine organ-specific representations for segmentation predictions. Moreover, the anatomical prior is incorporated to enhance the consistency of the anatomical structures. A multi-organ US database with segmentation labels, comprising 7039 images from 10 organs across various regions of the human body, has been established to develop and evaluate our model. Results demonstrate that the MOFO outperforms single-organ methods in terms of the Dice coefficient, 95% Hausdorff distance and average symmetric surface distance with statistically sufficient margins. Our experiments in multi-organ universal segmentation for US images serve as a pioneering exploration of improving segmentation performance by leveraging semantic and anatomical relationships within US images of multiple organs.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"1005-1018"},"PeriodicalIF":0.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142373923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SPIRiT-Diffusion: Self-Consistency Driven Diffusion Model for Accelerated MRI SPIRiT-Diffusion：用于加速核磁共振成像的自一致性驱动扩散模型

IEEE transactions on medical imaging

Pub Date : 2024-10-03 DOI: 10.1109/TMI.2024.3473009

Zhuo-Xu Cui;Chentao Cao;Yue Wang;Sen Jia;Jing Cheng;Xin Liu;Hairong Zheng;Dong Liang;Yanjie Zhu

Diffusion models have emerged as a leading methodology for image generation and have proven successful in the realm of magnetic resonance imaging (MRI) reconstruction. However, existing reconstruction methods based on diffusion models are primarily formulated in the image domain, making the reconstruction quality susceptible to inaccuracies in coil sensitivity maps (CSMs). k-space interpolation methods can effectively address this issue but conventional diffusion models are not readily applicable in k-space interpolation. To overcome this challenge, we introduce a novel approach called SPIRiT-Diffusion, which is a diffusion model for k-space interpolation inspired by the iterative self-consistent SPIRiT method. Specifically, we utilize the iterative solver of the self-consistent term (i.e., k-space physical prior) in SPIRiT to formulate a novel stochastic differential equation (SDE) governing the diffusion process. Subsequently, k-space data can be interpolated by executing the diffusion process. This innovative approach highlights the optimization model’s role in designing the SDE in diffusion models, enabling the diffusion process to align closely with the physics inherent in the optimization model-a concept referred to as model-driven diffusion. We evaluated the proposed SPIRiT-Diffusion method using a 3D joint intracranial and carotid vessel wall imaging dataset. The results convincingly demonstrate its superiority over image-domain reconstruction methods, achieving high reconstruction quality even at a substantial acceleration rate of 10. Our code are available at https://github.com/zhyjSIAT/SPIRiT-Diffusion.

扩散模型已成为图像生成的主要方法，并在磁共振成像（MRI）重建领域取得了成功。然而，现有的基于扩散模型的重建方法主要是在图像域中制定的，因此重建质量容易受到线圈灵敏度图（CSM）不准确的影响。k 空间插值方法可以有效解决这一问题，但传统的扩散模型在 k 空间插值中并不适用。为了克服这一难题，我们引入了一种名为 SPIRiT-Diffusion 的新方法，它是受迭代自洽 SPIRiT 方法启发而产生的 k 空间插值扩散模型。具体来说，我们利用 SPIRiT 中的自洽项（即 k 空间物理先验项）迭代求解器，制定了一个管理扩散过程的新型随机微分方程（SDE）。随后，可通过执行扩散过程对 k 空间数据进行插值。这种创新方法突出了优化模型在设计扩散模型中的 SDE 时所扮演的角色，使扩散过程与优化模型中固有的物理过程紧密结合--这一概念被称为模型驱动扩散。我们使用三维颅内和颈动脉血管壁联合成像数据集对所提出的 SPIRiT-Diffusion 方法进行了评估。结果令人信服地证明了该方法优于图像域重建方法，即使在 10 倍的大幅加速率下也能达到很高的重建质量。我们的代码见 https://github.com/zhyjSIAT/SPIRiT-Diffusion。

{"title":"SPIRiT-Diffusion: Self-Consistency Driven Diffusion Model for Accelerated MRI","authors":"Zhuo-Xu Cui;Chentao Cao;Yue Wang;Sen Jia;Jing Cheng;Xin Liu;Hairong Zheng;Dong Liang;Yanjie Zhu","doi":"10.1109/TMI.2024.3473009","DOIUrl":"10.1109/TMI.2024.3473009","url":null,"abstract":"Diffusion models have emerged as a leading methodology for image generation and have proven successful in the realm of magnetic resonance imaging (MRI) reconstruction. However, existing reconstruction methods based on diffusion models are primarily formulated in the image domain, making the reconstruction quality susceptible to inaccuracies in coil sensitivity maps (CSMs). k-space interpolation methods can effectively address this issue but conventional diffusion models are not readily applicable in k-space interpolation. To overcome this challenge, we introduce a novel approach called SPIRiT-Diffusion, which is a diffusion model for k-space interpolation inspired by the iterative self-consistent SPIRiT method. Specifically, we utilize the iterative solver of the self-consistent term (i.e., k-space physical prior) in SPIRiT to formulate a novel stochastic differential equation (SDE) governing the diffusion process. Subsequently, k-space data can be interpolated by executing the diffusion process. This innovative approach highlights the optimization model’s role in designing the SDE in diffusion models, enabling the diffusion process to align closely with the physics inherent in the optimization model-a concept referred to as model-driven diffusion. We evaluated the proposed SPIRiT-Diffusion method using a 3D joint intracranial and carotid vessel wall imaging dataset. The results convincingly demonstrate its superiority over image-domain reconstruction methods, achieving high reconstruction quality even at a substantial acceleration rate of 10. Our code are available at <uri>https://github.com/zhyjSIAT/SPIRiT-Diffusion</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"1019-1031"},"PeriodicalIF":0.0,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142373924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Three-Dimensional Variable Slab-Selective Projection Acquisition Imaging 三维可变平板选择性投影采集成像。

IEEE transactions on medical imaging

Pub Date : 2024-09-30 DOI: 10.1109/TMI.2024.3460974

Jinil Park;Taehoon Shin;Jang-Yeon Park

Three-dimensional (3D) projection acquisition (PA) imaging has recently gained attention because of its advantages, such as achievability of very short echo time, less sensitivity to motion, and undersampled acquisition of projections without sacrificing spatial resolution. However, larger subjects require a stronger Nyquist criterion and are more likely to be affected by outer-volume signals outside the field of view (FOV), which significantly degrades the image quality. Here, we proposed a variable slab-selective projection acquisition (VSS-PA) method to mitigate the Nyquist criterion and effectively suppress aliasing streak artifacts in 3D PA imaging. The proposed method involves maintaining the vertical orientation of the slab-selective gradient for frequency-selective spin excitation and the readout gradient for data acquisition. As VSS-PA can selectively excite spins only in the width of the desired FOV in the projection direction during data acquisition, the effective size of the scanned object that determines the Nyquist criterion can be reduced. Additionally, unwanted signals originating from outside the FOV (e.g., aliasing streak artifacts) can be effectively avoided. The mitigation of the Nyquist criterion owing to VSS-PA was theoretically described and confirmed through numerical simulations and phantom and human lung experiments. These experiments further showed that the aliasing streak artifacts were nearly suppressed.

三维（3D）投影采集（PA）成像具有回波时间极短、对运动的敏感性较低、在不牺牲空间分辨率的情况下采集欠采样投影等优点，因此近来备受关注。然而，较大的受试者需要更强的奈奎斯特标准，而且更容易受到视野（FOV）外的外容积信号的影响，从而大大降低图像质量。在此，我们提出了一种可变板片选择性投影采集（VSS-PA）方法，以减轻奈奎斯特标准，并有效抑制三维 PA 成像中的混叠条纹伪影。该方法包括保持用于频率选择性自旋激发的板片选择梯度和用于数据采集的读出梯度的垂直方向。由于 VSS-PA 在数据采集过程中只能选择性地激发投影方向上所需 FOV 宽度内的自旋，因此可以减小决定奈奎斯特标准的扫描对象的有效尺寸。此外，还能有效避免来自 FOV 以外的不需要的信号（如混叠条纹伪影）。VSS-PA 对奈奎斯特标准的减弱进行了理论描述，并通过数值模拟、人体模型和人体肺部实验得到了证实。这些实验进一步表明，混叠条纹伪影几乎被抑制。

{"title":"Three-Dimensional Variable Slab-Selective Projection Acquisition Imaging","authors":"Jinil Park;Taehoon Shin;Jang-Yeon Park","doi":"10.1109/TMI.2024.3460974","DOIUrl":"10.1109/TMI.2024.3460974","url":null,"abstract":"Three-dimensional (3D) projection acquisition (PA) imaging has recently gained attention because of its advantages, such as achievability of very short echo time, less sensitivity to motion, and undersampled acquisition of projections without sacrificing spatial resolution. However, larger subjects require a stronger Nyquist criterion and are more likely to be affected by outer-volume signals outside the field of view (FOV), which significantly degrades the image quality. Here, we proposed a variable slab-selective projection acquisition (VSS-PA) method to mitigate the Nyquist criterion and effectively suppress aliasing streak artifacts in 3D PA imaging. The proposed method involves maintaining the vertical orientation of the slab-selective gradient for frequency-selective spin excitation and the readout gradient for data acquisition. As VSS-PA can selectively excite spins only in the width of the desired FOV in the projection direction during data acquisition, the effective size of the scanned object that determines the Nyquist criterion can be reduced. Additionally, unwanted signals originating from outside the FOV (e.g., aliasing streak artifacts) can be effectively avoided. The mitigation of the Nyquist criterion owing to VSS-PA was theoretically described and confirmed through numerical simulations and phantom and human lung experiments. These experiments further showed that the aliasing streak artifacts were nearly suppressed.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"728-737"},"PeriodicalIF":0.0,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142335179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning With Explicit Shape Priors for Medical Image Segmentation 利用显式形状先验学习医学图像分割

IEEE transactions on medical imaging

Pub Date : 2024-09-27 DOI: 10.1109/TMI.2024.3469214

Xin You;Junjun He;Jie Yang;Yun Gu

Medical image segmentation is a fundamental task for medical image analysis and surgical planning. In recent years, UNet-based networks have prevailed in the field of medical image segmentation. However, convolutional neural networks (CNNs) suffer from limited receptive fields, which fail to model the long-range dependency of organs or tumors. Besides, these models are heavily dependent on the training of the final segmentation head. And existing methods can not well address aforementioned limitations simultaneously. Hence, in our work, we proposed a novel shape prior module (SPM), which can explicitly introduce shape priors to promote the segmentation performance of UNet-based models. The explicit shape priors consist of global and local shape priors. The former with coarse shape representations provides networks with capabilities to model global contexts. The latter with finer shape information serves as additional guidance to relieve the heavy dependence on the learnable prototype in the segmentation head. To evaluate the effectiveness of SPM, we conduct experiments on three challenging public datasets. And our proposed model achieves state-of-the-art performance. Furthermore, SPM can serve as a plug-and-play structure into classic CNNs and Transformer-based backbones, facilitating the segmentation task on different datasets. Source codes are available at https://github.com/AlexYouXin/Explicit-Shape-Priors.

{"title":"Learning With Explicit Shape Priors for Medical Image Segmentation","authors":"Xin You;Junjun He;Jie Yang;Yun Gu","doi":"10.1109/TMI.2024.3469214","DOIUrl":"10.1109/TMI.2024.3469214","url":null,"abstract":"Medical image segmentation is a fundamental task for medical image analysis and surgical planning. In recent years, UNet-based networks have prevailed in the field of medical image segmentation. However, convolutional neural networks (CNNs) suffer from limited receptive fields, which fail to model the long-range dependency of organs or tumors. Besides, these models are heavily dependent on the training of the final segmentation head. And existing methods can not well address aforementioned limitations simultaneously. Hence, in our work, we proposed a novel shape prior module (SPM), which can explicitly introduce shape priors to promote the segmentation performance of UNet-based models. The explicit shape priors consist of global and local shape priors. The former with coarse shape representations provides networks with capabilities to model global contexts. The latter with finer shape information serves as additional guidance to relieve the heavy dependence on the learnable prototype in the segmentation head. To evaluate the effectiveness of SPM, we conduct experiments on three challenging public datasets. And our proposed model achieves state-of-the-art performance. Furthermore, SPM can serve as a plug-and-play structure into classic CNNs and Transformer-based backbones, facilitating the segmentation task on different datasets. Source codes are available at <uri>https://github.com/AlexYouXin/Explicit-Shape-Priors</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"927-940"},"PeriodicalIF":0.0,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142329045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Integrating Eye Tracking With Grouped Fusion Networks for Semantic Segmentation on Mammogram Images 将眼动跟踪与分组融合网络相结合，实现乳腺 X 射线图像的语义分割

IEEE transactions on medical imaging

Pub Date : 2024-09-27 DOI: 10.1109/TMI.2024.3468404

Jiaming Xie;Qing Zhang;Zhiming Cui;Chong Ma;Yan Zhou;Wenping Wang;Dinggang Shen

Medical image segmentation has seen great progress in recent years, largely due to the development of deep neural networks. However, unlike in computer vision, high-quality clinical data is relatively scarce, and the annotation process is often a burden for clinicians. As a result, the scarcity of medical data limits the performance of existing medical image segmentation models. In this paper, we propose a novel framework that integrates eye tracking information from experienced radiologists during the screening process to improve the performance of deep neural networks with limited data. Our approach, a grouped hierarchical network, guides the network to learn from its faults by using gaze information as weak supervision. We demonstrate the effectiveness of our framework on mammogram images, particularly for handling segmentation classes with large scale differences. We evaluate the impact of gaze information on medical image segmentation tasks and show that our method achieves better segmentation performance compared to state-of-the-art models. A robustness study is conducted to investigate the influence of distraction or inaccuracies in gaze collection. We also develop a convenient system for collecting gaze data without interrupting the normal clinical workflow. Our work offers novel insights into the potential benefits of integrating gaze information into medical image segmentation tasks.

{"title":"Integrating Eye Tracking With Grouped Fusion Networks for Semantic Segmentation on Mammogram Images","authors":"Jiaming Xie;Qing Zhang;Zhiming Cui;Chong Ma;Yan Zhou;Wenping Wang;Dinggang Shen","doi":"10.1109/TMI.2024.3468404","DOIUrl":"10.1109/TMI.2024.3468404","url":null,"abstract":"Medical image segmentation has seen great progress in recent years, largely due to the development of deep neural networks. However, unlike in computer vision, high-quality clinical data is relatively scarce, and the annotation process is often a burden for clinicians. As a result, the scarcity of medical data limits the performance of existing medical image segmentation models. In this paper, we propose a novel framework that integrates eye tracking information from experienced radiologists during the screening process to improve the performance of deep neural networks with limited data. Our approach, a grouped hierarchical network, guides the network to learn from its faults by using gaze information as weak supervision. We demonstrate the effectiveness of our framework on mammogram images, particularly for handling segmentation classes with large scale differences. We evaluate the impact of gaze information on medical image segmentation tasks and show that our method achieves better segmentation performance compared to state-of-the-art models. A robustness study is conducted to investigate the influence of distraction or inaccuracies in gaze collection. We also develop a convenient system for collecting gaze data without interrupting the normal clinical workflow. Our work offers novel insights into the potential benefits of integrating gaze information into medical image segmentation tasks.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"868-879"},"PeriodicalIF":0.0,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142329048","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DC²T: Disentanglement-Guided Consolidation and Consistency Training for Semi-Supervised Cross-Site Continual Segmentation DC2T：用于半监督跨站点连续分割的分离引导巩固和一致性训练

IEEE transactions on medical imaging

Pub Date : 2024-09-27 DOI: 10.1109/TMI.2024.3469528

Jingyang Zhang;Jialun Pei;Dunyuan Xu;Yueming Jin;Pheng-Ann Heng

Continual Learning (CL) is recognized to be a storage-efficient and privacy-protecting approach for learning from sequentially-arriving medical sites. However, most existing CL methods assume that each site is fully labeled, which is impractical due to budget and expertise constraint. This paper studies the Semi-Supervised Continual Learning (SSCL) that adopts partially-labeled sites arriving over time, with each site delivering only limited labeled data while the majority remains unlabeled. In this regard, it is challenging to effectively utilize unlabeled data under dynamic cross-site domain gaps, leading to intractable model forgetting on such unlabeled data. To address this problem, we introduce a novel Disentanglement-guided Consolidation and Consistency Training (DC2T) framework, which roots in an Online Semi-Supervised representation Disentanglement (OSSD) perspective to excavate content representations of partially labeled data from sites arriving over time. Moreover, these content representations are required to be consolidated for site-invariance and calibrated for style-robustness, in order to alleviate forgetting even in the absence of ground truth. Specifically, for the invariance on previous sites, we retain historical content representations when learning on a new site, via a Content-inspired Parameter Consolidation (CPC) method that prevents altering the model parameters crucial for content preservation. For the robustness against style variation, we develop a Style-induced Consistency Training (SCT) scheme that enforces segmentation consistency over style-related perturbations to recalibrate content encoding. We extensively evaluate our method on fundus and cardiac image segmentation, indicating the advantage over existing SSCL methods for alleviating forgetting on unlabeled data.

{"title":"DC²T: Disentanglement-Guided Consolidation and Consistency Training for Semi-Supervised Cross-Site Continual Segmentation","authors":"Jingyang Zhang;Jialun Pei;Dunyuan Xu;Yueming Jin;Pheng-Ann Heng","doi":"10.1109/TMI.2024.3469528","DOIUrl":"10.1109/TMI.2024.3469528","url":null,"abstract":"Continual Learning (CL) is recognized to be a storage-efficient and privacy-protecting approach for learning from sequentially-arriving medical sites. However, most existing CL methods assume that each site is fully labeled, which is impractical due to budget and expertise constraint. This paper studies the Semi-Supervised Continual Learning (SSCL) that adopts partially-labeled sites arriving over time, with each site delivering only limited labeled data while the majority remains unlabeled. In this regard, it is challenging to effectively utilize unlabeled data under dynamic cross-site domain gaps, leading to intractable model forgetting on such unlabeled data. To address this problem, we introduce a novel Disentanglement-guided Consolidation and Consistency Training (DC2T) framework, which roots in an Online Semi-Supervised representation Disentanglement (OSSD) perspective to excavate content representations of partially labeled data from sites arriving over time. Moreover, these content representations are required to be consolidated for site-invariance and calibrated for style-robustness, in order to alleviate forgetting even in the absence of ground truth. Specifically, for the invariance on previous sites, we retain historical content representations when learning on a new site, via a Content-inspired Parameter Consolidation (CPC) method that prevents altering the model parameters crucial for content preservation. For the robustness against style variation, we develop a Style-induced Consistency Training (SCT) scheme that enforces segmentation consistency over style-related perturbations to recalibrate content encoding. We extensively evaluate our method on fundus and cardiac image segmentation, indicating the advantage over existing SSCL methods for alleviating forgetting on unlabeled data.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"903-914"},"PeriodicalIF":0.0,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142329046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Leveraging Input-Level Feature Deformation With Guided-Attention for Sulcal Labeling 利用输入级特征变形和引导式注意力进行胼胝体标记

IEEE transactions on medical imaging

Pub Date : 2024-09-26 DOI: 10.1109/TMI.2024.3468727

Seungeun Lee;Seunghwan Lee;Ethan H. Willbrand;Benjamin J. Parker;Silvia A. Bunge;Kevin S. Weiner;Ilwoo Lyu

The identification of cortical sulci is key for understanding functional and structural development of the cortex. While large, consistent sulci (or primary/secondary sulci) receive significant attention in most studies, the exploration of smaller and more variable sulci (or putative tertiary sulci) remains relatively under-investigated. Despite its importance, automatic labeling of cortical sulci is challenging due to (1) the presence of substantial anatomical variability, (2) the relatively small size of the regions of interest (ROIs) compared to unlabeled regions, and (3) the scarcity of annotated labels. In this paper, we propose a novel end-to-end learning framework using a spherical convolutional neural network (CNN). Specifically, the proposed method learns to effectively warp geometric features in a direction that facilitates the labeling of sulci while mitigating the impact of anatomical variability. Moreover, we introduce a guided-attention mechanism that takes into account the extent of deformation induced by the learned warping. This extracts discriminative features that emphasize sulcal ROIs, while suppressing irrelevant information of unlabeled regions. In the experiments, we evaluate the proposed method on 8 sulci of the posterior medial cortex. Our method outperforms existing methods particularly in the putative tertiary sulci. The code is publicly available at https://github.com/Shape-Lab/DSPHARM-Net.

{"title":"Leveraging Input-Level Feature Deformation With Guided-Attention for Sulcal Labeling","authors":"Seungeun Lee;Seunghwan Lee;Ethan H. Willbrand;Benjamin J. Parker;Silvia A. Bunge;Kevin S. Weiner;Ilwoo Lyu","doi":"10.1109/TMI.2024.3468727","DOIUrl":"10.1109/TMI.2024.3468727","url":null,"abstract":"The identification of cortical sulci is key for understanding functional and structural development of the cortex. While large, consistent sulci (or primary/secondary sulci) receive significant attention in most studies, the exploration of smaller and more variable sulci (or putative tertiary sulci) remains relatively under-investigated. Despite its importance, automatic labeling of cortical sulci is challenging due to (1) the presence of substantial anatomical variability, (2) the relatively small size of the regions of interest (ROIs) compared to unlabeled regions, and (3) the scarcity of annotated labels. In this paper, we propose a novel end-to-end learning framework using a spherical convolutional neural network (CNN). Specifically, the proposed method learns to effectively warp geometric features in a direction that facilitates the labeling of sulci while mitigating the impact of anatomical variability. Moreover, we introduce a guided-attention mechanism that takes into account the extent of deformation induced by the learned warping. This extracts discriminative features that emphasize sulcal ROIs, while suppressing irrelevant information of unlabeled regions. In the experiments, we evaluate the proposed method on 8 sulci of the posterior medial cortex. Our method outperforms existing methods particularly in the putative tertiary sulci. The code is publicly available at <uri>https://github.com/Shape-Lab/DSPHARM-Net</uri>.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 2","pages":"915-926"},"PeriodicalIF":0.0,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142325307","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0