首页 > 最新文献

IEEE transactions on medical imaging最新文献

英文 中文
PAL: Boosting Skin Lesion Segmentation via Probabilistic Attribute Learning 基于概率属性学习的皮肤损伤分割
Pub Date : 2025-07-11 DOI: 10.1109/TMI.2025.3588167
Yuchen Yuan;Xi Wang;Jinpeng Li;Guangyong Chen;Pheng-Ann Heng
Skin lesion segmentation is vital for the early detection, diagnosis, and treatment of melanoma, yet it remains challenging due to significant variations in lesion attributes (e.g., color, size, shape), ambiguous boundaries, and noise interference. Recent advancements have focused on capturing contextual information and incorporating boundary priors to handle challenging lesions. However, there has been limited exploration on the explicit analysis of the inherent patterns of skin lesions, a crucial aspect of the knowledge-driven decision-making process used by clinical experts. In this work, we introduce a novel approach called Probabilistic Attribute Learning (PAL), which leverages knowledge of lesion patterns to achieve enhanced performance on challenging lesions. Recognizing that the lesion patterns exhibited in each image can be properly depicted by disentangled attributes, we begin by explicitly estimating the distributions of these attributes as distinct Gaussian distributions, with mean and variance indicating the most likely pattern of that attribute and its variation. Using Monte Carlo Sampling, we iteratively draw multiple samples from these distributions to capture various potential patterns for each attribute. These samples are then merged through an effective attribute fusion technique, resulting in diverse representations that comprehensively depict the lesion class. By performing pixel-class proximity matching between each pixel-wise representation and the diverse class-wise representations, we significantly enhance the model’s robustness. Extensive experiments on two public skin lesion datasets and one unified polyp lesion dataset demonstrate the effectiveness and strong generalization ability of our method. Codes are available at https://github.com/IsYuchenYuan/PAL
皮肤病变分割对于黑色素瘤的早期检测、诊断和治疗至关重要,但由于病变属性(如颜色、大小、形状)的显著变化、边界模糊和噪声干扰,它仍然具有挑战性。最近的进展集中在获取上下文信息和结合边界先验来处理具有挑战性的病变。然而,对皮肤病变固有模式的明确分析的探索有限,这是临床专家使用的知识驱动决策过程的关键方面。在这项工作中,我们引入了一种称为概率属性学习(PAL)的新方法,该方法利用病变模式的知识来提高具有挑战性病变的性能。认识到每个图像中显示的病变模式可以通过解纠缠属性适当地描述,我们首先明确估计这些属性的分布为不同的高斯分布,其中均值和方差表示该属性及其变化的最可能模式。使用蒙特卡罗采样,我们从这些分布中迭代地绘制多个样本,以捕获每个属性的各种潜在模式。然后通过有效的属性融合技术合并这些样本,产生全面描述病变类别的不同表示。通过在每个像素表示和不同的类表示之间执行像素类接近匹配,我们显著增强了模型的鲁棒性。在两个公共皮肤病变数据集和一个统一的息肉病变数据集上进行的大量实验证明了该方法的有效性和较强的泛化能力。代码可在https://github.com/IsYuchenYuan/PAL上获得
{"title":"PAL: Boosting Skin Lesion Segmentation via Probabilistic Attribute Learning","authors":"Yuchen Yuan;Xi Wang;Jinpeng Li;Guangyong Chen;Pheng-Ann Heng","doi":"10.1109/TMI.2025.3588167","DOIUrl":"10.1109/TMI.2025.3588167","url":null,"abstract":"Skin lesion segmentation is vital for the early detection, diagnosis, and treatment of melanoma, yet it remains challenging due to significant variations in lesion attributes (e.g., color, size, shape), ambiguous boundaries, and noise interference. Recent advancements have focused on capturing contextual information and incorporating boundary priors to handle challenging lesions. However, there has been limited exploration on the explicit analysis of the inherent patterns of skin lesions, a crucial aspect of the knowledge-driven decision-making process used by clinical experts. In this work, we introduce a novel approach called Probabilistic Attribute Learning (PAL), which leverages knowledge of lesion patterns to achieve enhanced performance on challenging lesions. Recognizing that the lesion patterns exhibited in each image can be properly depicted by disentangled attributes, we begin by explicitly estimating the distributions of these attributes as distinct Gaussian distributions, with mean and variance indicating the most likely pattern of that attribute and its variation. Using Monte Carlo Sampling, we iteratively draw multiple samples from these distributions to capture various potential patterns for each attribute. These samples are then merged through an effective attribute fusion technique, resulting in diverse representations that comprehensively depict the lesion class. By performing pixel-class proximity matching between each pixel-wise representation and the diverse class-wise representations, we significantly enhance the model’s robustness. Extensive experiments on two public skin lesion datasets and one unified polyp lesion dataset demonstrate the effectiveness and strong generalization ability of our method. Codes are available at <uri>https://github.com/IsYuchenYuan/PAL</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5183-5196"},"PeriodicalIF":0.0,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11078393","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144611278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Echocardiography Video Segmentation via Neighborhood Correlation Mining 基于邻域相关挖掘的超声心动图视频分割
Pub Date : 2025-07-11 DOI: 10.1109/TMI.2025.3588157
Xiaolong Deng;Huisi Wu
Accurate segmentation of the left ventricle in echocardiography is critical for diagnosing and treating cardiovascular diseases. However, accurate segmentation remains challenging due to the limitations of ultrasound imaging. Although numerous image and video segmentation methods have been proposed, existing methods still fail to effectively solve this task, which is limited by sparsity annotations. To address this problem, we propose a novel semi-supervised segmentation framework named NCM-Net for echocardiography. We first propose the neighborhood correlation mining (NCM) module, which sufficiently mines the correlations between query features and their spatiotemporal neighborhoods to resist noise influence. The module also captures cross-scale contextual correlations between pixels spatially to further refine features, thus alleviating the impact of noise on echocardiography segmentation. To further improve segmentation accuracy, we propose using unreliable-pixels masked attention (UMA). By masking reliable pixels, it pays extra attention to unreliable pixels to refine the boundary of segmentation. Further, we use cross-frame boundary constraints on the final predictions to optimize their temporal consistency. Through extensive experiments on two publicly available datasets, CAMUS and EchoNet-Dynamic, we demonstrate the effectiveness of the proposed, which achieves state-of-the-art performance and outstanding temporal consistency. Codes are available at https://github.com/dengxl0520/NCMNet
超声心动图对左心室的准确分割对心血管疾病的诊断和治疗至关重要。然而,由于超声成像的限制,准确分割仍然具有挑战性。尽管已经提出了许多图像和视频分割方法,但现有的方法仍然不能有效地解决这一问题,这受到稀疏性注释的限制。为了解决这个问题,我们提出了一种新的超声心动图半监督分割框架NCM-Net。我们首先提出邻域相关挖掘(NCM)模块,该模块充分挖掘查询特征与其时空邻域之间的相关性,以抵抗噪声的影响。该模块还在空间上捕获像素之间的跨尺度上下文相关性,以进一步细化特征,从而减轻噪声对超声心动图分割的影响。为了进一步提高分割精度,我们提出使用不可靠像素掩蔽注意(UMA)。通过屏蔽可靠的像素点,对不可靠的像素点给予额外的关注,以细化分割的边界。此外,我们对最终预测使用跨帧边界约束来优化它们的时间一致性。通过在CAMUS和EchoNet-Dynamic两个公开可用的数据集上进行大量实验,我们证明了该方法的有效性,该方法实现了最先进的性能和出色的时间一致性。代码可在https://github.com/dengxl0520/NCMNet上获得
{"title":"Echocardiography Video Segmentation via Neighborhood Correlation Mining","authors":"Xiaolong Deng;Huisi Wu","doi":"10.1109/TMI.2025.3588157","DOIUrl":"10.1109/TMI.2025.3588157","url":null,"abstract":"Accurate segmentation of the left ventricle in echocardiography is critical for diagnosing and treating cardiovascular diseases. However, accurate segmentation remains challenging due to the limitations of ultrasound imaging. Although numerous image and video segmentation methods have been proposed, existing methods still fail to effectively solve this task, which is limited by sparsity annotations. To address this problem, we propose a novel semi-supervised segmentation framework named NCM-Net for echocardiography. We first propose the neighborhood correlation mining (NCM) module, which sufficiently mines the correlations between query features and their spatiotemporal neighborhoods to resist noise influence. The module also captures cross-scale contextual correlations between pixels spatially to further refine features, thus alleviating the impact of noise on echocardiography segmentation. To further improve segmentation accuracy, we propose using unreliable-pixels masked attention (UMA). By masking reliable pixels, it pays extra attention to unreliable pixels to refine the boundary of segmentation. Further, we use cross-frame boundary constraints on the final predictions to optimize their temporal consistency. Through extensive experiments on two publicly available datasets, CAMUS and EchoNet-Dynamic, we demonstrate the effectiveness of the proposed, which achieves state-of-the-art performance and outstanding temporal consistency. Codes are available at <uri>https://github.com/dengxl0520/NCMNet</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5172-5182"},"PeriodicalIF":0.0,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144611122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Flow-Rate-Constrained Physics-Informed Neural Networks for Flow Field Error Correction in 4-D Flow Magnetic Resonance Imaging 基于流量约束物理信息的四维流磁共振成像流场误差校正神经网络
Pub Date : 2025-07-10 DOI: 10.1109/TMI.2025.3587636
Jihun Kang;Eui Cheol Jung;Hyun Jung Koo;Dong Hyun Yang;Hojin Ha
In this study, we present enhanced physics-informed neural networks (PINNs), which were designed to address flow field errors in four-dimensional flow magnetic resonance imaging (4D Flow MRI). Flow field errors, typically occurring in high-velocity regions, lead to inaccuracies in velocity fields and flow rate underestimation. We proposed incorporating flow rate constraints to ensure physical consistency across cross-sections. The proposed framework included optimization strategies to improve convergence, stability, and accuracy. Artificial viscosity modeling, projecting conflicting gradients (PCGrad), and Euclidean norm scaling were applied to balance loss functions during training. The performance was validated using 2D computational fluid dynamics (CFD) with synthetic error, in-vitro 4D flow MRI mimicking aortic valve, and in-vivo 4D flow MRI from patients with aortic regurgitation and aortic stenosis. This study demonstrated considerable improvements in correcting flow field errors, denoising, and super-resolution. Notably, the proposed PINNs provided accurate flow rate reconstruction in stenotic and high-velocity regions. This approach extends the applicability of 4D flow MRI by providing reliable hemodynamics in the post-processing stage.
在这项研究中,我们提出了增强型物理信息神经网络(pinn),旨在解决四维流动磁共振成像(4D flow MRI)中的流场误差。流场误差通常发生在高速区域,导致速度场的不准确和流量的低估。我们建议合并流速限制以确保横截面上的物理一致性。提出的框架包括优化策略,以提高收敛性、稳定性和准确性。在训练过程中,使用人工粘度建模、投影冲突梯度(PCGrad)和欧氏范数缩放来平衡损失函数。通过二维计算流体动力学(CFD)、模拟主动脉瓣的体外4D血流MRI以及主动脉瓣返流和主动脉瓣狭窄患者的体内4D血流MRI验证了该性能。该研究表明,在校正流场误差、去噪和超分辨率方面有相当大的改进。值得注意的是,所提出的pinn在狭窄和高速区域提供了准确的流量重建。这种方法通过在后处理阶段提供可靠的血流动力学,扩展了4D血流MRI的适用性。
{"title":"Flow-Rate-Constrained Physics-Informed Neural Networks for Flow Field Error Correction in 4-D Flow Magnetic Resonance Imaging","authors":"Jihun Kang;Eui Cheol Jung;Hyun Jung Koo;Dong Hyun Yang;Hojin Ha","doi":"10.1109/TMI.2025.3587636","DOIUrl":"10.1109/TMI.2025.3587636","url":null,"abstract":"In this study, we present enhanced physics-informed neural networks (PINNs), which were designed to address flow field errors in four-dimensional flow magnetic resonance imaging (4D Flow MRI). Flow field errors, typically occurring in high-velocity regions, lead to inaccuracies in velocity fields and flow rate underestimation. We proposed incorporating flow rate constraints to ensure physical consistency across cross-sections. The proposed framework included optimization strategies to improve convergence, stability, and accuracy. Artificial viscosity modeling, projecting conflicting gradients (PCGrad), and Euclidean norm scaling were applied to balance loss functions during training. The performance was validated using 2D computational fluid dynamics (CFD) with synthetic error, in-vitro 4D flow MRI mimicking aortic valve, and in-vivo 4D flow MRI from patients with aortic regurgitation and aortic stenosis. This study demonstrated considerable improvements in correcting flow field errors, denoising, and super-resolution. Notably, the proposed PINNs provided accurate flow rate reconstruction in stenotic and high-velocity regions. This approach extends the applicability of 4D flow MRI by providing reliable hemodynamics in the post-processing stage.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5155-5171"},"PeriodicalIF":0.0,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144603140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coupled Diffusion Models for Metal Artifact Reduction of Clinical Dental CBCT Images 临床牙科CBCT图像金属伪影还原的耦合扩散模型。
Pub Date : 2025-07-08 DOI: 10.1109/TMI.2025.3587131
Zhouzhuo Zhang;Juncheng Yan;Yuxuan Shi;Zhiming Cui;Jun Xu;Dinggang Shen
Metal dental implants may introduce metal artifacts (MA) during the CBCT imaging process, causing significant interference in subsequent diagnosis. In recent years, many deep learning methods for metal artifact reduction (MAR) have been proposed. Due to the huge difference between synthetic and clinical MA, supervised learning MAR methods may perform poorly in clinical settings. Many existing unsupervised MAR methods trained on clinical data often suffer from incorrect dental morphology. To alleviate the above problems, in this paper, we propose a new MAR method of Coupled Diffusion Models (CDM) for clinical dental CBCT images. Specifically, we separately train two diffusion models on clinical MA-degraded images and clinical clean images to obtain prior information, respectively. During the denoising process, the variances of noise levels are calculated from MA images and the prior of diffusion models. Then we develop a noise transformation module between the two diffusion models to transform the MA noise image into a new initial value for the denoising process. Our designs effectively exploit the inherent transformation between the misaligned MA-degraded images and clean images. Additionally, we introduce an MA-adaptive inference technique to better accommodate the MA degradation in different areas of an MA-degraded image. Experiments on our clinical dataset demonstrate that our CDM outperforms the comparison methods on both objective metrics and visual quality, especially for severe MA degradation. We will publicly release our code.
金属牙种植体可能在CBCT成像过程中引入金属伪影(MA),对后续诊断造成明显干扰。近年来,人们提出了许多用于金属伪影还原的深度学习方法。由于合成和临床MA之间的巨大差异,监督学习MAR方法在临床环境中可能表现不佳。现有的许多基于临床数据训练的无监督MAR方法往往存在牙形态不正确的问题。为了解决上述问题,本文提出了一种基于耦合扩散模型(CDM)的临床牙科CBCT图像MAR方法。具体而言,我们分别在临床ma降级图像和临床干净图像上训练两个扩散模型来获得先验信息。在去噪过程中,通过MA图像和扩散模型的先验计算噪声级的方差。然后在两种扩散模型之间建立噪声变换模块,将MA噪声图像转换为新的初始值进行去噪处理。我们的设计有效地利用了不对齐的ma退化图像和干净图像之间的固有转换。此外,我们引入了一种自适应MA推理技术,以更好地适应MA退化图像中不同区域的MA退化。在临床数据集上的实验表明,我们的CDM在客观指标和视觉质量上都优于比较方法,特别是对于严重的MA退化。我们将公开发布我们的代码。
{"title":"Coupled Diffusion Models for Metal Artifact Reduction of Clinical Dental CBCT Images","authors":"Zhouzhuo Zhang;Juncheng Yan;Yuxuan Shi;Zhiming Cui;Jun Xu;Dinggang Shen","doi":"10.1109/TMI.2025.3587131","DOIUrl":"10.1109/TMI.2025.3587131","url":null,"abstract":"Metal dental implants may introduce metal artifacts (MA) during the CBCT imaging process, causing significant interference in subsequent diagnosis. In recent years, many deep learning methods for metal artifact reduction (MAR) have been proposed. Due to the huge difference between synthetic and clinical MA, supervised learning MAR methods may perform poorly in clinical settings. Many existing unsupervised MAR methods trained on clinical data often suffer from incorrect dental morphology. To alleviate the above problems, in this paper, we propose a new MAR method of Coupled Diffusion Models (CDM) for clinical dental CBCT images. Specifically, we separately train two diffusion models on clinical MA-degraded images and clinical clean images to obtain prior information, respectively. During the denoising process, the variances of noise levels are calculated from MA images and the prior of diffusion models. Then we develop a noise transformation module between the two diffusion models to transform the MA noise image into a new initial value for the denoising process. Our designs effectively exploit the inherent transformation between the misaligned MA-degraded images and clean images. Additionally, we introduce an MA-adaptive inference technique to better accommodate the MA degradation in different areas of an MA-degraded image. Experiments on our clinical dataset demonstrate that our CDM outperforms the comparison methods on both objective metrics and visual quality, especially for severe MA degradation. We will publicly release our code.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5103-5116"},"PeriodicalIF":0.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144578618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-Source CBCT for Large FoV Imaging Under Short-Scan Trajectories 短扫描轨迹下大视场成像的双源CBCT
Pub Date : 2025-07-07 DOI: 10.1109/TMI.2025.3586622
Tianling Lyu;Xusheng Zhang;Xinyun Zhong;Zhan Wu;Yan Xi;Wei Zhao;Yang Chen;Yuanjing Feng;Wentao Zhu
Cone-beam CT is extensively used in medical diagnosis and treatment. Despite its large longitudinal field of view (FoV), the horizontal FoV of CBCT systems is severely limited due to the detector width. Certain commercial CBCT systems increase the horizontal FoV by employing the offset detector method. However, this method necessitates 360° full circular scanning trajectory which increases the scanning time and is not compatible with specific CBCT system models. In this paper, we investigate the feasibility of large FoV imaging under short scan trajectories with an additional X-ray source. A dual-source CBCT geometry is proposed as well as two corresponding image reconstruction algorithms. The first one is based on cone-parallel rebinning and the subsequent employs a modified Parker weighting scheme. Theoretical calculations demonstrate that the proposed geometry achieves a wider horizontal FoV than the ${90}%$ detector offset geometry (radius of ${214}.{83}textit {mm}$ vs. ${198}.{99}textit {mm}$ ) with a significantly reduced rotation angle (less than 230° vs. 360°). As demonstrated by experiments, the proposed geometry and reconstruction algorithms obtain comparable imaging qualities within the FoV to conventional CBCT imaging techniques. Implementing the proposed geometry is straightforward and does not substantially increase development expenses. It possesses the capacity to expand CBCT applications even further.
锥束CT在医学诊断和治疗中有着广泛的应用。尽管CBCT系统具有较大的纵向视场(FoV),但由于探测器宽度的限制,CBCT系统的水平视场受到严重限制。某些商用CBCT系统通过采用偏移检测法来增加水平视场。然而,该方法需要360°全圆周扫描轨迹,增加了扫描时间,并且与特定的CBCT系统模型不兼容。在本文中,我们研究了在短扫描轨迹下使用附加x射线源进行大视场成像的可行性。提出了一种双源CBCT几何结构以及两种相应的图像重建算法。第一种方法是基于锥平行重球,第二种方法采用改进的帕克加权方法。理论计算表明,所提出的几何结构比${90}%$探测器偏移几何(半径${214})实现了更宽的水平视场。{83}textit {mm}$ vs. ${198}。{99}textit {mm}$),旋转角度明显减小(小于230°vs 360°)。实验证明,所提出的几何和重建算法在视场内获得了与传统CBCT成像技术相当的成像质量。实现所建议的几何结构非常简单,并且不会大幅增加开发费用。它具有进一步扩大CBCT应用的能力。
{"title":"Dual-Source CBCT for Large FoV Imaging Under Short-Scan Trajectories","authors":"Tianling Lyu;Xusheng Zhang;Xinyun Zhong;Zhan Wu;Yan Xi;Wei Zhao;Yang Chen;Yuanjing Feng;Wentao Zhu","doi":"10.1109/TMI.2025.3586622","DOIUrl":"10.1109/TMI.2025.3586622","url":null,"abstract":"Cone-beam CT is extensively used in medical diagnosis and treatment. Despite its large longitudinal field of view (FoV), the horizontal FoV of CBCT systems is severely limited due to the detector width. Certain commercial CBCT systems increase the horizontal FoV by employing the offset detector method. However, this method necessitates 360° full circular scanning trajectory which increases the scanning time and is not compatible with specific CBCT system models. In this paper, we investigate the feasibility of large FoV imaging under short scan trajectories with an additional X-ray source. A dual-source CBCT geometry is proposed as well as two corresponding image reconstruction algorithms. The first one is based on cone-parallel rebinning and the subsequent employs a modified Parker weighting scheme. Theoretical calculations demonstrate that the proposed geometry achieves a wider horizontal FoV than the <inline-formula> <tex-math>${90}%$ </tex-math></inline-formula> detector offset geometry (radius of <inline-formula> <tex-math>${214}.{83}textit {mm}$ </tex-math></inline-formula> vs. <inline-formula> <tex-math>${198}.{99}textit {mm}$ </tex-math></inline-formula>) with a significantly reduced rotation angle (less than 230° vs. 360°). As demonstrated by experiments, the proposed geometry and reconstruction algorithms obtain comparable imaging qualities within the FoV to conventional CBCT imaging techniques. Implementing the proposed geometry is straightforward and does not substantially increase development expenses. It possesses the capacity to expand CBCT applications even further.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 12","pages":"5051-5064"},"PeriodicalIF":0.0,"publicationDate":"2025-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144577997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prompting Vision-Language Model for Nuclei Instance Segmentation and Classification 核实例分割与分类的提示视觉语言模型。
Pub Date : 2025-06-25 DOI: 10.1109/TMI.2025.3579214
Jieru Yao;Guangyu Guo;Zhaohui Zheng;Qiang Xie;Longfei Han;Dingwen Zhang;Junwei Han
Nuclei instance segmentation and classification are a fundamental and challenging task in whole slide Imaging (WSI) analysis. Most dense nuclei prediction studies rely heavily on crowd labelled data on high-resolution digital images, leading to a time-consuming and expertise-required paradigm. Recently, Vision-Language Models (VLMs) have been intensively investigated, which learn rich cross-modal correlation from large-scale image-text pairs without tedious annotations. Inspired by this, we build a novel framework, called PromptNu, aiming at infusing abundant nuclei knowledge into the training of the nuclei instance recognition model through vision-language contrastive learning and prompt engineering techniques. Specifically, our approach starts with the creation of multifaceted prompts that integrate comprehensive nuclear knowledge, including visual insights from the GPT-4V model, statistical analyses, and expert insights from the pathology field. Then, we propose a novel prompting methodology that consists of two pivotal vision-language contrastive learning components: the Prompting Nuclei Representation Learning (PNuRL) and the Prompting Nuclei Dense Prediction (PNuDP), which adeptly integrates the expertise embedded in pre-trained VLMs and multifaceted prompts into the feature extraction and prediction process, respectively. Comprehensive experiments on six datasets with extensive WSI scenarios demonstrate the effectiveness of our method for both nuclei instance segmentation and classification tasks. The code is available at https://github.com/NucleiDet/PromptNu
核实例分割与分类是全切片成像(WSI)分析的基础和难点。大多数密集核预测研究严重依赖于高分辨率数字图像上的人群标记数据,这导致了一个耗时且需要专业知识的范例。近年来,视觉语言模型(VLMs)得到了广泛的研究,该模型能够从大规模的图像-文本对中学习丰富的跨模态相关性,而无需繁琐的注释。受此启发,我们构建了一个新的框架PromptNu,旨在通过视觉语言对比学习和提示工程技术,将丰富的核知识注入到核实例识别模型的训练中。具体来说,我们的方法从创建多方面的提示开始,这些提示集成了全面的核知识,包括来自GPT-4V模型的视觉见解,统计分析和来自病理学领域的专家见解。然后,我们提出了一种新的提示方法,它由两个关键的视觉语言对比学习组件组成:提示核表示学习(PNuRL)和提示核密集预测(PNuDP),它熟练地将嵌入在预训练vlm和多方面提示中的专业知识分别集成到特征提取和预测过程中。在六个具有广泛WSI场景的数据集上进行的综合实验表明,我们的方法对于核实例分割和分类任务都是有效的。代码可在https://github.com/NucleiDet/PromptNu上获得。
{"title":"Prompting Vision-Language Model for Nuclei Instance Segmentation and Classification","authors":"Jieru Yao;Guangyu Guo;Zhaohui Zheng;Qiang Xie;Longfei Han;Dingwen Zhang;Junwei Han","doi":"10.1109/TMI.2025.3579214","DOIUrl":"10.1109/TMI.2025.3579214","url":null,"abstract":"Nuclei instance segmentation and classification are a fundamental and challenging task in whole slide Imaging (WSI) analysis. Most dense nuclei prediction studies rely heavily on crowd labelled data on high-resolution digital images, leading to a time-consuming and expertise-required paradigm. Recently, Vision-Language Models (VLMs) have been intensively investigated, which learn rich cross-modal correlation from large-scale image-text pairs without tedious annotations. Inspired by this, we build a novel framework, called PromptNu, aiming at infusing abundant nuclei knowledge into the training of the nuclei instance recognition model through vision-language contrastive learning and prompt engineering techniques. Specifically, our approach starts with the creation of multifaceted prompts that integrate comprehensive nuclear knowledge, including visual insights from the GPT-4V model, statistical analyses, and expert insights from the pathology field. Then, we propose a novel prompting methodology that consists of two pivotal vision-language contrastive learning components: the Prompting Nuclei Representation Learning (PNuRL) and the Prompting Nuclei Dense Prediction (PNuDP), which adeptly integrates the expertise embedded in pre-trained VLMs and multifaceted prompts into the feature extraction and prediction process, respectively. Comprehensive experiments on six datasets with extensive WSI scenarios demonstrate the effectiveness of our method for both nuclei instance segmentation and classification tasks. The code is available at <uri>https://github.com/NucleiDet/PromptNu</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 11","pages":"4567-4578"},"PeriodicalIF":0.0,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144487884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Radiology Report Generation via Multi-Phased Supervision 通过多阶段监督加强放射学报告生成。
Pub Date : 2025-06-25 DOI: 10.1109/TMI.2025.3580659
Zailong Chen;Yingshu Li;Zhanyu Wang;Peng Gao;Johan Barthelemy;Luping Zhou;Lei Wang
Radiology report generation using large language models has recently produced reports with more realistic styles and better language fluency. However, their clinical accuracy remains inadequate. Considering the significant imbalance between clinical phrases and general descriptions in a report, we argue that using an entire report for supervision is problematic as it fails to emphasize the crucial clinical phrases, which require focused learning. To address this issue, we propose a multi-phased supervision method, inspired by the spirit of curriculum learning where models are trained by gradually increasing task complexity. Our approach organizes the learning process into structured phases at different levels of semantical granularity, each building on the previous one to enhance the model. During the first phase, disease labels are used to supervise the model, equipping it with the ability to identify underlying diseases. The second phase progresses to use entity-relation triples to guide the model to describe associated clinical findings. Finally, in the third phase, we introduce conventional whole-report-based supervision to quickly adapt the model for report generation. Throughout the phased training, the model remains the same and consistently operates in the generation mode. As experimentally demonstrated, this proposed change in the way of supervision enhances report generation, achieving state-of-the-art performance in both language fluency and clinical accuracy. Our work underscores the importance of training process design in radiology report generation. Our code is available on https://github.com/zailongchen/MultiP-R2Gen
使用大型语言模型的放射学报告生成最近产生了更逼真的风格和更好的语言流畅性的报告。然而,其临床准确性仍然不足。考虑到报告中临床短语和一般描述之间的显著不平衡,我们认为使用整个报告进行监督是有问题的,因为它没有强调需要集中学习的关键临床短语。为了解决这个问题,我们提出了一种多阶段监督方法,受课程学习精神的启发,通过逐渐增加任务复杂性来训练模型。我们的方法将学习过程组织成不同语义粒度级别的结构化阶段,每个阶段都建立在前一个阶段的基础上,以增强模型。在第一阶段,使用疾病标签来监督模型,使其具有识别潜在疾病的能力。第二阶段使用实体-关系三元组来指导模型描述相关的临床发现。最后,在第三阶段,我们引入传统的基于全报告的监管,以快速适应报告生成模式。在整个分阶段训练过程中,模型保持不变,始终以生成模式运行。正如实验证明的那样,这种提出的监督方式的改变增强了报告的生成,在语言流畅性和临床准确性方面都达到了最先进的表现。我们的工作强调了培训流程设计在放射学报告生成中的重要性。我们的代码可以在https://github.com/zailongchen/MultiP-R2Gen上找到。
{"title":"Enhancing Radiology Report Generation via Multi-Phased Supervision","authors":"Zailong Chen;Yingshu Li;Zhanyu Wang;Peng Gao;Johan Barthelemy;Luping Zhou;Lei Wang","doi":"10.1109/TMI.2025.3580659","DOIUrl":"10.1109/TMI.2025.3580659","url":null,"abstract":"Radiology report generation using large language models has recently produced reports with more realistic styles and better language fluency. However, their clinical accuracy remains inadequate. Considering the significant imbalance between clinical phrases and general descriptions in a report, we argue that using an entire report for supervision is problematic as it fails to emphasize the crucial clinical phrases, which require focused learning. To address this issue, we propose a multi-phased supervision method, inspired by the spirit of curriculum learning where models are trained by gradually increasing task complexity. Our approach organizes the learning process into structured phases at different levels of semantical granularity, each building on the previous one to enhance the model. During the first phase, disease labels are used to supervise the model, equipping it with the ability to identify underlying diseases. The second phase progresses to use entity-relation triples to guide the model to describe associated clinical findings. Finally, in the third phase, we introduce conventional whole-report-based supervision to quickly adapt the model for report generation. Throughout the phased training, the model remains the same and consistently operates in the generation mode. As experimentally demonstrated, this proposed change in the way of supervision enhances report generation, achieving state-of-the-art performance in both language fluency and clinical accuracy. Our work underscores the importance of training process design in radiology report generation. Our code is available on <uri>https://github.com/zailongchen/MultiP-R2Gen</uri>","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 11","pages":"4666-4677"},"PeriodicalIF":0.0,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144487975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LMT++: Adaptively Collaborating LLMs With Multi-Specialized Teachers for Continual VQA in Robotic Surgical Videos lmt++:自适应协作法学硕士与多专业教师在机器人手术视频持续VQA
Pub Date : 2025-06-20 DOI: 10.1109/TMI.2025.3581108
Yuyang Du;Kexin Chen;Yue Zhan;Chang Han Low;Mobarakol Islam;Ziyu Guo;Yueming Jin;Guangyong Chen;Pheng Ann Heng
Visual question answering (VQA) plays a vital role in advancing surgical education. However, due to the privacy concern of patient data, training VQA model with previously used data becomes restricted, making it necessary to use the exemplar-free continual learning (CL) approach. Previous CL studies in the surgical field neglected two critical issues: i) significant domain shifts caused by the wide range of surgical procedures collected from various sources, and ii) the data imbalance problem caused by the unequal occurrence of medical instruments or surgical procedures. This paper addresses these challenges with a multimodal large language model (LLM) and an adaptive weight assignment strategy. First, we developed a novel LLM-assisted multi-teacher CL framework (named LMT++), which could harness the strength of a multimodal LLM as a supplementary teacher. The LLM’s strong generalization ability, as well as its good understanding of the surgical domain, help to address the knowledge gap arising from domain shifts and data imbalances. To incorporate the LLM in our CL framework, we further proposed an innovative approach to process the training data, which involves the conversion of complex LLM embeddings into logits value used within our CL training framework. Moreover, we design an adaptive weight assignment approach that balances the generalization ability of the LLM and the domain expertise of conventional VQA models obtained in previous model training processes within the CL framework. Finally, we created a new surgical VQA dataset for model evaluation. Comprehensive experimental findings on these datasets show that our approach surpasses state-of-the-art CL methods.
视觉问答(VQA)在推进外科教育中起着至关重要的作用。然而,由于患者数据的隐私问题,使用以前使用的数据训练VQA模型受到限制,因此有必要使用无范例持续学习(CL)方法。以往在外科领域的CL研究忽略了两个关键问题:i)由于从各种来源收集的手术程序范围广泛而导致的显著的领域转移,ii)由于医疗器械或手术程序的不平等发生而导致的数据不平衡问题。本文采用多模态大语言模型(LLM)和自适应权重分配策略来解决这些问题。首先,我们开发了一个新的LLM辅助的多教师CL框架(命名为lmt++),它可以利用多模态LLM作为补充教师的优势。LLM强大的泛化能力,以及对外科领域的良好理解,有助于解决领域转移和数据不平衡带来的知识差距。为了将LLM纳入我们的CL框架,我们进一步提出了一种处理训练数据的创新方法,该方法涉及将复杂的LLM嵌入转换为我们的CL训练框架中使用的logits值。此外,我们设计了一种自适应权重分配方法,平衡了LLM的泛化能力和在CL框架内以前的模型训练过程中获得的传统VQA模型的领域专业知识。最后,我们创建了一个新的外科VQA数据集用于模型评估。在这些数据集上的综合实验结果表明,我们的方法超越了最先进的CL方法。
{"title":"LMT++: Adaptively Collaborating LLMs With Multi-Specialized Teachers for Continual VQA in Robotic Surgical Videos","authors":"Yuyang Du;Kexin Chen;Yue Zhan;Chang Han Low;Mobarakol Islam;Ziyu Guo;Yueming Jin;Guangyong Chen;Pheng Ann Heng","doi":"10.1109/TMI.2025.3581108","DOIUrl":"10.1109/TMI.2025.3581108","url":null,"abstract":"Visual question answering (VQA) plays a vital role in advancing surgical education. However, due to the privacy concern of patient data, training VQA model with previously used data becomes restricted, making it necessary to use the exemplar-free continual learning (CL) approach. Previous CL studies in the surgical field neglected two critical issues: i) significant domain shifts caused by the wide range of surgical procedures collected from various sources, and ii) the data imbalance problem caused by the unequal occurrence of medical instruments or surgical procedures. This paper addresses these challenges with a multimodal large language model (LLM) and an adaptive weight assignment strategy. First, we developed a novel LLM-assisted multi-teacher CL framework (named LMT++), which could harness the strength of a multimodal LLM as a supplementary teacher. The LLM’s strong generalization ability, as well as its good understanding of the surgical domain, help to address the knowledge gap arising from domain shifts and data imbalances. To incorporate the LLM in our CL framework, we further proposed an innovative approach to process the training data, which involves the conversion of complex LLM embeddings into logits value used within our CL training framework. Moreover, we design an adaptive weight assignment approach that balances the generalization ability of the LLM and the domain expertise of conventional VQA models obtained in previous model training processes within the CL framework. Finally, we created a new surgical VQA dataset for model evaluation. Comprehensive experimental findings on these datasets show that our approach surpasses state-of-the-art CL methods.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 11","pages":"4678-4689"},"PeriodicalIF":0.0,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144335331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SPHARM-Reg: Unsupervised Cortical Surface Registration Using Spherical Harmonics 使用球面谐波的无监督皮质表面配准
Pub Date : 2025-06-20 DOI: 10.1109/TMI.2025.3581605
Seungeun Lee;Seunghwan Lee;Sunghwa Ryu;Ilwoo Lyu
We present a novel learning-based spherical registration method, called SPHARM-Reg, tailored for establishing cortical shape correspondence. SPHARM-Reg aims to reduce warp distortion that can introduce biases in downstream shape analyses. To achieve this, we tackle two critical challenges: (1) joint rigid and non-rigid alignments and (2) rotation-preserving smoothing. Conventional approaches perform rigid alignment only once before a non-rigid alignment. The resulting rotation is potentially sub-optimal, and the subsequent non-rigid alignment may introduce unnecessary distortion. In addition, common velocity encoding schemes on the unit sphere often fail to preserve the rotation component after spatial smoothing of velocity. To address these issues, we propose a diffeomorphic framework that integrates spherical harmonic decomposition of the velocity field with a novel velocity encoding scheme. SPHARM-Reg optimizes harmonic components of the velocity field, enabling joint adjustments for both rigid and non-rigid alignments. Furthermore, the proposed encoding scheme using spherical functions encourages consistent smoothing that preserves the rotation component. In the experiments, we validate SPHARM-Reg on healthy adult datasets. SPHARM-Reg achieves a substantial reduction in warp distortion while maintaining a high level of registration accuracy compared to existing methods. In the clinical analysis, we show that the extent of warp distortion significantly impacts statistical significance.
我们提出了一种新的基于学习的球面配准方法,称为spham - reg,专门用于建立皮质形状对应。spham - reg旨在减少翘曲失真,这可能会在下游形状分析中引入偏差。为了实现这一目标,我们解决了两个关键挑战:(1)关节刚性和非刚性对准;(2)保持旋转的平滑。传统方法在非刚性对齐之前只执行一次刚性对齐。由此产生的旋转可能是次优的,随后的非刚性对齐可能会引入不必要的扭曲。此外,单位球上常用的速度编码方案在速度空间平滑后往往不能保留旋转分量。为了解决这些问题,我们提出了一个将速度场的球谐分解与一种新的速度编码方案相结合的微分同构框架。spham - reg优化了速度场的谐波分量,使刚性和非刚性对准的联合调整成为可能。此外,所提出的使用球面函数的编码方案鼓励保持旋转分量的一致平滑。在实验中,我们在健康成人数据集上验证了spham - reg。与现有方法相比,spham - reg实现了大幅减少翘曲失真,同时保持高水平的配准精度。在临床分析中,我们发现翘曲变形的程度显著影响统计学意义。
{"title":"SPHARM-Reg: Unsupervised Cortical Surface Registration Using Spherical Harmonics","authors":"Seungeun Lee;Seunghwan Lee;Sunghwa Ryu;Ilwoo Lyu","doi":"10.1109/TMI.2025.3581605","DOIUrl":"10.1109/TMI.2025.3581605","url":null,"abstract":"We present a novel learning-based spherical registration method, called SPHARM-Reg, tailored for establishing cortical shape correspondence. SPHARM-Reg aims to reduce warp distortion that can introduce biases in downstream shape analyses. To achieve this, we tackle two critical challenges: (1) joint rigid and non-rigid alignments and (2) rotation-preserving smoothing. Conventional approaches perform rigid alignment only once before a non-rigid alignment. The resulting rotation is potentially sub-optimal, and the subsequent non-rigid alignment may introduce unnecessary distortion. In addition, common velocity encoding schemes on the unit sphere often fail to preserve the rotation component after spatial smoothing of velocity. To address these issues, we propose a diffeomorphic framework that integrates spherical harmonic decomposition of the velocity field with a novel velocity encoding scheme. SPHARM-Reg optimizes harmonic components of the velocity field, enabling joint adjustments for both rigid and non-rigid alignments. Furthermore, the proposed encoding scheme using spherical functions encourages consistent smoothing that preserves the rotation component. In the experiments, we validate SPHARM-Reg on healthy adult datasets. SPHARM-Reg achieves a substantial reduction in warp distortion while maintaining a high level of registration accuracy compared to existing methods. In the clinical analysis, we show that the extent of warp distortion significantly impacts statistical significance.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 11","pages":"4732-4742"},"PeriodicalIF":0.0,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144334897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PEPC-Net: Progressive Edge Perception and Completion Network for Precise Identification of Safe Resection Margins in Maxillofacial Cysts PEPC-Net:用于颌面部囊肿安全切除边缘精确识别的渐进式边缘感知和完成网络
Pub Date : 2025-06-19 DOI: 10.1109/TMI.2025.3581200
Nuo Tong;Yuanlin Liu;Yueheng Ding;Tao Wang;Lingnan Hou;Mei Shi;Xiaoyi Hu;Shuiping Gou
Maxillofacial cysts pose significant surgical risks due to their proximity to critical anatomical structures, such as blood vessels and nerves. Precise identification of the safe resection margins is essential for complete lesion removal while minimizing damage to surrounding at-risk tissues, which highly relies on accurate segmentation in CT images. However, due to the limited space and complex anatomical structures in the maxillofacial region, along with heterogeneous compositions of bone and soft tissues, accurate segmentation is extremely challenging. Thus, a Progressive Edge Perception and Completion Network (PEPC-Net) is presented in this study, which integrates three novel components: 1) Progressive Edge Perception Branch, which progressively fuses semantic features from multiple resolution levels in a dual-stream manner, enabling the model to handle the varying forms of maxillofacial cysts at different stages. 2) Edge Information Completion Module, which captures subtle, differentiated edge features from adjacent layers within the encoding blocks, providing more comprehensive edge information for identifying heterogeneous boundaries. 3) Edge-Aware Skip Connection to adaptively fuse multi-scale edge features, preserving detailed edge information, to facilitate precise identification of the cyst boundaries. Extensive experiments on clinically collected maxillofacial lesion datasets validate the effectiveness of the proposed PEPC-Net, achieving a DSC of 88.71% and an ASD of 0.489mm. It’s generalizability is further assessed using an external validation set, which includes more diverse range of maxillofacial cyst cases and images of varying qualities. These experiments highlight the superior performance of PEPC-Net in delineating the polymorphic edges of heterogeneous lesions, which is critical for safe resection margins decision.
由于颌面部囊肿靠近血管和神经等关键解剖结构,因此具有重大的手术风险。准确识别安全切除边缘对于完全切除病灶,同时最大限度地减少对周围危险组织的损害至关重要,这高度依赖于CT图像的准确分割。然而,由于颌面区域空间有限,解剖结构复杂,骨组织和软组织组成不均,准确分割极具挑战性。为此,本研究提出了一个渐进式边缘感知和补全网络(PEPC-Net),该网络集成了三个新组件:1)渐进式边缘感知分支,该分支以双流方式逐步融合多分辨率水平的语义特征,使模型能够处理不同阶段不同形式的颌面部囊肿。2)边缘信息补全模块,从编码块内相邻层捕获细微的、差异化的边缘特征,为识别异构边界提供更全面的边缘信息。3) edge - aware Skip Connection自适应融合多尺度边缘特征,保留详细的边缘信息,便于精确识别囊肿边界。在临床采集的颌面部病变数据集上进行的大量实验验证了PEPC-Net的有效性,DSC为88.71%,ASD为0.489mm。使用外部验证集进一步评估其泛化性,该验证集包括更多样化的颌面部囊肿病例和不同质量的图像。这些实验突出了PEPC-Net在描绘异质病变的多态边缘方面的优越性能,这对安全切除边缘的决定至关重要。
{"title":"PEPC-Net: Progressive Edge Perception and Completion Network for Precise Identification of Safe Resection Margins in Maxillofacial Cysts","authors":"Nuo Tong;Yuanlin Liu;Yueheng Ding;Tao Wang;Lingnan Hou;Mei Shi;Xiaoyi Hu;Shuiping Gou","doi":"10.1109/TMI.2025.3581200","DOIUrl":"10.1109/TMI.2025.3581200","url":null,"abstract":"Maxillofacial cysts pose significant surgical risks due to their proximity to critical anatomical structures, such as blood vessels and nerves. Precise identification of the safe resection margins is essential for complete lesion removal while minimizing damage to surrounding at-risk tissues, which highly relies on accurate segmentation in CT images. However, due to the limited space and complex anatomical structures in the maxillofacial region, along with heterogeneous compositions of bone and soft tissues, accurate segmentation is extremely challenging. Thus, a Progressive Edge Perception and Completion Network (PEPC-Net) is presented in this study, which integrates three novel components: 1) Progressive Edge Perception Branch, which progressively fuses semantic features from multiple resolution levels in a dual-stream manner, enabling the model to handle the varying forms of maxillofacial cysts at different stages. 2) Edge Information Completion Module, which captures subtle, differentiated edge features from adjacent layers within the encoding blocks, providing more comprehensive edge information for identifying heterogeneous boundaries. 3) Edge-Aware Skip Connection to adaptively fuse multi-scale edge features, preserving detailed edge information, to facilitate precise identification of the cyst boundaries. Extensive experiments on clinically collected maxillofacial lesion datasets validate the effectiveness of the proposed PEPC-Net, achieving a DSC of 88.71% and an ASD of 0.489mm. It’s generalizability is further assessed using an external validation set, which includes more diverse range of maxillofacial cyst cases and images of varying qualities. These experiments highlight the superior performance of PEPC-Net in delineating the polymorphic edges of heterogeneous lesions, which is critical for safe resection margins decision.","PeriodicalId":94033,"journal":{"name":"IEEE transactions on medical imaging","volume":"44 11","pages":"4704-4716"},"PeriodicalIF":0.0,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144328530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on medical imaging
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1