首页 > 最新文献

Biomedical Signal Processing and Control最新文献

英文 中文
Elevating MRI reconstruction: A novel enhanced framework integrating deep learning and traditional algorithms for sampling, reconstruction, and training optimization 提升MRI重建:一个新的增强框架,集成了深度学习和传统的采样、重建和训练优化算法
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-10 DOI: 10.1016/j.bspc.2026.109486
Congchao Bian , Ning Cao , Hua Yan , Yue Liu
Deep learning (DL)-based methods have shown great promise in accelerating magnetic resonance imaging (MRI) reconstruction. However, effectively exploiting the inherent phase structure of raw k-space data as prior knowledge remains a critical yet underexplored direction for further enhancing reconstruction quality. To address this challenge, we propose POCS-DLNet, a novel enhancement framework that integrates deep learning with the traditional Projection Onto Convex Sets (POCS) algorithm. The framework enforces phase consistency by incorporating priors derived from the low-frequency components of k-space, guiding both model training and inference. By embedding existing DL-MRI models — trained to map undersampled inputs to fully sampled outputs — into the POCS iteration process, POCS-DLNet enables efficient phase correction and accelerated reconstruction while minimizing the number of trainable parameters. Furthermore, we introduce an asymmetric undersampling mask design strategy, termed CrossMask, which leverages the conjugate symmetry of k-space to improve sampling efficiency and reconstruction fidelity. Extensive experiments on two public datasets demonstrate that POCS-DLNet significantly enhances the reconstruction accuracy of representative DL-MRI models while maintaining low computational overhead. Comprehensive ablation studies further validate the contribution of each proposed component and confirm the robustness and generalization capability of the framework.
基于深度学习(DL)的方法在加速磁共振成像(MRI)重建方面显示出巨大的希望。然而,有效地利用原始k空间数据的固有相位结构作为先验知识仍然是进一步提高重建质量的关键方向,但尚未得到充分的探索。为了应对这一挑战,我们提出了POCS- dlnet,这是一种将深度学习与传统的凸集投影(POCS)算法相结合的新型增强框架。该框架通过结合来自k空间低频分量的先验来加强相位一致性,指导模型训练和推理。通过将现有的DL-MRI模型(经过训练,可以将未充分采样的输入映射到完全采样的输出)嵌入到POCS迭代过程中,POCS- dlnet可以实现有效的相位校正和加速重建,同时最大限度地减少可训练参数的数量。此外,我们引入了一种不对称欠采样掩模设计策略,称为CrossMask,它利用k空间的共轭对称性来提高采样效率和重建保真度。在两个公共数据集上的大量实验表明,POCS-DLNet在保持较低计算开销的同时,显著提高了代表性DL-MRI模型的重建精度。综合消融研究进一步验证了每个提出的组件的贡献,并确认了框架的鲁棒性和泛化能力。
{"title":"Elevating MRI reconstruction: A novel enhanced framework integrating deep learning and traditional algorithms for sampling, reconstruction, and training optimization","authors":"Congchao Bian ,&nbsp;Ning Cao ,&nbsp;Hua Yan ,&nbsp;Yue Liu","doi":"10.1016/j.bspc.2026.109486","DOIUrl":"10.1016/j.bspc.2026.109486","url":null,"abstract":"<div><div>Deep learning (DL)-based methods have shown great promise in accelerating magnetic resonance imaging (MRI) reconstruction. However, effectively exploiting the inherent phase structure of raw k-space data as prior knowledge remains a critical yet underexplored direction for further enhancing reconstruction quality. To address this challenge, we propose POCS-DLNet, a novel enhancement framework that integrates deep learning with the traditional Projection Onto Convex Sets (POCS) algorithm. The framework enforces phase consistency by incorporating priors derived from the low-frequency components of k-space, guiding both model training and inference. By embedding existing DL-MRI models — trained to map undersampled inputs to fully sampled outputs — into the POCS iteration process, POCS-DLNet enables efficient phase correction and accelerated reconstruction while minimizing the number of trainable parameters. Furthermore, we introduce an asymmetric undersampling mask design strategy, termed CrossMask, which leverages the conjugate symmetry of k-space to improve sampling efficiency and reconstruction fidelity. Extensive experiments on two public datasets demonstrate that POCS-DLNet significantly enhances the reconstruction accuracy of representative DL-MRI models while maintaining low computational overhead. Comprehensive ablation studies further validate the contribution of each proposed component and confirm the robustness and generalization capability of the framework.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109486"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Morphology-enhanced CAM-guided SAM for weakly supervised breast lesion segmentation 形态学增强的cam引导SAM用于弱监督乳腺病变分割
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-10 DOI: 10.1016/j.bspc.2026.109509
Xin Yue , Qing Zhao , Xiaoling Liu , Jianqiang Li , Jing Bai , Changwei Song , Suqin Liu , Rodrigo Moreno , Zhikai Yang , Stefano E. Romero , Gabriel Jimenez , Guanghui Fu
Ultrasound imaging is vital for the early detection of breast cancer, where accurate lesion segmentation supports clinical diagnosis and treatment planning. However, existing deep learning-based methods rely on pixel-level annotations, which are costly and labor-intensive to obtain. This study presents a weakly supervised framework for breast lesion segmentation in ultrasound images. The framework combines morphological enhancement with Class Activation Map (CAM)-guided lesion localization and utilizes the Segment Anything Model (SAM) for refined segmentation without pixel-level labels. By adopting a lightweight region synthesis strategy and relying solely on SAM inference, the proposed approach substantially reduces model complexity and computational cost while maintaining high segmentation accuracy. Experimental results on the BUSI dataset show that our method achieves a Dice coefficient of 0.7063 under five-fold cross-validation and outperforms several fully supervised models in Hausdorff distance metrics. These results demonstrate that the proposed framework effectively balances segmentation accuracy, computational efficiency, and annotation cost, offering a practical and low-complexity solution for breast ultrasound analysis. The code for this study is available at: https://github.com/YueXin18/MorSeg-CAM-SAM-Segmentation.
超声成像对于乳腺癌的早期发现至关重要,准确的病灶分割支持临床诊断和治疗计划。然而,现有的基于深度学习的方法依赖于像素级注释,这是昂贵和劳动密集型的。本研究提出了一个弱监督框架乳腺病变分割超声图像。该框架将形态学增强与类激活图(CAM)引导的病灶定位相结合,并利用任何片段模型(SAM)进行精细分割,而不需要像素级标签。该方法采用轻量级区域合成策略,仅依靠SAM推理,在保持较高分割精度的同时,大大降低了模型复杂度和计算成本。在BUSI数据集上的实验结果表明,我们的方法在5倍交叉验证下获得了0.7063的Dice系数,并且在Hausdorff距离度量上优于几种完全监督模型。结果表明,该框架有效地平衡了分割精度、计算效率和注释成本,为乳腺超声分析提供了一种实用且低复杂度的解决方案。这项研究的代码可在:https://github.com/YueXin18/MorSeg-CAM-SAM-Segmentation。
{"title":"Morphology-enhanced CAM-guided SAM for weakly supervised breast lesion segmentation","authors":"Xin Yue ,&nbsp;Qing Zhao ,&nbsp;Xiaoling Liu ,&nbsp;Jianqiang Li ,&nbsp;Jing Bai ,&nbsp;Changwei Song ,&nbsp;Suqin Liu ,&nbsp;Rodrigo Moreno ,&nbsp;Zhikai Yang ,&nbsp;Stefano E. Romero ,&nbsp;Gabriel Jimenez ,&nbsp;Guanghui Fu","doi":"10.1016/j.bspc.2026.109509","DOIUrl":"10.1016/j.bspc.2026.109509","url":null,"abstract":"<div><div>Ultrasound imaging is vital for the early detection of breast cancer, where accurate lesion segmentation supports clinical diagnosis and treatment planning. However, existing deep learning-based methods rely on pixel-level annotations, which are costly and labor-intensive to obtain. This study presents a weakly supervised framework for breast lesion segmentation in ultrasound images. The framework combines morphological enhancement with Class Activation Map (CAM)-guided lesion localization and utilizes the Segment Anything Model (SAM) for refined segmentation without pixel-level labels. By adopting a lightweight region synthesis strategy and relying solely on SAM inference, the proposed approach substantially reduces model complexity and computational cost while maintaining high segmentation accuracy. Experimental results on the BUSI dataset show that our method achieves a Dice coefficient of 0.7063 under five-fold cross-validation and outperforms several fully supervised models in Hausdorff distance metrics. These results demonstrate that the proposed framework effectively balances segmentation accuracy, computational efficiency, and annotation cost, offering a practical and low-complexity solution for breast ultrasound analysis. The code for this study is available at: <span><span>https://github.com/YueXin18/MorSeg-CAM-SAM-Segmentation</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109509"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BT-MCT: Brain tumor regions segmentation using multi-class token transformers for weakly supervised semantic segmentation 基于多类标记转换器的弱监督语义分割
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-10 DOI: 10.1016/j.bspc.2026.109498
JiaHuan Lin , Juan Chen , Lei Guo , Zengnan Wang
Brain tumor segmentation is pivotal in medical diagnostics, enabling precise determination of tumor size, shape, and location for accurate and timely interventions. However, the annotation process often demands specialized expertise, making it time-intensive, labor-intensive, and costly. Existing weakly supervised methods for brain tumor segmentation primarily focus on whole tumor (WT) segmentation, which typically treats the problem as binary (tumor present/absent), thereby overlooking the pathological significance of individual sub-regions: enhancing tumors (ET) often indicate active malignancy, peritumoral edema (ED) reflects tumor invasion, and necrotic/non-enhancing tumor cores (NET) correlate with progression or therapeutic response. To address this limitation, we propose a weakly supervised multi-region segmentation network, termed Brain Tumor Region Segmentation Network Using Multi-Class Token Transformers, which explicitly treats ET, ED, and NET as distinct labels, i.e., a multilabel supervision setting. The network employs class tokens to capture region-specific localization and generates accurate localization maps for each sub-region. A patch-to-patch transformer is then utilized to compute patch-level pairwise affinities, serving as pseudo-labels. Supervised by a lightweight MLP decoder tailored to the Multi-Class Token Transformer Encoder, the network produces precise predictions. Additionally, a Boundary-Constrained Transformer (BCT) module enhances transformer block guidance, refining pseudo-label generation. Comprehensive experiments on public datasets (BraTS2018, BraTS2019, BraTS2020) demonstrate the proposed method’s superior performance compared to state-of-the-art weakly supervised multi-region segmentation approaches, validating its effectiveness and potential in clinical applications. This study highlights that multi-label supervision, unlike traditional binary WSSS, enables more precise and clinically meaningful segmentation of tumor sub-regions, addressing critical gaps in previous weakly supervised approaches.
脑肿瘤分割在医学诊断中至关重要,它能够精确地确定肿瘤的大小、形状和位置,从而进行准确和及时的干预。然而,注释过程通常需要专门的专业知识,这使得它需要大量的时间和人力,而且成本很高。现有的弱监督脑肿瘤分割方法主要关注全肿瘤(WT)分割,通常将问题视为二元(肿瘤存在/不存在),从而忽略了单个亚区域的病理意义:增强肿瘤(ET)通常表明恶性活动性,肿瘤周围水肿(ED)反映肿瘤侵袭,坏死/非增强肿瘤核心(NET)与进展或治疗反应相关。为了解决这一限制,我们提出了一个弱监督的多区域分割网络,称为使用多类令牌转换器的脑肿瘤区域分割网络,它明确地将ET, ED和NET作为不同的标签,即多标签监督设置。该网络使用类令牌捕获特定区域的定位,并为每个子区域生成精确的定位图。然后利用一个补丁到补丁转换器来计算补丁级别的成对亲和力,作为伪标签。在为多类令牌转换器编码器量身定制的轻量级MLP解码器的监督下,该网络产生精确的预测。此外,边界约束变压器(BCT)模块增强了变压器块引导,改进了伪标签生成。在公共数据集(BraTS2018, BraTS2019, BraTS2020)上的综合实验表明,与最先进的弱监督多区域分割方法相比,该方法具有优越的性能,验证了其在临床应用中的有效性和潜力。该研究强调,与传统的二元WSSS不同,多标签监督能够更精确和有临床意义的肿瘤亚区分割,解决了以前弱监督方法的关键空白。
{"title":"BT-MCT: Brain tumor regions segmentation using multi-class token transformers for weakly supervised semantic segmentation","authors":"JiaHuan Lin ,&nbsp;Juan Chen ,&nbsp;Lei Guo ,&nbsp;Zengnan Wang","doi":"10.1016/j.bspc.2026.109498","DOIUrl":"10.1016/j.bspc.2026.109498","url":null,"abstract":"<div><div>Brain tumor segmentation is pivotal in medical diagnostics, enabling precise determination of tumor size, shape, and location for accurate and timely interventions. However, the annotation process often demands specialized expertise, making it time-intensive, labor-intensive, and costly. Existing weakly supervised methods for brain tumor segmentation primarily focus on whole tumor (WT) segmentation, which typically treats the problem as binary (tumor present/absent), thereby overlooking the pathological significance of individual sub-regions: enhancing tumors (ET) often indicate active malignancy, peritumoral edema (ED) reflects tumor invasion, and necrotic/non-enhancing tumor cores (NET) correlate with progression or therapeutic response. To address this limitation, we propose a weakly supervised multi-region segmentation network, termed Brain Tumor Region Segmentation Network Using Multi-Class Token Transformers, which explicitly treats ET, ED, and NET as distinct labels, i.e., a multilabel supervision setting. The network employs class tokens to capture region-specific localization and generates accurate localization maps for each sub-region. A patch-to-patch transformer is then utilized to compute patch-level pairwise affinities, serving as pseudo-labels. Supervised by a lightweight MLP decoder tailored to the Multi-Class Token Transformer Encoder, the network produces precise predictions. Additionally, a Boundary-Constrained Transformer (BCT) module enhances transformer block guidance, refining pseudo-label generation. Comprehensive experiments on public datasets (BraTS2018, BraTS2019, BraTS2020) demonstrate the proposed method’s superior performance compared to state-of-the-art weakly supervised multi-region segmentation approaches, validating its effectiveness and potential in clinical applications. This study highlights that multi-label supervision, unlike traditional binary WSSS, enables more precise and clinically meaningful segmentation of tumor sub-regions, addressing critical gaps in previous weakly supervised approaches.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109498"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Construction of a multi-scale feature fusion algorithm for precise lung nodule segmentation 构建用于肺结节精确分割的多尺度特征融合算法
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-10 DOI: 10.1016/j.bspc.2026.109568
Weitao Chen , Yuntian Zhao , Lu Gao , ZhaoLi Yao , Shenglan Qin , Liangquan Jia , Chong Yao , Feng Hua
Lung nodules, as one of the key imaging indicators for early lung cancer, play a crucial role in early screening and diagnosis, which are essential for the early prevention and intervention of lung cancer. However, existing deep learning-based lung nodule segmentation methods often fail to perform well when faced with practical issues such as complex background interference and blurry nodule boundaries in CT images. To address these challenges, this paper proposes a novel U-Net model for precise lung nodule segmentation. The model is based on the U-Net architecture, incorporating Transformer structures to optimize the skip connections and overcome the semantic gap in feature transmission. Furthermore, a channel-space dual-domain feature fusion module is designed to enhance the complementary fusion of shallow and deep features during the decoding phase. In the encoding phase, a Haar Wavelet DownSample module is employed to effectively alleviate the information loss problem. To evaluate the performance of the proposed model, this study uses the LUNA16 public dataset. Experimental results show that the proposed segmentation model achieves Dice Similarity Coefficient, Sensitivity, and Accuracy scores of 77.92%, 91.79%, and 83.91%, respectively. The comprehensive performance significantly outperforms current mainstream lung nodule segmentation methods, providing an efficient and reliable new solution for precise lung nodule segmentation and early lung cancer diagnosis software. Our implementation is available at https://github.com/shmookpup/EMR-Unet.
肺结节作为早期肺癌的关键影像学指标之一,在早期筛查和诊断中起着至关重要的作用,对肺癌的早期预防和干预至关重要。然而,现有的基于深度学习的肺结节分割方法在面对CT图像背景干扰复杂、结节边界模糊等实际问题时,往往表现不佳。为了解决这些问题,本文提出了一种新的U-Net模型用于肺结节的精确分割。该模型基于U-Net体系结构,采用Transformer结构优化跳接,克服特征传输中的语义缺口。此外,设计了信道空间双域特征融合模块,增强了解码阶段浅域和深域特征的互补融合。在编码阶段,采用Haar小波DownSample模块,有效地缓解了信息丢失问题。为了评估所提出模型的性能,本研究使用了LUNA16公共数据集。实验结果表明,该分割模型的骰子相似系数、灵敏度和准确率分别达到77.92%、91.79%和83.91%。综合性能明显优于目前主流肺结节分割方法,为肺结节精确分割和肺癌早期诊断软件提供了高效可靠的新解决方案。我们的实现可以在https://github.com/shmookpup/EMR-Unet上获得。
{"title":"Construction of a multi-scale feature fusion algorithm for precise lung nodule segmentation","authors":"Weitao Chen ,&nbsp;Yuntian Zhao ,&nbsp;Lu Gao ,&nbsp;ZhaoLi Yao ,&nbsp;Shenglan Qin ,&nbsp;Liangquan Jia ,&nbsp;Chong Yao ,&nbsp;Feng Hua","doi":"10.1016/j.bspc.2026.109568","DOIUrl":"10.1016/j.bspc.2026.109568","url":null,"abstract":"<div><div>Lung nodules, as one of the key imaging indicators for early lung cancer, play a crucial role in early screening and diagnosis, which are essential for the early prevention and intervention of lung cancer. However, existing deep learning-based lung nodule segmentation methods often fail to perform well when faced with practical issues such as complex background interference and blurry nodule boundaries in CT images. To address these challenges, this paper proposes a novel U-Net model for precise lung nodule segmentation. The model is based on the U-Net architecture, incorporating Transformer structures to optimize the skip connections and overcome the semantic gap in feature transmission. Furthermore, a channel-space dual-domain feature fusion module is designed to enhance the complementary fusion of shallow and deep features during the decoding phase. In the encoding phase, a Haar Wavelet DownSample module is employed to effectively alleviate the information loss problem. To evaluate the performance of the proposed model, this study uses the LUNA16 public dataset. Experimental results show that the proposed segmentation model achieves Dice Similarity Coefficient, Sensitivity, and Accuracy scores of 77.92%, 91.79%, and 83.91%, respectively. The comprehensive performance significantly outperforms current mainstream lung nodule segmentation methods, providing an efficient and reliable new solution for precise lung nodule segmentation and early lung cancer diagnosis software. Our implementation is available at <span><span>https://github.com/shmookpup/EMR-Unet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109568"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating spatial normalization for SVM-based EEG decoding: A within- and between-subjects perspective 基于支持向量机的EEG解码空间归一化评估:受试者内部和受试者之间的视角
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-10 DOI: 10.1016/j.bspc.2026.109535
Yuan Qin , Qi Xu , Tuomo Kujala , Xiaoshuang Wang , Fengyu Cong
Normalization is widely used in electroencephalogram (EEG)-based multivariate pattern classification (MVPC) to reduce magnitude differences across trials and subjects. However, the spatial normalization method as applied to EEG channel-based brain maps has been rarely investigated in EEG-based decoding tasks like event-related potential (ERP) experiments. Meanwhile, the effectiveness of spatial normalization across diverse experimental paradigms remains unclear. This study evaluated the impact of spatial normalization on decoding accuracy using the support vector machine (SVM). The analysis included nine experimental paradigms, with seven binary ERP paradigms, one four-class facial expression paradigm, and one sixteen-class orientation paradigm. Results showed that spatial normalization significantly improved the between-subjects decoding accuracy (Cohen’s d=1.39, p<0.001) but did not enhance the within-subjects decoding accuracy. Additionally, the morphological fidelity of the difference wave was preserved after spatial normalization, as evidenced by the high similarity between the normalized and original ERP difference waves across the seven binary paradigms. We validated our findings across diverse experimental paradigms and demonstrated that spatial normalization effectively enhances between-subjects decoding accuracy using SVM while preserving the temporal consistency of ERP, offering a generalizable preprocessing approach for EEG-based cognitive, clinical, and brain–computer interface (BCI) applications.
归一化被广泛应用于基于脑电图(EEG)的多变量模式分类(MVPC)中,以减少试验和受试者之间的幅度差异。然而,将空间归一化方法应用于基于脑电信号通道的脑图,在事件相关电位(ERP)实验等基于脑电信号的解码任务中鲜有研究。同时,空间归一化在不同实验范式中的有效性尚不清楚。本研究利用支持向量机(SVM)评估空间归一化对解码精度的影响。分析包括9个实验范式,包括7个二元ERP范式、1个四类面部表情范式和1个十六类取向范式。结果表明,空间归一化显著提高了被试之间的解码精度(Cohen’s d=1.39, p<0.001),但没有提高被试内部的解码精度。此外,在空间归一化处理后,差异波的形态保真度得到了保持,在7个二元范式中,归一化后的ERP差异波与原始ERP差异波具有较高的相似性。我们在不同的实验范式中验证了我们的发现,并证明空间归一化有效地提高了使用SVM的受试者之间解码的准确性,同时保持了ERP的时间一致性,为基于脑电图的认知、临床和脑机接口(BCI)应用提供了一种通用的预处理方法。
{"title":"Evaluating spatial normalization for SVM-based EEG decoding: A within- and between-subjects perspective","authors":"Yuan Qin ,&nbsp;Qi Xu ,&nbsp;Tuomo Kujala ,&nbsp;Xiaoshuang Wang ,&nbsp;Fengyu Cong","doi":"10.1016/j.bspc.2026.109535","DOIUrl":"10.1016/j.bspc.2026.109535","url":null,"abstract":"<div><div>Normalization is widely used in electroencephalogram (EEG)-based multivariate pattern classification (MVPC) to reduce magnitude differences across trials and subjects. However, the spatial normalization method as applied to EEG channel-based brain maps has been rarely investigated in EEG-based decoding tasks like event-related potential (ERP) experiments. Meanwhile, the effectiveness of spatial normalization across diverse experimental paradigms remains unclear. This study evaluated the impact of spatial normalization on decoding accuracy using the support vector machine (SVM). The analysis included nine experimental paradigms, with seven binary ERP paradigms, one four-class facial expression paradigm, and one sixteen-class orientation paradigm. Results showed that spatial normalization significantly improved the between-subjects decoding accuracy (Cohen’s <span><math><mrow><mi>d</mi><mo>=</mo><mn>1</mn><mo>.</mo><mn>39</mn></mrow></math></span>, <span><math><mrow><mi>p</mi><mo>&lt;</mo><mn>0</mn><mo>.</mo><mn>001</mn></mrow></math></span>) but did not enhance the within-subjects decoding accuracy. Additionally, the morphological fidelity of the difference wave was preserved after spatial normalization, as evidenced by the high similarity between the normalized and original ERP difference waves across the seven binary paradigms. We validated our findings across diverse experimental paradigms and demonstrated that spatial normalization effectively enhances between-subjects decoding accuracy using SVM while preserving the temporal consistency of ERP, offering a generalizable preprocessing approach for EEG-based cognitive, clinical, and brain–computer interface (BCI) applications.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109535"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SimNet-SFDR: PET motion artifact correction via Sim(3)-Equivariant and Frequency-Based registration SimNet-SFDR:基于Sim(3)-等变和基于频率的配准的PET运动伪影校正
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-10 DOI: 10.1016/j.bspc.2026.109569
Hui Zhou , Zhihui Wu , Longxi He , Yangsheng Hu , Zhouyuan Qin , Feng Wang , Jianfeng He
Respiratory motion introduces non-rigid anatomical deformation and signal blurring in PET imaging, leading to bias in lesion quantification and standardized uptake value (SUV) estimation. This study proposes SimNet-SFDR, an unsupervised 3D registration framework that integrates Sim(3)-equivariant encoding with structure-frequency domain regularization for accurate compensation of respiratory motion artifacts. The architecture couples geometric invariance with structure-aware refinement, enabling anatomically consistent and functionally stable deformation estimation.
Comprehensive experiments were conducted on simulated phantoms and multi-center 3D clinical PET/CT datasets, providing large-scale validation across heterogeneous scanners and acquisition protocols. On clinical data, SimNet-SFDR achieved an average SSIM of 0.953 ± 0.021 and CC of 0.958 ± 0.017, yielding improvements of approximately 4% over uncorrected images and 6% over the learning-based baseline VoxelMorph. Compared with the uncorrected images, the normalized mutual information increased by about 8%, whereas the target registration error and 95th percentile Hausdorff distance were reduced by 56% and 22%, respectively, demonstrating markedly improved geometric precision and deformation regularity.
Lesion-level subgroup analyses further demonstrated consistent performance across tumor sizes. The method maintained SUVmean deviations within ± 10% and sub-millimeter geometric error for small lesions (<10 mm), indicating stable quantification and effective suppression of motion- and partial-volume related bias.
These results confirm that SimNet-SFDR provides a robust and anatomically consistent motion-correction framework, offering practical potential for integration into quantitative and motion-aware PET imaging workflows in clinical environments.
呼吸运动在PET成像中引入非刚性解剖变形和信号模糊,导致病变量化和标准化摄取值(SUV)估计的偏差。本研究提出了SimNet-SFDR,这是一种无监督3D配准框架,将Sim(3)等变编码与结构频域正则化相结合,用于精确补偿呼吸运动伪影。该结构将几何不变性与结构感知的细化相结合,从而实现解剖一致和功能稳定的变形估计。在模拟幻影和多中心三维临床PET/CT数据集上进行了综合实验,提供了跨异构扫描仪和采集协议的大规模验证。在临床数据中,SimNet-SFDR的平均SSIM为0.953±0.021,CC为0.958±0.017,比未校正的图像提高了约4%,比基于学习的基线VoxelMorph提高了6%。与未经校正的图像相比,归一化后的互信息提高了约8%,目标配准误差和第95百分位Hausdorff距离分别降低了56%和22%,几何精度和变形规律得到了显著提高。病变水平亚组分析进一步证明了不同肿瘤大小的表现一致。该方法将SUVmean偏差保持在±10%以内,小病变(<10 mm)的几何误差保持在亚毫米以内,表明量化稳定,有效抑制了运动和部分体积相关的偏差。这些结果证实,SimNet-SFDR提供了一个强大且解剖学上一致的运动校正框架,为临床环境中定量和运动感知PET成像工作流程的集成提供了实际潜力。
{"title":"SimNet-SFDR: PET motion artifact correction via Sim(3)-Equivariant and Frequency-Based registration","authors":"Hui Zhou ,&nbsp;Zhihui Wu ,&nbsp;Longxi He ,&nbsp;Yangsheng Hu ,&nbsp;Zhouyuan Qin ,&nbsp;Feng Wang ,&nbsp;Jianfeng He","doi":"10.1016/j.bspc.2026.109569","DOIUrl":"10.1016/j.bspc.2026.109569","url":null,"abstract":"<div><div>Respiratory motion introduces non-rigid anatomical deformation and signal blurring in PET imaging, leading to bias in lesion quantification and standardized uptake value (SUV) estimation. This study proposes SimNet-SFDR, an unsupervised 3D registration framework that integrates Sim(3)-equivariant encoding with structure-frequency domain regularization for accurate compensation of respiratory motion artifacts. The architecture couples geometric invariance with structure-aware refinement, enabling anatomically consistent and functionally stable deformation estimation.</div><div>Comprehensive experiments were conducted on simulated phantoms and multi-center 3D clinical PET/CT datasets, providing large-scale validation across heterogeneous scanners and acquisition protocols. On clinical data, SimNet-SFDR achieved an average SSIM of 0.953 ± 0.021 and CC of 0.958 ± 0.017, yielding improvements of approximately 4% over uncorrected images and 6% over the learning-based baseline VoxelMorph. Compared with the uncorrected images, the normalized mutual information increased by about 8%, whereas the target registration error and 95th percentile Hausdorff distance were reduced by 56% and 22%, respectively, demonstrating markedly improved geometric precision and deformation regularity.</div><div>Lesion-level subgroup analyses further demonstrated consistent performance across tumor sizes. The method maintained SUVmean deviations within ± 10% and sub-millimeter geometric error for small lesions (&lt;10 mm), indicating stable quantification and effective suppression of motion- and partial-volume related bias.</div><div>These results confirm that SimNet-SFDR provides a robust and anatomically consistent motion-correction framework, offering practical potential for integration into quantitative and motion-aware PET imaging workflows in clinical environments.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109569"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Medical image segmentation based on 3D PDC with Swin Transformer 基于Swin变压器的三维PDC医学图像分割
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-10 DOI: 10.1016/j.bspc.2026.109516
Lin Fan , Xiaojia Ding , Zhongmin Wang , Hai Wang , Rong Zhang
3D medical image segmentation is vital for disease diagnosis and effective treatment strategies. Despite the advancements in Convolutional Neural Networks (CNN), their fixed receptive fields constrain global context modeling, leading to suboptimal performance, particularly with complex shapes and multi-scale variations. Transformer enhances the global modeling capability through self-attention mechanism. However, challenges persist in capturing fine-grained local features and extracting edge features. To overcome these issues, In this study, a dual-path encoder is proposed, which consists of a 3D Pixel Difference Convolution (PDC) and a Swin Transformer and is combined with a CNN for the decoder. The 3D PDC module extracts local features by calculating the differences between neighboring pixels. To improve the ability of capturing edge structures, three 3D PDC variants: 3D Central Pixel Difference Convolution (CPDC), 3D Angular Pixel Difference Convolution (APDC), and 3D Radial Pixel Difference Convolution (RPDC) are proposed to optimize the processing ability of complex edges, multi-directional edges, and multi-scale structures, respectively. After performance evaluation, the selected CARV combinations (3D CPDC, 3D APDC, 3D RPDC, and 3D ordinary convolution) achieve a compromise between high segmentation accuracy and low computational cost. The Swin Transformer encoder captures global context using a hierarchical shift-window mechanism, while the CNN decoder fuses features progressively to produce precise pixel-level segmentation. The proposed method exceeds the performance of existing state-of-the-art techniques, as shown by experiments conducted on the BTCV, FLARE21, and AMOS22 datasets.
医学图像的三维分割对于疾病的诊断和有效的治疗策略至关重要。尽管卷积神经网络(CNN)取得了进步,但其固定的接受域限制了全局上下文建模,导致性能不佳,特别是在复杂形状和多尺度变化的情况下。Transformer通过自关注机制增强了全局建模能力。然而,在捕获细粒度的局部特征和提取边缘特征方面仍然存在挑战。为了克服这些问题,本研究提出了一种双路径编码器,该编码器由3D像素差卷积(PDC)和Swin变压器组成,并与CNN相结合用于解码器。三维PDC模块通过计算相邻像素之间的差来提取局部特征。为了提高边缘结构的捕获能力,提出了三种三维PDC变体:三维中心像素差卷积(CPDC)、三维角度像素差卷积(APDC)和三维径向像素差卷积(RPDC),分别优化了复杂边缘、多向边缘和多尺度结构的处理能力。经过性能评估,所选择的CARV组合(3D CPDC、3D APDC、3D RPDC和3D普通卷积)实现了高分割精度和低计算成本之间的折衷。Swin Transformer编码器使用分层移位窗口机制捕获全局上下文,而CNN解码器逐步融合特征以产生精确的像素级分割。在BTCV、FLARE21和AMOS22数据集上进行的实验表明,所提出的方法优于现有的最先进技术。
{"title":"Medical image segmentation based on 3D PDC with Swin Transformer","authors":"Lin Fan ,&nbsp;Xiaojia Ding ,&nbsp;Zhongmin Wang ,&nbsp;Hai Wang ,&nbsp;Rong Zhang","doi":"10.1016/j.bspc.2026.109516","DOIUrl":"10.1016/j.bspc.2026.109516","url":null,"abstract":"<div><div>3D medical image segmentation is vital for disease diagnosis and effective treatment strategies. Despite the advancements in Convolutional Neural Networks (CNN), their fixed receptive fields constrain global context modeling, leading to suboptimal performance, particularly with complex shapes and multi-scale variations. Transformer enhances the global modeling capability through self-attention mechanism. However, challenges persist in capturing fine-grained local features and extracting edge features. To overcome these issues, In this study, a dual-path encoder is proposed, which consists of a 3D Pixel Difference Convolution (PDC) and a Swin Transformer and is combined with a CNN for the decoder. The 3D PDC module extracts local features by calculating the differences between neighboring pixels. To improve the ability of capturing edge structures, three 3D PDC variants: 3D Central Pixel Difference Convolution (CPDC), 3D Angular Pixel Difference Convolution (APDC), and 3D Radial Pixel Difference Convolution (RPDC) are proposed to optimize the processing ability of complex edges, multi-directional edges, and multi-scale structures, respectively. After performance evaluation, the selected CARV combinations (3D CPDC, 3D APDC, 3D RPDC, and 3D ordinary convolution) achieve a compromise between high segmentation accuracy and low computational cost. The Swin Transformer encoder captures global context using a hierarchical shift-window mechanism, while the CNN decoder fuses features progressively to produce precise pixel-level segmentation. The proposed method exceeds the performance of existing state-of-the-art techniques, as shown by experiments conducted on the BTCV, FLARE21, and AMOS22 datasets.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109516"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Electroencephalogram decoding driven by shared semantic information for perception and imagination cognitive processes 共享语义信息驱动的脑电图解码在感知和想象认知过程中的作用
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-10 DOI: 10.1016/j.bspc.2026.109586
Jinze Tong, Wanzhong Chen
Accurately decoding the semantics of visual perception and imagination Electroencephalogram (EEG) signals is vital for understanding brain functions. It also contributes to the improvement and expansion of brain–computer interfaces. However, existing EEG-based decoding methods often treat perception and imagination independently, ignoring their potential correlations and thus limiting decoding performance. To address this, inspired by the invariance of semantic information across cognitive processes, this paper proposes Shared Semantic Information Driven Network (SSIDNet). This model incorporates three modules: a dual-branch pre-trained Specialized Part for extracting features from EEG data of a single cognitive process; a Kolmogorov–Arnold Network (KAN)-based Public Part designed as a parameter-sharing parallel structure to extract shared semantic information from both perception and imagination EEG signals; and a Capsule Network (ccCapsNet)-based Fusion Part that integrates features and performs classification. Experiments on two public datasets demonstrate that SSIDNet increases accuracy by 12.4 and 15.47 percentage points over the Specialized Part alone, leading to notably better semantic decoding performance. Furthermore, the success of SSIDNet provides supporting algorithmic evidence for the existence of shared data patterns and semantic features between perception and imagination in the brain, and demonstrates the feasibility of leveraging this shared information to enhance EEG-based semantic decoding.
准确解码视觉感知和想象脑电图信号的语义对于理解脑功能至关重要。它还有助于改进和扩展脑机接口。然而,现有的基于脑电图的解码方法往往将感知和想象独立对待,忽略了它们潜在的相关性,从而限制了解码性能。为了解决这一问题,受跨认知过程的语义信息不变性的启发,本文提出了共享语义信息驱动网络(SSIDNet)。该模型包含三个模块:双分支预训练的专门化部分用于从单个认知过程的脑电数据中提取特征;基于Kolmogorov-Arnold网络(KAN)的公共部分设计为参数共享并行结构,从感知和想象脑电信号中提取共享语义信息;以及基于胶囊网络(ccCapsNet)的融合部分,该部分集成了功能并执行分类。在两个公共数据集上的实验表明,SSIDNet比Specialized Part单独提高了12.4和15.47个百分点的准确率,显著提高了语义解码性能。此外,SSIDNet的成功为大脑中感知和想象之间存在共享数据模式和语义特征提供了支持性算法证据,并证明了利用这些共享信息增强基于脑电图的语义解码的可行性。
{"title":"Electroencephalogram decoding driven by shared semantic information for perception and imagination cognitive processes","authors":"Jinze Tong,&nbsp;Wanzhong Chen","doi":"10.1016/j.bspc.2026.109586","DOIUrl":"10.1016/j.bspc.2026.109586","url":null,"abstract":"<div><div>Accurately decoding the semantics of visual perception and imagination Electroencephalogram (EEG) signals is vital for understanding brain functions. It also contributes to the improvement and expansion of brain–computer interfaces. However, existing EEG-based decoding methods often treat perception and imagination independently, ignoring their potential correlations and thus limiting decoding performance. To address this, inspired by the invariance of semantic information across cognitive processes, this paper proposes Shared Semantic Information Driven Network (SSIDNet). This model incorporates three modules: a dual-branch pre-trained Specialized Part for extracting features from EEG data of a single cognitive process; a Kolmogorov–Arnold Network (KAN)-based Public Part designed as a parameter-sharing parallel structure to extract shared semantic information from both perception and imagination EEG signals; and a Capsule Network (ccCapsNet)-based Fusion Part that integrates features and performs classification. Experiments on two public datasets demonstrate that SSIDNet increases accuracy by 12.4 and 15.47 percentage points over the Specialized Part alone, leading to notably better semantic decoding performance. Furthermore, the success of SSIDNet provides supporting algorithmic evidence for the existence of shared data patterns and semantic features between perception and imagination in the brain, and demonstrates the feasibility of leveraging this shared information to enhance EEG-based semantic decoding.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109586"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Veress needle guidance in pneumoperitoneum creation using optical coherence tomography and machine learning 利用光学相干断层扫描和机器学习进行气腹穿刺引导
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-10 DOI: 10.1016/j.bspc.2026.109495
Meng-Chun Kao , Eric Yi-Hsiu Huang , Ting Chang , Wen-Chuan Kuo
Pneumoperitoneum creation is a crucial step in laparoscopic surgery. Blind insertion of a Veress needle into the abdominal wall mainly relies on the surgeon’s experiences, which may lead to more than 3% of failures in the clinic, including preperitoneal insufflation, gas embolism, and vascular or visceral injury. This study proposed a novel method to create pneumoperitoneum using fiber-probe optical coherence tomography (OCT) as a real-time imaging guide. Our image analysis process and automatic identification techniques for the peritoneum and its intra- and extraperitoneal tissues reveal that four distinct image features—Mad, root mean square, standard deviation, and coarseness—can effectively describe the various tissue structures encountered during needle puncture. By combining these four features as inputs, various classifiers were employed to differentiate the peritoneum from intra- and extraperitoneal tissues. The Cubic Support Vector Machine (CSVM) classifier achieves an average precision of 98.6% in identifying peritoneum and its intra- and extraperitoneal tissues. Using intelligent and objective OCT image-guided puncture for real-time recognition of the needle tip, pneumoperitoneum can be effectively and safely established, thereby avoiding the failure caused by human judgment errors, the surgeon’s opinion, and the number of tries required. Reducing failure can significantly lower medical insurance costs and the ongoing expenses associated with post-operative complications. Adopting these advanced imaging technologies as laparoscopic techniques evolve is crucial for improving surgical precision and patient care.
气腹形成是腹腔镜手术的关键步骤。Veress针在腹壁的盲目插入主要依赖于外科医生的经验,这可能导致超过3%的临床失败,包括腹膜前充气、气体栓塞、血管或内脏损伤。本研究提出了一种利用光纤探针光学相干断层扫描(OCT)作为实时成像引导来制造气腹的新方法。我们对腹膜及其腹膜内和腹膜外组织的图像分析过程和自动识别技术表明,四种不同的图像特征- mad、均方根、标准差和粗糙度-可以有效地描述针刺过程中遇到的各种组织结构。通过结合这四个特征作为输入,使用各种分类器来区分腹膜与腹膜内和腹膜外组织。三次支持向量机(CSVM)分类器识别腹膜及其腹膜内、腹膜外组织的平均准确率达到98.6%。利用智能、客观的OCT图像引导穿刺对针尖进行实时识别,可以有效、安全地建立气腹,从而避免因人为判断错误、外科医生的意见和所需的尝试次数而导致的失败。减少手术失败可以显著降低医疗保险费用和与术后并发症相关的持续费用。随着腹腔镜技术的发展,采用这些先进的成像技术对于提高手术精度和患者护理至关重要。
{"title":"Veress needle guidance in pneumoperitoneum creation using optical coherence tomography and machine learning","authors":"Meng-Chun Kao ,&nbsp;Eric Yi-Hsiu Huang ,&nbsp;Ting Chang ,&nbsp;Wen-Chuan Kuo","doi":"10.1016/j.bspc.2026.109495","DOIUrl":"10.1016/j.bspc.2026.109495","url":null,"abstract":"<div><div>Pneumoperitoneum creation is a crucial step in laparoscopic surgery. Blind insertion of a Veress needle into the abdominal wall mainly relies on the surgeon’s experiences, which may lead to more than 3% of failures in the clinic, including preperitoneal insufflation, gas embolism, and vascular or visceral injury. This study proposed a novel method to create pneumoperitoneum using fiber-probe optical coherence tomography (OCT) as a real-time imaging guide. Our image analysis process and automatic identification techniques for the peritoneum and its intra- and extraperitoneal tissues reveal that four distinct image features—Mad, root mean square, standard deviation, and coarseness—can effectively describe the various tissue structures encountered during needle puncture. By combining these four features as inputs, various classifiers were employed to differentiate the peritoneum from intra- and extraperitoneal tissues. The Cubic Support Vector Machine (CSVM) classifier achieves an average precision of 98.6% in identifying peritoneum and its intra- and extraperitoneal tissues. Using intelligent and objective OCT image-guided puncture for real-time recognition of the needle tip, pneumoperitoneum can be effectively and safely established, thereby avoiding the failure caused by human judgment errors, the surgeon’s opinion, and the number of tries required. Reducing failure can significantly lower medical insurance costs and the ongoing expenses associated with post-operative complications. Adopting these advanced imaging technologies as laparoscopic techniques evolve is crucial for improving surgical precision and patient care.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109495"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CDCLIP: An interpretable zero-shot medical image classification framework based on concept decomposition CDCLIP:一种基于概念分解的可解释零镜头医学图像分类框架
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-10 DOI: 10.1016/j.bspc.2026.109585
Zheng Li , Xiangwei Zheng , Dejian Su , Mingzhe Zhang
Contrastive Language-Image Pre-Training (CLIP) has shown superior performances in zero-shot natural image classification. However, its effective application to medical-related tasks remains a challenge. Some existing studies suffer from decreased generalization ability and catastrophic forgetting for utilizing self-collected insufficient datasets to fine-tune CLIP. In this paper, we propose a novel Concept Decomposition-based CLIP framework (CDCLIP) aimed at improving the classification accuracy of previously unseen medical images, obviating the need for retraining. Specifically, CDCLIP exploits external prior knowledge and the multi-level structural approach to disentangle medical diseases into several regular visual concepts. Notably, CDCLIP shifts the analytical focus from the disease category to the correlation degree between the images and the prompts derived from the decomposed concepts, which helps medical attributes be better evaluated. By introducing inference mechanism, the prompts composed of specific attributes serve to infer the final medical diagnosis. Comprehensive experiments are conducted on four datasets (including multiple diseases under endoscopy, CT, X-ray, and retina images) and the results demonstrate that CDCLIP owns better generalization ability. Compared to CLIP, CDCLIP achieves significant average accuracy improvement of intestinal metaplasia identification (+3.64%), lung cancer identification (+22.71%), tuberculosis detection (+11.96%), glaucoma analysis (+10.4%), and breast tumor identification (+49.87%).
对比语言图像预训练(CLIP)在零采样自然图像分类中表现出优异的性能。然而,将其有效应用于医疗相关任务仍然是一个挑战。现有的一些研究由于利用自己收集的不充分的数据集对CLIP进行微调,导致泛化能力下降和灾难性遗忘。在本文中,我们提出了一种新的基于概念分解的CLIP框架(CDCLIP),旨在提高以前未见过的医学图像的分类精度,从而避免重新训练的需要。具体来说,CDCLIP利用外部先验知识和多层次结构方法将医学疾病分解为几个规则的视觉概念。值得注意的是,CDCLIP将分析重点从疾病类别转移到图像与分解概念得到的提示之间的关联度,这有助于更好地评估医学属性。通过引入推理机制,由特定属性组成的提示符用于推断最终的医疗诊断。在内镜下、CT下、x线下、视网膜下的多种疾病4个数据集上进行了综合实验,结果表明CDCLIP具有较好的泛化能力。与CLIP相比,CDCLIP在肠化生识别(+3.64%)、肺癌识别(+22.71%)、结核病检测(+11.96%)、青光眼分析(+10.4%)和乳腺肿瘤识别(+49.87%)方面的平均准确率均有显著提高。
{"title":"CDCLIP: An interpretable zero-shot medical image classification framework based on concept decomposition","authors":"Zheng Li ,&nbsp;Xiangwei Zheng ,&nbsp;Dejian Su ,&nbsp;Mingzhe Zhang","doi":"10.1016/j.bspc.2026.109585","DOIUrl":"10.1016/j.bspc.2026.109585","url":null,"abstract":"<div><div>Contrastive Language-Image Pre-Training (CLIP) has shown superior performances in zero-shot natural image classification. However, its effective application to medical-related tasks remains a challenge. Some existing studies suffer from decreased generalization ability and catastrophic forgetting for utilizing self-collected insufficient datasets to fine-tune CLIP. In this paper, we propose a novel Concept Decomposition-based CLIP framework (CDCLIP) aimed at improving the classification accuracy of previously unseen medical images, obviating the need for retraining. Specifically, CDCLIP exploits external prior knowledge and the multi-level structural approach to disentangle medical diseases into several regular visual concepts. Notably, CDCLIP shifts the analytical focus from the disease category to the correlation degree between the images and the prompts derived from the decomposed concepts, which helps medical attributes be better evaluated. By introducing inference mechanism, the prompts composed of specific attributes serve to infer the final medical diagnosis. Comprehensive experiments are conducted on four datasets (including multiple diseases under endoscopy, CT, X-ray, and retina images) and the results demonstrate that CDCLIP owns better generalization ability. Compared to CLIP, CDCLIP achieves significant average accuracy improvement of intestinal metaplasia identification (+3.64%), lung cancer identification (+22.71%), tuberculosis detection (+11.96%), glaucoma analysis (+10.4%), and breast tumor identification (+49.87%).</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109585"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biomedical Signal Processing and Control
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1