Pub Date : 2026-01-10DOI: 10.1016/j.bspc.2026.109486
Congchao Bian , Ning Cao , Hua Yan , Yue Liu
Deep learning (DL)-based methods have shown great promise in accelerating magnetic resonance imaging (MRI) reconstruction. However, effectively exploiting the inherent phase structure of raw k-space data as prior knowledge remains a critical yet underexplored direction for further enhancing reconstruction quality. To address this challenge, we propose POCS-DLNet, a novel enhancement framework that integrates deep learning with the traditional Projection Onto Convex Sets (POCS) algorithm. The framework enforces phase consistency by incorporating priors derived from the low-frequency components of k-space, guiding both model training and inference. By embedding existing DL-MRI models — trained to map undersampled inputs to fully sampled outputs — into the POCS iteration process, POCS-DLNet enables efficient phase correction and accelerated reconstruction while minimizing the number of trainable parameters. Furthermore, we introduce an asymmetric undersampling mask design strategy, termed CrossMask, which leverages the conjugate symmetry of k-space to improve sampling efficiency and reconstruction fidelity. Extensive experiments on two public datasets demonstrate that POCS-DLNet significantly enhances the reconstruction accuracy of representative DL-MRI models while maintaining low computational overhead. Comprehensive ablation studies further validate the contribution of each proposed component and confirm the robustness and generalization capability of the framework.
{"title":"Elevating MRI reconstruction: A novel enhanced framework integrating deep learning and traditional algorithms for sampling, reconstruction, and training optimization","authors":"Congchao Bian , Ning Cao , Hua Yan , Yue Liu","doi":"10.1016/j.bspc.2026.109486","DOIUrl":"10.1016/j.bspc.2026.109486","url":null,"abstract":"<div><div>Deep learning (DL)-based methods have shown great promise in accelerating magnetic resonance imaging (MRI) reconstruction. However, effectively exploiting the inherent phase structure of raw k-space data as prior knowledge remains a critical yet underexplored direction for further enhancing reconstruction quality. To address this challenge, we propose POCS-DLNet, a novel enhancement framework that integrates deep learning with the traditional Projection Onto Convex Sets (POCS) algorithm. The framework enforces phase consistency by incorporating priors derived from the low-frequency components of k-space, guiding both model training and inference. By embedding existing DL-MRI models — trained to map undersampled inputs to fully sampled outputs — into the POCS iteration process, POCS-DLNet enables efficient phase correction and accelerated reconstruction while minimizing the number of trainable parameters. Furthermore, we introduce an asymmetric undersampling mask design strategy, termed CrossMask, which leverages the conjugate symmetry of k-space to improve sampling efficiency and reconstruction fidelity. Extensive experiments on two public datasets demonstrate that POCS-DLNet significantly enhances the reconstruction accuracy of representative DL-MRI models while maintaining low computational overhead. Comprehensive ablation studies further validate the contribution of each proposed component and confirm the robustness and generalization capability of the framework.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109486"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.bspc.2026.109509
Xin Yue , Qing Zhao , Xiaoling Liu , Jianqiang Li , Jing Bai , Changwei Song , Suqin Liu , Rodrigo Moreno , Zhikai Yang , Stefano E. Romero , Gabriel Jimenez , Guanghui Fu
Ultrasound imaging is vital for the early detection of breast cancer, where accurate lesion segmentation supports clinical diagnosis and treatment planning. However, existing deep learning-based methods rely on pixel-level annotations, which are costly and labor-intensive to obtain. This study presents a weakly supervised framework for breast lesion segmentation in ultrasound images. The framework combines morphological enhancement with Class Activation Map (CAM)-guided lesion localization and utilizes the Segment Anything Model (SAM) for refined segmentation without pixel-level labels. By adopting a lightweight region synthesis strategy and relying solely on SAM inference, the proposed approach substantially reduces model complexity and computational cost while maintaining high segmentation accuracy. Experimental results on the BUSI dataset show that our method achieves a Dice coefficient of 0.7063 under five-fold cross-validation and outperforms several fully supervised models in Hausdorff distance metrics. These results demonstrate that the proposed framework effectively balances segmentation accuracy, computational efficiency, and annotation cost, offering a practical and low-complexity solution for breast ultrasound analysis. The code for this study is available at: https://github.com/YueXin18/MorSeg-CAM-SAM-Segmentation.
{"title":"Morphology-enhanced CAM-guided SAM for weakly supervised breast lesion segmentation","authors":"Xin Yue , Qing Zhao , Xiaoling Liu , Jianqiang Li , Jing Bai , Changwei Song , Suqin Liu , Rodrigo Moreno , Zhikai Yang , Stefano E. Romero , Gabriel Jimenez , Guanghui Fu","doi":"10.1016/j.bspc.2026.109509","DOIUrl":"10.1016/j.bspc.2026.109509","url":null,"abstract":"<div><div>Ultrasound imaging is vital for the early detection of breast cancer, where accurate lesion segmentation supports clinical diagnosis and treatment planning. However, existing deep learning-based methods rely on pixel-level annotations, which are costly and labor-intensive to obtain. This study presents a weakly supervised framework for breast lesion segmentation in ultrasound images. The framework combines morphological enhancement with Class Activation Map (CAM)-guided lesion localization and utilizes the Segment Anything Model (SAM) for refined segmentation without pixel-level labels. By adopting a lightweight region synthesis strategy and relying solely on SAM inference, the proposed approach substantially reduces model complexity and computational cost while maintaining high segmentation accuracy. Experimental results on the BUSI dataset show that our method achieves a Dice coefficient of 0.7063 under five-fold cross-validation and outperforms several fully supervised models in Hausdorff distance metrics. These results demonstrate that the proposed framework effectively balances segmentation accuracy, computational efficiency, and annotation cost, offering a practical and low-complexity solution for breast ultrasound analysis. The code for this study is available at: <span><span>https://github.com/YueXin18/MorSeg-CAM-SAM-Segmentation</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109509"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.bspc.2026.109498
JiaHuan Lin , Juan Chen , Lei Guo , Zengnan Wang
Brain tumor segmentation is pivotal in medical diagnostics, enabling precise determination of tumor size, shape, and location for accurate and timely interventions. However, the annotation process often demands specialized expertise, making it time-intensive, labor-intensive, and costly. Existing weakly supervised methods for brain tumor segmentation primarily focus on whole tumor (WT) segmentation, which typically treats the problem as binary (tumor present/absent), thereby overlooking the pathological significance of individual sub-regions: enhancing tumors (ET) often indicate active malignancy, peritumoral edema (ED) reflects tumor invasion, and necrotic/non-enhancing tumor cores (NET) correlate with progression or therapeutic response. To address this limitation, we propose a weakly supervised multi-region segmentation network, termed Brain Tumor Region Segmentation Network Using Multi-Class Token Transformers, which explicitly treats ET, ED, and NET as distinct labels, i.e., a multilabel supervision setting. The network employs class tokens to capture region-specific localization and generates accurate localization maps for each sub-region. A patch-to-patch transformer is then utilized to compute patch-level pairwise affinities, serving as pseudo-labels. Supervised by a lightweight MLP decoder tailored to the Multi-Class Token Transformer Encoder, the network produces precise predictions. Additionally, a Boundary-Constrained Transformer (BCT) module enhances transformer block guidance, refining pseudo-label generation. Comprehensive experiments on public datasets (BraTS2018, BraTS2019, BraTS2020) demonstrate the proposed method’s superior performance compared to state-of-the-art weakly supervised multi-region segmentation approaches, validating its effectiveness and potential in clinical applications. This study highlights that multi-label supervision, unlike traditional binary WSSS, enables more precise and clinically meaningful segmentation of tumor sub-regions, addressing critical gaps in previous weakly supervised approaches.
{"title":"BT-MCT: Brain tumor regions segmentation using multi-class token transformers for weakly supervised semantic segmentation","authors":"JiaHuan Lin , Juan Chen , Lei Guo , Zengnan Wang","doi":"10.1016/j.bspc.2026.109498","DOIUrl":"10.1016/j.bspc.2026.109498","url":null,"abstract":"<div><div>Brain tumor segmentation is pivotal in medical diagnostics, enabling precise determination of tumor size, shape, and location for accurate and timely interventions. However, the annotation process often demands specialized expertise, making it time-intensive, labor-intensive, and costly. Existing weakly supervised methods for brain tumor segmentation primarily focus on whole tumor (WT) segmentation, which typically treats the problem as binary (tumor present/absent), thereby overlooking the pathological significance of individual sub-regions: enhancing tumors (ET) often indicate active malignancy, peritumoral edema (ED) reflects tumor invasion, and necrotic/non-enhancing tumor cores (NET) correlate with progression or therapeutic response. To address this limitation, we propose a weakly supervised multi-region segmentation network, termed Brain Tumor Region Segmentation Network Using Multi-Class Token Transformers, which explicitly treats ET, ED, and NET as distinct labels, i.e., a multilabel supervision setting. The network employs class tokens to capture region-specific localization and generates accurate localization maps for each sub-region. A patch-to-patch transformer is then utilized to compute patch-level pairwise affinities, serving as pseudo-labels. Supervised by a lightweight MLP decoder tailored to the Multi-Class Token Transformer Encoder, the network produces precise predictions. Additionally, a Boundary-Constrained Transformer (BCT) module enhances transformer block guidance, refining pseudo-label generation. Comprehensive experiments on public datasets (BraTS2018, BraTS2019, BraTS2020) demonstrate the proposed method’s superior performance compared to state-of-the-art weakly supervised multi-region segmentation approaches, validating its effectiveness and potential in clinical applications. This study highlights that multi-label supervision, unlike traditional binary WSSS, enables more precise and clinically meaningful segmentation of tumor sub-regions, addressing critical gaps in previous weakly supervised approaches.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109498"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.bspc.2026.109568
Weitao Chen , Yuntian Zhao , Lu Gao , ZhaoLi Yao , Shenglan Qin , Liangquan Jia , Chong Yao , Feng Hua
Lung nodules, as one of the key imaging indicators for early lung cancer, play a crucial role in early screening and diagnosis, which are essential for the early prevention and intervention of lung cancer. However, existing deep learning-based lung nodule segmentation methods often fail to perform well when faced with practical issues such as complex background interference and blurry nodule boundaries in CT images. To address these challenges, this paper proposes a novel U-Net model for precise lung nodule segmentation. The model is based on the U-Net architecture, incorporating Transformer structures to optimize the skip connections and overcome the semantic gap in feature transmission. Furthermore, a channel-space dual-domain feature fusion module is designed to enhance the complementary fusion of shallow and deep features during the decoding phase. In the encoding phase, a Haar Wavelet DownSample module is employed to effectively alleviate the information loss problem. To evaluate the performance of the proposed model, this study uses the LUNA16 public dataset. Experimental results show that the proposed segmentation model achieves Dice Similarity Coefficient, Sensitivity, and Accuracy scores of 77.92%, 91.79%, and 83.91%, respectively. The comprehensive performance significantly outperforms current mainstream lung nodule segmentation methods, providing an efficient and reliable new solution for precise lung nodule segmentation and early lung cancer diagnosis software. Our implementation is available at https://github.com/shmookpup/EMR-Unet.
{"title":"Construction of a multi-scale feature fusion algorithm for precise lung nodule segmentation","authors":"Weitao Chen , Yuntian Zhao , Lu Gao , ZhaoLi Yao , Shenglan Qin , Liangquan Jia , Chong Yao , Feng Hua","doi":"10.1016/j.bspc.2026.109568","DOIUrl":"10.1016/j.bspc.2026.109568","url":null,"abstract":"<div><div>Lung nodules, as one of the key imaging indicators for early lung cancer, play a crucial role in early screening and diagnosis, which are essential for the early prevention and intervention of lung cancer. However, existing deep learning-based lung nodule segmentation methods often fail to perform well when faced with practical issues such as complex background interference and blurry nodule boundaries in CT images. To address these challenges, this paper proposes a novel U-Net model for precise lung nodule segmentation. The model is based on the U-Net architecture, incorporating Transformer structures to optimize the skip connections and overcome the semantic gap in feature transmission. Furthermore, a channel-space dual-domain feature fusion module is designed to enhance the complementary fusion of shallow and deep features during the decoding phase. In the encoding phase, a Haar Wavelet DownSample module is employed to effectively alleviate the information loss problem. To evaluate the performance of the proposed model, this study uses the LUNA16 public dataset. Experimental results show that the proposed segmentation model achieves Dice Similarity Coefficient, Sensitivity, and Accuracy scores of 77.92%, 91.79%, and 83.91%, respectively. The comprehensive performance significantly outperforms current mainstream lung nodule segmentation methods, providing an efficient and reliable new solution for precise lung nodule segmentation and early lung cancer diagnosis software. Our implementation is available at <span><span>https://github.com/shmookpup/EMR-Unet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109568"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Normalization is widely used in electroencephalogram (EEG)-based multivariate pattern classification (MVPC) to reduce magnitude differences across trials and subjects. However, the spatial normalization method as applied to EEG channel-based brain maps has been rarely investigated in EEG-based decoding tasks like event-related potential (ERP) experiments. Meanwhile, the effectiveness of spatial normalization across diverse experimental paradigms remains unclear. This study evaluated the impact of spatial normalization on decoding accuracy using the support vector machine (SVM). The analysis included nine experimental paradigms, with seven binary ERP paradigms, one four-class facial expression paradigm, and one sixteen-class orientation paradigm. Results showed that spatial normalization significantly improved the between-subjects decoding accuracy (Cohen’s , ) but did not enhance the within-subjects decoding accuracy. Additionally, the morphological fidelity of the difference wave was preserved after spatial normalization, as evidenced by the high similarity between the normalized and original ERP difference waves across the seven binary paradigms. We validated our findings across diverse experimental paradigms and demonstrated that spatial normalization effectively enhances between-subjects decoding accuracy using SVM while preserving the temporal consistency of ERP, offering a generalizable preprocessing approach for EEG-based cognitive, clinical, and brain–computer interface (BCI) applications.
{"title":"Evaluating spatial normalization for SVM-based EEG decoding: A within- and between-subjects perspective","authors":"Yuan Qin , Qi Xu , Tuomo Kujala , Xiaoshuang Wang , Fengyu Cong","doi":"10.1016/j.bspc.2026.109535","DOIUrl":"10.1016/j.bspc.2026.109535","url":null,"abstract":"<div><div>Normalization is widely used in electroencephalogram (EEG)-based multivariate pattern classification (MVPC) to reduce magnitude differences across trials and subjects. However, the spatial normalization method as applied to EEG channel-based brain maps has been rarely investigated in EEG-based decoding tasks like event-related potential (ERP) experiments. Meanwhile, the effectiveness of spatial normalization across diverse experimental paradigms remains unclear. This study evaluated the impact of spatial normalization on decoding accuracy using the support vector machine (SVM). The analysis included nine experimental paradigms, with seven binary ERP paradigms, one four-class facial expression paradigm, and one sixteen-class orientation paradigm. Results showed that spatial normalization significantly improved the between-subjects decoding accuracy (Cohen’s <span><math><mrow><mi>d</mi><mo>=</mo><mn>1</mn><mo>.</mo><mn>39</mn></mrow></math></span>, <span><math><mrow><mi>p</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>001</mn></mrow></math></span>) but did not enhance the within-subjects decoding accuracy. Additionally, the morphological fidelity of the difference wave was preserved after spatial normalization, as evidenced by the high similarity between the normalized and original ERP difference waves across the seven binary paradigms. We validated our findings across diverse experimental paradigms and demonstrated that spatial normalization effectively enhances between-subjects decoding accuracy using SVM while preserving the temporal consistency of ERP, offering a generalizable preprocessing approach for EEG-based cognitive, clinical, and brain–computer interface (BCI) applications.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109535"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.bspc.2026.109569
Hui Zhou , Zhihui Wu , Longxi He , Yangsheng Hu , Zhouyuan Qin , Feng Wang , Jianfeng He
Respiratory motion introduces non-rigid anatomical deformation and signal blurring in PET imaging, leading to bias in lesion quantification and standardized uptake value (SUV) estimation. This study proposes SimNet-SFDR, an unsupervised 3D registration framework that integrates Sim(3)-equivariant encoding with structure-frequency domain regularization for accurate compensation of respiratory motion artifacts. The architecture couples geometric invariance with structure-aware refinement, enabling anatomically consistent and functionally stable deformation estimation.
Comprehensive experiments were conducted on simulated phantoms and multi-center 3D clinical PET/CT datasets, providing large-scale validation across heterogeneous scanners and acquisition protocols. On clinical data, SimNet-SFDR achieved an average SSIM of 0.953 ± 0.021 and CC of 0.958 ± 0.017, yielding improvements of approximately 4% over uncorrected images and 6% over the learning-based baseline VoxelMorph. Compared with the uncorrected images, the normalized mutual information increased by about 8%, whereas the target registration error and 95th percentile Hausdorff distance were reduced by 56% and 22%, respectively, demonstrating markedly improved geometric precision and deformation regularity.
Lesion-level subgroup analyses further demonstrated consistent performance across tumor sizes. The method maintained SUVmean deviations within ± 10% and sub-millimeter geometric error for small lesions (<10 mm), indicating stable quantification and effective suppression of motion- and partial-volume related bias.
These results confirm that SimNet-SFDR provides a robust and anatomically consistent motion-correction framework, offering practical potential for integration into quantitative and motion-aware PET imaging workflows in clinical environments.
{"title":"SimNet-SFDR: PET motion artifact correction via Sim(3)-Equivariant and Frequency-Based registration","authors":"Hui Zhou , Zhihui Wu , Longxi He , Yangsheng Hu , Zhouyuan Qin , Feng Wang , Jianfeng He","doi":"10.1016/j.bspc.2026.109569","DOIUrl":"10.1016/j.bspc.2026.109569","url":null,"abstract":"<div><div>Respiratory motion introduces non-rigid anatomical deformation and signal blurring in PET imaging, leading to bias in lesion quantification and standardized uptake value (SUV) estimation. This study proposes SimNet-SFDR, an unsupervised 3D registration framework that integrates Sim(3)-equivariant encoding with structure-frequency domain regularization for accurate compensation of respiratory motion artifacts. The architecture couples geometric invariance with structure-aware refinement, enabling anatomically consistent and functionally stable deformation estimation.</div><div>Comprehensive experiments were conducted on simulated phantoms and multi-center 3D clinical PET/CT datasets, providing large-scale validation across heterogeneous scanners and acquisition protocols. On clinical data, SimNet-SFDR achieved an average SSIM of 0.953 ± 0.021 and CC of 0.958 ± 0.017, yielding improvements of approximately 4% over uncorrected images and 6% over the learning-based baseline VoxelMorph. Compared with the uncorrected images, the normalized mutual information increased by about 8%, whereas the target registration error and 95th percentile Hausdorff distance were reduced by 56% and 22%, respectively, demonstrating markedly improved geometric precision and deformation regularity.</div><div>Lesion-level subgroup analyses further demonstrated consistent performance across tumor sizes. The method maintained SUVmean deviations within ± 10% and sub-millimeter geometric error for small lesions (<10 mm), indicating stable quantification and effective suppression of motion- and partial-volume related bias.</div><div>These results confirm that SimNet-SFDR provides a robust and anatomically consistent motion-correction framework, offering practical potential for integration into quantitative and motion-aware PET imaging workflows in clinical environments.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109569"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.bspc.2026.109516
Lin Fan , Xiaojia Ding , Zhongmin Wang , Hai Wang , Rong Zhang
3D medical image segmentation is vital for disease diagnosis and effective treatment strategies. Despite the advancements in Convolutional Neural Networks (CNN), their fixed receptive fields constrain global context modeling, leading to suboptimal performance, particularly with complex shapes and multi-scale variations. Transformer enhances the global modeling capability through self-attention mechanism. However, challenges persist in capturing fine-grained local features and extracting edge features. To overcome these issues, In this study, a dual-path encoder is proposed, which consists of a 3D Pixel Difference Convolution (PDC) and a Swin Transformer and is combined with a CNN for the decoder. The 3D PDC module extracts local features by calculating the differences between neighboring pixels. To improve the ability of capturing edge structures, three 3D PDC variants: 3D Central Pixel Difference Convolution (CPDC), 3D Angular Pixel Difference Convolution (APDC), and 3D Radial Pixel Difference Convolution (RPDC) are proposed to optimize the processing ability of complex edges, multi-directional edges, and multi-scale structures, respectively. After performance evaluation, the selected CARV combinations (3D CPDC, 3D APDC, 3D RPDC, and 3D ordinary convolution) achieve a compromise between high segmentation accuracy and low computational cost. The Swin Transformer encoder captures global context using a hierarchical shift-window mechanism, while the CNN decoder fuses features progressively to produce precise pixel-level segmentation. The proposed method exceeds the performance of existing state-of-the-art techniques, as shown by experiments conducted on the BTCV, FLARE21, and AMOS22 datasets.
{"title":"Medical image segmentation based on 3D PDC with Swin Transformer","authors":"Lin Fan , Xiaojia Ding , Zhongmin Wang , Hai Wang , Rong Zhang","doi":"10.1016/j.bspc.2026.109516","DOIUrl":"10.1016/j.bspc.2026.109516","url":null,"abstract":"<div><div>3D medical image segmentation is vital for disease diagnosis and effective treatment strategies. Despite the advancements in Convolutional Neural Networks (CNN), their fixed receptive fields constrain global context modeling, leading to suboptimal performance, particularly with complex shapes and multi-scale variations. Transformer enhances the global modeling capability through self-attention mechanism. However, challenges persist in capturing fine-grained local features and extracting edge features. To overcome these issues, In this study, a dual-path encoder is proposed, which consists of a 3D Pixel Difference Convolution (PDC) and a Swin Transformer and is combined with a CNN for the decoder. The 3D PDC module extracts local features by calculating the differences between neighboring pixels. To improve the ability of capturing edge structures, three 3D PDC variants: 3D Central Pixel Difference Convolution (CPDC), 3D Angular Pixel Difference Convolution (APDC), and 3D Radial Pixel Difference Convolution (RPDC) are proposed to optimize the processing ability of complex edges, multi-directional edges, and multi-scale structures, respectively. After performance evaluation, the selected CARV combinations (3D CPDC, 3D APDC, 3D RPDC, and 3D ordinary convolution) achieve a compromise between high segmentation accuracy and low computational cost. The Swin Transformer encoder captures global context using a hierarchical shift-window mechanism, while the CNN decoder fuses features progressively to produce precise pixel-level segmentation. The proposed method exceeds the performance of existing state-of-the-art techniques, as shown by experiments conducted on the BTCV, FLARE21, and AMOS22 datasets.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109516"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.bspc.2026.109586
Jinze Tong, Wanzhong Chen
Accurately decoding the semantics of visual perception and imagination Electroencephalogram (EEG) signals is vital for understanding brain functions. It also contributes to the improvement and expansion of brain–computer interfaces. However, existing EEG-based decoding methods often treat perception and imagination independently, ignoring their potential correlations and thus limiting decoding performance. To address this, inspired by the invariance of semantic information across cognitive processes, this paper proposes Shared Semantic Information Driven Network (SSIDNet). This model incorporates three modules: a dual-branch pre-trained Specialized Part for extracting features from EEG data of a single cognitive process; a Kolmogorov–Arnold Network (KAN)-based Public Part designed as a parameter-sharing parallel structure to extract shared semantic information from both perception and imagination EEG signals; and a Capsule Network (ccCapsNet)-based Fusion Part that integrates features and performs classification. Experiments on two public datasets demonstrate that SSIDNet increases accuracy by 12.4 and 15.47 percentage points over the Specialized Part alone, leading to notably better semantic decoding performance. Furthermore, the success of SSIDNet provides supporting algorithmic evidence for the existence of shared data patterns and semantic features between perception and imagination in the brain, and demonstrates the feasibility of leveraging this shared information to enhance EEG-based semantic decoding.
{"title":"Electroencephalogram decoding driven by shared semantic information for perception and imagination cognitive processes","authors":"Jinze Tong, Wanzhong Chen","doi":"10.1016/j.bspc.2026.109586","DOIUrl":"10.1016/j.bspc.2026.109586","url":null,"abstract":"<div><div>Accurately decoding the semantics of visual perception and imagination Electroencephalogram (EEG) signals is vital for understanding brain functions. It also contributes to the improvement and expansion of brain–computer interfaces. However, existing EEG-based decoding methods often treat perception and imagination independently, ignoring their potential correlations and thus limiting decoding performance. To address this, inspired by the invariance of semantic information across cognitive processes, this paper proposes Shared Semantic Information Driven Network (SSIDNet). This model incorporates three modules: a dual-branch pre-trained Specialized Part for extracting features from EEG data of a single cognitive process; a Kolmogorov–Arnold Network (KAN)-based Public Part designed as a parameter-sharing parallel structure to extract shared semantic information from both perception and imagination EEG signals; and a Capsule Network (ccCapsNet)-based Fusion Part that integrates features and performs classification. Experiments on two public datasets demonstrate that SSIDNet increases accuracy by 12.4 and 15.47 percentage points over the Specialized Part alone, leading to notably better semantic decoding performance. Furthermore, the success of SSIDNet provides supporting algorithmic evidence for the existence of shared data patterns and semantic features between perception and imagination in the brain, and demonstrates the feasibility of leveraging this shared information to enhance EEG-based semantic decoding.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109586"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.bspc.2026.109495
Meng-Chun Kao , Eric Yi-Hsiu Huang , Ting Chang , Wen-Chuan Kuo
Pneumoperitoneum creation is a crucial step in laparoscopic surgery. Blind insertion of a Veress needle into the abdominal wall mainly relies on the surgeon’s experiences, which may lead to more than 3% of failures in the clinic, including preperitoneal insufflation, gas embolism, and vascular or visceral injury. This study proposed a novel method to create pneumoperitoneum using fiber-probe optical coherence tomography (OCT) as a real-time imaging guide. Our image analysis process and automatic identification techniques for the peritoneum and its intra- and extraperitoneal tissues reveal that four distinct image features—Mad, root mean square, standard deviation, and coarseness—can effectively describe the various tissue structures encountered during needle puncture. By combining these four features as inputs, various classifiers were employed to differentiate the peritoneum from intra- and extraperitoneal tissues. The Cubic Support Vector Machine (CSVM) classifier achieves an average precision of 98.6% in identifying peritoneum and its intra- and extraperitoneal tissues. Using intelligent and objective OCT image-guided puncture for real-time recognition of the needle tip, pneumoperitoneum can be effectively and safely established, thereby avoiding the failure caused by human judgment errors, the surgeon’s opinion, and the number of tries required. Reducing failure can significantly lower medical insurance costs and the ongoing expenses associated with post-operative complications. Adopting these advanced imaging technologies as laparoscopic techniques evolve is crucial for improving surgical precision and patient care.
{"title":"Veress needle guidance in pneumoperitoneum creation using optical coherence tomography and machine learning","authors":"Meng-Chun Kao , Eric Yi-Hsiu Huang , Ting Chang , Wen-Chuan Kuo","doi":"10.1016/j.bspc.2026.109495","DOIUrl":"10.1016/j.bspc.2026.109495","url":null,"abstract":"<div><div>Pneumoperitoneum creation is a crucial step in laparoscopic surgery. Blind insertion of a Veress needle into the abdominal wall mainly relies on the surgeon’s experiences, which may lead to more than 3% of failures in the clinic, including preperitoneal insufflation, gas embolism, and vascular or visceral injury. This study proposed a novel method to create pneumoperitoneum using fiber-probe optical coherence tomography (OCT) as a real-time imaging guide. Our image analysis process and automatic identification techniques for the peritoneum and its intra- and extraperitoneal tissues reveal that four distinct image features—Mad, root mean square, standard deviation, and coarseness—can effectively describe the various tissue structures encountered during needle puncture. By combining these four features as inputs, various classifiers were employed to differentiate the peritoneum from intra- and extraperitoneal tissues. The Cubic Support Vector Machine (CSVM) classifier achieves an average precision of 98.6% in identifying peritoneum and its intra- and extraperitoneal tissues. Using intelligent and objective OCT image-guided puncture for real-time recognition of the needle tip, pneumoperitoneum can be effectively and safely established, thereby avoiding the failure caused by human judgment errors, the surgeon’s opinion, and the number of tries required. Reducing failure can significantly lower medical insurance costs and the ongoing expenses associated with post-operative complications. Adopting these advanced imaging technologies as laparoscopic techniques evolve is crucial for improving surgical precision and patient care.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109495"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.bspc.2026.109585
Zheng Li , Xiangwei Zheng , Dejian Su , Mingzhe Zhang
Contrastive Language-Image Pre-Training (CLIP) has shown superior performances in zero-shot natural image classification. However, its effective application to medical-related tasks remains a challenge. Some existing studies suffer from decreased generalization ability and catastrophic forgetting for utilizing self-collected insufficient datasets to fine-tune CLIP. In this paper, we propose a novel Concept Decomposition-based CLIP framework (CDCLIP) aimed at improving the classification accuracy of previously unseen medical images, obviating the need for retraining. Specifically, CDCLIP exploits external prior knowledge and the multi-level structural approach to disentangle medical diseases into several regular visual concepts. Notably, CDCLIP shifts the analytical focus from the disease category to the correlation degree between the images and the prompts derived from the decomposed concepts, which helps medical attributes be better evaluated. By introducing inference mechanism, the prompts composed of specific attributes serve to infer the final medical diagnosis. Comprehensive experiments are conducted on four datasets (including multiple diseases under endoscopy, CT, X-ray, and retina images) and the results demonstrate that CDCLIP owns better generalization ability. Compared to CLIP, CDCLIP achieves significant average accuracy improvement of intestinal metaplasia identification (+3.64%), lung cancer identification (+22.71%), tuberculosis detection (+11.96%), glaucoma analysis (+10.4%), and breast tumor identification (+49.87%).
{"title":"CDCLIP: An interpretable zero-shot medical image classification framework based on concept decomposition","authors":"Zheng Li , Xiangwei Zheng , Dejian Su , Mingzhe Zhang","doi":"10.1016/j.bspc.2026.109585","DOIUrl":"10.1016/j.bspc.2026.109585","url":null,"abstract":"<div><div>Contrastive Language-Image Pre-Training (CLIP) has shown superior performances in zero-shot natural image classification. However, its effective application to medical-related tasks remains a challenge. Some existing studies suffer from decreased generalization ability and catastrophic forgetting for utilizing self-collected insufficient datasets to fine-tune CLIP. In this paper, we propose a novel Concept Decomposition-based CLIP framework (CDCLIP) aimed at improving the classification accuracy of previously unseen medical images, obviating the need for retraining. Specifically, CDCLIP exploits external prior knowledge and the multi-level structural approach to disentangle medical diseases into several regular visual concepts. Notably, CDCLIP shifts the analytical focus from the disease category to the correlation degree between the images and the prompts derived from the decomposed concepts, which helps medical attributes be better evaluated. By introducing inference mechanism, the prompts composed of specific attributes serve to infer the final medical diagnosis. Comprehensive experiments are conducted on four datasets (including multiple diseases under endoscopy, CT, X-ray, and retina images) and the results demonstrate that CDCLIP owns better generalization ability. Compared to CLIP, CDCLIP achieves significant average accuracy improvement of intestinal metaplasia identification (+3.64%), lung cancer identification (+22.71%), tuberculosis detection (+11.96%), glaucoma analysis (+10.4%), and breast tumor identification (+49.87%).</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"116 ","pages":"Article 109585"},"PeriodicalIF":4.9,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145940770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}