Pub Date : 2026-01-31DOI: 10.1016/j.bspc.2026.109715
Yawen Fan , Xiang Wang , Zhen Yue , Xinchen Zhang , Mingkai Chen , Jianxin Chen
Automated classification of brain tumors is essential for reliable diagnosis and effective treatment planning. However, deep learning-based methods require large, well-labeled MRI datasets, which can be expensive, time-consuming, and challenging to obtain in clinical settings. Moreover, real-world datasets often exhibit severe class imbalance and inter-subject variability, both of which can compromise model robustness and limit generalization to unseen cases. In this paper, we introduce a novel dynamic active learning framework enhanced by clustering for brain tumor classification. First, the proposed framework extracts high-level features of MRI images by a self-supervised learning method, which are then clustered to form a multi-class data pool, providing a pre-classification of the samples. To reduce annotation effort while maintaining model performance, the framework dynamically selects the most informative samples from each cluster by jointly considering prediction uncertainty and cluster diversity. Additionally, we have constructed a high-quality brain tumor MRI dataset that includes three tumor types: glioma, metastatic tumor, and diffuse large B-cell lymphoma. Notably, the latter is scarce in existing public datasets. Extensive experiments on both public and private datasets show that the proposed method achieves competitive performance using only a small portion of labeled data. Also, on an external test set, the method obtained an average accuracy of 0.92. All these results suggest that our method offers a practical and efficient solution for MRI-based brain tumor classification in real-world clinical settings.
{"title":"Clustering-enhanced active learning with dynamic sampling for brain tumor classification","authors":"Yawen Fan , Xiang Wang , Zhen Yue , Xinchen Zhang , Mingkai Chen , Jianxin Chen","doi":"10.1016/j.bspc.2026.109715","DOIUrl":"10.1016/j.bspc.2026.109715","url":null,"abstract":"<div><div>Automated classification of brain tumors is essential for reliable diagnosis and effective treatment planning. However, deep learning-based methods require large, well-labeled MRI datasets, which can be expensive, time-consuming, and challenging to obtain in clinical settings. Moreover, real-world datasets often exhibit severe class imbalance and inter-subject variability, both of which can compromise model robustness and limit generalization to unseen cases. In this paper, we introduce a novel dynamic active learning framework enhanced by clustering for brain tumor classification. First, the proposed framework extracts high-level features of MRI images by a self-supervised learning method, which are then clustered to form a multi-class data pool, providing a pre-classification of the samples. To reduce annotation effort while maintaining model performance, the framework dynamically selects the most informative samples from each cluster by jointly considering prediction uncertainty and cluster diversity. Additionally, we have constructed a high-quality brain tumor MRI dataset that includes three tumor types: glioma, metastatic tumor, and diffuse large B-cell lymphoma. Notably, the latter is scarce in existing public datasets. Extensive experiments on both public and private datasets show that the proposed method achieves competitive performance using only a small portion of labeled data. Also, on an external test set, the method obtained an average accuracy of 0.92. All these results suggest that our method offers a practical and efficient solution for MRI-based brain tumor classification in real-world clinical settings.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"118 ","pages":"Article 109715"},"PeriodicalIF":4.9,"publicationDate":"2026-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-30DOI: 10.1016/j.bspc.2026.109675
Saleh Lashkari , Mohammad Ali Khalilzadeh , Seyyed Ali Zendehbad , Mohammad Reza Khakzad , Shahryar Salmani Bajestani , Elias Mazrooei Rad
Introduction
It is thus essential to describe dynamic changes in brain signals, especially in the Electroencephalogram (EEG) data, to analyze nonlinear and chaotic patterns, particularly in epilepsy. This work presents an approach that seeks to incorporate recurrence plots with textural features in epilepsy seizure detection. This method provides a detailed picture of brain activity and can be helpful for clinical neuroscience.
Materials and methods
In this study, Gray-Level Co-occurrence Matrix (GLCM) texture features were computed from recurrence plots of EEG signals to characterize neural dynamics. GLCM measures spatial relationships in the data, providing detailed insights into temporal and spatial patterns of brain activity. The method was initially validated on chaotic systems, demonstrating its ability to capture nonlinear behaviors. It was then applied to EEG data to detect seizures, highlighting its potential in clinical settings.
Results
The proposed framework outperformed traditional Recurrence Quantification Analysis (RQA) and other methods in detecting epileptic seizures. The GLCM-enhanced recurrence plots provided a more accurate and sensitive representation of brain dynamics, allowing for earlier and more reliable seizure detection. This method shows strong potential for clinical applications, enhancing the ability to detect seizures early. On the Bonn EEG corpus, the proposed GRP–GLCM + SVM pipeline achieved 98.6 % (Case 1: AB/CD/E), 99.6 % (Case 2: ABCD/E), and 100 % (Case 3: D/E) Accuracy under nested cross-validation. Precision, Recall, and F1 were ≥ 0.98 in Cases 1–2 and 1.00 in Case 3 (zero FP/FN). Compared with an RQA baseline (76.8 %, 95.6 %, 91.0 %), these results reflect + 21.8, +4.0, and + 9.0 percentage-point gains while remaining competitive with recent CNN-based approaches and preserving interpretability.
Conclusion
This study demonstrates that textural analysis of recurrence plots, particularly using GLCM features, provides a robust and efficient tool for epileptic seizure detection. By capturing subtle changes in brain activity, the framework offers a promising approach for improving early detection and intervention in clinical neuroscience.
{"title":"Improved Detection of Epileptic Seizures via EEG Signals and Texture Analysis of Recurrence Plots","authors":"Saleh Lashkari , Mohammad Ali Khalilzadeh , Seyyed Ali Zendehbad , Mohammad Reza Khakzad , Shahryar Salmani Bajestani , Elias Mazrooei Rad","doi":"10.1016/j.bspc.2026.109675","DOIUrl":"10.1016/j.bspc.2026.109675","url":null,"abstract":"<div><h3>Introduction</h3><div>It is thus essential to describe dynamic changes in brain signals, especially in the Electroencephalogram (EEG) data, to analyze nonlinear and chaotic patterns, particularly in epilepsy. This work presents an approach that seeks to incorporate recurrence plots with textural features in epilepsy seizure detection. This method provides a detailed picture of brain activity and can be helpful for clinical neuroscience.</div></div><div><h3>Materials and methods</h3><div>In this study, Gray-Level Co-occurrence Matrix (GLCM) texture features were computed from recurrence plots of EEG signals to characterize neural dynamics. GLCM measures spatial relationships in the data, providing detailed insights into temporal and spatial patterns of brain activity. The method was initially validated on chaotic systems, demonstrating its ability to capture nonlinear behaviors. It was then applied to EEG data to detect seizures, highlighting its potential in clinical settings.</div></div><div><h3>Results</h3><div>The proposed framework outperformed traditional Recurrence Quantification Analysis (RQA) and other methods in detecting epileptic seizures. The GLCM-enhanced recurrence plots provided a more accurate and sensitive representation of brain dynamics, allowing for earlier and more reliable seizure detection. This method shows strong potential for clinical applications, enhancing the ability to detect seizures early. On the Bonn EEG corpus, the proposed GRP–GLCM + SVM pipeline achieved 98.6 % (Case 1: AB/CD/E), 99.6 % (Case 2: ABCD/E), and 100 % (Case 3: D/E) Accuracy under nested cross-validation. Precision, Recall, and F1 were ≥ 0.98 in Cases 1–2 and 1.00 in Case 3 (zero FP/FN). Compared with an RQA baseline (76.8 %, 95.6 %, 91.0 %), these results reflect + 21.8, +4.0, and + 9.0 percentage-point gains while remaining competitive with recent CNN-based approaches and preserving interpretability.</div></div><div><h3>Conclusion</h3><div>This study demonstrates that textural analysis of recurrence plots, particularly using GLCM features, provides a robust and efficient tool for epileptic seizure detection. By capturing subtle changes in brain activity, the framework offers a promising approach for improving early detection and intervention in clinical neuroscience.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"118 ","pages":"Article 109675"},"PeriodicalIF":4.9,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pneumonia remains one of the leading causes of childhood mortality worldwide, especially in low-resource clinical settings where access to expert radiologists is limited. Automated and interpretable deep learning models can provide rapid and reliable diagnostic support.
Objective:
This study introduces AXNet+ECA, a lightweight attention-augmented convolutional neural network, designed to improve pneumonia detection from pediatric chest X-ray (CXR) images while ensuring computational efficiency and interpretability. The novelty of AXNet+ECA lies in the dual-attention integration of Convolutional Block Attention Module (CBAM) and Efficient Channel Attention (ECA) mechanisms within a lightweight backbone, jointly enhancing diagnostic accuracy and model interpretability while maintaining computational frugality.
Methods:
The proposed model builds upon the ResNet-18 backbone by embedding CBAM blocks within each residual stage and appending an ECA head for fine-grained channel calibration. AXNet+ECA was trained and evaluated on 5863 pediatric chest X-ray images from the publicly available Kaggle pneumonia dataset, using an 80–10–10 train/validation/test split. Evaluation encompassed baseline comparisons, ablation studies, robustness analysis, and statistical significance testing.
Results:
AXNet+ECA achieved a test accuracy of 93.6%, F1-score of 93.1%, and AUC of 0.964, outperforming or matching CNN baselines (ResNet-18, DenseNet-121, VGG-16, CheXNet) and recent transformer-based models (ViT-B/16, Swin-T). Despite competitive performance, AXNet+ECA requires only 13.1M parameters and 4.7 ms/image inference time, highlighting its computational efficiency. Visual interpretability via CBAM and Grad-CAM revealed 86.7% alignment with radiologist-annotated abnormalities.
Conclusion:
By integrating dual-path attention within a compact architecture, AXNet+ECA achieves an effective balance between diagnostic accuracy, interpretability, and efficiency. These characteristics underline its potential for real-time clinical deployment in resource-constrained healthcare environments and large-scale screening initiatives.
{"title":"AXNet: Attention-enhanced X-ray network for pneumonia detection","authors":"Mojtaba Jahanian , Abbas Karimi , Nafiseh Osati Eraghi , Faraneh Zarafshan","doi":"10.1016/j.bspc.2026.109618","DOIUrl":"10.1016/j.bspc.2026.109618","url":null,"abstract":"<div><h3>Background:</h3><div>Pneumonia remains one of the leading causes of childhood mortality worldwide, especially in low-resource clinical settings where access to expert radiologists is limited. Automated and interpretable deep learning models can provide rapid and reliable diagnostic support.</div></div><div><h3>Objective:</h3><div>This study introduces <strong>AXNet+ECA</strong>, a lightweight attention-augmented convolutional neural network, designed to improve pneumonia detection from pediatric chest X-ray (CXR) images while ensuring computational efficiency and interpretability. The novelty of AXNet+ECA lies in the dual-attention integration of <em>Convolutional Block Attention Module (CBAM)</em> and <em>Efficient Channel Attention (ECA)</em> mechanisms within a lightweight backbone, jointly enhancing diagnostic accuracy and model interpretability while maintaining computational frugality.</div></div><div><h3>Methods:</h3><div>The proposed model builds upon the ResNet-18 backbone by embedding CBAM blocks within each residual stage and appending an ECA head for fine-grained channel calibration. AXNet+ECA was trained and evaluated on 5863 pediatric chest X-ray images from the publicly available Kaggle pneumonia dataset, using an 80–10–10 train/validation/test split. Evaluation encompassed baseline comparisons, ablation studies, robustness analysis, and statistical significance testing.</div></div><div><h3>Results:</h3><div>AXNet+ECA achieved a test accuracy of <strong>93.6%</strong>, F1-score of <strong>93.1%</strong>, and AUC of <strong>0.964</strong>, outperforming or matching CNN baselines (ResNet-18, DenseNet-121, VGG-16, CheXNet) and recent transformer-based models (ViT-B/16, Swin-T). Despite competitive performance, AXNet+ECA requires only <strong>13.1M parameters</strong> and <strong>4.7 ms/image</strong> inference time, highlighting its computational efficiency. Visual interpretability via CBAM and Grad-CAM revealed <strong>86.7%</strong> alignment with radiologist-annotated abnormalities.</div></div><div><h3>Conclusion:</h3><div>By integrating dual-path attention within a compact architecture, AXNet+ECA achieves an effective balance between diagnostic accuracy, interpretability, and efficiency. These characteristics underline its potential for real-time clinical deployment in resource-constrained healthcare environments and large-scale screening initiatives.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"118 ","pages":"Article 109618"},"PeriodicalIF":4.9,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-30DOI: 10.1016/j.bspc.2026.109669
Anny Maza , Sandra Goizueta , María Dolores Navarro , Enrique Noé , Joan Ferri , Valery Naranjo , Roberto Llorens
Emerging evidence suggests that personalized emotional stimuli can elicit measurable brain responses indicative of cognitive processing. This could be particularly relevant in patients with disorders of consciousness (DOC), where accurate assessment remains remarkably challenging. Despite this, the electroencephalography (EEG) features most sensitive to these personalized emotional stimuli, and their generalizability from healthy individuals to patients with DOC, remain underexplored. This study aimed to identify EEG features distinguishing brain responses to familiar versus non-familiar audiovisual stimuli in healthy controls and to assess their applicability in patients with DOC. Nineteen healthy controls and nineteen patients with DOC viewed personalized emotional videos featuring either familiar or unfamiliar individuals while EEG data were recorded. Seventeen EEG features across various domains were compared using subject-independent machine-learning models in healthy controls, and the top-performing features were validated in patients with DOC. Results indicated fuzzy entropy, common spatial pattern (CSP), and Hjorth activity were the most discriminative features. Applying models trained on healthy individuals to patients with DOC revealed statistically significant performances in 60% of patients in minimally conscious state and 33% of patients with unresponsive wakefulness syndrome. Topographical analyses identified prominent differences in temporal, parietal, and frontal regions within beta and gamma bands for healthy controls, partially replicated in responsive patients. These findings underscore fuzzy entropy, CSP, and Hjorth activity as sensitive EEG markers for detecting emotional responses to personalized videos in healthy controls. Their partial generalization to patients with DOC highlights potential clinical utility in assessing residual cognitive processing and consciousness, particularly in responsive states.
{"title":"Identifying relevant EEG features for personalized emotional videos: a cross-population analysis in healthy controls and patients with disorders of consciousness","authors":"Anny Maza , Sandra Goizueta , María Dolores Navarro , Enrique Noé , Joan Ferri , Valery Naranjo , Roberto Llorens","doi":"10.1016/j.bspc.2026.109669","DOIUrl":"10.1016/j.bspc.2026.109669","url":null,"abstract":"<div><div>Emerging evidence suggests that personalized emotional stimuli can elicit measurable brain responses indicative of cognitive processing. This could be particularly relevant in patients with disorders of consciousness (DOC), where accurate assessment remains remarkably challenging. Despite this, the electroencephalography (EEG) features most sensitive to these personalized emotional stimuli, and their generalizability from healthy individuals to patients with DOC, remain underexplored. This study aimed to identify EEG features distinguishing brain responses to familiar versus non-familiar audiovisual stimuli in healthy controls and to assess their applicability in patients with DOC. Nineteen healthy controls and nineteen patients with DOC viewed personalized emotional videos featuring either familiar or unfamiliar individuals while EEG data were recorded. Seventeen EEG features across various domains were compared using subject-independent machine-learning models in healthy controls, and the top-performing features were validated in patients with DOC. Results indicated fuzzy entropy, common spatial pattern (CSP), and Hjorth activity were the most discriminative features. Applying models trained on healthy individuals to patients with DOC revealed statistically significant performances in 60% of patients in minimally conscious state and 33% of patients with unresponsive wakefulness syndrome. Topographical analyses identified prominent differences in temporal, parietal, and frontal regions within beta and gamma bands for healthy controls, partially replicated in responsive patients. These findings underscore fuzzy entropy, CSP, and Hjorth activity as sensitive EEG markers for detecting emotional responses to personalized videos in healthy controls. Their partial generalization to patients with DOC highlights potential clinical utility in assessing residual cognitive processing and consciousness, particularly in responsive states.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"118 ","pages":"Article 109669"},"PeriodicalIF":4.9,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-30DOI: 10.1016/j.bspc.2026.109687
Xiaolin Wang , Xiaowei Li , Jing Li , Yunfa Fu , Dan Zhang , Yan Peng
The lack of dynamic coupling relationships between multimodal signals has limited research into the neural mechanisms of sleep disorders. However, existing studies primarily focus on intra-modal or cross-frequency coupling analysis, lacking methods that can simultaneously quantify nonlinear, time-varying cross-modal coupling and be applied to sleep staging, thereby failing to reveal cross-modal coupling relationships effectively. To address this issue, a novel end-to-end dynamic graph constructor-spatio-temporal graph attention network (DGC-STGAT) is proposed for modeling cross-modal dynamic coupling relationships among electroencephalogram (EEG), electrocardiogram (ECG), and electrooculogram (EOG) signals. The DGC module maps multimodal signals into graph node representations through node feature encoding and adaptively constructs weighted adjacency matrices based on intermodal similarity, thereby generating dynamic graph sequences. The ST-GAT integrates a spatial graph attention mechanism with bidirectional long short-term memory (Bi-LSTM) network to jointly model the spatial dependency structure and temporal evolution of dynamic graph sequences. This extracts spatio-temporal features for analyzing cross-modal coupling relationships and automatically classifying five sleep stages, namely wakefulness (WAKE), non-rapid eye movement (NREM) stages N1, N2, and N3, and rapid eye movement (REM) sleep. Experimental results on a private polysomnography (PSG) dataset involving 50 subjects and the public ISRUC-Sleep dataset demonstrate the effectiveness of the proposed framework. The EEG-EOG modality pair dominates cross-modal coupling, accounting for approximately 80% of the coupling ratio, while modality pairs incorporating ECG contribute significantly less, highlighting the asymmetry in interaction patterns across modalities. In the five-class sleep staging task, DGC-STGAT achieved classification accuracies of 89.1% and 88.6% on the private and ISRUC-Sleep datasets, respectively, marking an improvement of 1.6% over the best-performing baseline model ST-GCN. The overall classification performance of DGC-STGAT outperforms six representative baseline methods DeepSleepNet, AttnSleep, SleepTransformer, ST-GCN, Sleep-CLIP, and MVF-SleepNet. By modeling dynamic cross-modal coupling relationships and applying them to sleep staging, this study not only provides interpretable coupling patterns and achieves high overall classification performance but also offers new insights into the synergistic mechanisms of multimodal physiological signals.
{"title":"Dynamic cross-modal spatio-temporal graph attention network: Multimodal coupling analysis in sleep stage classification","authors":"Xiaolin Wang , Xiaowei Li , Jing Li , Yunfa Fu , Dan Zhang , Yan Peng","doi":"10.1016/j.bspc.2026.109687","DOIUrl":"10.1016/j.bspc.2026.109687","url":null,"abstract":"<div><div>The lack of dynamic coupling relationships between multimodal signals has limited research into the neural mechanisms of sleep disorders. However, existing studies primarily focus on intra-modal or cross-frequency coupling analysis, lacking methods that can simultaneously quantify nonlinear, time-varying cross-modal coupling and be applied to sleep staging, thereby failing to reveal cross-modal coupling relationships effectively. To address this issue, a novel end-to-end dynamic graph constructor-spatio-temporal graph attention network (DGC-STGAT) is proposed for modeling cross-modal dynamic coupling relationships among electroencephalogram (EEG), electrocardiogram (ECG), and electrooculogram (EOG) signals. The DGC module maps multimodal signals into graph node representations through node feature encoding and adaptively constructs weighted adjacency matrices based on intermodal similarity, thereby generating dynamic graph sequences. The ST-GAT integrates a spatial graph attention mechanism with bidirectional long short-term memory (Bi-LSTM) network to jointly model the spatial dependency structure and temporal evolution of dynamic graph sequences. This extracts spatio-temporal features for analyzing cross-modal coupling relationships and automatically classifying five sleep stages, namely wakefulness (WAKE), non-rapid eye movement (NREM) stages N1, N2, and N3, and rapid eye movement (REM) sleep. Experimental results on a private polysomnography (PSG) dataset involving 50 subjects and the public ISRUC-Sleep dataset demonstrate the effectiveness of the proposed framework. The EEG-EOG modality pair dominates cross-modal coupling, accounting for approximately 80% of the coupling ratio, while modality pairs incorporating ECG contribute significantly less, highlighting the asymmetry in interaction patterns across modalities. In the five-class sleep staging task, DGC-STGAT achieved classification accuracies of 89.1% and 88.6% on the private and ISRUC-Sleep datasets, respectively, marking an improvement of 1.6% over the best-performing baseline model ST-GCN. The overall classification performance of DGC-STGAT outperforms six representative baseline methods DeepSleepNet, AttnSleep, SleepTransformer, ST-GCN, Sleep-CLIP, and MVF-SleepNet. By modeling dynamic cross-modal coupling relationships and applying them to sleep staging, this study not only provides interpretable coupling patterns and achieves high overall classification performance but also offers new insights into the synergistic mechanisms of multimodal physiological signals.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"118 ","pages":"Article 109687"},"PeriodicalIF":4.9,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-30DOI: 10.1016/j.bspc.2026.109700
Diogo Mendes , João Manuel R.S. Tavares
Lumbar spine disorders are a major cause of disability worldwide, often requiring magnetic resonance imaging (MRI) for accurate diagnosis and treatment planning. Segmentation of spinal structures in MRI is a critical yet time-consuming task that is traditionally performed manually by experts. In recent years, deep learning (DL) has emerged as a powerful tool to automate this process, offering improvements in efficiency, consistency, and accuracy. This systematic review presents a comprehensive analysis of DL-based methods for lumbar spine segmentation in MRI, conducted by searching the Scopus and PubMed databases for peer-reviewed journal articles and reviews. Studies were included only if they employed MRI as the imaging modality, applied deep learning techniques for segmentation, and explicitly reported both the model architecture and segmentation results. A total of 56 studies met these criteria, comprising 49 original research articles and 7 review articles. The selected works were analyzed in two main dimensions: (1) dataset characteristics, including image orientation, modality, annotation methods, and data availability; and (2) deep learning frameworks, covering preprocessing, data augmentation, network architectures, and evaluation metrics. Convolutional neural networks (CNNs), particularly U-Net and its variants, are the most used architectures, often enhanced with residual blocks, attention mechanisms, and multi-scale feature extraction. Despite promising results, most studies relied on private datasets and lacked external validation, highlighting challenges to reproducibility.
{"title":"Deep learning for lumbar spine segmentation in magnetic resonance imaging—A systematic review","authors":"Diogo Mendes , João Manuel R.S. Tavares","doi":"10.1016/j.bspc.2026.109700","DOIUrl":"10.1016/j.bspc.2026.109700","url":null,"abstract":"<div><div>Lumbar spine disorders are a major cause of disability worldwide, often requiring magnetic resonance imaging (MRI) for accurate diagnosis and treatment planning. Segmentation of spinal structures in MRI is a critical yet time-consuming task that is traditionally performed manually by experts. In recent years, deep learning (DL) has emerged as a powerful tool to automate this process, offering improvements in efficiency, consistency, and accuracy. This systematic review presents a comprehensive analysis of DL-based methods for lumbar spine segmentation in MRI, conducted by searching the Scopus and PubMed databases for peer-reviewed journal articles and reviews. Studies were included only if they employed MRI as the imaging modality, applied deep learning techniques for segmentation, and explicitly reported both the model architecture and segmentation results. A total of 56 studies met these criteria, comprising 49 original research articles and 7 review articles. The selected works were analyzed in two main dimensions: (1) dataset characteristics, including image orientation, modality, annotation methods, and data availability; and (2) deep learning frameworks, covering preprocessing, data augmentation, network architectures, and evaluation metrics. Convolutional neural networks (CNNs), particularly U-Net and its variants, are the most used architectures, often enhanced with residual blocks, attention mechanisms, and multi-scale feature extraction. Despite promising results, most studies relied on private datasets and lacked external validation, highlighting challenges to reproducibility.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"118 ","pages":"Article 109700"},"PeriodicalIF":4.9,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-30DOI: 10.1016/j.bspc.2026.109688
Yuncheng Jin , Jing Zheng , BinSun , Kaidi Fu , Jingjing Qu , Chenyi Ren , Ting Wang , Yuncui Gan , Binggen Wu , Xinyu Jin , Jianya Zhou
Background and Objective
Due to the underutilization of advanced deep learning techniques in the diagnosis and treatment of checkpoint inhibitor-related pneumonia (CIP), this study proposed a CIP prediction algorithm based on multimodal data fusion.
Methods
Specifically, the algorithm constructed a model using patients’ computed tomography (CT) imaging, electronic medical records, and physiological examination reports. First, feature extraction modules were developed to process each data modality. Subsequently, multimodal features were fused using a cross-attention approach. Finally, these features were input into a classifier for classification.
Results
Experimental results demonstrated that multimodal cross-attention network (MMCA-Net) significantly outperformed single-modality models and traditional fusion methods. In10-fold patient-level cross-validation, the proposed model achieved an average accuracy of 87.12% (±0.83%) and an area under the curve (AUC) of 0.8981 (±0.007). Furthermore, the algorithm showed excellent reproducibility, with performance deviation of less than 0.2% in independent replication trials. Quantitative analysis of attention weights confirmed that the model effectively integrated clinical context to resolve ambiguous radiological patterns.
Conclusions
The proposed deep learning-based multimodal method provides a stable and highly accurate tool for predicting CIP. By integrating information from imaging, textual data, and laboratory results, MMCA-Net offers a valuable clinical reference for physicians, with the potential to enhance patient safety and improve treatment outcomes in the management of cancer immunotherapy.
{"title":"A deep learning-based multimodal data fusion algorithm for predicting checkpoint inhibitors related pneumonia","authors":"Yuncheng Jin , Jing Zheng , BinSun , Kaidi Fu , Jingjing Qu , Chenyi Ren , Ting Wang , Yuncui Gan , Binggen Wu , Xinyu Jin , Jianya Zhou","doi":"10.1016/j.bspc.2026.109688","DOIUrl":"10.1016/j.bspc.2026.109688","url":null,"abstract":"<div><h3>Background and Objective</h3><div>Due to the underutilization of advanced deep learning techniques in the diagnosis and treatment of checkpoint inhibitor-related pneumonia (CIP), this study proposed a CIP prediction algorithm based on multimodal data fusion.</div></div><div><h3>Methods</h3><div>Specifically, the algorithm constructed a model using patients’ computed tomography (CT) imaging, electronic medical records, and physiological examination reports. First, feature extraction modules were developed to process each data modality. Subsequently, multimodal features were fused using a cross-attention approach. Finally, these features were input into a classifier for classification.</div></div><div><h3>Results</h3><div>Experimental results demonstrated that multimodal cross-attention network (MMCA-Net) significantly outperformed single-modality models and traditional fusion methods. In10-fold patient-level cross-validation, the proposed model achieved an average accuracy of 87.12% (±0.83%) and an area under the curve (AUC) of 0.8981 (±0.007). Furthermore, the algorithm showed excellent reproducibility, with performance deviation of less than 0.2% in independent replication trials. Quantitative analysis of attention weights confirmed that the model effectively integrated clinical context to resolve ambiguous radiological patterns.</div></div><div><h3>Conclusions</h3><div>The proposed deep learning-based multimodal method provides a stable and highly accurate tool for predicting CIP. By integrating information from imaging, textual data, and laboratory results, MMCA-Net offers a valuable clinical reference for physicians, with the potential to enhance patient safety and improve treatment outcomes in the management of cancer immunotherapy.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"118 ","pages":"Article 109688"},"PeriodicalIF":4.9,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1016/j.bspc.2026.109673
Ivan Blekanov , Gleb Kim , Fedor Ezhov , Evgenii Larin , Lev Kovalenko , Anthony Nwohiri , Egor Razumilov
Advancements in artificial intelligence are rapidly transforming healthcare, including the diagnosis of aortic aneurysms, which relies on precise measurement of aortic parameters from CT scans. Current manual methods are time-consuming and require expert surgeons, making automation essential. Accurate automation depends on robust aortic semantic segmentation, cross-section reconstruction, and parameter extraction. Existing 2D segmentation models achieve Dice similarity coefficients (DSC) of 0.842–0.890, while 3D models reach 0.750–0.950. Despite the generally high segmentation accuracy, 3D models require substantial computational resources for both training and inference. This presents a substantial challenge for clinical deployment, especially in developing countries. Our research bridges this gap by advancing state-of-the-art 2D deep learning techniques for aortic semantic segmentation on CT scans. In this regard, we developed a pipeline leveraging novel neural network (NN) architectures and computer vision (CV) techniques. Various high-performing semantic segmentation NNs were rigorously compared. The best NNs (such as VAN-S-UNet, rViT-UNet (TransUNet), MiT-B2-UNet) achieved a DSC of 0.938–0.976 for open datasets, and 0.912 for our dataset of 50 aortic CT scans. The proposed pipeline automates the main stages of CT image processing, from raw CT scan data to quantitative aortic assessment, extracting clinically relevant parameters such as cross-sectional area, border length, and major and minor diameters for subsequent pathology diagnosis and informed clinical decision-making. Case study experiments show minor deviations between the results of the proposed method and expert assessments: approximately 5% for perimeter, 6% for major diameter, 10% for minor diameter, and 15% for cross-sectional area measurement.
{"title":"Automated measurement of aortic parameters using deep learning and computer vision","authors":"Ivan Blekanov , Gleb Kim , Fedor Ezhov , Evgenii Larin , Lev Kovalenko , Anthony Nwohiri , Egor Razumilov","doi":"10.1016/j.bspc.2026.109673","DOIUrl":"10.1016/j.bspc.2026.109673","url":null,"abstract":"<div><div>Advancements in artificial intelligence are rapidly transforming healthcare, including the diagnosis of aortic aneurysms, which relies on precise measurement of aortic parameters from CT scans. Current manual methods are time-consuming and require expert surgeons, making automation essential. Accurate automation depends on robust aortic semantic segmentation, cross-section reconstruction, and parameter extraction. Existing 2D segmentation models achieve Dice similarity coefficients (DSC) of 0.842–0.890, while 3D models reach 0.750–0.950. Despite the generally high segmentation accuracy, 3D models require substantial computational resources for both training and inference. This presents a substantial challenge for clinical deployment, especially in developing countries. Our research bridges this gap by advancing state-of-the-art 2D deep learning techniques for aortic semantic segmentation on CT scans. In this regard, we developed a pipeline leveraging novel neural network (NN) architectures and computer vision (CV) techniques. Various high-performing semantic segmentation NNs were rigorously compared. The best NNs (such as VAN-S-UNet, rViT-UNet (TransUNet), MiT-B2-UNet) achieved a DSC of 0.938–0.976 for open datasets, and 0.912 for our dataset of 50 aortic CT scans. The proposed pipeline automates the main stages of CT image processing, from raw CT scan data to quantitative aortic assessment, extracting clinically relevant parameters such as cross-sectional area, border length, and major and minor diameters for subsequent pathology diagnosis and informed clinical decision-making. Case study experiments show minor deviations between the results of the proposed method and expert assessments: approximately 5% for perimeter, 6% for major diameter, 10% for minor diameter, and 15% for cross-sectional area measurement.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"118 ","pages":"Article 109673"},"PeriodicalIF":4.9,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1016/j.bspc.2026.109681
Xugang Li , Guanghao Huang , Yinhua Liu , Keum-Shik Hong
Electroencephalography (EEG) provides a non-invasive, cost-effective, and objective means for detecting depression, yet existing methods often fail to integrate the full spectrum of EEG information, including time–frequency, spatial, and connectivity features. This paper proposes a multidimensional deep learning framework, termed the Local-Global Feature Fusion Network (LGFF-Net), which unifies local brain-region dynamics with whole-brain functional connectivity (FC) patterns. In LGFF-Net, local time–frequency features are extracted from eight functional brain regions using a multi-branch residual neural network enhanced with multi-head attention, highlighting the most discriminative regions. Simultaneously, global connectivity features are captured by phase-locking-value-based FC matrices across five frequency bands, combined with a learnable band-weighting mechanism and a convolutional neural network. A global-to-local fusion mechanism further allows whole-brain connectivity to adaptively modulate the contribution of each brain region before classification, thereby strengthening the complementarity between local and global features. On the MODMA dataset, LGFF-Net achieves an accuracy of 97.64 ± 0.21%, outperforming state-of-the-art approaches, and it also maintains strong performance on an additional independent EEG dataset collected in Malaysia, indicating good cross-dataset generalization. Interpretability analyses highlight the frontal and temporal lobes, together with theta and alpha band connectivity, as key markers differentiating patients with depression from healthy controls, confirming both the robustness and interpretability of the proposed framework.
{"title":"EEG-based depression detection using a local–global feature fusion deep learning network","authors":"Xugang Li , Guanghao Huang , Yinhua Liu , Keum-Shik Hong","doi":"10.1016/j.bspc.2026.109681","DOIUrl":"10.1016/j.bspc.2026.109681","url":null,"abstract":"<div><div>Electroencephalography (EEG) provides a non-invasive, cost-effective, and objective means for detecting depression, yet existing methods often fail to integrate the full spectrum of EEG information, including time–frequency, spatial, and connectivity features. This paper proposes a multidimensional deep learning framework, termed the Local-Global Feature Fusion Network (LGFF-Net), which unifies local brain-region dynamics with whole-brain functional connectivity (FC) patterns. In LGFF-Net, local time–frequency features are extracted from eight functional brain regions using a multi-branch residual neural network enhanced with multi-head attention, highlighting the most discriminative regions. Simultaneously, global connectivity features are captured by phase-locking-value-based FC matrices across five frequency bands, combined with a learnable band-weighting mechanism and a convolutional neural network. A global-to-local fusion mechanism further allows whole-brain connectivity to adaptively modulate the contribution of each brain region before classification, thereby strengthening the complementarity between local and global features. On the MODMA dataset, LGFF-Net achieves an accuracy of 97.64 ± 0.21%, outperforming state-of-the-art approaches, and it also maintains strong performance on an additional independent EEG dataset collected in Malaysia, indicating good cross-dataset generalization. Interpretability analyses highlight the frontal and temporal lobes, together with theta and alpha band connectivity, as key markers differentiating patients with depression from healthy controls, confirming both the robustness and interpretability of the proposed framework.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"118 ","pages":"Article 109681"},"PeriodicalIF":4.9,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1016/j.bspc.2026.109640
D.E. Martina Jaincy, Pattabiraman Venkatasubbu
Tumor is a dreadful disease faced by human beings and can lead to death. Thus, ultimate methods must be applied to these diseases and save human beings from them. One of the dangerous kinds of tumor faced by women is breast cancer (BC). Earlier diagnosis increases the survival rate, and it should save more lives by protecting women from these dangerous diseases. Different types of images are there for detecting BC. Magnetic resonance imaging (MRI) is the most essential imaging method in predicting BC. Various survey papers are reviewed in this survey for detecting BC using MRI images. In this review, 50 research papers are analyzed regarding breast cancer (BC) detection using MRI images. Moreover, 50 research papers are reviewed in this study regarding breast cancer (BC) detection using MRI images. The technique-related overviews are categorized as follows: machine learning (ML)-enabled approaches, including deep learning (DL) techniques as a specialized subset; clustering-based approaches; segmentation-based methods; and hybrid techniques. MRI is the most accomplished imaging for detecting BC compared to other images like mammography, ultrasound and so on. This survey is comprised of categorization research approaches, toolset, year of publication, datasets and performance metrics to detect BC. Finally, the limitations of the investigated approaches are explained, which motivates these researchers to develop the latest effectual approaches for detecting BC by wielding MRI images.
{"title":"An empirical study for breast cancer detection using MRI images","authors":"D.E. Martina Jaincy, Pattabiraman Venkatasubbu","doi":"10.1016/j.bspc.2026.109640","DOIUrl":"10.1016/j.bspc.2026.109640","url":null,"abstract":"<div><div>Tumor is a dreadful disease faced by human beings and can lead to death. Thus, ultimate methods must be applied to these diseases and save human beings from them. One of the dangerous kinds of tumor faced by women is breast cancer (BC). Earlier diagnosis increases the survival rate, and it should save more lives by protecting women from these dangerous diseases. Different types of images are there for detecting BC. Magnetic resonance imaging (MRI) is the most essential imaging method in predicting BC. Various survey papers are reviewed in this survey for detecting BC using MRI images. In this review, 50 research papers are analyzed regarding breast cancer (BC) detection using MRI images. Moreover, 50 research papers are reviewed in this study regarding breast cancer (BC) detection using MRI images. The technique-related overviews are categorized as follows: machine learning (ML)-enabled approaches, including deep learning (DL) techniques as a specialized subset; clustering-based approaches; segmentation-based methods; and hybrid techniques. MRI is the most accomplished imaging for detecting BC compared to other images like mammography, ultrasound and so on. This survey is comprised of categorization research approaches, toolset, year of publication, datasets and performance metrics to detect BC. Finally, the limitations of the investigated approaches are explained, which motivates these researchers to develop the latest effectual approaches for detecting BC by wielding MRI images.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"118 ","pages":"Article 109640"},"PeriodicalIF":4.9,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}