首页 > 最新文献

Biomedical Signal Processing and Control最新文献

英文 中文
Towards objective heart disease classification: A deep neural network approach for comprehensive diagnosis 迈向客观心脏病分类:一种综合诊断的深度神经网络方法
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-16 DOI: 10.1016/j.bspc.2026.109502
Seunghee Han , Juyeob Lee , Eunil Park
Heart disease is a major issue in modern society owing to its severity. However, to date, it heavily relies on human judgment, necessitating the need for technology that can aid in objective and rapid human diagnosis. Several studies attempted data-driven approaches to classify heart disease, but they are limited to specific diseases and may not apply to the real medical field. To address these challenges, we propose a suite of deep learning-based classifiers, including a CNN and a state-of-the-art ViT enhanced with an auxiliary UNet feature extractor. To classify the eight types of heart disease, we utilize multi-view echocardiogram images consisting of numbers that reflect the proportion of actual cardiac patients. The experimental results reveal that vanilla ViT is not suitable for the echocardiogram dataset (accuracy of 0.6451). However, the performance can be improved using the UNet auxiliary feature extraction network (achieving an accuracy of 0.8121 for EfficientUNet+ViT). Among the comparison models, our CNN achieved the highest performance with an accuracy of 0.8829 and minimal computational cost, demonstrating its efficacy for direct disease classification without the need for ViT.
心脏病因其严重程度而成为现代社会的一大问题。然而,迄今为止,它在很大程度上依赖于人类的判断,因此需要能够帮助进行客观和快速的人类诊断的技术。一些研究尝试用数据驱动的方法对心脏病进行分类,但它们仅限于特定的疾病,可能不适用于真正的医学领域。为了解决这些挑战,我们提出了一套基于深度学习的分类器,包括CNN和最先进的ViT,并使用辅助的UNet特征提取器进行增强。为了对八种心脏病进行分类,我们使用了由数字组成的多视图超声心动图图像,这些数字反映了实际心脏病患者的比例。实验结果表明,香草ViT不适合超声心动图数据集(准确率为0.6451)。然而,使用UNet辅助特征提取网络可以提高性能(对于EfficientUNet+ViT达到0.8121的精度)。在比较模型中,我们的CNN以0.8829的准确率和最小的计算成本取得了最高的性能,证明了它在不需要ViT的情况下直接进行疾病分类的功效。
{"title":"Towards objective heart disease classification: A deep neural network approach for comprehensive diagnosis","authors":"Seunghee Han ,&nbsp;Juyeob Lee ,&nbsp;Eunil Park","doi":"10.1016/j.bspc.2026.109502","DOIUrl":"10.1016/j.bspc.2026.109502","url":null,"abstract":"<div><div>Heart disease is a major issue in modern society owing to its severity. However, to date, it heavily relies on human judgment, necessitating the need for technology that can aid in objective and rapid human diagnosis. Several studies attempted data-driven approaches to classify heart disease, but they are limited to specific diseases and may not apply to the real medical field. To address these challenges, we propose a suite of deep learning-based classifiers, including a CNN and a state-of-the-art ViT enhanced with an auxiliary UNet feature extractor. To classify the eight types of heart disease, we utilize multi-view echocardiogram images consisting of numbers that reflect the proportion of actual cardiac patients. The experimental results reveal that vanilla ViT is not suitable for the echocardiogram dataset (accuracy of 0.6451). However, the performance can be improved using the UNet auxiliary feature extraction network (achieving an accuracy of 0.8121 for EfficientUNet+ViT). Among the comparison models, our CNN achieved the highest performance with an accuracy of 0.8829 and minimal computational cost, demonstrating its efficacy for direct disease classification without the need for ViT.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"117 ","pages":"Article 109502"},"PeriodicalIF":4.9,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neurophysiological biomarker extraction through decoupled attention mechanisms and EEG signal processing in Alzheimer’s disease and frontotemporal dementia 通过解耦注意机制和脑电图信号处理提取阿尔茨海默病和额颞叶痴呆的神经生理生物标志物
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-16 DOI: 10.1016/j.bspc.2026.109559
Alaa Hussein Abdulaal , Morteza Valizadeh , Mehdi Chehel Amirani
Alzheimer’s disease (AD) and frontotemporal dementia (FTD) represent critical neurodegenerative disorders requiring early and accurate detection for optimal patient management. This study introduces a Hybrid Frequency-Spatial Attention Network (HFSAN) that integrates convolutional neural networks with specialized attention mechanisms for the automated classification of dementia using EEG. The proposed methodology employs decoupled frequency and spatial attention modules, enabling the targeted analysis of spectral and topographical features before adaptive fusion via a gating mechanism. Advanced signal processing techniques, including continuous wavelet transform and artifact subspace reconstruction, ensure optimal feature extraction from 19-channel EEG recordings.
Evaluation of the Miltiadous dataset, comprising 88 subjects, demonstrated exceptional performance with 96.7 % accuracy for AD versus healthy controls, 92.3 % for FTD versus healthy controls, and 87.6 % for AD versus FTD classification. The method significantly outperformed traditional machine learning (Random Forest: 84.7 %) and existing deep learning approaches (CNN-2D: 91.3 %, CNN-LSTM: 92.4 %). A strong correlation with Mini-Mental State Examination scores (r = 0.894) validates the clinical relevance of this finding. A comprehensive interpretability analysis, conducted through Grad-CAM visualization, reveals clinically meaningful attention patterns that are consistent with established neurophysiological markers. The proposed HFSAN represents a significant advancement toward clinically viable EEG-based dementia detection, offering superior accuracy, interpretability, and practical applicability for early diagnosis and disease monitoring.
阿尔茨海默病(AD)和额颞叶痴呆(FTD)是严重的神经退行性疾病,需要早期和准确的检测以获得最佳的患者管理。本研究引入了一种混合频率-空间注意网络(HFSAN),该网络将卷积神经网络与专门的注意机制相结合,用于脑电痴呆的自动分类。该方法采用解耦的频率和空间注意模块,在自适应融合之前通过门控机制对光谱和地形特征进行有针对性的分析。先进的信号处理技术,包括连续小波变换和伪影子空间重构,确保了19通道脑电信号记录的最佳特征提取。对包含88名受试者的Miltiadous数据集的评估显示,AD与健康对照的准确率为96.7%,FTD与健康对照的准确率为92.3%,AD与FTD分类的准确率为87.6%。该方法显著优于传统的机器学习(Random Forest: 84.7%)和现有的深度学习方法(CNN-2D: 91.3%, CNN-LSTM: 92.4%)。与迷你精神状态检查分数的强相关性(r = 0.894)证实了这一发现的临床相关性。通过Grad-CAM可视化进行的全面可解释性分析揭示了与已建立的神经生理标记一致的临床有意义的注意模式。提出的HFSAN代表了临床可行的基于脑电图的痴呆检测的重大进步,为早期诊断和疾病监测提供了卓越的准确性、可解释性和实用性。
{"title":"Neurophysiological biomarker extraction through decoupled attention mechanisms and EEG signal processing in Alzheimer’s disease and frontotemporal dementia","authors":"Alaa Hussein Abdulaal ,&nbsp;Morteza Valizadeh ,&nbsp;Mehdi Chehel Amirani","doi":"10.1016/j.bspc.2026.109559","DOIUrl":"10.1016/j.bspc.2026.109559","url":null,"abstract":"<div><div>Alzheimer’s disease (AD) and frontotemporal dementia (FTD) represent critical neurodegenerative disorders requiring early and accurate detection for optimal patient management. This study introduces a Hybrid Frequency-Spatial Attention Network (HFSAN) that integrates convolutional neural networks with specialized attention mechanisms for the automated classification of dementia using EEG. The proposed methodology employs decoupled frequency and spatial attention modules, enabling the targeted analysis of spectral and topographical features before adaptive fusion via a gating mechanism. Advanced signal processing techniques, including continuous wavelet transform and artifact subspace reconstruction, ensure optimal feature extraction from 19-channel EEG recordings.</div><div>Evaluation of the Miltiadous dataset, comprising 88 subjects, demonstrated exceptional performance with 96.7 % accuracy for AD versus healthy controls, 92.3 % for FTD versus healthy controls, and 87.6 % for AD versus FTD classification. The method significantly outperformed traditional machine learning (Random Forest: 84.7 %) and existing deep learning approaches (CNN-2D: 91.3 %, CNN-LSTM: 92.4 %). A strong correlation with Mini-Mental State Examination scores (r = 0.894) validates the clinical relevance of this finding. A comprehensive interpretability analysis, conducted through Grad-CAM visualization, reveals clinically meaningful attention patterns that are consistent with established neurophysiological markers. The proposed HFSAN represents a significant advancement toward clinically viable EEG-based dementia detection, offering superior accuracy, interpretability, and practical applicability for early diagnosis and disease monitoring.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"117 ","pages":"Article 109559"},"PeriodicalIF":4.9,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HSA-CompSeg: Deep learning-based multi-view segmentation and automated morphological quantification for COPLL HSA-CompSeg:基于深度学习的COPLL多视图分割和自动形态学定量
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-16 DOI: 10.1016/j.bspc.2026.109599
Xuning Zhang , Lu Li , Jianpeng Chen , Changlin Lv , Huan Yang , Yongming Xi
Accurate segmentation of cervical ossification and vertebrae regions in MRI is crucial for Cervical Ossification of the Posterior Longitudinal Ligament (COPLL) diagnosis, as it forms the basis for deriving quantitative indicators essential for clinical decision-making. To address this need, we propose HSA-CompSeg, a novel hybrid framework for multi-view cervical MRI segmentation. The framework efficiently captures global contextual dependencies while maintaining semantic consistency across network layers. It further integrates spatial–channel collaborative attention and cross-channel mixing to enhance multi-scale feature fusion and improve boundary delineation. Beyond segmentation, we develop an automated morphological measurement pipeline, in which Landmark-based Morphology Quantification (LMQ) extracts axial-view indicators and Defect Compression Quantification (DCQ) estimates sagittal-view clinical metrics. Experiments on a COPLL MRI dataset comprising 32 patients (582 images) demonstrate that HSA-CompSeg achieves superior accuracy over state-of-the-art methods, with consistent improvements in DSC and HD95. Moreover, the resulting quantitative measurements exhibit high agreement with expert assessments, underscoring the clinical utility of the proposed end-to-end system for objective COPLL diagnosis and severity grading.
MRI中颈椎骨化和椎体区域的准确分割对于后纵韧带颈椎骨化(COPLL)的诊断至关重要,因为它是获得临床决策所必需的定量指标的基础。为了满足这一需求,我们提出了HSA-CompSeg,这是一种用于多视图颈椎MRI分割的新型混合框架。该框架有效地捕获全局上下文依赖关系,同时保持跨网络层的语义一致性。进一步融合空间通道协同关注和跨通道混合,增强多尺度特征融合,改善边界划分。除了分割之外,我们还开发了一个自动形态学测量管道,其中基于地标的形态学量化(LMQ)提取轴向视图指标,缺陷压缩量化(DCQ)估计矢状视图临床指标。在包含32名患者(582张图像)的COPLL MRI数据集上进行的实验表明,HSA-CompSeg比最先进的方法具有更高的准确性,并在DSC和HD95方面取得了一致的改进。此外,由此产生的定量测量结果与专家评估结果高度一致,强调了提出的端到端COPLL客观诊断和严重程度分级系统的临床实用性。
{"title":"HSA-CompSeg: Deep learning-based multi-view segmentation and automated morphological quantification for COPLL","authors":"Xuning Zhang ,&nbsp;Lu Li ,&nbsp;Jianpeng Chen ,&nbsp;Changlin Lv ,&nbsp;Huan Yang ,&nbsp;Yongming Xi","doi":"10.1016/j.bspc.2026.109599","DOIUrl":"10.1016/j.bspc.2026.109599","url":null,"abstract":"<div><div>Accurate segmentation of cervical ossification and vertebrae regions in MRI is crucial for <span><math><mi>C</mi></math></span>ervical <span><math><mi>O</mi></math></span>ssification of the <span><math><mi>P</mi></math></span>osterior <span><math><mi>L</mi></math></span>ongitudinal <span><math><mi>L</mi></math></span>igament (COPLL) diagnosis, as it forms the basis for deriving quantitative indicators essential for clinical decision-making. To address this need, we propose HSA-CompSeg, a novel hybrid framework for multi-view cervical MRI segmentation. The framework efficiently captures global contextual dependencies while maintaining semantic consistency across network layers. It further integrates spatial–channel collaborative attention and cross-channel mixing to enhance multi-scale feature fusion and improve boundary delineation. Beyond segmentation, we develop an automated morphological measurement pipeline, in which Landmark-based Morphology Quantification (LMQ) extracts axial-view indicators and Defect Compression Quantification (DCQ) estimates sagittal-view clinical metrics. Experiments on a COPLL MRI dataset comprising 32 patients (582 images) demonstrate that HSA-CompSeg achieves superior accuracy over state-of-the-art methods, with consistent improvements in DSC and HD95. Moreover, the resulting quantitative measurements exhibit high agreement with expert assessments, underscoring the clinical utility of the proposed end-to-end system for objective COPLL diagnosis and severity grading.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"117 ","pages":"Article 109599"},"PeriodicalIF":4.9,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards practical Alzheimer’s Disease diagnosis: A lightweight and interpretable spiking neural model 迈向实用的阿尔茨海默病诊断:一个轻量级和可解释的尖峰神经模型
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-16 DOI: 10.1016/j.bspc.2026.109595
Changwei Wu , Yifei Chen , Yuxin Du , Jinying Zong , Jie Dong , Mingxuan Liu , Feiwei Qin , Yong Peng , Jin Fan , Chaomiao Wang
Early diagnosis of Alzheimer’s Disease (AD), particularly at the mild cognitive impairment stage, is essential for timely intervention. However, this process faces significant barriers, including reliance on subjective assessments and the high cost of advanced imaging techniques. While deep learning offers automated solutions to improve diagnostic accuracy, its widespread adoption remains constrained due to high energy requirements and computational demands, particularly in resource-limited settings. Spiking neural networks (SNNs) provide a promising alternative, as their brain-inspired design is well-suited to model the sparse and event-driven patterns characteristic of neural degeneration in AD. These networks offer the potential for developing interpretable, energy-efficient diagnostic tools. Despite their advantages, existing SNNs often suffer from limited expressiveness and challenges in stable training, which reduce their effectiveness in handling complex medical tasks. To address these shortcomings, we introduce FasterSNN, a hybrid neural architecture that combines biologically inspired Leaky Integrate-and-Fire (LIF) neurons with region-adaptive convolution and multi-scale spiking attention mechanisms. This approach facilitates efficient, sparse processing of 3D MRI data while maintaining high diagnostic accuracy. Experimental results on benchmark datasets reveal that FasterSNN delivers competitive performance with significantly enhanced efficiency and training stability, highlighting its potential for practical application in AD screening. Our source code is available at https://github.com/wuchangw/FasterSNN.
早期诊断阿尔茨海默病(AD),特别是在轻度认知障碍阶段,是必要的及时干预。然而,这一过程面临着重大障碍,包括依赖主观评估和先进成像技术的高成本。虽然深度学习提供了自动化的解决方案来提高诊断的准确性,但由于高能耗和计算需求,特别是在资源有限的环境下,其广泛采用仍然受到限制。脉冲神经网络(snn)提供了一个有希望的替代方案,因为它们的大脑启发设计非常适合模拟阿尔茨海默病神经退化的稀疏和事件驱动模式特征。这些网络为开发可解释的、节能的诊断工具提供了潜力。尽管已有的snn具有一定的优势,但其表达能力有限,且在稳定训练中存在挑战,这降低了其处理复杂医疗任务的有效性。为了解决这些缺点,我们引入了fastsnn,这是一种混合神经结构,将生物学启发的Leaky Integrate-and-Fire (LIF)神经元与区域自适应卷积和多尺度尖峰注意机制结合在一起。这种方法有利于高效、稀疏的3D MRI数据处理,同时保持较高的诊断准确性。在基准数据集上的实验结果表明,FasterSNN在显著提高效率和训练稳定性的同时提供了具有竞争力的性能,突出了其在AD筛选中的实际应用潜力。我们的源代码可从https://github.com/wuchangw/FasterSNN获得。
{"title":"Towards practical Alzheimer’s Disease diagnosis: A lightweight and interpretable spiking neural model","authors":"Changwei Wu ,&nbsp;Yifei Chen ,&nbsp;Yuxin Du ,&nbsp;Jinying Zong ,&nbsp;Jie Dong ,&nbsp;Mingxuan Liu ,&nbsp;Feiwei Qin ,&nbsp;Yong Peng ,&nbsp;Jin Fan ,&nbsp;Chaomiao Wang","doi":"10.1016/j.bspc.2026.109595","DOIUrl":"10.1016/j.bspc.2026.109595","url":null,"abstract":"<div><div>Early diagnosis of Alzheimer’s Disease (AD), particularly at the mild cognitive impairment stage, is essential for timely intervention. However, this process faces significant barriers, including reliance on subjective assessments and the high cost of advanced imaging techniques. While deep learning offers automated solutions to improve diagnostic accuracy, its widespread adoption remains constrained due to high energy requirements and computational demands, particularly in resource-limited settings. Spiking neural networks (SNNs) provide a promising alternative, as their brain-inspired design is well-suited to model the sparse and event-driven patterns characteristic of neural degeneration in AD. These networks offer the potential for developing interpretable, energy-efficient diagnostic tools. Despite their advantages, existing SNNs often suffer from limited expressiveness and challenges in stable training, which reduce their effectiveness in handling complex medical tasks. To address these shortcomings, we introduce FasterSNN, a hybrid neural architecture that combines biologically inspired Leaky Integrate-and-Fire (LIF) neurons with region-adaptive convolution and multi-scale spiking attention mechanisms. This approach facilitates efficient, sparse processing of 3D MRI data while maintaining high diagnostic accuracy. Experimental results on benchmark datasets reveal that FasterSNN delivers competitive performance with significantly enhanced efficiency and training stability, highlighting its potential for practical application in AD screening. Our source code is available at <span><span>https://github.com/wuchangw/FasterSNN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"117 ","pages":"Article 109595"},"PeriodicalIF":4.9,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multi-modal fusion-based deep learning with finetuned LLaMA 3 for lung disease diagnosis using PACS radiology reports 基于多模式融合的深度学习与微调LLaMA 3,用于使用PACS放射学报告进行肺部疾病诊断
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-16 DOI: 10.1016/j.bspc.2026.109466
J Lefty Joyson , K.Ruba Soundar , P. Nancy
The non-prompt identification of lung disease causes major contribution to global mortality, leading to significant limitations of conventional diagnostic strategies. There are several challenges facing existing deep learning strategies, including, but not limited to, generalizability issues, poor integration of different features, and slow convergence. This research presents a groundbreaking diagnostic system that is centered on Multi-level Feature Gated Spatio-temporal Fusion with Siamese Tensor Transformer (MFGSF-STT), a new fusion architecture that carries both disease classification with high accuracy and radiology report generation that is clinically coherent at the same time. The main breakthrough is the multi-level gated fusion mechanism, which combines spatial, temporal, and semantic cues in a more efficient way than even the best current models. In addition to this, Animated Oat Optimization Algorithm (AOOA) is included in the system to improve performance much faster and offers better hyperparameter tuning than traditional optimization methods. Furthermore, LLaMA3 with supervised fine-tuning to produce context-filled and medically consistent radiology reports that are easier for doctors to understand and are also supportive of clinical decision-making. A significant link between language expressiveness and medical accuracy was demonstrated by the 98.5 % diagnostic accuracy and 97.3 % F1-score, BLEU score of 91.6, and clinical coherence score of 4.7/5 obtained with this framework. Overall, these frameworks have characteristics of being clinically useful solutions for diagnosis of lung disease, and solve many problems associated with other existing models to improve prediction capability and clinical applicability.
肺部疾病的不及时识别是造成全球死亡率的主要原因,导致传统诊断策略的重大局限性。现有的深度学习策略面临着一些挑战,包括但不限于泛化问题、不同特征的集成不良以及收敛缓慢。本研究提出了一种突破性的诊断系统,该系统以多层特征门控时空融合与暹罗张量转换器(MFGSF-STT)为中心,这是一种新的融合架构,可以同时进行高精度的疾病分类和临床一致的放射学报告生成。主要的突破是多层次门控融合机制,它以比目前最好的模型更有效的方式结合了空间、时间和语义线索。除此之外,系统中还包含了动画优化算法(AOOA),以更快地提高性能,并提供比传统优化方法更好的超参数调优。此外,通过监督微调,LLaMA3可以生成内容丰富、医学上一致的放射学报告,这对医生来说更容易理解,也有助于临床决策。语言表达与医疗准确性之间存在显著的联系,诊断准确率为98.5%,f1评分为97.3%,BLEU评分为91.6,临床一致性评分为4.7/5。总的来说,这些框架具有临床有用的肺部疾病诊断解决方案的特点,并解决了许多与其他现有模型相关的问题,提高了预测能力和临床适用性。
{"title":"A multi-modal fusion-based deep learning with finetuned LLaMA 3 for lung disease diagnosis using PACS radiology reports","authors":"J Lefty Joyson ,&nbsp;K.Ruba Soundar ,&nbsp;P. Nancy","doi":"10.1016/j.bspc.2026.109466","DOIUrl":"10.1016/j.bspc.2026.109466","url":null,"abstract":"<div><div>The non-prompt identification of lung disease causes major contribution to global mortality, leading to significant limitations of conventional diagnostic strategies. There are several challenges facing existing deep learning strategies, including, but not limited to, generalizability issues, poor integration of different features, and slow convergence. This research presents a groundbreaking diagnostic system that is centered on Multi-level Feature Gated Spatio-temporal Fusion with Siamese Tensor Transformer (MFGSF-STT), a new fusion architecture that carries both disease classification with high accuracy and radiology report generation that is clinically coherent at the same time. The main breakthrough is the multi-level gated fusion mechanism, which combines spatial, temporal, and semantic cues in a more efficient way than even the best current models. In addition to this, Animated Oat Optimization Algorithm (AOOA) is included in the system to improve performance much faster and offers better hyperparameter tuning than traditional optimization methods. Furthermore, LLaMA3 with supervised fine-tuning to produce context-filled and medically consistent radiology reports that are easier for doctors to understand and are also supportive of clinical decision-making. A significant link between language expressiveness and medical accuracy was demonstrated by the 98.5 % diagnostic accuracy and 97.3 % F1-score, BLEU score of 91.6, and clinical coherence score of 4.7/5 obtained with this framework. Overall, these frameworks have characteristics of being clinically useful solutions for diagnosis of lung disease, and solve many problems associated with other existing models to improve prediction capability and clinical applicability.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"117 ","pages":"Article 109466"},"PeriodicalIF":4.9,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A modularly designed controllable generative framework for glioma and MRI editing via style representations enhancement 一个模块化设计的可控生成框架,用于胶质瘤和MRI编辑,通过风格表征增强
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-15 DOI: 10.1016/j.bspc.2026.109579
Liangce Qi, Zhengang Jiang, Weili Shi, Yu Miao, Guodong Wei
Generative models have emerged as a powerful solution to address data scarcity in deep learning for glioma diagnosis. Among these, pairwise image generation has garnered significant attention for its ability to enhance the diversity and utility of synthetic data. However, current methods are primarily limited to single-modality generation and lack precise control over crucial morphological attributes of glioma, such as shape, location, and size. This limitation hinders their broader clinical applicability. To address these challenges, we propose a novel modular generative framework. A key contribution is the introduction of a box conditioning mechanism that complements the commonly used segmentation masks and MRI images. Furthermore, we construct three aligned feature spaces through pre-training, which enable flexible and independent control over glioma morphology, surrounding anatomical context, and semantic information in multi-modal MRI. We train a total of six generators, modified from StyleGAN2, for two pixel-wise glioma segmentation map generation tasks and four modality MRI synthesis tasks. By sharing identical control vectors across the glioma and MRI generation tasks, our framework ensures superior anatomical consistency in the synthetic paired images. Extensive experiments on the BraTS 2023 dataset demonstrate the superiority of our method in terms of both synthetic image quality and utility in downstream tasks. Our code is available at https://github.com/LcQi-mic/Gli_edit.
生成模型已经成为解决神经胶质瘤诊断中深度学习数据稀缺问题的有力解决方案。其中,两两图像生成因其增强合成数据多样性和实用性的能力而备受关注。然而,目前的方法主要局限于单模态生成,缺乏对胶质瘤关键形态属性的精确控制,如形状、位置和大小。这一限制阻碍了它们更广泛的临床应用。为了解决这些挑战,我们提出了一个新的模块化生成框架。一个关键的贡献是引入了一个盒子调节机制,补充了常用的分割掩模和MRI图像。此外,我们通过预训练构建了三个对齐的特征空间,使其能够灵活独立地控制多模态MRI中的胶质瘤形态、周围解剖背景和语义信息。我们一共训练了六个生成器,由StyleGAN2修改,用于两个像素级胶质瘤分割图生成任务和四个模态MRI合成任务。通过在胶质瘤和MRI生成任务中共享相同的控制向量,我们的框架确保了合成配对图像中优越的解剖一致性。在BraTS 2023数据集上进行的大量实验表明,我们的方法在合成图像质量和下游任务中的实用性方面都具有优势。我们的代码可在https://github.com/LcQi-mic/Gli_edit上获得。
{"title":"A modularly designed controllable generative framework for glioma and MRI editing via style representations enhancement","authors":"Liangce Qi,&nbsp;Zhengang Jiang,&nbsp;Weili Shi,&nbsp;Yu Miao,&nbsp;Guodong Wei","doi":"10.1016/j.bspc.2026.109579","DOIUrl":"10.1016/j.bspc.2026.109579","url":null,"abstract":"<div><div>Generative models have emerged as a powerful solution to address data scarcity in deep learning for glioma diagnosis. Among these, pairwise image generation has garnered significant attention for its ability to enhance the diversity and utility of synthetic data. However, current methods are primarily limited to single-modality generation and lack precise control over crucial morphological attributes of glioma, such as shape, location, and size. This limitation hinders their broader clinical applicability. To address these challenges, we propose a novel modular generative framework. A key contribution is the introduction of a box conditioning mechanism that complements the commonly used segmentation masks and MRI images. Furthermore, we construct three aligned feature spaces through pre-training, which enable flexible and independent control over glioma morphology, surrounding anatomical context, and semantic information in multi-modal MRI. We train a total of six generators, modified from StyleGAN2, for two pixel-wise glioma segmentation map generation tasks and four modality MRI synthesis tasks. By sharing identical control vectors across the glioma and MRI generation tasks, our framework ensures superior anatomical consistency in the synthetic paired images. Extensive experiments on the BraTS 2023 dataset demonstrate the superiority of our method in terms of both synthetic image quality and utility in downstream tasks. Our code is available at <span><span>https://github.com/LcQi-mic/Gli_edit</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"117 ","pages":"Article 109579"},"PeriodicalIF":4.9,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-scale graph convolutional EEG emotion recognition method driven by dynamic channel state labels 动态通道状态标签驱动的多尺度图卷积脑电情感识别方法
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-15 DOI: 10.1016/j.bspc.2026.109515
Ming Liu , Zichong Zhang , Haifeng Guo , Jianli Yang , Peng Xiong , Jieshuo Zhang , Xiuling Liu
Emotion recognition holds significant importance in fields such as biomedicine and brain-computer interfaces. Graph Convolutional Networks (GCNs) based on electroencephalographic (EEG) signals have been widely applied to emotion recognition tasks. However, existing methods often overlook the differential responses of brain regions during emotional reactions, and shallow GCN architectures struggle to fully exploit the spatial and functional correlations between EEG channels. To dynamically capture the responses of brain channels during emotional reactions and enhance the multi-scale feature extraction capability of GCNs, this study proposes an EEG emotion recognition method that combines a Graph Convolution Parameter Optimization Network (GCPONet) and a Multi-scale Emotion Recognition Network (MERNet). Specifically, GCPONet models the response differences of each channel during emotional reactions by dynamically optimizing the weight distribution of multiple EEG frequency bands (β and γ), and classifies these channels into three state labels (”active”, ”stable”, and ”sluggish”) to quantify their activation levels. MERNet extracts multi-scale features by integrating label information with a graph coarsening strategy, constructing a feature representation framework from local to global levels. Subject-dependent and subject-independent experiments conducted on the SEED dataset demonstrate two key findings: (1) The channels’ state labels generated by GCPONet can accurately reflect the state of each channel, providing targeted information for emotion recognition; (2) Compared with existing popular methods, the proposed method can effectively capture multi-scale EEG features, avoid the over-smoothing issue, and achieve more favorable performance in emotion classification. This research offers a feasible new perspective for emotion recognition and brain network analysis.
情绪识别在生物医学、脑机接口等领域具有重要意义。基于脑电图(EEG)信号的图卷积网络(GCNs)已广泛应用于情绪识别任务。然而,现有的方法往往忽略了情绪反应时大脑区域的差异反应,并且浅层GCN架构难以充分利用脑电通道之间的空间和功能相关性。为了动态捕捉情绪反应过程中脑通道的反应,增强神经网络的多尺度特征提取能力,本研究提出了一种结合图卷积参数优化网络(GCPONet)和多尺度情绪识别网络(MERNet)的脑电情绪识别方法。具体而言,GCPONet通过动态优化多个EEG频带(β和γ)的权重分布来模拟情绪反应中各通道的反应差异,并将这些通道分为“活跃”、“稳定”和“缓慢”三种状态标签,量化其激活水平。MERNet通过将标签信息与图粗化策略相结合来提取多尺度特征,构建了从局部到全局的特征表示框架。在SEED数据集上进行的受试者依赖和受试者独立实验显示了两个关键发现:(1)GCPONet生成的通道状态标签能够准确反映每个通道的状态,为情绪识别提供有针对性的信息;(2)与现有流行的方法相比,所提方法能有效捕获多尺度脑电特征,避免了过度平滑问题,在情绪分类方面取得了更好的性能。该研究为情绪识别和脑网络分析提供了一个可行的新视角。
{"title":"Multi-scale graph convolutional EEG emotion recognition method driven by dynamic channel state labels","authors":"Ming Liu ,&nbsp;Zichong Zhang ,&nbsp;Haifeng Guo ,&nbsp;Jianli Yang ,&nbsp;Peng Xiong ,&nbsp;Jieshuo Zhang ,&nbsp;Xiuling Liu","doi":"10.1016/j.bspc.2026.109515","DOIUrl":"10.1016/j.bspc.2026.109515","url":null,"abstract":"<div><div>Emotion recognition holds significant importance in fields such as biomedicine and brain-computer interfaces. Graph Convolutional Networks (GCNs) based on electroencephalographic (EEG) signals have been widely applied to emotion recognition tasks. However, existing methods often overlook the differential responses of brain regions during emotional reactions, and shallow GCN architectures struggle to fully exploit the spatial and functional correlations between EEG channels. To dynamically capture the responses of brain channels during emotional reactions and enhance the multi-scale feature extraction capability of GCNs, this study proposes an EEG emotion recognition method that combines a Graph Convolution Parameter Optimization Network (GCPONet) and a Multi-scale Emotion Recognition Network (MERNet). Specifically, GCPONet models the response differences of each channel during emotional reactions by dynamically optimizing the weight distribution of multiple EEG frequency bands (<span><math><mi>β</mi></math></span> and <span><math><mi>γ</mi></math></span>), and classifies these channels into three state labels (”active”, ”stable”, and ”sluggish”) to quantify their activation levels. MERNet extracts multi-scale features by integrating label information with a graph coarsening strategy, constructing a feature representation framework from local to global levels. Subject-dependent and subject-independent experiments conducted on the SEED dataset demonstrate two key findings: (1) The channels’ state labels generated by GCPONet can accurately reflect the state of each channel, providing targeted information for emotion recognition; (2) Compared with existing popular methods, the proposed method can effectively capture multi-scale EEG features, avoid the over-smoothing issue, and achieve more favorable performance in emotion classification. This research offers a feasible new perspective for emotion recognition and brain network analysis.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"117 ","pages":"Article 109515"},"PeriodicalIF":4.9,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep feature-based approaches for brain tumor classification and segmentation in medical imaging 医学影像中基于深度特征的脑肿瘤分类与分割方法
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-15 DOI: 10.1016/j.bspc.2026.109603
Agnesh Chandra Yadav, Maheshkumar H. Kolekar
Accurate brain tumor classification and segmentation are critical for patient prognosis and clinical decision-making, as they directly guide treatment planning. This review article examines the major developments in brain tumor imaging research from 2020 to 2025, focusing on methods based on convolutional neural networks, U-Net and its extensions, attention mechanisms, transformer-based designs, hybrid models, and recent generative techniques. Studies conducted on widely used MRI datasets — such as BraTS, FeTS, TCGA, and Figshare — are discussed, covering different imaging sequences including T1, T2, FLAIR, and contrast-enhanced scans. Reported results are compared using common evaluation measures like accuracy, Dice coefficient, Hausdorff distance, and Intersection over Union. This article also discusses persistent difficulties faced by researchers, including variations between datasets, unequal availability of imaging modalities, limited annotated data, and the need for methods that preserve patient privacy during training. Current research directions include multimodal feature integration, learning representations without extensive manual labeling, distributed learning frameworks, and improved interpretability of model outputs. In addition to reviewing the current methods, this study points out their limitations and suggests future directions for building automated brain tumor detection and analysis systems that are reliable, scalable, and suitable for clinical use.
准确的脑肿瘤分类和分割对患者预后和临床决策至关重要,因为它们直接指导治疗计划。本文综述了2020年至2025年脑肿瘤成像研究的主要进展,重点关注基于卷积神经网络、U-Net及其扩展、注意力机制、基于变压器的设计、混合模型和最新生成技术的方法。对广泛使用的MRI数据集(如BraTS、fts、TCGA和Figshare)进行的研究进行了讨论,涵盖了不同的成像序列,包括T1、T2、FLAIR和对比增强扫描。报告的结果使用常见的评估措施进行比较,如准确性,骰子系数,豪斯多夫距离和交集超过联盟。本文还讨论了研究人员面临的持续困难,包括数据集之间的差异,成像模式的不平等可用性,有限的注释数据,以及在训练期间保护患者隐私的方法的需求。目前的研究方向包括多模态特征集成、无需大量人工标记的学习表示、分布式学习框架以及提高模型输出的可解释性。除了回顾目前的方法外,本研究还指出了它们的局限性,并提出了建立可靠、可扩展且适合临床使用的自动化脑肿瘤检测和分析系统的未来方向。
{"title":"Deep feature-based approaches for brain tumor classification and segmentation in medical imaging","authors":"Agnesh Chandra Yadav,&nbsp;Maheshkumar H. Kolekar","doi":"10.1016/j.bspc.2026.109603","DOIUrl":"10.1016/j.bspc.2026.109603","url":null,"abstract":"<div><div>Accurate brain tumor classification and segmentation are critical for patient prognosis and clinical decision-making, as they directly guide treatment planning. This review article examines the major developments in brain tumor imaging research from 2020 to 2025, focusing on methods based on convolutional neural networks, U-Net and its extensions, attention mechanisms, transformer-based designs, hybrid models, and recent generative techniques. Studies conducted on widely used MRI datasets — such as BraTS, FeTS, TCGA, and Figshare — are discussed, covering different imaging sequences including T1, T2, FLAIR, and contrast-enhanced scans. Reported results are compared using common evaluation measures like accuracy, Dice coefficient, Hausdorff distance, and Intersection over Union. This article also discusses persistent difficulties faced by researchers, including variations between datasets, unequal availability of imaging modalities, limited annotated data, and the need for methods that preserve patient privacy during training. Current research directions include multimodal feature integration, learning representations without extensive manual labeling, distributed learning frameworks, and improved interpretability of model outputs. In addition to reviewing the current methods, this study points out their limitations and suggests future directions for building automated brain tumor detection and analysis systems that are reliable, scalable, and suitable for clinical use.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"117 ","pages":"Article 109603"},"PeriodicalIF":4.9,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DB-HDFFN: Dual Branch Hierarchical Dynamic Feature Fusion Network for medical image classification 用于医学图像分类的双分支层次动态特征融合网络DB-HDFFN
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-15 DOI: 10.1016/j.bspc.2026.109538
Yang Wen , Bai Chen , Wuzhen Shi , Song Wu , Xiaokang Yang , Bin Sheng
Automatic, accurate disease classification of medical images is of great significance for timely clinical diagnosis and intervention. However, most existing medical image classification methods still face many challenges, such as being misled by irrelevant information, inaccurate feature extraction from complex lesion regions, and the lack of domain-specific structure knowledge. To address these limitations, we propose DB-HDFFN, a Dual-Branch Hierarchical Dynamic Feature Fusion Network, which introduces three key innovations. Firstly, we develop a Dynamic Sparse Attention Block (DSA-Block). Unlike existing Transformer-based methods that apply full attention and are easily distracted by irrelevant regions, the proposed DSA-Block introduces a novel top-k dynamic sparsification strategy that selectively preserves the most informative regions, enabling noise-robust, computation-efficient global modeling. Secondly, we design an Intra-Inter Variance Feature Extracting Block (ICV-Block) that leverages the inherent property of significant intra-image variance and relatively slight inter-image variance in medical images. By modeling this unique property, the ICV-Block enables the network to adaptively emphasize subtle lesion regions while suppressing misleading anatomical variations, thereby addressing a long-standing challenge in accurately capturing fine-grained pathological patterns in complex lesion areas. Finally, we propose a Dynamic Fusion Block (DF-Block) that performs hierarchical cross-branch fusion, ensuring effective integration of multi-scale global and local representations. The ACC and F1 values of our proposed model were 80.28% and 80.31% on the COVID-19-CT dataset, 92.47% and 92.49% on the Chest X-ray PA dataset, and 87.68% and 87.63% on the Kvasir dataset. These results across multiple datasets thoroughly verify that the proposed dual-branch hierarchical dynamic feature fusion network outperforms other state-of-the-art models for medical image classification.
医学图像疾病的自动、准确分类对临床及时诊断和干预具有重要意义。然而,现有的大多数医学图像分类方法仍然面临着许多挑战,如被不相关信息误导、复杂病变区域特征提取不准确、缺乏特定领域结构知识等。为了解决这些限制,我们提出了DB-HDFFN,一种双分支分层动态特征融合网络,它引入了三个关键创新。首先,我们开发了一个动态稀疏注意块(DSA-Block)。与现有的基于transformer的方法不同,DSA-Block引入了一种新颖的top-k动态稀疏策略,该策略有选择性地保留了信息量最大的区域,实现了噪声鲁棒性、计算效率高的全局建模。其次,利用医学图像图像内方差显著、图像间方差相对较小的固有特性,设计了一种方差内特征提取块(ICV-Block)。通过对这一独特属性进行建模,ICV-Block使网络能够自适应地强调细微的病变区域,同时抑制误导性的解剖变异,从而解决了在复杂病变区域准确捕获细粒度病理模式的长期挑战。最后,我们提出了一个动态融合块(DF-Block),它执行分层跨分支融合,确保多尺度全局和局部表示的有效集成。我们提出的模型在COVID-19-CT数据集上的ACC和F1值分别为80.28%和80.31%,在胸片x射线PA数据集上的ACC和F1值分别为92.47%和92.49%,在Kvasir数据集上的ACC和F1值分别为87.68%和87.63%。这些跨多个数据集的结果彻底验证了所提出的双分支分层动态特征融合网络优于其他最先进的医学图像分类模型。
{"title":"DB-HDFFN: Dual Branch Hierarchical Dynamic Feature Fusion Network for medical image classification","authors":"Yang Wen ,&nbsp;Bai Chen ,&nbsp;Wuzhen Shi ,&nbsp;Song Wu ,&nbsp;Xiaokang Yang ,&nbsp;Bin Sheng","doi":"10.1016/j.bspc.2026.109538","DOIUrl":"10.1016/j.bspc.2026.109538","url":null,"abstract":"<div><div>Automatic, accurate disease classification of medical images is of great significance for timely clinical diagnosis and intervention. However, most existing medical image classification methods still face many challenges, such as being misled by irrelevant information, inaccurate feature extraction from complex lesion regions, and the lack of domain-specific structure knowledge. To address these limitations, we propose DB-HDFFN, a Dual-Branch Hierarchical Dynamic Feature Fusion Network, which introduces three key innovations. Firstly, we develop a Dynamic Sparse Attention Block (DSA-Block). Unlike existing Transformer-based methods that apply full attention and are easily distracted by irrelevant regions, the proposed DSA-Block introduces a novel top-k dynamic sparsification strategy that selectively preserves the most informative regions, enabling noise-robust, computation-efficient global modeling. Secondly, we design an Intra-Inter Variance Feature Extracting Block (ICV-Block) that leverages the inherent property of significant intra-image variance and relatively slight inter-image variance in medical images. By modeling this unique property, the ICV-Block enables the network to adaptively emphasize subtle lesion regions while suppressing misleading anatomical variations, thereby addressing a long-standing challenge in accurately capturing fine-grained pathological patterns in complex lesion areas. Finally, we propose a Dynamic Fusion Block (DF-Block) that performs hierarchical cross-branch fusion, ensuring effective integration of multi-scale global and local representations. The ACC and F1 values of our proposed model were 80.28% and 80.31% on the COVID-19-CT dataset, 92.47% and 92.49% on the Chest X-ray PA dataset, and 87.68% and 87.63% on the Kvasir dataset. These results across multiple datasets thoroughly verify that the proposed dual-branch hierarchical dynamic feature fusion network outperforms other state-of-the-art models for medical image classification.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"117 ","pages":"Article 109538"},"PeriodicalIF":4.9,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wavelet attention fusion for semi-supervised ultrasound segmentation 半监督超声分割的小波注意融合
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-01-15 DOI: 10.1016/j.bspc.2026.109566
Xiaming Wu , Wenbo Yue , Xinglong Wu , Qing Huang , Chang Li , Yajun Yu , Guoping Xu
Ultrasound image segmentation plays a vital role in medical diagnosis. However, automatic segmentation remains a significant challenge due to the presence of noise, low contrast, and the limited availability of annotated data. This paper proposes a novel semi-supervised segmentation approach, termed WAF (Wavelet Attention Fusion). The method applies discrete wavelet transform (DWT) to decompose ultrasound images into sub-bands of different frequencies, primarily utilizing the low-frequency components for global feature representation, while the high-frequency components capture fine details and edges. To improve the model’s ability to focus on critical regions, we introduce an attention fusion module that integrates both channel and spatial attention mechanisms. This design effectively enhances the model’s perception of important frequency and spatial features on low resolution ultrasound images. Experiments on multiple ultrasound segmentation datasets demonstrate that WAF consistently outperforms traditional FixMatch and other state-of-the-art semi-supervised methods. Specifically, WAF yields Dice score improvements of +1.38%, +2.41%, and +3.88% over FixMatch on HC18, DDTI, and CCAUI, respectively. Ablation studies further confirm the essential role of wavelet decomposition and dual attention in boosting performance. Our findings suggest that WAF can significantly improve semi-supervised medical image segmentation while reducing reliance on labeled data. The code is publicly available at https://github.com/wxmadm/WAF.
超声图像分割在医学诊断中起着至关重要的作用。然而,由于存在噪声、低对比度和标注数据的有限可用性,自动分割仍然是一个重大挑战。本文提出了一种新的半监督分割方法,称为小波注意融合。该方法利用离散小波变换(DWT)将超声图像分解成不同频率的子带,主要利用低频分量进行全局特征表示,高频分量捕捉精细细节和边缘。为了提高模型对关键区域的关注能力,我们引入了一个集成了通道和空间注意机制的注意融合模块。该设计有效地增强了模型对低分辨率超声图像重要频率和空间特征的感知能力。在多个超声分割数据集上的实验表明,WAF始终优于传统的FixMatch和其他最先进的半监督方法。具体来说,WAF在HC18、DDTI和CCAUI上的Dice得分分别比FixMatch提高了+1.38%、+2.41%和+3.88%。消融研究进一步证实了小波分解和双注意力在提高性能中的重要作用。我们的研究结果表明,WAF可以显著改善半监督医学图像分割,同时减少对标记数据的依赖。该代码可在https://github.com/wxmadm/WAF上公开获得。
{"title":"Wavelet attention fusion for semi-supervised ultrasound segmentation","authors":"Xiaming Wu ,&nbsp;Wenbo Yue ,&nbsp;Xinglong Wu ,&nbsp;Qing Huang ,&nbsp;Chang Li ,&nbsp;Yajun Yu ,&nbsp;Guoping Xu","doi":"10.1016/j.bspc.2026.109566","DOIUrl":"10.1016/j.bspc.2026.109566","url":null,"abstract":"<div><div>Ultrasound image segmentation plays a vital role in medical diagnosis. However, automatic segmentation remains a significant challenge due to the presence of noise, low contrast, and the limited availability of annotated data. This paper proposes a novel semi-supervised segmentation approach, termed <strong>WAF (Wavelet Attention Fusion)</strong>. The method applies discrete wavelet transform (DWT) to decompose ultrasound images into sub-bands of different frequencies, primarily utilizing the low-frequency components for global feature representation, while the high-frequency components capture fine details and edges. To improve the model’s ability to focus on critical regions, we introduce an attention fusion module that integrates both channel and spatial attention mechanisms. This design effectively enhances the model’s perception of important frequency and spatial features on low resolution ultrasound images. Experiments on multiple ultrasound segmentation datasets demonstrate that WAF consistently outperforms traditional FixMatch and other state-of-the-art semi-supervised methods. Specifically, WAF yields Dice score improvements of +1.38%, +2.41%, and +3.88% over FixMatch on HC18, DDTI, and CCAUI, respectively. Ablation studies further confirm the essential role of wavelet decomposition and dual attention in boosting performance. Our findings suggest that WAF can significantly improve semi-supervised medical image segmentation while reducing reliance on labeled data. The code is publicly available at <span><span>https://github.com/wxmadm/WAF</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"117 ","pages":"Article 109566"},"PeriodicalIF":4.9,"publicationDate":"2026-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145982051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biomedical Signal Processing and Control
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1