首页 > 最新文献

Biomedical Signal Processing and Control最新文献

英文 中文
DBMAF: Dual-branch multimodal attention-based feature fusion network for fusing histopathology and radiology images DBMAF:用于融合组织病理学和放射学图像的双分支多模式基于注意力的特征融合网络
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-07 DOI: 10.1016/j.bspc.2026.109739
Yingfa Li , Jialin Shi , Yufei Wang , Jiping Wei , Yaru Wei , Liang Wu , Meihao Wang , Zhifang Pan
Integrating radiology and histopathology images provides critical complementary perspectives for cancer survival prediction. However, current research faces two main challenges: (1) significant discrepancies in spatial scale and feature dimensionality between modalities; and (2) limited clinical generalizability due to existing methods being restricted to single cancer types or tasks. To overcome these barriers, we propose the Dual-Branch Multimodal Attention-based Feature Fusion Network (DBMAF). This framework employs an enhanced multi-scale channel attention mechanism for intra-modal feature extraction and an attention-guided cross-modal module to learn discriminative correlations between modalities. We validated DBMAF on four cancer cohorts, comprising three public datasets (TCGA-OV, TCGA-KIRC, TCGA-LIHC) and one private institutional dataset (WMU-CRC). Quantitative evaluations demonstrate that our method consistently outperforms all compared methods, achieving a maximum C-index of 0.910 on the TCGA-LIHC cohort. Furthermore, DBMAF showed robust performance across multiple survival endpoints (OS, TTP, and PFS) on the TCGA-OV dataset, highlighting its clinical utility for precise treatment stratification.
整合放射学和组织病理学图像为癌症生存预测提供了关键的互补视角。然而,目前的研究面临着两个主要挑战:(1)模态之间的空间尺度和特征维度存在显著差异;(2)由于现有方法仅限于单一癌症类型或任务,临床推广能力有限。为了克服这些障碍,我们提出了双分支多模态基于注意力的特征融合网络(DBMAF)。该框架采用增强的多尺度通道注意机制进行模态内特征提取,并采用注意引导的跨模态模块学习模态间的判别相关性。我们在四个癌症队列中验证了DBMAF,包括三个公共数据集(TCGA-OV, TCGA-KIRC, TCGA-LIHC)和一个私人机构数据集(WMU-CRC)。定量评估表明,我们的方法始终优于所有比较的方法,在TCGA-LIHC队列中实现了最大的c指数0.910。此外,在TCGA-OV数据集上,DBMAF在多个生存终点(OS, TTP和PFS)上表现出稳健的性能,突出了其在精确治疗分层方面的临床应用。
{"title":"DBMAF: Dual-branch multimodal attention-based feature fusion network for fusing histopathology and radiology images","authors":"Yingfa Li ,&nbsp;Jialin Shi ,&nbsp;Yufei Wang ,&nbsp;Jiping Wei ,&nbsp;Yaru Wei ,&nbsp;Liang Wu ,&nbsp;Meihao Wang ,&nbsp;Zhifang Pan","doi":"10.1016/j.bspc.2026.109739","DOIUrl":"10.1016/j.bspc.2026.109739","url":null,"abstract":"<div><div>Integrating radiology and histopathology images provides critical complementary perspectives for cancer survival prediction. However, current research faces two main challenges: (1) significant discrepancies in spatial scale and feature dimensionality between modalities; and (2) limited clinical generalizability due to existing methods being restricted to single cancer types or tasks. To overcome these barriers, we propose the Dual-Branch Multimodal Attention-based Feature Fusion Network (DBMAF). This framework employs an enhanced multi-scale channel attention mechanism for intra-modal feature extraction and an attention-guided cross-modal module to learn discriminative correlations between modalities. We validated DBMAF on four cancer cohorts, comprising three public datasets (TCGA-OV, TCGA-KIRC, TCGA-LIHC) and one private institutional dataset (WMU-CRC). Quantitative evaluations demonstrate that our method consistently outperforms all compared methods, achieving a maximum C-index of 0.910 on the TCGA-LIHC cohort. Furthermore, DBMAF showed robust performance across multiple survival endpoints (OS, TTP, and PFS) on the TCGA-OV dataset, highlighting its clinical utility for precise treatment stratification.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109739"},"PeriodicalIF":4.9,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146191910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-knowledge feature fusion for PPG-based blood pressure prediction: Low-dimensional extraction via functional data analysis and knowledge constraint 基于ppg的血压预测数据-知识特征融合:基于功能数据分析和知识约束的低维提取
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-07 DOI: 10.1016/j.bspc.2026.109754
Qingfeng Tang , Huihui Hu , Chao Tao , Pengcheng Ding , Guowei Dai , Guangjun Wang , Xiaojuan Hu , Benyue Su , Jiatuo Xu , Hui An
Although concatenating knowledge features (KF) and data features (DF) of photoplethysmography (PPG) can improve the predictive performance of blood pressure monitoring models, this approach inevitably increases the dimensionality of feature space. To address this limitation, we propose an innovative feature extraction method that deeply integrate KF and DF, rather than simply concatenating them.
Our method employs functional data analysis to extract DF by treating PPG as continuously functional curve. Subsequently, the distribution patterns of KF are thoroughly analyzed to construct a KF-based constrained space, which serves as a guide during DF extraction, to yield novel data-knowledge features (DKF).
The experimental results on blood pressure prediction showed that, without the need for additional dimensions, 9-dimensional DKF delivered superior predictive performance compared to both 9-dimensional DF and 8-dimensional KF. Specifically, for systolic blood pressure prediction, DKF reduces the mean absolute error (MAE) to 11.41, outperforming KF (MAE=12.11) and DF (MAE=13.24). Similarly, for diastolic blood pressure, DKF achieves an MAE of 7.27, lower than that of KF (7.41) and DF (7.84).
The proposed feature extraction method effectively overcomes the drawbacks of feature concatenation, offering a novel and effective approach to extracting low-dimensional, highly discriminative features from PPG for accurate blood pressure estimation.
虽然将photoplethysmography (PPG)的知识特征(KF)和数据特征(DF)串联可以提高血压监测模型的预测性能,但这种方法不可避免地增加了特征空间的维数。为了解决这一限制,我们提出了一种创新的特征提取方法,该方法将KF和DF深度集成,而不是简单地将它们连接起来。我们的方法采用功能数据分析,将PPG作为连续功能曲线来提取DF。然后,深入分析KF的分布模式,构建基于KF的约束空间,并在DF提取过程中作为指导,生成新的数据-知识特征(DKF)。血压预测的实验结果表明,在不需要额外维度的情况下,与9维DF和8维KF相比,9维DKF具有更好的预测性能。具体而言,对于收缩压预测,DKF将平均绝对误差(MAE)降低至11.41,优于KF (MAE=12.11)和DF (MAE=13.24)。同样,对于舒张压,DKF的MAE为7.27,低于KF(7.41)和DF(7.84)。所提出的特征提取方法有效地克服了特征拼接的缺点,为从PPG中提取低维、高判别性的特征以实现准确的血压估计提供了一种新颖有效的方法。
{"title":"Data-knowledge feature fusion for PPG-based blood pressure prediction: Low-dimensional extraction via functional data analysis and knowledge constraint","authors":"Qingfeng Tang ,&nbsp;Huihui Hu ,&nbsp;Chao Tao ,&nbsp;Pengcheng Ding ,&nbsp;Guowei Dai ,&nbsp;Guangjun Wang ,&nbsp;Xiaojuan Hu ,&nbsp;Benyue Su ,&nbsp;Jiatuo Xu ,&nbsp;Hui An","doi":"10.1016/j.bspc.2026.109754","DOIUrl":"10.1016/j.bspc.2026.109754","url":null,"abstract":"<div><div>Although concatenating knowledge features (KF) and data features (DF) of photoplethysmography (PPG) can improve the predictive performance of blood pressure monitoring models, this approach inevitably increases the dimensionality of feature space. To address this limitation, we propose an innovative feature extraction method that deeply integrate KF and DF, rather than simply concatenating them.</div><div>Our method employs functional data analysis to extract DF by treating PPG as continuously functional curve. Subsequently, the distribution patterns of KF are thoroughly analyzed to construct a KF-based constrained space, which serves as a guide during DF extraction, to yield novel data-knowledge features (DKF).</div><div>The experimental results on blood pressure prediction showed that, without the need for additional dimensions, 9-dimensional DKF delivered superior predictive performance compared to both 9-dimensional DF and 8-dimensional KF. Specifically, for systolic blood pressure prediction, DKF reduces the mean absolute error (MAE) to 11.41, outperforming KF (MAE=12.11) and DF (MAE=13.24). Similarly, for diastolic blood pressure, DKF achieves an MAE of 7.27, lower than that of KF (7.41) and DF (7.84).</div><div>The proposed feature extraction method effectively overcomes the drawbacks of feature concatenation, offering a novel and effective approach to extracting low-dimensional, highly discriminative features from PPG for accurate blood pressure estimation.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109754"},"PeriodicalIF":4.9,"publicationDate":"2026-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146191946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-attention-aware motion estimation for cardiac MR imaging based on a feature pyramid network 基于特征金字塔网络的心脏MR多注意感知运动估计
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-06 DOI: 10.1016/j.bspc.2026.109714
Kun Wu , Xiang Chen , Nina Cheng , Nishant Ravikumar , Alejandro F. Frangi

Objective:

Unsupervised cardiac motion estimation often confronts complex scenarios, which include a lack of explicit reference for the deformation fields, and intramodal anatomical gaps. These factors introduce substantial obstacles in the effective representation of both smooth and accurate cardiac motion, thereby hindering the prediction of intricate structural details after registration. However, existing approaches have not sufficiently explored the explicit spatial correlations encompassed in multi-range displacements.

Methods:

To overcome the challenges, we propose a novel multi-attention-guided network, MAPC-Net, a pyramidal network with spatial correlation normalisation compensated in the mechanism for high-quality cardiac motion estimation.

Results:

The extensive experimental results from quantitative and qualitative aspects indicate that MAPC-Net achieves exceptional performance in the generalisation of the effective deformation field on the private dataset UKBiobank and publicly available ACDC in terms of cardiac cine-MRI datasets. Our model achieves an average Dice score over 75% (77.2%), a 95% - Hausdorff Distance less than 4.50 mm and a Negative Jacobian Determinant value of 0.20% without segmentation label guided over UKBioBank dataset. We highlighted the significant potential of the proposed framework in clinical relevance by demonstrating the downstream analysis in terms of cardiac peak strain signal. We improve the estimated peak radial strain value from 41.72% to 43.28%.

Conclusion:

A novel framework was proposed for the refinement of motion estimation by introducing attention-guided correlation between warped and fixed frames.

Significance:

The architecture of our proposed model offers a new solution for predicting high-quality cardiac deformation fields, leveraging an attention-aware cost volume calculation embedded in a pyramidal network for motion estimation.
目的:无监督心脏运动估计经常面临复杂的情况,包括缺乏明确的变形场参考,以及模内解剖间隙。这些因素在有效表示平稳和准确的心脏运动方面引入了实质性障碍,从而阻碍了注册后复杂结构细节的预测。然而,现有的方法并没有充分探索包含在多范围位移中的明确的空间相关性。为了克服这些挑战,我们提出了一种新的多注意引导网络MAPC-Net,这是一种具有空间相关归一化补偿机制的金字塔网络,用于高质量的心脏运动估计。结果:从定量和定性方面的广泛实验结果表明,MAPC-Net在私人数据集UKBiobank和公开可用的心脏电影mri数据集ACDC上的有效变形场泛化方面取得了卓越的表现。我们的模型在UKBioBank数据集上实现了平均骰子得分超过75% (77.2%),95% - Hausdorff距离小于4.50 mm,负雅克比行列值为0.20%,没有分割标签引导。我们通过展示心脏峰值应变信号的下游分析,强调了所提出的框架在临床相关性中的重大潜力。我们将估计的峰值径向应变值从41.72%提高到43.28%。结论:提出了一种新的运动估计框架,通过引入注意引导下的扭曲帧和固定帧之间的关联来改进运动估计。意义:我们提出的模型架构为预测高质量的心脏变形场提供了一种新的解决方案,利用嵌入在运动估计的金字塔网络中的注意力感知成本体积计算。
{"title":"Multi-attention-aware motion estimation for cardiac MR imaging based on a feature pyramid network","authors":"Kun Wu ,&nbsp;Xiang Chen ,&nbsp;Nina Cheng ,&nbsp;Nishant Ravikumar ,&nbsp;Alejandro F. Frangi","doi":"10.1016/j.bspc.2026.109714","DOIUrl":"10.1016/j.bspc.2026.109714","url":null,"abstract":"<div><h3>Objective:</h3><div>Unsupervised cardiac motion estimation often confronts complex scenarios, which include a lack of explicit reference for the deformation fields, and intramodal anatomical gaps. These factors introduce substantial obstacles in the effective representation of both smooth and accurate cardiac motion, thereby hindering the prediction of intricate structural details after registration. However, existing approaches have not sufficiently explored the explicit spatial correlations encompassed in multi-range displacements.</div></div><div><h3>Methods:</h3><div>To overcome the challenges, we propose a novel multi-attention-guided network, MAPC-Net, a pyramidal network with spatial correlation normalisation compensated in the mechanism for high-quality cardiac motion estimation.</div></div><div><h3>Results:</h3><div>The extensive experimental results from quantitative and qualitative aspects indicate that MAPC-Net achieves exceptional performance in the generalisation of the effective deformation field on the private dataset UKBiobank and publicly available ACDC in terms of cardiac cine-MRI datasets. Our model achieves an average Dice score over 75<span><math><mtext>%</mtext></math></span> (77.2<span><math><mtext>%</mtext></math></span>), a 95<span><math><mtext>%</mtext></math></span> - Hausdorff Distance less than 4.50 mm and a Negative Jacobian Determinant value of 0.20<span><math><mtext>%</mtext></math></span> without segmentation label guided over UKBioBank dataset. We highlighted the significant potential of the proposed framework in clinical relevance by demonstrating the downstream analysis in terms of cardiac peak strain signal. We improve the estimated peak radial strain value from 41.72<span><math><mtext>%</mtext></math></span> to 43.28<span><math><mtext>%</mtext></math></span>.</div></div><div><h3>Conclusion:</h3><div>A novel framework was proposed for the refinement of motion estimation by introducing attention-guided correlation between warped and fixed frames.</div></div><div><h3>Significance:</h3><div>The architecture of our proposed model offers a new solution for predicting high-quality cardiac deformation fields, leveraging an attention-aware cost volume calculation embedded in a pyramidal network for motion estimation.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"118 ","pages":"Article 109714"},"PeriodicalIF":4.9,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CycleGAN-based prosody and spectrum modeling for Mandarin touch-controlled Electrolaryngeal speech enhancement 基于cyclegan的普通话触控电喉语音增强韵律和频谱建模
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-06 DOI: 10.1016/j.bspc.2026.109746
Jie Zhou , Li Wang , Fengji Li , Shaochuan Zhang , Fei Shen , Fan Fan , Tao Liu , Xiaohong Chen , Haijun Niu
The application of Electrolarynx (EL) for tonal language laryngectomees remains challenging due to the difficulty in achieving tonal completion without useful fundamental frequency (F0) information. This study proposes a novel Mandarin EL speech enhancement framework by integrating the prior F0 information provided by finger movements, combined with the Cycle-Consistent Adversarial Network (CycleGAN) and Continuous Wavelet Transform (CWT). For prosody modeling, we exploit the hierarchical structure inherent in Mandarin prosody by using CWT decomposition coefficients as a feature representation of F0. For spectral conversion, we extract Mel-frequency cepstral coefficients (MCEP) as spectral features. These two feature sets were trained separately using the CycleGAN model. In results, acoustic feature analysis indicates that the four tones after converted are closer to normal tones in both F0 value and F0 contour. The spectrogram of the converted speech is also more similar to that of normal speech, and compensates for low-frequency energy missing below 500 Hz. Both subjective and objective evaluations demonstrate the effectiveness of the proposed method in Mandarin EL speech enhancement. This study also provides a novel approach for EL speech enhancement in other tonal languages. And it may provide valuable insights and guidance for future improvement in tonal EL devices development and EL speech enhancement.
由于没有有用的基频(F0)信息难以实现音调补全,因此电喉(EL)在声调语言喉切除术中的应用仍然具有挑战性。本研究结合循环一致对抗网络(CycleGAN)和连续小波变换(CWT),将手指运动提供的先验F0信息整合在一起,提出了一种新的普通话EL语音增强框架。对于韵律建模,我们利用汉语韵律固有的层次结构,使用CWT分解系数作为F0的特征表示。对于频谱转换,我们提取Mel-frequency倒谱系数(MCEP)作为频谱特征。这两个特征集分别使用CycleGAN模型进行训练。结果声学特征分析表明,转换后的四个音调在F0值和F0轮廓上都更接近正常音调。转换后的语音频谱图也更接近于正常语音,并补偿了500 Hz以下的低频能量缺失。主观和客观的评价都证明了该方法在普通话英语语音增强中的有效性。本研究也为其他声调语言的EL语音增强提供了一种新的方法。为今后声调语音器件的开发和语音增强提供有价值的见解和指导。
{"title":"CycleGAN-based prosody and spectrum modeling for Mandarin touch-controlled Electrolaryngeal speech enhancement","authors":"Jie Zhou ,&nbsp;Li Wang ,&nbsp;Fengji Li ,&nbsp;Shaochuan Zhang ,&nbsp;Fei Shen ,&nbsp;Fan Fan ,&nbsp;Tao Liu ,&nbsp;Xiaohong Chen ,&nbsp;Haijun Niu","doi":"10.1016/j.bspc.2026.109746","DOIUrl":"10.1016/j.bspc.2026.109746","url":null,"abstract":"<div><div>The application of Electrolarynx (EL) for tonal language laryngectomees remains challenging due to the difficulty in achieving tonal completion without useful fundamental frequency (F0) information. This study proposes a novel Mandarin EL speech enhancement framework by integrating the prior F0 information provided by finger movements, combined with the Cycle-Consistent Adversarial Network (CycleGAN) and Continuous Wavelet Transform (CWT). For prosody modeling, we exploit the hierarchical structure inherent in Mandarin prosody by using CWT decomposition coefficients as a feature representation of F0. For spectral conversion, we extract Mel-frequency cepstral coefficients (MCEP) as spectral features. These two feature sets were trained separately using the CycleGAN model. In results, acoustic feature analysis indicates that the four tones after converted are closer to normal tones in both F0 value and F0 contour. The spectrogram of the converted speech is also more similar to that of normal speech, and compensates for low-frequency energy missing below 500 Hz. Both subjective and objective evaluations demonstrate the effectiveness of the proposed method in Mandarin EL speech enhancement. This study also provides a novel approach for EL speech enhancement in other tonal languages. And it may provide valuable insights and guidance for future improvement in tonal EL devices development and EL speech enhancement.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"118 ","pages":"Article 109746"},"PeriodicalIF":4.9,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Integrating EEG microstates and functional connectivity via cross-attention for emotion recognition in virtual reality 基于交叉注意的脑电微态与功能连接集成在虚拟现实中的情感识别
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-06 DOI: 10.1016/j.bspc.2026.109685
Yicai Bai , Yucheng Zhou , Jinqi Dong , Dengjiujiu He , Chao Jiang , Jinglu Hu , Yingjie Li
Emotion is fundamental to human cognition and behavior, and electroencephalography (EEG), with its high temporal resolution, provides a powerful approach to investigate the neural activity underlying emotional processing. We combined EEG with Virtual Reality technology to conduct an EEG-based emotion experiment in a more immersive environment. Moreover, EEG microstate and functional connectivity features are closely related yet capturing their complex nonlinear interactions remains challenging. To address this challenge, we proposed a deep learning framework that integrates Cross-Attention mechanisms with convolutional neural networks (CNN) to model these interactions. Specifically, Cross-Attention captures inter-feature dependencies, while CNN performs hierarchical feature extraction. Experimental results demonstrate that our framework significantly outperforms the baseline CNN model, particularly in recognizing subtle emotional states such as neutral emotion. Notably, this improvement is driven by the synergistic interaction between microstate and functional connectivity features, thereby improving model interpretability. These findings highlight the potential of Cross-Attention CNN to elucidate the complex nonlinear neural mechanisms underlying emotional processing.
情绪是人类认知和行为的基础,脑电图(EEG)以其高时间分辨率为研究情绪加工背后的神经活动提供了有力的手段。我们将脑电图与虚拟现实技术相结合,在更加身临其境的环境中进行基于脑电图的情绪实验。此外,脑电微观状态和功能连接特征密切相关,但捕捉它们之间复杂的非线性相互作用仍然是一个挑战。为了应对这一挑战,我们提出了一个深度学习框架,该框架将交叉注意机制与卷积神经网络(CNN)集成在一起,以模拟这些相互作用。具体来说,交叉注意捕获特征之间的依赖关系,而CNN执行分层特征提取。实验结果表明,我们的框架显著优于基线CNN模型,特别是在识别微妙的情绪状态(如中性情绪)方面。值得注意的是,这种改进是由微状态和功能连接特征之间的协同交互驱动的,从而提高了模型的可解释性。这些发现强调了交叉注意CNN在阐明情绪处理背后复杂的非线性神经机制方面的潜力。
{"title":"Integrating EEG microstates and functional connectivity via cross-attention for emotion recognition in virtual reality","authors":"Yicai Bai ,&nbsp;Yucheng Zhou ,&nbsp;Jinqi Dong ,&nbsp;Dengjiujiu He ,&nbsp;Chao Jiang ,&nbsp;Jinglu Hu ,&nbsp;Yingjie Li","doi":"10.1016/j.bspc.2026.109685","DOIUrl":"10.1016/j.bspc.2026.109685","url":null,"abstract":"<div><div>Emotion is fundamental to human cognition and behavior, and electroencephalography (EEG), with its high temporal resolution, provides a powerful approach to investigate the neural activity underlying emotional processing. We combined EEG with Virtual Reality technology to conduct an EEG-based emotion experiment in a more immersive environment. Moreover, EEG microstate and functional connectivity features are closely related yet capturing their complex nonlinear interactions remains challenging. To address this challenge, we proposed a deep learning framework that integrates Cross-Attention mechanisms with convolutional neural networks (CNN) to model these interactions. Specifically, Cross-Attention captures inter-feature dependencies, while CNN performs hierarchical feature extraction. Experimental results demonstrate that our framework significantly outperforms the baseline CNN model, particularly in recognizing subtle emotional states such as neutral emotion. Notably, this improvement is driven by the synergistic interaction between microstate and functional connectivity features, thereby improving model interpretability. These findings highlight the potential of Cross-Attention CNN to elucidate the complex nonlinear neural mechanisms underlying emotional processing.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109685"},"PeriodicalIF":4.9,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146193098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Time frequency transform kernel enhanced ShallowConvNet for auditory selective attention decoding with steady state motion auditory evoked potential 基于时频变换核增强的浅卷积神经网络的稳态运动听觉诱发电位听觉选择性注意解码
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-06 DOI: 10.1016/j.bspc.2026.109736
Huanqing Zhang , Jun Xie , Kaixuan Liu , Yan Liu , Wenxiang Dong , Guanghua Xu
Steady state motion auditory evoked potential (SSMAEP) is neural responses elicited by rhythmic auditory stimuli with periodic spatial motion. SSMAEP brain computer interface (BCI) relies on auditory selective attention to decode user intent in multi-source environments. However, the complex temporal and spectral structure of SSMAEP presents challenges for effective feature extraction from electroencephalogram (EEG). Time-frequency transforms are suited for extracting the joint time–frequency features of SSMAEP. Notably, these transforms share structural similarities with convolution operations in convolution neural networks. In this study, we propose a novel time frequency convolutional layer that incorporates structured kernels based on the S transform, continuous wavelet transform (CWT), and short-time Fourier transform (STFT). These time frequency kernels are embedded as learnable filters and replace the conventional first convolutional layer of ShallowConvNet. This design enables the model to more effectively capture SSMAEP signal dynamics across both time and frequency domains. The proposed method was evaluated on two SSMAEP-BCI datasets with two and three auditory targets. Experimental results demonstrate consistent improvements in classification accuracy and robustness compared to baseline models. Furthermore, analysis of the learned kernels revealed that the time frequency filters retained their interpretable structure after training, with task-relevant shifts in center frequency and bandwidth. These findings highlight not only the performance advantage but also the improved interpretability of the proposed model, offering insights into the spectral encoding of SSMAEP-BCI.
稳态运动听觉诱发电位(SSMAEP)是由周期性空间运动的节奏性听觉刺激引起的神经反应。SSMAEP脑机接口(BCI)在多源环境下依靠听觉选择性注意解码用户意图。然而,SSMAEP复杂的时间和频谱结构给有效提取脑电图特征带来了挑战。时频变换适合提取SSMAEP的联合时频特征。值得注意的是,这些变换与卷积神经网络中的卷积操作具有结构相似性。在这项研究中,我们提出了一种新的时频卷积层,该层结合了基于S变换、连续小波变换(CWT)和短时傅立叶变换(STFT)的结构化核。这些时频核被嵌入为可学习滤波器,并取代了传统的ShallowConvNet的第一卷积层。这种设计使模型能够更有效地捕获跨时域和频域的SSMAEP信号动态。在两个具有两个和三个听觉目标的SSMAEP-BCI数据集上对所提出的方法进行了评估。实验结果表明,与基线模型相比,分类精度和鲁棒性得到了一致的提高。此外,对学习到的核的分析表明,训练后的时频滤波器保留了其可解释的结构,中心频率和带宽发生了与任务相关的偏移。这些发现不仅突出了性能优势,而且还提高了所提出模型的可解释性,为SSMAEP-BCI的频谱编码提供了见解。
{"title":"Time frequency transform kernel enhanced ShallowConvNet for auditory selective attention decoding with steady state motion auditory evoked potential","authors":"Huanqing Zhang ,&nbsp;Jun Xie ,&nbsp;Kaixuan Liu ,&nbsp;Yan Liu ,&nbsp;Wenxiang Dong ,&nbsp;Guanghua Xu","doi":"10.1016/j.bspc.2026.109736","DOIUrl":"10.1016/j.bspc.2026.109736","url":null,"abstract":"<div><div>Steady state motion auditory evoked potential (SSMAEP) is neural responses elicited by rhythmic auditory stimuli with periodic spatial motion. SSMAEP brain computer interface (BCI) relies on auditory selective attention to decode user intent in multi-source environments. However, the complex temporal and spectral structure of SSMAEP presents challenges for effective feature extraction from electroencephalogram (EEG). Time-frequency transforms are suited for extracting the joint time–frequency features of SSMAEP. Notably, these transforms share structural similarities with convolution operations in convolution neural networks. In this study, we propose a novel time frequency convolutional layer that incorporates structured kernels based on the S transform, continuous wavelet transform (CWT), and short-time Fourier transform (STFT). These time frequency kernels are embedded as learnable filters and replace the conventional first convolutional layer of ShallowConvNet. This design enables the model to more effectively capture SSMAEP signal dynamics across both time and frequency domains. The proposed method was evaluated on two SSMAEP-BCI datasets with two and three auditory targets. Experimental results demonstrate consistent improvements in classification accuracy and robustness compared to baseline models. Furthermore, analysis of the learned kernels revealed that the time frequency filters retained their interpretable structure after training, with task-relevant shifts in center frequency and bandwidth. These findings highlight not only the performance advantage but also the improved interpretability of the proposed model, offering insights into the spectral encoding of SSMAEP-BCI.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109736"},"PeriodicalIF":4.9,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146193101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MacroNet-enhanced energy-aware node clustering protocol for wireless body area networks 基于macronet的无线体域网络能量感知节点聚类协议
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-06 DOI: 10.1016/j.bspc.2026.109792
Altaf Hussain , Shuaiyong Li , Tariq Hussain
Wireless Body Area Networks (WBANs) support continuous patient monitoring in clinical and remote settings by enabling low-power sensors to collect and forward physiological data. However, WBAN deployments are constrained by limited battery capacity and challenging on-body propagation, which increase path loss, degrade link reliability, and shorten network lifetime. Moreover, many existing routing/clustering solutions treat these issues separately rather than jointly. To address this, we propose MacroNet-Enhanced Energy-aware Node Clustering Protocol (MEE-NCP), which integrates dual energy-efficiency models with a MacroNet-based clustering design using four cluster heads (CHs). MEE-NCP forwards data to the coordinator node through CHs using a cost function that prioritizes higher residual energy and shorter distance to balance energy consumption and improve delivery reliability. We evaluate MEE-NCP in MATLAB against representative WBAN routing schemes using Packet Error Rate (PER), Packet Generation Rate (PGR), Data Generation Rate (DGR), RSSI, SNR, residual energy, throughput, end-to-end delay, and network lifetime. Simulation results indicate improved energy distribution and lifetime, reduced delay and packet errors, and stronger link quality via better path-loss handling.
无线体域网络(wban)通过启用低功耗传感器收集和转发生理数据,支持临床和远程环境下对患者的持续监测。然而,无线宽带网络的部署受到电池容量有限和具有挑战性的体上传播的限制,这增加了路径损耗,降低了链路可靠性,缩短了网络寿命。此外,许多现有的路由/集群解决方案单独处理这些问题,而不是联合处理。为了解决这个问题,我们提出了macronet增强的能量感知节点集群协议(MEE-NCP),该协议将双重能效模型与基于macronet的集群设计集成在一起,使用四个簇头(CHs)。MEE-NCP通过成本函数将数据通过CHs转发到协调节点,该函数优先考虑更高的剩余能量和更短的距离,以平衡能量消耗,提高传输可靠性。我们在MATLAB中使用包错误率(PER)、包生成率(PGR)、数据生成率(DGR)、RSSI、信噪比、剩余能量、吞吐量、端到端延迟和网络寿命来评估具有代表性的WBAN路由方案的MEE-NCP。仿真结果表明,通过更好的路径丢失处理,改进了能量分布和寿命,减少了延迟和数据包错误,并增强了链路质量。
{"title":"MacroNet-enhanced energy-aware node clustering protocol for wireless body area networks","authors":"Altaf Hussain ,&nbsp;Shuaiyong Li ,&nbsp;Tariq Hussain","doi":"10.1016/j.bspc.2026.109792","DOIUrl":"10.1016/j.bspc.2026.109792","url":null,"abstract":"<div><div>Wireless Body Area Networks (WBANs) support continuous patient monitoring in clinical and remote settings by enabling low-power sensors to collect and forward physiological data. However, WBAN deployments are constrained by limited battery capacity and challenging on-body propagation, which increase path loss, degrade link reliability, and shorten network lifetime. Moreover, many existing routing/clustering solutions treat these issues separately rather than jointly. To address this, we propose MacroNet-Enhanced Energy-aware Node Clustering Protocol (MEE-NCP), which integrates dual energy-efficiency models with a MacroNet-based clustering design using four cluster heads (CHs). MEE-NCP forwards data to the coordinator node through CHs using a cost function that prioritizes higher residual energy and shorter distance to balance energy consumption and improve delivery reliability. We evaluate MEE-NCP in MATLAB against representative WBAN routing schemes using Packet Error Rate (PER), Packet Generation Rate (PGR), Data Generation Rate (DGR), RSSI, SNR, residual energy, throughput, end-to-end delay, and network lifetime. Simulation results indicate improved energy distribution and lifetime, reduced delay and packet errors, and stronger link quality via better path-loss handling.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"118 ","pages":"Article 109792"},"PeriodicalIF":4.9,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated learning for prenatal detection of interrupted aortic arch using fetal ultrasound imaging 联合学习用于胎儿超声成像的主动脉弓中断产前检测
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-06 DOI: 10.1016/j.bspc.2026.109795
Jiancheng Han , Heqing Wang , Yifan Feng , Qi Yang , Jingtan Li , Haojie Zhang , Yihua He , Jiang Liu , Toru Nakamura , Yang Cao , Naidi Sun , Kun Qian , Bin Hu , Xinru Gao , Yan Xia , Zongjie Weng , Björn W. Schuller , Yoshiharu Yamamoto
This study presents the first application of federated learning (FL) for prenatal detection of Interrupted Aortic Arch (IAA) using fetal ultrasound images. To address the challenges of data scarcity, privacy constraints, and inter-institutional variability, we develop a federated learning IAA detection method and systematically evaluate three representative strategies (FedAvg, FedProx, and FedBABU) across five clinical centres. Results show that FL improves model performance over local training in recall and F1-score in data-scarce centres. Among FL algorithms, FedAvg and FedProx consistently outperform FedBABU in stability and generalisation. Among the three CNN architectures compared — ResNet-50, EfficientNet-B3, and DenseNet-121 — DenseNet-121 demonstrates superior overall performance, particularly in non-independent and identically distributed (Non-IID) scenarios. Our framework demonstrates the feasibility of collaborative AI for rare disease detection without data sharing, laying the foundation for scalable, real-world prenatal screening of congenital heart defects.
本研究首次应用联邦学习(FL)胎儿超声图像检测主动脉弓中断(IAA)。为了应对数据稀缺、隐私约束和机构间可变性的挑战,我们开发了一种联邦学习IAA检测方法,并系统地评估了五个临床中心的三种代表性策略(FedAvg、FedProx和FedBABU)。结果表明,在数据稀缺的中心,FL比局部训练在召回和f1得分方面提高了模型的性能。在FL算法中,fedag和FedProx在稳定性和泛化方面始终优于FedBABU。在ResNet-50、EfficientNet-B3和DenseNet-121这三种CNN架构中,DenseNet-121表现出了卓越的整体性能,特别是在非独立和同分布(Non-IID)场景中。我们的框架证明了协作人工智能在没有数据共享的情况下进行罕见疾病检测的可行性,为可扩展的、现实世界的先天性心脏缺陷产前筛查奠定了基础。
{"title":"Federated learning for prenatal detection of interrupted aortic arch using fetal ultrasound imaging","authors":"Jiancheng Han ,&nbsp;Heqing Wang ,&nbsp;Yifan Feng ,&nbsp;Qi Yang ,&nbsp;Jingtan Li ,&nbsp;Haojie Zhang ,&nbsp;Yihua He ,&nbsp;Jiang Liu ,&nbsp;Toru Nakamura ,&nbsp;Yang Cao ,&nbsp;Naidi Sun ,&nbsp;Kun Qian ,&nbsp;Bin Hu ,&nbsp;Xinru Gao ,&nbsp;Yan Xia ,&nbsp;Zongjie Weng ,&nbsp;Björn W. Schuller ,&nbsp;Yoshiharu Yamamoto","doi":"10.1016/j.bspc.2026.109795","DOIUrl":"10.1016/j.bspc.2026.109795","url":null,"abstract":"<div><div>This study presents the first application of federated learning (FL) for prenatal detection of Interrupted Aortic Arch (IAA) using fetal ultrasound images. To address the challenges of data scarcity, privacy constraints, and inter-institutional variability, we develop a federated learning IAA detection method and systematically evaluate three representative strategies (FedAvg, FedProx, and FedBABU) across five clinical centres. Results show that FL improves model performance over local training in recall and F1-score in data-scarce centres. Among FL algorithms, FedAvg and FedProx consistently outperform FedBABU in stability and generalisation. Among the three CNN architectures compared — ResNet-50, EfficientNet-B3, and DenseNet-121 — DenseNet-121 demonstrates superior overall performance, particularly in non-independent and identically distributed (Non-IID) scenarios. Our framework demonstrates the feasibility of collaborative AI for rare disease detection without data sharing, laying the foundation for scalable, real-world prenatal screening of congenital heart defects.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109795"},"PeriodicalIF":4.9,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146193096","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GMMA-Net: A CCTA image segmentation algorithm based on grouped multi-path feature fusion and multi-scale attention GMMA-Net:一种基于分组多路径特征融合和多尺度关注的CCTA图像分割算法
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-06 DOI: 10.1016/j.bspc.2026.109726
Yi Wang , Pei Deng , Tinghui Zheng , Haoyao Cao
Automatic and accurate segmentation of coronary arteries (CA) is a prerequisite for high-precision reconstruction of three-dimensional CA models. However, the complex structure of CA, including low contrast, significant variation in vessel diameter, and high curvature, poses significant challenges for segmentation and reconstruction. In addition, coronary computed tomography angiography (CCTA) images contain abundant background information (such as other tissues, organs, or vessels), further increasing the difficulty of segmentation. These factors often lead to vessel discontinuity and incomplete segmentation. Therefore, accurate CA segmentation remains a challenging task. In this study, we propose the GMMA-Net network to improve the continuity, robustness, and noise resistance of CA segmentation. GMMA-Net employs a grouped multi-path feature fusion module (GMFFM) in the encoder to capture richer multi-scale feature information. Furthermore, by introducing a multi-scale attention module (MAM) into the bottleneck layer of GMMA-Net, we achieve dynamic weight adjustment, capture long-range dependencies, and suppress redundant features. Experimental results show that GMMA-Net outperforms existing methods in the task of CA segmentation from CCTA images, effectively overcoming challenges caused by scale sensitivity and noise interference. GMMA-Net demonstrates superior performance on metrics such as IoU, Dice coefficient, recall rate, and HD95, especially exhibiting stronger segmentation capability when handling cases with poor image quality and large variations in vessel diameter. The code of the proposed method is available at https://github.com/DengPei-C/GMMA-Net.
自动准确分割冠状动脉是实现冠状动脉三维模型高精度重建的前提。然而,CA结构复杂,对比度低,血管直径变化大,曲率大,给分割和重建带来了很大的挑战。此外,冠状动脉ct血管造影(CCTA)图像包含丰富的背景信息(如其他组织、器官或血管),进一步增加了分割的难度。这些因素往往导致血管不连续性和不完全分割。因此,准确的CA分割仍然是一项具有挑战性的任务。在本研究中,我们提出了GMMA-Net网络来提高CA分割的连续性、鲁棒性和抗噪声性。GMMA-Net在编码器中采用分组多路径特征融合模块(GMFFM)来捕获更丰富的多尺度特征信息。此外,通过在GMMA-Net的瓶颈层中引入多尺度注意力模块(MAM),实现了动态权值调整、远程依赖关系捕获和冗余特征抑制。实验结果表明,GMMA-Net在CCTA图像的CA分割任务中优于现有方法,有效克服了尺度敏感性和噪声干扰带来的挑战。GMMA-Net在IoU、Dice系数、召回率和HD95等指标上表现优异,特别是在处理图像质量差和血管直径变化大的情况下表现出更强的分割能力。所提出的方法的代码可在https://github.com/DengPei-C/GMMA-Net上获得。
{"title":"GMMA-Net: A CCTA image segmentation algorithm based on grouped multi-path feature fusion and multi-scale attention","authors":"Yi Wang ,&nbsp;Pei Deng ,&nbsp;Tinghui Zheng ,&nbsp;Haoyao Cao","doi":"10.1016/j.bspc.2026.109726","DOIUrl":"10.1016/j.bspc.2026.109726","url":null,"abstract":"<div><div>Automatic and accurate segmentation of coronary arteries (CA) is a prerequisite for high-precision reconstruction of three-dimensional CA models. However, the complex structure of CA, including low contrast, significant variation in vessel diameter, and high curvature, poses significant challenges for segmentation and reconstruction. In addition, coronary computed tomography angiography (CCTA) images contain abundant background information (such as other tissues, organs, or vessels), further increasing the difficulty of segmentation. These factors often lead to vessel discontinuity and incomplete segmentation. Therefore, accurate CA segmentation remains a challenging task. In this study, we propose the GMMA-Net network to improve the continuity, robustness, and noise resistance of CA segmentation. GMMA-Net employs a grouped multi-path feature fusion module (GMFFM) in the encoder to capture richer multi-scale feature information. Furthermore, by introducing a multi-scale attention module (MAM) into the bottleneck layer of GMMA-Net, we achieve dynamic weight adjustment, capture long-range dependencies, and suppress redundant features. Experimental results show that GMMA-Net outperforms existing methods in the task of CA segmentation from CCTA images, effectively overcoming challenges caused by scale sensitivity and noise interference. GMMA-Net demonstrates superior performance on metrics such as IoU, Dice coefficient, recall rate, and <span><math><mrow><mi>H</mi><msub><mrow><mi>D</mi></mrow><mrow><mn>95</mn></mrow></msub></mrow></math></span>, especially exhibiting stronger segmentation capability when handling cases with poor image quality and large variations in vessel diameter. The code of the proposed method is available at <span><span>https://github.com/DengPei-C/GMMA-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109726"},"PeriodicalIF":4.9,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146193095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unbiased diagnostic report generation via multi-modal counterfactual inference 通过多模态反事实推理生成无偏诊断报告
IF 4.9 2区 医学 Q1 ENGINEERING, BIOMEDICAL Pub Date : 2026-02-06 DOI: 10.1016/j.bspc.2026.109639
Yuting Guo , Shuai Li , Wenfeng Song , Aimin Hao
Automated diagnostic report generation is a challenging vision-and-language bridging task aimed at accurately describing medical images and performing cross-modal causal inference. Despite its significant clinical importance, widespread application remains challenging. Existing methods often rely on pre-trained models with large-scale medical report datasets, leading to data shifts between training and testing sets, resulting in irrelevant contextual biases in the visual domain and correlation biases within the knowledge graph. To address these issues, we propose a novel multimodal causal inference approach called Multimodal Counterfactual Unbiased Report Generation (MCURG), which incorporates causal inference to exploit invariant rationales. Our key innovation lies in leveraging counterfactual inference to reduce visual and knowledge biases. MCURG employs a Structural Causal Model (SCM) to elucidate the complex relationships among images, knowledge graphs, reports, confounders, and personalized features. We design two multimodal debiasing modules: a visual debiasing module and a knowledge graph debiasing module. The visual debiasing module focuses on the Total Direct Effect of image features, mitigating confounding factors, while the knowledge graph debiasing module identifies individualized treatments within the graph, reducing spurious generations. We conducted extensive experiments and comprehensive evaluations on multiple datasets, demonstrating that MCURG effectively reduces bias and improves the accuracy of generated reports. This multimodal causal inference approach, through the use of SCM and counterfactual reasoning, successfully addresses bias in automated diagnostic report generation, marking a significant innovation in the field. The codes are available at https://github.com/stellating/MCURG.
自动诊断报告生成是一项具有挑战性的视觉和语言桥接任务,旨在准确描述医学图像并执行跨模态因果推理。尽管其具有重要的临床意义,但广泛应用仍然具有挑战性。现有的方法通常依赖于大规模医疗报告数据集的预训练模型,导致数据在训练集和测试集之间发生偏移,从而导致视觉域中的无关上下文偏差和知识图中的相关偏差。为了解决这些问题,我们提出了一种新的多模态因果推理方法,称为多模态反事实无偏报告生成(MCURG),它结合了因果推理来利用不变的基本原理。我们的关键创新在于利用反事实推理来减少视觉和知识偏见。MCURG采用结构因果模型(SCM)来阐明图像、知识图、报告、混杂因素和个性化特征之间的复杂关系。我们设计了两个多模态去偏模块:视觉去偏模块和知识图去偏模块。视觉去偏模块侧重于图像特征的总直接效应,减轻混淆因素,而知识图去偏模块识别图内的个性化治疗,减少虚假生成。我们对多个数据集进行了广泛的实验和综合评估,证明MCURG有效地减少了偏差,提高了生成报告的准确性。这种多模态因果推理方法,通过使用SCM和反事实推理,成功地解决了自动诊断报告生成中的偏见,标志着该领域的重大创新。代码可在https://github.com/stellating/MCURG上获得。
{"title":"Unbiased diagnostic report generation via multi-modal counterfactual inference","authors":"Yuting Guo ,&nbsp;Shuai Li ,&nbsp;Wenfeng Song ,&nbsp;Aimin Hao","doi":"10.1016/j.bspc.2026.109639","DOIUrl":"10.1016/j.bspc.2026.109639","url":null,"abstract":"<div><div>Automated diagnostic report generation is a challenging vision-and-language bridging task aimed at accurately describing medical images and performing cross-modal causal inference. Despite its significant clinical importance, widespread application remains challenging. Existing methods often rely on pre-trained models with large-scale medical report datasets, leading to data shifts between training and testing sets, resulting in irrelevant contextual biases in the visual domain and correlation biases within the knowledge graph. To address these issues, we propose a novel multimodal causal inference approach called Multimodal Counterfactual Unbiased Report Generation (MCURG), which incorporates causal inference to exploit invariant rationales. Our key innovation lies in leveraging counterfactual inference to reduce visual and knowledge biases. MCURG employs a Structural Causal Model (SCM) to elucidate the complex relationships among images, knowledge graphs, reports, confounders, and personalized features. We design two multimodal debiasing modules: a visual debiasing module and a knowledge graph debiasing module. The visual debiasing module focuses on the Total Direct Effect of image features, mitigating confounding factors, while the knowledge graph debiasing module identifies individualized treatments within the graph, reducing spurious generations. We conducted extensive experiments and comprehensive evaluations on multiple datasets, demonstrating that MCURG effectively reduces bias and improves the accuracy of generated reports. This multimodal causal inference approach, through the use of SCM and counterfactual reasoning, successfully addresses bias in automated diagnostic report generation, marking a significant innovation in the field. The codes are available at <span><span>https://github.com/stellating/MCURG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109639"},"PeriodicalIF":4.9,"publicationDate":"2026-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146193097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biomedical Signal Processing and Control
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1