Pub Date : 2026-06-15Epub Date: 2026-02-07DOI: 10.1016/j.bspc.2026.109739
Yingfa Li , Jialin Shi , Yufei Wang , Jiping Wei , Yaru Wei , Liang Wu , Meihao Wang , Zhifang Pan
Integrating radiology and histopathology images provides critical complementary perspectives for cancer survival prediction. However, current research faces two main challenges: (1) significant discrepancies in spatial scale and feature dimensionality between modalities; and (2) limited clinical generalizability due to existing methods being restricted to single cancer types or tasks. To overcome these barriers, we propose the Dual-Branch Multimodal Attention-based Feature Fusion Network (DBMAF). This framework employs an enhanced multi-scale channel attention mechanism for intra-modal feature extraction and an attention-guided cross-modal module to learn discriminative correlations between modalities. We validated DBMAF on four cancer cohorts, comprising three public datasets (TCGA-OV, TCGA-KIRC, TCGA-LIHC) and one private institutional dataset (WMU-CRC). Quantitative evaluations demonstrate that our method consistently outperforms all compared methods, achieving a maximum C-index of 0.910 on the TCGA-LIHC cohort. Furthermore, DBMAF showed robust performance across multiple survival endpoints (OS, TTP, and PFS) on the TCGA-OV dataset, highlighting its clinical utility for precise treatment stratification.
{"title":"DBMAF: Dual-branch multimodal attention-based feature fusion network for fusing histopathology and radiology images","authors":"Yingfa Li , Jialin Shi , Yufei Wang , Jiping Wei , Yaru Wei , Liang Wu , Meihao Wang , Zhifang Pan","doi":"10.1016/j.bspc.2026.109739","DOIUrl":"10.1016/j.bspc.2026.109739","url":null,"abstract":"<div><div>Integrating radiology and histopathology images provides critical complementary perspectives for cancer survival prediction. However, current research faces two main challenges: (1) significant discrepancies in spatial scale and feature dimensionality between modalities; and (2) limited clinical generalizability due to existing methods being restricted to single cancer types or tasks. To overcome these barriers, we propose the Dual-Branch Multimodal Attention-based Feature Fusion Network (DBMAF). This framework employs an enhanced multi-scale channel attention mechanism for intra-modal feature extraction and an attention-guided cross-modal module to learn discriminative correlations between modalities. We validated DBMAF on four cancer cohorts, comprising three public datasets (TCGA-OV, TCGA-KIRC, TCGA-LIHC) and one private institutional dataset (WMU-CRC). Quantitative evaluations demonstrate that our method consistently outperforms all compared methods, achieving a maximum C-index of 0.910 on the TCGA-LIHC cohort. Furthermore, DBMAF showed robust performance across multiple survival endpoints (OS, TTP, and PFS) on the TCGA-OV dataset, highlighting its clinical utility for precise treatment stratification.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109739"},"PeriodicalIF":4.9,"publicationDate":"2026-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146191910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-15Epub Date: 2026-02-11DOI: 10.1016/j.bspc.2026.109712
Shihua Qin , Fangxu Xing , Jihoon Cho , Jinah Park , Xiaofeng Liu , Amir Rouhollahi , Elias J. Bou Farhat , Hoda Javadikasgari , Ashraf Sabe , Farhad R. Nezami , Jonghye Woo , Iman Aganj
Accurate segmentation of the left ventricle (LV) in cardiac CT images is crucial for assessing ventricular function and diagnosing cardiovascular diseases. Creating a sufficiently large training set with accurate manual labels of LV can be cumbersome. More efficient semi-automatic segmentation, however, often includes unwanted structures, such as papillary muscles, due to low contrast between the LV wall and surrounding tissues. This study introduces a two-input-channel method within a Hybrid-Fusion Transformer deep-learning framework to produce refined LV labels from a combination of CT images and semi-automatic rough labels, effectively removing papillary muscles. By leveraging the efficiency of semi-automatic LV segmentation, we train an automatic refined segmentation model on a small set of images with both refined manual and rough semi-automatic labels. Evaluated through quantitative cross-validation, our method outperformed models that used only either CT images or rough masks as input.
{"title":"Refined myocardium segmentation from CT using a Hybrid-Fusion transformer","authors":"Shihua Qin , Fangxu Xing , Jihoon Cho , Jinah Park , Xiaofeng Liu , Amir Rouhollahi , Elias J. Bou Farhat , Hoda Javadikasgari , Ashraf Sabe , Farhad R. Nezami , Jonghye Woo , Iman Aganj","doi":"10.1016/j.bspc.2026.109712","DOIUrl":"10.1016/j.bspc.2026.109712","url":null,"abstract":"<div><div>Accurate segmentation of the left ventricle (LV) in cardiac CT images is crucial for assessing ventricular function and diagnosing cardiovascular diseases. Creating a sufficiently large training set with accurate manual labels of LV can be cumbersome. More efficient semi-automatic segmentation, however, often includes unwanted structures, such as papillary muscles, due to low contrast between the LV wall and surrounding tissues. This study introduces a two-input-channel method within a Hybrid-Fusion Transformer deep-learning framework to produce refined LV labels from a combination of CT images and semi-automatic rough labels, effectively removing papillary muscles. By leveraging the efficiency of semi-automatic LV segmentation, we train an automatic refined segmentation model on a small set of images with both refined manual and rough semi-automatic labels. Evaluated through quantitative cross-validation, our method outperformed models that used only either CT images or rough masks as input.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109712"},"PeriodicalIF":4.9,"publicationDate":"2026-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-15Epub Date: 2026-02-12DOI: 10.1016/j.bspc.2026.109705
Xiaorui Zhang , Peisen Lu , Wei Sun , Rui Jiang
As a common malignant tumor, the accurate classification of brain tumors is crucial for early diagnosis and prevention. Appropriate feature extraction and classification methods can help significantly to achieve this goal. Traditional methods like Local Ternary Patterns (LTP) are suitable for extracting complex texture features of brain tumors, despite deep learning’s excellent results in classifying brain tumor datasets. S2-MLP, while effective in processing brain tumor features extracted by LTP, lacks uniformity and discriminative power, despite enhancing correlation between input features and achieving excellent classification results when combined with LTP. The present research proposes a probability feature expression method based on partition uniformity measure, which reduces computational complexity by transforming three-dimensional coding into two-dimensional coding through partitioning. Regions are labeled based on uniformity measure, with non-uniform regions receiving different labels and uniform regions receiving the same. These labels are converted into features using occurrence probabilities. Additionally, a method for multi-spatial shift segmented attention information fusion is proposed. The classifier is redesigned by expanding the feature maps multiple times, applying spatial shifts in different directions to each feature map, and using a split attention module to fuse the shifted feature maps, enhancing the correlation between features. The internal nodes of the MLP are also optimized to improve the model’s generalization performance. The experiments achieved the highest classification accuracy on the Sa and SfB datasets, achieving 95.32% and 97.26%, respectively, indicating that this method has significant potential applications in brain tumor classification.
{"title":"Brain tumor classification method based on segmented uniformity measure and spatial shift information fusion","authors":"Xiaorui Zhang , Peisen Lu , Wei Sun , Rui Jiang","doi":"10.1016/j.bspc.2026.109705","DOIUrl":"10.1016/j.bspc.2026.109705","url":null,"abstract":"<div><div>As a common malignant tumor, the accurate classification of brain tumors is crucial for early diagnosis and prevention. Appropriate feature extraction and classification methods can help significantly to achieve this goal. Traditional methods like Local Ternary Patterns (LTP) are suitable for extracting complex texture features of brain tumors, despite deep learning’s excellent results in classifying brain tumor datasets. S<sup>2</sup>-MLP, while effective in processing brain tumor features extracted by LTP, lacks uniformity and discriminative power, despite enhancing correlation between input features and achieving excellent classification results when combined with LTP. The present research proposes a probability feature expression method based on partition uniformity measure, which reduces computational complexity by transforming three-dimensional coding into two-dimensional coding through partitioning. Regions are labeled based on uniformity measure, with non-uniform regions receiving different labels and uniform regions receiving the same. These labels are converted into features using occurrence probabilities. Additionally, a method for multi-spatial shift segmented attention information fusion is proposed. The classifier is redesigned by expanding the feature maps multiple times, applying spatial shifts in different directions to each feature map, and using a split attention module to fuse the shifted feature maps, enhancing the correlation between features. The internal nodes of the MLP are also optimized to improve the model’s generalization performance. The experiments achieved the highest classification accuracy on the Sa and SfB datasets, achieving 95.32% and 97.26%, respectively, indicating that this method has significant potential applications in brain tumor classification.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109705"},"PeriodicalIF":4.9,"publicationDate":"2026-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-15Epub Date: 2026-02-12DOI: 10.1016/j.bspc.2026.109841
Kan Luo , Haixin He , Yu Chen , Lu You , Jiajia Yang , Dengke Hong , Jianxing Li , Chitin Hon
Cardiovascular diseases (CVDs) are the leading cause of global mortality, and accurate electrocardiogram (ECG) diagnoses are essential for effective clinical interventions. This paper introduces MAR-GCNet, a novel deep learning framework for multi-label ECG anomaly detection that integrates multi-scale feature extraction and inter-class correlation modeling. It combines multi-attention residual networks (MARNs), graph convolutional networks (GCNs) and a weighted fusion strategy. The MARNs incorporate ECA-ResNet blocks with convolutional kernels of sizes 3, 5, and 7 to capture both local and global temporal characteristics in 12-lead ECG signals. The GCNs use a conditional probability matrix (CPM) and a multi-label feature matrix (MLFM) to model inter-class dependencies and mutual exclusivity among cardiac abnormalities. A weighted fusion loss function is employed to integrate the outputs of the MARNs and GCNs branches, enabling optimal multi-label predictions. Experiments on the PTB-XL dataset show that MAR-GCNet outperforms several state-of-the-art (SOTA) models across various annotation levels, achieving the F1 scores of 72.68%, 66.80%, 69.46%, 76.84%, 52.06%, and 90.97% in the “all”, “diag.”, “sub-diag.”, “super-diag.”, “form”, and “rhythm” tasks, respectively. Ablation studies confirm that the integration of multi-scale feature extraction and the two-layer GCN configuration significantly enhance the model performance. These results indicate that MAR-GCNet is a promising approach for accurate and robust automated ECG analysis.
{"title":"MAR-GCNet: Multi-label abnormal detection of electrocardiograms by combining multiscale features and graph convolutional networks","authors":"Kan Luo , Haixin He , Yu Chen , Lu You , Jiajia Yang , Dengke Hong , Jianxing Li , Chitin Hon","doi":"10.1016/j.bspc.2026.109841","DOIUrl":"10.1016/j.bspc.2026.109841","url":null,"abstract":"<div><div>Cardiovascular diseases (CVDs) are the leading cause of global mortality, and accurate electrocardiogram (ECG) diagnoses are essential for effective clinical interventions. This paper introduces MAR-GCNet, a novel deep learning framework for multi-label ECG anomaly detection that integrates multi-scale feature extraction and inter-class correlation modeling. It combines multi-attention residual networks (MARNs), graph convolutional networks (GCNs) and a weighted fusion strategy. The MARNs incorporate ECA-ResNet blocks with convolutional kernels of sizes 3, 5, and 7 to capture both local and global temporal characteristics in 12-lead ECG signals. The GCNs use a conditional probability matrix (CPM) and a multi-label feature matrix (MLFM) to model inter-class dependencies and mutual exclusivity among cardiac abnormalities. A weighted fusion loss function is employed to integrate the outputs of the MARNs and GCNs branches, enabling optimal multi-label predictions. Experiments on the PTB-XL dataset show that MAR-GCNet outperforms several state-of-the-art (SOTA) models across various annotation levels, achieving the F1 scores of 72.68%, 66.80%, 69.46%, 76.84%, 52.06%, and 90.97% in the “all”, “diag.”, “sub-diag.”, “super-diag.”, “form”, and “rhythm” tasks, respectively. Ablation studies confirm that the integration of multi-scale feature extraction and the two-layer GCN configuration significantly enhance the model performance. These results indicate that MAR-GCNet is a promising approach for accurate and robust automated ECG analysis.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109841"},"PeriodicalIF":4.9,"publicationDate":"2026-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-15Epub Date: 2026-02-12DOI: 10.1016/j.bspc.2026.109815
Mingxuan Sun , Yang Liu , Daoshuang Geng , Xiaobang Wu , Daoguo Yang
Pain is a complex subjective experience requiring objective assessment methods for precise diagnosis and treatment. Current approaches relying on self-reports are susceptible to bias and individual variability. This study proposes a cross-mixed model combining a convolutional neural network (CNN) and a bidirectional long short-term memory (BiLSTM) network (hybrid CNN–BiLSTM framework). It classifies pain intensity on the basis of electroencephalography (EEG) signals while explicitly modeling interindividual differences. We introduce a quantitative pain sensitivity index derived from pain threshold and tolerance measurements during cold pressor tests. It facilitates the categorization of subjects into high- and low-sensitivity groups. The CNN component extracts spatial features from EEG time–frequency representations, while the BiLSTM with self-attention captures the temporal dynamics of pain evolution. Subject-independent evaluation was performed using a Leave-One-Subject-Out (LOSO) cross-validation (LOSOCV) strategy. The model achieves accuracies of 88.64% (no pain), 95.80% (mild pain), 99.75% (moderate pain), and 82.96% (severe pain) in the undivided group. When individual sensitivity differences revealed through group-stratified training were considered, the overall accuracy increases to 93.98%, accompanied by increases in Recall and F1-scores increase. Ablation studies confirm the contributions of each architectural component (CNN: spatial feature extraction; BiLSTM: temporal modeling; attention: salient segment weighting; LOSOCV: generalization). Statistical analysis reveals significant correlation between intersubject pain score differences and prediction loss (R2 = 0.45, p < 0.01), validating the effect of individual variability. The proposed framework provides not only accurate pain classification but also a methodology for personalizing pain assessment based on individual sensitivity profiles, showing potential for precise clinical pain management.
疼痛是一种复杂的主观体验,需要客观的评估方法来精确诊断和治疗。目前依赖自我报告的方法容易受到偏见和个体差异的影响。本研究提出了一种结合卷积神经网络(CNN)和双向长短期记忆(BiLSTM)网络(混合CNN - BiLSTM框架)的交叉混合模型。它根据脑电图(EEG)信号对疼痛强度进行分类,同时明确地模拟个体间的差异。我们介绍了一个定量的疼痛敏感性指数,从疼痛阈值和耐受性测量在冷压试验。它有助于将受试者分为高敏感组和低敏感组。CNN组件从EEG时频表征中提取空间特征,而自注意BiLSTM捕获疼痛演化的时间动态。受试者独立评估采用留一受试者(LOSO)交叉验证(LOSOCV)策略进行。在未划分组中,模型的准确率分别为88.64%(无疼痛)、95.80%(轻度疼痛)、99.75%(中度疼痛)和82.96%(重度疼痛)。当考虑群体分层训练的个体敏感性差异时,总体准确率提高到93.98%,同时召回率和f1分数也有所提高。消融研究证实了每个建筑成分的贡献(CNN:空间特征提取;BiLSTM:时间建模;注意力:显著段加权;LOSOCV:泛化)。统计分析显示受试者间疼痛评分差异与预测损失之间存在显著相关性(R2 = 0.45, p < 0.01),验证了个体差异的影响。该框架不仅提供了准确的疼痛分类,还提供了基于个体敏感性特征的个性化疼痛评估方法,显示了精确临床疼痛管理的潜力。
{"title":"Pain intensity classification and evaluation of individual differences in subjects based on hybrid CNN–BiLSTM approach","authors":"Mingxuan Sun , Yang Liu , Daoshuang Geng , Xiaobang Wu , Daoguo Yang","doi":"10.1016/j.bspc.2026.109815","DOIUrl":"10.1016/j.bspc.2026.109815","url":null,"abstract":"<div><div>Pain is a complex subjective experience requiring objective assessment methods for precise diagnosis and treatment. Current approaches relying on self-reports are susceptible to bias and individual variability. This study proposes a cross-mixed model combining a convolutional neural network (CNN) and a bidirectional long short-term memory (BiLSTM) network (hybrid CNN–BiLSTM framework). It classifies pain intensity on the basis of electroencephalography (EEG) signals while explicitly modeling interindividual differences. We introduce a quantitative pain sensitivity index derived from pain threshold and tolerance measurements during cold pressor tests. It facilitates the categorization of subjects into high- and low-sensitivity groups. The CNN component extracts spatial features from EEG time–frequency representations, while the BiLSTM with self-attention captures the temporal dynamics of pain evolution. Subject-independent evaluation was performed using a Leave-One-Subject-Out (LOSO) cross-validation (LOSOCV) strategy. The model achieves accuracies of 88.64% (no pain), 95.80% (mild pain), 99.75% (moderate pain), and 82.96% (severe pain) in the undivided group. When individual sensitivity differences revealed through group-stratified training were considered, the overall accuracy increases to 93.98%, accompanied by increases in Recall and F1-scores increase. Ablation studies confirm the contributions of each architectural component (CNN: spatial feature extraction; BiLSTM: temporal modeling; attention: salient segment weighting; LOSOCV: generalization). Statistical analysis reveals significant correlation between intersubject pain score differences and prediction loss (<em>R</em><sup>2</sup> = 0.45, <em>p</em> < 0.01), validating the effect of individual variability. The proposed framework provides not only accurate pain classification but also a methodology for personalizing pain assessment based on individual sensitivity profiles, showing potential for precise clinical pain management.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109815"},"PeriodicalIF":4.9,"publicationDate":"2026-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-15Epub Date: 2026-02-14DOI: 10.1016/j.bspc.2026.109709
Miao Li , Jing Lian , Jizhao Liu , Huaikun Zhang , Bin Shi , Qidong Liu
In medical ultrasound image segmentation, lesion areas are often blurred, making it difficult to distinguish them from the background, thereby complicating segmentation tasks. In the past decade, deep convolutional neural networks have proven effective for medical image segmentation. However, the inductive biases in convolutional architectures limit their ability to capture long-range dependencies. Recently, denoising diffusion probabilistic models (DDPMs) have emerged as powerful generative frameworks in computer vision. Yet, many diffusion-based segmentation approaches overlook the semantic relationships between lesion regions (foreground) and surrounding normal tissues (background), often resulting in distorted segmentation outputs. To address these limitations, we propose DMUS-Net, a diffusion model-based network for medical ultrasound segmentation. DMUS-Net integrates a Multi-Scale Conditional Guidance Network (MSCGN) and Adaptive Detail-Oriented Attention (AODA) modules. By Leveraging the Transformer network’s global relational capabilities, DMUS-Net effectively balances attention between global context and fine-grained features. Subsequently, it dynamically integrates rich image prior information, enhancing semantic correlations between foreground and background. Additionally, we introduce Context-Aware Cross-Decoding layers (CACD) to capture global features and inter-channel correlations, thereby improving both segmentation accuracy and efficiency. DMUS-Net is applied to ultrasound segmentation tasks, including breast, thyroid, and gallbladder stones, achieving superior results, in comparative experiments. These findings highlight DMUS-Net’s robust generalization ability and potential for practical clinical applications.
{"title":"Diffusion model-based medical ultrasound segmentation network in ultrasound image","authors":"Miao Li , Jing Lian , Jizhao Liu , Huaikun Zhang , Bin Shi , Qidong Liu","doi":"10.1016/j.bspc.2026.109709","DOIUrl":"10.1016/j.bspc.2026.109709","url":null,"abstract":"<div><div>In medical ultrasound image segmentation, lesion areas are often blurred, making it difficult to distinguish them from the background, thereby complicating segmentation tasks. In the past decade, deep convolutional neural networks have proven effective for medical image segmentation. However, the inductive biases in convolutional architectures limit their ability to capture long-range dependencies. Recently, denoising diffusion probabilistic models (DDPMs) have emerged as powerful generative frameworks in computer vision. Yet, many diffusion-based segmentation approaches overlook the semantic relationships between lesion regions (foreground) and surrounding normal tissues (background), often resulting in distorted segmentation outputs. To address these limitations, we propose DMUS-Net, a diffusion model-based network for medical ultrasound segmentation. DMUS-Net integrates a Multi-Scale Conditional Guidance Network (MSCGN) and Adaptive Detail-Oriented Attention (AODA) modules. By Leveraging the Transformer network’s global relational capabilities, DMUS-Net effectively balances attention between global context and fine-grained features. Subsequently, it dynamically integrates rich image prior information, enhancing semantic correlations between foreground and background. Additionally, we introduce Context-Aware Cross-Decoding layers (CACD) to capture global features and inter-channel correlations, thereby improving both segmentation accuracy and efficiency. DMUS-Net is applied to ultrasound segmentation tasks, including breast, thyroid, and gallbladder stones, achieving superior results, in comparative experiments. These findings highlight DMUS-Net’s robust generalization ability and potential for practical clinical applications.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109709"},"PeriodicalIF":4.9,"publicationDate":"2026-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146193104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-15Epub Date: 2026-02-06DOI: 10.1016/j.bspc.2026.109639
Yuting Guo , Shuai Li , Wenfeng Song , Aimin Hao
Automated diagnostic report generation is a challenging vision-and-language bridging task aimed at accurately describing medical images and performing cross-modal causal inference. Despite its significant clinical importance, widespread application remains challenging. Existing methods often rely on pre-trained models with large-scale medical report datasets, leading to data shifts between training and testing sets, resulting in irrelevant contextual biases in the visual domain and correlation biases within the knowledge graph. To address these issues, we propose a novel multimodal causal inference approach called Multimodal Counterfactual Unbiased Report Generation (MCURG), which incorporates causal inference to exploit invariant rationales. Our key innovation lies in leveraging counterfactual inference to reduce visual and knowledge biases. MCURG employs a Structural Causal Model (SCM) to elucidate the complex relationships among images, knowledge graphs, reports, confounders, and personalized features. We design two multimodal debiasing modules: a visual debiasing module and a knowledge graph debiasing module. The visual debiasing module focuses on the Total Direct Effect of image features, mitigating confounding factors, while the knowledge graph debiasing module identifies individualized treatments within the graph, reducing spurious generations. We conducted extensive experiments and comprehensive evaluations on multiple datasets, demonstrating that MCURG effectively reduces bias and improves the accuracy of generated reports. This multimodal causal inference approach, through the use of SCM and counterfactual reasoning, successfully addresses bias in automated diagnostic report generation, marking a significant innovation in the field. The codes are available at https://github.com/stellating/MCURG.
{"title":"Unbiased diagnostic report generation via multi-modal counterfactual inference","authors":"Yuting Guo , Shuai Li , Wenfeng Song , Aimin Hao","doi":"10.1016/j.bspc.2026.109639","DOIUrl":"10.1016/j.bspc.2026.109639","url":null,"abstract":"<div><div>Automated diagnostic report generation is a challenging vision-and-language bridging task aimed at accurately describing medical images and performing cross-modal causal inference. Despite its significant clinical importance, widespread application remains challenging. Existing methods often rely on pre-trained models with large-scale medical report datasets, leading to data shifts between training and testing sets, resulting in irrelevant contextual biases in the visual domain and correlation biases within the knowledge graph. To address these issues, we propose a novel multimodal causal inference approach called Multimodal Counterfactual Unbiased Report Generation (MCURG), which incorporates causal inference to exploit invariant rationales. Our key innovation lies in leveraging counterfactual inference to reduce visual and knowledge biases. MCURG employs a Structural Causal Model (SCM) to elucidate the complex relationships among images, knowledge graphs, reports, confounders, and personalized features. We design two multimodal debiasing modules: a visual debiasing module and a knowledge graph debiasing module. The visual debiasing module focuses on the Total Direct Effect of image features, mitigating confounding factors, while the knowledge graph debiasing module identifies individualized treatments within the graph, reducing spurious generations. We conducted extensive experiments and comprehensive evaluations on multiple datasets, demonstrating that MCURG effectively reduces bias and improves the accuracy of generated reports. This multimodal causal inference approach, through the use of SCM and counterfactual reasoning, successfully addresses bias in automated diagnostic report generation, marking a significant innovation in the field. The codes are available at <span><span>https://github.com/stellating/MCURG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109639"},"PeriodicalIF":4.9,"publicationDate":"2026-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146193097","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-15Epub Date: 2026-02-06DOI: 10.1016/j.bspc.2026.109726
Yi Wang , Pei Deng , Tinghui Zheng , Haoyao Cao
Automatic and accurate segmentation of coronary arteries (CA) is a prerequisite for high-precision reconstruction of three-dimensional CA models. However, the complex structure of CA, including low contrast, significant variation in vessel diameter, and high curvature, poses significant challenges for segmentation and reconstruction. In addition, coronary computed tomography angiography (CCTA) images contain abundant background information (such as other tissues, organs, or vessels), further increasing the difficulty of segmentation. These factors often lead to vessel discontinuity and incomplete segmentation. Therefore, accurate CA segmentation remains a challenging task. In this study, we propose the GMMA-Net network to improve the continuity, robustness, and noise resistance of CA segmentation. GMMA-Net employs a grouped multi-path feature fusion module (GMFFM) in the encoder to capture richer multi-scale feature information. Furthermore, by introducing a multi-scale attention module (MAM) into the bottleneck layer of GMMA-Net, we achieve dynamic weight adjustment, capture long-range dependencies, and suppress redundant features. Experimental results show that GMMA-Net outperforms existing methods in the task of CA segmentation from CCTA images, effectively overcoming challenges caused by scale sensitivity and noise interference. GMMA-Net demonstrates superior performance on metrics such as IoU, Dice coefficient, recall rate, and , especially exhibiting stronger segmentation capability when handling cases with poor image quality and large variations in vessel diameter. The code of the proposed method is available at https://github.com/DengPei-C/GMMA-Net.
{"title":"GMMA-Net: A CCTA image segmentation algorithm based on grouped multi-path feature fusion and multi-scale attention","authors":"Yi Wang , Pei Deng , Tinghui Zheng , Haoyao Cao","doi":"10.1016/j.bspc.2026.109726","DOIUrl":"10.1016/j.bspc.2026.109726","url":null,"abstract":"<div><div>Automatic and accurate segmentation of coronary arteries (CA) is a prerequisite for high-precision reconstruction of three-dimensional CA models. However, the complex structure of CA, including low contrast, significant variation in vessel diameter, and high curvature, poses significant challenges for segmentation and reconstruction. In addition, coronary computed tomography angiography (CCTA) images contain abundant background information (such as other tissues, organs, or vessels), further increasing the difficulty of segmentation. These factors often lead to vessel discontinuity and incomplete segmentation. Therefore, accurate CA segmentation remains a challenging task. In this study, we propose the GMMA-Net network to improve the continuity, robustness, and noise resistance of CA segmentation. GMMA-Net employs a grouped multi-path feature fusion module (GMFFM) in the encoder to capture richer multi-scale feature information. Furthermore, by introducing a multi-scale attention module (MAM) into the bottleneck layer of GMMA-Net, we achieve dynamic weight adjustment, capture long-range dependencies, and suppress redundant features. Experimental results show that GMMA-Net outperforms existing methods in the task of CA segmentation from CCTA images, effectively overcoming challenges caused by scale sensitivity and noise interference. GMMA-Net demonstrates superior performance on metrics such as IoU, Dice coefficient, recall rate, and <span><math><mrow><mi>H</mi><msub><mrow><mi>D</mi></mrow><mrow><mn>95</mn></mrow></msub></mrow></math></span>, especially exhibiting stronger segmentation capability when handling cases with poor image quality and large variations in vessel diameter. The code of the proposed method is available at <span><span>https://github.com/DengPei-C/GMMA-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109726"},"PeriodicalIF":4.9,"publicationDate":"2026-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146193095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-06-15Epub Date: 2026-02-13DOI: 10.1016/j.bspc.2026.109829
Mingxia Zhang , Huijing Hu , Di Ao , Li Yan , Qinghua Huang , Zhengxiang Zhang , Le Li
This study investigates the neural mechanisms underlying age-related declines in motor control by proposing a novel Temporal Dynamic Graph Fourier Transform (TDGFT) method. TDGFT integrates graph signal processing with dynamic brain networks analysis to characterize time-varying corticomuscular interactions in the spectral domain, thereby linking global and local brain connectivity patterns to motor behavior. Integrating functional near-infrared spectroscopy (fNIRS) and electromyography (EMG), we systematically examine the dynamic regulation of brain network and muscle activity in older adults and younger adults during elbow flexion tasks at 30% and 70% of maximum voluntary contraction (MVC). Sixteen older adults and sixteen younger adults are recruited for the study. Our findings reveal that older adults exhibit weaker dynamic regulation of brain regions during high-load tasks, accompanied by significantly increased constraints of structural brain networks on functional activity, reflecting a decline in cognitive control. Additionally, older adults rely on multi-regional brain coordination for motor control during low-intensity tasks, while reducing cognitive load to enhance motor efficiency during high-intensity tasks. By providing an interpretable spectral representation of corticomuscular dynamics, TDGFT advances the understanding of how aging reshapes motor-related brain connectivity. These findings may help identify changes of age-related motor decline and facilitate the design of individualized motor rehabilitation strategies for older adults.
{"title":"Dynamic regulation of brain network and muscle activity in upper limb force generation among older adults: A temporal dynamic graph Fourier transform approach","authors":"Mingxia Zhang , Huijing Hu , Di Ao , Li Yan , Qinghua Huang , Zhengxiang Zhang , Le Li","doi":"10.1016/j.bspc.2026.109829","DOIUrl":"10.1016/j.bspc.2026.109829","url":null,"abstract":"<div><div>This study investigates the neural mechanisms underlying age-related declines in motor control by proposing a novel Temporal Dynamic Graph Fourier Transform (TDGFT) method. TDGFT integrates graph signal processing with dynamic brain networks analysis to characterize time-varying corticomuscular interactions in the spectral domain, thereby linking global and local brain connectivity patterns to motor behavior. Integrating functional near-infrared spectroscopy (fNIRS) and electromyography (EMG), we systematically examine the dynamic regulation of brain network and muscle activity in older adults and younger adults during elbow flexion tasks at 30% and 70% of maximum voluntary contraction (MVC). Sixteen older adults and sixteen younger adults are recruited for the study. Our findings reveal that older adults exhibit weaker dynamic regulation of brain regions during high-load tasks, accompanied by significantly increased constraints of structural brain networks on functional activity, reflecting a decline in cognitive control. Additionally, older adults rely on multi-regional brain coordination for motor control during low-intensity tasks, while reducing cognitive load to enhance motor efficiency during high-intensity tasks. By providing an interpretable spectral representation of corticomuscular dynamics, TDGFT advances the understanding of how aging reshapes motor-related brain connectivity. These findings may help identify changes of age-related motor decline and facilitate the design of individualized motor rehabilitation strategies for older adults.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109829"},"PeriodicalIF":4.9,"publicationDate":"2026-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Knee Osteoarthritis (KOA) is a degenerative joint disorder affecting middle-aged and elderly individuals, with its diagnosis facing challenges in achieving objective, transparent quantification and incorporating clinical manifestations, despite advances in deep-learning for medical imaging. To address these issues, in this paper, a deep learning-based hybrid (Convolutional Neural Network (CNN)-Transformer encoder) classification framework, DeepOsteoCls, is proposed to perform binary and multi-class classification of KOA from X-rays and MRI scans from OsteoXRNet and OsteoMRNet models separately, with Gradient-weighted Class Activation Mappings (Grad-CAMs). The Osteoarthritis Edge Detection (OAED) and Multi-Resolution Feature Integration (MRFI) modules are also introduced in the proposed framework to facilitate the extraction of edge-based features from X-ray images and multi-scale regional features from the MRI volume, respectively. Furthermore, a disorder-aware weakly supervised training scheme—Domain Knowledge Transfer and Entropy Regularization (DoKTER) is proposed to enhance the explainability of Radiological KOA (RKOA) diagnosis by predicting the region score and GradCAMs of MRI scans. Comprehensive experiments on the Osteoarthritis Initiative (OAI) dataset demonstrated that the proposed framework achieved a classification accuracy of 72.10% for X-ray and 53.16% for MRI in a multi-class classification task, and 85.74% for X-ray and 81.04% for MRI in a binary classification task, outperforming state-of-the-art models. The DoKTER scheme is found to accurately classify the affected region with 65.15% and 62.5% for the OAI and Multi-Hospital Knee Osteoarthritis (MHKOA) datasets, respectively. Additionally, Femoral Cartilage Thickness (FCT) in non-RKOA subjects can be effectively monitored using the region score, with distinct cut-offs values. The code is available at: https://github.com/adaydar/Deep-OsteoCls
{"title":"DeepOsteoCls: Deep learning-based framework for Knee Osteoarthritis Classification with qualitative explanations from radiographs and MRI volumes","authors":"Akshay Daydar , Arijit Sur , Subramani Kanagaraj , Hanif Laskar","doi":"10.1016/j.bspc.2026.109819","DOIUrl":"10.1016/j.bspc.2026.109819","url":null,"abstract":"<div><div>Knee Osteoarthritis (KOA) is a degenerative joint disorder affecting middle-aged and elderly individuals, with its diagnosis facing challenges in achieving objective, transparent quantification and incorporating clinical manifestations, despite advances in deep-learning for medical imaging. To address these issues, in this paper, a deep learning-based hybrid (Convolutional Neural Network (CNN)-Transformer encoder) classification framework, DeepOsteoCls, is proposed to perform binary and multi-class classification of KOA from X-rays and MRI scans from OsteoXRNet and OsteoMRNet models separately, with Gradient-weighted Class Activation Mappings (Grad-CAMs). The Osteoarthritis Edge Detection (OAED) and Multi-Resolution Feature Integration (MRFI) modules are also introduced in the proposed framework to facilitate the extraction of edge-based features from X-ray images and multi-scale regional features from the MRI volume, respectively. Furthermore, a disorder-aware weakly supervised training scheme—Domain Knowledge Transfer and Entropy Regularization (DoKTER) is proposed to enhance the explainability of Radiological KOA (RKOA) diagnosis by predicting the region score and GradCAMs of MRI scans. Comprehensive experiments on the Osteoarthritis Initiative (OAI) dataset demonstrated that the proposed framework achieved a classification accuracy of 72.10% for X-ray and 53.16% for MRI in a multi-class classification task, and 85.74% for X-ray and 81.04% for MRI in a binary classification task, outperforming state-of-the-art models. The DoKTER scheme is found to accurately classify the affected region with 65.15% and 62.5% for the OAI and Multi-Hospital Knee Osteoarthritis (MHKOA) datasets, respectively. Additionally, Femoral Cartilage Thickness (FCT) in non-RKOA subjects can be effectively monitored using the region score, with distinct cut-offs values. The code is available at: <span><span>https://github.com/adaydar/Deep-OsteoCls</span><svg><path></path></svg></span></div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"119 ","pages":"Article 109819"},"PeriodicalIF":4.9,"publicationDate":"2026-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}